Skip to main content
TechnicalThought Leadership

Privacy and Data Ownership Needs to be at the Core of our Technology

By April 18, 2018 No Comments
Facebook data privacy Cambridge analytica

We need to talk. Not just you and me, society as a whole.  The foundation of free society is eroding and we have only a brief time to shore it up. We need to talk about data privacy and balancing it with technological advancement.


Emerging danger

Technology constantly introduces new features, new techniques and opens up new possibilities.  Most of the time this is good – things get easier, faster, and cheaper. Tasks which were difficult and prohibitively expensive a few years earlier are suddenly one-click away as fun Snapchat filters.

These changes can be hard to keep track of but aren’t really dangerous if you don’t keep up. Life won’t dramatically change if you miss out on a few generations of the latest gimmicks and graphic design trends.

But something pervasively and unprecedentedly important has happened in technology over the last decade. Society is now walking at the edge of a privacy precipice from which we might never be able to recover if we aren’t very careful with the next few steps.


Why are things this way?

I think it is important to look at what has lead to this state of things.  Beginning around 2000 those of us in computer science started exploring things like “data mining”, “big data”, and most recently “machine learning”.  Like most technology there was no moral intent behind these techniques – they were just methodologies that emerged as interesting approaches to tough problems.

What really makes these technologies unique from a societal perspective is not in the code.  It is the required input to these systems. They need information … lots and lots of information.

Anyone who is well versed in machine learning can confirm that one of the early considerations when planning to use this type of “AI” is identifying a source of massive amounts of the right kind of data for that problem.


Unquenchable thirst for data

The impressive results from data analysis has turned data into the new gold rush.  For many companies, finding data has become the most imperative task, bar none.

This has led to services offering every enticement they can think of to obtain this precious commodity.  Facebook isn’t alone but is the king of this. In exchange for your checking “I accept” on their Terms of Service, they offer unlimited usage, boundless storage, and ever-growing functionality.  They offer convenience, power and entertainment. All for a few simple permissions on your cell phone.


Quest for excellence

Once Facebook had access to it all, they splashed in the pools of data they amassed.  Much like a child visiting the ocean for the first time, they played with it to see what kinds of castles they could build. Matching algorithms, suggestions, predictions. Finding new ways to leverage this data was the sure way to corporate recognition and business success.

And they excelled at it.  Facebook is better at it than anyone.  NOBODY knows *you* better than they do.


After you have it all, what comes next?

Looking for more led to partners willing to push the boundaries.  Individuals and companies were driven by the same quest for more knowledge extracted from this data.

Facebook’s claim has always been that it’s a matching service–allowing a company to detail the type of customers it wants to target, and then placing the ads for the companies. The companies and marketing firms were never meant to have first-hand access to profiles.

But, sharing this trove with like minded academic researchers surely seemed logical. Simple. Easy.

This led to Aleksandr Kogan, the researcher behind the personality quiz “thisisyourdigitallife”. This app asked its users for access to some personal information and for access to their networks which provided Kogan with a baseline profile of every friend they had under the pretense of academic use. 270,000 consenting users turned into 50 million profiles.

Which led to Cambridge Analytica. Which lead to…

Ethics can be a slippery thing. It was likely easy to justify each of these steps. Users loved what they were getting, right?  They clicked “I agree”. This is a fair trade… right?


Fundamental principles

Fundamental principles are being tested for the first time. Fiction has explored the concept of “big brother watching you” for decades. But the reality is that it’s been impossible to watch *everyone* in a free society. Until now.

With freedom comes an implied expectation of privacy. Can freedom exist without protection of privacy? I don’t think so. We *need* the ability to explore aspects of ourselves without fear of exploitation. Society benefits from the individuals in the society being themselves. Thus, protecting privacy is as important as freedom itself.

The phrase “peeping into the bedroom window” makes people uncomfortable.  Certain things are held sacred; and the norms, rules and laws of society are supposed to protect its members when they are the most vulnerable.

It’s arguable that Kogan was not technically peeping in the window in this situation. Through Kogan’s app, he was invited into the bedroom, then broke a promise of confidentiality by sharing what he saw (and recorded) by providing it to Cambridge Analytica. Facebook has noted that this was not a security breach or a hack, but a breach of their terms of use for that data.

This new usage of data analytics has created hidden paths and invitations that threaten the privacy of the members of our society every bit as much as a peeping Tom at the bedroom window.  Even more so since you don’t need anything as obvious as a ladder to gain this knowledge.


End of a great experiment?

So, should we call for an end to Big Data?  Is machine learning a dark art that should be banned … too powerful to be used?

I don’t think so, for several reasons.  This box has been opened. These techniques work and will be used.  Banning research doesn’t stop it, it just pushes it underground.

Besides, there is HUGE societal benefit from the results of some machine learning.  We need to explore how to learn more from it, not less. We just need to do it responsibly.

We maintain that we don’t need to collect every shred of available data on our community and their interactions with Mycroft to create a fantastic user experience.

But we do need some data, so we’re using an opt-in mechanism to allow our community members to choose if any of their information is stored. We’re building Open Datasets – anonymized information and recordings from our opted-in community members use by anyone.

We think seriously about how to collect, store, and use data responsibly and transparently.

What else could we do?



Modern technologies will help in this challenge. The same way we were able to use pervasive computing and networking to harvest and analyze data in ways that were previously unprecedented, we can also use these to create personal protections in ways that were previously impossible.

With tools like encryption and the blockchain I believe we can not only create privacy protections as policy but also guarantee it.  I’ll be expanding on these ideas publicly soon.

For now, I’d like to hear if I’m alone in my thinking.  Is this a real danger? Are we vulnerable? Does anyone care?

Join us on the Forums to share your thoughts.