Can social media algorithms predict our behaviour?
With everything capturing our data every time we use it - plus many apps which capture our data even when we're not using it, just so long as it's open - big data may know more about us than we do about ourselves.
And when they call it big, they mean BIG. Every page-view, every update, every message, every read more... It all feeds the borg. The statistics about data are ever-changing, but the human race now captures around the same amount of data in any two days than in all of history prior to 2003. And almost 80% of that comes from social media. Our tweets, our snaps, and our likes are telling anyone with the ability to crunch the numbers who we are, what we think, where we stand on critical issues. And, of course, how to sell things to us.
That an advertiser might know your teenage daughter is pregnant well before you do is now a given. It seems almost inevitable that with the amount of data and the amount of computing power available to process that data soon algorithms will be able to predict our decisions even before we've thought of making them.
After all, we give away a lot of data through social media. Our updates and tweets show our emotional states and sentiments about almost anything you care to name. Our friends lists and followers show who we interact with and the intimacy of our relationships. Or pictures show our biometric data - and the biometric data of the people we're with (meaning even your Nan who doesn't have a mobile and thinks Instagram is something to do with Western Union has some of her data there in the cloud.) And everywhere we go, every time we take out our cellphones, we give away our location.
But is it inevitable? Is the answer to the meaning of life really all in the data? At first pass, you'd be foolish to say no. In almost every other circumstance, the more we know about something, the better we're able to predict its behaviour.
Except, that's not quite true. Heisenberg's Uncertainty Principle is the quantum notion that the position and the velocity of an object cannot simultaneously be measured exactly, even in theory. The more data known about speed or position can get us closer and closer, but essentially, some things can't be known, even with all the data.
And when it comes to pre-internet data Edward Lorenz, the mathematician who developed Chaos Theory and the notion of the butterfly effect, studied one of the few things humanity has been keeping data on for millennia - the weather. Weather, he concluded, was ultimately too chaotic for real long-term prediction.
No matter how much data on the atmosphere and weather systems we collect, no matter how big or fast the computers we use to analyse that data, we'll probably never be able to get a long-range forecast further out than two weeks. At least, not with any degree of accuracy.
Human behaviour is, ultimately, far more chaotic than weather systems. When it comes to algorithmic prediction, Netflix - back when it was a DVD mailout service, rather than a handy way to use all your bandwidth and most of your weekend watching Friends - offered $US 1 million to anyone who could improve its suggestion feature by 10%. They gave up scads of data in the hope someone could create an algorithm just that bit better at knowing whether a particular customer would enjoy a movie or not.
The prize wasn't won for three years, and it took the combined efforts of three teams to get infinitesimally over the 10% mark. By then Netflix had moved on.
Even with the ever-growing amount of data pumping out through social media, and even as we get better at interpreting that data, we will probably never be able to predict the behaviour of the individual. A person is just too chaotic. And even were they not, the uncertainty principle still holds - data might be able to predict the intention of action, but there are far too many other factors which could get in the way.
But if a person is too chaotic to predict, people are not. An analysis of data on Facebook predicted the success of both Brexit and Trump. A rise in tweets containing words that correspond with increased social agitation and "mobilisation" have been found to pre-empt riots in the US. And the discussions of our health woes can help map the spread of illnesses.
So the time of privacy may well indeed be over. Even for the most cautious individual, there will be data that links their face to their acquaintances to their spending. But while Big Brother may now be referred to as Big Data, he still can't tell what we're thinking. Not until we share it on Facebook, anyway.