## Friday, 26 September 2014

### BIG DATA: A Powerful New Resource for the 21st Century

by Dirk Helbing

This chapter is a free translation of an introductory article on originally published in Die Volkswirtschaft - Das Magazin für Wirtschaftspolitik (5/2014),

#### Abstract

Information and communication technology (ICT) is the economic sector that is developing most rapidly in the USA and Asia and generates the greatest value added per employee. Big Data - the algorithmic discovery of hidden treasures in large data sets - creates new economic value. The development is increasingly understood as a new technological revolution. Switzerland could establish itself as data bank and Open Data pioneer in Europe and turn into a leading place in the area of information technologies.

When the social media portal WhatsApp with its 450 million users was recently sold to Facebook for$19 billion - almost half a billion dollars was made per employee. "Big Data" is changing our world. The term, coined more than 15 years ago, means data sets so big that one can no longer cope with them with standard computational methods. Big Data is increasingly referred to as the oil of the 21st century. To benefit from it, we must learn to "drill" and "refine" data, i.e. to transform them into useful information and knowledge. The global data volume doubles every 12 months. Therefore, in just two years, we produce as much data as in the entire history of humankind. Tremendous amounts of data have been created by four technological innovations: • the Internet, which enables our global communication • the World Wide Web, a network of globally accessible websites that evolved after the invention of hypertext protocol (HTTP) at CERN in Geneva • the emergence of social media such as Facebook, Google+, Whatsup, or Twitter, which have created social communication networks, and • the emergence of the "Internet of Things'', which also allows sensors and machines to connect to the Internet. Soon there will be more machines than human users in the Internet. #### Data sets bigger than the largest library Meanwhile, the data sets collected by companies such as eBay, Walmart or Facebook, reach the size of petabytes (1 million billion bytes) - one hundred times the information content of the largest library in the world: the U.S. Library of Congress. The mining of Big Data opens up entirely new possibilities for process optimization, identification of interdependencies, and decision support. However, Big Data also comes with new challenges, which are often characterized by four criteria: • volume: the file sizes and number of records are huge, • velocity: the data evaluation has often to be done in real-time, • variety: the data is often very heterogeneous and unstructured, • veracity: the data is probably incomplete, not representative, and contains errors Therefore, one had to develop completely new algorithms: new computational methods. Because it is inefficient for Big Data processing to load all relevant data into a shared memory, the processing must take place locally, where the data resides, on potentially, thousands of computers. This is accomplished with massively parallel computing approaches such as: MapReduce or Hadoop. Big Data algorithms detect interesting interdependencies in the data ("correlations"), which may be of commercial value, for example, between weather and consumption or between health and credit risks. Today, even the prosecution of crime and terrorism is based on the analysis of large amounts of behavioral data. #### What do applications look like? Big Data applications are spreading like wildfire. They facilitate personalized offers, services and products. One of the greatest successes of Big Data is automatic speech recognition and processing. Apple's Siri understands you when asking for a Spanish restaurant, and Google Maps can lead you there. Google Translate interprets foreign languages by comparing them with a huge collection of translated texts. IBM's Watson computer even understands human language. It can not only beat experienced quiz show players, but even take care of customer hotlines - often better than humans. IBM has recently decided to invest$1 billion to further develop and commercialize the system.

Of course, Big Data plays an important role in the financial sector. Approximately seventy percent of all financial market transactions are now made by automated trading algorithms. In just one day, the entire money supply of the world is traded. Such quantities of money also attract organized crime and financial transactions are scanned by Big Data algorithms for abnormalities to detect suspicious activities. The company Blackrock uses a similar software called "Aladdin", to successfully speculate with funds amounting to multiple times the gross domestic product (GDP) of Switzerland.

Box 1:
To get an overview of the ICT trends, it is worthwhile to look at Google with over 50 software platforms. The company invests nearly $6 billion in research and development annually. Within just one year, Google has introduced self-driving cars, invested heavily in robotics, and started a Google Brain project to add intelligence to the Internet. Through the purchase of Nest Labs, Google has also invested$3.2 billion in the "Internet of Things". Furthermore, Google X has been reported to have around 100 secret projects in the pipeline.

#### The potential is great...

No country today can afford to ignore the potentials of Big Data. The additional economic potential of Open Data alone - i.e. of data sets that are made ​​available to everyone - is estimated by McKinsey to be between 3,000 to 5,000 billion dollars globally each year [2]. This can benefit almost all sectors of society. For example, energy production and consumption can be better matched with "smart metering", and energy peaks can be avoided. More generally, new information and communication technologies allow us to build "smart cities". Resources can be managed more efficiently and the environment protected better. Risks can be better recognized and avoided, thereby reducing unintended consequences of decisions and identifying opportunities that would otherwise have been missed. Medicine can be better adapted to the patients, and disease prevention may become more important than curing diseases.

#### ... but also the implicit risks

Like all technologies, Big Data also implies risks. The security of digital communication has been undermined. Cyber ​​crime, including data, identity and financial theft, quickly spread on ever greater dimensions. Critical infrastructures such as energy, financial and communication systems are threatened by cyber attacks. They could, in principle, be made dysfunctional for an extended time period.

Moreover, while common Big Data algorithms are used to reveal optimization potentials, their results may be unreliable or may not reflect causal relationships. Therefore, a naive application of Big Data algorithms can easily lead to wrong conclusions. The error rate in classification problems (e.g. the distinction between "good" and "bad" risks) is often relevant. Issues such as wrong decisions or discrimination must be seriously considered. Therefore, one much find effective procedures for quality control. In this connection, universities will likely play an important role. One must also find effective mechanisms to protect privacy and the right of informational self-determination, for example, by applying the Personal Data Purse [1] concept.

#### The digital revolution creates an urgency to act

Information and communication technologies are going to change most of our traditional institutions: our educational system (personalized learning), science (Data Science), mobility (self-driving cars), the transport of goods (drones), consumption (see amazon and ebay), production (3D printers), the health system (personalized medicine), politics (more transparency), and the entire economy (with co-producing consumers, so-called prosumers). Banks are losing more and more ground to algorithmic trading, alternative payment systems such as Bitcoins, Paypal and Google Wallet. Moreover, a substantial part of the insurance business takes place in financial products such as credit default swaps. For the economic and social transformation into a digital society'', we may perhaps just have 20 years. This is an extremely short time period, considering that the planning and construction of a road often requires 30 years or more.

The foregoing implies an urgent need for action on the technological, legal and socio-economic level. Some years ago, the United States started a Big Data research initiative amounting to 200 million dollars followed by further substantial investments. In Europe, the FuturICT project (www.futurict.eu) has developed concepts for the digital society within the context of the EU flagship competition. Other countries have already started to implement this concept, for example, Japan has recently launched a \$100 million 10-year project at the Tokyo Institute of Technology. In addition, numerous other projects exist, particularly in the military and security sector, which often have multiples of the budgets mentioned above.

#### Switzerland can become a European driver of innovation for the digital era

Switzerland is well positioned to benefit from the digital age. However, it is insufficient to reinvent and build upon already existing technologies in Switzerland. New inventions that will shape the digital age must be invented. The World Wide Web was once invented in Switzerland, the largest civil Big Data competence in the world exists at CERN, however the USA and Asian countries have the lead in commercializing Big Data to date. With the NSA controversy, the ubiquity of wireless communication sensors as well as the "Internet of Things",  a new opportunity is emerging.

With targeted support of ICT activities at its universities, Switzerland could take the lead in Europe's research and development. Swiss academia has excelled with the scientific coordination of three out of six finalists of the EU FET flagship competition.
At the moment, however, there is only a focus on the digital modeling of the human brain and robotics. However from 2017 onwards, the ETH domain plans to increasingly invest into the area of Data Science, the emerging research field centered around the scientific analysis of data.

In view of the fast development of the ICT area, the huge economic potential as well as the transformative power of these technologies, a prioritized, broad and substantial financial support is a matter of Swiss national interest. With its basic democratic values, legal framework and ICT focus, Switzerland is well prepared to become Europe's innovation driver for the digital age.

Box 2:
How will the digital revolution change our economy and society? How can we use this as an opportunity for us and reduce the related risks? For illustration, it is helpful to recall the factors that enabled the success of the automobile age: the invention of cars and of systems of mass production; the construction of public roads, gas stations, and parking lots; the creation of driving schools and driver licenses; and last but not least, the establishment of traffic rules, traffic signs, speed controls, and traffic police.
What are the technological infrastructures and the legal, economic and societal institutions needed to make the digital age a big success? This question would set the agenda of the Innovation Alliance. A partial answer is already clear: we need trustworthy, transparent, open, and participatory ICT systems, which are compatible with our values. For example, it would make sense to establish the emergent "Internet of Things" as a Citizen Web. This would enable self-regulating systems through real-time measurements of the state of the world, which would be possible with a public information platform called the "Planetary Nervous System". It would also facilitate a real-time measurement and search engine: an open and participatory "Google 2.0."

To protect privacy, all data collected about individuals should be stored in a Personal Data Purse and, given informed consent, processed in a decentralized way by third-party Trustable Information Brokers, allowing everyone to control the use of their sensitive data. A Micro-Payment System would allow data providers, intellectual property right holders, and innovators to get rewards for their services. It would also encourage the exploration of new and timely intellectual property right paradigms ("Innovation Accelerator"). A pluralistic, User-centric Reputation System would promote responsible behavior in the virtual (and real) world. It would even enable the establishment of a new value exchange system called "Qualified Money," which would overcome weaknesses of the current financial system by providing additional adaptability.
A Global Participatory Platform would empower everyone to contribute data, computer algorithms and related ratings, and to benefit from the contributions of others (either free of charge or for a fee). It would also enable the generation of Social Capital such as trust and cooperativeness, using next-generation User-controlled Social Media. A Job and Project Platform would support crowdsourcing, collaboration, and socio-economic co-creation. Altogether, this would build a quickly growing Information and Innovation Ecosystem, unleashing the potential of data for everyone: business, politics, science, and citizens alike.

[1] Y.-A. de Montjoye, E. Shmueli, S. S. Wang, and A. S. Pentland (2014) openPDS: Protecting the Privacy of Metadata through SafeAnswers,

[2] McKinsey & Company (2013) Open data: Unlocking innovation and performance with liquid information,

## Tuesday, 23 September 2014

### Creating ("Making") a Planetary Nervous System as Citizen Web

by Dirk Helbing

The goal of the Planetary Nervous System is to create an open, public, intelligent software layer on top of the "Internet of Things" as the basic information infrastructure for the emerging digital societies of the 21st century.

After the development of the Computer, Internet, the World Wide Web, Smartphones and Social Media, the evolution of our global information and communication systems will now be driven by the "Internet of Things" (IoT). Based on wirelessly connected sensors and actuators,[1] it will connect "things" (such as machines, devices, gadgets, robots, sensors, and algorithms) with things, and things with people.

Already now, more things than people are connected to the Internet. In 10 years time, it is expected that something like 150 billion sensors will be connected to the IoT. Given such masses of sensors everywhere around us -- sensors in our coffee machine, our fridge, our tooth brush, our shoes, our fire alarm etc. -- the IoT could easily turn into a dystopian surveillance nightmare, if largely controlled by one company or by the state. For the IoT to be successful, people need to be able to trust the new information and communication system, and they need to be able to exert their right of informational self-determination, which also requires the possibility to protect privacy.

Most likely, the only way to establish such a trustable, privacy-respecting IoT is to build it as a Citizen Web. Citizens would deploy the sensors in their homes, gardens, and offices themselves, and they would decide themselves what sensor information to open up (i.e. decrypt), and for whom (and for how long). In other words, the citizens would be in control of the information streams. A software platform such as open Personal Data Store (openPDS)[2] would allow everyone to manage the access to personal data produced by the IoT.

What are the benefits of having an "Internet of Things"?

• One can perform real-time measurements of the (biological, technological, social and economic) world around us
• This information can be turned into (real-time) maps of our world[3] and serve as compasses for decision-makers, enabling them to take better decisions and more effective actions, considering externalities
• One can build self-organizing and self-regulating systems, based on real-time feedback and adaptation[4]
• Sensor Kits and Smartphones, to measure the environment
• Algorithms and filters to encrypt information or degrade it such that it is not sensitive anymore[5]
• Ad hoc network / mesh net (e.g. firechat) to enable direct communication between wirelessly communicating sensors
• Server architecture to collect, manage and process data
•  A data analytics layer and possibly a search engine and Collective Intelligence/Cognitive Computing layer on top
• An open Personal Data Store (such as openPDS) to empower users to exercise their right of informational self-determination
• An app-store-like Global Participatory Platform (GPP) to share data, algorithms, and ratings
• An editor allowing non-expert users to combine inputs and outputs in playful, creative ways
• A multi-dimensional reputation and micro-payment system
•  A project platform to allow the Nervous community to coordinate and self-organize their activities and projects
Both Planetary Nervous System Apps would offer a rich Open Data stream accessible for everyone. They would build something like a "real-time data streaming Wikipedia", offering people and companies to build services and products on top. The PNS is hence an attempt to enable and catalyze new creative jobs in times where the digital revolution is expected to eliminate about 50% of the conventional jobs of today.
Uses of these kinds will be enabled by a software layer that we call the "Planetary Nervous System" (PNS) or just "Nervous". It offers new possibilities that will allow humanity to overcome some long-standing problems (such as systemic instabilities or "tragedies of the commons" like environmental degradation, etc.), and to change the world to the better.

Basic Elements of the Planetary Nervous System
We will build two variants of the Planetary Nervous System App for smart devices such as smartphones: Nervous and Nervous+. While Nervous would not save original sensor data, Nervous+ would potentially do so. Nervous is thought to be for users that are concerned about their personal data, while Nervous+ offers additional functionality for people who are happy to share data of all kinds. Hence, the users can choose the system they prefer.

Creating a public good, and business and non-profit opportunities for everyone by maximum openness, transparency, and participation
The main goal of the PNS project is to create a public good, namely the basic information infrastructure for the emerging digital societies of the 21st century. Besides providing Open Data streams, the Planetary Nervous System may nevertheless offer some premium services to people and/or institutions, who pay for the services or have qualified to receive them for free (such as committed scientists or citizens). "Qualification" means contributions made to the components of the Planetary Nervous System, but also a responsible use of the information services. In this way, we want to reduce malicious uses of the powerful functionality of Nervous+ as much as possible.

The profits created by the PNS would be managed, for example, by a benefit corporation, which is committed to improving social and/or environmental conditions. The largest share of the profits should be used to promote the science, research and development promoting the PNS and services built on top of it. Profits created with inventions of the PNS shall be also used to support the PNS project.

As the PNS project wants to grow a public good for everyone, the Planetary Nervous System project is committed to opening up its source codes, as much as this is not expected to create security issues or dangers to human rights. Depending on the competitive situation the PNS is in, the publication may be done with a delay (usually less than 2 years). To minimize delays we will create incentives for early sharing.

The goal of this strategy is to catalyze an open information and innovation ecosystem. Others will be able to use our codes (and other people's open source codes), modify them and share them back. The same will apply to data, Apps, and other contributions. In this way, the Nervous community will benefit maximally from contributions of other Nervous members, and everyone can build on functionality that has been created by others.

Contributions of volunteers will be acknowledged by mentioning the respective creators by name (if they don't prefer to stay anonymous or pseudonymous). In addition, contributions will be rewarded by ratings, reputational values, or scores, which may be later used to get access to premium services. These would include larger query or data volumes ("power users") or an earlier access to codes that will be publicly released with a delay, or further benefits. The PNS project may also hand out medals or prizes for outstanding contributions, or highlight them in social or public media.

The role of Citizen Science
For the Planetary Nervous System to be successful, it is crucial to grow a large community of users, but the underlying logic of sharing, bottom-up involvement and informational self-determination demands that everyone is encouraged to contribute to the creation of the system itself. The system would hence be built similar to Wikipedia or OpenStreetMap. In fact, the success of OpenStreetMap is based on the contributions of 1.5 million volunteers worldwide.

This is, why the Nervous project wants to engage with Citizen Science, to grow the Planetary Nervous System as a Citizen Web. As basis of citizen engagement, the Nervous Team will provide (a) kits containing sets of sensors[6] and actuators (e.g. a basic kit, and several extension kits) and (b) a GPP portal, where people can download (and upload) algorithms ("Apps"), which will run on the sensors and thereby produce certain kinds of functionalities.

The CitizenScience community will be engaged in certain measurement tasks (e.g. "measure the noise distribution in your city as a function of time", or "measure data enabling weather predictions"). It will also be engaged to come up with innovative ways to use sensor data and turn them into outputs (i.e. to produce new codes or modify existing ones, thereby creating new Apps). For this, the PNS team will provide tools (such as an editor), allowing non-expert users to transform inputs into outputs in playful, creative ways. Playfulness, fun and reputation are hence offered in exchange for contributing to the development and spreading of the PNS. As a result, we will get new measurement procedures for science, and adaptive feedback processes to create self-regulating systems.

#### The Planetary Nervous System is already being built, see the @NervousNet on Twitter. To join our development team as a volunteer in a similar way as Linux, Wikipedia or OpenStreetMap are being created, please contact Dirk Helbing dhelbing (at) ethz.ch

[1] Actuators are devices that can induce change, for example, motors.
[2] see http://openpds.media.mit.edu/ and http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0098790
[3] Such maps may include environmental maps depicting environmental changes and who causes them, resource maps visualizing resources and who uses them, as well as risk maps, crisis maps, or conflict maps.
[4] Smart city applications are often of this kind.
[5] For example, one may average over personal data, and throw away the original data. Microphone data may be degraded by using a lowpass filter, such that spoken words cannot be anymore understood, remaining basically with a noise level measurement. Access to unfiltered video or microphone data would require explicit approval, similar to accepting a call. Data shared with a restricted circle will be encrypted.
[6] Such sensors may measure temperature, noise, humidity, health, or anything else.

## Monday, 22 September 2014

### BIG DATA SOCIETY: Age of Reputation or Age of Discrimination?

By Dirk Helbing

If we want Big Data to create societal progress, more transparency and participatory opportunities are needed to avoid discrimination and ensure that they are used in a scientifically sound, trustable, and socially beneficial way.

Have you ever "enjoyed" an extra screening at the airport because you happened to sit next to someone from a foreign country? Have you been surprised by a phone call offering a special service or product, because you visited a certain webpage? Or do you feel your browser reads your mind? Then, welcome to the world of Big Data, which mines the tons of digital traces of our daily activities such as web searches, credit card transactions, GPS mobility data, phone calls, text messages, facebook profiles, cloud storage, and more. But are you sure you are getting the best possible product, service, insurance or credit contract? I am not.

Like every technology, Big Data has some side effects. Even if you are not concerned about losing your privacy, you should be worried about one thing: discrimination. A typical application of Big Data is to distinguish different kinds of people: terrorists from normal people, good from bad insurance risks, honest tax payers from those who don't declare all income ... You may ask, isn't that a good thing? Maybe on average it is, but what if you are wrongly classified? Have you checked the information collected by the Internet about your name or gone through the list of pictures google stores about you? Even more scary than how much is known about you is the fact that there is quite some information in between which does not fit. So, what if you are stopped by border control, just because you have a similar name as a criminal suspect? If so, you might have been traumatized for quite some time.

Where does the problem originate? Normally, the groups of people to distinguish are overlapping -- their data points are not well separated. Therefore, mining Big Data comes with the statistical problem of false positives and false negatives [1]. That is, some people get an unintended advantage, while others suffer an unfair disadvantage -- an injustice hard to accept. Even with the overly optimistic assumption that the data mining algorithm has an accuracy of 99.9% -- when applied to 200 Million people, there are hundreds of thousands of people who will experience a wrong treatment. In medicine, the approach of mass screenings is therefore highly controversial [2]. Are you willing to sacrifice your breast or prostata for a wrongly diagnosed cancer? Probably not, but it happens more often than you think.

Similarly, tens of thousands of honest people are unintentionally mixed up with terrorists. So, how can you be sure you are getting your loan for fair conditions, and do not have to pay a higher interest rate, just because someone in your neighborhood defaulted? Can you still afford to live in an easy-going multi-cultural quarter, or do you have to move to another neighborhood to get a reasonable loan? And what about the tariff of your health insurance? Will you have to pay more, just because your neighbors do not go jogging? Will we have to put pressure on our facebook friends, colleagues, and neighbors, just to avoid possible future discrimination? And what would be the features that play out positively or negatively? How much Coke on our credit card bill will be acceptable to our health insurance? Is it ok to drink a glass of wine, or better not? What about another cup of coffee or tea? Can we still eat meat, or will we get punished for it with higher monthly rates? Would there be a right way of living at all, or would just everyone be discriminated for some behavior, while perhaps getting rewards for others? The latter is surely the case.

This might be fine, if everybody would benefit on average, but unfortunately this is rather unlikely. Some would be lucky and others would be unlucky, i.e. inequality would grow. But similar to stock markets, it would be difficult to tell before, who would benefit and who would lose. This is so not just because of the random distribution of individual properties, but also because the parameters of the data mining algorithms can be determined only with a limited accuracy. However, even tiny parameter changes may produce dramatically different results (a fact known as "sensitivity" or "butterfly effect") [3]. In other words, while the miners of Big Data may pretend to take more scientific, better and fairer decisions, the results will often have a considerable amount of arbitrariness. Many data miners probably don't know about this or don't care. But the fact that lots of algorithms produce outputs without warnings of their limitations creates a dangerous overconfidence in their results. Moreover, note that the choice of the model can be even more critical than the choice of parameters [4]. That's basically why people say: "Don't believe a statistics that you haven't produced yourself."

The problem is reminiscent of the experiences made with financial innovations. People used models without questioning their validity enough. It was discovered too late that financial innovations may have negative effects and destabilize the markets. One example is the excessive use of credit default swaps, which package risks in ways that buyers don't seem to understand anymore. The consequence of this was a financial meltdown that the public has to pay for at least for another decade or two. It is no wonder that trust in the financial system dropped dramatically, with serious economic implications (no trust means no lending). This time, we should not make the same mistakes, but rather use Big Data in a trustworthy, transparent, and beneficial way. To reap the benefits of personalized medicine, for example, we need to make sure that personal medical data will not be used to the disadvantage of patients who are willing to share their data in favor of creating a public good -- a better understanding of diseases and how to cure them.

In fact, we have worked hard to overcome discrimination of people for gender, race, religion, or sexual orientation. Should we now extend discrimination to hundreds or even thousands of variables, just because Big Data allows us to do so? Probably not! But how can we protect ourselves from such discrimination? In order to avoid that the information age becomes an age of discrimination fueled by Big Data, we need informational justice. This includes to establish (1) suitable quality standards like for medical drugs, (2) proper testing, and (3) fair compensation schemes. Otherwise people will quickly lose trust in Big Data. This requires us to decide what collateral damage for individuals would be considered tolerable or not. Moreover, we need to distinguish between “healthy” and “toxic” innovations, where “healthy” means innovations that produce long-term benefits for the economy and society (see Information Box below).  That is, the overall benefit should be bigger than the disadvantage caused by false positives, such that the corresponding individuals can be compensated for unfair treatments.

There are two fundamentally different ways to ensure a "healthy" use of Big Data and allow victims of discrimination to defend their interests. The classical approach would be to create a dedicated government agency or institution that establishes detailed regulations, in particular quality standards, certification procedures, and effective punitive schemes for violations. But there is a second approach -- one that I believe could be more effective for companies and citizens than complicated legal and executive procedures. This framework would be based on next-generation reputation systems creating feedback loops that support self-regulation.

How would such a next-generation reputation system work? The proposal is to establish a Global Participatory Plattform [5], i.e. a public store for models and data. It would work a bit like an appstore, but people and companies could upload not only apps. They could also upload data sets, algorithms (e.g. statistical methods, simulation models, or visualization tools), and ratings. Everybody could use these data sets for free or for a fee, and annotate user feedbacks. It would be as if we could submit not only queries to google, but also algorithms to determine the answers. In this way, we could better control the quality of results extracted from the data.

So, assume we would store all data collected about individuals in a data bank (for reasons of data security, a decentralized and encrypted storage would be preferable). Moreover, assume that everyone could submit algorithms to be run on these datasets. The algorithms would be able to perform certain operations within the bounds of privacy laws and other regulations. For example, they could generate aggregate information and statistics, while privacy-invasive queries violating user consent would not be executed. Moreover, if executable files of the algorithms used by insurance or other companies using Big Data would be uploaded as well, it would allow scientists and citizens to judge their statistical properties and verify that undesirable discrimination effects are below commonly accepted thresholds. This would ensure that quality standards would be met and continuously improved.

The advantages of such a transparent and participatory approach are multifold for business, science, and society alike: (1) results can be verified or falsified, thereby uncovering possible methodological issues, (2) the quality of Big Data algorithms and data will increase more quickly, (3) “healthy” innovation and economic profits will be stimulated, (4) the level of trust in the algorithms, data and conclusions will increase, and (5) an "information ecosystem" will be grown, creating an enormous amount of new business opportunities, to fully unleash the potential of Big Data.

I fully agree with the US Consumer Data Privacy Bill of Rights [6] stating that “Trust is essential to maintaining the social and economic benefits that networked technologies bring to the United States and the rest of the world.” A report on personal data as a new asset class, published by the World Economic Forum, therefore suggests a “New Deal on Data” [7]. This includes establishing a data ecosystem that creates a balance between the interest of companies, citizens, and the state. Important elements of this would be: transparency, more control by citizens over their personal data, and the ability for individuals to participate in the value generated with their personal data.

This has implications for the design of the Global Participatory Platform I am proposing. Data collected about individuals would be stored in a personal data purse. Individuals could add and comment the data, have them corrected, if factually wrong, and determine, who could use them for what kind of purpose, to meet the regulations regarding privacy and self-determination. When personal data are used, both the user and the company that collected the data would earn a small amount, triggering micropayments. Finally, to keep misuse of data and malicious applications on a low level, there would be a certain reputation system, which would act like a social immune system.

Reputation and recommender systems are quickly spreading all over the Web. People can rate products, news, and comments. In exchange, amazon, ebay, tripadvisor and many other platforms offer recommendations. Such recommendations are beneficial not only for the user, who tends to get a better service, but also for a company offering the product or service, as higher reputation allows it to take a higher price [8]. However, it is not good enough to leave it to a company to decide, what recommendations we get, because then we don't know how much we are being manipulated. We want to look at the world from our own perspective, based on our own values and quality criteria. It would be terrible if everyone ended up reading the same books and listening to the same music. Therefore, it is important that recommender systems do not undermine socio-diversity.

Diversity is an important factor for innovation, social well-being, and societal resilience [9]. It deserves to be protected in the very same way as biodiversity. Modern societies need a complex interaction pattern of diverse people and ideas, not average people who all do the same things. The socio-economic misery in many countries of the world is clearly correlated with the loss of socio-economic diversity. While some level of norms and standardization appears to be favorable, too much homogeneity turns out to be bad. This also implies that we need to be careful about discriminating people who are different -- such discrimination may undermine socio-diversity.

Today's personalized recommender systems endanger socio-diversity as well. They are manipulating people’s opinions and decisions, thereby imposing a certain perspective and value system on them. This can seriously undermine the “wisdom of crowds” [10], which is central to the functioning of democracies. The "wisdom of crowds" requires independent information gathering and decision-making -- a principle not sufficiently respected by most recommender systems [11].

How could we, therefore, build "pluralistic" reputation and recommender systems, which support socio-economic diversity, and are also less prone to manipulation attempts? First, one should distinguish three kinds of user feedbacks: facts (linked to information allowing to check them), advertisements (if there is a personal benefit for posting them), and opinions (all other feedbacks). Second, user feedbacks could be made in an anonymous, pseudonymous, or personally identifiable way. Third, users should be able to choose among many different reputation filters and recommender algorithms. Just imagine, we could set up the filters ourselves, share them with our friends and colleagues, modify them, and rate them. For example, we could have filters recommending us the latest news, the most controversial stories, the news that our friends are interested in, or a surprise filter. So, we could choose among a set of filters that we find most useful. Considering credibility and relevance, the filters would also put a stronger weight on information sources we trust (e.g. the opinions of friends or family members), and neglect information sources we do not want to rely on (e.g. anonymous ratings). For this, users would rate information sources as well, i.e. other raters. Therefore, spammers would quickly lose reputation and, with this, their influence on recommendations made.

In sum, the system of personal reputation filters would establish an “information ecosystem”, in which increasingly good filters will evolve by modification and selection, thereby steadily enhancing our ability to find meaningful information. Then, the pluralistic reputation values of companies and their products (e.g. insurance contracts or loan schemes) would give a pretty differentiated picture, which can also help the companies to develop better customized and more successful products.

References

[2] G. Gigerenzer, W. Gaissmaier, E. Kurz-Milcke, L.M. Schwartz, and S. Woloshin (2008) Helping doctors and patients make sense of health statistics, Psychological Science in the Public Interest 8(2), 53-96.

[3] I. Kondor, S. Pafka, and G. Nagy (2007) Noise sensitivity of portfolio selection under various risk measures. Journal of Banking & Finance 31(5), 1545-1573.

[4] T. Siegfried (2010) Odds are, it's wrong, Science News 177(7), p. 26ff, see http://www.sciencenews.org/view/feature/id/57091/description/Odds_Are_Its_Wrong; J.P.A. Ioannidis (2005) Why most published research findings are false, PLoS Medicine 2(8): e124.

[5] S. Buckingham Shum, K. Aberer, A. Schmidt, S. Bishop, P. Lukowicz et al. Towards a global participatory platform (2012) Democratising open data, complexity science and collective intelligence. EPJ Special Topics 214, 109-152.

[6] The White House (2012) Consumer data privacy in a networked world: A framework for protecting privacy and promoting innovation in the global digital economy, see http://www.whitehouse.gov/sites/default/files/privacy-final.pdf

[7] World Economic Forum (2011) Personal Data: The Emergence of a New Asset Class, see www3.weforum.org/docs/WEF_ITTC_PersonalDataNewAsset_Report_2011.pdf

[8] W. Przepiorka, Buyers pay for and sellers invest in a good reputation: More evidence from eBay, The Journal of Socio-Economics 42, 31-42 (2013).

[9] S.E. Page (2007) The Difference (Princeton University Press, Princeton).

[10] J. Lorenz, H. Rauhut, F. Schweitzer, and D. Helbing (2011) How social influence can undermine the wisdom of crowd effect. Proceedings of the National Academy of Sciences of the USA 108(28), 9020-9025.

[11] T. Zhou, Z. Kuscsik, J-G. Liu, M. Medo, J.R. Wakeling, and Y-C. Zhang (2010) Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences of the USA 107, 4511-4515.

[12] B.A. Huberman (2012) Big data deserve a bigger audience, Nature 482, 308.

[13] F. Berman and V. Cerf (2013) Who will pay for public access of research data? Science 341, 616-617.

[14] D. Helbing (2013) Economics 2.0: The natural step towards a self-regulating, participatory market society, Evolutionary and Institutional Economics Review 10(1), 3-41.

#### Information Box: How to define quality standards for data mining

Assume that the individuals in a population of N people fall into one of two classes. Let us consider people of kind 1 “desirable” (e.g. honest citizens, good insurance risks) and people of kind 2 “undesirable” (criminals, bad insurance risks, etc.). We represent the number of people classified as kind 1 and 2 by N1 and N2 respectively.  Let the rate of false positives, that is individuals who are faced with unjustified discrimination, be given by α, and the rate of false negatives be β. Then, the actual number of people of kind 1 is (1-β)N1+αN2, and the actual number of people of kind 2 is (1-α)N2+ βN1. Furthermore, assume that the classification is creating an advantage of A>0 for people classified as kind 1, but a disadvantage of –D<0 for people classified as kind 2. Then, each false positive classified person has a double disadvantage of -(A+D), because he or she should have received the advantage A while suffering the disadvantage -D. This will be considered unfair and question the legitimacy of the procedure. False negatives, in contrast, those who are classified “desired” but are in fact “undesired”, enjoy a double advantage of (A+D). They may also create an extra damage -E to society. Overall, the classification produces a gain of G=N1[(1-β)A+β(A+D)] to individuals classified to be of kind 1 and a cost of C=-N2[(1-α)D+α(D+A)] to individuals classified to be of kind 2. The overall benefit to society would be B=G-C-E. Unfortunately, there is no guarantee that it would be positive.

To demonstrate this, let us assume a business application of Big Data, in which the economic profit P (e.g. by selling cheaper insurance contracts to people of kind 1) is a fraction f of the gain, i.e. P=fG. If applied to many people, the application may be profitable even if the fraction f<1 is quite small. Moreover, from the point of view of a company, discrimination may be rewarding even if it has an overall disadvantage to people (i.e. if the overall benefit B is negative). This is because a company typically cares about its own profits and its customers, but not everybody else. Clearly, if some insurance contracts get cheaper, others will have to be more expensive. In the end, people with high risks will not be offered insurances anymore, or only at an unaffordable price, so some victims of accidents may not be compensated at all for their damage.

Even if B is positive, the profit P may be smaller than the unjust disadvantage U, which is the price that false positives have to pay. Such a business model would create a situation that I will call a "discrimination tragedy," where citizens have to pay the price for economic profits, even though they are not getting a good service in exchange.

It is, therefore, in the public interest to establish binding standards for the "healthy" use of Big Data algorithms, regulating the required predictive power and the acceptable values of α, D, B and U. A cost-benefit analysis suggests to demand B>0 (there is a benefit) and B>U (the benefit is high enough to compensate for unjust treatments). Moreover, αN1 and D should be below some acceptable thresholds. Today, these values are often unknown, and that means we have no idea what economic and societal benefits or damages are actually created by current applications of Big Data.

## Wednesday, 10 September 2014

### HAVE WE OPENED PANDORA’S BOX? We must move beyond 9/11

By Dirk Helbing

We continue FuturICT’s essays and discussion on Big Data, the ongoing Digital Revolution and the emergent Participatory Market Society written since 2008 in response to the financial and other crises. If we want to master the challenges, we must analyze the underlying problems and change the way we manage our technosocio- economic systems.

## 1.1.   Global financial, economic and public spending crisis

In March 2008, when Markus Christen, James Breiding and myself worried so much about the stability of the financial system that we felt urged to write a newspaper article to alert the public [65] Unfortunately, at that time, the public was not ready to listen. Newspaper editors found our analysis too complex. We responded that a financial crisis would be impossible to prevent, if newspapers failed to explain the complexity of problems like this to their audience. Just a few months later, Lehmann Brothers collapsed, which gave rise to a large-scale crisis. It made me think about the root causes of economic problems [3–6] and of global crises in general [1, 2], and the need to address a change in conventional economic thinking [65-67]. But my collaborators and I saw not only the financial crisis coming. We also voiced the surveillance problem early on and the political vulnerability of European gas supply. We studied conflict in Israel, the spreading of diseases, and new response strategies to earthquakes and other disasters. Shortly later, all of this turned out to be pretty visionary... When I attended a Global Science Forum in 2008 organized by the OECD [9], most people still expected that the problems in the US real estate market and the banking system could be fixed. However, it was already clear to me—and probably many complexity scientists—that they would cause cascade effects and trigger a global economic and public spending crisis, which we would not recover from for many years. At that time, I said that nobody understood our financial system, our economy, and our society well enough to grasp the related problems and to manage them successfully. Therefore, I proposed to invest into a large-scale project in the social sciences—including economics—in the very same way as we have invested billions into the CERN elementary particle accelerator, the ITER fusion reactor, the GALILEO satellite system, space missions, astrophysics, the human genome projects, and more. I stressed that, in the 21st century, we would require a “knowledge accelerator” to keep up with the pace at which our societies are faced with emerging problems [10]. Today, business and politics are often based on scientific findings that are 30 to 50 year old, or not based on evidence at all. This is not sufficient anymore to succeed in a quickly changing world. We would need a kind of Apollo project, but not one to explore our universe—rather one that would focus on the Earth and what was going on there, and why.

## 1.2.   Need of a “knowledge accelerator”

As a consequence, the VISIONEER support action funded by the European Commission www.visioneer.ethz.ch worked out four white papers proposing large-scale data mining, social supercomputing, and the creation of an innovation accelerator [7]. Already back in 2011, VISIONEER was also pointing out the privacy issues of modern information and communication technologies, and it even made recommendations how to address them [8]. Then, in response to the European call for two 10-year-long one billion EURO flagship projects in the area of Future Emerging Technologies (FET), the multidisciplinary FUTURICT consortium was formed to turn this vision into reality www.futurict.eu. Thousands of researchers world-wide, hundreds of universities, and hundreds of companies signed up for this. 90 million EURO matching funds were waiting to be spent in the first 30 months. But while the project was doing impressively well, to everyone’s surprise it was finally not funded, even though we proposed an approach aiming at ethical information and communication technologies [8, 11], with a focus on privacy and citizen participation [12]. This possibly meant that governments had decided against FuturICT’s open, transparent, participatory, and privacy-respecting approach, and that they might invest in secret projects instead. If this were the case, a worrying digital arms race would result. Therefore, while spending my Easter holidays 2012 in Sevilla, I wrote a word of warning with the article ”Google as God?” [57]. Shortly later, Edward Snowden’s revelations of global mass surveillance shocked the world, including myself [13]. These unveiled past and current practices of secret services in various countries and criticized them as illegal. Even though an informed reader could have expected a lot of what was then reported, much of it just surpassed the limits of imagination. The sheer extent of mass surveillance, the lack of any limits to the technical tools developed, and the way they were used frightened and alarmed many citizens and politicians. The German president, Joachim Gauck, for example, commented: “This affair [of mass surveillance] concerns me a lot.... The worry that our phone calls and emails would be recorded and stored by a foreign secret service hampers the feeling of freedom—and with this there is a danger that freedom itself will be damaged.” [14] Nevertheless, many important questions have still not been asked: How did we get into this system of mass surveillance? What was driving these developments? Where will they lead us? And what if such powerful information and communcation technologies were misused? The answer to these questions continues to form endeavours of the FuturICT commnunity.

## 1.3.   We are experiencing a digital revolution

One of the important insights is: We are in the middle of a digital revolution—a third industrial revolution after the one turning agricultural societies into industrial ones, and these into service societies. This will fundamentally transform our economy and lead us into the “digital society” [15]. I claim that not only the citizens haven’t noticed this process early enough, but also most businesses and politicians. By the time we got a vague glimpse of what might be our future, it had already pervaded our society, in the same way as the financial crisis had infected most parts of our economy. Again, we have difficulties to identify the responsible people—we are facing a systemic issue.
Rather than blaming companies or people, our effort should be to raise awareness for the implications of the techno-socio-economic systems we have created: intended and unintended, positive and negative ones and to point the way to a brighter future. As it turns out, we do in fact have better alternatives. But before I discuss these, let me first give a reasonably short summary of the current insights into the side effects of information and communication technologies, as far as they must concern us.

## 1.4.   Threats to the average citizen

Let me begin with the implications of mass surveillance for citizens. It is naive and just wrong to assume mass surveillance would not matter for an average citizen, who is not engaged in any criminal or terrorist activities. The number of people on lists of terror suspects comprises a million names [17]—other sources even say a multiple of this. It became known that these lists contains a majority of people who are not terrorists nor linked with any. Furthermore, since friends of friends of contact persons of suspects are also under surveillance, basically everyone is under surveillance [18]. Of course nobody would argue against preventing terrorism. However, mass surveillance [19] and surveillance cameras [20] haven’t been significantly more effective in preventing crime and terror than classical investigation methods and security measures, but they have various side effects. For example, tens of thousands of innocent subjects had to undergo extended investigation procedures at airports [21]. In connection with the war on drugs, there have even been 45 million arrests [22], where many appear to be based on illegal clues from surveillance [23]. Nevertheless, the war on drugs failed, and US Attorney General Eric Holder finally concluded: “Too many Americans go to too many prisons for far too long, and for no truly good law enforcement reason” [24]. Recently, many people have also been chased for tax evasion. While I am not trying to defend drug misuse or tax evasion, we certainly see a concerning transition from the principle of assumed innocence to a situation where everyone is considered to be a potential suspect [25]. This is undermining fundamental principles of our legal system, and implies threats for everyone. In an over-regulated society, it is unlikely that there is anybody who would not violate any laws over the time period of a year [26]. So, everyone is guilty, in principle. People (and companies) are even increasingly getting into trouble for activities, which are legal—but socially undesirable, i.e. we are increasingly seeing phenomena that remind of “witch hunting.” For example, in December 2013, thousands of people got sued by a law firm for watching porn [27]. For the first time, many people became aware that all of their clicks in the Internet were recorded by companies, and that their behavior was tracked in detail.

## 1.5.   Threats so big that one cannot even talk about them

On the side of the state, such tracking is being justified by the desire to protect danger from society, and child pornography is often given as one of the reaons. Again, nobody would argue against the need to protect children from misuse, but this time the subject is even so taboo that most people are not even aware of what exactly one is talking about. You can’t really risk to look up information in the Internet, and you are advised to delete photographs depicting yourself when you were a child. Only recently, we have learned that Internet companies report thousands of suspects of child pornography [28]. It is not known what percentage of these people have ever touched a child in an immoral way, or paid money for unethical pictures or video materials. This is particularly problematic, as millions of private computers are hacked and used to send spam mails [29]; illegal material might easily be among them. Note that passwords of more than a billion email accounts have been illegally collected, recently [30]. This might imply that almost everyone living in a first world country can be turned into a criminal by putting illegal materials on one of their digital devices. In other words, if you stand in somebody’s way, he or she might now be able to send you to prison, even if you have done nothing wrong. The evidence against you can be easily prepared. Therefore, your computer and your mobile device become really dangerous for you. It is no wonder that two thirds of all Germans don’t trust anymore that Internet companies and public authorities use their personal data in proper ways only; half of all Germans even feel threatened by the Internet [31].

## 1.8.   Political and societal risks

However, the age of Big Data also implies considerable political and societal risks. The most evident threat is probably that of cyberwar, which seriously endangers the functionality of critical infrastructures and services. This creates risks that may materialize within miliseconds, for extended time periods, and potentially for large regions [48]. Therefore, the nuclear response to cyber-attacks is considered to be an option [49]. Some countries also work on automated programs for responsive cyber attacks [50]. However, as cyberattacks are often arranged such that they appear to originate from a different country, this could easily lead to responsive counterstrikes on the wrong country—a country that has not been the aggressor. But there are further dangers. For example, political careers become more vulnerable to what politicians have said or done many years back—it can all be easily reconstructed. Such problems do not require that these people have violated any laws—it might just be that the social norms have changed in the meantime. This makes it difficult for personalities—typically people with non-average characters— to make a political career. Therefore, intellectual leadership, pointing the way into a different, better future, might become less likely. At the same time, Big Data analytics is being used for personalized election campaigns [51], which might determine a voter’s political inclination and undermine the fundamental democratic principle of voting secrecy.With lots of personal and social media data it also becomes easier to give a speech saying exactly what the people in a particular city would like to hear—but this, of course, does not mean the promises will be kept. Moreover, if the governing political leaders have privileged access to data, this can undermine a healthy balance of power between competing political parties.

## 1.9.   Are the secret services democratically well controlled?

It has further become known that secret services, also in democratic countries, manipulate discussions in social media and Internet contents, including evidence, by so-called “cyber magicians” [52]. In South Korea, the prime minister is even said to have been tweeted into office by the secret services [53]. But not always are secret services playing in accord with the ruling politicians. In Luxembourg, for example, it seems they have arranged terror attacks (besides other crimes) to get a higher budget approved [54]. They have further spied on Luxembourg’s prime minister Jean-Claude Juncker, who lost control over the affair and even his office. One may therefore hope that, in his current role as the president of the European Commission, he will be able to establish proper democratic control of the activities of secret services. In fact, there is a serious but realistic danger that criminals might get control of the powers of secret services, who should be protecting the society from organized crime. Of course, criminals will always be attracted by Big Data and cyber powers to use them in their interest, and they will often find ways to do so. In Bulgaria, for example, a politician is said to have been trying to get control over the country’s secret services for criminal business. The Bulgarian people have been demonstrating for many weeks to prevent this from happening [55].

## 1.10.What kind of society are we heading to?

Unfortunately, one must conclude that mass surveillance and Big Data haven’t increased societal, econonomic, and cyber security. They have made us ever more vulnerable.We, therefore, find our societies on a slippery slope. Democracies could easily turn into totalitarian kinds of societies, or at least “democratorships,” i.e. societies in which politicians are still voted for, but in which the citizens have no significant influence anymore on the course of events and state of affairs. The best examples for this are probably the secret negotiations the ACTA and TTIP agreements, i.e. laws intended to protect intellectual property and to promote free trade regimes. These include parallel legal mechanisms and court systems, which would take intransparent decisions that the public would nevertheless have to pay for. It seems that traditional democracies are more and more transformed into something else. This would perhaps be ok, if it happened through an open and participatory debate that takes the citizens and all relevant stakeholders on board. In history, societies have undergone transformations many times, and I believe the digital revolution will lead us to another one. But if politicians or business leaders acted as revolutionaries trying to undermine our constitutional rights, this would sooner or later fail. Remember that the constitution—at least in many European countries— explicitly demands from everyone to protect privacy and family life, to respect the secrecy of non-public information exchange, to protect us from misuse of personal data, and to grant the possibility of informational self-determination, as these are essential functional preconditions of free, human, and livable democracies [56]. The secret services should be protecting us from those who question our constitutional rights and don’t respect them. Given the state of affairs, this would probably require something like an autoimmune response. It often seems that not even public media can protect our constitutional rights efficiently. This is perhaps because they are not able to reveal issues that governments share with them exclusively under mutually agreed confidence.

## 1.11.“Big governments” fueled by “Big Data”

In the past years, some elites have increasingly become excited about the Singaporean “big government” model, i.e. something like an authoritarian democracy ruled according to the principle of a benevolent, “wise king,” empowed by Big Data [57]. As logical as it may sound, such a concept may be beneficial up to a certain degree of complexity of a society, but beyond this point it limits the cultural and societal evolution [15, 57]. While the approach to take decisions like a “wise king” might help to advance Singapore and a number of other countries for some time, in a country like Germany or Switzerland, which gain their power and success by enganging into balanced and fair solutions in a diverse and well educated society with a high degree of civic participation, it would be a step backwards. Diversity and complexity are a precondition for innovation, societal resilience, and socioeconomic well-being [15]. However, we can benefit from complexity and diversity only if we enable distributed management and self-regulating systems. This requires to restrict top-down control to what cannot be managed in a bottom-up way. That is, where the transformative potential of information and communication systems really is: information technology can now enable the social, economic and political participation and coordination that was just impossible to organize before. It is now the time for a societal dialogue about the path that the emerging digital society should take: either one that is authoritarian and top-down, or one that is based on freedom, creativity, innovation, and participation, enabling bottom-up engagement [16]. I personally believe a participatory market society is offering the better perspectives for industrialized services societies in the future, and that it will be superior to an authoritarian top-down approach. Unfortunately, it seems we are heading towards the latter. But it is important to recognize that the dangers of the current Big Data approach are substantial, and that there is nobody who could not become a victim of it. It is crucial to understand and admit that we need a better approach, and that it was a mistake to engage into the current one.

## 1.12.We must move beyond  September 11

The present Big Data approach seems to be one of the many consequences of September 11, which did not change our world to the better. By now, it has become clear that the “war on X” approach—where “X” stands for drugs, terror, or other countries—does not work. Feedback, cascade and side effects have produced many unintended results. In the meantime, one is trying to find “medicines” against the side effects of the medicines that were applied to the world before. The outcome of the wars on Iraque and Afghanistan can hardly be celebrated as success. We rather see that these wars have destabilized entire regions. We are faced with increased terrorism by people considering themselves as freedom fighers, a chaotic aftermath of Arab spring revolutions, devastating wars in Syria, Israel and elsewhere, an invasion of religious warriors, increased unwelcomed migration, poverty-related spreading of dangerous diseases, and larger-than-ever public spending deficits; torture, Guantanamo, secret prisons, drones and an aggressive cybersecurity approach have not managed to make the world a safer place [58]. As these problems demonstrate, globalization means that problems in other parts of the world will sooner or later feed back on us [59]. In other words, in the future we must make sure that, if we want to have a better and peaceful life, others around us will also find peaceful and reasonable living conditions. To better understand the often unexpected and undesirable feedback, cascade and side effects occuring in the complex interdependent systems of our globalized world, it is important to develop a Global Systems Science [1]. For example, it has been recently pointed out, even from unexpected sides such as Standard and Poor’s, that too much inequality endangers economic and societal progress [60]. It is also important to recognize that respect and qualified trust are a more sustainable basis for socio-economic order than power and fear [61, 68]. I believe the dangerous aspect of mass surveillance is that its impact will become obvious only over a time period of many years. By the time we notice this, it might be too late to protect us from harm. Like nuclear radiation, one cannot directly feel the effects of mass surveillance, but it nevertheless causes structural damages—in this case to democratic societies. Mass surveillance undermines trust and legitimacy. However, trust and legitimacy is the glue that keeps societies together—they create the power of our political representatives and public institutions. Without trust, a society becomes unstable.

## 1.13.What needs to be done

It is not unreasonable to be afraid of the “ghosts out to of the bottle” that mass surveillance released. Some people consider it to be one of the things that escaped from Pandora’s Box in the aftermath of September 11. But hope is last. What can we do? First of all, to ensure accountability, it seems necessary to record each access to personal data (including the computational operation and the exact data set it was executed on). Second of all, one must restore lost trust by the public, which requires a sufficient level of transparency. For example, the log files of data queries executed by secret services and other public authorities would have to be accessible to independent and sufficiently empowered supervising authorities. Similarly, log files of data queries executed by companies should be regularly checked by independent experts such as qualified scientists or citizen scientists. To be able to trust Big Data analytics, the public must know that it is scientifically sound and compliant with the values of our society and constitution. This also requires that users, customers, and citizens have a right to legally challenge results of Big Data analytics. For this, Big Data analytics must be made reproducible, such that the quality and law compliance of data mining results can be checked by independent experts. Furthermore, it should be ensured that the power of Big Data is not used against the justified interest of people. For example, I recommend to use it to enable people, scientists, companies and politicians to take better decisions and more effective actions rather than applying it for the sake of large-scale law enforcement. The use of Big Data for criminal investigation should, therefore, be restricted to activites that endanger the foundations of a well-functioning society. It might further be necessary to punish data manipulation and data pollution, no matter who engages in it (including secret services). Given the many instances of data manipulation today, data traces should not be considered as pieces of evidence themselves. Furthermore, for the sake of just and legitimate sanctioning systems, it must be ensured that sanctions are not applied in an arbitrary and selective way. In addition, the number of criminal investigations triggered by data analytics must be kept low and controlled by the parliament. Otherwise, in an over-regulated society, Big Data analytics might be misused by the elites to shape the society according to their taste—and this would surely end in a disaster sooner or later. In particular, the use of Big Data should not get into the way of freedom and innovation, as these are important functional success principles of complex societies. It is also important to recognize that the emergent digital society will require particular institutions, as it was also the case for the industrial and the service societies. This includes data infrastructures implementing a “new deal on data” [62], which would give users control over their own data and allow them to benefit from profits created with them. This can be done with the “Personal Data Purse” approach, which has recently been developed to comply with the constitutional right of informational self-determination [63]. Further infrastructures and institutions needed by the digital society are addressed in [15]

## 1.14.A better future, based on self-regulation

Finally, I recommend to engage into the creation of self-regulating systems. These can be enabled by real-time measurements, which the sensor networks underlying the emerging “Internet of Things” will increasingly allow. Interestingly, such applications can support socio-economic coordination and order based on selforganization, without requiring the storage of personal or other sensitive data. In other words, the production of data and their use for self-regulating systems would be temporary and local, thereby enabling efficient and desirable socio-economic outcomes while avoiding dystopian surveillance scenarios. I am convinced that this is the information-based way into a better future and, therefore, I will describe further details of this approach in an upcoming book on the self-regulating digital society [64].While our writings to date have been more focused on concerns related to the current trends and developments, a forthcoming book will be focused on the question, what we can do to promote a “happy end.”

#### Acknowledgments

I would like to thank many friends and colleagues, in particular the world-wide FuturICT community, for the inspiring discussions and the continued support.

#### References

1. D. Helbing (2013) Globally networked risks and how to respond. Nature 497, 5159, see http://www.researchgate.net/publication/236602842_Globally_ networked_risks_and_how_to_respond/file/60b7d52ada0b3d1494. pdf; also see http://www.sciencedaily.com/releases/2013/05/ 130501131943.htm

2. D. Helbing (2010) Systemic Risks in Society and Economics. International Risk Governance Council (irgc), see http://www.researchgate.net/publication/ 228666065_Systemic_risks_in_society_and_economics/file/ 9fcfd50bafbc5375d6.pdf

3. D. Helbing and S. Balietti (2010) Fundamental and real-world challenges in economics. Science and Culture 76(9-10), 399417, see http://www.saha.ac.in/cmp/camcs/ Sci_Cul_091010/17%20Dirk%20Helbing.pdf

4. Ormerod, P. and D. Helbing (2012) Back to the drawing board for macroeconomics, in D. Coyle (ed.) Whats the Use of Economics?: Teaching the Dismal Science after the Crisis, London Publishing Partnership, see http://volterra.co.uk/wp-content/uploads/2013/03/2_ Back-to-the-Drawing-Board-for-Macroeconomics.pdf

5. D. Helbing and A. Kirman (2013) Rethinking economics using complexity theory. Real- World Economics Review 64, 2352, see http://www.paecon.net/PAEReview/ issue64/HelbingKirman64.pdf; also see http://futurict.blogspot.ch/ 2013/04/how-and-why-our-conventional-economic_8.html

6. D. Helbing (2013) Economics 2.0: The natural step towards a self-regulating, participatory market society. Evolutionary and Institutional Economics Review 10, 3-41, see https:// 12 Foreword: Pandora’s Box www.jstage.jst.go.jp/article/eier/10/1/10_D2013002/_pdf; also see http://www.todayonline.com/singapore/new-kind-economy-born

7. D. Helbing, S. Balietti et al. (2011) Visioneer special issue: How can we Learn to Understand, Create and Manage Complex Techno-Socio-Economic Systems? http://epjst.epj.org/index.php?option=com_toc&url=/articles/ epjst/abs/2011/04/contents/contents.html

8. D. Helbing and S. Balietti, Big Data, Privacy, and Trusted Web: What Needs to Be Done, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2322082

9. 9. OECD Global Science Forum (2008) Applications of Complexity Science for Public Policy: New Tools for Finding Unanticipated Consequences and Unrealized Opportunities, http: //www.oecd.org/science/sci-tech/43891980.pdf

10. The FuturIcT Knowledge Accelerator: Unleashing the Power of Information for a Sustainable Future, http://papers.ssrn.com/sol3/papers.cfm?abstract_ id=1597095; see also http://arxiv.org/abs/1304.0788.

11. J. van den Hoven et al. (2012) FuturICT—The road towards ethical ICT, Eur. Phys. J. Special Topics 214, 153-181, http://link.springer.com/article/10.1140/epjst/ e2012-01691-2#page-1

12. S. Buckingham Shum et al. (2012) Towards a Global Participatory Platform, Eur. Phys. J. Special Topics 214, 109-152, http://link.springer.com/article/10.1140/ epjst/e2012-01690-3#page-1

13. For an overview of the Snowden revelations, see http://www.theguardian.com/ world/the-nsa-files

14. Heise Online (July 25, 2013) Bundesprsident Gauck “sehr beunruhigt” ber US-¨berwachung, http://www.heise.de/newsticker/meldung/ Bundespraesident-Gauck-sehr-beunruhigt-ueber-US-Ueberwachung-1924026. html; for further interesting quotes see http://www.spiegel.de/ international/europe/eu-officials-furious-at-nsa-spying-in-brussels-and-germany-a-908614. html

15. D. Helbing (June 12, 2014) What the Digital Revolution Means for Us, Science Business, http://www.sciencebusiness.net/news/76591/ What-the-digital-revolution-means-for-us, see also [16]

16. See the videos http://www.youtube.com/watch?v=I_Lphxknozc and http: //www.youtube.com/watch?v=AErRh_yDr-Q

17. The Intercept (August 5, 2014) Barack Obamas secret terrorist-tracking system, by the numbers, https://firstlook.org/theintercept/article/2014/08/ 05/watch-commander/

18. Foreign Policy (July 17, 2013) 3 degrees of separation is enough to have you watched by the NSA, http://complex.foreignpolicy.com/posts/ 2013/07/17/3_degrees_of_separation_is_enough_to_have_ you_watched_by_the_nsa; see also “Three degrees of separation” in http://www.theguardian.com/world/interactive/2013/nov/01/ snowden-nsa-files-surveillance-revelations-decoded#section/1

19. The Washington Post (January 12, 2014) NSA phone record collection does little to prevent terrorist attacks, group says, http: //www.washingtonpost.com/world/national-security/ nsa-phone-record-collection-does-little-to-prevent-terrorist-attacks-group-says/ 2014/01/12/8aa860aa-77dd-11e3-8963-b4b654bcc9b2_story.html? hpid=z4; see also http://securitydata.newamerica.net/nsa/analysis

20. M. Gill and Spriggs: Assessing the impact of CCTV. Home Office Research, Development and Statistics Directorate (2005), https://www.cctvusergroup.com/ downloads/file/Martin%20gill.pdf; see also BBC News (August 24, 2009) 1,000 cameras ‘solve one crime, http://news.bbc.co.uk/2/hi/uk_news/ england/london/8219022.stm

21. Home Office (September 2012) Review of the Operation of Schedule 7, https://www.gov.uk/government/uploads/system/uploads/ References 13 attachment_data/file/157896/consultation-document.pdf; also see http://www.theguardian.com/commentisfree/2013/aug/18/ david-miranda-detained-uk-nsa

22. National Geographic (January 22, 2013) The war on drugs is a “miserable failure’, http://newswatch.nationalgeographic.com/2013/01/22/ the-war-on-drugs-is-a-miserable-failure/

23. Electronic Frontier Foundation (August 6, 2013) DEA and NSA team up to share intelligence, leading to secret use of surveillance in ordinary investigations, https://www.eff.org/deeplinks/2013/08/ dea-and-nsa-team-intelligence-laundering

24. The Guardian (August 12, 2013) Eric Holder unveils new reforms aimed at curbing US prison population, http://www.theguardian.com/world/2013/aug/12/ eric-holder-smart-crime-reform-us-prisons

25. The Intercept (March 7, 2014) Guilty until proven innocent, https: //firstlook.org/theintercept/document/2014/03/07/ guilty-proven-innocent/; see also http://www.huffingtonpost.com/ 2014/08/15/unlawful-arrests-police_n_5678829.html

26. J. Schmieder (2013) Mit einem Bein im Knast, http://www.amazon.com/ Mit-einem-Bein-Knast-gesetzestreu-ebook/dp/B00BOAFXKM/ ref=sr_1_1?ie=UTF8; see also http://www.spiegel.tv/filme/ magazin-29122013-verboten/

27. Spiegel (December 9, 2013) Redtube.com: Massenabmahnungen wegen Porno-Stream, http://www.spiegel.de/netzwelt/web/ porno-seite-redtube-abmahnungen-gegen-viele-nutzer-a-938077. html; see also http://www.spiegel.de/netzwelt/web/ massenabmahnungen-koennen-laut-gerichtsurteil-ein-rechtsmissbrauch-sein-a-939764. html

29. MailOnline (January 12, 2013) The terrifying rise of cyber crime: Your computer is currently being targeted by criminal gangs looking to harvest your personal details and steal your money, http://www.dailymail.co.uk/home/moslive/article-2260221/ Cyber-crime-Your-currently-targeted-criminal-gangs-looking-steal-money. html; see also https://firstlook.org/theintercept/article/2014/03/ 12/nsa-plans-infect-millions-computers-malware/

30. New York Times (August 5, 2014) Russian Hackers Amass Over a Billion Internet Passwords, http://www.nytimes.com/2014/08/06/technology/ russian-gang-said-to-amass-more-than-a-billion-stolen-internet-credentials. html?_r=0

31. Spiegel Online (June 5, 2014) Umfrage zum Datenschutz: Online misstrauen die Deutschen dem Staat, http://www.spiegel.de/netzwelt/web/ umfrage-deutsche-misstrauen-dem-staat-beim-online-datenschutz-a-973522. html

32. Focus Money Online (August 8, 2014) Test enth¨ullt Fehler in jeder zweiten Schufa- Auskunft, http://www.focus.de/finanzen/banken/ratenkredit/ falsche-daten-teure-gebuehren-test-enthuellt-fehler-in-jeder-zweiten-schufa-auskunft_ id_4046967.html

33. Versicherungsbote (July 2, 2013) Wirtschaftsspionage durch amerikanischen Geheimdienst NSA - Deutsche Unternehmen sind besorgt, http://www.versicherungsbote.de/id/89486/ Wirtschaftsspionage-durch-amerikanischen-Geheimdienst-NSA/; see also http://pretioso-blog.com/der-fall-enercon-in-der-ard-wirtschaftspionage-der-usa-durch-and http://www.tagesschau.de/wirtschaft/wirtschafsspionage100. html 14 Foreword: Pandora’s Box

34. Zeit Online (April 17, 2014) Blackout, http://www.zeit.de/2014/16/ blackout-energiehacker-stadtwerk-ettlingen

35. Stuxnet, see http://en.wikipedia.org/wiki/ Stuxnet, http://www.zdnet.com/blog/security/ stuxnet-attackers-used-4-windows-zero-day-exploits/ 7347, http://www.itworld.com/security/281553/ researcher-warns-stuxnet-flame-show-microsoft-may-have-been-infiltrated-nsa-cia

36. PC News (July 31, 2014) Researchers warn about ‘BadUSB exploit, http://www. pcmag.com/article2/0,2817,2461717,00.asp

37. Mail Online (October 31, 2013) China is spying on you through your KETTLE: Bugs that scan wi-fi devices found in imported kitchen gadgets, http://www.dailymail.co.uk/news/article-2480900/ China-spying-KETTLE-Bugs-scan-wi-fi-devices-imported-kitchen-gadgets. html

39. MIT Technology Review (October 8, 2013) NSAs own hardware backdoors may still be a “Problem from hell, http://www.technologyreview.com/news/519661/ nsas-own-hardware-backdoors-may-still-be-a-problem-from-hell/; see also http://www.theguardian.com/world/2013/sep/05/ nsa-gchq-encryption-codes-security, http://www.eteknix.com/ expert-says-nsa-have-backdoors-built-into-intel-and-amd-processors/ http://en.wikipedia.org/wiki/NSA_ANT_catalog

40. RT (July 3, 2014) NSA sued for hoarding details on use of zero day exploits, http://rt. com/usa/170264-eff-nsa-lawsuit-0day/; see also http://www.wired. com/2014/04/obama-zero-day/

41. Private Wifi (March 31, 2014) New drone can hack into your mobile device, http://www.privatewifi.com/ new-drone-can-hack-into-your-mobile-device/; see also http://www. privatewifi.com/new-drone-can-hack-into-your-mobile-device/, http://securitywatch.pcmag.com/hacking/314370-black-hat-intercepting-calls-and-cloning-phones-http://www.npr.org/blogs/alltechconsidered/2013/07/15/ 201490397/How-Hackers-Tapped-Into-My-Verizon-Cellphone-For-250, http://www.alarmspy.com/phone_hacking_9.html

42. The Guardian (July 31, 2013) XKeyscore: NSA tool collects ‘nearly everything a user does on the internet, http://www.theguardian.com/world/2013/jul/31/ nsa-top-secret-program-online-data; see also http://en.wikipedia. org/wiki/XKeyscore

43. Business Insider (June 10, 2013) How a GED-holder managed to get ‘top secret’ government clearance, http://www.businessinsider.com/ edward-snowden-top-secret-clearance-nsa-whistleblower-2013-6

44. The Guardian (September 16, 2013) Academics criticise NSA and GCHQ for weakening online encryption, http://www.theguardian.com/technology/2013/sep/16/ nsa-gchq-undermine-internet-security

45. BBC News (April 10, 2014) Heartbleed bug: What you need to know, http://www.bbc. com/news/technology-26969629; see also http://en.wikipedia.org/ wiki/Heartbleed

46. Huff Post (July 25, 2014) Apple May Be Spying On You Through Your iPhone, http:// www.huffingtonpost.com/2014/07/26/apple-iphones-allow-extra_ n_5622524.html; see also http://tech.firstpost.com/news-analysis/ chinese-media-calls-apples-iphone-a-national-security-concern-227246. html References 15

47. The Guardian (May 20, 2014) Chinese military officials charged with stealing US data as tensions escalate, http://www.theguardian.com/technology/ 2014/may/19/us-chinese-military-officials-cyber-espionage; see also http://www.nytimes.com/2006/06/07/washington/ 07identity.html, http://blog.techgenie.com/editors-pick/ data-theft-incidents-to-prevent-or-to-cure.html, http: //articles.economictimes.indiatimes.com/keyword/data-theft/ recent/5

48. See http://en.wikipedia.org/wiki/Cyberwarfareandhttp://en. wikipedia.org/wiki/Cyber-attack and http://www.wired.com/2013/ 06/general-keith-alexander-cyberwar/all/

49. The National Interest (June 24, 2013) Cyberwar and the nuclear option, http://nationalinterest.org/commentary/ cyberwar-the-nuclear-option-8638

50. Wired (August 13, 2014) Meet MonsterMind, the NSA Bot That Could Wage Cyberwar Autonomously, http://www.wired.com/2014/08/ nsa-monstermind-cyberwarfare/

51. InfoWorld (February 14, 2013) The real story of how big data analytics helped Obama win, http://www.infoworld.com/d/big-data/ the-real-story-of-how-big-data-analytics-helped-obama-win-212862; see also http://www.technologyreview.com/featuredstory/509026/ how-obamas-team-used-big-data-to-rally-voters/; Nature (September 12, 2012) Facebook experiment boosts US voter turnout, http://www.nature.com/ news/facebook-experiment-boosts-us-voter-turnout-1.11401; see also http://www.sbs.com.au/news/article/2012/09/13/ us-election-can-twitter-and-facebook-influence-voters

52. RT (February 25, 2014) Western spy agencies build ‘cyber magicians’ to manipulate online discourse, http://rt.com/news/ five-eyes-online-manipulation-deception-564/; see also https: //firstlook.org/theintercept/2014/02/24/jtrig-manipulation/, https://firstlook.org/theintercept/2014/07/14/ manipulating-online-polls-ways-british-spies-seek-control-internet/, http://praag.org/?p=13752

53. NZZ (August 18, 2014) Ins Amt gezwischert? www.nzz.ch/aktuell/startseite/ ins-amt-gezwitschert-1.18202760

54. Secret services are there to stabilize democracies? The reality looks different, https:// www.facebook.com/FuturICT/posts/576176715754340

55. DW (June 26, 2013) Bulgarians protest government of ‘oligarchs, http://www.dw.de/ bulgarians-protest-government-of-oligarchs/a-16909751; see also Tagesschau.de (July 24, 2013) Zorn vieler Bulgaren ebbt nicht ab, http://www. tagesschau.de/ausland/bulgarienkrise102.html

56. Aus dem Volksz¨ahlungsurteil des Bundesverfassungsgerichts (BVerfGE 65, 1 ff; NJW 84, 419 ff, see http://de.wikipedia.org/wiki/Volksz\unhbox\voidb@x\ bgroup\let\unhbox\voidb@x\setbox\@tempboxa\hbox{a\global\ mathchardef\accent@spacefactor\spacefactor}\accent127a\ egroup\spacefactor\accent@spacefactorhlungsurteil): “Mit dem Recht auf informationelle Selbstbestimmung wren eine Gesellschaftsordnung und eine diese ermglichende Rechtsordnung nicht vereinbar, in der Brger nicht mehr wissen knnen, wer was wann und bei welcher Gelegenheit ber sie wei. Wer unsicher ist, ob abweichende Verhaltensweisen jederzeit notiert und als Information dauerhaft gespeichert, verwendet oder weitergegeben werden, wird versuchen, nicht durch solche Verhaltensweisen aufzufallen. [] Dies wrde nicht nur die individuellen Entfaltungschancen des Einzelnen beeintrchtigen, sondern auch das Gemeinwohl, weil Selbstbestimmung eine elementare Funktionsbedingung eines auf Handlungsfhigkeit und Mitwirkungsfhigkeit 16 Foreword: Pandora’s Box seiner Brger begrndeten freiheitlichen demokratischen Gemeinwesens ist. Hieraus folgt: Freie Entfaltung der Persnlichkeit setzt unter den modernen Bedingungen der Datenverarbeitung den Schutz des Einzelnen gegen unbegrenzte Erhebung, Speicherung, Verwendung und Weitergabe seiner persnlichen Daten voraus. Dieser Schutz ist daher von dem Grundrecht des Art. 2 Abs. 1 in Verbindung mit Art. 1 Abs. 1 GG umfasst. Das Grundrecht gewhrleistet insoweit die Befugnis des Einzelnen, grundstzlich selbst ber die Preisgabe und Verwendung seiner persnlichen Daten zu bestimmen. See also Alexander Rossnagel (August 28, 2013) “Big Data und das Konzept der Datenschutzgesetze, http://www.privacy-security.ch/2013/Download/Default.htm

57. Foreign Policy (2014) The Social Laborary, see http://www.foreignpolicy. com/articles/2014/07/29/the_social_laboratory_singapore_ surveillance_state; also see Dirk Helbing (March 27, 2013) Google as God? Opportunities and Risks of the Information Age, http://futurict.blogspot. ie/2013/03/google-as-god-opportunities-and-risks.html; see also From crystal ball to magic wand: The new world order in times of digital revolution, https://www.youtube.com/watch?v=AErRh_yDr-Q

58. Live science (January 6, 2011) U.S. torture techniques unethical, ineffective, http://www.livescience.com/ 9209-study-torture-techniques-unethical-ineffective.html; see also http://en.wikipedia.org/wiki/Effectiveness_of_torture_ for_interrogation and http://www.huffingtonpost.com/2014/ 04/11/cia-harsh-interrogations_n_5130218.html; The Guardian (January 7, 2013) US drone attacks ’counter-productive’, former Obama security adviser claims, http://www.theguardian.com/world/2013/jan/07/ obama-adviser-criticises-drone-policy; see also http://www. huffingtonpost.com/2013/05/21/us-drone-strikes-ineffective_ n_3313407.html and http://sustainablesecurity.org/2013/ 10/24/us-drone-strikes-in-pakistan/; NationalJournal (April 30, 2014) The NSA isn’t just spying on us, it’s also undermining Internet security , http://www.nationaljournal.com/daily/ the-nsa-isn-t-just-spying-on-us-it-s-also-undermining-internet-security-20140429; see also http://www.slate.com/blogs/future_tense/2014/07/ 31/usa_freedom_act_update_how_the_nsa_hurts_our_economy_ cybersecurity_and_foreign.html

59. D. Helbing et al. (2014) Saving human lives: What complexity science and information systems can contribute. J. Stat. Phys., http://link.springer.com/article/10. 1007%2Fs10955-014-1024-9

60. Time (August 5, 2014) S&P: Income Inequality Is Damaging the Economy, http:// time.com/3083100/income-inequality/

61. Physicstoday (July 2013) Qualified trust, not surveillance, is the basis of a stable society, http://scitation.aip.org/content/aip/magazine/physicstoday/ news/10.1063/PT.4.2508; see also the Foreword in Consumer Data Privacy in a Networked World (February 2012) http://www.whitehouse.gov/sites/default/ files/privacy-final.pdf, which starts: “Trust is essential to maintaining the social and economic benefits that networked technologies bring to the United States and the rest of the world.”

62. World Economic Forum (2011) Personal Data: Emergence of a New Asset Class, see http://www.weforum.org/reports/ personal-data-emergence-new-asset-class

63. Y.-A. de Montjoye, E. Shmueli, S. S. Wang, and A. S. Pentland (2014) openPDS: Protecting the Privacy of Metadata through SafeAnswers, http://www.plosone.org/article/info%3Adoi%2F10.1371% 2Fjournal.pone.0098790; see also ftp://131.107.65.22/pub/ debull/A12dec/large-scale.pdf, http://infoclose.com/ References 17 protecting-privacy-online-new-system-would-give-individuals-more-control-over-shared-digital-http://www.taz.de/!131892/, http://www.taz.de/!143055/

64. Dirk Helbing (August 14, 2014) The world after Big Data: Building the self-regulating society, see https://www.youtube.com/watch?v=I_Q_-Pk-btY or https:// www.youtube.com/watch?v=I_Lphxknozc

65. How and Why Our Conventional Economic Thinking Causes Global Crises http://futurict.blogspot.de/2013/04/how-and-why-our-conventional-economic_8.html

66. “Networked Minds” Require A Fundamentally New Kind of Economics (19 March 2013) http://futurict.blogspot.de/2013/03/networked-minds-require-fundamentally_19.html?spref=fb and see also How Natural Selection Can Create Both Self- and Other-Regarding Preferences, and Networked Minds (19 March 2013) http://www.nature.com/srep/2013/130319/srep01480/pdf/srep01480.pdf

67. Global Networks Must Be Redesigned (1 May 2013) http://phys.org/news/2013-05-global-networks-redesigned-professor.html#jCp

68. From Technology-Driven Society to Socially Oriented Technology-The Future of Information Society - Alternatives to Surveillance ( 9 July 2013) http://futurict.blogspot.ie/2013/07/from-technology-driven-society-to.html