Opportunities for Social Theory in the Age of Big Data
Great leaps in human history are the result of understanding something new, sometimes in the most unexpected ways. Landing on the moon required understanding how an apple falls to the ground, winning WWII required understanding computing, and the creation of massive internet searches required understanding social influence.
What does it take to make human societies more decent, happy and creative? It certainly requires understanding cities.
But what does it take to understand cities? Don’t citizens, mayors, and myriads of organizations understand cities well enough? Don’t we have the technologies and the engineering to create clean, efficient cities? Isn’t more and bigger data enough to guide such improvements?
I believe that the answer to these question is no. Specifically, we do not understand cities well enough to systematically tackle the problems that matter the most: issues of persistent poverty, inequality, social mobility, and the promotion of human creativity and innovation. These are not only the pivotal problems of American cities: creating a sustainable urbanized world hinges on our ability to tackle them everywhere (Bettencourt 2014a).
While the physical environment of cities is certainly important, while their infrastructure and services are vital enablers of a modern life, the city is above all a complex intertwined and dynamic web of multidimensional social and economic relations (Jacobs 1970; Sampson 2012; Bettencourt 2014b).
For this reason, understanding the most critical issues of cities depends on fundamental progress in the social sciences. New and more extensive data is pouring in about many aspects of people’s lives, their choices, and their structure and development in built spaces worldwide. This is challenging old frameworks and stimulating new advances. Here I provide what, in my opinion, are the most promising directions for social theory in the ear of big data, with cities and urban life at their fulcrum.
How Big Data works (so far)
Enthusiasts of big data hope to solve difficult human problems without the need for scientific theory (Anderson 2008). This proposition may be anathema to a (social) scientist, but there are actually many examples of how this is possible when difficult problems all obey the same logic. Formalizing this logic allows us to use data and information technologies at their best and also identify the limits of such strategies (Bettencourt 2014c).
Effective urban services, from running a transit system to picking up trash, are typically easy to improve using better operational data for three specific reasons: 1) clear targets of optimization can be defined and measured easily (e.g. waiting time), 2) measurements can be made locally and continuously in time using new technologies, and 3) performance can be controlled in simple ways (dispatching) through the active management of resources to ensure a target standard of service.
Most emerging uses of big data in cities have this flavor: they concentrate on measuring the short-term status of urban environments as aids to on-demand operations. This defines what is becoming known as urban analytics: ways to keep a finger on the pulse of the city in order to act more effectively and quickly.
The logic of these uses of data is not different from those found in other technologies specified by engineering theory. From error-free computing to smooth airplane flight, from effective home thermostats to self-driving cars, fast measurement and continuous course-correction keep complex engineered systems operating in optimal and stable ways. This can be done without sophisticated scientific theory so long as the system remains sufficiently close to its desired state.
But socioeconomic problems such as education, poverty, and justice escape the logic of feedback control solutions for several reasons. First, it is not entirely clear what should be measured or optimized quantitatively. As we know, obvious choices, such as improving standardized test scores or more sophisticated police equipment, may or may not contribute to the sustainable improvement of people’s lives. These problems are also multidimensional, and their causality runs in both directions, often making simple interventions ineffective: is a city rich because it has good infrastructure? Or does it have good infrastructure because it is rich?
For these reasons, many socioeconomic issues are known to planners as wicked problems (Rittel and Webber 1973). Understanding and solving wicked problems is where the social sciences really shine. The rest is engineering.
The Planner’s Problem and the Role of Self-Organization
If a technologist tells you that (s)he is going to solve all problems of cities using big data, you should be suspicious. The central roadblock is the social Planner’s Problem. Economists and urbanists have known this for a long time as they came to understand the role of information in human societies (Hayek 1945; Alexander 1964) and the ability of markets to solve massive coordination problems. But the best formal argument comes from computer science: it states that the amount of computation (not data!) necessary to optimally organize a society of heterogeneous agents is prohibitively large. This makes brute force computational solutions intractable and brings theory back in.
Board games are an excellent example of this problem: how to solve the problem of chess or go? It’s no problem describing the game’s configuration. These games are difficult, instead, because they involve a massive number of configurations, each with a different value. This number is so large that they cannot be listed and evaluated exhaustively as a method to find the best solution.
Now think of a city with millions of people. Imagine, for a moment, that the social planner (Mayor) is handed this “game” to solve. The Mayor can, with some effort and some technology, measure the city intensely, noting every citizen and every place across several quantities many times a second. It turns out that this does not generate that much data by modern standards. But can the Mayor now determine the optimal strategy going forward? The problem is a little bit like chess, but with millions of different pieces (people, places) and rules of play. It is totally intractable, involving numbers of possibilities that are myriads of times larger than all the particles in our universe.
This, in a nutshell, means that the decent city, the city of people and socioeconomic organizations, is impossible to plan in detail. But cities are systems with specific properties and dynamics that can, in principle, be better harnessed to promote positive change. This is why we need theory.
The Theory Challenge
The social sciences have developed uniquely insightful theory with no counterpart in the natural sciences. However, there is a convergence between theory in the social and natural sciences that is latent in cities but that has not yet happened. This, in my opinion, is the most fertile territory for new developments.
All fundamental theories in the natural sciences are theories of process, not of form or structure. This is equally true of a hydrogen atom and of a tiger, which are structures that can be produced (and destroyed) by quantum mechanics and evolution by natural selection. Specific macroscopic structures, such as an ice crystal or a cloud, are exceedingly hard to predict. But the elementary change in these patterns, however complicated, is simple, making it a productive focus for theory.
A re-framing of social and economic theory in terms of non-equilibrium statistical processes in heterogeneous systems (cities) is still in its early stages. There are three important areas: i) the co-evolution of built space and socioeconomic life; ii) individual learning and adaptation as the basis for human development and economic growth; iii) the co-development of social structures (networks and organizations) with these changes. I chose these three areas because they are deep and difficult but also obvious and pivotal for policy.
The relationship between built space and socioeconomic life is the central problem of urban planning (Alexander 1964; Lynch 1984). Planners have often assumed that certain built environments are conducive to specific choices and lifestyles, but such determinism proved misplaced. At the finest level, there are certainly functional causal effects: for example, neighborhoods with poor services inflict costs on their residents in terms of time and effort to deal with the simplest things in life. But what are the specific characteristics of services and physical spaces that sustainably improve people’s lives? Can they be achieved incrementally? How do they take off, generating virtuous cycles of human development and service improvement? Clear answers to these questions would profoundly change our ability to generate effective policy.
Second, current theory often models humans as social agents with certain identities, reputation, human capital, preferences, and so on. However, the defining characteristic of humans is their adaptability: the city is a rich learning environment, for good and for ill. Our ability to learn and adapt is limited by the places and people we can access, and thus by the social networks we can build (Fisher 1982; Sampson 2008; 2012,). Cognitive scientists are starting to show how many human cognitive biases (Galesic et al. 2012, see also Wilson 1987) can be explained by limited social and spatial horizons. Cities are the places where social horizons change. How can these processes be harnessed to create faster individual human development and more effective organizations?
Finally, social networks are always dynamic, discrete, and finite, and are actively built by individuals as they search for changes in their lives and estimate who and what can assist them. These arguments are staples of sociology theory, but better models that quantitatively connect network and cognitive changes with their social and economic consequences for the individual and the city would shed much light on human societies and their growth and development.
In closing, I’d like to note that it is a curious fact that some of the most advanced statistical theory in physics and biology stemmed historically from questions about people and their social behavior (Ball 2002). Most of these questions remain unanswered by physical and biological theory despite their ability to analyze and model large amounts of (social) data. Bigger socioeconomic data challenges social theory to grow to better explain processes of learning, development and change in human societies across many levels of organization. As this happens, I believe social theory will move back to the center of all scientific theory and become the cornerstone of our understanding of the most complex systems in nature and how to manage them in decent and open-ended ways.
Alexander (1964) Notes on the Synthesis of Urban Form (Harvard University Press, Cambridge MA).
Anderson (2008) The End of Theory: The Data Deluge Makes the Scientific Method Obsolete. Wired http://archive.wired.com/science/discoveries/magazine/16-07/pb_theory
Ball (2002) The Physical Modeling of Society: a Historical Perspective. Physica A 314: 1–14.
M. A. Bettencourt (2014a) Mass Urbanization Could Lead to Unprecedented Human Creativity -- But Only if We Do it Right, Huffington Post: http://www.huffingtonpost.com/luis-bettencourt/mass-urbanization-creativity_b_5670222.html
M. A. Bettencourt (2014b) The Origins of Scaling in Cities. Science 340: 1438-1441.
M. A. Bettencourt (2014c) The Uses of Big Data in Cities. Big Data 2: 12-22.
S. Fischer (1982) To Dwell among Friends: Personal Networks in Town and City (University of Chicago Press, Chicago IL)
Galesic, H. Olsson, J Rieskamp (2012) Social Sampling Explains Apparent Biases in Judgments of Social Environments Psychological Science 23: 1515–1523.
A. Hayek (1945) The Use of Knowledge in Society. American Economic Review 35: 519-530.
Jacobs (1970) The Economy of Cities (Vintage, New York, NY).
W. K. Rittel, M. M. Webber (1973) Dilemmas in a General Theory of Planning. Policy Sciences 4: 155-169.
Lynch (1984) Good City Form (MIT Press, Cambridge MA).