Field of Science

Showing posts with label "AI". Show all posts
Showing posts with label "AI". Show all posts

Carl Sagan's 1995 prediction of our technocratic dystopia

In 1995, just a year before his death, Carl Sagan published a bestselling book called “The Demon-Haunted World” which lamented what Sagan saw as the increasing encroachment of pseudoscience on people’s minds. It was an eloquent and wide-ranging volume. Sagan was mostly talking about obvious pseudoscientific claptrap such as alien abductions, psychokinesis and astrology. But he was also an astute observer of human nature who was well-educated in the humanities. His broad understanding of human beings led him to write the following paragraph which was innocuously buried in the middle of the second chapter.

“I have a foreboding of an America in my children's or grandchildren's time -- when the United States is a service and information economy; when nearly all the manufacturing industries have slipped away to other countries; when awesome technological powers are in the hands of a very few, and no one representing the public interest can even grasp the issues; when the people have lost the ability to set their own agendas or knowledgeably question those in authority; when, clutching our crystals and nervously consulting our horoscopes, our critical faculties in decline, unable to distinguish between what feels good and what's true, we slide, almost without noticing, back into superstition and darkness.”
As if these words were not ominous enough, Sagan follows up just a page later with another paragraph which is presumably designed to reduce us to a frightened, whimpering mass.

“I worry that, especially as the Millennium edges nearer, pseudoscience and superstition will seem year by year more tempting, the siren song of unreason more sonorous and attractive. Where have we heard it before? Whenever our ethnic or national prejudices are aroused, in times of scarcity, during challenges to national self-esteem or nerve, when we agonize about our diminished cosmic place and purpose, or when fanaticism is bubbling up around us - then, habits of thought familiar from ages past reach for the controls.

The candle flame gutters. Its little pool of light trembles. Darkness gathers. The demons begin to stir.”

What’s striking about this writing is its almost clairvoyant prescience. The phrases “fake news” and “post-factual world” were not used during Sagan’s times, but he is clearly describing them when he talks about people being “unable to distinguish between what’s real and what feels good”. And the rise of nationalist prejudice seems to have occurred almost exactly as he described.

It’s also interesting how Sagan’s prediction of the outsourcing of manufacturing mirrors the concerns of so many people who voted for Trump. The difference is that Sagan was not taking aim at immigrants, partisan politics, China or similar factors; he was simply seeing the disappearance of manufacturing as an essential consequence of its tradeoff with the rise of the information economy. We are now acutely living that tradeoff and it has cost us mightily.

One thing that’s difficult to say is whether Sagan was also anticipating the impact of technology on the displacement of jobs. Automation had already been around in the 90s and the computer was becoming a force to reckon with, but speech and image recognition and the subsequent impact of machine learning on these tasks was in its fledgling days. Sagan didn’t know about these fields: nonetheless, the march of technology also feeds into his concern about people gradually descending into ignorance because they cannot understand the world around them, even as technological comprehension stays in the hands of a privileged few.

In terms of people “losing the ability to set their own agendas or question those in power”, consider how many of us, let alone those in power, can grasp the science and technology behind deep learning, climate change, genome editing or even our iPhones? And yet these tools are subtly inserting them in pretty much all aspects of life, and there will soon be a time when no part of our daily existence is untouched by them. Yet it will also be a time when we use these technologies without understanding them, essentially safeguarding them with our lives, liberties and pursuit of happiness. Then, if something goes wrong, as it inevitably does with any complex system, we will be in deep trouble because of our lack of comprehension. Not only will there be chaos everywhere, but because we mindlessly used technology as a black box, we wouldn’t have the first clue about how to fix it.

Equally problematic is the paradox in which as technology becomes more user-friendly, it becomes more and more easy to apply it with abandon without understanding its strengths and limitations. My own field of computer-aided drug design (CADD) is a good example. Twenty years ago, software tools in my field were the realm of experts. But graphical user interfaces, slick marketing and cheap computing power have now put them in the hands of non-experts. While this has led to a useful democratization of these tools, it had also led to their abuse and overapplication. For instance, most of these techniques have been used without a proper understanding of statistics, not only leading to incorrect results being published but also to a waste of resources and time in the always time-strapped pharmaceutical and biotech industries.

This same paradox is now going to underlie deep learning and AI which are far more hyped and consequential than computer-aided drug design. Yesterday I read an interview with computer scientist Andrew Ng from Stanford who enthusiastically advocated that millions of people be taught AI techniques. Ng and others are well-meaning, but what’s not discussed is the potential catastrophe that could arise from putting imperfect tools in the hands of millions of people who don’t understand how they work and who suddenly start applying them to important aspects of our lives. To illustrate the utility of large-scale education in deep learning, Ng gives the example of how the emergence of commercial electric installations suddenly led to a demand for large numbers of electrical engineers. The difference was that electricity was far more deterministic and well-understood compared to AI. If it went wrong we largely knew how to fix it because we knew enough about the behavior of electrons, wiring and circuitry.

The problem with many AI algorithms like neural nets is that not only are they black boxes but their exactly utility is still a big unknown. In fact, AI is such a fledgling field that even the experts don’t really understand its domains of applicability, so it’s too much to believe that people who acquire AI diplomas in a semester or two will do any better. I would rather have a small number of experts develop and use imperfect technology than millions adopt technologies which are untested, especially when they are being used not just in our daily lives but in critical services like healthcare, transportation and banking.

As far as “those in power” are concerned, Sagan hints at the fact that they may no longer be politicians but technocrats. Both government and Silicon Valley technocrats have already taken over many aspects of our lives, but their hold seems to only tighten. One little appreciated story from that recent Google memo fiasco was written by journalist Elaine Ou who focused on a very different aspect of the incident; the way it points toward the technological elite carefully controlling what we read, digest and debate based on their own social and political preferences. As Ou says,

“Suppressing intellectual debate on college campuses is bad enough. Doing the same in Silicon Valley, which has essentially become a finishing school for elite universities, compounds the problem. Its engineers build products that potentially shape our digital lives. At Google, they oversee a search algorithm that seeks to surface “authoritative” results and demote low-quality content. This algorithm is tuned by an internal team of evaluators. If the company silences dissent within its own ranks, why should we trust it to manage our access to information?”

I personally find this idea that technological access can be controlled by the political or moral preferences of a self-appointed minority to be deeply disturbing. Far from all information being freely available at our fingertips, it will instead ensure that we increasingly read the biased, carefully shaped perspective of this minority. For example, this recent event at Google has indicated the social opinions of several of its most senior personnel as well as of those engineers who more directly control the flow of vast amounts of information permeating our lives every day. The question is not whether you agree or disagree with their views, it’s that there’s a good chance that these opinions will increasingly and subtly – sometimes without their proponents even knowing it – embed themselves into the pieces of code that influence what we see and hear pretty much every minute of our hyperconnected world. And this is not about simply switching the channel. When politics is embedded in technology itself, you cannot really switch the channel until you switch the entire technological foundation, something that’s almost impossible to accomplish in an age of oligopolies. This is an outcome that should worry even the most enthusiastic proponent of information technology, and it certainly should worry every civil libertarian. Even Carl Sagan was probably not thinking about this when he was talking about “awesome technological powers being in the hands of a very few”.

The real fear is that ignorance borne of technological control will be so subtle, gradual and all-pervasive that it will make us slide back, “almost without noticing”, not into superstition and darkness but into a false sense of security, self-importance and connectivity. In that sense it would very much resemble the situation in “The Matrix”. Politicians have used the strategy for ages, but ceding it to all-powerful machines enveloping us in their byte-lined embrace will be the ultimate capitulation. Giving people the illusion of freedom works better than any actual efforts at curbing freedom. Perfect control works when those who are controlled keep on believing the opposite. We can be ruled by demons when they come disguised as Gods.

A Manhattan Project for AI?

Neuroscientist and AI researcher Gary Marcus has an op-ed in the NYT in which he bemoans the lack of international collaboration in AI, a limitation that Marcus thinks is significant hampering progress in the field. He says that AI researchers should consider a global effort akin to CERN; a massively funded, wide-ranging project to solve specific problems in AI that would benefit from the expertise of hundreds of independent researchers. This hivemind effort could potentially clear the AI pipeline of several clogs which have held back progress.

On the face of it this is not a bad idea. Marcus's opinion is that both private and public research has some significant limitations which a meld of the two could potentially overcome.

"Academic labs are too small. Take the development of automated machine reading, which is a key to building any truly intelligent system. Too many separate components are needed for any one lab to tackle the problem. A full solution will incorporate advances in natural language processing (e.g., parsing sentences into words and phrases), knowledge representation (e.g., integrating the content of sentences with other sources of knowledge) and inference (reconstructing what is implied but not written). Each of those problems represents a lifetime of work for any single university lab.

Corporate labs like those of Google and Facebook have the resources to tackle big questions, but in a world of quarterly reports and bottom lines, they tend to concentrate on narrow problems like optimizing advertisement placement or automatically screening videos for offensive content. There is nothing wrong with such research, but it is unlikely to lead to major breakthroughs. Even Google Translate, which pulls off the neat trick of approximating translations by statistically associating sentences across languages, doesn’t understand a word of what it is translating.

I look with envy at my peers in high-energy physics, and in particular at CERN, the European Organization for Nuclear Research, a huge, international collaboration, with thousands of scientists and billions of dollars of funding. They pursue ambitious, tightly defined projects (like using the Large Hadron Collider to discover the Higgs boson) and share their results with the world, rather than restricting them to a single country or corporation. Even the largest “open” efforts at A.I., like OpenAI, which has about 50 staff members and is sponsored in part by Elon Musk, is tiny by comparison.

An international A.I. mission focused on teaching machines to read could genuinely change the world for the better — the more so if it made A.I. a public good, rather than the property of a privileged few."

This is a good point. For all its commitment to blue sky research, Google is not exactly the Bell Labs of 2017, and except for highly targeted research like that done at Verily and Calico, it's still committed to work that has more or less immediate applications to its flagship products. And as Marcus says, academic labs suffer from limits to capacity that keep them from working on the big picture.

A CERN for AI wouldn't be a bad idea, but it would be different from the real CERN in some key aspects. Most notably, unlike discovering the Higgs Boston, AI has immense potential social, economic and political ramifications. Thus, keeping the research at a CERN-like facility open and free for all would be a steep challenge, with governments and individuals constantly vying for a piece of the pie. In addition, there would be important IP issues if corporations were funding this endeavor. And even CERN had to contend with paranoid fears of mini black holes, so one can only imagine how much the more realistic (albeit more modest) fears of AI would be blown out of proportion.

As interesting as a CERN-like AI facility is, I think another metaphor for a global AI project would be the Manhattan Project. Now let me be the first to say that I consider most comparisons of Big Science projects to the Manhattan Project to be glib and ill-considered; comparing almost any peacetime project with necessarily limited resources to a wartime project that benefited from a virtually unlimited supply of resources brought to bear on it with great urgency will be a fraught exercise. And yet I think the Manhattan Project supplies at least one particular ingredient for successful AI research that Marcus does not really talk about. It's the essential interdisciplinary nature of tackling big problems like nuclear weapons or artificial intelligence.

What seems to be missing from a lot of the AI research taking place today is that it does not involve scientists from all disciplines working closely together in an open, free-for-all environment. That is not to say that individual scientists have not collaborated in the field, and it's also not to say that fields like neuroscience and biology have not given computer scientists a lot to think about. But a practical arrangement in which generally smart people from a variety of fields work intensely on a few well-defined AI problems seems to still be missing.

The main reason why this kind of interdisciplinary work may be key to cracking AI is very simple: in a very general sense, there are no experts in the field. It's too new for anyone to really claim expertise. The situation was very similar to the Manhattan Project. While physicists are most associated with the atomic bomb, without specialists in chemistry, metallurgy, ordnance, engineering and electronics the bomb would have been impossible to create. More importantly, none of these people were experts in the field and they had to make key innovations on the fly. Let's take the key idea of implosion, perhaps the most important and most novel scientific contribution to emerge from the project: Seth Neddermeyer who worked on cosmic rays before the war came up with the initial idea of implosion that made the Nagasaki bomb possible. But Neddermeyer's idea would not have taken practical shape had it not been for the under-appreciated British physicist James Tuck who came up with the ingenious design of having explosives of different densities around the plutonium core that would focus the shockwave inward toward the core, similar to how a lens focuses light. And Tuck's design would not have seen the light of day had they not brought in an expert in the chemistry of explosives - George Kistiakowsky.

These people were experts in their well-defined fields of science, but none of them were expert in nuclear weapons design, and they were making it up as they went along. But they were generally smart and capable people, capable of thinking widely outside their immediate sphere of expertise, capable of producing at least parts of ideas which they could then hand over in a sort of relay to others with different parts.

Similarly, nobody in the field of AI is an expert, and just like nuclear weapons the field is still new enough and wide enough for all kinds of generally smart people to make contributions to it. So along with a global effort, we should perhaps have a kind of Manhattan Project of AI that brings together computer scientists, neuroscientists, physicists, chemists, mathematicians and biologists at the minimum to dwell on the field's outstanding problems. These people don't need to be experts or know much about AI at all, they don't even need to know how to implement every idea they have, but they do need to be idea generators, they need to be able to bounce ideas off of each other, and they need to be able to pursue odd leads and ends and try to see the big picture. The Manhattan Project worked not because of experts pursuing deep ideas but because of a tight deadline and a concentrated effort by smart scientists who were encouraged to think outside the box as much as possible. Except for the constraints of wartime urgency, it should not be hard to replicate that effort, at least in its essentials.

Want to know if you are depressed? Don't ask Siri just yet.

"Tell me more about your baseline calibration, Siri"
There's no dearth of articles claiming that the "wearables revolution" is around the corner and that we aren't far from the day when every aspect of our health is recorded every second, analyzed and sent to the doctor for rapid diagnosis and treatment. That's why it was especially interesting for me to read this new analysis from computer scientists at Berkeley and Penn that should temper the soaring enthusiasm that riddles pretty much all things "AI" these days.

The authors are asking a very simple question in the context of machine learning (ML) algorithms that claim to predict your mood - and by proxy mental health issues like depression - based on GPS and other data. What's this simple question? It's one about baselines. When any computer algorithm makes a prediction, one of the key questions is how much better this prediction is compared to some baseline. Another name for baselines is "null models". Yet another is "controls", although controls themselves can be artificially inflated. 

In this case the baseline can be of two kinds: personal baselines (self-reported individual moods) or population baselines (the mood of a population). What the study finds is not too pretty. They analyze a variety of literature on mood-reporting ML algorithms and find that in about 77% of cases the studies use meaningless baselines that overestimate the performance of the ML models with respect to predicting mood swings. The reason is because the baselines that are used in most studies are population baselines rather than the more relevant personal baselines. The population baseline assumes a constant average state for all individuals, while the individual baseline assumes an average state for every individuals but different states between individuals. 

Clearly doing better than the population baseline is not very useful for tracking individual mood changes, and this is especially true since the authors find greater errors for population baselines compared to individual ones; these larger errors can simply obscure model performance. The paper also consider two datasets and try to figure out how to improve the performance of models on these datasets using a metric which they call "user lift" that determines how much better the model is compared to the baseline. 

I will let the abstract speak for itself:

"A new trend in medicine is the use of algorithms to analyze big datasets, e.g. using everything your phone measures about you for diagnostics or monitoring. However, these algorithms are commonly compared against weak baselines, which may contribute to excessive optimism. To assess how well an algorithm works, scientists typically ask how well its output correlates with medically assigned scores. Here we perform a meta-analysis to quantify how the literature evaluates their algorithms for monitoring mental wellbeing. We find that the bulk of the literature (∼77%) uses meaningless comparisons that ignore patient baseline state. For example, having an algorithm that uses phone data to diagnose mood disorders would be useful. However, it is possible to over 80% of the variance of some mood measures in the population by simply guessing that each patient has their own average mood - the patient-specific baseline. Thus, an algorithm that just predicts that our mood is like it usually is can explain the majority of variance, but is, obviously, entirely useless. Comparing to the wrong (population) baseline has a massive effect on the perceived quality of algorithms and produces baseless optimism in the field. To solve this problem we propose “user lift” that reduces these systematic errors in the evaluation of personalized medical monitoring."

That statement about being able to explain 80% of the variance in the model simply by guessing an average  mood for every individual should stand out. It means that simple informed guesswork based on an average "feeling" is both as good as the model and is also eminently useless since it predicts no variability and is therefore of little practical utility.

I find this paper important because it should put a dent in what is often inflated enthusiasm about wearables these days. It also illustrates the dangers of what is called "technological solutionism": simply because you can strap on a watch or device on your body to measure various parameters and simply because you have enough computing power to analyze the resulting stream of data does not mean the results will be significant. You record because you can, you analyze because you can, you conclude because you can. What the authors find about tracking moods can apply to tracking other kinds of important variables like blood pressure and sleep duration. Every time the question must be; am I using the right baseline for comparison? And am I doing better than the baseline? Hopefully the authors can use larger and more diverse datasets and find out similar facts about other such metrics.

I also found this study interesting because it reminds me of a whole lot of valid criticism in the field of molecular modeling that we have seen over the last few years. One of the most important questions there is about null models. Whenever your latest and greatest FEP/MD/informatics/docking study is claimed to have done exceptionally well on a dataset, the first question should be; is it better than the null model? And have you defined the null model correctly to begin with? Is your model doing better than a simpler method? And if it's not, why use it, and why assign a causal connection between your technique and the relevant result?

In science there are seldom absolutes. Studies like this show us that every new method needs to be compared with what came before it. When old facts have already paved the way, new facts are compelled to do better. Otherwise they can create the illusion of doing well.