To Improve Algorithms, Embed Human Principles Into Code

‘The Ethical Algorithm,’ a new book from computer scientists Michael Kearns and Aaron Roth, describes the social challenges of automation and offers a new approach for creating socially-aware algorithms.

Michael Kearns and Aaron Roth sit in front of a chalkboard. — ‘The Ethical Algorithm,’ a new book from Michael Kearns and Aaron Roth, describes how algorithms can inadvertently share private information or perpetuate racial and gender biases and also offers a set of principled solutions that can help researchers design the next generation of socially-aware algorithms.

This is an excerpt adapted from “The Ethical Algorithm: The Science of Socially Aware Algorithm Design” by Michael Kearns and Aaron Roth of the School of Engineering and Applied Science and published by Oxford University Press.

In December 2018, the New York Times obtained a commercial dataset containing location information collected from phone apps whose nominal purpose is to provide mundane things like weather reports and restaurant recommendations. Such datasets contain precise locations for hundreds of millions of individuals, each updated hundreds of times a day. Commercial buyers of such data will generally be interested in aggregate information, but the data is recorded by individual phones. It is superficially anonymous, without names attached, but there is only so much anonymity you can promise when recording a person’s every move.

From this data, the New York Times was able to identify a 46-year-old math teacher named Lisa Magrin. She was the only person who made the daily commute from her home in upstate New York to the middle school where she works, 14 miles away. And once someone’s identity is uncovered in this way, it’s possible to learn a lot more about them. The Times followed Lisa’s data trail to Weight Watchers, to a dermatologist’s office, and to her ex-boyfriend’s home. Just a couple of decades ago, this level of intrusive surveillance would have required a private investigator or a government agency. Now, it is simply the by-product of widely available commercial datasets.

It’s not only privacy that has become a concern as data gathering and analysis proliferate: Algorithms aren’t simply analyzing the data that we generate with our every move, they are also being used to actively make decisions that affect our lives. When you apply for a credit card, your application may never be examined by a human being. Instead, an algorithm pulling in data about you from many different sources might automatically approve or deny your request.

In many states, algorithms based on what is called machine learning are also used to inform bail, parole, and criminal sentencing decisions. All this raises questions not only of privacy but also of fairness as well as a variety of other basic social values including safety, transparency, accountability, and even morality.

If we are going to continue to generate and use huge datasets to automate important decisions, we have to think seriously about some weighty topics. These include limits on the use of data and algorithms and the corresponding laws, regulations, and organizations that would determine and enforce those limits. But we must also think seriously about addressing the concerns scientifically — about what it might mean to encode ethical principles directly into the design of the algorithms that are increasingly woven into our daily lives.

You might be excused for some skepticism about imparting moral character to an algorithm. An algorithm, after all, is just a human artifact or tool, like a hammer, and who would entertain the idea of an ethical hammer? Of course, a hammer might be put to an unethical use — as an instrument of violence, for example — but this can’t be said to be the hammer’s fault. Anything ethical about the use or misuse of a hammer can be attributed to the human being who wields it.

But algorithms — especially those deploying machine learning — are different. They are different both because we allow them a significant amount of agency to make decisions without human intervention and because they are often so complex and opaque that even their designers cannot anticipate how they will behave in many situations.

Unlike a hammer, it is usually not so easy to blame a particular misdeed of an algorithm directly on the person who designed or deployed it. There are many instances in which algorithms leak sensitive personal information or discriminate against one demographic or another. But how exactly do these things happen? Are violations of privacy and fairness the result of incompetent software developers or, worse yet, the work of evil programmers deliberately coding racism and back doors into their programs?

The answer is a resounding no, but the real reasons for algorithmic misbehavior are perhaps even more disturbing than human incompetence or malfeasance, which we are at least more familiar with and have some mechanisms for addressing. Society’s most influential algorithms, from Google search and Facebook’s News Feed to credit scoring and health risk assessment algorithms, are generally developed by highly trained engineers who are carefully applying well-understood design principles. The problems actually lie within those very principles, most specifically those of machine learning.

Continue reading this excerpt of The Ethical Algorithm at Penn Today.