SI2021
Special Issue of Synthese 2021
Decision Theory and the Future of AI
Editors:
Yang Liu (University of Cambridge)
Stephan Hartmann (LMU Munich)
Huw Price (University of Cambridge)
DOI: https://doi.org/10.1007/s11229-021-03316-z
Introduction
In the long run, the development of artificial intelligence (AI) is likely to be one of the biggest technological revolutions in human history. Even in the short run it will present tremendous challenges as well as tremendous opportunities. The more we do now to think through these complex challenges and opportunities, the better the prospects for the kind of outcomes we all hope for, for ourselves, our children, and our planet.
Thinking through these challenges needs a new kind of interdisciplinary research community. Many sources of expertise and insight are likely to be relevant, and this community needs to be very well-connected in several dimensions – ‘horizontally’ between academic disciplines, ‘vertically’ to the policy and technology worlds, and of course geographically. AI is a global technology, and many of the challenges and opportunities of AI will be global in nature. Accordingly, getting AI right is not just an engineering challenge, but also a challenge for many other societal and academic sectors, including the humanities. Put another way, there is an engineering challenge of a ‘sociological’ kind, about how best to foster the necessary research community.
The field of decision theory is ideally placed to make contributions here, at several levels. AI innovations, including techniques from machine learning, are increasingly used to make decisions with significant social and ethical consequences, ranging from determining the news feeds on social media to making sentencing and parole recommendations in the criminal justice system. Decision theory provides and studies the standards by which such decisions are evaluated and improved. What is a rational decision? How can we train machines to make rational decisions? What is the relationship between human decision-making and machine decision-making? How can one make machine decision-making transparent (i.e. understandable to a human agent)? Which role does cognitive science play in these developments?
Perhaps even more importantly, the field of decision theory itself is highly interdisciplinary, with a strong presence in disciplines such as philosophy, mathematical logic, economics, psychology, and cognitive science, amongst others. In addition, of course, it has foundational links to computer science and machine learning. So it is ideally placed to contribute to the sociological challenge. It offers very fertile ground in which to foster the kind of rich interdisciplinary community needed for the challenges of AI, short term and long term.
This special issue stems from a conference series established with these goals in mind. Decision Theory and the Future of AI began in 2017 as a collaboration between the Leverhulme Centre for the Future of Intelligence (CFI) and the Centre for the Study of Existential Risk (CSER) at Cambridge, and Munich Center for Mathematical Philosophy (MCMP) at LMU Munich. The first two conferences were held at Trinity College, Cambridge in 2017 and LMU Munich in 2018. The first meeting outside Europe was held at ANU, Canberra, in 2019, in conjunction with ANU’s Humanising Machine Intelligence project. A fourth conference was planned at PKU, Beijing, in 2020, before Covid intervened. We will be back!
Several of the papers in this special issue were presented at one of these conferences, while others were submitted in response to an open call for papers. The range of topics, and even more so the range of authors and their home disciplines and affiliations, is a tribute to the richness of the territory, both in intellectual and in community-building terms.
Contributions
Approval-directed agency and the decision theory of Newcomb-like problems
Caspar Oesterheld
Decision theorists disagree about how instrumentally rational agents, i.e., agents trying to achieve some goal, should behave in so-called Newcomb-like problems, with the main contenders being causal and evidential decision theory. Since the main goal of artificial intelligence research is to create machines that make instrumentally rational decisions, the disagreement pertains to this field. In addition to the more philosophical question of what the right decision theory is, the goal of AI poses the question of how to implement any given decision theory in an AI. For example, how would one go about building an AI whose behavior matches evidential decision theory’s recommendations? Conversely, we can ask which decision theories (if any) describe the behavior of any existing AI design. In this paper, we study what decision theory an approval-directed agent, i.e., an agent whose goal it is to maximize the score it receives from an overseer, implements. If we assume that the overseer rewards the agent based on the expected value of some von Neumann–Morgenstern utility function, then such an approval-directed agent is guided by two decision theories: the one used by the agent to decide which action to choose in order to maximize the reward and the one used by the overseer to compute the expected utility of a chosen action. We show which of these two decision theories describes the agent’s behavior in which situations.
Desirability foundations of robust rational decision making
Marco Zaffalon & Enrique Miranda
Recent work has formally linked the traditional axiomatisation of incomplete preferences à la Anscombe-Aumann with the theory of desirability developed in the context of imprecise probability, by showing in particular that they are the very same theory. The equivalence has been established under the constraint that the set of possible prizes is finite. In this paper, we relax such a constraint, thus de facto creating one of the most general theories of rationality and decision making available today. We provide the theory with a sound interpretation and with basic notions, and results, for the separation of beliefs and values, and for the case of complete preferences. Moreover, we discuss the role of conglomerability for the presented theory, arguing that it should be a rationality requirement under very broad conditions.
Subjective causal networks and indeterminate suppositional credences
Jiji Zhang, Teddy Seidenfeld & Hailin Liu
This paper has two main parts. In the first part, we motivate a kind of indeterminate, suppositional credences by discussing the prospect for a subjective interpretation of a causal Bayesian network (CBN), an important tool for causal reasoning in artificial intelligence. A CBN consists of a causal graph and a collection of interventional probabilities. The subjective interpretation in question would take the causal graph in a CBN to represent the causal structure that is believed by an agent, and interventional probabilities in a CBN to represent suppositional credences. We review a difficulty noted in the literature with such an interpretation, and suggest that a natural way to address the challenge is to go for a generalization of CBN that allows indeterminate credences. In the second part, we develop a decision-theoretic foundation for such indeterminate suppositional credences, by generalizing a theory of coherent choice functions to accommodate some form of act-state dependence. The upshot is a decision-theoretic framework that is not only rich enough to, so to speak, ground the probabilities in a subjectively interpreted causal network, but also interesting in its own right, in that it accommodates both act-state dependence and imprecise probabilities.
Intuition, intelligence, data compression
Jens Kipper
The main goal of my paper is to argue that data compression is a necessary condition for intelligence. One key motivation for this proposal stems from a paradox about intuition and intelligence. For the purposes of this paper, it will be useful to consider playing board games—such as chess and Go—as a paradigm of problem solving and cognition, and computer programs as a model of human cognition. I first describe the basic components of computer programs that play board games, namely value functions and search functions. I then argue that value functions both play the same role as intuition in humans and work in essentially the same way. However, as will become apparent, using an ordinary value function is just a simpler and less accurate form of relying on a database or lookup table. This raises our paradox, since reliance on intuition is usually considered to manifest intelligence, whereas usage of a lookup table is not. I therefore introduce another condition for intelligence that is related to data compression. This proposal allows that even reliance on a perfectly accurate lookup table can be nonintelligent, while retaining the claim that reliance on intuition can be highly intelligent. My account is not just theoretically plausible, but it also captures a crucial empirical constraint. This is because all systems with limited resources that solve complex problems—and hence, all cognitive systems—need to compress data.
A classification of Newcomb problems and decision theories
Kenny Easwaran
Newcomb-like problems are classified by the payoff table of their act-state pairs, and the causal structure that gives rise to the act-state correlation. Decision theories are classified by the one or more points of intervention whose causal role is taken to be relevant to rationality in various problems. Some decision theories suggest an inherent conflict between different notions of rationality that are all relevant. Some issues with causal modeling raise problems for decision theories in the contexts where Newcomb problems arise.
Causal concepts and temporal ordering
Reuben Stern
Though common sense says that causes must temporally precede their effects, the hugely influential interventionist account of causation makes no reference to temporal precedence. Does common sense lead us astray? In this paper, I evaluate the power of the commonsense assumption from within the interventionist approach to causal modeling. I first argue that if causes temporally precede their effects, then one need not consider the outcomes of interventions in order to infer causal relevance, and that one can instead use temporal and probabilistic information to infer exactly when X is causally relevant to Y in each of the senses captured by Woodward’s interventionist treatment. Then, I consider the upshot of these findings for causal decision theory, and argue that the commonsense assumption is especially powerful when an agent seeks to determine whether so-called “dominance reasoning” is applicable.
Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective
Tom Everitt, Marcus Hutter, Ramana Kumar & Victoria Krakovna
Can humans get arbitrarily capable reinforcement learning (RL) agents to do their bidding? Or will sufficiently capable RL agents always find ways to bypass their intended objectives by shortcutting their reward signal? This question impacts how far RL can be scaled, and whether alternative paradigms must be developed in order to build safe artificial general intelligence. In this paper, we study when an RL agent has an instrumental goal to tamper with its reward process, and describe design principles that prevent instrumental goals for two different types of reward tampering (reward function tampering and RF-input tampering). Combined, the design principles can prevent reward tampering from being an instrumental goal. The analysis benefits from causal influence diagrams to provide intuitive yet precise formalisations.
Acknowledgements
We would like to thank the Cambridge-LMU Strategic partnership for support for the conference series, and our institutions (CFI, CSER, and the Faculty of Philosophy at Cambridge, and MCMP at LMU Munich) for administrative support. We also thank the authors and referees of the papers for all their work. Yang Liu and Huw Price gratefully acknowledge the support of the Leverhulme Trust, and of a grant from Templeton World Charity Foundation (TWCF0128); the opinions expressed in this publication are those of the authors and do not necessarily reflect the views of TWCF.