Workshop on Decision Theory & the Future of Artificial Intelligence
27 - 28 July 2018
This workshop will continue in the tradition established last year of bringing together philosophers, decision theorists, and AI researchers in order to promote research at the nexus between decision theory and AI. Our plan for the second installment is to make connections between decision theory and burgeoning research programs that may play a prominent role in the near future of the discipline – e.g., quantum information theory, social network analysis, and causal inference.
Hans Briegel (University of Innsbruck)
Tina Eliassi-Rad (Northeastern University)
Dominik Janzing (Amazon Development Center)
Christian List (London School of Economics)
Aidan Lyon (University of Maryland)
Teresa Scantamburlo (University of Bristol)
Wolfgang Spohn (University of Konstanz)
27 July 2018
9:25 Stephan Hartmann: Opening
9:30 Hans Briegel: Transparency of Classical and Quantum Mechanical Artificial Intelligence
10:45 Tom Everitt: Goal Alignment in Reinforcement Learning
11:45 Teresa Scantamburlo: The Progress of Machine Decisions
14:30 Dominik Janzing: Principle of Independence of Mechanisms in Machine Learning and Physics
15:15 Benjamin Eva: The Similarity of Causal Structure
16:30 Christian List: Representing Moral Judgments: The Reason-Based Approach
28 July 2018
9:30 Wolfgang Spohn: From Nash to Dependency Equilibria
10:45 Johannes Treutlein: How the Decision Theory of Newcomblike Problems Differs Between Humans and Machines
11:45 Tina Eliassi-Rad: Just Machine Learning
14:30 Aidan Lyon: Deciding AI
15:15 Zoe Cremer: Preserving AI Control via the Use of Theoretical Neuroinformatics in Empirical Ethics
16:00 Huw Price: Closing
Transparency of classical and quantum mechanical artificial intelligence
In the first part of the talk, I will review recent work on the classical model of projective simulation (PS) for learning and agency, including its applications in robotics and quantum experiment. The model can be quantized, leading to a speed-up in the decision-making capacity of the agent. In the second part, I will address the problems of transparency and interpretability of learning systems. I will discuss these problems in the specific context of the PS model but also beyond, including quantum models for artificial learning agents.
Preserving AI control via the use of theoretical neuroinformatics in empirical ethics
It will be crucial to remain in control of the values implemented by autonomous, artificially intelligent agents. This proposal shows how neuroinformatics could be applied to conceive of a quantifiable decision model, that can guide AI systems to remained aligned with empirically derived human ethics. The combination of machine learning and non-invasive brain imaging can abstract a shared cognitive space in ethical decision-making across humans, directly from neural computation. This shared space can be used to build a model of human ethics and assumptions can be verified by reverse inference. The experimental paradigm is theoretically capable of processing naturalistic stimuli, reducing biases in theory selection and improving decision-making, by empirically estimating the Coherent Extrapolated Volition, while still retaining AI alignment.
Just Machine Learning
Fairness in machine learning is an important and popular topic these days. “Fair” machine learning approaches are supposed to produce decisions that are probabilistically independent of sensitive features (such as gender and race) or their proxies (such as zip codes). Some examples of probabilistically fair measures here include precision parity, true positive parity, and false positive parity across pre-defined groups in the population (e.g., whites vs. non-whites). Most literature in this area frame the machine learning problem as estimating a risk score. For example, Jack’s risk of defaulting on a loan is 8, while Jill's is 2. Recent papers -- by Kleinberg, Mullainathan, and Raghavan (arXiv:1609.05807v2, 2016) and Alexandra Chouldechova (arXiv:1703.00056v1 , 2017) -- present an impossibility result on simultaneously satisfying three desirable fairness properties when estimating risk scores with differing base rates in the population. I take a boarder notion of fairness and ask the following two questions: Is there such a thing as just machine learning? If so, is just machine learning possible in our unjust world? I will describe a different way of framing the problem and will present some preliminary results.
The Similarity of Causal Structure
Benjamin Eva (with Stephan Hartmann and Reuben Stern)
Does y obtain under the counterfactual supposition that x? The answer to this question is famously thought to depend on whether y obtains in the most similar world(s) in which x obtains. What this notion of ‘similarity’ consists in is controversial, but in recent years, graphical causal models have proved incredibly useful in getting a handle on considerations of similarity between worlds. One limitation of the resulting conception of similarity is that it says nothing about what would obtain were the causal structure to be different from what it actually is, or from what we believe it to be. In this paper, we explore the possibility of using graphical causal models to resolve counterfactual queries about causal structure by introducing a notion of similarity between causal graphs. Since there are multiple principled senses in which a graph G∗ can be more similar to a graph G than a graph G∗∗, we introduce multiple similarity metrics, as well as multiple ways to prioritize the various metrics when settling counterfactual queries about causal structure.
Goal Alignment in Reinforcement Learning
Tom Everitt (with Marcus Hutter)
Constructing artificial intelligence with goals that are aligned with our human values will be crucial if we want to be able to control systems with intelligence exceeding our own. In this extended abstract we summarize the main reasons why a reinforcement learning system may have goals misaligned with ours: Reward hijacking, misspecified reward functions, motivated value selection, and observation corruption. We also briefly describe promising techniques for mitigating these problems. Reward hijacking can be prevented by simulation optimization, which tells the agent to optimize its current reward function in the future; misspecified reward functions can be mitigated by an interactively learned reward function; motivated value selection can be addressed by combining rich reward data with a number of other techniques; observation corruption can be managed by interactively learned reward functions and action-observation grounding.
Principle of Independence of Mechanisms in Machine Learning and Physics
Understanding science by identifying mechanisms that work independently of others is at the heart of scientific methodology.
In causal Bayesian networks, for instance, the joint distribution P(X_1,...,X_n) of n variables is decomposed into the product of causal conditionals, that is, conditional distributions of every variable, given its direct causes. The idea that every causal conditional in this factorization represents an independent mechanism is widespread in the literature, but our research suggests that the implications need to be further explored:
- In causal inference, it suggests entirely novel approaches. In particular, the principle that P(cause) and P(effect | cause) contain no information about each other sometimes allows to distinguish between cause and effect in bivariate statistics . Further, the principle suggests new methods for detecting hidden common causes .
- Roughly speaking, it also suggests that semi-supervised learning only works in 'anticausal' direction, that is, when the cause is predicted from the effect. In causal direction, that is, when the effect is predicted from the cause, unlabelled points are pointless because knowing more about P(cause) does not help to better infer the relation between cause and effect .
- When joint distributions change across different data sets, some causal conditionals may have remained constant, which can be helpful for machine learning in different environments .
We have, moreover, argued that a related principle stating the *independence of initial state and dynamical law'  reproduces the standard thermodynamic arrow of time in physics -- which thus relates asymmetries between cause and effect with asymmetries between past and future .
 Peters, Janzing, Schoelkopf: Elements of causal inference, MIT Press 2017.
 Janzing, Schoelkopf: Detecting non-causal artifacts in multivariate linear regression models, ICML 2018.
 Schoelkopf, Janzing, Peters, Sgouritsa, Zhang, Mooij: On causal and anticausal learning, ICML 2012.
 Janzing, Chaves, Schoelkopf: Algorithmic independence of initial condition and dynamic law in thermodynamics and causal inference, NJP 2016.
 Janzing: On the entropy production of time series with unidirectional linearity, JSP 2010.
Representing Moral Judgments: The Reason-Based Approach
An agent’s “practical moral judgments” can be defined as the agent’s judgments as to which options (e.g., actions) can be permissibly chosen in various contexts. How can such moral judgments be cognitively represented? Clearly, a purely extensional approach – simply storing a database in memory which explicitly encodes which options are permissible in each context – is not very efficient or feasible in general, and it would also make the acquisition of moral judgments (“moral learning”) informationally far too demanding. A more promising approach is the “consequentialization approach”, familiar from traditional utilitarianism and much- discussed in formal ethics. According to it, an agent represents his or her moral judgments in terms of a universal betterness ordering over the options that he or she might encounter and then takes the morally permissible options in each context to be the most highly ranked feasible ones. Although this approach constitutes a clear improvement over the purely extensional approach, it still has significant limitations. First of all, consequentialist representations of moral judgments are not always possible, because of the non- consequentialist or relativist nature of those judgments; and secondly, consequentialist representations are cumbersome and not very illuminating, especially when the number of possible options is large. We argue that moral judgments can be plausibly represented in terms of a “reasons structure”: a specification of which properties are morally relevant in each context, and how choice-worthy different bundles of properties are. This way of representing moral judgments not only offers greater power and flexibility, but it also suggests a plausible mechanism for moral learning: agents come to acquire a reasons structure, perhaps on the basis of a limited database of moral experiences (a “learning database”), and they then extrapolate the moral judgments encoded in this reasons structure to new contexts.
The Progress of Machine Decisions
In the last few years, several fields have turned towards artificial intelligence to solve their own problems reframing routine decisions in machine learning terms (see the case of criminal justice as a paradigmatic example). While some recent applications achieved impressive predictive accuracy, the consequences of machine decisions raise serious questions about the benefits and the risks of machine learning: Is machine learning an adequate solution? Should we accept its presuppositions and conclusions? In other words, should we trust it?
These questions challenge the standard approach to machine learning assessment and call for a broader notion of progress encompassing both empirical and non-empirical factors.
In this talk I will discuss the problem-solving effectiveness of machine decisions drawing on concrete case studies and Larry Laudan’s philosophy of science. The discussion will point out how the field is facing new conceptual problems due to the tensions between machine learning solutions and deep-rooted social values such as privacy, fairness, accountability and transparency. I will argue that the rise of such conceptual difficulties is bringing profound changes in the field and has the potential to enrich the underlying research tradition. Yet, missing the conceptual nature of some problems, by placing too much emphasis on empirical considerations, may constitute a serious impediment to the progress of the field and impoverish its overall assessment.
From Nash to Dependency Equilibria
As is well known, Nash equilibria assume the causal independence of the decisions and the actions of the players. While the independence of the actions is constitutive of normal form games, the independence of the decisions may and should be given up. This leads to the wider and quite different notion of a dependency equilibrium; e.g., cooperation in the single-shot prisoners’ dilemma is a dependency equilibrium. The talk will argue that this notion is meaningful and significant and sketches some of its consequences.
How the decision theory of Newcomblike problems differs between humans and machines
Newcomblike problems divide the philosophical community into adherents of evidential decision theory (EDT) and causal decision theory (CDT)—“one-boxers” and “two-boxers”, respectively. Although normative theories of rational choice can be applied to rational agents in general, the debate about these problems has often been focused on human decision-making in particular. However, as the capabilities of machines increase, it becomes more important to apply the debate to artificial agents, as well. In particular, the new problem arises of which decision theory to implement into these agents. In this talk, I outline three ways in which the decision theory of Newcomblike problems differs between humans and machines. I argue that, due to these differences, the theory for machines requires its own discussion.
Main University Building, Geschwister-Scholl-Platz 1, D-80539 München, Germany