Friam

Fwd: Reminder: Thesis Proposal - December 1, 2020 - Nicolay Topin - Unifying State and Policy Level Explanations for Reinforcement Learning

Classic

List

Threaded

1 message

Frank Wimberly-2

Fwd: Reminder: Thesis Proposal - December 1, 2020 - Nicolay Topin - Unifying State and Policy Level Explanations for Reinforcement Learning

I think we were talking about explanation of reasoning in AI systems recently. This thesis proposal is relevant.

---
Frank C. Wimberly
140 Calle Ojo Feliz,
Santa Fe, NM 87505

505 670-9918
Santa Fe, NM

---------- Forwarded message ---------
From: Diane Stidle <[hidden email]>
Date: Mon, Nov 30, 2020, 8:04 AM
Subject: Reminder: Thesis Proposal - December 1, 2020 - Nicolay Topin - Unifying State and Policy Level Explanations for Reinforcement Learning
To: [hidden email] <[hidden email]>, <[hidden email]>

Thesis Proposal

Date: December 1, 2020
Time: 10:00am (EST)
Speaker: Nicolay Topin

Zoom Meeting Link: https://cmu.zoom.us/j/99269721240?pwd=a3c5QytZbE01a0w4WEpIS3RpSjFSdz09
Meeting ID: 992 6972 1240
Password: 068976

Title: Unifying State and Policy Level Explanations for Reinforcement Learning

Abstract:
In an off-policy reinforcement learning setting, an agent observes interactions with an environment to learn a policy to maximize reward. Before the agent is allowed to follow its learned policy, a human operator can use explanations to gauge the agent's competency and try to understand its behavior. Policy-level behavior explanations illustrate the long-term behavior of the agent. Feature importance explanations identify the features of a state that affect an agent’s action choice for that state. Experience importance explanations show which past experiences led to the current behavior. Previous methods for creating explanations have provided a subset of information types but not all three at once. In this thesis, we address the problem of creating explanations for a reinforcement learning agent that include this full set of information types. We contribute a novel explanation method that unifies and extends these existing explanation types.

We have created a method for producing feature importance explanations by learning a decision tree policy using reinforcement learning. This method formulates the problem as a Markov decision process, so standard off-policy learning algorithms can be used to learn an optimal decision tree. Likewise, we have created an algorithm for summarizing policy-level behavior as a Markov chain over abstract states. Our approach uses a set of decision trees to map states to abstract states. In addition, we have introduced a method for creating experience importance explanations which identifies sets of similarly treated inputs and how these sets impacted training.

We propose two lines of future work. First, we will integrate the two decision tree explanations (for feature importance explanations and policy-level behavior explanations) via a shared state featurization. Second, we will extend the experience importance explanation algorithm to identify important experiences for both abstract state division as well as the agent's choice of features to examine.

Thesis committee:
Manuela Veloso (Chair)
Tom Mitchell
Ameet Talwalkar
Marie desJardins (Simmons University)

- .... . -..-. . -. -.. -..-. .. ... -..-. .... . .-. .
FRIAM Applied Complexity Group listserv
Zoom Fridays 9:30a-12p Mtn GMT-6 bit.ly/virtualfriam
un/subscribe http://redfish.com/mailman/listinfo/friam_redfish.com
archives: http://friam.471366.n2.nabble.com/
FRIAM-COMIC http://friam-comic.blogspot.com/