I think we were talking about explanation of reasoning in AI systems recently. This thesis proposal is relevant.
---
Frank C. Wimberly
140 Calle Ojo Feliz,
Santa Fe, NM 87505
505 670-9918
Santa Fe, NM
---------- Forwarded message ---------
From:
Diane Stidle <[hidden email]>Date: Mon, Nov 30, 2020, 8:04 AM
Subject: Reminder: Thesis Proposal - December 1, 2020 - Nicolay Topin - Unifying State and Policy Level Explanations for Reinforcement Learning
To:
[hidden email] <
[hidden email]>, <
[hidden email]>
Thesis Proposal
Date: December 1, 2020
Time: 10:00am (EST)
Speaker: Nicolay Topin
Zoom Meeting Link: https://cmu.zoom.us/j/99269721240?pwd=a3c5QytZbE01a0w4WEpIS3RpSjFSdz09
Meeting ID: 992 6972 1240
Password: 068976
Title: Unifying State and Policy
Level Explanations for Reinforcement Learning
Abstract:
In an off-policy reinforcement
learning setting, an agent observes interactions with an
environment to learn a policy to maximize reward. Before the
agent is allowed to follow its learned policy, a human
operator can use explanations to gauge the agent's
competency and try to understand its behavior. Policy-level
behavior explanations illustrate the long-term behavior of
the agent. Feature importance explanations identify the
features of a state that affect an agent’s action choice for
that state. Experience importance explanations show which
past experiences led to the current behavior. Previous
methods for creating explanations have provided a subset of
information types but not all three at once. In this thesis,
we address the problem of creating explanations for a
reinforcement learning agent that include this full set of
information types. We contribute a novel explanation method
that unifies and extends these existing explanation types.
We have created a method for producing feature importance
explanations by learning a decision tree policy using
reinforcement learning. This method formulates the problem
as a Markov decision process, so standard off-policy
learning algorithms can be used to learn an optimal decision
tree. Likewise, we have created an algorithm for summarizing
policy-level behavior as a Markov chain over abstract
states. Our approach uses a set of decision trees to map
states to abstract states. In addition, we have introduced a
method for creating experience importance explanations which
identifies sets of similarly treated inputs and how these
sets impacted training.
We propose two lines of future work. First, we will
integrate the two decision tree explanations (for feature
importance explanations and policy-level behavior
explanations) via a shared state featurization. Second, we
will extend the experience importance explanation algorithm
to identify important experiences for both abstract state
division as well as the agent's choice of features to
examine.
Thesis committee:
Manuela Veloso (Chair)
Tom Mitchell
Ameet Talwalkar
Marie desJardins (Simmons University)
- .... . -..-. . -. -.. -..-. .. ... -..-. .... . .-. .
FRIAM Applied Complexity Group listserv
Zoom Fridays 9:30a-12p Mtn GMT-6 bit.ly/virtualfriam
un/subscribe
http://redfish.com/mailman/listinfo/friam_redfish.comarchives:
http://friam.471366.n2.nabble.com/FRIAM-COMIC
http://friam-comic.blogspot.com/