seldonian.RL.Agents.keyboard_gridworld.Keyboard_gridworld

class Keyboard_gridworld(env_description)

Bases: Agent

__init__(env_description)

An agent used for debugging the gridworld environment. Not intended for public use.

__repr__()

Return repr(self).

Methods

choose_action(observation)

Choose an action given an observation. To be overridden

Parameters:

observation – The current observation of the agent, type depends on environment.

get_params()

Retrieve the parameters of the agent’s policy

get_policy()

Retrieve the agent’s policy object

get_prob_this_action(observation, action)

Get probability of a given action provided environment is in a observation. To be overridden

Parameters:
  • observation – The current observation of the agent, type depends on environment.

  • observation – The current action of the agent, type depends on environment.

set_new_params(theta)

Update the parameters of the agent’s policy to theta.

Parameters:

theta – policy parameters

update(observation, next_observation, reward, terminated)

Updates agent’s parameters according to the learning rule To be overriden

Parameters:
  • observation – The current observation of the agent, type depends on environment.

  • next_observation – The observation of the agent after an action is taken

  • reward – The reward for taking the action

  • terminated (bool) – Whether next_observation is a terminal observation