seldonian.RL.environments.mountaincar.Mountaincar

class Mountaincar

Bases: Environment

__init__()

Classic Mountaincar environment with hardcoded position and velocity bounds. Actions: -1,0,1 -> force left, no force, force right.

Variables:
  • env_description (Env_Description) – contains attributes describing the environment

  • terminal_state (bool) – Whether the terminal obs is occupied

  • time (int) – The current timestep

  • position (float) – The 1D physical position of the car, initialized at -0.5.

  • velocity (float) – The 1D velocity of the car, initialized at 0.0.

  • max_time (int) – Maximum allowed timestep

__repr__()

Return repr(self).

Methods

check_valid_mc_action(action)

Checks to ensure a valid action was taken.

Parameters:

action – A proposed action at the current obs

create_env_description()

Creates the environment description object.

Parameters:

num_states – The number of states

Returns:

Environment description for the obs and action spaces

Return type:

Env_Description

get_env_description()

Get environment description. Override this method in child class implementation

get_observation()

Get the position and velocity at the current timestep

position_and_termination_update()

Update the position given the current velocity. Check to see if we have gone outside position bounds. Also check to see if we have reached the goal position.

reset()

Go back to initial obs and timestep

start_visualizing()

Turn on visualization debugger.

stop_visualizing()

Turn off visualization debugger.

terminated()

Get the terminal observation

transition(action)

Transition between states given an action, return a reward.

Parameters:

action – A possible action at the current obs

Returns:

reward for reaching the next obs

update_velocity(action)

Apply the velocity update rule

Parameters:

action – A possible action at the current obs

visualize()

Print out current observation, useful for debugging. Override this method in child class implementation