seldonian.RL.Agents.simglucose_custom_fixedarea_random_agent.SimglucoseFixedAreaAgent

class SimglucoseFixedAreaAgent(bb_crmin, bb_crmax, bb_cfmin, bb_cfmax, cr_shrink_factor, cf_shrink_factor)

Bases: Agent

__init__(bb_crmin, bb_crmax, bb_cfmin, bb_cfmax, cr_shrink_factor, cf_shrink_factor)

An agent used for the simglucose problem studied in this example: https://seldonian.cs.umass.edu/Tutorials/examples/diabetes/

Parameters:
  • bb_crmin (float) – The bounding box minimum value in CR space.

  • bb_crmax (float) – The bounding box maximum value in CR space.

  • bb_cfmin (float) – The bounding box minimum value in CF space.

  • bb_cfmax (float) – The bounding box maximum value in CF space.

  • cr_shrink_factor – How much to shrink the CR size by

  • cf_shrink_factor – How much to shrink the CF size by

__repr__()

Return repr(self).

Methods

choose_action(obs)

Return a CR,CF by sampling from uniform random distributions whose bounds are determined by the crmin,crmax,cfmin,cfmax which are determined from sigmoiding the theta values (policy weights).

Parameters:

obs – The current observation of the agent, type depends on environment

Returns:

array of actions

get_params()

Retrieve the parameters of the agent’s policy

get_policy()

Retrieve the agent’s policy object

get_prob_this_action(observation, action)

Get probability of a given action provided environment is in a observation. To be overridden

Parameters:
  • observation – The current observation of the agent, type depends on environment.

  • observation – The current action of the agent, type depends on environment.

set_new_params(new_params)

Set the parameters of the agent

Parameters:

new_params – array of weights

update(observation, next_observation, reward, terminated)

Noop, but it must be implemented

Parameters:
  • observation – The current observation of the agent, type depends on environment.

  • next_observation – The observation of the agent after an action is taken

  • reward – The reward for taking the action

  • terminated (bool) – Whether next_observation is the terminal observation