seldonian.RL.Agents.simglucose_custom_fixedarea_random_agent.SimglucoseFixedAreaAgent¶
- class SimglucoseFixedAreaAgent(bb_crmin, bb_crmax, bb_cfmin, bb_cfmax, cr_shrink_factor, cf_shrink_factor)¶
Bases:
Agent
- __init__(bb_crmin, bb_crmax, bb_cfmin, bb_cfmax, cr_shrink_factor, cf_shrink_factor)¶
An agent used for the simglucose problem studied in this example: https://seldonian.cs.umass.edu/Tutorials/examples/diabetes/
- Parameters:
bb_crmin (float) – The bounding box minimum value in CR space.
bb_crmax (float) – The bounding box maximum value in CR space.
bb_cfmin (float) – The bounding box minimum value in CF space.
bb_cfmax (float) – The bounding box maximum value in CF space.
cr_shrink_factor – How much to shrink the CR size by
cf_shrink_factor – How much to shrink the CF size by
- __repr__()¶
Return repr(self).
Methods
- choose_action(obs)¶
Return a CR,CF by sampling from uniform random distributions whose bounds are determined by the crmin,crmax,cfmin,cfmax which are determined from sigmoiding the theta values (policy weights).
- Parameters:
obs – The current observation of the agent, type depends on environment
- Returns:
array of actions
- get_params()¶
Retrieve the parameters of the agent’s policy
- get_policy()¶
Retrieve the agent’s policy object
- get_prob_this_action(observation, action)¶
Get probability of a given action provided environment is in a observation. To be overridden
- Parameters:
observation – The current observation of the agent, type depends on environment.
observation – The current action of the agent, type depends on environment.
- set_new_params(new_params)¶
Set the parameters of the agent
- Parameters:
new_params – array of weights
- update(observation, next_observation, reward, terminated)¶
Noop, but it must be implemented
- Parameters:
observation – The current observation of the agent, type depends on environment.
next_observation – The observation of the agent after an action is taken
reward – The reward for taking the action
terminated (bool) – Whether next_observation is the terminal observation