experiments.baselines.baselines.RLExperimentBaseline

class RLExperimentBaseline(model_name, policy, env_kwargs={'gamma': 1.0})

Bases: object

__init__(model_name, policy, env_kwargs={'gamma': 1.0})

Base class for all RL experiment baselines. All RL experiment baselines must have at least the two methods below. Depending on the constraint, other methods may be required. When the constraint involves an importance sampling variant, e.g., one of the “J_pi_new_” variants, a method:

get_probs_from_observations_and_actions(

self, theta, observations, actions, behavior_action_probs

)

is also required.

Parameters:
  • model_name – The string name to give the model. This will be used as the prefix for the directory in which the model’s results are saved.

  • policy – A seldonian.RL.Agents.Policies.Policy.Policy object.

  • env_kwargs – Keyword arguments specific to the RL environemnt.

__repr__()

Return repr(self).

Methods

set_new_params(weights)

Set new policy parameters given model weights

train(dataset, **kwargs)

Train the model using a Seldonian dataset object. This contains the episodes generated using the behavior policy. Must return the trained policy parameters.