seldonian.RL.RL_model.RL_model¶
- class RL_model(policy, env_kwargs)¶
Bases:
SeldonianModel
- __init__(policy, env_kwargs)¶
Base class for all RL models.
- Parameters:
policy (
Policy
) – A policy parameterizationenv_kwargs (dict) – Kwargs pertaining to environment such as gamma, the discount factor
- __repr__()¶
Return repr(self).
Methods
- get_probs_from_observations_and_actions(new_params, observations, actions, action_probs)¶
Get action probablities under policy with new parameters. Just a wrapper to call policy method of same name.
- Parameters:
new_params – New policy parameter weights to set
observations – Array of observations
actions – Array of actions
action_probs – Array of action probabilities from the behavior policy
- Returns:
Array of action probabilities under the new policy