seldonian.parse_tree.nodes.RLAltRewardBaseNode¶

class RLAltRewardBaseNode(name, alt_reward_number, lower=-inf, upper=inf, conditional_columns=[], **kwargs)¶

Bases: NewPolicyPerformanceBaseNode

__init__(name, alt_reward_number, lower=-inf, upper=inf, conditional_columns=[], **kwargs)¶

A base node for computing the IS estimate using an alternate reward (i.e. one besides the primary reward). There can be an arbitrary number of alternate rewards, so the “alt_reward_number” attribute allows one to reference the specific alternate reward. These are 1-indexed, so if one wants to reference the second alternate reward, the base node string would be: “J_pi_new_IS_[2]” Inherits all of the attributes/methods of NewPolicyPerformanceBaseNode and therefore basenode. On-policy evaluation is possible via calculate_value(on_policy=True).

Parameters:

name (str) – The name of the node, e.g. “J_pi_new_IS_[1]”
alt_reward_number – Which alternate reward to use when calculating the IS estimate. 1-indexed.
lower (float) – Lower confidence bound
upper (float) – Upper confidence bound
conditional_columns (List(str)) – When calculating confidence bounds on a measure function, condition on these columns being == 1

__repr__()¶: Overrides Node.__repr__()

Methods

calculate_bounds(**kwargs)¶

Calculate confidence bounds given a bound_method, such as t-test.

Returns:: A dictionary mapping the bound name to its value, e.g., {“lower”:-1.0, “upper”: 1.0}

calculate_data_forbound(**kwargs)¶

Prepare data inputs for confidence bound calculation.

Returns:: data_dict, a dictionary containing the prepared data

calculate_value(on_policy=False, **kwargs)¶

Calculate the value of the node given model weights, etc. This is the expected value of the base variable, not the bound.

Parameters:: on_policy (Boolean) – If True, uses episodes generated by the new policy parameterization to calculate the return. If False, estimates the return using an off-policy estimate.

compute_HC_lowerbound(data, datasize, delta, **kwargs)¶

Calculate high confidence lower bound Used in safety test

Parameters:

data (numpy ndarray) – Vector containing base variable evaluated at each observation in dataset
datasize (int) – The number of observations in the safety dataset
delta (float) – Confidence level, e.g. 0.05

Returns:

lower, the high-confidence lower bound

compute_HC_upper_and_lowerbound(data, datasize, delta_lower, delta_upper, **kwargs)¶

Calculate high confidence lower and upper bounds Used in safety test. Confidence levels for lower and upper bound do not have to be equivalent.

Depending on the bound_method, this is not always equivalent to calling compute_HC_lowerbound() and compute_HC_upperbound() independently.

Parameters:

data (numpy ndarray) – Vector containing base variable evaluated at each observation in dataset
datasize (int) – The number of observations in the safety dataset
delta_lower – Confidence level for the lower bound, e.g. 0.05
delta_upper – Confidence level for the upper bound, e.g. 0.05

Returns:

(lower,upper) the high-confidence lower and upper bounds.

compute_HC_upperbound(data, datasize, delta, **kwargs)¶

Calculate high confidence upper bound Used in safety test

Parameters:

data (numpy ndarray) – Vector containing base variable evaluated at each observation in dataset
datasize (int) – The number of observations in the safety dataset
delta (float) – Confidence level, e.g. 0.05

Returns:

upper, the high-confidence upper bound

mask_data(dataset, conditional_columns)¶

Mask features and labels using a joint AND mask where each of the conditional columns is True.

Parameters:

dataset (dataset.Dataset object) – The candidate or safety dataset
conditional_columns (List(str)) – List of columns for which to create the joint AND mask on the dataset

Returns:

The masked dataframe

Return type:

numpy ndarray

predict_HC_lowerbound(data, datasize, delta, **kwargs)¶

Calculate high confidence lower bound that we expect to pass the safety test. Used in candidate selection

Parameters:

data (numpy ndarray) – Vector containing base variable evaluated at each observation in dataset
datasize (int) – The number of observations in the safety dataset
delta (float) – Confidence level, e.g. 0.05

Returns:

lower, the predicted high-confidence lower bound

predict_HC_upper_and_lowerbound(data, datasize, delta_lower, delta_upper, **kwargs)¶

Calculate high confidence lower and upper bounds that we expect to pass the safety test. Used in candidate selection. Confidence levels for lower and upper bound do not have to be equivalent.

Depending on the bound_method, this is not always equivalent to calling predict_HC_lowerbound() and predict_HC_upperbound() independently.

Parameters:

data (numpy ndarray) – Vector containing base variable evaluated at each observation in dataset
datasize (int) – The number of observations in the safety dataset
delta_lower – Confidence level for the lower bound, e.g. 0.05
delta_upper – Confidence level for the upper bound, e.g. 0.05

Returns:

(lower,upper) the predicted high-confidence lower and upper bounds.

predict_HC_upperbound(data, datasize, delta, **kwargs)¶

Calculate high confidence upper bound that we expect to pass the safety test. Used in candidate selection

Parameters:

data (numpy ndarray) – Vector containing base variable evaluated at each observation in dataset
datasize (int) – The number of observations in the safety dataset
delta (float) – Confidence level, e.g. 0.05

Returns:

upper, the predicted high-confidence upper bound

zhat(model, theta, data_dict, sub_regime, **kwargs)¶

Calculate an unbiased estimate of the base variable node.

Parameters:

model (models.SeldonianModel object) – The machine learning model
theta (numpy ndarray) – model weights
data_dict (dict) – Contains inputs to model, such as features and labels

Returns:

A vector of unbiased estimates of the measure function

seldonian.parse_tree.nodes.RLAltRewardBaseNode¶

Seldonian Engine

Navigation

Related Topics