seldonian.seldonian_algorithm.SeldonianAlgorithm¶
- class SeldonianAlgorithm(spec)¶
Bases:
object
- __init__(spec)¶
Object for running the Seldonian algorithm and for getting introspection into candidate selection and safety test.
- Parameters:
spec (
Spec
object) – The specification object with the complete set of parameters for running the Seldonian algorithm
- __repr__()¶
Return repr(self).
Methods
Attributes
The base_node_bound_dict specifies the bounding method for each base node.
- candidate_safety_split(frac_data_in_safety)¶
Split dataset into candidate and safety sets. Regime-agnostic.
- Parameters:
frac_data_in_safety – Fraction of data used in safety test. The remaining fraction will be used in candidate selection
- Returns:
- supervised_learning: F_c,F_s,L_c,L_s,S_c,S_s, n_candidate, n_safety
where F=features, L=labels, S=sensitive attributes
- reinforcement_learning: E_c, E_s, S_c, S_s, n_candidate, n_safety
where E=episodes, S=sensitive attributes
- custom regime: D_c,D_s,S_c,S_s, n_candidate, n_safety
where D=data, S=sensitive attributes
- candidate_safety_split_addl_datasets(frac_data_in_safety, addl_dataset, batch_size, constraint_str, base_node)¶
Split addl dataset into candidate and safety sets. Regime-agnostic.
- Parameters:
frac_data_in_safety – Fraction of data used in safety test. The remaining fraction will be used in candidate selection.
addl_dataset – The dataset to split
batch_size – The batch size provided by the user (may be None)
constraint_str – The constraint string for the parse tree for which this additional dataset is to be used.
base_node – The base node within the constraint string for which this additional dataset is to be used.
- Returns:
- supervised_learning: F_c,F_s,L_c,L_s,S_c,S_s, n_candidate, n_safety
where F=features, L=labels, S=sensitive attributes
- reinforcement_learning: E_c, E_s, S_c, S_s, n_candidate, n_safety
where E=episodes, S=sensitive attributes
- custom regime: D_c,D_s,S_c,S_s, n_candidate, n_safety
where D=data, S=sensitive attributes
- candidate_selection(write_logfile=False)¶
Create the candidate selection object
- Parameters:
write_logfile – Whether to write out a pickle file containing details of candidate selection
- evaluate_primary_objective(branch, theta)¶
Get value of the primary objective given model weights, theta, on either the candidate selection dataset or the safety dataset. This is just a wrapper for primary_objective where data is fixed.
- Parameters:
branch (str) – ‘candidate_selection’ or ‘safety_test’
theta (numpy.ndarray) – model weights
- Returns:
the value of the primary objective evaluated for the given branch at the provided value of theta
- Return type:
float
- get_cs_result()¶
Get the dictionary returned from running candidate selection
- get_importance_weights(branch, theta)¶
Get the importance weights from the model weights, theta, evaluated either on the candidate data or safety data.
- Parameters:
branch (str) – ‘candidate_selection’ or ‘safety_test’
theta (numpy.ndarray) – model weights
- Returns:
an array of importance weights (floats) the same length as the number of episodes in the data (depending on which branch was chosen)
- get_st_upper_bounds()¶
Get the upper bounds on each constraint evaluated on the safety data from the last time the safety test was run.
- return: upper_bounds_dict, a dictionary where the keys
are the constraint strings and the values are the values of the upper bounds for that constraint
- parse_trees¶
The base_node_bound_dict specifies the bounding method for each base node. Any base nodes not in this dictionary will be bounded using the default method
- run(write_cs_logfile=False, debug=False)¶
Runs seldonian algorithm using spec object
- Parameters:
write_cs_logfile – Whether to write candidate selection log file
debug – Whether to print out debugging info
- Returns:
(passed_safety, solution). passed_safety indicates whether solution found during candidate selection passes the safety test. solution is the optimized model weights found during candidate selection or ‘NSF’.
- Return type:
Tuple
- run_candidate_selection(write_logfile=False, debug=False)¶
Run candidate selection
- Parameters:
write_logfile – Whether to write out a pickle file containing details of candidate selection
debug – Boolean flag for whether to run in debug mode.
- Returns:
candidate_solution, the model weights obtained from running candidate selection or NSF if an error occurred during candidate selection.
- Return type:
numpy.ndarray or str
- run_safety_test(candidate_solution, batch_size_safety=None, debug=False)¶
Runs safety test using solution from candidate selection.
- Parameters:
candidate_solution – model weights from candidate selection or other process
debug – Whether to print out debugging info
- Returns:
(passed_safety, solution). passed_safety indicates whether solution found during candidate selection passed the safety test. solution is the optimized model weights found during candidate selection or ‘NSF’.
- Return type:
Tuple(Bool,numpy.ndarray or str)
- safety_test()¶
Create the safety test object
- set_initial_solution(verbose=False)¶
Set the self.initial_solution parameter by evaluating the initial_solution_fn if provided, otherwise use the default.