seldonian.seldonian_algorithm.SeldonianAlgorithm

class SeldonianAlgorithm(spec)

Bases: object

__init__(spec)

Object for running the Seldonian algorithm and for getting introspection into candidate selection and safety test.

Parameters:

spec (Spec object) – The specification object with the complete set of parameters for running the Seldonian algorithm

__repr__()

Return repr(self).

Methods

Attributes

parse_trees

The base_node_bound_dict specifies the bounding method for each base node.

candidate_safety_split(frac_data_in_safety)

Split dataset into candidate and safety sets. Regime-agnostic.

Parameters:

frac_data_in_safety – Fraction of data used in safety test. The remaining fraction will be used in candidate selection

Returns:

supervised_learning: F_c,F_s,L_c,L_s,S_c,S_s, n_candidate, n_safety

where F=features, L=labels, S=sensitive attributes

reinforcement_learning: E_c, E_s, S_c, S_s, n_candidate, n_safety

where E=episodes, S=sensitive attributes

custom regime: D_c,D_s,S_c,S_s, n_candidate, n_safety

where D=data, S=sensitive attributes

candidate_safety_split_addl_datasets(frac_data_in_safety, addl_dataset, batch_size, constraint_str, base_node)

Split addl dataset into candidate and safety sets. Regime-agnostic.

Parameters:
  • frac_data_in_safety – Fraction of data used in safety test. The remaining fraction will be used in candidate selection.

  • addl_dataset – The dataset to split

  • batch_size – The batch size provided by the user (may be None)

  • constraint_str – The constraint string for the parse tree for which this additional dataset is to be used.

  • base_node – The base node within the constraint string for which this additional dataset is to be used.

Returns:

supervised_learning: F_c,F_s,L_c,L_s,S_c,S_s, n_candidate, n_safety

where F=features, L=labels, S=sensitive attributes

reinforcement_learning: E_c, E_s, S_c, S_s, n_candidate, n_safety

where E=episodes, S=sensitive attributes

custom regime: D_c,D_s,S_c,S_s, n_candidate, n_safety

where D=data, S=sensitive attributes

candidate_selection(write_logfile=False)

Create the candidate selection object

Parameters:

write_logfile – Whether to write out a pickle file containing details of candidate selection

evaluate_primary_objective(branch, theta)

Get value of the primary objective given model weights, theta, on either the candidate selection dataset or the safety dataset. This is just a wrapper for primary_objective where data is fixed.

Parameters:
  • branch (str) – ‘candidate_selection’ or ‘safety_test’

  • theta (numpy.ndarray) – model weights

Returns:

the value of the primary objective evaluated for the given branch at the provided value of theta

Return type:

float

get_cs_result()

Get the dictionary returned from running candidate selection

get_importance_weights(branch, theta)

Get the importance weights from the model weights, theta, evaluated either on the candidate data or safety data.

Parameters:
  • branch (str) – ‘candidate_selection’ or ‘safety_test’

  • theta (numpy.ndarray) – model weights

Returns:

an array of importance weights (floats) the same length as the number of episodes in the data (depending on which branch was chosen)

get_st_upper_bounds()

Get the upper bounds on each constraint evaluated on the safety data from the last time the safety test was run.

return: upper_bounds_dict, a dictionary where the keys

are the constraint strings and the values are the values of the upper bounds for that constraint

parse_trees

The base_node_bound_dict specifies the bounding method for each base node. Any base nodes not in this dictionary will be bounded using the default method

run(write_cs_logfile=False, debug=False)

Runs seldonian algorithm using spec object

Parameters:
  • write_cs_logfile – Whether to write candidate selection log file

  • debug – Whether to print out debugging info

Returns:

(passed_safety, solution). passed_safety indicates whether solution found during candidate selection passes the safety test. solution is the optimized model weights found during candidate selection or ‘NSF’.

Return type:

Tuple

run_candidate_selection(write_logfile=False, debug=False)

Run candidate selection

Parameters:
  • write_logfile – Whether to write out a pickle file containing details of candidate selection

  • debug – Boolean flag for whether to run in debug mode.

Returns:

candidate_solution, the model weights obtained from running candidate selection or NSF if an error occurred during candidate selection.

Return type:

numpy.ndarray or str

run_safety_test(candidate_solution, batch_size_safety=None, debug=False)

Runs safety test using solution from candidate selection.

Parameters:
  • candidate_solution – model weights from candidate selection or other process

  • debug – Whether to print out debugging info

Returns:

(passed_safety, solution). passed_safety indicates whether solution found during candidate selection passed the safety test. solution is the optimized model weights found during candidate selection or ‘NSF’.

Return type:

Tuple(Bool,numpy.ndarray or str)

safety_test()

Create the safety test object

set_initial_solution(verbose=False)

Set the self.initial_solution parameter by evaluating the initial_solution_fn if provided, otherwise use the default.