seldonian.parse_tree.parse_tree.ParseTree¶
- class ParseTree(delta, regime, sub_regime, columns=[], custom_measure_functions={})¶
Bases:
object
- __init__(delta, regime, sub_regime, columns=[], custom_measure_functions={})¶
Class to represent a parse tree for a single behavioral constraint
- Parameters:
delta (float) – Confidence level for the constraint. Specifies the maximum probability that the algorithm can return a solution violates the behavioral constraint. This gets broken up into smaller deltas for the base nodes.
regime (str) – The category of the machine learning algorithm, e.g., supervised_learning or reinforcement_learning
sub_regime (str) – The sub-category of ml algorithm, e.g. classification or regression for supervised learning. Use ‘all’ for RL.
columns (List(str)) – The names of the columns in the dataframe. Used to determine if conditional columns provided by user are appropriate.
- Variables:
root (nodes.Node object) – Root node which contains the whole tree via left and right child attributes. Gets assigned when tree is built by create_from_ast()
constraint_str (str) – The string expression for the behavioral constraint
n_nodes (int) – Total number of nodes in the parse tree
n_base_nodes (int) – Number of base variable nodes in the parse tree. Does not include constants. If a base variable, such as PR | [M] appears more than once in the constraint_str each appearance contributes to n_base_nodes
base_node_dict (dict) – Keeps track of unique base variable nodes, their confidence bounds and whether the bounds have been calculated for a given base node already. Helpful for handling case where we have duplicate base nodes
n_unique_bounds_tot (int) – The total number of unique confidence bounds that need to be computed over all unique base nodes. This is set by assign_bounds_needed()
node_fontsize (int) – Fontsize used for graphviz visualizations
available_measure_functions (int) – A list of measure functions for the given regime and sub-regime, e.g. “Mean_Error” for supervised regression or “PR”, i.e. Positive Rate for supervised classification.
- __repr__()¶
Return repr(self).
Methods
- _abs(a)¶
Absolute value of a confidence interval
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
- _add(a, b)¶
Add two confidence intervals
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
b (tuple) – Confidence interval like: (lower,upper)
- _assign_bounds_helper(node, lower_needed, upper_needed, **kwargs)¶
Helper function to traverse the parse tree and assign which bounds we need to calculate on the base nodes.
- Parameters:
node (
Node
object) – node in the parse treelower_needed (bool) – Whether lower bound needs to be calculated
upper_needed (bool) – Whether upper bound needs to be calculated
- _assign_deltas_helper(node, weight_method, **kwargs)¶
Helper function to traverse the parse tree and assign delta values to base nodes.
- Parameters:
node (
Node
object) – node in the parse treeweight_method (str) – How you want to assign the deltas to the base nodes
- _assign_infl_factors_helper(node, method, factors)¶
Helper function to traverse the parse tree and assign bound inflation factors values to base nodes.
- Parameters:
node (
Node
object) – node in the parse treemethod (str) – How you want to assign the bound inflation factors to the base nodes
factors – If an integer and method==”constant”, that integer is applied to all bounds. If method==”manual”, this needs to be a vector of bound inflation factors to assign to each unique base node.
- _ast2pt_node(ast_node)¶
From ast.AST node object, create one of the node objects from
Nodes
- Parameters:
ast_node (ast.AST node object) – node in the ast tree
- _ast_tree_helper(ast_node)¶
From a given node in the ast tree, make a node in the tree and recurse to children of this node.
- Parameters:
ast_node (ast.AST node object) – node in the ast tree
- _div(a, b)¶
Divide two confidence intervals
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
b (tuple) – Confidence interval like: (lower,upper)
- _evaluator_helper(node, **kwargs)¶
Helper function for traversing through the tree to evaluate the constraint
- Parameters:
node (
Node
object) – node in the parse tree
- _exp(a)¶
Exponentiate a confidence interval
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
- _log(a)¶
Take log of a confidence interval
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
- _max(a, b)¶
Get the maximum of two confidence intervals
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
b (tuple) – Confidence interval like: (lower,upper)
- _min(a, b)¶
Get the minimum of two confidence intervals
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
b (tuple) – Confidence interval like: (lower,upper)
- _mult(a, b)¶
Multiply two confidence intervals
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
b (tuple) – Confidence interval like: (lower,upper)
- _parse_subscript(ast_node)¶
Helper function for dealing with base nodes with subscripts.
- Parameters:
ast_node (ast.AST node object) – node in the ast tree
- Returns:
node_class - which node class to use for this base node node_kwargs - keyword arguments used to build the base node
- _pow(a, b)¶
Get the confidence interval on pow(a,b) where b and b are both be intervals
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
b (tuple) – Confidence interval like: (lower,upper)
- _preprocess_constraint_str(s)¶
Check if inequalities present and move everything to one side so final constraint string is in the form: {constraint_str} <= 0
Also does some validation checks to make sure string that was passed is valid
- Parameters:
s (str) – mathematical expression written in Python syntax from which we build the parse tree
- Returns:
String for g
- Return type:
str
- _propagate_value(node)¶
Helper function for propagating values
- Parameters:
node (
Node
object) – node in the parse tree
- _propagator_helper(node, **kwargs)¶
Helper function for traversing through the tree and propagating confidence bounds
- Parameters:
node (
Node
object) – node in the parse tree
- _protect_nan(bound, bound_type)¶
Handle nan as negative infinity if in lower bound and postitive infinity if in upper bound
- Parameters:
bound (float) – The value of the upper or lower bound
bound_type (str) – ‘lower’ or ‘upper’
- _sub(a, b)¶
Subract two confidence intervals
- Parameters:
a (tuple) – Confidence interval like: (lower,upper)
b (tuple) – Confidence interval like: (lower,upper)
- _validate_delta_vector(delta_vector)¶
Checks to ensure the supplied delta vector is the correct length. Also if it does not sum to self.delta normalize it so it does.
- Parameters:
delta_vector – 1D array of delta values to assign to the unique base nodes
- _validate_infl_factors(method, factors)¶
Checks to make sure supplied factors has correct dtype and size given the method.
- Parameters:
method (str) – How you want to assign the bound inflation factors to the base nodes
factors – If an integer and method==”constant”, that integer is applied to all bounds. If method==”manual”, this needs to be a vector of bound inflation factors to assign to each unique base node.
- assign_bounds_needed(**kwargs)¶
Depth-first search through the tree and decide which bounds are required to compute on each child node. There are cases where it is not always necessary to compute both lower and upper bounds because at the end all we care about is the upper bound of the root node.
- assign_deltas(weight_method='equal', **kwargs)¶
Assign the delta values to the base nodes in the tree.
- Parameters:
weight_method (str) – str, defaults to ‘equal’ How you want to assign the deltas to the base nodes. The default ‘equal’ splits up delta equally among unique base nodes. ‘manual’ allows specifying the weights as an array.
- assign_infl_factors(method='constant', factors=2)¶
Assign the bound inflation factors (for candidate selection) to the base nodes in the tree.
- Parameters:
method (str) – str, defaults to ‘constant’, which assigns a factor of the ‘factors’ value to all bounds. If method == “manual”, then factors should be a vector of values to use for the bounds whose length is equal to the number of total unique bounds across all base nodes.
factors – If an integer and method==”constant”, that integer is applied to all bounds. If method==”manual”, this needs to be a vector of bound inflation factors to assign to each unique base node.
- build_tree(constraint_str, delta_weight_method='equal', delta_vector=[], infl_factor_method='constant', infl_factors=2)¶
Convenience function for building the tree from a constraint string, subdividing the tree delta to deltas for each base node, and assigning which nodes need upper and lower bounding.
- Parameters:
constraint_str (str) – mathematical expression written in Python syntax from which we build the parse tree
delta_weight_method (str, defaults to 'equal') – str, How you want to assign the deltas to the base nodes. The default ‘equal’ splits up delta equally among unique base nodes
delta_vector – 1D array of delta values to assign to the unique base nodes.
infl_factor_method (str, defaults to 'constant') – How you want to assign the inflation factors to the base nodes. The default ‘constant’ applies a constant factor to each base node bound. ‘manual’ allows assinging unique inflation factors to each base node bound.
infl_factors – The bound inflation factors. Int if infl_factor_method=”constant”, array-like if infl_factor_method=”manual”.
- create_from_ast(s)¶
Create the node structure of the tree given a mathematical string expression, s
- Parameters:
s (str) – mathematical expression written in Python syntax from which we build the parse tree
- evaluate_constraint(**kwargs)¶
Evaluate the constraint itself (not bounds) Postorder traverse (left, right, root) through the tree and calculate the values of the base nodes then propagate bounds using propagation logic
- make_viz(title)¶
Make a graphviz diagram from a root node
- Parameters:
title (str) – The title you want to display at the top of the graph
- make_viz_helper(root, graph)¶
Helper function for make_viz() Recurses through the parse tree and adds nodes and edges to the graph
- Parameters:
root (
Node
object) – root of the parse treegraph (graphviz.Digraph object) – The graphviz graph object
- propagate(node)¶
Helper function for propagating confidence bounds
- Parameters:
node (
Node
object) – node in the parse tree
- propagate_bounds(**kwargs)¶
Postorder traverse (left, right, root) through the tree and calculate confidence bounds on base nodes, then propagate bounds using propagation logic
- reset_base_node_dict(reset_data=False)¶
Reset base node dict so that any bounds or values stored are removed. However, keeps the delta values and bound inflation factors for each bound that were set when the tree was built. If those need to be reset, create an entirely new parse tree.
- Parameters:
reset_data (bool) – Whether to reset the cached data for each base node. This is needed less frequently than one needs to reset the bounds.