seldonian.models.trees.sktree_model.SeldonianDecisionTree

class SeldonianDecisionTree(**dt_kwargs)

Bases: ClassificationModel

__init__(**dt_kwargs)

A Seldonian decision tree model that re-labels leaf node probabilities from a vanilla decision tree built using SKLearn’s DecisionTreeClassifier object.

Variables:

classifier – The SKLearn classifier object

__repr__()

Return repr(self).

Methods

fit(features, labels, **kwargs)

A wrapper around SKLearn’s fit() method. Returns the leaf node probabilities of SKLearn’s built tree.

Parameters:
  • features (numpy ndarray) – Features

  • labels (1D numpy array) – Labels

Returns:

Leaf node probabilities (of predicting the positive class only), ordered from left to right

forward_pass(X)

Do a forward pass through the sklearn model.

Parameters:

X (numpy ndarray) – model features

Returns:

probs_pos_class: the vector of probabilities, leaf_nodes_hit: the ids of the leaf nodes that were

hit by each sample. These are needed for computing the Jacobian

get_jacobian(ans, theta, X)

Return the Jacobian d(forward_pass)_i/dtheta_j, where i run over datapoints and j run over model parameters.

Parameters:
  • ans – The result of the forward pass function evaluated on theta and X

  • theta – The weight vector, which isn’t used in this method

  • X – The features

Returns:

J, the Jacobian matrix

get_leaf_node_probs()

Retrieve the leaf node probabilities from the current tree from left to right

predict(theta, X, **kwargs)

Call the autograd primitive (a workaround since our forward pass involves an external library)

Parameters:
  • theta (numpy ndarray) – model weights (not probabilities)

  • X (numpy ndarray) – model features

Return pred:

model predictions

Rtype pred:

numpy ndarray same shape as labels

set_leaf_node_values(probs)
Update the leaf node probabilities

(actually the numbers in each label)

Parameters:

probs – The vector of probabilities to set on the leaf nodes from left to right