seldonian.models.trees.skrandomforest_model.SeldonianRandomForest¶

class SeldonianRandomForest(**rf_kwargs)¶

Bases: ClassificationModel

__init__(**rf_kwargs)¶

A Seldonian random forest model that re-labels leaf node probabilities from a vanilla decision tree built using SKLearn’s RandomForestClassifier object.

Variables:

classifier – The SKLearn classifier object
n_trees – The number of decision trees in the forest

__repr__()¶: Return repr(self).

Methods

fit(features, labels, **kwargs)¶

A wrapper around SKLearn’s fit() method. Returns the leaf node probabilities of SKLearn’s built trees in the forest. Assigns leaf node ids in a list of lists, where each sublist contains the ids for a single tree, ordered from left to right.

Parameters:

features (numpy ndarray) – Features
labels (1D numpy array) – Labels

Returns:

Flattend array of leaf node probabilites (of predicting the positive class) for all trees, ordered left to right in a given tree.

forward_pass(X)¶

Predict the probability of the postive class for each sample in X.

Parameters:

X – Feature matrix

Returns:

probs_pos_class: the vector of probabilities, leaf_nodes_hit: the ids of the leaf nodes that were

hit by each sample. These are needed for computing the Jacobian

get_jacobian(ans, theta, X)¶

Return the Jacobian d(forward_pass)_i/dtheta_{j+1}, where i run over datapoints and j run over model parameters. Here, a forward pass is 1/n * sum_k { forward_k(theta,X) }, where forward_k is the forward pass of a single decision tree. We can compute Jacobians for each tree separately and then horizontally stack them and add a 1/n out front.

Parameters:

ans – The result of the forward pass function evaluated on theta and X
theta – The weight vector, which isn’t used in this method
X – The features

Returns:

J, the Jacobian matrix

get_leaf_node_probs()¶

Retrieve the leaf node probabilities from the current forest of trees from left to right.

Returns:: Flattend array of leaf node probabilites (of predicting the positive class) for all trees, ordered left to right in a given tree.

predict(theta, X, **kwargs)¶

Call the autograd primitive (a workaround since our forward pass involves an external library)

Parameters:

theta (numpy ndarray) – model weights (not probabilities)
X (numpy ndarray) – model features

Returns:

model predictions

Return type:

numpy ndarray same shape as labels

set_leaf_node_values(probs)¶

Update the leaf node values, i.e., the number of samples that get categorized as 0 or 1, using the new probabilities, probs.

Parameters:: probs – A flattened array of the leaf node probabilities from all trees

seldonian.models.trees.skrandomforest_model.SeldonianRandomForest¶

Seldonian Engine

Navigation

Related Topics