Step
A single step in a machine learning pipeline that wraps a scikit-learn estimator.
This class represents an individual processing step that can either transform data (transformers like StandardScaler, SelectKBest) or make predictions (classifiers like KNeighborsClassifier, LinearSVC). Steps can be combined into Pipeline objects to create complex ML workflows.
Parameters
typ
type
The scikit-learn estimator class (must inherit from BaseEstimator).
required
name
str
A unique name for this step. If None, generates a name from the class name and ID.
required
params_tuple
tuple
Tuple of (parameter_name, parameter_value) pairs for the estimator. Parameters are automatically sorted for consistency.
required
Attributes
typ
type
The scikit-learn estimator class.
name
str
The unique name for this step in the pipeline.
params_tuple
tuple
Sorted tuple of parameter key-value pairs.
Examples
Create a scaler step:
>>> from xorq.ml import Step
>>> from sklearn.preprocessing import StandardScaler
>>> scaler_step = Step(typ= StandardScaler, name= "scaler" )
>>> scaler_step.instance
Create a classifier step with parameters:
>>> from sklearn.neighbors import KNeighborsClassifier
>>> knn_step = Step(
... typ= KNeighborsClassifier,
... name= "knn" ,
... params_tuple= (("n_neighbors" , 5 ), ("weights" , "uniform" ))
... )
>>> knn_step.instance
Notes
The Step class is frozen (immutable) using attrs.
All estimators must inherit from sklearn.base.BaseEstimator.
Parameter tuples are automatically sorted for hash consistency.
Steps can be fitted to data using the fit() method which returns a FittedStep.
Methods
fit
fit(expr, features= None , target= None , storage= None , dest_col= None )
Fit this step to the given expression data.
Parameters
expr
Expr
The xorq expression containing the training data.
required
features
tuple of str
Column names to use as features. If None, infers from expr.columns.
None
target
str
Target column name. Required for prediction steps.
None
storage
Storage
Storage backend for caching fitted models.
None
dest_col
str
Destination column name for transformed output.
None
Returns
FittedStep
A fitted step that can transform or predict on new data.
from_fit_predict
from_fit_predict(fit, predict, return_type, klass_name= None , name= None )
Create a Step from custom fit and predict functions.
Parameters
fit
callable
Function to fit the model.
required
predict
callable
Function to make predictions.
required
return_type
DataType
The return type for predictions.
required
klass_name
str
Name for the generated estimator class.
None
name
str
Name for the step.
None
Returns
Step
A new Step with a dynamically created estimator type.
from_instance_name
from_instance_name(instance, name= None , deep= False )
Create a Step from an existing scikit-learn estimator instance.
Parameters
instance
object
A scikit-learn estimator instance.
required
name
str
Name for the step. If None, generates from instance class name.
None
Returns
Step
A new Step wrapping the estimator instance.
from_name_instance
from_name_instance(name, instance, deep= False )
Create a Step from a name and estimator instance.
Parameters
name
str
Name for the step.
required
instance
object
A scikit-learn estimator instance.
required
Returns
Step
A new Step wrapping the estimator instance.
set_params
Create a new Step with updated parameters.
Parameters
**kwargs
Parameter names and values to update.
{}
Returns
Step
A new Step instance with updated parameters.
Examples
>>> knn_step = Step(typ= KNeighborsClassifier, name= "knn" )
>>> updated_step = knn_step.set_params(n_neighbors= 10 , weights= "distance" )