Custom Estimators
FlowMatchingBDT accepts any scikit-learn-compatible regressor via the estimator parameter. Each flow step clones this estimator and wraps it in MultiOutputRegressor to handle multi-dimensional velocity prediction.
from sklearn.datasets import make_moons
data, _ = make_moons(n_samples=500, noise=0.1, random_state=0)
Using Decision Tree
from sklearn.tree import DecisionTreeRegressor
from flowmatching_bdt import FlowMatchingBDT
model = FlowMatchingBDT(
estimator=DecisionTreeRegressor(),
n_flow_steps=5,
n_duplicates=10,
)
model.fit(data)
Using Extra Trees
from sklearn.ensemble import ExtraTreesRegressor
from flowmatching_bdt import FlowMatchingBDT
model = FlowMatchingBDT(
estimator=ExtraTreesRegressor(n_estimators=10),
n_flow_steps=5,
n_duplicates=10,
)
model.fit(data)
Requirements for Custom Estimators
Any estimator you pass must:
- Implement
fit(X, y)— called with noised samplesXand velocity targetsy - Implement
predict(X)— called during Euler integration to predict the velocity field - Be compatible with
sklearn.base.clone— the estimator is cloned once per flow step
By default, the estimator is wrapped in MultiOutputRegressor, so it only needs to support single-output regression. If your estimator natively handles multi-output (e.g. neural networks), set multi_output=True to skip the wrapper:
model = FlowMatchingBDT(
estimator=DecisionTreeRegressor(),
multi_output=True,
n_flow_steps=5,
n_duplicates=10,
)
Tip
The default HistGradientBoostingRegressor is a good choice for most tabular datasets. Try XGBoost or LightGBM if you need more control over hyperparameters or GPU acceleration.