Probability Paths
A probability path defines how samples are interpolated between the source distribution \(x_0\) (e.g gaussian at \(t{=}0\)) and the target distribution \(x_1\) (real data at \(t{=}1\)). The path determines both the training targets and the dynamics of generation.
Built-in Paths
LinearPath (default)
The simplest linear interpolation between source and target. Samples travel in straight lines from source \(x_0\) to target \(x_1\) at constant speed:
from flowmatching_bdt import FlowMatchingBDT
from flowmatching_bdt.paths import LinearPath
model = FlowMatchingBDT(path=LinearPath())
PolynomialPath
A generalisation of LinearPath where the speed along each trajectory is non-uniform:
For k > 1, samples linger near the source distribution early on and accelerate towards the data manifold. For k = 1, this is equivalent to LinearPath.
from flowmatching_bdt import FlowMatchingBDT
from flowmatching_bdt.paths import PolynomialPath
model = FlowMatchingBDT(path=PolynomialPath(k=2.0))
Creating a Custom Path
Subclass ProbabilityPath and implement two methods:
compute_mu_t(x0, x1, t)— the interpolated point at timetcompute_flow(x0, x1, t, xt)— the target velocity field
Example: Cosine Schedule Path
A cosine schedule moves slowly near both endpoints and fastest at the midpoint:
import numpy as np
from flowmatching_bdt.paths import ProbabilityPath, pad_t_like_x
class CosinePath(ProbabilityPath):
"""Cosine-schedule interpolation between source and target.
mu_t = (1 - s(t)) * x0 + s(t) * x1
where s(t) = (1 - cos(pi * t)) / 2
The velocity is ds/dt * (x1 - x0) = (pi/2) * sin(pi * t) * (x1 - x0).
"""
def compute_mu_t(self, x0, x1, t):
t = pad_t_like_x(t, x0)
s = (1 - np.cos(np.pi * t)) / 2
return (1 - s) * x0 + s * x1
def compute_flow(self, x0, x1, t, xt):
t = pad_t_like_x(t, x0)
dsdt = np.pi / 2 * np.sin(np.pi * t)
return dsdt * (x1 - x0)
Using Your Custom Path
from sklearn.datasets import make_moons
from flowmatching_bdt import FlowMatchingBDT
data, _ = make_moons(n_samples=500, noise=0.05, random_state=0)
model = FlowMatchingBDT(path=CosinePath(), n_flow_steps=5, n_duplicates=10)
model.fit(data)
samples = model.predict(num_samples=500)
Path Design Guidelines
All paths above follow the general form \(\mu_t = (1 - s(t))\,x_0 + s(t)\,x_1\), where \(s(t)\) is a schedule function that controls the interpolation speed. When designing a custom path, keep these properties in mind:
- Boundary conditions: \(\mu_t\) should satisfy \(\mu_0 = x_0\) (source) and \(\mu_1 = x_1\) (target).
- Smoothness: The velocity \(u_t\) should be finite everywhere in \((0, 1)\). Paths where \(u_t\) diverges near \(t = 0\) (e.g.
PolynomialPathwith \(k < 1\)) can produce very large training targets at the first flow step. - Monotonicity: The schedule function \(s(t)\) mapping \([0, 1] \to [0, 1]\) should be monotonically increasing.