Whittaker Smoother API¶
The Whittaker smoother is the special case of P-splines where the basis matrix \(B = I\) (identity). Instead of estimating B-spline coefficients, the smoother operates directly on the data vector \(y\), solving:
For non-uniformly spaced data, the standard difference operator \(D\) is replaced by the divided-difference operator \(D_x\) which weights each finite difference by the reciprocal of the gap in \(x\). This ensures the roughness penalty is expressed in the natural units of the independent variable rather than index position.
WhittakerSmoother Class¶
Whittaker smoother with x-aware penalty for non-uniform data.
Solves (W + λ D'D) z = W y where D is either the standard
difference operator (uniform spacing) or the divided-difference
operator (non-uniform spacing).
Parameters:
-
x((array - like, shape(n))) –Sample positions — must be finite. Need not be sorted or uniformly spaced; the smoother sorts internally.
-
y((array - like, shape(n))) –Observed values at each x.
-
lambda_(float, default:100.0) –Smoothing parameter (> 0).
-
penalty_order(int, default:2) –Difference order for the roughness penalty (1 or 2, default 2).
-
weights((array - like, shape(n)), default:None) –Non-negative observation weights. Zero means "missing".
fit ¶
fit() -> WhittakerSmoother
Fit the smoother via a single sparse solve.
Solves (W + λ D'D) z = W y and stores results in
:attr:fitted_values, :attr:ED, and :attr:se_fitted.
Returns:
-
WhittakerSmoother–self, for method-chaining.
Source code in src/psplines/whittaker.py
predict ¶
Interpolate fitted values at new x-locations.
Uses linear interpolation on the fitted (sorted) grid.
Parameters:
-
x_new(array - like) –Query positions.
Returns:
-
NDArray–Interpolated smoothed values.
Source code in src/psplines/whittaker.py
cross_validation ¶
Select lambda via generalized cross-validation (GCV).
Minimises GCV(λ) = (RSS / n) / (1 - ED / n)² over a
bounded search in log₁₀(λ).
Parameters:
-
lambda_bounds((float, float), default:(1e-06, 1000000.0)) –Lower and upper bounds for the λ search.
Returns:
-
best_lambda(float) – -
best_score(float) –
Source code in src/psplines/whittaker.py
v_curve ¶
v_curve(
lambda_bounds: tuple[float, float] = (1e-06, 1000000.0),
num_lambda: int = 81,
) -> tuple[float, float]
Select lambda via the V-curve minimum-distance criterion.
Sweeps a log-uniform grid of λ values, computes
(log RSS, log penalty) at each, and picks the λ where
consecutive-point distance is minimised.
Parameters:
-
lambda_bounds((float, float), default:(1e-06, 1000000.0)) –Lower and upper bounds for the λ grid.
-
num_lambda(int, default:81) –Number of grid points.
Returns:
-
best_lambda(float) – -
best_score(float) –Minimum distance value.
Source code in src/psplines/whittaker.py
Usage Examples¶
Basic Smoothing¶
import numpy as np
from psplines import WhittakerSmoother
# Noisy signal
x = np.linspace(0, 10, 200)
y = np.sin(x) + 0.3 * np.random.default_rng(42).standard_normal(200)
# Fit with a fixed lambda
ws = WhittakerSmoother(x, y, lambda_=1e4, penalty_order=2)
ws.fit()
# Smoothed values (same length as input, in original order)
z = ws.fitted_values
Automatic Lambda Selection¶
# GCV-based selection (recommended)
ws = WhittakerSmoother(x, y)
best_lam, gcv_score = ws.cross_validation()
print(f"Optimal λ = {best_lam:.2f}")
# V-curve selection (alternative)
ws2 = WhittakerSmoother(x, y)
best_lam, v_score = ws2.v_curve()
Non-Uniform Spacing¶
The key advantage over the standard Whittaker smoother: gaps in \(x\) are handled correctly via divided differences.
# Irregularly sampled signal
rng = np.random.default_rng(7)
x = np.sort(rng.uniform(0, 10, 150))
y = np.sin(x) + 0.2 * rng.standard_normal(150)
ws = WhittakerSmoother(x, y, lambda_=100)
ws.fit()
# The smoother automatically uses divided differences
# when x-spacing is non-uniform
Missing Data via Weights¶
Zero weights effectively mark observations as missing; the smoother interpolates through them:
x = np.linspace(0, 10, 200)
y_true = np.sin(x)
y = y_true.copy()
y[80:120] = 99.0 # corrupt a region
weights = np.ones(200)
weights[80:120] = 0.0 # mark as missing
ws = WhittakerSmoother(x, y, lambda_=1e3, weights=weights)
ws.fit()
# ws.fitted_values smoothly interpolates through the gap
Interpolation at New Points¶
ws = WhittakerSmoother(x, y, lambda_=1e3).fit()
# Predict at new locations via linear interpolation
x_new = np.linspace(0, 10, 500)
z_new = ws.predict(x_new)
Unsorted Input¶
The smoother accepts unsorted \(x\) — it sorts internally and returns results in the original order:
rng = np.random.default_rng(42)
x = rng.uniform(0, 10, 100) # not sorted
y = np.sin(x) + 0.1 * rng.standard_normal(100)
ws = WhittakerSmoother(x, y, lambda_=1e3).fit()
# ws.fitted_values[i] corresponds to x[i], not to sorted x
When to Use WhittakerSmoother vs PSpline¶
| Criterion | WhittakerSmoother |
PSpline |
|---|---|---|
| Basis | Identity (\(B = I\)) | B-spline basis |
| Prediction at new points | Linear interpolation | Exact B-spline evaluation |
| Derivatives | Not supported | Analytic via derivative basis |
| GLM families | Gaussian only | Gaussian, Poisson, Binomial |
| Shape constraints | Not supported | Full support |
| Speed | Single sparse solve | Single sparse solve (+ basis construction) |
| Non-uniform spacing | Handled via divided differences | Handled via B-spline basis |
| Ideal for | Fast signal smoothing, spectra, time series | General smoothing, derivatives, GLMs |
Rule of thumb
Use WhittakerSmoother when you want fast, simple smoothing on a fixed grid — especially for uniformly or near-uniformly sampled signals where you don't need derivatives or fancy GLM features. Use PSpline for everything else.