Skip to content

Penalty Matrices API

The penalty module provides functions for constructing difference penalty matrices used in P-spline smoothing.

Functions

Create a sparse difference matrix of shape (n-order)×n.

Parameters:

  • n (int) –

    Length of the coefficient vector α.

  • order (int, default: 2 ) –

    Order d of the finite difference (must be >= 0).

Returns:

  • D ( csr_matrix ) –

    Sparse (n-order)×n matrix implementing d-th order differences. If order=0, returns identity; if order>=n, returns an empty matrix.

Source code in src/psplines/penalty.py
def difference_matrix(n: int, order: int = 2) -> csr_matrix:
    """
    Create a sparse difference matrix of shape (n-order)×n.

    Parameters
    ----------
    n : int
        Length of the coefficient vector α.
    order : int
        Order d of the finite difference (must be >= 0).

    Returns
    -------
    D : csr_matrix
        Sparse (n-order)×n matrix implementing d-th order differences.
        If order=0, returns identity; if order>=n, returns an empty matrix.
    """
    if order < 0:
        raise ValueError("order must be non-negative")
    if order == 0:
        # zero-order = identity operator
        return diags([np.ones(n)], [0], shape=(n, n), format="csr")
    if order >= n:
        # no valid differences
        return csr_matrix((0, n))

    # number of rows
    m = n - order
    # offsets for diagonals: 0,1,...,order
    offsets = np.arange(order + 1)
    # build each diagonal of length m
    data = [((-1) ** (order - k)) * comb(order, k) * np.ones(m) for k in offsets]
    D = diags(data, offsets, shape=(m, n), format="csr")
    return D

Usage Examples

First-Order Differences

import numpy as np
from psplines.penalty import difference_matrix

# First-order difference matrix
D1 = difference_matrix(n=5, order=1)
print(D1.toarray())
# [[-1  1  0  0  0]
#  [ 0 -1  1  0  0]
#  [ 0  0 -1  1  0]
#  [ 0  0  0 -1  1]]

Second-Order Differences

# Second-order difference matrix  
D2 = difference_matrix(n=5, order=2)
print(D2.toarray())
# [[ 1 -2  1  0  0]
#  [ 0  1 -2  1  0]
#  [ 0  0  1 -2  1]]

Higher-Order Differences

# Third-order difference matrix
D3 = difference_matrix(n=6, order=3)
print(D3.toarray())
# [[-1  3 -3  1  0  0]
#  [ 0 -1  3 -3  1  0]
#  [ 0  0 -1  3 -3  1]]

Mathematical Background

Difference Operators

The \(p\)-th order difference operator \(\Delta^p\) is defined recursively:

  • \(\Delta^0 \alpha_i = \alpha_i\) (identity)
  • \(\Delta^1 \alpha_i = \alpha_{i+1} - \alpha_i\) (first difference)
  • \(\Delta^p \alpha_i = \Delta^{p-1} \alpha_{i+1} - \Delta^{p-1} \alpha_i\)

Matrix Representation

The difference matrix \(D_p\) has entries such that \((D_p \alpha)_i = \Delta^p \alpha_i\).

For first-order differences (\(p=1\)): \(\(D_1 = \begin{pmatrix} -1 & 1 & 0 & \cdots & 0 \\ 0 & -1 & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & -1 & 1 \end{pmatrix}\)\)

For second-order differences (\(p=2\)): \(\(D_2 = \begin{pmatrix} 1 & -2 & 1 & 0 & \cdots & 0 \\ 0 & 1 & -2 & 1 & \cdots & 0 \\ \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & 1 & -2 & 1 \end{pmatrix}\)\)

Properties

  • Dimension: For \(n\) coefficients, \(D_p\) has size \((n-p) \times n\)
  • Rank: The rank of \(D_p\) is \(n-p\) (for typical cases)
  • Sparsity: Each row has at most \(p+1\) non-zero entries
  • Banded structure: All non-zeros lie within \(p+1\) diagonals

Penalty Interpretation

The penalty term \(\lambda \|D_p \alpha\|^2\) penalizes:

  • \(p=1\): Large first differences (rough slopes)
  • \(p=2\): Large second differences (rough curvature)
  • \(p=3\): Large third differences (rough rate of curvature change)

Implementation Details

Sparse Storage

  • Returns scipy.sparse.csr_matrix for efficient storage
  • Memory usage: \(O((n-p) \times (p+1))\) instead of \(O((n-p) \times n)\)
  • Fast matrix-vector multiplication

Numerical Properties

  • Condition number: Increases with penalty order
  • Null space: \(D_p\) has a null space of dimension \(p\)
  • Regularization: The penalty \(\lambda \|D_p \alpha\|^2\) regularizes the fit

Construction Algorithm

The implementation uses efficient sparse matrix construction:

  1. Compute binomial coefficients for the difference operator
  2. Build row indices, column indices, and data arrays
  3. Construct sparse matrix in COO format
  4. Convert to CSR format for efficient operations

Usage in P-Splines

In the P-spline objective function: \(\(\min_\alpha \|y - B\alpha\|^2 + \lambda \|D_p \alpha\|^2\)\)

The penalty matrix appears as \(P = \lambda D_p^T D_p\), leading to the linear system: \(\((B^T B + \lambda D_p^T D_p) \alpha = B^T y\)\)

Penalty Order Selection

  • Order 1: Good for piecewise linear trends
  • Order 2: Most common choice, penalizes curvature
  • Order 3: For very smooth curves, penalizes jerk
  • Higher orders: Rarely needed, may cause numerical issues