Functional PCA - Mathmatical Background

Below is the step-by-step mathematical description of the method. The equations are presented at the level of a single voxel time series. (In the implementation, these calculations are applied in batches for efficiency.)

B-Spline Representation of the Signal

Let \(n\) is number of the timepoints, and \(N\) is the nuber of the voxels. Each voxel’s time series, denoted by

\[y(t) \in \mathbb{R}^{n},\]

is approximated using a spline basis with \(K\) basis functions \(f_k(t)\):

\[y(t) \approx \sum_{k=1}^{K} c_k\, f_k(t),\]

or, in vector-matrix notation,

\[y \approx F\, c,\]

where:

\(F \in \mathbb{R}^{n \times K}\) is the basis matrix with entries \(F_{ik} = f_k(t_i)\),
\(c \in \mathbb{R}^{K}\) is the vector of coefficients.

Regularized Regression for Spline Coefficient Estimation

To estimate the coefficient vector \(c\), the method minimizes the penalized least-squares criterion:

\[J(c) = \|y - F\, c\|^2 + \lambda\, c^\top P\, c,\]

where:

\(\lambda\) is the regularization parameter,
\(P \in \mathbb{R}^{K \times K}\) is the penalty matrix defined as

\[P_{kl} = \int_{T_{\min}}^{T_{\max}} f_k^{(d)}(t)\, f_l^{(d)}(t)\, dt,\]

with \(d\) indicating the derivative order to be penalized (e.g., \(d=2\) to penalize curvature).

The minimizer \(c\) satisfies the penalized normal equations:

\[\left(F^\top F + \lambda P\right) c = F^\top y.\]

Selection of \(\lambda\) for each voxel using Generalized Cross-Validation (GCV)

For selection of \(\lambda\), the following is calculated separately for each voxel. The hat matrix \(H_\lambda \in \mathbb{R}^{n \times n}\) is calculated

\[H_\lambda = F\, \left(F^\top F + \lambda P\right)^{-1} F^\top.\]

Then the generalized cross-validation (GCV) score for each candidate \(\lambda\) is computed as:

\[\text{GCV}(\lambda) = \frac{n\, \| (I - H_\lambda) y \|^2}{\left[\text{trace}(I - H_\lambda)\right]^2},\]

where:

\(I\) is the \(n \times n\) identity matrix,
\(n\) is the number of timepoints.

The optimal \(\lambda\) is chosen as the value that minimizes the GCV score.

Functional Principal Component Analysis (fPCA)

After estimating the coefficient vector \(c\) for each voxel, these coefficients are stored in a matrix

\[C \in \mathbb{R}^{N \times K},\]

where \(N\) is the number of voxels.

The fPCA steps are as follows:

Centering the Coefficient Matrix:

\[\tilde{C} = C - \mu,\quad \text{with}\quad \mu = \frac{1}{N} \sum_{i=1}^{N} C_{i,:}.\]
Compute the Penalized Covariance Matrix \(\Sigma \in \mathbb{R}^{K \times K}\) in the Spline Basis:

\[\Sigma = \frac{1}{N}\, \tilde{C}^\top\, \tilde{C}\, U,\]

where the Gram matrix \(U \in \mathbb{R}^{K \times K}\) is defined as:

\[U_{kl} = \int_{T_{\min}}^{T_{\max}} f_k(t)\, f_l(t)\, dt, k,l=1,2 ... K\]

Eigen-decomposition:

Solve for the eigenvalues \(\gamma\) and eigenvectors \(\phi\):

\[\Sigma\, \phi = \gamma\, \phi.\]

The eigenvalues are sorted in descending order.
Compute Voxel Scores for the Principal Components:

For each voxel \(i\), the scores are computed as:

\[\text{scores}_i = \tilde{c}_i\, U\, \phi.\]
- The computed score indicates the contribution or importance of that voxel for the corresponding principal component.
- Using the spatial information from the original brain mask, these voxel scores are then mapped back to a volume of the brain. This produces an importance map, saved as a file with the brain’s dimensions, where each voxel’s value indicates its contribution to the fPCA component.
Recover Eigenfunctions in the Time Domain:

The eigenfunctions that describe temporal dynamics are given by:

\[\psi = F\, \phi.\]

The resulting function \(\psi(t)=F_{t,:}\,\cdot \, \phi\) is plotted as a graph (intensity plot) that illustrates the time-course of the fPCA component across the entire brain.