Scientist: denario-6 Date: 2026-04-26
A synthetic dataset of 2D random walk trajectories spanning five anomalous diffusion regimes, from normal diffusion to ballistic motion. The dataset is designed to support research on anomalous diffusion characterisation, inference of the anomalous exponent alpha, and comparison of MSD-based versus trajectory-based analysis methods.
/home/node/work/projects/superdiffusion_v1/superdiffusion_trajectories.npy- NumPy structured array, 50,000 rows (50 trajectories × 1,000 time steps)
- Fields:
Field dtype Description trajectory_id int32 Integer ID 1–50 identifying each trajectory alpha float64 Ground-truth anomalous exponent (1.0, 1.2, 1.5, 1.8, 2.0) hurst_exponent float64 Hurst exponent H = alpha/2 (0.5, 0.6, 0.75, 0.9, 1.0) time float64 Observation time t = step × dt, dt=0.01 s, range [0.01, 10.0] x float64 x-coordinate (noisy), metres y float64 y-coordinate (noisy), metres msd_true float64 True mean squared displacement x²+y² (noise-free)
import numpy as np
data = np.load('/home/node/work/projects/superdiffusion_v1/superdiffusion_trajectories.npy', allow_pickle=False)
# Access by field name
traj_id = data['trajectory_id']
alpha = data['alpha']
t = data['time']
x = data['x']
y = data['y']
msd_true = data['msd_true']
# Filter to a single trajectory
mask = data['trajectory_id'] == 1
traj1 = data[mask] # 1000 rows, alpha=1.0 (normal diffusion)| alpha | H | Regime |
|---|---|---|
| 1.0 | 0.50 | Normal (Brownian) diffusion |
| 1.2 | 0.60 | Mild superdiffusion |
| 1.5 | 0.75 | Moderate superdiffusion |
| 1.8 | 0.90 | Strong superdiffusion |
| 2.0 | 1.00 | Ballistic motion |
Each trajectory is a 2D fractional Brownian motion (fBm) generated via circulant embedding of the fractional Gaussian noise (fGn) autocovariance:
γ(k) = 0.5 * (|k+1|^(2H) - 2|k|^(2H) + |k-1|^(2H))
The relationship between the anomalous exponent and the Hurst exponent is: α = 2H.
The expected MSD scaling is: ⟨r²(t)⟩ ~ t^α.
Each trajectory has independent x and y components.
Gaussian measurement noise with std=0.05 m is added to x and y positions.
The msd_true field contains the noise-free squared displacement for validation.
- All trajectories start at the origin (x=0, y=0).
- dt = 0.01 s; total duration = 10 s; 1000 steps per trajectory.
- The fGn was generated using circulant embedding with eigenvalue clipping (negative eigenvalues set to zero) — this introduces a small approximation at high H.
- Measurement noise (σ=0.05 m) contaminates short-lag MSD estimates; it should be accounted for in fitting procedures.
- The
msd_truefield is the instantaneous squared displacement of a single trajectory (not ensemble-averaged), so it is noisy even without measurement noise.
- MSD scaling analysis: compute time-averaged MSD ⟨Δr²(τ)⟩ for each trajectory and fit the power-law slope to estimate α. Compare estimated vs ground-truth α.
- Classification: use trajectory features to classify each trajectory into its diffusion class (5-class problem).
- Hurst exponent estimation: apply DFA (detrended fluctuation analysis), R/S analysis, or wavelet-based methods to estimate H for each trajectory.
- Effect of noise: compare MSD-based α estimates from the noisy positions vs the true MSD, and quantify noise bias at short lags.
- Ergodicity breaking: compute the ergodicity-breaking parameter (EB) and examine its dependence on α.