Transfer Learning using ahead::ridge2f on synthetic stocks returns Pt.2: synthetic data generation

Posted on September 9, 2025 by T. Moudiki in Data science | 0 Comments

This article was first published on T. Moudiki's Webpage - Python , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In https://github.com/thierrymoudiki/2025-09-05-transfer-learning-ridge2f, I pretrain ahead::ridge2f (also available Python) on 1000 synthetic stock returns using Bayesian Optimization, and test its performance (coverage rate and Winkler score for now) on real market data.

Here’s how I obtained the synthetic stock returns:

The overall process in the model simulates asset returns over time by integrating stochastic volatility, regime switching, jumps, and microstructure noise. At each time step \(t\), the return \(r_t\) is given by:

\[r_t = \mu + \sqrt{V_t} \cdot \epsilon_t + J_t + \epsilon_{\text{noise}}\]

where \(\mu\) is the drift term, \(\sqrt{V_t}\) is the volatility at time \(t\), \(\epsilon_t \sim \mathcal{N}(0, 1)\) is a random shock from a standard normal distribution, \(J_t\) is the jump component, and \(\epsilon_{\text{noise}}\) represents microstructure noise. The volatility process \(V_t\) follows a mean-reverting SDE inspired by the Heston model:

\[dV_t = \kappa (\theta – V_t) dt + \sigma_v \sqrt{V_t} dW_t\]

where \(\kappa\) is the mean reversion speed, \(\theta\) is the long-term mean volatility, \(\sigma_v\) is the volatility of volatility, and \(W_t\) is a Wiener process. The jumps are modeled using a Poisson process with intensity \(\lambda_{\text{jump}}\), and the size of the jump \(J_t\) is drawn from one of three distributions: normal \(\mathcal{N}(0, \sigma_{\text{jump}})\), log-normal \(\log J_t \sim \mathcal{N}(-\frac{1}{2} \sigma_{\text{jump}}^2, \sigma_{\text{jump}})\), or exponential \(J_t \sim \text{Exp}(\sigma_{\text{jump}})\). Regime switching is modeled as a two-state Markov process, where the volatility parameters \(\kappa\) and \(\theta\) are different for each state, and transitions between regimes are governed by a transition matrix \(\mathbf{P}\), where:
\(\mathbf{P} = \begin{bmatrix} p_{11} & p_{12} \\ p_{21} & p_{22} \end{bmatrix}\)
with \(p_{11}\) and \(p_{22}\) being the probabilities of staying in the current state, and \(p_{12}\) and \(p_{21}\) being the probabilities of switching regimes. Finally, the microstructure noise \(\epsilon_{\text{noise}}\) is modeled as either normal \(\mathcal{N}(0, \text{noise scale})\) or Student’s t-distribution \(t_{\nu}\), capturing small-scale market effects. Together, these components combine to simulate realistic asset return dynamics, reflecting continuous volatility evolution, discrete jumps, regime shifts, and market microstructure noise.

More details about this model (actually used in an industrial setting):

To leave a comment for the author, please follow the link and comment on their blog: T. Moudiki's Webpage - Python .

Want to share your content on python-bloggers? click here.

Python-bloggers

Data science news and tutorials - contributed by Python bloggers

Transfer Learning using ahead::ridge2f on synthetic stocks returns Pt.2: synthetic data generation

Related