OLS Spread Model Guide
Uses linear regression for optimal hedge ratio calculation and enhanced pairs trading precision.
1Understanding Linear Regression
Ordinary Least Squares (OLS) finds the best straight line describing the historical relationship between the log prices of the two stocks. Running the regression on log prices captures the percentage-based (multiplicative) relationship between the pair and makes the model scale-invariant — the result does not depend on the absolute price level of either stock. This is the foundation for creating a truly market-neutral spread.
Regression Equation:
ln(Stock A) = α + β × ln(Stock B) + ε📌 Why log prices?
Using natural log prices removes price-level bias, focuses the model on relative (percentage) moves, and is consistent with how the platform computes the spread and Z-score. The resulting spread lives in log space, not raw price.
α (Alpha):
The intercept term, representing the average log-price residual of the relationship.
β (Beta / Elasticity):
The slope coefficient in log space — an elasticity that measures the percentage sensitivity of Stock A to Stock B. The platform uses β as a scaling factor for position sizing and hedging, not as a direct share count.
2Calculating the Spread
The Spread is the residuals (ε) from the log-price regression. It represents the percentage "mispricing" between the two stocks after accounting for the scaling factor (β) and the intercept (α). Because both inputs are natural logs, the spread lives in log space — not in raw price terms.
Spread Formula:
Spread = ln(Stock A) − (α + β × ln(Stock B))💡 Key Insight:
The spread must be a stationary time series for mean-reversion trading. Because it is computed in log space, a spread value of zero means the two log prices are exactly in line with their historical relationship — not that the raw prices are equal.
3Statistical Significance
Before trading, we must ensure the relationship is statistically valid and the spread is mean-reverting.
R² (R-squared):
Measures the fit of the regression line. Look for values above 0.7 to indicate a strong, correlated relationship.
ADF Test (Stationarity):
The Augmented Dickey-Fuller (ADF) test must confirm the spread is stationary (i.e., co-integrated) for mean reversion to be viable.
🎯 Quality Thresholds:
Look for ADF test p-value < 0.05 and a Statistical Half-Life between 5 and 60 days.
4Trading Signals
Trading signals are generated using the Z-score of the calculated spread.
📈 Entry: Z-Score > +2.5
Spread is too high. Action: Short Stock A and Long Stock B, sized using β as the platform's scaling factor.
📉 Entry: Z-Score < -2.5
Spread is too low. Action: Long Stock A and Short Stock B, sized using β as the platform's scaling factor.
⚠️ Critical Difference:
Unlike the Ratio Model, the position size for Stock B must be scaled by β. In the log-price model β is an elasticity — it reflects the percentage sensitivity of Stock A to Stock B — and the platform uses it as a scaling factor to size the hedge. It does not represent a fixed number of shares in the same way as a price-level regression.
5Advantages Over Ratio Model
The OLS model offers superior stability and statistical rigor compared to the simple ratio approach.
🎯 Better Hedging:
Optimal Beta coefficient minimizes basis risk, creating a cleaner spread signal.
📊 Statistical Validation:
R² and statistical tests provide confidence in the relationship's stability.
Ready to Implement?
Use the Pair Analyzer to run OLS, visualize the spread, and confirm stationarity with the ADF test.
Quick Navigation
Related Guides
💡 Pro Tip
Use a rolling window (e.g., 60-120 days) for regression calculations to ensure the relationship remains current and relevant to changing market conditions.