Testing Best Practices
Testing quantitative software is mostly about keeping different layers of the stack from inventing different realities. The scalar indicator path, the SIMD path, the GPU path, and the backtest engine can all be internally consistent while still disagreeing with one another. That kind of failure is dangerous because the numbers often still look reasonable.
A useful test suite therefore does more than check whether a function returns a value. It checks whether the reference path and the optimized path still mean the same thing, whether warmup handling stayed intact, whether NaNs and edge cases propagate correctly, and whether the backtest layer is still honoring the timing contract it claims to model.
Test the contract before the speed path
The scalar implementation should be the easiest code to read and the hardest code to misunderstand. That role makes it the reference. Once that path is trustworthy, SIMD or CUDA variants can be compared against it under many input shapes and parameter combinations. If the reference path is vague, every optimized path inherits the ambiguity.
This is also why edge cases matter. Empty slices, too-short windows, NaN prefixes, all-equal inputs, and invalid parameters sit near the center of indicator work. They are exactly where implementation drift tends to hide.
Use invariants alongside examples
Example-based tests are necessary, and quant code also benefits from property-style checks that assert invariants across large input spaces. A moving average should preserve constant input after warmup. A volatility measure should never return negative values. A batch API should agree with repeated single-run evaluation for the same parameters.
assert_relative_eq!(scalar_output[i], simd_output[i], epsilon = 1e-12);
assert_relative_eq!(scalar_output[i], gpu_output[i], epsilon = 1e-12); The exact tolerance depends on the operation, but the pattern should stay strict. Numeric code deserves explicit tolerances and a clear definition of what counts as close enough.
Integration tests should resemble the real path
Unit tests prove that a component behaves under controlled conditions. Integration tests are where you discover whether the components still agree once data flows through the actual pipeline. For this stack, that usually means feeding realistic OHLCV data through parsing, indicator calculation, strategy evaluation, and result aggregation without swapping in a softer contract halfway through.
Use integration tests to exercise the joins: data alignment, warmup propagation, execution timing, and result reduction. Those joins are where the quiet regressions usually show up.
Backtest Validation Lives Above Unit Testing
A backtest can pass all its local tests and still be methodologically wrong. Validation therefore has to include dataset checks, timing checks, cost assumptions, and a clear separation between optimization and holdout evaluation. The test suite keeps code honest. The validation process keeps the research claims honest. You need both.
If that distinction is fuzzy, start with Backtesting Fundamentals. Most bad backtests usually fail for deeper reasons than a visible off-by-one in the obvious place. They fail because the whole experiment leaked information or treated execution as a suggestion.
Performance tests should be isolated from correctness tests
Benchmarks are useful, but only if they stay narrow and repeatable. A performance test detects movement in throughput or latency for a defined workload. It complements correctness checks, and it should stay simple enough to preserve signal on a noisy machine.
The usual pattern is simple: correctness tests run broadly and often, performance tests run on controlled workloads with recorded input sizes and hardware context. When a performance number changes, you want to know whether the code changed, the workload changed, or the machine changed.
A minimum test matrix
- Reference scalar tests for every indicator and strategy primitive.
- Cross-path parity checks for scalar, SIMD, and GPU outputs where applicable.
- Edge-case tests for NaNs, short inputs, invalid parameters, and warmup handling.
- Integration tests over realistic market data slices and real pipeline joins.
- Validation checks that keep optimization and holdout evaluation separate.
- Performance tests with fixed workloads and explicit hardware context.
Next reads
If the immediate concern is execution correctness, continue with Backtesting Fundamentals. If the concern is cross-path numeric agreement, the relevant implementation context is in SIMD Optimization Explained and GPU Acceleration Setup.