Backtest
Calibrated alpha replay — what cumulative P&L would the system have generated if every realized signal had been sized by its current calibration weight? This page replays the entire historical signal ledger using today's trust weights to estimate the strategy's hypothetical NAV.
Methodology
The headline numbers below come from v2.0 (Honest Backtest): each scored realistic trade's net P&L (post-slippage, post-concentration-cap, post-gross-cap) is booked on itslogged_at firing date, then a complete business-day calendar is
constructed from first to last date — every Mon–Fri without a trade contributes 0 to that
day's P&L. Annualized return is computed geometrically from the compounded NAV; Sharpe subtracts
the 5% risk-free rate. The headline Sharpe is then bracketed by a 1000-sample block
bootstrap (5-day blocks) to give an honest [p5, p95] confidence interval. v1.1 (idealized,
no friction) and v1.2 (with frictions but only counting trade days in vol) are kept below for
comparison so methodology drag and friction drag are both visible.
Known limitation: calibration weights are still fitted in-sample. Walk-forward out-of-sample calibration is v2.1 (next ship). Auto-recomputes every 6 hours.
v2.0 Honest Backtest Headline
The number to publish. Lump-sum P&L on each trade's firing day, full business-day calendar (zero-trade days included in volatility), 5% risk-free rate subtracted, Sharpe reported with bootstrap [p5, p95] confidence interval.
Loading…
| Metric | Value |
|---|---|
| Loading… | |
v2.1 Walk-Forward Calibration future headline
Each trade's weight resolved from the calibration snapshot in effect at its logged_at date — eliminates in-sample look-ahead bias. Snapshots written every Sunday 12:00 UTC. Coverage stat tells you how trustworthy the v2.1 number is. When coverage ≥50% AND window ≥0.5y, this becomes the headline number.
Loading…
| Metric | Value |
|---|---|
| Loading… | |
Loading…
Cumulative NAV — calibrated alpha replay
Starting at $100,000. Each step compounds that day's net contribution at the calibrated weight. Drawdowns shaded.
Signal contribution rankings
Which signals drove the cumulative alpha? Top contributors deserve more weight; bottom contributors are candidates for inversion or removal.
↑ Top alpha contributors
↓ Negative contributors
Full signal table
Every signal with at least one scored outcome. Click column headers to sort.
| Signal | Weight | N | Win Rate | Avg Return | Avg Contrib | Total Contrib |
|---|---|---|---|---|---|---|
| Loading… | ||||||
Realistic backtest — friction applied (v1.2)
The headline backtest above is idealized: no slippage, no exposure caps, no concentration limits. v1.2 applies real-world frictions to the same trades to see how much survives: 5 bps slippage per leg (10 bps round-trip), 40% NAV cap per signal type per day, and 100% NAV gross exposure cap per day. The drag between the two is what the system would have actually paid in a live account.
| Metric | Idealized v1.1 | Realistic v1.2 | Drag |
|---|---|---|---|
| Loading… | |||
Brief calls backtest — would following the brief have made money?
Each decisive call (EXIT_ALL_RISK..LEVER) maps to a SPY exposure level. The strategy holds that exposure until the next call. This compares what your NAV would have done if you had traded the brief's calls verbatim, vs 100% SPY buy-and-hold over the same window. Note: the ledger started 2026-05-04, so this metric is sparse until more calls accumulate.
| Call | Period | Days | Exposure | SPY Δ | Strategy Δ | Contribution |
|---|---|---|---|---|---|---|
| Loading… | ||||||