Backtest

Calibrated alpha replay — what cumulative P&L would the system have generated if every realized signal had been sized by its current calibration weight? This page replays the entire historical signal ledger using today's trust weights to estimate the strategy's hypothetical NAV.

loading… Outcomes Window Last computed

Methodology

The headline numbers below come from v2.0 (Honest Backtest): each scored realistic trade's net P&L (post-slippage, post-concentration-cap, post-gross-cap) is booked on its logged_at firing date, then a complete business-day calendar is constructed from first to last date — every Mon–Fri without a trade contributes 0 to that day's P&L. Annualized return is computed geometrically from the compounded NAV; Sharpe subtracts the 5% risk-free rate. The headline Sharpe is then bracketed by a 1000-sample block bootstrap (5-day blocks) to give an honest [p5, p95] confidence interval. v1.1 (idealized, no friction) and v1.2 (with frictions but only counting trade days in vol) are kept below for comparison so methodology drag and friction drag are both visible.

Known limitation: calibration weights are still fitted in-sample. Walk-forward out-of-sample calibration is v2.1 (next ship). Auto-recomputes every 6 hours.

v2.0 Honest Backtest Headline

The number to publish. Lump-sum P&L on each trade's firing day, full business-day calendar (zero-trade days included in volatility), 5% risk-free rate subtracted, Sharpe reported with bootstrap [p5, p95] confidence interval.

Loading…

MetricValue
Loading…

v2.1 Walk-Forward Calibration future headline

Each trade's weight resolved from the calibration snapshot in effect at its logged_at date — eliminates in-sample look-ahead bias. Snapshots written every Sunday 12:00 UTC. Coverage stat tells you how trustworthy the v2.1 number is. When coverage ≥50% AND window ≥0.5y, this becomes the headline number.

Loading…

MetricValue
Loading…

Loading…

Cumulative NAV — calibrated alpha replay

Starting at $100,000. Each step compounds that day's net contribution at the calibrated weight. Drawdowns shaded.

Signal contribution rankings

Which signals drove the cumulative alpha? Top contributors deserve more weight; bottom contributors are candidates for inversion or removal.

↑ Top alpha contributors

↓ Negative contributors

Full signal table

Every signal with at least one scored outcome. Click column headers to sort.

Signal Weight N Win Rate Avg Return Avg Contrib Total Contrib
Loading…

Realistic backtest — friction applied (v1.2)

The headline backtest above is idealized: no slippage, no exposure caps, no concentration limits. v1.2 applies real-world frictions to the same trades to see how much survives: 5 bps slippage per leg (10 bps round-trip), 40% NAV cap per signal type per day, and 100% NAV gross exposure cap per day. The drag between the two is what the system would have actually paid in a live account.

MetricIdealized v1.1Realistic v1.2Drag
Loading…

Brief calls backtest — would following the brief have made money?

Each decisive call (EXIT_ALL_RISK..LEVER) maps to a SPY exposure level. The strategy holds that exposure until the next call. This compares what your NAV would have done if you had traded the brief's calls verbatim, vs 100% SPY buy-and-hold over the same window. Note: the ledger started 2026-05-04, so this metric is sparse until more calls accumulate.

Call Period Days Exposure SPY Δ Strategy Δ Contribution
Loading…