Backtest

Calibrated alpha replay — what cumulative P&L would the system have generated if every realized signal had been sized by its current calibration weight? This page replays the entire historical signal ledger using today's trust weights to estimate the strategy's hypothetical NAV.

loading… Outcomes — Window — Last computed —

Methodology

The headline numbers below come from v2.0 (Honest Backtest): each scored realistic trade's net P&L (post-slippage, post-concentration-cap, post-gross-cap) is booked on its logged_at firing date, then a complete business-day calendar is constructed from first to last date — every Mon–Fri without a trade contributes 0 to that day's P&L. Annualized return is computed geometrically from the compounded NAV; Sharpe subtracts the 5% risk-free rate. The headline Sharpe is then bracketed by a 1000-sample block bootstrap (5-day blocks) to give an honest [p5, p95] confidence interval. v1.1 (idealized, no friction) and v1.2 (with frictions but only counting trade days in vol) are kept below for comparison so methodology drag and friction drag are both visible.

Known limitation: calibration weights are still fitted in-sample. Walk-forward out-of-sample calibration is v2.1 (next ship). Auto-recomputes every 6 hours.

v2.0 Honest Backtest Headline

The number to publish. Lump-sum P&L on each trade's firing day, full business-day calendar (zero-trade days included in volatility), 5% risk-free rate subtracted, Sharpe reported with bootstrap [p5, p95] confidence interval.

Loading…

Metric	Value
Loading…

v2.1 Walk-Forward Calibration future headline

Each trade's weight resolved from the calibration snapshot in effect at its logged_at date — eliminates in-sample look-ahead bias. Snapshots written every Sunday 12:00 UTC. Coverage stat tells you how trustworthy the v2.1 number is. When coverage ≥50% AND window ≥0.5y, this becomes the headline number.

Loading…

Metric	Value
Loading…

Loading…

Cumulative NAV — calibrated alpha replay

Starting at $100,000. Each step compounds that day's net contribution at the calibrated weight. Drawdowns shaded.

Signal contribution rankings

Which signals drove the cumulative alpha? Top contributors deserve more weight; bottom contributors are candidates for inversion or removal.

↑ Top alpha contributors

↓ Negative contributors

Full signal table

Every signal with at least one scored outcome. Click column headers to sort.

Signal	Weight	N	Win Rate	Avg Return	Avg Contrib	Total Contrib
Loading…

Realistic backtest — friction applied (v1.2)

The headline backtest above is idealized: no slippage, no exposure caps, no concentration limits. v1.2 applies real-world frictions to the same trades to see how much survives: 5 bps slippage per leg (10 bps round-trip), 40% NAV cap per signal type per day, and 100% NAV gross exposure cap per day. The drag between the two is what the system would have actually paid in a live account.

Metric	Idealized v1.1	Realistic v1.2	Drag
Loading…

Brief calls backtest — would following the brief have made money?

Each decisive call (EXIT_ALL_RISK..LEVER) maps to a SPY exposure level. The strategy holds that exposure until the next call. This compares what your NAV would have done if you had traded the brief's calls verbatim, vs 100% SPY buy-and-hold over the same window. Note: the ledger started 2026-05-04, so this metric is sparse until more calls accumulate.

Call	Period	Days	Exposure	SPY Δ	Strategy Δ	Contribution
Loading…

Methodology

v2.0 Honest Backtest Headline

Loading…

v2.1 Walk-Forward Calibration future headline

Loading…

Loading…

Cumulative NAV — calibrated alpha replay

Horizon attribution

Signal contribution rankings

↑ Top alpha contributors

↓ Negative contributors

Full signal table

Realistic backtest — friction applied (v1.2)

Brief calls backtest — would following the brief have made money?