Back to Learn
research May 16, 2026

Four trade-aggression atoms — and three of them collapsed

A 1,000,000-event evidence pass on four new trade-aggression atoms. Three of them turned out to be the same axis as raw_trade_ofi. One survived as a WATCH.

#raw atoms #trade flow #evidence pass #stage-1

Locking the first three raw atoms settled the seed list: raw_trade_ofi, raw_microprice_dev, raw_ofi_l1. The question this pass asks is whether the trade event can be split into better raw trade sensors than signed size alone.

The question

Can trade events be decomposed into raw trade-aggression atoms that carry information not already captured by raw_trade_ofi?

Hypothesis

raw_trade_ofi is the strongest known raw atom, but it might mix several forms of aggression into one signed-size number. A trade that consumes a large share of the visible queue, trades through the touch, or sweeps multiple book levels could carry information the signed size alone doesn’t.

What’s being tested

Four new candidate atoms, all signed:

atomwhat it measures
raw_trade_size_norm_l1signed trade size ÷ opposite top queue size
raw_trade_through_tickssigned ticks of price penetration through the previous best bid/ask
raw_trade_sweep_depthsigned count of previous book levels the trade crossed
raw_trade_consumed_l1_fraccapped signed fraction of the opposite top queue the trade consumed

Controls (the existing locked set):

raw_trade_ofi
raw_microprice_dev
raw_ofi_l1
raw_near_minus_deep_ofi

Sample contract

fieldvalue
Instrument id42001149
Tick size0.25
FilesFirst 10 sorted DBN files from Z:\MBP-10-NEW\data
Sample1,000,000 events across 10 day buckets
Horizonsh = 1..200 events

What separates a LOCK from a DROP here

For this study, the decisive bar is uniqueness against raw_trade_ofi.

If the new atom’s residual rank IC after controlling for the existing locked set is near zero, then it isn’t a new sensor — it’s a different encoding of the same trade-flow axis. Only an atom that survives the residual test gets considered for LOCK.

What the data said

Scorecard:

atompeak ICpeak hIC(20)IC(50)IC(200)corr to controlsresidual ICdays wonverdict
raw_trade_ofi0.28510.1610.1120.0610.0220.28910/10LOCK
raw_trade_consumed_l1_frac0.28810.1610.1120.0611.0000.12010/10DROP
raw_trade_size_norm_l10.28710.1610.1120.0611.0000.11510/10DROP
raw_trade_sweep_depth0.28510.1610.1120.0611.000−0.01410/10DROP
raw_trade_through_ticks0.14110.0980.0700.0380.4370.01810/10WATCH

Residual IC by horizon — what’s left after controlling for the locked set:

atomh=1h=5h=20h=50h=100h=200
raw_trade_size_norm_l10.1150.0300.0100.0050.001−0.000
raw_trade_consumed_l1_frac0.1200.0270.0060.002−0.000−0.001
raw_trade_sweep_depth−0.014−0.021−0.015−0.009−0.007−0.004
raw_trade_through_ticks0.0180.0320.0310.0230.0160.013

Trade-event coverage (how often the atom is nonzero):

atomnonzero rate
raw_trade_ofi5.62%
raw_trade_size_norm_l15.62%
raw_trade_sweep_depth5.62%
raw_trade_consumed_l1_frac5.62%
raw_trade_through_ticks1.06%

What this means in plain language

Three of the four “new” atoms are duplicates. raw_trade_size_norm_l1, raw_trade_consumed_l1_frac, and raw_trade_sweep_depth all have IC curves nearly identical to raw_trade_ofi, and their Spearman correlation to raw_trade_ofi rounds to 1.000. Their residual IC after controls collapses to near zero past h = 1. In this corpus, the rank ordering of trade events is dominated by the signed trade itself — normalising by queue size or counting levels swept doesn’t create a new axis. Different math, same measurement.

raw_trade_sweep_depth is worse than a duplicate. Its residual IC is negative at every checked horizon. After controlling for raw_trade_ofi, what’s left of sweep-depth predicts the wrong direction. That’s a duplicate plus noise.

raw_trade_through_ticks is the interesting one. It only fires on ~1% of all events — only when a trade actually penetrates the touch. Its raw IC of 0.141 is much lower than raw_trade_ofi, but its correlation to raw_trade_ofi is only 0.44 (vs ~1.00 for the others), and its residual IC stays positive from h = 1 through h = 200. The residual is even stronger at h = 5 (0.032) than at h = 1 (0.018) — through-touch aggression resolves a few events out, not instantly. It doesn’t beat raw_trade_ofi, but it isn’t raw_trade_ofi either.

Verdicts

atomverdictreason
raw_trade_ofiLOCKStill the locked anchor of the trade-flow axis.
raw_trade_size_norm_l1DROPSpearman 1.00 to raw_trade_ofi. Duplicate encoding.
raw_trade_consumed_l1_fracDROPSpearman 1.00 to raw_trade_ofi. Duplicate encoding.
raw_trade_sweep_depthDROPDuplicate, plus negative residual IC.
raw_trade_through_ticksWATCHSparse (1% of events), partly independent, stable. Promote to Stage 2.

What this changes

The Stage 1 trade-aggression slot stays as just raw_trade_ofi. There is no second trade-aggression atom worth locking — yet.

raw_trade_through_ticks moves to Stage 2 as a sparse aggression state. The transforms most worth testing first:

  • agreement_state(raw_trade_through_ticks, raw_microprice_dev) — through-touch aggression when the touch fair value agrees with it.
  • agreement_state(raw_trade_through_ticks, raw_ofi_l1) — through-touch aggression plus book flow.
  • signed_persistence(raw_trade_through_ticks, L = 3..10) — sign streaks of through-touch trades.
  • EWMA(raw_trade_through_ticks, L = 3..20) — pressure persistence on the sparse axis.
  • through_ticks_when_spread_tight — conditional state.
  • through_ticks_when_microprice_disagrees — the contrarian read.

The decisive test for Stage 2 is whether through-touch aggression plus microprice agreement produces a stronger, less-redundant version of the existing trade_microprice_agreement_3 composite.

Reproduce

$files = (Get-ChildItem Z:\MBP-10-NEW\data -Filter *.dbn |
  Sort-Object Name | Select-Object -First 10 -ExpandProperty FullName) -join ','

go run . raw `
  -files $files `
  -instrument-id 42001149 `
  -tick-size 0.25 `
  -max-events 0 `
  -max-events-per-file 100000 `
  -out out\trade_atom_research_20260517_10d_100k

Primary outputs:

raw_atom_scorecard.csv
raw_atom_ic_curve.csv
raw_atom_incremental_ic.csv
raw_atom_orthogonality.csv
raw_atom_health.csv
manifest.txt