Numinor Co-Movement Graph — Methodology

SKU comovement-graph-v1 · Full treatment Numinor Co-Movement Graph Whitepaper v1.3 · June 2026

This is the buyer-facing methodology extract. It describes how the feed is built, how it was validated, and how to read its fields. The whitepaper carries the full evidence and figures; every number below is reproduced from the shipped feed by the reference codebase.

1. Two layers: substrate and overlay

The product is built in two layers, and keeping them distinct is the whole idea.

The substrate is the news co-movement graph. Every scored pair is one the market is co-mentioning in news over the trailing 90 days. Co-mention is a revealed-attention signal: when the market discusses two companies together — same supply shock, same policy, same deal — it is treating them as related. The substrate is broad, timely, and noisy. It tells you which pairs are in play right now (~54,000 A-share pairs over ~5,450 names in a typical month).
The overlay is the structural grading. Each co-mentioned pair is annotated with four structural "lamps." The lamps explain which of those co-movements are grounded in a durable link. News activates the lamps; the lamps explain the news. The product is the structure of the overlap — news ∩ network = confirmed co-movers, news − network = the blacklist.

This is deliberately not a structural-relationship map. A network-only feed would enumerate every supply-chain or affiliate pair across the whole market — most dormant at any moment, and unable to say which observed correlations are spurious. The blacklist exists only because the news substrate gives an observed co-movement to certify.

2. The four lamps

Lamp	What it means	Source
deep product peer	the two firms make essentially the same specific product (deep overlap on the SAM product tree, depth ≥4)	SAM product ontology
SAM supply chain	one firm's product is an input to the other's (core inputs)	SAM product ontology — available continuously, not only when disclosed
disclosed customer–supplier	an ongoing relationship reported in financial statements, active within 2 years, bids excluded	mandatory disclosures
affiliate	common ownership / cross-holding	ownership tables

A pair with no lamp lit is dark — co-mentioned but structurally unexplained, the spurious-correlation candidate. In a typical vintage ~two-thirds of co-mentioned pairs are confirmed and one-third are dark.

3. Develop and holdout

All exploration was done on a develop window (2017 – mid-2022). A holdout (2023 onward, 42 monthly vintages) was reserved before testing, kept untouched during development, and used to confirm. Every headline number is the holdout figure. The forecast model that powers expected_fwd_corr is fitted on develop only, so a holdout pair's forecast never sees holdout data. Where develop and holdout disagree the effect is rejected — which is exactly how the tender-bid layer was eliminated (it forecasts in-sample, sign-flips out-of-sample).

4. The correlation measure

For each month-end T and pair, trailing 90-day and forward 60-day co-movement are measured with one normalized measure — a z-scored return inner product over the window, with a min-overlap denominator. Using the same measure for both makes trailing (shipped) and forward (validation only) directly comparable, so a lamp's forecast is a like-for-like statement about the next quarter. Correlations, not covariances, are the unit throughout.

5. Does a lamp forecast correlation? (Fama–MacBeth)

Each month, forward correlation is regressed cross-sectionally on trailing correlation, the four lamp indicators, and a dark indicator, against a random-pair baseline; coefficients are averaged across months. A lamp's coefficient is the additional forward correlation it predicts over and above trailing correlation and the other lamps.

Holdout result (reproduced from the feed):

Lamp	additional forward correlation	t
deep product peer (over a +0.049 shallow base)	+0.065	22
SAM supply chain	+0.028	9
disclosed customer–supplier (2-yr)	+0.013	2.6
affiliate	+0.005	2.2
dark (news-only)	−0.005	−1.2

A deep product peer adds ~+0.11 of persistent forward correlation in total. The effects are reliable (large t-stats over tens of thousands of pairs) and modest in size — the realistic profile for a relationship dataset.

expected_fwd_corr in the feed applies the develop-fit version of this model (model_coefficients.json); the holdout numbers above are the out-of-sample confirmation.

6. Confirmed vs. unconfirmed correlations (the "discount list")

Matching pairs on their trailing correlation and looking forward, structurally-confirmed pairs retain ~92% of their correlation a quarter later, versus ~75% for pairs selected on price history alone. This is the field a buyer cannot reconstruct from prices: which correlated pairs are correlated for a real reason and which are coincidence. Read confirmed / dark against your own price-correlation matrix — keep the confirmed correlations, discount the dark ones.

7. Hedging

For each CSI 300 member each month, the single name is hedged three ways — index, equal-weight same-industry basket, and its top-N graph peers — fitting the hedge ratio on trailing data and measuring forward residual volatility. A 10-name graph hedge cuts ~45% of forward residual variance and beats a ~100-name industry basket ~76% of the time, out-of-sample (11,262 stock-months). N is a cost/quality dial; ten is the sweet spot.

8. Scope (stated plainly)

Large-cap. The hedging edge fades from CSI 300 (beats industry 76%) to a tie at CSI 1000 (49%). This follows from the substrate: small caps are not co-mentioned enough to enter the graph with signal.
Not a global risk model. The pairwise edge washes out in a whole-universe optimizer; the value is targeted (single-name hedging, pairs, concentrated risk).
Correlation, not returns. Not alpha — return-spillover on the same graph is flat-to-negative out-of-sample.
Modest magnitudes, and the 2023+ holdout has been queried several times across validation — read the figures as confirmed-out-of-sample.

9. Why information content, not a Sharpe

We report correlation-forecast coefficients, retention, and variance reduction rather than a portfolio Sharpe. A Sharpe depends on universe, weighting, neutralization, turnover, and costs that differ for every buyer; a correlation forecast and a variance-reduction figure are construction-independent and portable. You apply the graph to your own risk model and hedging book and measure your own contribution — corr_delta shows exactly where the graph disagrees with the price screen. The decisive test is buyer-side replication on your stack.

Numinor Systems Limited · Full methodology + figures: Co-Movement Graph Whitepaper v1.3 · support@numinor.io