C2C Supply-Chain Construct Data — Methodology
Buyer-facing extract from the Numinor C2C Supply-Chain Whitepaper v3.0 (the full 34-page paper, with every robustness test, is the companion download).
The question the data answers
Chinese A-share companies are legally required to disclose their largest customers and suppliers, and public procurement produces a continuous stream of awarded contracts. Both observation channels name who actually transacts with whom, and for how much — but at the legal-entity level, scattered across filings and announcements. This product resolves those observations into a single point-in-time edge table between listed companies, so the relationships are usable quantitatively: as a graph, as exposures, or as the input to spillover signals.
Construction in four steps
- Observe. Two channels: disclosed — mandatory top-5 customer/supplier tables and related-party transaction schedules from periodic reports (reporting periods from 2015); bid — awarded procurement contracts (winning-bid announcements, from 2020).
- Resolve. Each named party maps to its listed-company parent through ChinaScope's structured affiliate-ownership history. Entities owned by no listco drop out; multi-parent entities yield one edge per parent. No language model is involved — the resolution is deterministic joins over structured identifier and ownership tables.
- Weight. Each edge carries its economic value in CNY, ownership-adjusted on both
sides: a relationship flowing through a 60%-owned subsidiary counts at 60% to the
parent (
relation_value_cny = raw value × supplier_own_ratio × customer_own_ratio; the raw value ships alongside). - Time-stamp. Every edge carries the full Date 1–4 audit trail: reporting-period
date → public-availability date (
source_publish_date: the filing's publication or the award announcement) → effective date (eff_date = source_publish_date + pit_buffer_days, default 30, dialable 0–120). A buyer can verify no-look-ahead from the data alone.
Validation — the customer-momentum test
The whitepaper validates the graph's information content with the classic customer-momentum construction (Cohen–Frazzini 2008, adapted to disclosed Chinese data): each seller's signal is the value-weighted trailing 21-day return of its customers over a 9-month relationship window; bids are admitted only when material to that seller — at least the seller's median disclosed-customer size, calibrated per company; the two channels are z-scored per date and unioned; the union is residualized against a 22-factor base and scored by Spearman IC against the 20-day forward return.
| Result (whitepaper v3.0, frozen research vintage) | Value |
|---|---|
| Union orthogonal ICIR, full sample / out-of-sample | +0.470 / +0.394 (t = 4.0 / 2.8) |
| Coverage | ~4,860 unique sellers; ~2,480 per month-end |
| Rebalance phasings positive (§6.1) | 10 / 10 |
| Random 12-factor sub-books positive (§6.2) | 100 / 100 |
| Industry-neutral OOS (§6.5) | +0.484 — the signal strengthens |
| Union-vs-disclosed paired refinement (§6.6) | t = +2.07 full / +1.92 OOS |
| Unscreened bids (§6.8) | t ≈ 0 — the materiality band is load-bearing |
Every number reproduces bit-exactly from the frozen research vintage
(c2c-data-package) via the MIT replication codebase (Numinor-Systems/c2c-codebase,
notebooks 01–04 with per-cell asserts). The live feed's clean production construction
(vintage-deduplicated ownership mapping, real hold ratios, publish-basis PIT) re-runs the
identical harness at +0.465 / +0.381 — the finding is construction-robust.
What the product is, and is not
- It is the graph. The momentum signal is one construction over the edges; the same table supports supplier-side spillover, customer-concentration and counterparty-risk measures, network centrality, and shock-propagation studies.
- It is not a portfolio. The whitepaper reports information content (orthogonal ICIR, coverage, refinement tests) — not Sharpe, capacity, or net-of-cost performance, which are construction-dependent and the buyer's to measure. Section 6.4 discloses plainly that the signal is high-turnover.
- It is point-in-time by contract. Filter on
eff_date, never on partition dates; recompute the buffer fromsource_publish_dateif your workflow needs a different allowance.
Reading further
- Data dictionary — every column, the PIT rule, delivery layout, quickstart.
- Whitepaper v3.0 (PDF) — the complete methodology and robustness battery.
- Replication codebase —
Numinor-Systems/c2c-codebase(MIT).
Numinor Systems Limited · Gandalf (onsite) · support@numinor.io