Numinor SAM Product Momentum Construct Data v1.0
English: Numinor SAM Product Momentum Construct Data 中文: Numinor SAM 产品动量构建数据
1. Product Identity
Product: SAM PM 构建数据 (Construct Data) Version: v1.0 Methodology Reference: Numinor SAM Product Momentum Whitepaper v2.2 (May 2026) Reference Implementation: github.com/Numinor-Systems/sam-pm-construct-reference — MIT License
What this product is
A pre-engineered stock-level product-momentum signal for A-share listcos, derived from ChinaScope's SAM product taxonomy and daily price data. Each trading day, for each A-share listco with SAM coverage, Numinor publishes three signal values:
ne_composite_styled— the headline composite signal, suitable for direct use as a factor input or sortable rankbiz_mom_styled— the product-momentum component (revenue-mix-weighted aggregation of product-level momentum)biz_resvol_styled— the product-residual-volatility component (revenue-mix-weighted aggregation of product-level residual volatility)
The buyer can use the composite directly, or combine the two components themselves with their preferred weights.
What this product is NOT
- Not a buy/sell signal. This is a continuous-valued factor signal; the buyer's portfolio construction is theirs to design.
- Not raw ChinaScope data. That is sold separately by ChinaScope. This product uses the raw SAM product mix and daily prices as input and reshapes them into a per-stock momentum signal.
- Not a pre-computed strategy P&L. No long-short basket construction is performed — that's the buyer's choice.
- Not residualized against Numinor's illustrative factors. The signal is factor-model agnostic; the buyer can apply their own factor neutralization on top.
Who it's for
Quant equity buyers operating in Chinese A-shares who want an orthogonal product-momentum factor with established validation (see WP v2.2). Particularly useful as a complement to traditional price-momentum factors, since this signal captures momentum at the product-level (where the company actually does business) rather than the stock-level (where prices are observed).
2. Schema
Primary schema (parquet and CSV — identical columns)
| Column | Type | Description |
|---|---|---|
trade_date | date32 | The trading date this signal value applies to (Asia/Shanghai timezone). |
ts_code | string | A-share stock ticker in format NNNNNN.XX (e.g., 000001.SZ, 600000.SH). |
ne_composite_styled | float64 | Composite product-momentum signal. Z-scored cross-sectionally per trade_date (i.e., mean ≈ 0 and std ≈ 1 across the cross-section on any given day). |
biz_mom_styled | float64 | Product-momentum component. Z-scored cross-sectionally. |
biz_resvol_styled | float64 | Product-residual-volatility component. Z-scored cross-sectionally. |
source_basis | string | Always "sam_product_mix" in v1.0. Reserved for future variants (e.g., supply-chain-routed momentum). |
source_rpt_date | date32 | The SAM source data effective date used to compute the revenue-mix weights (PIT-correct). |
source_publish_date | date32 | When the governing filing became public. (schema v1.1) |
eff_date | date32 | When the filing became usable: source_publish_date + 30 days. Strict audit column — eff_date ≤ trade_date on every served row. (schema v1.1) |
Example rows
trade_date | ts_code | ne_composite_styled | biz_mom_styled | biz_resvol_styled | source_basis | source_rpt_date
2026-05-30 | 000001.SZ | +0.31 | +0.42 | -0.18 | sam_product_mix | 2025-12-31
2026-05-30 | 000002.SZ | -0.62 | -0.81 | +0.55 | sam_product_mix | 2025-12-31
2026-05-30 | 600519.SH | +1.85 | +2.10 | +0.32 | sam_product_mix | 2025-12-31
2026-05-30 | 300750.SZ | -0.04 | +0.18 | -0.41 | sam_product_mix | 2025-12-31
Type / value rules
trade_date: ISO 8601 calendar date (YYYY-MM-DD). Trading-day-aligned (Shanghai/Shenzhen Stock Exchange calendar). Non-trading days are NOT published.ts_code: A-share listco tickers only. FormatNNNNNN.SZfor Shenzhen,NNNNNN.SHfor Shanghai.ne_composite_styled,biz_mom_styled,biz_resvol_styled: float64, z-scored cross-sectionally pertrade_date. Typical range[-3, +3]with extremes occasionally beyond ±5. Cross-section mean is ~0, std is ~1.0 (small deviations from exactly 1.0 due to NaN handling).source_basis: String literal"sam_product_mix". New values may be added in future versions but existing values are stable.source_rpt_date: The SAM source data effective date used for revenue-mix weighting. Always ≤trade_date - 30 calendar days(see §4 PIT discipline).source_publish_date: the filing's publication date (source_rpt_date ≤ source_publish_date).eff_date:source_publish_date + 30 calendar days— when the filing became usable.eff_date ≤ trade_dateon every row (the strict no-look-ahead invariant, verifiable directly from the data).
Sign convention
A higher ne_composite_styled predicts a higher 20-day forward return. Buyer convention should be:
- Long the top quantile
- Short the bottom quantile
- Or use as a positive-direction factor weight in a multi-factor model
If a buyer's factor model expects "low value = good" (some Barra-style conventions), they can simply negate the column.
What's NOT in the schema (intentional)
- No "raw" intermediate values (
biz_mom_daily,biz_mom_neu,z_mom, etc.) — these can be reconstructed from the source data + methodology doc + reference code, and aren't needed for normal use. - No per-product breakdown (product-level momentum values aggregated to the stock) — the aggregation is the product, and exposing per-product would expose the SAM taxonomy details that are ChinaScope's IP, not Numinor's.
- No factor exposures (size, value, momentum betas, etc.) — buyer's own factor model handles this.
3. File Layout & Delivery
File layout — Hive-partitioned daily
The product is published as Hive-partitioned daily parquet (one partition per trading day), mirroring SAM Amplifier 构建数据:
s3://numinor-construct-data/sam-pm/parquet/
└── year=YYYY/month=MM/day=DD/data.parquet (one partition per trading day)
Each partition holds every covered stock's row for that trade_date (~3,000–4,600 rows). A
CSV mirror (data.csv.zip, ZIP-compressed, identical columns) is written alongside each
parquet partition. Parquet is snappy-compressed. The API layer (§7) mints signed-URL access
over this store — /historical returns the full range, /delta/{YYYYMMDD} a single day's
partition, /range a date span.
Current published coverage: 2016-01-05 → 2026-04-07 (2,489 daily partitions); advances each
trading day via the daily refresh cron (scripts/cron/run_cron_a.py).
File sizes (approximate)
| File | Parquet | CSV.zip |
|---|---|---|
| Historical dump (2016-2025, ~10M rows: ~4000 stocks × 2500 trading days) | ~120 MB | ~400 MB |
| Daily delta (one trading day, ~3000-5000 rows) | ~0.5-1 MB | ~1-3 MB |
These files are an order of magnitude smaller than SAM Amplifier 构建数据 because SAM PM is one row per (stock, day) rather than many edges per (stock, day).
Update cadence & SLA
- Daily refresh: new
sam_pm_delta_YYYYMMDD.parquetpublished by 06:00 Asia/Shanghai time, for trading dayYYYYMMDD(T+1). - No publication on Chinese A-share market holidays.
- Historical dump: issued once per subscriber at onboarding; rebuilt only when methodology changes (rare; documented in changelog).
- Methodology stability: Numinor commits not to change the methodology mid-version. Methodology updates trigger a version bump (e.g., v1.1) with 60-day advance notice.
Buyer-chosen rebalance cadence
The signal is published every trading day for every covered stock. The buyer is NOT restricted to any particular rebalance cadence:
- Monthly rebalancer? Pull signal values from each month-end.
- Weekly Wednesday rebalancer? Pull from each Wednesday.
- 20-trading-day rebalancer (as in WP v2.2)? Pull every 20th trading day from your chosen anchor.
- Daily rebalancer? Pull every day.
The Numinor pipeline delivers daily; the buyer's pipeline decides which dates to consume.
4. PIT Discipline
The product is point-in-time correct: a signal value dated trade_date = D uses only ChinaScope source data that was already publicly available on day D.
Publish lag rule
For every source used in v1.0:
source_publish_date + 30 calendar days ≤ trade_date
This buffer models realistic vendor delivery (ChinaScope T+1) + institutional-buyer ingestion / recompute / deployment lag (typically 3-4 weeks combined) + a modest conservatism cushion. It matches the convention used in Numinor's SAM Amplifier construct, keeping the lag rule consistent across our catalog.
Measured availability floor (2026-06): across 5,675 filings spanning the FY2025 annual + Q1 reporting season, 99% of SAM records were delivered within 4 days of publish_date (99.9% within 30). The 30-day buffer is therefore ~4 days of measured availability + ~26 days of buyer-workflow allowance — conservative by construction.
This is mechanically enforced by the pipeline — the revenue-mix weights underlying the signal at trade_date = D are sourced from filings whose publish_date + 30 calendar days ≤ D.
Daily price data PIT
Daily price returns (used to compute the daily momentum component before aggregation) are PIT by construction: returns on day D are computed from the close of day D price relative to the close of day D-1. The signal value for trade_date = D uses returns through close of D.
What this means for the buyer
- The buyer never receives a signal value that "looked into the future" relative to its
trade_date. - Backtests using this data inherit the PIT discipline automatically.
- For audit / transparency, the buyer can verify the strict no-look-ahead rule directly from the served data:
eff_date ≤ trade_dateon every row, whereeff_date = source_publish_date + 30(all three dates ship as of schema v1.1). The report-date proxy (source_rpt_date + 30 ≤ trade_date) also holds but is the weaker check.
Relationship to the WP
The published SAM PM Whitepaper v2.2 used 30-day lag throughout. The v1.0 construct data matches this exactly. A buyer running the WP's methodology on the construct data should reproduce structurally similar results (subject to differences in factor models for evaluation purposes — the WP uses Numinor's illustrative 22-factor base, which the buyer would not use directly).
Can the buyer change the lag?
- Tighter lag (<30 days): not available through this product. Requires licensing raw ChinaScope SAM + daily price data and running your own pipeline.
- Looser lag (>30 days, more conservative): easily applied buyer-side — simply consume the signal at
trade_date + extra_daysin your pipeline.
5. Signal Construction — Numinor's Engineering Value-Add
ChinaScope ships raw SAM data (per-company × per-product revenue mix tables) and daily stock prices. For stock-level quantitative analysis, the buyer needs these inputs assembled into a per-stock momentum signal with appropriate aggregation, PIT discipline, and z-scoring. That assembly is the engineering work Numinor performs.
Signal recipe (Construction R, the v1.0 canonical)
For each trade_date = D:
Step 1 — Daily product-level returns. Each SAM product node p is mapped to its constituent A-share listcos with revenue exposure. For each product on day D, compute a revenue-share-weighted return of its constituent stocks (with self-exclusion: each focal stock is excluded from its own products' aggregates when constructing the focal's signal).
Step 2 — Daily product-level residual return. Strip out cross-sectional mean from product returns per date to get the product's daily residual return. This isolates idiosyncratic product-level momentum from market-wide moves.
Step 3 — Rolling 20-trading-day momentum. For each product, compute the trailing-20-day sum of daily residual returns → biz_mom_daily[p, D]. Also compute the trailing-20-day standard deviation → biz_resvol_daily[p, D].
Step 4 — Project back to focal stock. For each focal stock i, aggregate biz_mom_daily[p, D] across the focal's products, weighted by the focal's revenue share on each product:
biz_mom[i, D] = Σ_p revenue_share[i, p] × biz_mom_daily[p, D]
Same aggregation for biz_resvol[i, D].
Step 5 — Cross-sectional z-score per date. Standardize each of biz_mom and biz_resvol to mean 0, std 1 across the A-share cross-section on day D. This gives biz_mom_styled and biz_resvol_styled.
Step 6 — Composite. Combine the two styled components:
ne_composite_styled = z_mom_weight × biz_mom_styled + z_resvol_weight × biz_resvol_styled
Weights are tuned in the methodology doc. Sign of ne_composite_styled is calibrated so higher = predicts higher 20-day forward return.
Universe filtering
After computing raw signals:
- Drop stocks with insufficient SAM coverage (no revenue mix data at
trade_date - 30 days) - Drop stocks suspended for extended periods (cross-section unstable)
- Drop stocks not in A-share SH/SZ/KC/CYB listco universe
- Drop stocks IPO'd after
trade_dateor delisted beforetrade_date
Daily refresh
At each new trade_date:
- Update daily product-level returns (one new day of data)
- Roll the 20-day window forward by one day
- If any new SAM source data has just crossed
publish_date + 30 days, update affected revenue-mix weights - Recompute z-scores per date
- Emit a delta file containing the new day's row for every covered stock (~3000-5000 rows)
What the buyer pays for vs. does themselves
| Step | Done by Numinor | Buyer would need to do |
|---|---|---|
| Read raw ChinaScope SAM + daily prices | ✓ | Schema knowledge, multi-table joins |
| Apply PIT discipline (revenue-mix as of date) | ✓ | Track each filing's publish_date |
| Compute product-level returns with self-exclusion | ✓ | Implement aggregation correctly |
| 20-day rolling momentum & residual volatility | ✓ | Maintain rolling-window state |
| Cross-sectional z-scoring per date | ✓ | Recompute z-scores daily |
| Composite weighting | ✓ | Choose weights, replicate methodology |
| Daily refresh pipeline | ✓ | Run own pipeline daily |
| Historical snapshots | ✓ | Build own historical store |
Doing this end-to-end from raw ChinaScope data is approximately 2-3 weeks of focused data engineering work for an experienced team, plus ongoing maintenance.
6. Universe Rules
Inclusion
- All A-share listcos with SAM coverage at
trade_date, traded on Shanghai (SH), Shenzhen (SZ), STAR Board (KC), or ChiNext (CYB). - Stocks under brief suspension on
trade_dateare included in the signal if their underlying product-mix data is still valid (the signal doesn't require trading ontrade_dateto be computable). - Stocks delisted before
trade_dateare excluded from that date forward. - Stocks IPO'd after
trade_dateare excluded prior to listing.
Exclusion (by design)
- Beijing Stock Exchange (BJ): excluded.
- Stocks with no SAM coverage (no product revenue mix in ChinaScope's SAM tables): excluded.
- Stocks with <90 trading days of price history (insufficient to compute residual volatility): excluded.
Coverage start
- The dataset covers
2016-01-04 → present. SAM data prior to 2016 has insufficient depth. - Signal effective from ~2016-02-02. The momentum/residual-volatility features need a ~20-trading-day rolling window; with no price history before 2016-01-04 to seed it, the first ~19 trading days of 2016 (early January) carry no computable signal (NaN). Every later date is fully warmed. (Partitions in this start-of-data window may be absent or NaN; treat as "no signal".)
Typical universe size
- 2016: ~3,000 stocks per day
- 2026: ~4,500 stocks per day
- Coverage grows roughly with the A-share listing universe over time
7. API Specification
All API access is via signed URL minting. The buyer authenticates once with their API key; the API returns a time-limited S3 URL the buyer downloads from directly.
Base URL
https://api.numinor.io/v1/constructs/sam-pm
Authentication
Authorization: Bearer <numinor_api_key>
- Default expiration: none. API keys do not expire automatically.
- Rotation: client-controlled via subscriber dashboard. Subscribers may rotate at any cadence; we recommend 90 days as security best practice.
- Revocation: immediate. Compromised keys can be invalidated instantly via the dashboard.
- Multiple keys per subscriber: supported, useful for separating dev / staging / production access.
Endpoints
Identical surface to SAM Amplifier 构建数据. The construct path is /v1/constructs/sam-pm instead of /v1/constructs/sam-amplifier.
| Endpoint | Method | Returns |
|---|---|---|
/manifest | GET | Schema, available date range, total row count |
/historical | GET | Signed S3 URL for historical dump file |
/delta/{YYYYMMDD} | GET | Signed S3 URL for a specific date's delta file |
/range | GET | List of signed S3 URLs for a date range |
/query | POST | Filtered query results inline as JSON (small results only) |
/query body shape
{
"trade_date": "2026-05-30",
"ts_code": "000001.SZ" // optional; omit for all stocks on that date
}
Rate limits
| Endpoint | Limit | Per-response cap |
|---|---|---|
POST /query | 100 req/min per API key, burst 20 in 10 sec | 10,000 rows per response. Exceed → HTTP 413, use /range instead |
GET /historical, /delta, /range | unlimited | n/a (signed S3 URL) |
GET /manifest | 1000 req/min | small JSON |
Signed URL validity: 4 hours from issuance. Within the validity window, downloads from S3 are unlimited.
Coming in v1.1: MCP
We will publish an MCP server exposing the same API as native LLM tools (get_sam_pm_signal(date, ts_code)) once MCP infrastructure matures across major model providers.
8. Quickstart: From Subscription to First Value in 5 Minutes
1. Get your API key
Issued via email at onboarding. Store as NUMINOR_API_KEY:
export NUMINOR_API_KEY="nm_live_..."
2. Pull today's delta
Python (pandas):
import requests, os, pandas as pd
key = os.environ["NUMINOR_API_KEY"]
date = "20260530"
resp = requests.get(
f"https://api.numinor.io/v1/constructs/sam-pm/delta/{date}",
headers={"Authorization": f"Bearer {key}"}
).json()
df = pd.read_parquet(resp["url"])
print(df.head())
# Output:
# trade_date | ts_code | ne_composite_styled | biz_mom_styled | biz_resvol_styled | ...
Python (polars):
import polars as pl
df = pl.read_parquet(resp["url"])
R:
library(arrow)
library(httr)
resp <- httr::GET("https://api.numinor.io/v1/constructs/sam-pm/delta/20260530",
httr::add_headers(Authorization=paste("Bearer", Sys.getenv("NUMINOR_API_KEY"))))
url <- jsonlite::fromJSON(httr::content(resp, "text"))$url
df <- arrow::read_parquet(url)
DuckDB:
INSTALL httpfs;
LOAD httpfs;
SELECT ts_code, ne_composite_styled FROM read_parquet('<signed_url>') ORDER BY ne_composite_styled DESC LIMIT 100;
3. Pull the historical dump (one-time)
resp = requests.get(
"https://api.numinor.io/v1/constructs/sam-pm/historical",
headers={"Authorization": f"Bearer {key}"}
).json()
df_history = pd.read_parquet(resp["url"])
4. Use the signal in your strategy
Simplest use — sort and trade:
# Latest cross-section
today = df_history[df_history["trade_date"] == "2026-05-30"]
# Long top quintile, short bottom quintile
long_basket = today.nlargest(int(len(today) * 0.20), "ne_composite_styled")["ts_code"].tolist()
short_basket = today.nsmallest(int(len(today) * 0.20), "ne_composite_styled")["ts_code"].tolist()
Or use as a feature in your multi-factor model:
# Merge with your factor file
my_factors = pd.read_parquet("my_factor_panel.parquet")
combined = my_factors.merge(
df_history[["trade_date", "ts_code", "ne_composite_styled"]],
on=["trade_date", "ts_code"], how="left"
)
# combined now has your factors + ne_composite_styled as a new column
# Use in your usual factor-combination pipeline
5. Ask Gandalf for help
Stuck on integration? Open Gandalf (your onsite AI assistant), ask:
"How do I use ne_composite_styled in a vol-scaled portfolio?" "What's the difference between biz_mom_styled and biz_resvol_styled?" "How do I run the WP v2.2 multi-offset robustness test on this data?"
Gandalf has context on the data dictionary, the reference code, and methodology.
9. Onboarding Checklist
When a new subscriber comes online:
| Step | Owner | Time |
|---|---|---|
| 1. Subscription contract signed | Sales | — |
| 2. API key generated, emailed to subscriber's technical lead | Numinor ops | < 1 hour |
3. Subscriber tests /manifest endpoint to confirm access | Subscriber | 5 min |
| 4. Subscriber downloads historical dump | Subscriber | 1-2 min (~120 MB) |
| 5. Subscriber validates schema against this data dictionary | Subscriber | 10 min |
| 6. Subscriber runs reference code (sort + long/short example) | Subscriber | 10-20 min |
| 7. Subscriber's first signal-driven backtest produced | Subscriber | end of day 1 |
| 8. Optional: live integration call with Numinor team | Joint | 1 hour |
Total time from contract to first usable signal: < 1 business day.
10. Versioning & Changelog
| Version | Date | Notes |
|---|---|---|
| v1.0 | 2026-05-28 | Initial release. Mirrors SAM PM WP v2.2 (May 2026) Construction R signal at offset=0, daily refresh. |
| v1.0 (published) | 2026-06-05 | Historical dump published as Hive daily partitions (sam-pm/parquet/, 2016-01-05 → 2026-04-07, 2,489 partitions). Realm registry live (sam-pm/realm/: catalog.json, dials.json, applicability matrix). Pipeline numinor_sam_pm/construct.py (Construction R) + daily refresh cron scripts/cron/run_cron_a.py. Defaults reproduce WP v2.2 multi-offset orthogonal ICIR +0.3497 full / +0.3523 OOS (vs 22-factor base), 100% positive offsets. |
Roadmap
- v1.1 (planned Q3 2026): MCP server for LLM-native data access.
- v1.2 (planned Q4 2026): Optional multi-offset variants for buyers wanting to replicate WP §6 multi-offset robustness internally.
- v1.3 (planned Q4 2026): Construction S variant (source-residualized daily returns) for buyers who already factor-neutralize at the daily-returns layer.
- v2.0 (planned 2027): SAM Supply Chain v5 incorporation, plus optional supply-chain-routed momentum (combining SAM PM with SAM Amplifier methodology).
Subscribers receive 60-day advance notice of any breaking changes.
11. FAQ
Q: How is this different from buying ChinaScope's raw SAM data directly?
A: ChinaScope sells the raw sam_product_calc and daily price tables. To compute the product-momentum signal yourself, you'd need to (a) implement the revenue-share aggregation with self-exclusion, (b) build the rolling 20-day momentum and residual volatility pipeline, (c) handle PIT discipline correctly, (d) maintain daily refreshes. Numinor does all this for you with a published, peer-reviewable methodology (WP v2.2) and a reference implementation. You pay for the engineering + ongoing maintenance, not the data.
Q: Why three columns instead of just ne_composite_styled?
A: Most buyers will use only ne_composite_styled (the headline composite). The two components are included for buyers who want to:
- Combine the components with their own weights (rather than our default composite weighting)
- Use only momentum or only residual-volatility separately
- Validate that the composite is calculable from the components
If you only want one column, you can drop the other two on read.
Q: Is the WP v2.2 ICIR of +0.42 / +0.35 what I should expect?
A: Those numbers were computed on Numinor's illustrative 22-factor base for orthogonalization. Your numbers depend on YOUR factor model. As a directional benchmark: the signal has consistent positive cross-sectional information beyond standard size/value/momentum factors. Magnitude depends on what's already in your stack.
Q: Does the signal work better at certain forward horizons?
A: Per WP §6.4, the signal is most predictive at 60-day forward horizon (orth-ICIR +0.46/+0.48 raw, +0.27/+0.28 de-overlapped). At the canonical 20-day horizon, it's +0.42/+0.35 (raw). The signal is "slow alpha" by construction — product-spillover effects play out over weeks, not days. Buyers running daily-rebal strategies should account for this; weekly-or-monthly-rebal strategies are well-aligned.
Q: What if a stock has no signal value on a given date?
A: It's simply absent from the delta file for that date. The buyer's pipeline should treat missing ne_composite_styled as "no signal available" — fall back to whatever default behavior makes sense (no position, average factor value, etc.). The reference code demonstrates the fallback pattern.
Q: Does the signal account for industry / sector effects?
A: The cross-sectional z-scoring per date provides one layer of normalization (the signal value tells you "how does this stock's product-momentum rank against the whole A-share cross-section today?"). For industry-relative or sector-neutral use, buyers typically apply their own industry/sector neutralization on top. The signal is delivered "raw cross-sectional" so the buyer can choose their preferred neutralization.
Q: Why does the composite use biz_mom + biz_resvol instead of just biz_mom?
A: Per WP §3, including the residual-volatility component improves OOS ICIR by ~+0.05-0.10. Empirically, product-residual-volatility carries unique predictive content (likely reflecting product-level information dispersion and idiosyncratic risk pricing) that complements pure momentum. Both are included.
Q: Can I run my own backtest to verify the WP's claims before purchasing?
A: Yes. Request a 30-day evaluation license (contact sales). Evaluation includes full historical dump + 30 days of daily deltas, with the same API access. The buyer can replicate WP §4-§6 in their own infrastructure.
12. Contact & Support
| Channel | Use case |
|---|---|
| Gandalf (in-product) | First-line technical questions, code examples, methodology clarifications |
support@numinor.io | Everything else — production issues, subscriptions, data quality, methodology |
Appendix A: Schema Reference Card (Printable)
Numinor SAM Product Momentum 构建数据 v1.0
Format: parquet (.parquet) or CSV/ZIP (.csv.zip)
trade_date date32 YYYY-MM-DD, Shanghai trading days
ts_code string NNNNNN.SZ or NNNNNN.SH
ne_composite_styled float64 z-scored composite signal (higher = predicts higher fwd ret)
biz_mom_styled float64 z-scored momentum component
biz_resvol_styled float64 z-scored residual-vol component
source_basis string "sam_product_mix" (v1.0)
source_rpt_date date32 YYYY-MM-DD, fiscal period of the governing filing
source_publish_date date32 YYYY-MM-DD, when that filing became public
eff_date date32 YYYY-MM-DD, publish + 30d; STRICT: eff_date ≤ trade_date
Universe: A-shares (SH + SZ + KC + CYB); not BJ
Cadence: daily, one row per stock per trading day
Coverage: 2016-01-04 → present
PIT discipline: source_rpt_date + 30 days ≤ trade_date
Forward horizon: signal calibrated to 20-day forward returns (per WP v2.2 §3)
Sign: higher ne_composite_styled = predicts higher 20-day forward return
Layout: Hive year=YYYY/month=MM/day=DD/data.parquet (+ data.csv.zip mirror)
Delivery: s3://numinor-construct-data/sam-pm/parquet/ (Hive daily partitions)
via API signed-URL minting at https://api.numinor.io/v1/
Realm registry: s3://numinor-construct-data/sam-pm/realm/ (catalog.json, dials.json,
ANALYSIS_DIAL_APPLICABILITY.md)
Build dials: data tier (pit_buffer_days, sam_level, revenue_threshold, min_peer_count,
mom_window) + analysis tier — see realm/dials.json
Appendix B: How SAM PM 构建数据 Relates to SAM Amplifier 构建数据
Both products derive from ChinaScope's SAM data, but they capture different aspects of the network:
| SAM Amplifier 构建数据 | SAM PM 构建数据 | |
|---|---|---|
| What it is | Stock-to-stock relationship graph (peer / upstream / downstream edges) | Per-stock momentum signal |
| Schema | Edge-list: (focal, counterparty, weight) | Stock-day: (date, ts_code, signal value) |
| Buyer usage | Apply as aggregation operator on buyer's own factors | Use directly as a factor input or sort signal |
| Updates | Daily delta of changed edges | Daily delta of new day's signals |
| Subscriber asks Gandalf | "Which stocks are similar to 600519.SH today?" | "What's the momentum signal for 600519.SH today?" |
Subscribers can purchase one or both. They are complementary, not substitutes — Amplifier captures who is connected to whom; PM captures which stocks have product-level momentum. Bundling discounts available.
End of SAM PM 构建数据 v1.0 specification. Methodology reference: Numinor SAM PM Whitepaper v2.2. Reference implementation: github.com/Numinor-Systems/sam-pm-construct-reference (MIT License). © 2026 Numinor Systems. All rights reserved on product definitions; reference code MIT-licensed.