Numinor
Use Cases
Multi-Dataset
4 min readFebruary 15, 2026

Beyond Industry Labels: Mining Alpha from Company Relationship Networks

Traditional industry classifications miss the complex web of business relationships. Discover how combining product value chains, supply chain linkages, and news co-mentions captures alpha that conventional models overlook.

Datasets Used
SAM Value ChainSmarTag NewsCustomer & Supplier

The Question

When two companies are classified in different sectors, should their stock movements be independent? Traditional finance says yes. Market reality says no.

A semiconductor manufacturer and an automotive OEM sit in separate industry buckets—"Technology" versus "Consumer Discretionary." But when the OEM announces a new electric vehicle platform, the chip supplier's stock moves. When a trade policy shift hits the supply chain, news articles mention both companies together. When quarterly reports reveal customer concentration, the revenue dependencies become clear.

Industry classifications are single-label systems imposed on multi-dimensional relationships. They tell you what a company is, not who it's connected to. And in modern equity markets, connections predict co-movement better than labels.

The question for quantitative investors: can you systematically extract alpha from relationship networks that conventional models ignore?

The Approach

We construct three types of relationship graphs using Numinor's datasets, then combine them in a unified framework.

Graph 1: Product Value Chains (SAM)
Every company in SAM is tagged with granular product classifications—not just "Technology," but specific nodes like "automotive semiconductors" or "OLED display components." Companies sharing product tags sit in the same value chain, even if their official industry codes differ. This captures business model similarity at scale.

Graph 2: Supply Chain Networks (Customer & Supplier)
Direct commercial relationships: who buys from whom, weighted by reported transaction volumes. This captures capital flow dependencies—when your customer's revenue drops, your stock should react, regardless of sector labels.

Graph 3: News Co-Occurrence Networks (SmarTag)
When financial media discusses two companies in the same article, it signals a perceived relationship—competitive dynamics, shared exposure to macro themes, or analyst-identified pairs. This captures information flow and market attention, the substrate on which correlations propagate.

Each graph undergoes community detection (Leiden algorithm) to identify clusters. These clusters become predefined concepts fed into a HIST (Hidden Information for Stock Trend) model—a neural architecture that decomposes stock behavior into three components: predefined concept information, hidden concept information, and individual stock information.

The model learns: How much of this stock's movement is explained by its relationship clusters? How much is idiosyncratic? And critically: Which relationships matter most for prediction?

The Finding

Using all three graphs simultaneously as predefined concepts, the HIST model achieved 14.10% annualized excess return over the CSI 300 benchmark, with an information ratio of 2.06 and maximum drawdown of just 5.08% during the backtest period spanning multiple market cycles.

The alpha comes from cross-classification dynamics. Traditional factor models assume stocks in the same industry move together. But a company making automotive chips might correlate more strongly with its automotive customers (captured via supply chain + news graphs) than with other semiconductor firms. Single-sector classification forces it into the wrong peer group.

Cluster attributes from relationship networks provided incremental information over direct company features. In other words, knowing which relationship communities a stock belongs to predicts returns better than knowing what products it makes alone. The graph structure—who's connected to whom—matters as much as the node attributes.

Performance varied by graph type:

  • Supply chain networks performed best for predicting medium-term returns (1-3 months), capturing fundamental revenue dependencies
  • News co-occurrence networks excelled at short-term prediction (days to weeks), reflecting how information cascades through attention networks
  • Product value chains added long-term stability, anchoring the model in economic fundamentals rather than transient correlations

The model's attention mechanism revealed that indirect relationships matter. A stock's second-degree neighbors in the supply chain graph—suppliers' suppliers, customers' customers—contributed predictive power. Markets are slow to recognize these extended dependencies, creating exploitable inefficiencies.

Try It Yourself

This strategy requires integrating three distinct datasets and implementing graph neural network infrastructure—a non-trivial engineering lift. But the payoff is a robust, multi-dimensional view of market structure that evolves with the economy.

Institutional investors can apply this approach to:

  • Portfolio construction: Build sector-neutral portfolios that respect actual co-movement patterns, not just industry labels
  • Risk management: Identify hidden common exposures by mapping relationship clusters across your holdings
  • Pair trading: Discover statistically related pairs that fundamental similarity misses
  • Event-driven strategies: When news hits one node in a cluster, predict which connected stocks will react next

Want to explore relationship-based alpha in your research environment? Book a call and we'll walk through the graph construction pipeline, model architecture, and backtest framework.

Want to explore this with your own data?

We'll walk you through the methodology, provide sample code, and help you adapt this approach to your specific research questions.

Book a Call

Related Use Cases