The Question
Why do seemingly unrelated stocks move together? Sometimes it's sector exposure. Sometimes it's factor correlation. But often, it's something subtler: shared narratives.
When an analyst report discusses both companies. When news articles cover them in the same breath. When market commentary treats them as proxies for the same theme—AI beneficiaries, supply chain reshoring, green energy transition.
Media co-mentions create correlation by shaping investor mental models. If two companies are discussed together repeatedly, investors begin to think of them as related—and trade them as a pair, whether or not their fundamentals are actually linked.
Can you systematically map these narrative connections and exploit the correlation structure they create?
The Approach
Using SmarTag News, we construct co-occurrence networks where:
- Nodes = all companies (listed and unlisted) mentioned in financial media
- Edges = two companies appearing in the same article, weighted by frequency
An article mentioning Company A and Company B creates an edge. Repeated co-mentions strengthen the edge weight. Over a rolling window (e.g., 6 months), we build a dynamic graph that captures which companies are narratively linked in the market's collective attention.
We then apply graph neural networks (GNNs) using this co-occurrence matrix as the adjacency structure. The GNN learns: Given what's happening to stocks frequently mentioned alongside this one, what should we predict for its returns?
We also run community detection (Leiden algorithm) to identify clusters of frequently co-mentioned companies—these are implicit thematic groupings that traditional industry classifications miss.
The Finding
Stocks with co-occurrence relationships showed significantly stronger return correlations than unrelated stocks. The effect was robust across time periods, market cap segments, and industries—this is a pure narrative effect, not just a proxy for sector or factor exposure.
Within detected communities (clusters of frequently co-mentioned stocks), correlations were even higher. For example, a community of EV supply chain companies—battery manufacturers, cathode material producers, lithium miners—emerged from co-mention patterns, and these stocks moved together tightly despite spanning multiple traditional industries (materials, industrials, technology).
The GNN approach using co-occurrence matrices generated 11% annualized excess return versus benchmark, outperforming a baseline self-attention model without explicit graph structure by 4 percentage points. The explicit relationships matter—the model isn't just learning correlations from price data, it's exploiting the structure of how information flows through media networks.
News co-occurrence communities provided incremental classification information beyond traditional industry groupings. Overlap between co-mention communities and industry sectors was low (~5-10%), meaning the narrative graph reveals a fundamentally different dimension of market segmentation.
Try It Yourself
Media co-occurrence networks are dynamic—they evolve as themes gain and lose attention, making them particularly useful for identifying emerging correlations before they're fully priced in.
Practical applications:
- Thematic portfolio construction: Build baskets of stocks around narrative themes (AI infrastructure, carbon neutrality) by detecting co-mention communities, not relying on static sector classifications
- Pairs trading: Identify stocks that co-occur frequently but have diverged in price—signaling potential mean reversion if the narrative linkage is strong
- Risk management: Monitor co-mention networks to detect rising correlation risks from thematic attention (e.g., if your portfolio is overweight multiple stocks in an emerging "trade war casualties" narrative)
- Event-driven strategies: When news hits one node in a co-mention cluster, predict which connected stocks will follow—and position ahead of the herd
This strategy requires real-time news data with entity extraction and graph infrastructure for community detection and GNN training. The payoff is a continuously updated map of the market's narrative structure.
Interested in building co-mention networks for your universe? Book a call to discuss data pipelines, graph construction, and model implementation.