I Tracked How a Topic Goes From ArXiv Paper to Reddit Hype to Market Mover — in One API Call Per Source
Information Has a Lifecycle. Track It.
Every market-moving narrative follows the same path: an academic paper gets published, niche YouTube creators pick it up, Reddit threads start forming, mainstream news publishes articles, and finally Google Trends spikes. By the time your Bloomberg terminal lights up, the signal has been visible for months — if you knew where to look.
I built a tool that tracks this cascade across six sources through Trawl's unified API. For any topic, it tells you exactly where in the lifecycle that narrative sits right now.
The Cascade Model
Information flows through five predictable stages:
| Stage | Name | Signal |
|---|---|---|
| 1 | Academic | Topic exists only in papers. Very early, no public awareness. |
| 2 | Early Adopter | Niche YouTube creators and podcasters are covering it. |
| 3 | Social Amplification | Reddit threads are forming. Retail awareness growing. |
| 4 | Mainstream | News outlets are publishing articles. Broad awareness. |
| 5 | Peak / Saturation | Google Trends spike and Wikipedia pageview surge. Likely priced in. |
The alpha is in the gap between stages. A topic at Stage 2 (niche YouTube coverage) that's headed to Stage 4 (mainstream news) represents an information advantage. A topic already at Stage 5 is priced in — you're late.
Step 1: Search Academic Papers
Start with the source of truth. Has this topic been published in academic literature?
curl "https://api.gettrawl.com/api/papers/search?q=GLP-1+weight+loss"Step 2: Check Social Amplification
Is Reddit talking about it? When social platforms start amplifying, you're entering Stage 3.
curl "https://api.gettrawl.com/api/reddit/search?query=GLP-1+weight+loss"Step 3: Mainstream Coverage
Has news media picked it up? If yes, you're at Stage 4 or beyond.
curl "https://api.gettrawl.com/api/news/search?q=GLP-1+weight+loss"Step 4: Check Saturation
Wikipedia pageviews are a proxy for mass public awareness. A spike here means Stage 5 — the narrative is saturated.
curl "https://api.gettrawl.com/api/pageviews/GLP-1?article=GLP-1"The Lifecycle Detection Algorithm
The stage is determined by which sources have results:
Stage 5: News + (Google Trends spike OR Wikipedia surge) -> Saturated
Stage 4: News coverage detected -> Mainstream
Stage 3: Reddit threads forming -> Social amplification
Stage 2: YouTube/podcast coverage + papers -> Early adopter
Stage 1: Papers only -> Academic
The tool also computes a trend direction for each source. It splits the date range in half and compares item density in each period. If the second half has 30%+ more items, the trend is "increasing." If 30%+ fewer, "decreasing." This tells you whether a narrative is gaining or losing momentum within each stage.
The Full Pipeline
from trawl import TrawlClient
from datetime import datetime
client = TrawlClient()
def analyze_cascade(topic: str) -> dict:
"""Track a narrative across all sources and determine lifecycle stage."""
# Search all sources
papers = client.papers.search(q=topic)
reddit = client.reddit.search(query=topic)
news = client.news.search(q=topic)
youtube = client.search(q=topic)
# Check saturation signals
try:
pageviews = client.pageviews.get(topic)
has_pageview_spike = True
except Exception:
has_pageview_spike = False
# Determine stage
has_papers = len(papers.results) > 0
has_youtube = len(youtube.results) > 0
has_reddit = len(reddit.results) > 0
has_news = len(news.results) > 0
if has_news and has_pageview_spike:
stage = 5 # Saturated
elif has_news:
stage = 4 # Mainstream
elif has_reddit:
stage = 3 # Social amplification
elif has_youtube and has_papers:
stage = 2 # Early adopter
else:
stage = 1 # Academic
# Build cascade timeline (earliest mention per source)
timeline = []
sources = {
"papers": papers.results,
"youtube": youtube.results,
"reddit": reddit.results,
"news": news.results,
}
for name, results in sources.items():
if results:
dates = [r.date for r in results if hasattr(r, 'date') and r.date]
if dates:
earliest = min(dates)
timeline.append({
"source": name,
"earliest": earliest,
"count": len(results),
})
timeline.sort(key=lambda x: x["earliest"])
stage_names = {
1: "Academic",
2: "Early Adopter",
3: "Social Amplification",
4: "Mainstream",
5: "Peak / Saturation",
}
return {
"topic": topic,
"stage": stage,
"stage_name": stage_names[stage],
"timeline": timeline,
"source_counts": {name: len(r) for name, r in sources.items()},
}
# Run it
result = analyze_cascade("GLP-1 weight loss")
print(f"Topic: {result['topic']}")
print(f"Stage: {result['stage']} - {result['stage_name']}")
print()
print("Cascade Timeline:")
for entry in result["timeline"]:
print(f" {entry['source']:10s} {entry['earliest']} ({entry['count']} results)")
Why This Matters
Every narrative that moved markets in the last decade — AI, crypto, GLP-1 drugs, meme stocks — followed this exact cascade. The information was available at Stage 1 or 2 if you were looking in the right places. By Stage 5, the trade is crowded.
The key insight is that Trawl's unified API makes cross-source temporal analysis trivial. What would normally require six different APIs, six different auth flows, and six different response formats is just six GET requests to the same base URL. No API keys. No authentication. One base URL.
Non-Code Options
Trawl's MCP server works inside Claude Desktop — describe what you're tracking and it handles the API calls:
"Track the narrative lifecycle of 'GLP-1 weight loss' across academic papers, YouTube, Reddit, and news. What stage is it in? Is it accelerating or decelerating?"
Claude searches all sources, compares earliest mention dates, and gives you a lifecycle assessment. You get the analysis without writing a line of code.
Source Code
The full implementation with parallel fetching, trend detection, and Rich terminal output: