Explore Alternative Data for Quant Trading with One API
Explore Alternative Data for Quant Trading with One API
Try it now (requires Python 3.9+, no API key needed):
git clone https://github.com/trawlhq/trawl-examples.git && cd trawl-examples
pip install marimo requests && marimo run demos/quant-trading.py
Every systematic fund runs on data. The edge isn't in having more data — it's in having the right data, faster and cheaper than the other side. Alternative data (alt data) — earnings call transcripts, SEC filings, social media discussions, news velocity — has become table stakes for quantitative strategies.
The problem? Each source has its own API, auth system, schema, and rate limits. Integrating five sources means maintaining five pipelines.
Trawl fixes the ingestion layer. One API, one SDK, one schema. Pull structured text from 10+ internet sources and feed it into your own models. I'll walk through what's available — and include an interactive demo you can run right now.
The Data Sources
Our demo pulls from five sources, each with a different reliability tier:
| Tier | Source | What It Provides | Trawl Endpoint |
|---|---|---|---|
| 1 — Regulatory | SEC Filings | Parsed insider trades (Form 4 XML), 10-K, 10-Q, 8-K | /api/filings/form4/{ticker} |
| 2 — Corporate | Earnings Calls | Speaker-segmented transcripts via Finnhub | /api/earnings/search |
| 3 — News | Financial News | Multi-source aggregated articles | /api/news/company |
| 4 — Social | Posts with scores, comments, subreddit | /api/reddit/search | |
| 5 — Academic | Papers | ArXiv + Semantic Scholar | /api/papers/search |
Why tiers matter: A Form 4 SEC filing is a legally mandated disclosure. A Reddit post is an anonymous opinion. Your model should weight them accordingly.
Quick Start: 5 Sources in 10 Lines
from trawl import TrawlClient
# No key needed for preview endpoints (what the demo uses)
client = TrawlClient()
earnings = client.earnings.search("AAPL", from_date="2024-01-01")
filings = client.filings.form4("AAPL", date_after="2024-01-01")
news = client.news.company_news("AAPL")
reddit = client.reddit.search("$AAPL", from_date="2024-01-01")
papers = client.papers.search("Apple stock sentiment NLP")
sentiment = client.ai.preview_sentiment("Revenue beat expectations by 15%...")
For production (higher rate limits, watches, webhooks), pass your API key:
client = TrawlClient(api_key="your_key") # Get a free key at gettrawl.com/register
Five sources, ten lines, zero per-source integration work. Every response includes timestamps, titles, source attribution, and URLs to original content.
Note: The interactive demo uses raw HTTP
requestsso you can see the API directly. In production, use the SDK shown above — it handles retries, rate limits, and typed responses.
Or query all sources at once with unified search — this is Trawl's core differentiator:
# Single call — all sources, one schema
results = client.search.unified("AAPL earnings guidance")
for item in results:
print(f"[{item.source}] {item.title} — {item.published_date}")
Build vs. Buy: Why Use Trawl?
| Build In-House | Use Trawl | |
|---|---|---|
| Setup time | ~80 engineering hours | pip install trawl-sdk |
| Ongoing maintenance | ~10 hrs/month (API changes, rate limits) | Zero |
| Auth systems | 5 different (Finnhub key, EDGAR User-Agent, Reddit OAuth, etc.) | 1 API key |
| Schemas | 5 different response formats | 1 unified schema |
| Monthly cost | ~$1,500/mo in eng time + API fees | $0-29/mo |
Working with SEC Filings
Form 4 filings (insider transactions) are a well-studied signal in academic literature. Trawl parses the Form 4 XML directly — you get structured buy/sell data, not just metadata:
curl "https://api.gettrawl.com/api/filings/form4/AAPL?max_results=5"# Parsed insider transactions — buy/sell direction, shares, price
form4 = client.filings.form4("AAPL", date_after="2024-01-01", max_results=10)
for filing in form4["filings"]:
for owner in filing["owners"]:
print(f"Insider: {owner['name']} ({owner['officer_title']})")
for tx in filing["transactions"]:
print(f" {tx['transaction_type']}: {tx['shares']} shares @ ${tx['price_per_share']}")
print(f" Total value: ${tx['total_value']:,.2f}")
No XML parsing on your end. Trawl handles the EDGAR rate limits, namespace quirks, and derivative/non-derivative transaction variants. Every transaction includes: security title, date, code (purchase/sale/gift/exercise), shares, price, ownership type, and whether it's a derivative instrument.
Reddit Data for Quant Research
Reddit discussions (r/wallstreetbets, r/stocks, r/investing) provide a window into retail sentiment. Trawl's Reddit API returns full post metadata:
curl "https://api.gettrawl.com/api/reddit/search?query=%24AAPL&limit=5&sort=relevance"# Date filtering for backtesting — pull historical sentiment windows
posts = client.reddit.search("$AAPL", from_date="2024-01-01", to_date="2024-06-30")
for post in posts:
print(f"r/{post.subreddit}: {post.title}")
print(f" Score: {post.score}, Comments: {post.num_comments}")
For deeper analysis, fetch the full comment tree as RAG-ready segments:
# Get the post ID from search results, then fetch the full thread
post_id = posts[0].id # e.g., "1s8zm2z"
thread = client.reddit.get_thread(post_id)
for segment in thread.segments:
print(f"[{segment.speaker}] {segment.text}")
Reddit Analytics: Mentions, Trending, and Sentiment
Beyond search, Trawl offers dedicated analytics endpoints for quantitative Reddit analysis:
Ticker mention velocity — how often is a ticker being discussed across finance subreddits?
curl "https://api.gettrawl.com/api/reddit/mentions/AAPL?period=24h"# Mention velocity across r/wallstreetbets, r/stocks, r/investing, etc.
mentions = client.reddit.mentions("AAPL", period="24h")
print(f"Mentions in last 24h: {mentions['total_mentions']}")
for sub in mentions["subreddits"]:
print(f" r/{sub['name']}: {sub['count']} mentions")
Trending tickers — which tickers are gaining momentum on Reddit right now?
curl "https://api.gettrawl.com/api/reddit/trending?period=24h&limit=10"# Trending tickers with noise filtering (excludes common false positives)
trending = client.reddit.trending(period="24h", limit=25)
for ticker in trending["tickers"]:
print(f"${ticker['symbol']}: {ticker['mentions']} mentions, score {ticker['score']}")
Aggregated Reddit sentiment — AI-scored sentiment per ticker across finance subreddits:
curl "https://api.gettrawl.com/api/reddit/sentiment/AAPL?period=24h&max_posts=10"# Aggregated AI sentiment from finance subreddits
sentiment = client.reddit.sentiment("AAPL", period="24h", max_posts=10)
print(f"Overall: {sentiment['overall_sentiment']['label']} ({sentiment['overall_sentiment']['score']:+.2f})")
for sub, data in sentiment["subreddit_scores"].items():
print(f" r/{sub}: {data['label']} ({data['score']:+.2f})")
These three endpoints give you a complete picture of retail sentiment: how much attention a ticker is getting (mentions), what's gaining momentum (trending), and whether the crowd is bullish or bearish (sentiment).
Financial Sentiment Analysis (Built-In)
Trawl includes a server-side financial sentiment API — no need to bring your own NLP model:
curl -X POST "https://api.gettrawl.com/api/ai/preview/sentiment" \
-H "Content-Type: application/json" \
-d '{
"text": "Revenue beat expectations by 15%. Management raised full-year guidance and announced a $10B buyback."
}'# Score any financial text — earnings transcripts, news, Reddit posts
sentiment = client.ai.preview_sentiment(
"Revenue beat expectations by 15%. Management raised full-year guidance."
)
print(f"Score: {sentiment['sentiment_score']:+.2f}") # -1.0 to +1.0
print(f"Label: {sentiment['sentiment_label']}") # bullish / bearish / neutral
print(f"Confidence: {sentiment['confidence']:.0%}")
for phrase in sentiment["key_phrases"]:
print(f" {phrase['impact']}: {phrase['phrase']}")
The sentiment model is tuned for financial language — it understands earnings beats, guidance changes, and forward-looking statements. For a full production pipeline:
- Ingest via Trawl (earnings + news + Reddit + filings)
- Chunk using Trawl's built-in chunker (
/api/earnings/{ticker}/{year}/{quarter}/chunks) - Score with Trawl's
/api/ai/preview/sentimentor your own FinBERT/GPT-4 - Embed into a vector store for semantic retrieval
- Aggregate scores by source tier, recency, and confidence
For deeper analysis, bring your own model — Trawl handles the data pipeline so you can focus on alpha.
Intelligence Report: All Sources in One Call
For the ultimate shortcut, Trawl's intelligence report endpoint aggregates every available source for a ticker into a single structured response — Reddit sentiment, insider trades, earnings, SEC filings, news, congressional trading, and Wikipedia pageviews:
curl "https://api.gettrawl.com/api/intelligence/AAPL"# One call — complete cross-source intelligence
import requests
report = requests.get("https://api.gettrawl.com/api/intelligence/AAPL").json()
print(f"Summary: {report['summary']}")
for section_name, section_data in report["sections"].items():
print(f"\n--- {section_name} ---")
print(f" {len(section_data) if isinstance(section_data, list) else 'Available'}")
for signal in report.get("signals", []):
print(f"Signal: {signal['type']} — {signal['description']} (confidence: {signal['confidence']:.0%})")
This is the single most powerful endpoint for quant research — it replaces a dozen individual API calls with one request. Use it for daily briefings, screening new tickers, or as the first step in a deeper analysis pipeline.
Try the Interactive Demo
We built a Marimo notebook (a reactive Python notebook, like Jupyter but interactive) that pulls live data from all 5 sources:
git clone https://github.com/trawlhq/trawl-examples.git
cd trawl-examples
pip install marimo requests
marimo run demos/quant-trading.py
The demo includes:
- Live API calls to 5 Trawl sources with human-readable error messages
- Unified search — one call that queries all sources in a single request
- Financial sentiment scoring — Trawl's server-side bullish/bearish/neutral NLP on live content
- Form 4 insider transactions — parsed buy/sell data from SEC EDGAR XML
- Reddit sentiment — AI-aggregated sentiment across finance subreddits
- Intelligence report — full cross-source analysis in one call
- Date-range backtesting — filter earnings and Reddit by custom date windows
- AI summarization — Trawl's built-in NLP on real financial content (no setup)
- News volume timeline showing publication frequency by hour
- Reddit community breakdown by subreddit engagement
No API key needed. No sign-up required.
From Demo to Production
- Start with the intelligence report (
/api/intelligence/{ticker}) for a complete cross-source overview in one call - Score sentiment with Trawl's
/api/ai/preview/sentiment(server-side NLP) or bring your own FinBERT/GPT-4 - Track insider trades via
/api/filings/form4/{ticker}— parsed XML with buy/sell direction, shares, and price - Monitor Reddit momentum with
/api/reddit/mentions/{ticker},/trending, and/sentiment/{ticker} - Backtest with date ranges —
from_date/to_dateon earnings, Reddit, and filings - Use the chunks endpoint for RAG-ready, speaker-segmented earnings data
- Set up watches to monitor tickers automatically (Pro tier, $29/mo)
- Scale to 100+ tickers with concurrent requests or the unified search endpoint
Plug Into Your Stack
Trawl's API is just HTTP — it works with any framework. Here's how to wire it into the most common AI and automation stacks for quant workflows.
CrewAI — Multi-Agent Research
Spin up a team of agents that divide and conquer across data sources. One analyst pulls earnings, another tracks insider trades, a third monitors news sentiment — then they synthesize a brief.
from crewai import Agent, Task, Crew
from trawl_crewai import TrawlEarningsTool, TrawlNewsTool, TrawlIntelligenceTool
analyst = Agent(
role="Equity Research Analyst",
goal="Analyze a company using earnings calls, news, and insider trading data",
tools=[TrawlEarningsTool(), TrawlNewsTool(), TrawlIntelligenceTool()],
)
task = Task(
description="Build a research brief on NVDA: recent earnings call highlights, insider trading patterns, and news sentiment",
agent=analyst,
)
crew = Crew(agents=[analyst], tasks=[task])
result = crew.kickoff()
LangChain — RAG Over Earnings
Load earnings call transcripts as LangChain Documents, then index them for semantic question-answering. Useful for screening dozens of tickers by asking natural-language questions about guidance, margins, or capex.
from langchain_trawl import TrawlLoader
# Load earnings call transcripts as LangChain Documents
loader = TrawlLoader(urls=[
"https://api.gettrawl.com/api/earnings/AAPL/2026/1",
])
docs = loader.load()
# Index for question-answering
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
retriever = vectorstore.as_retriever()
Slack Bot
Get intelligence delivered to your trading channel — no context-switching required:
/trawl-intelligence NVDA
/trawl-earnings AAPL
/trawl-sentiment TSLA
MCP (No Code)
If you use Claude Desktop with the Trawl MCP server, you can skip code entirely. Ask Claude directly:
"Pull up NVDA's latest earnings call, check for any Form 4 insider transactions this month, and compare with congressional trading activity."
Claude will call the right Trawl tools behind the scenes — earnings, Form 4 filings, and congressional trades — and return a synthesized answer. No API keys to manage, no scripts to maintain.
Get Started
pip install trawl-sdk
Want to monitor 100 tickers across 5 sources? The free tier covers 1,000 requests/month. Sign up in 30 seconds — no credit card required. Pro includes a 7-day free trial.