Back to Playbooks
Quant Traderintermediate
quant tradingalternative datasentiment analysisSEC filingsearnings callsfintech

Explore Alternative Data for Quant Trading with One API

Trawl Team·

Explore Alternative Data for Quant Trading with One API

Try it now (requires Python 3.9+, no API key needed):

git clone https://github.com/trawlhq/trawl-examples.git && cd trawl-examples
pip install marimo requests && marimo run demos/quant-trading.py

Every systematic fund runs on data. The edge isn't in having more data — it's in having the right data, faster and cheaper than the other side. Alternative data (alt data) — earnings call transcripts, SEC filings, social media discussions, news velocity — has become table stakes for quantitative strategies.

The problem? Each source has its own API, auth system, schema, and rate limits. Integrating five sources means maintaining five pipelines.

Trawl fixes the ingestion layer. One API, one SDK, one schema. Pull structured text from 10+ internet sources and feed it into your own models. I'll walk through what's available — and include an interactive demo you can run right now.

The Data Sources

Our demo pulls from five sources, each with a different reliability tier:

TierSourceWhat It ProvidesTrawl Endpoint
1 — RegulatorySEC FilingsParsed insider trades (Form 4 XML), 10-K, 10-Q, 8-K/api/filings/form4/{ticker}
2 — CorporateEarnings CallsSpeaker-segmented transcripts via Finnhub/api/earnings/search
3 — NewsFinancial NewsMulti-source aggregated articles/api/news/company
4 — SocialRedditPosts with scores, comments, subreddit/api/reddit/search
5 — AcademicPapersArXiv + Semantic Scholar/api/papers/search

Why tiers matter: A Form 4 SEC filing is a legally mandated disclosure. A Reddit post is an anonymous opinion. Your model should weight them accordingly.

Quick Start: 5 Sources in 10 Lines

from trawl import TrawlClient

# No key needed for preview endpoints (what the demo uses)
client = TrawlClient()

earnings = client.earnings.search("AAPL", from_date="2024-01-01")
filings = client.filings.form4("AAPL", date_after="2024-01-01")
news = client.news.company_news("AAPL")
reddit = client.reddit.search("$AAPL", from_date="2024-01-01")
papers = client.papers.search("Apple stock sentiment NLP")
sentiment = client.ai.preview_sentiment("Revenue beat expectations by 15%...")

For production (higher rate limits, watches, webhooks), pass your API key:

client = TrawlClient(api_key="your_key")  # Get a free key at gettrawl.com/register

Five sources, ten lines, zero per-source integration work. Every response includes timestamps, titles, source attribution, and URLs to original content.

Note: The interactive demo uses raw HTTP requests so you can see the API directly. In production, use the SDK shown above — it handles retries, rate limits, and typed responses.

Or query all sources at once with unified search — this is Trawl's core differentiator:

# Single call — all sources, one schema
results = client.search.unified("AAPL earnings guidance")
for item in results:
    print(f"[{item.source}] {item.title}{item.published_date}")

Build vs. Buy: Why Use Trawl?

Build In-HouseUse Trawl
Setup time~80 engineering hourspip install trawl-sdk
Ongoing maintenance~10 hrs/month (API changes, rate limits)Zero
Auth systems5 different (Finnhub key, EDGAR User-Agent, Reddit OAuth, etc.)1 API key
Schemas5 different response formats1 unified schema
Monthly cost~$1,500/mo in eng time + API fees$0-29/mo

Working with SEC Filings

Form 4 filings (insider transactions) are a well-studied signal in academic literature. Trawl parses the Form 4 XML directly — you get structured buy/sell data, not just metadata:

Get parsed Form 4 insider transactions
curl "https://api.gettrawl.com/api/filings/form4/AAPL?max_results=5"
# Parsed insider transactions — buy/sell direction, shares, price
form4 = client.filings.form4("AAPL", date_after="2024-01-01", max_results=10)
for filing in form4["filings"]:
    for owner in filing["owners"]:
        print(f"Insider: {owner['name']} ({owner['officer_title']})")
    for tx in filing["transactions"]:
        print(f"  {tx['transaction_type']}: {tx['shares']} shares @ ${tx['price_per_share']}")
        print(f"  Total value: ${tx['total_value']:,.2f}")

No XML parsing on your end. Trawl handles the EDGAR rate limits, namespace quirks, and derivative/non-derivative transaction variants. Every transaction includes: security title, date, code (purchase/sale/gift/exercise), shares, price, ownership type, and whether it's a derivative instrument.

Reddit Data for Quant Research

Reddit discussions (r/wallstreetbets, r/stocks, r/investing) provide a window into retail sentiment. Trawl's Reddit API returns full post metadata:

# Date filtering for backtesting — pull historical sentiment windows
posts = client.reddit.search("$AAPL", from_date="2024-01-01", to_date="2024-06-30")
for post in posts:
    print(f"r/{post.subreddit}: {post.title}")
    print(f"  Score: {post.score}, Comments: {post.num_comments}")

For deeper analysis, fetch the full comment tree as RAG-ready segments:

# Get the post ID from search results, then fetch the full thread
post_id = posts[0].id  # e.g., "1s8zm2z"
thread = client.reddit.get_thread(post_id)
for segment in thread.segments:
    print(f"[{segment.speaker}] {segment.text}")

Reddit Analytics: Mentions, Trending, and Sentiment

Beyond search, Trawl offers dedicated analytics endpoints for quantitative Reddit analysis:

Ticker mention velocity — how often is a ticker being discussed across finance subreddits?

Get ticker mention velocity
curl "https://api.gettrawl.com/api/reddit/mentions/AAPL?period=24h"
# Mention velocity across r/wallstreetbets, r/stocks, r/investing, etc.
mentions = client.reddit.mentions("AAPL", period="24h")
print(f"Mentions in last 24h: {mentions['total_mentions']}")
for sub in mentions["subreddits"]:
    print(f"  r/{sub['name']}: {sub['count']} mentions")

Trending tickers — which tickers are gaining momentum on Reddit right now?

# Trending tickers with noise filtering (excludes common false positives)
trending = client.reddit.trending(period="24h", limit=25)
for ticker in trending["tickers"]:
    print(f"${ticker['symbol']}: {ticker['mentions']} mentions, score {ticker['score']}")

Aggregated Reddit sentiment — AI-scored sentiment per ticker across finance subreddits:

Get AI-aggregated Reddit sentiment
curl "https://api.gettrawl.com/api/reddit/sentiment/AAPL?period=24h&max_posts=10"
# Aggregated AI sentiment from finance subreddits
sentiment = client.reddit.sentiment("AAPL", period="24h", max_posts=10)
print(f"Overall: {sentiment['overall_sentiment']['label']} ({sentiment['overall_sentiment']['score']:+.2f})")
for sub, data in sentiment["subreddit_scores"].items():
    print(f"  r/{sub}: {data['label']} ({data['score']:+.2f})")

These three endpoints give you a complete picture of retail sentiment: how much attention a ticker is getting (mentions), what's gaining momentum (trending), and whether the crowd is bullish or bearish (sentiment).

Financial Sentiment Analysis (Built-In)

Trawl includes a server-side financial sentiment API — no need to bring your own NLP model:

Score financial sentiment
curl -X POST "https://api.gettrawl.com/api/ai/preview/sentiment" \
  -H "Content-Type: application/json" \
  -d '{
  "text": "Revenue beat expectations by 15%. Management raised full-year guidance and announced a $10B buyback."
}'
# Score any financial text — earnings transcripts, news, Reddit posts
sentiment = client.ai.preview_sentiment(
    "Revenue beat expectations by 15%. Management raised full-year guidance."
)
print(f"Score: {sentiment['sentiment_score']:+.2f}")  # -1.0 to +1.0
print(f"Label: {sentiment['sentiment_label']}")         # bullish / bearish / neutral
print(f"Confidence: {sentiment['confidence']:.0%}")
for phrase in sentiment["key_phrases"]:
    print(f"  {phrase['impact']}: {phrase['phrase']}")

The sentiment model is tuned for financial language — it understands earnings beats, guidance changes, and forward-looking statements. For a full production pipeline:

  1. Ingest via Trawl (earnings + news + Reddit + filings)
  2. Chunk using Trawl's built-in chunker (/api/earnings/{ticker}/{year}/{quarter}/chunks)
  3. Score with Trawl's /api/ai/preview/sentiment or your own FinBERT/GPT-4
  4. Embed into a vector store for semantic retrieval
  5. Aggregate scores by source tier, recency, and confidence

For deeper analysis, bring your own model — Trawl handles the data pipeline so you can focus on alpha.

Intelligence Report: All Sources in One Call

For the ultimate shortcut, Trawl's intelligence report endpoint aggregates every available source for a ticker into a single structured response — Reddit sentiment, insider trades, earnings, SEC filings, news, congressional trading, and Wikipedia pageviews:

Get full intelligence report
curl "https://api.gettrawl.com/api/intelligence/AAPL"
# One call — complete cross-source intelligence
import requests

report = requests.get("https://api.gettrawl.com/api/intelligence/AAPL").json()

print(f"Summary: {report['summary']}")
for section_name, section_data in report["sections"].items():
    print(f"\n--- {section_name} ---")
    print(f"  {len(section_data) if isinstance(section_data, list) else 'Available'}")

for signal in report.get("signals", []):
    print(f"Signal: {signal['type']}{signal['description']} (confidence: {signal['confidence']:.0%})")

This is the single most powerful endpoint for quant research — it replaces a dozen individual API calls with one request. Use it for daily briefings, screening new tickers, or as the first step in a deeper analysis pipeline.

Try the Interactive Demo

We built a Marimo notebook (a reactive Python notebook, like Jupyter but interactive) that pulls live data from all 5 sources:

git clone https://github.com/trawlhq/trawl-examples.git
cd trawl-examples
pip install marimo requests
marimo run demos/quant-trading.py

The demo includes:

  • Live API calls to 5 Trawl sources with human-readable error messages
  • Unified search — one call that queries all sources in a single request
  • Financial sentiment scoring — Trawl's server-side bullish/bearish/neutral NLP on live content
  • Form 4 insider transactions — parsed buy/sell data from SEC EDGAR XML
  • Reddit sentiment — AI-aggregated sentiment across finance subreddits
  • Intelligence report — full cross-source analysis in one call
  • Date-range backtesting — filter earnings and Reddit by custom date windows
  • AI summarization — Trawl's built-in NLP on real financial content (no setup)
  • News volume timeline showing publication frequency by hour
  • Reddit community breakdown by subreddit engagement

No API key needed. No sign-up required.

From Demo to Production

  1. Start with the intelligence report (/api/intelligence/{ticker}) for a complete cross-source overview in one call
  2. Score sentiment with Trawl's /api/ai/preview/sentiment (server-side NLP) or bring your own FinBERT/GPT-4
  3. Track insider trades via /api/filings/form4/{ticker} — parsed XML with buy/sell direction, shares, and price
  4. Monitor Reddit momentum with /api/reddit/mentions/{ticker}, /trending, and /sentiment/{ticker}
  5. Backtest with date rangesfrom_date/to_date on earnings, Reddit, and filings
  6. Use the chunks endpoint for RAG-ready, speaker-segmented earnings data
  7. Set up watches to monitor tickers automatically (Pro tier, $29/mo)
  8. Scale to 100+ tickers with concurrent requests or the unified search endpoint

Plug Into Your Stack

Trawl's API is just HTTP — it works with any framework. Here's how to wire it into the most common AI and automation stacks for quant workflows.

CrewAI — Multi-Agent Research

Spin up a team of agents that divide and conquer across data sources. One analyst pulls earnings, another tracks insider trades, a third monitors news sentiment — then they synthesize a brief.

from crewai import Agent, Task, Crew
from trawl_crewai import TrawlEarningsTool, TrawlNewsTool, TrawlIntelligenceTool

analyst = Agent(
    role="Equity Research Analyst",
    goal="Analyze a company using earnings calls, news, and insider trading data",
    tools=[TrawlEarningsTool(), TrawlNewsTool(), TrawlIntelligenceTool()],
)

task = Task(
    description="Build a research brief on NVDA: recent earnings call highlights, insider trading patterns, and news sentiment",
    agent=analyst,
)

crew = Crew(agents=[analyst], tasks=[task])
result = crew.kickoff()

LangChain — RAG Over Earnings

Load earnings call transcripts as LangChain Documents, then index them for semantic question-answering. Useful for screening dozens of tickers by asking natural-language questions about guidance, margins, or capex.

from langchain_trawl import TrawlLoader

# Load earnings call transcripts as LangChain Documents
loader = TrawlLoader(urls=[
    "https://api.gettrawl.com/api/earnings/AAPL/2026/1",
])
docs = loader.load()

# Index for question-answering
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

Slack Bot

Get intelligence delivered to your trading channel — no context-switching required:

/trawl-intelligence NVDA
/trawl-earnings AAPL
/trawl-sentiment TSLA

Set up the Slack bot →

MCP (No Code)

If you use Claude Desktop with the Trawl MCP server, you can skip code entirely. Ask Claude directly:

"Pull up NVDA's latest earnings call, check for any Form 4 insider transactions this month, and compare with congressional trading activity."

Claude will call the right Trawl tools behind the scenes — earnings, Form 4 filings, and congressional trades — and return a synthesized answer. No API keys to manage, no scripts to maintain.

Set up MCP →

Get Started

pip install trawl-sdk

Want to monitor 100 tickers across 5 sources? The free tier covers 1,000 requests/month. Sign up in 30 seconds — no credit card required. Pro includes a 7-day free trial.