API Documentation
Complete reference for 15+ content sources including government and regulatory data.
Quickstart
Extract a transcript
No signup needed. Paste any YouTube URL and get the full transcript with timestamps.
import requests
response = requests.post(
"https://api.gettrawl.com/api/transcripts/preview",
json={"url": "https://youtube.com/watch?v=dQw4w9WgXcQ"}
)
data = response.json()
for seg in data["segments"]:
print(f"[{seg['start']:.1f}s] {seg['text']}")Get an API key
Create an account and generate an API key to store transcripts and access advanced features.
from trawl import TrawlClient
# Use your API key
client = TrawlClient(api_key="trawl_your_key")Extract and download
Store the transcript permanently and download in any format: TXT, JSON, SRT, VTT, or JSONL.
transcript = client.transcripts.extract("https://youtube.com/watch?v=dQw4w9WgXcQ")
print(f"Stored as ID {transcript.id} with {len(transcript.segments)} segments")
# Download
content = client.transcripts.download(transcript.id, format="srt")SDKs & Libraries
Python
pip install trawl-sdkhttpx · sync + async
JavaScript
npm i trawl-sdkTypeScript · native fetch
LangChain
TrawlLoaderBaseLoader · lazy_load()
Zapier
7,000+ app connectorsTriggers · Searches · Private
Obsidian
Community PluginIntelligence reports → vault
n8n
n8n-nodes-trawlCommunity node · 4 resources
Python SDK
from trawl import TrawlClient
client = TrawlClient(api_key="trawl_xxx")
# Search
results = client.search.youtube("machine learning", max_results=5)
# Extract
transcript = client.transcripts.extract("https://youtube.com/watch?v=...")
# Podcasts
podcasts = client.search.podcasts("Lex Fridman")
episodes = client.podcasts.episodes(podcasts.results[0].id)
# Bulk download as JSONL
zip_bytes = client.bulk.download(["vid1", "vid2"], format="jsonl")LangChain Document Loader
Load YouTube and podcast transcripts directly into your RAG pipeline.
from langchain_trawl import TrawlLoader
loader = TrawlLoader(
urls=["https://youtube.com/watch?v=abc", "https://youtube.com/watch?v=def"],
api_key="trawl_xxx",
)
docs = loader.load()
# docs[0].page_content = "Full transcript..."
# docs[0].metadata = {"video_id": "abc", "title": "...", "source": "..."}Open-Source Examples
14 ready-to-run pipelines: sentiment analysis, insider trading tracker, RAG, LangChain agent, and more.
github.com/trawlhq/trawl-examples →Authentication
API Key
Backends & scripts
X-API-Key: trawl_xxxJWT Bearer
Frontends & SPAs
30min access + 7day refreshOAuth
One-click signup
Google & GitHub redirect flowAPI Key authentication
Generate a key from the dashboard, then pass it in every request.
client = TrawlClient(api_key="trawl_your_key")
transcript = client.transcripts.extract("https://youtube.com/watch?v=...")YouTube Transcripts
Search YouTube
Search via innertube (no API quota). Filter by date.
results = client.search.youtube("AI tutorial", max_results=10)
for r in results.results:
print(f"{r.title} — {r.channel}")Extract and download
Store the transcript and download in TXT, JSON, SRT, or VTT.
t = client.transcripts.extract("https://youtube.com/watch?v=abc123")
srt = client.transcripts.download(t.id, format="srt")Podcast Transcription
Search and browse episodes
Search 4M+ podcasts. Browse episodes. Free to browse — sign up to transcribe.
podcasts = client.search.podcasts("Lex Fridman")
episodes = client.podcasts.episodes(podcasts.results[0].id)
for ep in episodes:
status = "✅ Free" if ep.has_transcript else "🎙️ AI Transcription"
print(f"{status} {ep.title}")Submit transcription job
Episodes with RSS transcripts are instant and free. Others use AI transcription (~$0.006/min).
job = client.podcasts.transcribe("https://example.com/ep.mp3")
print(f"Job {job.id}: {job.status}")TikTok
PublicExtract TikTok captions
Auto-generated caption extraction from any TikTok video URL. No auth needed.
data = requests.post("https://api.gettrawl.com/api/tiktok/preview",
json={"url": "https://tiktok.com/@user/video/7234567890123456789"}).json()Search TikTok videos
Search TikTok by keyword, hashtag, or username. Powered by EnsembleData. Builder+ tier.
# Keyword search
results = client.tiktok_search.search("machine learning")
# User's videos
videos = client.tiktok_search.user_videos("username")
# Video comments
comments = client.tiktok_search.video_comments("7234567890123456789")Extract Instagram Reel captions
Extract embedded captions from Instagram Reels. Falls back to AI transcription. No auth needed.
data = requests.post("https://api.gettrawl.com/api/instagram/preview",
json={"url": "https://instagram.com/reel/ABC123/"}).json()Search Instagram
Search Instagram by keyword, username, or user posts/reels. Powered by EnsembleData. Builder+ tier.
# Search
results = client.instagram_search.search("data science")
# User info
user = client.instagram_search.user_info("openai")
# User's reels
reels = client.instagram_search.user_reels(12345)
# Post comments
comments = client.instagram_search.post_comments("ABC123")X Spaces
PublicTranscribe a Space
Downloads the audio recording and transcribes with AI speech-to-text. May take 1-2 minutes.
data = requests.post("https://api.gettrawl.com/api/twitter/preview",
json={"url": "https://x.com/i/spaces/1eaKbrPAqbwKX"}).json()
for seg in data["segments"]:
print(f"[{int(seg['start']//60):02d}:{int(seg['start']%60):02d}] {seg['text']}")Earnings Calls
PublicSearch earnings by ticker
Find earnings call transcripts by company ticker symbol. Requires Finnhub API key for full functionality.
FINNHUB_API_KEY env var for full transcripts. Search works without it.results = client.earnings.search("AAPL", from_date="2024-01-01", to_date="2025-01-01")
for call in results["results"]:
print(f"{call['company_name']} Q{call['quarter']} {call['year']}")Get full earnings transcript
Retrieve the complete earnings call transcript with speaker-segmented sections (prepared remarks, Q&A) and participant roles.
transcript = client.earnings.get("AAPL", year=2024, quarter=4)
for section in transcript["sections"]:
print(f"[{section['speaker']}] ({section['role']})")
print(f" {section['text'][:100]}...")Get JSONL chunks for RAG
Token-bounded JSONL chunks optimized for vector databases. Never splits across speaker segments. Each chunk includes speaker, role, and section metadata.
chunks = client.earnings.chunks("AAPL", year=2024, quarter=4)
for chunk in chunks:
print(f"[chunk {chunk['chunk_index']}] {chunk['speaker']}: {chunk['text'][:80]}...")SEC Filings
PublicSearch SEC EDGAR filings
Free access to 18M+ SEC filings. No auth required. Supports 10-K, 10-Q, 8-K, and all form types.
results = client.filings.search("AAPL", form_type="10-K")
for filing in results["results"]:
print(f"{filing['company_name']} {filing['form_type']} filed {filing['filed_date']}")Parse Form 4 insider transactions
Get structured buy/sell data from SEC Form 4 filings. Returns insider identity, transaction direction, shares, price, and ownership type.
data = client.filings.form4("AAPL", date_after="2024-01-01")
for filing in data["filings"]:
for owner in filing["owners"]:
print(f"{owner['name']} ({owner['officer_title']})")
for txn in filing["transactions"]:
print(f" {txn['transaction_type'].upper()}: {txn['shares']:,.0f} shares @ ${txn['price_per_share']}")News
PublicSearch global news (GDELT)
Search across 65+ languages. Free, no auth. Updates every 15 minutes. Filter by language and country.
GET /api/news/article?url=...results = client.news.search("artificial intelligence")
for article in results["articles"]:
print(f"[{article['source_domain']}] {article['title']}")Get market news
Real-time financial market news from Finnhub. Filter by category: general, forex, crypto, or merger.
results = client.news.market_news(category="general")
for article in results["articles"]:
print(f"[{article['source']}] {article['title']}")Get company news
Company-specific news by ticker symbol. Optionally filter by date range.
results = client.news.company_news("AAPL")
for article in results["articles"]:
print(f"[{article['source']}] {article['title']} ({article['published_date'][:10]})")Congressional Trading
PublicSearch stock trades by politicians
Search House and Senate member stock trades by ticker, politician name, chamber, and date range. All tiers — no auth required.
results = client.congress_trading.search(ticker="AAPL", from_date="2024-01-01")
for trade in results["trades"]:
print(f"{trade['politician']} ({trade['chamber']}) — {trade['type']} ${trade['amount']}")List active politicians
Get politicians ranked by trading activity. Filter by chamber (house/senate) and minimum trade count.
politicians = client.congress_trading.politicians(chamber="senate", min_trades=5)
for p in politicians:
print(f"{p['name']} — {p['trade_count']} trades")USPTO Patents
PublicSearch 12M+ patents
Search by keyword, assignee, inventor, or CPC code. Filter by date range. No auth required.
results = client.patents.search("machine learning", assignee="Google")
for patent in results["patents"]:
print(f"{patent['patent_number']}: {patent['title']}")Get patent detail
Full patent metadata including claims, abstract, description, and citations.
patent = client.patents.get("US11526754B2")
print(f"Title: {patent['title']}")
print(f"Claims: {len(patent['claims'])}")Lobbying Disclosures
PublicSearch Senate LDA lobbying filings
Search by client name, registrant, or issue area. Returns filing details, amounts, and lobbyists. No auth required.
results = client.lobbying.search(client_name="Meta", issue_area="technology")
for filing in results["filings"]:
print(f"{filing['registrant']} → {filing['client_name']}: ${filing['amount']}")Wikipedia Pageviews
PublicGet article pageview data
Daily or monthly pageview counts for any Wikipedia article. Great for tracking public attention on companies, people, and events.
data = client.pageviews.get("Bitcoin", from_date="2024-01-01", to_date="2024-12-31", granularity="monthly")
for point in data["data"]:
print(f"{point['date']}: {point['views']:,} views")Compare multiple articles
Compare pageviews across up to 10 articles. Useful for tracking relative attention on competing companies or topics.
data = client.pageviews.compare(["Bitcoin", "Ethereum", "Solana"], from_date="2024-01-01")
for article, series in data["articles"].items():
total = sum(p["views"] for p in series)
print(f"{article}: {total:,} total views")OpenFIGI
PublicFinancial instrument identifier resolution
Resolve tickers, ISINs, CUSIPs, or SEDOLs to FIGI identifiers. Batch resolve multiple instruments in one call.
result = client.figi.lookup("AAPL", id_type="TICKER")
for instrument in result["data"]:
print(f"{instrument['name']} — FIGI: {instrument['figi']}, Exchange: {instrument['exchCode']}")Alt Data Pulse
PublicCross-reference Google Trends with financial signals
The Alt Data Pulse fetches Google Trends, searches news for each trending topic, extracts ticker mentions, then checks for insider trading and congressional activity. Returns ranked alt data signals where public trends intersect market activity. No auth required.
pulse = client.pulse.get(geo="US", max_trends=5)
for item in pulse["pulse"]:
print(f"[{item['alt_data_score']:.2f}] {item['trend']}")
print(f" Tickers: {', '.join(item['related_tickers'][:5])}")
for signal in item["signals"]:
print(f" Signal: {signal['type']} ({signal['severity']})")Deep-dive: topic to ticker correlations
Find which tickers are most exposed to any trending topic, with financial signal detection on each. Useful for understanding how a specific event (tariffs, AI regulation, oil prices) maps to market activity.
topic = client.pulse.topic("artificial intelligence chips")
for t in topic["tickers"]:
print(f"{t['ticker']}: score {t['alt_data_score']:.2f}, {t['mention_count']} mentions")
for s in t["signals"]:
print(f" {s['type']}: {s['description']}")Ticker Intelligence
PublicCross-source intelligence report
Aggregated intelligence for any ticker — earnings, insider trades, congressional trading, SEC filings, and news in one call. Includes automated signal detection (e.g., heavy insider selling, unusual congressional activity). No auth required. Free tier: 1,000 req/mo.
report = client.intelligence.get("AAPL", from_date="2024-01-01", sections="earnings,form4,congress")
print(f"Signals detected: {len(report['signals'])}")
for signal in report["signals"]:
print(f" [{signal['severity']}] {signal['description']}")Unified Timeline
PublicChronological cross-source event feed
Every earnings call, insider trade, congressional trade, SEC filing, and news article for a ticker in one sorted feed. Filter by source, date range, and sort order. No auth required. Free tier: 1,000 req/mo.
timeline = client.timeline.get("AAPL", from_date="2024-06-01", sources="form4,congress", limit=20)
for event in timeline["events"]:
print(f"[{event['timestamp'][:10]}] {event['event_type']}: {event['title']}")Watchlist Scan
PublicBatch intelligence across multiple tickers
Scan up to 20 tickers in one call. Returns ranked cross-source signals with severity filtering and per-ticker summaries. No auth required. Free tier: 1,000 req/mo.
scan = client.scan.run(
tickers=["AAPL", "MSFT", "NVDA"],
from_date="2024-06-01",
severity_filter="medium"
)
print(f"Signals found: {scan['signals_total']}")
for signal in scan["ranked_signals"]:
print(f" [{signal['severity']}] {signal['ticker']}: {signal['description']}")Anomaly Detection
PublicStatistical anomaly detection vs baseline
Compares current period activity against a baseline period using ratio-based thresholds. Detects unusual spikes in insider trading, congressional activity, SEC filings, and more. No auth required. Free tier: 1,000 req/mo.
anomalies = client.anomalies.get("AAPL", current_days=30, baseline_days=30)
print(f"Anomalies found: {anomalies['anomaly_count']}")
for a in anomalies["anomalies"]:
if a["is_anomaly"]:
print(f" [{a['severity']}] {a['interpretation']}")Academic Papers
PublicSearch 200M+ papers
Search across ArXiv and Semantic Scholar. Get metadata, abstracts, citation counts, and PDF links.
results = client.papers.search("transformer architecture")
for paper in results["results"]:
print(f"[{paper['source']}] {paper['title']} ({paper['citation_count']} citations)")Extract full paper text
Download PDF, extract text with section detection (Abstract, Methods, Results, etc.). Section-aware chunking for RAG.
extraction = client.papers.extract(arxiv_id="1706.03762")
print(f"Sections: {len(extraction['sections'])}")
for s in extraction["sections"]:
print(f" [{s['name']}] {len(s['text'])} chars")Search Reddit posts
Search posts across all subreddits or within a specific one. Returns titles, scores, comment counts, and URLs.
results = client.reddit.search("machine learning", limit=5, from_date="2024-01-01", to_date="2025-01-01")
for post in results["posts"]:
print(f"r/{post['subreddit']} | {post['title']} ({post['score']} pts)")Get full thread (RAG-optimized)
Post + all comments as unified TranscriptSegment array. Perfect for RAG pipelines — each segment has author, depth, and score metadata.
thread = client.reddit.get_thread("1abc123", sort="best")
print(f"Total segments: {thread['total_segments']}")
for seg in thread["segments"][:5]:
print(f" [{seg['metadata']['author']}] {seg['text'][:80]}")Reddit Analytics: Ticker Mentions
Track how often a ticker is mentioned across finance subreddits like r/wallstreetbets, r/stocks, and r/investing. Returns mention counts, velocity, and top posts.
mentions = client.reddit.mentions("NVDA", period="24h")
print(f"Mentions: {mentions['total_mentions']} in {mentions['period']}")
for sub in mentions["subreddits"]:
print(f" r/{sub['name']}: {sub['count']} mentions")Reddit Analytics: Trending Tickers
Discover which tickers are trending across Reddit finance communities. Noise filtering removes common words and low-signal mentions.
trending = client.reddit.trending(period="24h", limit=10)
for t in trending["tickers"]:
print(f"{t['ticker']}: {t['mentions']} mentions, {t['sentiment']}")Reddit Analytics: Ticker Sentiment
AI-aggregated sentiment analysis for a ticker across Reddit posts. Analyzes post titles, body text, and top comments to determine bullish/bearish/neutral sentiment.
sentiment = client.reddit.sentiment("TSLA", period="24h", max_posts=5)
print(f"Overall: {sentiment['overall_sentiment']} ({sentiment['confidence']})")
for post in sentiment["posts_analyzed"]:
print(f" [{post['sentiment']}] {post['title'][:60]}")Browse subreddit posts
Get recent or top posts from any subreddit. Top posts support time filters: hour, day, week, month, year, all.
# Recent posts
posts = client.reddit.subreddit("machinelearning")
for post in posts["posts"]:
print(f"{post['title']} ({post['score']} pts)")
# Top posts (last week)
top = client.reddit.subreddit_top("machinelearning", time_filter="week", limit=10)
for post in top["posts"]:
print(f"{post['title']} ({post['score']} pts)")GitHub
PublicSearch repositories
Search GitHub repositories by keyword. Sort by stars, forks, or updated date. Returns repo metadata including description, language, and star count.
results = client.github.search_repos("machine learning", sort="stars")
for repo in results["items"]:
print(f"{repo['full_name']} ⭐ {repo['stargazers_count']}")Get repository details
Retrieve detailed stats for a specific repository including top contributors, languages, and recent activity.
repo = client.github.get_repo("facebook", "react")
print(f"{repo['full_name']}: {repo['stargazers_count']} stars, {repo['forks_count']} forks")
for c in repo["top_contributors"][:3]:
print(f" {c['login']}: {c['contributions']} commits")Google Trends
PublicGet trending searches
Retrieve daily trending searches from Google Trends for a specific country. Returns trending topics with search volume and related articles.
trending = client.trends.get_trending(geo="US")
for topic in trending["searches"]:
print(f"{topic['title']} - {topic['traffic']}")Cryptocurrency
PublicGet cryptocurrency prices
Fetch current prices for one or more cryptocurrencies. Powered by CoinGecko. Supports any vs_currency (usd, eur, btc, etc.).
prices = client.crypto.get_prices(ids="bitcoin,ethereum")
for coin, data in prices.items():
print(f"{coin}: {data['usd']:,.2f} USD")Search coins
Search for cryptocurrencies by name or symbol. Returns matching coins with their IDs, symbols, and market cap rank.
results = client.crypto.search("solana")
for coin in results["coins"]:
print(f"{coin['name']} ({coin['symbol']}) - Rank #{coin['market_cap_rank']}")Trending cryptocurrencies
Get the most searched cryptocurrencies in the last 24 hours. Useful for tracking market sentiment and emerging interest.
trending = client.crypto.get_trending()
for coin in trending["coins"]:
print(f"{coin['name']} ({coin['symbol']}) - Score: {coin['score']}")Weather
PublicGet weather forecast
7-day forecast for any US location using the National Weather Service API. Provide latitude and longitude coordinates.
forecast = client.weather.get_forecast(lat=37.77, lon=-122.42)
for period in forecast["periods"]:
print(f"{period['name']}: {period['temperature']}°{period['temperatureUnit']} - {period['shortForecast']}")Get weather alerts
Active severe weather alerts from the National Weather Service. Filter by state/area and severity level.
alerts = client.weather.get_alerts(area="CA")
for alert in alerts["alerts"]:
print(f"[{alert['severity']}] {alert['event']}: {alert['headline']}")Unified Search
PublicSearch all sources at once
One query, all 15+ content sources including government and regulatory data. Graceful degradation — if one source fails, others still return.
import requests
results = requests.get("https://api.gettrawl.com/api/search/unified",
params={"q": "machine learning", "sources": "all", "max_per_source": 3}).json()
for r in results["results"]:
print(f"[{r['source_type']}] {r['title']}")
print(f"Queried: {', '.join(results['sources_queried'])}")Bulk Download
PublicDownload 30 transcripts as ZIP
No auth. Three formats: txt, json, jsonl. JSONL auto-chunks for vector DBs.
zip_bytes = client.bulk.download(
["dQw4w9WgXcQ", "9bZkp7q19f0"],
format="jsonl"
)
with open("transcripts.zip", "wb") as f:
f.write(zip_bytes)AI Features
Summarize
GPT-4o-mini generates structured summaries with topics. No auth for preview.
# Preview (no auth)
summary = requests.post("https://api.gettrawl.com/api/ai/preview/summarize",
json={"text": "Full transcript..."}).json()
# Stored transcript (auth)
summary = requests.post("https://api.gettrawl.com/api/ai/1/summarize",
headers={"X-API-Key": "trawl_xxx"}).json()Entity extraction
Extract people, teams, odds, injuries, events. Optimized for sports and finance.
curl -X POST https://api.gettrawl.com/api/ai/1/entities \
-H "X-API-Key: trawl_your_key"Financial sentiment analysis
Server-side NLP sentiment scoring tuned for financial text. Returns bullish/bearish/neutral with confidence score and key phrase extraction. No auth for preview.
# Preview (no auth)
result = client.ai.preview_sentiment("Revenue beat expectations by 15%...")
print(f"{result['sentiment_label']}: {result['sentiment_score']}")
# Stored transcript (auth)
result = client.ai.sentiment(transcript_id=42)Webhooks
Pro+Create a webhook
Register a URL to receive HMAC-signed event payloads. The response includes a secret for signature verification. Auto-disabled after 5 consecutive failures. 3 retries with exponential backoff.
webhook = client.webhooks.create(
url="https://your-app.com/webhooks/trawl",
events=["job.completed", "job.failed"]
)
print(f"Webhook ID: {webhook['id']}")
print(f"Secret: {webhook['secret']} # Store this securely!")Test your webhook
Send a test payload to verify your endpoint is receiving and processing events correctly.
client.webhooks.test("wh_abc123")Verify HMAC signatures
Every webhook request includes an X-Trawl-Signature header. Verify it using your webhook secret to ensure payloads are authentic.
import hmac, hashlib
def verify_webhook(payload: bytes, signature: str, secret: str) -> bool:
expected = "sha256=" + hmac.new(
secret.encode(), payload, hashlib.sha256
).hexdigest()
return hmac.compare_digest(expected, signature)
# In your Flask/FastAPI handler:
# signature = request.headers["X-Trawl-Signature"]
# is_valid = verify_webhook(request.body, signature, WEBHOOK_SECRET)Watches
Pro+Create a watch
Monitor a source for new content. Watches poll automatically via Celery Beat every 60 seconds. Content is deduplicated — you only get notified once per item. Pro tier: 10 watches, Scale: unlimited.
watch = client.watches.create(source_type="earnings_ticker", target="AAPL")
print(f"Watch ID: {watch['id']}")
print(f"Monitoring: {watch['source_type']} / {watch['target']}")List & manage watches
List all watches, pause/resume monitoring, trigger an immediate poll, or update/delete watches.
# List watches
watches = client.watches.list()
for w in watches:
print(f"[{w['status']}] {w['source_type']}: {w['target']}")
# Pause/resume
client.watches.pause(watch["id"])
client.watches.resume(watch["id"])
# Trigger immediate poll
client.watches.trigger(watch["id"])
# Update filters
client.watches.update(watch["id"], filters={"keywords": ["revenue"]})
# Delete
client.watches.delete(watch["id"])Attach webhooks to watches
Connect a webhook to a watch so you receive real-time notifications when new content is detected. A watch can have multiple webhooks.
# Attach webhook to watch
client.watches.attach_webhook(watch["id"], webhook_id="wh_abc123")
# Detach webhook
client.watches.detach_webhook(watch["id"], webhook_id="wh_abc123")MCP Server
Connect to Claude Desktop
70 tools for YouTube, podcasts, TikTok, Instagram, X Spaces, Reddit, earnings, SEC, news, papers, AI, congressional trading, patents, lobbying, intelligence, and more.
get_transcriptsearch_videosbulk_extractsearch_podcastsget_podcast_episodesget_tiktok_transcriptget_instagram_transcriptget_twitter_spaces_transcriptsearch_earningsget_earnings_transcriptsearch_filingsget_form4_filingssearch_newsget_article_textsearch_papersextract_papersearch_redditget_reddit_threadsearch_congress_tradingsearch_patentssearch_lobbyingget_pageviewslookup_figiget_ticker_intelligenceget_ticker_timelinescan_watchlistget_anomaliessummarizeextract_entitiespreview_sentimentlist_historyget_transcript_by_id# Add to claude_desktop_config.json:
{
"mcpServers": {
"trawl": {
"command": "python",
"args": ["/path/to/mcp-server/server.py"],
"env": {
"BACKEND_URL": "https://api.gettrawl.com",
"TRAWL_API_KEY": "trawl_your_key"
}
}
}
}n8n Integration
Install community node
Settings → Community Nodes → n8n-nodes-trawl. 7 operations: YouTube, Podcasts, TikTok, AI.
All Endpoints
Auth
/api/auth/registerPublic/api/auth/loginPublic/api/auth/refreshPublic/api/auth/meAuth/api/auth/api-keysAuthTranscripts
/api/transcriptsAuth/api/transcripts/previewPublic/api/transcripts/{id}/downloadAuthSearch & Bulk
/api/searchPublic/api/bulk-downloadPublicPodcasts
/api/podcasts/searchPublic/api/podcasts/{id}/episodesPublic/api/podcasts/transcribeAuthPlatforms
/api/tiktok/previewPublic/api/tiktok/searchPublic/api/tiktok/hashtagPublic/api/tiktok/user/{username}/videosPublic/api/instagram/previewPublic/api/instagram/searchPublic/api/instagram/user/{user_id}/reelsPublic/api/instagram/post/{shortcode}/commentsPublic/api/twitter/previewPublicAI
/api/ai/preview/summarizePublic/api/ai/preview/sentimentPublic/api/ai/preview/entitiesPublic/api/ai/{id}/summarizeAuth/api/ai/{id}/sentimentAuth/api/ai/{id}/topicsAuth/api/ai/{id}/entitiesAuthWebhooks
/api/webhooksPro+/api/webhooksPro+/api/webhooks/{id}Pro+/api/webhooks/{id}Pro+/api/webhooks/{id}/testPro+Watches
/api/watchesPro+/api/watchesPro+/api/watches/{id}Pro+/api/watches/{id}Pro+/api/watches/{id}Pro+/api/watches/{id}/pausePro+/api/watches/{id}/resumePro+/api/watches/{id}/triggerPro+/api/watches/{id}/webhooksPro+/api/watches/{id}/webhooks/{wh_id}Pro+Papers
/api/papers/searchPublic/api/papers/extractPublic/api/reddit/searchPublic/api/reddit/thread/{id}Public/api/reddit/subreddit/{name}Public/api/reddit/subreddit/{name}/topPublic/api/reddit/post/{id}Public/api/reddit/post/{id}/commentsPublic/api/reddit/mentions/{ticker}Public/api/reddit/trendingPublic/api/reddit/sentiment/{ticker}PublicFinance
/api/earnings/searchPublic/api/earnings/{ticker}/{year}/{q}Public/api/earnings/{ticker}/{year}/{q}/chunksPublic/api/filings/searchPublic/api/filings/form4/{ticker}PublicNews
/api/news/searchPublic/api/news/articlePublic/api/news/marketPublic/api/news/companyPublicUnified
/api/search/unifiedPublicAlternative Data
/api/congress-trading/searchPublic/api/congress-trading/politiciansPublic/api/patents/searchPublic/api/patents/{number}Public/api/lobbying/searchPublic/api/pageviews/{article}Public/api/pageviews/comparePublic/api/figi/lookupPublic/api/figi/resolvePublicIntelligence
/api/intelligence/{ticker}Public/api/timeline/{ticker}Public/api/scanPublic/api/anomalies/{ticker}PublicSubscriptions
/api/subscriptionsAuth/api/subscriptionsAuth/api/subscriptions/{id}AuthOutput Formats
| Format | Best For | Via |
|---|---|---|
| txt | LLM context, RAG | download, bulk |
| json | Programmatic access | download, bulk |
| jsonl | Vector DBs (Pinecone, ChromaDB) | bulk only |
| srt | Subtitle players | download |
| vtt | HTML5 video | download |
Rate Limits
| Scope | Limit | Window |
|---|---|---|
| Authenticated | 30 req | per min per user |
| Preview | 10 req | per min per IP |
| Bulk | 30 videos | per request |
| AI | tier-based | Free: 5/mo · Pro: 1000/mo |
Redis sliding window. Fails open. Check Retry-After header.
Monthly Request Limits by Tier
| Tier | Price | Requests/mo | Sources | Key Features |
|---|---|---|---|---|
| Free | $0 | 1,000 | YouTube, Podcasts, News, Earnings, SEC, Reddit, Papers, Alt Data, Intelligence | 3 formats, MCP |
| Builder | $14/mo | 5,000 | + TikTok, Instagram search | Bulk download, 500 podcast transcriptions |
| Pro | $29/mo | 25,000 | All sources | Full AI, webhooks, watches, async jobs |
| Scale | $99/mo | 100,000 | All sources | Diarization, 500 URL batch, SLA, dedicated support |
Intelligence endpoints (intelligence, timeline, scan, anomalies) are available on all tiers — no auth required. Annual billing saves ~25%.
Use Cases
RAG Pipeline → Pinecone
Search → Bulk JSONL → Embed → Upsert. End-to-end in 10 lines.
import requests, zipfile, io, json
# 1. Search
results = requests.get("https://api.gettrawl.com/api/search",
params={"q": "machine learning", "max_results": 10}).json()["results"]
# 2. Bulk download as JSONL
resp = requests.post("https://api.gettrawl.com/api/bulk-download",
json={"video_ids": [r["video_id"] for r in results], "format": "jsonl"})
# 3. Parse chunks
zf = zipfile.ZipFile(io.BytesIO(resp.content))
chunks = []
for name in zf.namelist():
if name.endswith(".jsonl"):
for line in zf.read(name).decode().strip().split("\n"):
chunks.append(json.loads(line))
# 4. Each chunk has: video_id, title, channel, source_url, text, token_count
# → Generate embeddings → Upsert to Pinecone with metadata filtersTrading — Earnings Call Analysis
results = requests.get("https://api.gettrawl.com/api/search",
params={"q": "NVIDIA earnings call Q4", "max_results": 5}).json()
for video in results["results"]:
transcript = requests.post("https://api.gettrawl.com/api/transcripts/preview",
json={"url": f"https://youtube.com/watch?v={video['video_id']}"}).json()
text = " ".join(s["text"] for s in transcript["segments"])
# → Feed to your sentiment model