Answer up front: Here are the 10 most useful free datasets for retail quants in 2025—covering macro, rates, filings, positions, factors, and equities—plus when to use each, how to access them programmatically, and pitfalls to avoid. All links point to primary sources and are current for 2024–2025.
Disclosure: This article contains no affiliate links.
Risk disclaimer: Markets involve risk. The tools below can sharpen analysis but do not guarantee profits or prevent losses. Backtest thoughtfully, size positions conservatively, and follow U.S. regulations.
Table of Contents
Top 10 Free Datasets Every Retail Quant Should Bookmark in 2025
Quant research is a compounding game: better data → cleaner signals → tighter risk. In 2025, you can build a serious, auditable research stack with government and academic sources—no credit card required. Below I lay out 10 datasets you should bookmark and actually use. I’ll show where each dataset shines, the quickest path to an API call or bulk download, caveats that trip up new quants, and a mini-framework (“MAPS”) for vetting data: Methodology, Access, Periodicity, Scope. We’ll also walk a small case study to turn a few of these feeds into a practical signal you can backtest.
1) FRED (Federal Reserve Economic Data) — macro & revisions
Why it matters: FRED is the U.S. macro backbone: growth, inflation, jobs, credit, financial conditions—plus ALFRED for vintage (as-published) values to avoid look-ahead bias.
• Start here: FRED API overview and docs. (St. Louis Fed, API docs) (St. Louis Fed, 2025) and general API page (St. Louis Fed, 2025).
• Freshness note: FRED actively adds and retires series (e.g., 2024 Wilshire change). (St. Louis Fed News, 2024)
• Use cases: Macro regimes, equity risk premium inputs, inflation trend filters, yield curve slope.
• MAPS:
• Methodology: Mirrors source agencies; ALFRED preserves publication vintages.
• Access: REST API + CSV/JSON.
• Periodicity: Daily → annual, depending on series.
• Scope: 800k+ series from 100+ providers.
Pro tip: When backtesting macro signals (e.g., using CPI or payrolls), pull ALFRED vintages to match what traders knew then, not later revisions.
2) BLS Public Data API — CPI, jobs, wages
Why it matters: Inflation and labor data drive rates → valuations → factor spreads. BLS is primary and timely.
• Start here: BLS Public Data API and developer docs (BLS, 2024–2025 context) and API explainer mentions the public API, updated Nov 27, 2024 (BLS, 2024).
• Use cases: CPI core vs headline filters, real wage trends, employment momentum, seasonality checks.
• MAPS:
• Methodology: Official surveys (CPI, CES, etc.).
• Access: REST (JSON/XLSX); series IDs required.
• Periodicity: Monthly (CPI, payrolls), more for subcomponents.
• Scope: All BLS programs.
Pro tip: Use CPI component series (e.g., shelter) as separate regressors; they often lead or lag broad CPI.
3) BEA API — GDP, PCE, corporate profits, NIPA tables
Why it matters: For growth-sensitive strategies, BEA is the source of truth on GDP, PCE, and profits.
• Start here: BEA API User Guide (PDF, Nov 12, 2024). (BEA, 2024)
• Freshness note: 2025 annual updates revise 2020–2025 sub-national GDP and related series. (BEA, 2025)
• Use cases: Nowcasting growth, detecting profit cycle shifts, decomposing PCE goods vs services.
Pro tip: Profit share of GDP (NIPA) can inform equity risk premia and value factor robustness.
4) U.S. Treasury “Fiscal Data” & Rate Feeds — yield curves, bills, real yields
Why it matters: Rates are the discount rate for everything. Treasury publishes daily par yield curves (nominal & real), bill rates, and more.
• Start here: Fiscal Data API docs (U.S. Treasury, 2025); Daily par yield curves and bill rates pages (Treasury, 2025) and specific 2025 table view (Treasury, 2025).
• Bonus datasets: Daily Treasury Statement (cash flows), Debt to the Penny, MTS. (Fiscal Data, 2025; Debt to the Penny; Monthly Treasury Statement)
• MAPS:
• Methodology: Monotone convex spline curve; indicative bids (see methodology notes).
• Access: JSON/CSV via API; XML feeds available.
• Periodicity: Daily.
• Scope: Nominal & real curves, bills, long-term extrapolation factors.
Pro tip: Use real par yields to separate growth vs inflation in equity risk premia.
5) U.S. Census — Monthly Retail Trade (MARTS), e-commerce & schedules
Why it matters: Consumption is ~70% of U.S. GDP. Retail sales trends power sector rotation and macro timing.
• Start here: Monthly Retail Trade portal & latest sales (Aug 2025 release). (Census, 2025)
• Planning: 2024/2025 release schedule for advance and full reports. (Census, 2025)
• MAPS:
• Methodology: Survey-based; advance vs revised.
• Access: CSV/XLS; API endpoints for time series.
• Periodicity: Monthly (advance at 8:30 a.m. ET).
• Scope: Total, core, categories; state retail (experimental).
Pro tip: Use core retail sales (ex-auto, gas, building materials, food services) as a cleaner growth proxy in equity sector models.
6) SEC EDGAR APIs — filings & fundamentals at the source
Why it matters: 10-K/Q, 8-K, and XBRL fundamentals feed factor models (quality, accruals, leverage), event studies, and anomaly screens.
• Start here: SEC EDGAR API hub for submissions & XBRL (REST/JSON). (SEC, Jun 25, 2024) and API page (SEC, Jun 6, 2024)
• 2025 operational context: EDGAR Next enrollment timeline changes filer processes (relevant to data timing). (SEC Press Release, Oct 3, 2024; SEC How-To, Feb 6, 2025)
• MAPS:
• Methodology: Official issuer filings and tagged XBRL.
• Access: REST/JSON; rate limits apply.
• Periodicity: Event-driven (filing timestamps).
• Scope: All U.S. registrants; deep fundamentals via XBRL tags.
Pro tip: Build an as-filed point-in-time fundamentals store; don’t rely on post-restatement aggregates.
7) CFTC Commitments of Traders (COT) — positions by trader type
Why it matters: COT shows positions across futures (financials & commodities) by trader class—useful for contrarian and positioning-aware signals.
• Start here: Current COT dashboards & CSV (weekly; includes Sept 16, 2025). (CFTC, 2025)
• Bulk: Historical compressed files (yearly bundles incl. 2025). (CFTC, 2025)
• MAPS:
• Methodology: Reportable positions aggregated; different report types (Disaggregated, TFF).
• Access: CSV; fixed release schedule.
• Periodicity: Weekly (Fri, covering prior Tuesday).
• Scope: Major futures and options.
Pro tip: Normalize by open interest; test spreads (e.g., Commercials vs Managed Money) as z-scores.
8) FINRA Short Interest & Short-Sale Volume — sentiment & liquidity stress
Why it matters: Short interest and daily short-sale volumes help flag squeeze risk, borrow pressure, and sentiment extremes.
• Start here: Equity Short Interest data & files (rolling year + archives). (FINRA, 2025; FINRA files)
• Daily short-sale volume: Aggregates by security & facility. (FINRA, 2025)
• MAPS:
• Methodology: Broker-dealer submissions under FINRA Rule 4560.
• Access: Bulk text files / downloads; APIs/guidelines for OTC data. (FINRA Short Interest Reporting Dates page, 2025)
• Periodicity: Bi-monthly (short interest), daily (short-sale volume).
• Scope: OTC equities and exchange-listed aggregates.
Pro tip: Use days-to-cover = short interest / ADV; cap outliers and test alongside borrow fee changes when available.
9) Kenneth R. French Data Library — factors & anomalies
Why it matters: Free gold standard for factor research (Fama-French 3/5, momentum, industry portfolios), updated through 2025.
• Start here: Fama/French factors description and latest coverage (daily through Jun 30, 2025). (French Data Library, Jun 30, 2025)
• MAPS:
• Methodology: Transparent construction; size/value/momentum, etc.
• Access: Zip/CSV downloads.
• Periodicity: Daily/Monthly updates by table.
• Scope: U.S. and international sets.
Pro tip: Use factors as risk controls in your backtests (neutralize exposures) before attributing alpha to your signals.
10) Stooq — free global OHLCV (EOD & intraday slices)
Why it matters: If you need free equities/ETF OHLCV for prototyping, Stooq provides broad coverage with daily and some intraday (hourly/5-minute) files.
• Start here: Free historical market data pages (updated 2025). (Stooq, Sep 2025) and overview directories: (Stooq DB)
• Example listings: S&P 500 (^SPX), Nasdaq-100 (^NDX), and U.S. tickers. (Stooq SPX, 2025; Stooq NDX, 2025)
• Caveats: No official API; occasional symbol changes; validate survivorship/adjustments. (Independent explanation: Stooq has CSV downloads; no formal API.) (QuantStart, 2025 reference context)
• MAPS:
• Methodology: Exchange-sourced aggregates; adjustments vary.
• Access: Direct CSV downloads.
• Periodicity: Daily; some intraday.
• Scope: Equities, ETFs, FX, indices, some crypto.
Pro tip: Use Stooq for prototype/backtest only; for production, confirm with primary or licensed feeds.
Bonus utility: Nasdaq Trader symbol directories
Why it matters: Clean symbology is 50% of the battle. Nasdaq’s official symbol files help you map listings, ETPs, MPIDs.
• Start here: Symbol Lookup & dynamic files (e.g., nasdaqlisted.txt
). (Nasdaq Trader, 2025) and live nasdaqlisted.txt
shows current roster, 2025 (Nasdaq Trader, 2025)
One-table snapshot (bookmark this)
# | Dataset | Best for | Fastest access | Typical latency | Key gotcha |
---|---|---|---|---|---|
1 | FRED/ALFRED | Macro regimes & vintage data | REST JSON (series/observations ) |
Daily→Monthly | Use ALFRED to avoid look-ahead bias. |
2 | BLS | CPI, jobs, wages | Public API (series IDs) | Monthly (fixed times) | Core vs headline matters; seasonality. |
3 | BEA | GDP, PCE, profits | API tables (NIPA) | Monthly/Quarterly | Annual revisions (2025 updates). |
4 | Treasury (Fiscal Data) | Yield curves (nominal/real), bills | JSON/CSV endpoints | Daily (3:30 p.m. ET quotes) | Par curves are indicative, not trades. |
5 | Census MARTS | Retail sales & e-commerce | CSV/XLS + API | Monthly (8:30 a.m. ET) | Advance data revised later. |
6 | SEC EDGAR | Filings & XBRL | REST/JSON | Event-driven | Parse as-filed timestamps. |
7 | CFTC COT | Futures positioning | CSV weekly | Weekly (Tue→Fri) | Use OI normalization. |
8 | FINRA Short Data | Short interest & short-sale volume | Bulk text/CSV | Bi-monthly / Daily | Exchange vs OTC coverage differs. |
9 | French Library | Factors & anomalies | CSV ZIPs | Daily/Monthly | Factor spec ≠ investable product. |
10 | Stooq | Free OHLCV for prototypes | CSV downloads | Daily / some intra | No official API; check adjustments. |
Takeaway: Mix official macro/rates (FRED, BLS, BEA, Treasury, Census) with market microstructure & filings (SEC, CFTC, FINRA) and research factors (French), then prototype equities on Stooq before graduating to a paid feed.
Step-by-step: build a “macro-aware risk premium” signal in 30 minutes
1) Pull rates (real & nominal): Use Treasury par yields for 10Y nominal and 10Y real (TIPS). (Treasury, 2025)
2) Get earnings yield proxy: Use S&P 500 EOD from Stooq (^SPX) to compute E/P via trailing EPS proxy (if you have it) or a simpler dividend + buyback yield proxy; for a free demo, compare E/P proxy vs 10Y real to estimate a rough ex-ante real ERP. (Stooq, 2025)
3) Macro guardrails: Add CPI trend from BLS (YoY and 3-month annualized) to toggle regime (stable vs accelerating). (BLS, 2024)
4) Position filter: If ERP is above your historical median and CPI is stable/falling, allow higher equity beta; if ERP is below median or CPI accelerating, cut risk.
5) Compliance sanity: Keep leverage caps and drawdown stops; log all data vintages (ALFRED where applicable). (St. Louis Fed, 2025)
Mini math (toy example):
• Suppose 10Y real = 2.0% (Treasury) and your E/P proxy = 5.0%. Estimated ERP ≈ 3.0%.
• If ERP percentile < 30th or CPI acceleration > 0 (3-mo ann. – YoY > 0), reduce equity weight by 50%.
Pros, cons & risk management
Pros
• Auditability: Gov/academic sources make backtests reproducible and compliant. (SEC, BLS, BEA, Treasury, CFTC 2024–2025.)
• Breadth: Macro + filings + positioning + factors gives orthogonal signals.
• Cost: $0 to start; allocate budget to compute/storage first.
Cons
• Latency: Free datasets can lag (e.g., short interest bi-monthly; BEA revisions).
• Coverage gaps: Free equities OHLCV lacks depth-of-book or corporate actions nuance.
• Non-investable factors: French factors are research artifacts, not tradable baskets.
Mitigations
• Use ALFRED and as-filed timestamps for point-in-time accuracy. (St. Louis Fed; SEC, 2024–2025.)
• Combine macro dailies (Treasury yields) with monthly series (BLS/Census) via regime states, not tight timing.
• Prototype on Stooq, then verify on a paid primary feed before deployment.
Common mistakes (and expert fixes)
• Using revised data in backtests. Fix: Pull ALFRED vintages (macro) and EDGAR timestamps (fundamentals). (St. Louis Fed, 2025; SEC, 2024)
• Ignoring methodology changes. Fix: Read Treasury yield-curve notes (spline methodology, series breaks). (Treasury, 2025)
• Mixing OTC vs exchange short data. Fix: FINRA pages differentiate coverage and files—don’t over-interpret. (FINRA, 2025)
• Assuming factor returns are investable as is. Fix: Treat French factors as controls/benchmarks, not trade lists. (French Library, 2025)
Practical example: a filings-plus-positioning alert
• Event: Company files an unexpected 8-K after the close (EDGAR timestamp). (SEC, 2024)
• Add context: Managed Money futures tilt from COT turns extreme for the firm’s key input commodity (e.g., WTI). (CFTC, 2025)
• Risk lens: Short interest days-to-cover spikes (FINRA). (FINRA, 2025)
• Action: Flag as high-dispersion event; tighten stops or cut size if you’re long into the print.
Compliance & U.S. regulators to know
• SEC – issuers’ filings, XBRL data, market structure rules. (SEC, 2024–2025)
• CFTC/NFA – futures/derivatives positioning and oversight; COT reports. (CFTC, 2025)
• Federal statistical system – BEA, BLS, Census, Treasury: authoritative macro and rates. (BEA, 2024–2025; BLS, 2024; Census, 2025; Treasury, 2025)
FAQs
Next steps (action plan)
1) Bookmark & script the 10 sources above; document MAPS for each in your repo.
2) Backtest with vintages (ALFRED + as-filed EDGAR) and add guardrails: max drawdown, position sizing, and kill switches.
3) Prototype quickly on Stooq; validate on a paid, licensed feed before going live.
4) Expand coverage with sector-specific BEA tables, BLS micro-series, FINRA daily short-sale volume, and CFTC positioning spreads.
5) Keep a release calendar (Census/BLS/BEA/Treasury) to time rebalances and avoid “information lag” trades.