Latency Wars: Which U.S. Brokers Offer True Sub-Millisecond Execution in 2025?

Answer up front: For U.S. retail traders, no mainstream broker publishes or demonstrates true sub-millisecond, end-to-end execution in 2025. Published averages from leading firms land in the tens of milliseconds (e.g., ~0.04–0.05 seconds). Sub-millisecond fills are realistically limited to institutional-style DMA with exchange colocation—not typical retail accounts. (Fidelity, Q1-2025: ~0.04 s; Schwab, Q2-2025: ~0.05 s.)

Affiliate disclosure: If this article includes affiliate links to brokers or tools, assume I may receive a commission at no additional cost to you. This never affects my editorial opinions.

Table of Contents


What “sub-millisecond execution” really means (and why it’s rare)

Plain-English definitions:

Latency: Delay from action to response. Here: from order receipt at a broker/venue to execution. Under amended SEC Rule 605, reporting entities must measure time-to-execution in millisecond increments.
Matching engine: The exchange component that pairs buys with sells. Exchange engines can process messages in microseconds—far faster than most end-to-end retail paths.
DMA (Direct Market Access): Institutional connectivity that bypasses retail layers, typically via FIX gateways, pre-trade risk checks, and colocation racks inside or adjacent to the exchange’s data center.

Key point: The SEC’s 2024 Rule 605 amendments force reporting of speed in ms because that’s the meaningful scale for retail execution quality comparisons. If sub-millisecond were normal for retail, you’d expect to see 0.x-ms figures in broker disclosures—but you do not.


The Latency Ladder™: Where your milliseconds actually go

Use this original, practical framework to locate bottlenecks and set realistic expectations:

1) You & your device → UI, OS scheduling, drivers.
2) Your uplink → ISP hops to your broker.
3) Broker front door → API/TWS/web, auth, rate limits.
4) Broker OMS / smart router → validations, risk checks, venue selection.
5) Network to venue → fiber/microwave path to the exchange or wholesaler.
6) Venue gateway → risk controls, throttle queues.
7) Matching engine → microsecond logic.
8) Acknowledgment path → confirmations back to you.

Even if #7 is measured in microseconds, #2–#6 often add tens of milliseconds for retail flows. That’s why leading retail brokers publish ~40–60 ms averages—not <1 ms. (Fidelity Q1-2025: 0.04 s; Schwab Q2-2025: 0.05 s.)


What the 2025 data says (and how to read it)

Rule 605 got modernized (effective June 2024): execution speeds must be measured in milliseconds or finer and summary reports are required, enabling apples-to-apples comparisons.
Retail brokers’ published figures:
Fidelity shows ~0.04 seconds average execution speed (Q1-2025).
Charles Schwab highlights ~0.05 seconds (Q2-2025).
These are excellent for retail, but not sub-ms.
Wholesalers & venues: Updated 605 data focuses on price improvement, spreads, median and 99th-percentile times—again, in milliseconds. Research and industry commentary in 2025 treat ms as the relevant unit for retail execution comparisons.

Bottom line: If a broker markets “lightning-fast” retail execution, the numbers still settle in ms, not μs.


So, which U.S. brokers offer true sub-millisecond execution to retail in 2025?

None, based on published data and mandated disclosures.
The best-in-class retail averages are ~40–60 ms. If your bar is <1 ms end-to-end, you’re looking at institutional DMA with colocation rather than a conventional retail account.

A realistic way to compare brokers on speed in 2025

Thanks to the 2024 amendments, you can now compare millisecond-granular metrics across market centers and broker-dealers via Rule 605 reports and broker execution pages. Look specifically for:

Average / median time-to-execution (ms)
99th percentile time-to-execution (ms)
Fill rates & price improvement (speed vs. price trade-offs matter)


One helpful table: What leading retail brokers publicly show

Broker (source) Published average execution speed (period)
Fidelity (Capital Markets statistics) 0.04 seconds (Q1-2025)
Charles Schwab (Execution Quality page) 0.05 seconds (Q2-2025)

Takeaway: Leading U.S. retail brokers cluster in tens of milliseconds—decisively not sub-millisecond.


Step-by-step: How to get as close as possible to sub-ms in the U.S.

1) Locate your strategy: If it’s sensitive to <10 ms, retail paths won’t suffice. Consider DMA with a professional broker that offers colocation in Mahwah (NYSE), Carteret (Nasdaq), Secaucus (Cboe/MEMX/IEX), or Aurora/NY5 (CME/EBS/BrokerTec).
2) Pick the right venue proximity: Your alpha depends on the dominant venue for your symbols; colocate near that venue. Latency mismatches across data centers will erase microsecond gains.
3) Use a low-latency gateway: On futures, CME iLink 3 is now the standard; ensure your broker supports it and that you’re on an optimized risk path.
4) Minimize software jitter: Prefer native APIs/FIX over retail GUIs, trim OS interruptions, pin CPU cores, and avoid consumer Wi-Fi for order entry. (See IBKR’s quant posts for useful architectural context.)
5) Measure properly: Use exchange timestamps (when available), compare broker order-receipt vs execution times, and track p50/p95/p99, not just averages. Rule 605’s millisecond granularity helps you benchmark venues.
6) Balance speed vs. price: Regulators emphasize best execution (price, speed, likelihood of fill). Faster isn’t always “better” if you sacrifice price improvement. FINRA Rule 5310 requires “reasonable diligence” in best-ex reviews.


Pros, cons & risk management

Pros of chasing latency:
• Better queue position on market/marketable limits.
• Lower slippage around news and microstructure transitions.

Cons:
Cost: Colocation racks, cross-connects, and premium feeds are expensive.
Complexity: Engineering for μs requires specialized staff and tooling.
Diminishing returns: Many retail strategies see more benefit from routing logic and price improvement than shaving 5 ms.

Mitigations:
• Start with broker/venue comparisons using ms-granular 605 data before considering colocation spend.
• Use IOC marketable limits to control worst-case fills.
Backtest with realistic latency injection (e.g., 30–80 ms for retail U.S. equities).


Practical mini case study: Can a retail trader hit <1 ms?

Scenario: You trade S&P 500 constituents from your home in Texas, routing to a retail broker in New Jersey that sends orders to wholesalers/exchanges.

Propagation alone over hundreds of miles eats multiple milliseconds, even with good fiber.
• Broker risk checks and router hops add more.
• Published broker stats: 40–60 ms averages.

Conclusion: You won’t see <1 ms end-to-end. To approach single-digit milliseconds, you’d need DMA + colocation + tuned stack. Even then, true sub-ms is generally venue-internal (gateway→engine) and not your full round-trip.


Common mistakes & expert tips

Mistakes:
Confusing engine speed with end-to-end speed. Exchange engines are fast; your path isn’t.
Ignoring best-ex obligations. Speed at the expense of price can fail internal reviews.
Taking marketing at face value. Verify with Rule 605 and broker transparency pages updated for 2024–2025.

Tips:
• Monitor median and 99th percentile times; tail latency kills P&L during volatility.
• Align venue proximity with your symbol universe; don’t colocate at NYSE if you mostly trade Nasdaq names.
• For futures, ensure your stack and broker fully support CME iLink 3 and planned CME private cloud/colo transitions.


Compliance & the U.S. regulators you should know

SEC (Reg NMS, Rule 605/606, proposed Reg Best Ex): Execution-quality disclosures in ms, routing transparency, and best-ex expectations.
FINRA (Rule 5310): Member firms must exercise reasonable diligence to achieve best execution; firms must regularly and rigorously review execution quality.
CFTC/NFA (for futures/FX): Supervisory and disclosure obligations for FCMs/RFEDs/CTAs/CPOs; retail FX falls under NFA Rules 2-36 et al. (execution-speed marketing must remain truthful and not misleading).

Plain-English risk disclaimer:
All trading involves risk, including potential loss of principal. Execution speed does not guarantee favorable pricing or profitability. Past performance and historical latency metrics are not indicative of future results.


FAQ

If exchanges work in microseconds, why don’t I get microsecond fills?
Because your order traverses the public internet, broker risk checks, routers, and venue gateways before it hits the engine. Those hops add tens of milliseconds in retail. Exchanges are fast; the path isn’t.
Is any U.S. retail broker sub-ms in 2025?
Would a “pro” platform like DMA actually help?
How do I verify a broker’s speed claims?

Conclusion: What you should do next

1) Right-size the goal: If your strategy is not microstructure-arbitrage, tens of ms with strong price improvement is usually optimal. Start by comparing Rule-605-style summaries across your broker options.
2) Measure, don’t assume: Log your p50/p95/p99 latencies and slippage per symbol and venue.
3) Iterate routing tactics: Prefer marketable limit over pure market orders; test IOC/pegged variants where supported.
4) If latency is truly the edge: Explore DMA + colocation pilots near the relevant venue (Mahwah/Carteret/Secaucus/Aurora), ensure iLink 3 (futures) support, and budget for the ongoing costs.
5) Keep it compliant: Align with SEC/FINRA best-ex expectations and NFA/CFTC rules if you trade futures/FX.

Risk disclaimer: All trading involves risk, including potential loss of principal. Execution speed does not guarantee favorable pricing or profitability. Past performance and historical latency metrics are not indicative of future results.


References

Back to top


Certified Market Technician, ex-prop trader and Python algo coder. I fuse technical analysis, backtesting and automation to craft high-probability Forex, CFD and crypto strategies. Follow for code snippets, VWAP pullbacks, grid-bot guides and trade-management hacks that help U.S. traders scale with confidence.

Explore more articles by Carlos Martinez!

Related Posts