Performance

Benchmark proof, not benchmark promises.

Measured from Singapore against global routers. This page is your source of truth for TTFT, latency, and reliability.

Benchmark Snapshot

27ms-class APAC routing profile

Representative benchmark view used in customer evaluations. Full scenario breakdown follows below.

Representative sample
MetricGlobal routersBrightnode RouterDelta
End-to-end latency p50
Measured from Singapore
733ms199ms3.7x faster
TTFT p50 (streaming)
Measured from Singapore
840ms94ms8.9x faster
End-to-end latency p95
Measured from Singapore
1,643ms318ms5.2x faster
TTFT p95 (streaming)
Measured from Singapore
1,393ms142ms9.8x faster
Source region: SingaporeWindow: repeated sample runsOutput profile: 200 token completion
TTFT p50
94ms

8.9x faster than global routers

End-to-end latency p50
199ms

3.7x faster at 200 token output

Error rate
0%

Global routers: 3.3% error rate

Concurrency stable
178ms at 5x

Consistent under parallel load

Full benchmark data

Detailed performance comparison

Measured from Singapore against global routers. All numbers are real. No synthetic optimisations.

End-to-end latency (non-streaming)

Time from request to complete response, by output length

Max tokensBrightnode p50Brightnode p95Global p50Global p95Speedup
5 tokens132ms139ms602ms844ms4.6x
50 tokens195ms309ms818ms4,460ms4.2x
200 tokens199ms318ms733ms1,643ms3.7x
500 tokens198ms200ms808ms1,859ms4.1x

Streaming performance

TTFT, inter-token latency (TPOT), and total streaming time

MetricBrightnodeGlobal routersSpeedup
TTFT p5094ms840ms8.9x
TTFT p95142ms1,393ms9.8x
TTFT p99224ms1,799ms8.0x
TTFT min86ms207ms2.4x
TPOT p5012ms16ms1.3x
TPOT p9513ms70ms5.4x
TPOT p9915ms82ms5.5x
Total p50193ms987ms5.1x
Total p95303ms1,695ms5.6x
Error rate0%3.3%

Concurrency sweep

5-token payload under parallel load

ConcurrencyBrightnode p50Brightnode p95Global p50Global p95Speedup
1 concurrent134ms167ms757ms2,695ms5.6x
5 concurrent178ms401ms728ms2,614ms4.1x
Source region: SingaporeRepeated sample runsTPOT = time per output tokenTTFT = time to first token
Verify it yourself

Run a quick reproducibility check from your APAC host

Use your own API key and machine to compare TTFT and total response time against published results.

# Run from any APAC server
curl -o /dev/null -s -w "TTFT: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
  -X POST https://api.brightnode.cloud/v1/chat/completions \
  -H "Authorization: Bearer $BRIGHTNODE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"meta-llama/Llama-3.3-70B-Instruct","messages":[{"role":"user","content":"Hello APAC"}],"max_tokens":128}'