Performance

Benchmark proof, not benchmark promises.

Measured from Singapore against global routers. This page is your source of truth for TTFT, latency, and reliability.

Benchmark Snapshot

27ms-class APAC routing profile

Representative benchmark view used in customer evaluations. Full scenario breakdown follows below.

Representative sample

Metric	Global routers	Brightnode Router	Delta
End-to-end latency p50 Measured from Singapore	733ms	199ms	3.7x faster
TTFT p50 (streaming) Measured from Singapore	840ms	94ms	8.9x faster
End-to-end latency p95 Measured from Singapore	1,643ms	318ms	5.2x faster
TTFT p95 (streaming) Measured from Singapore	1,393ms	142ms	9.8x faster

Source region: SingaporeWindow: repeated sample runsOutput profile: 200 token completion

TTFT p50

94ms

8.9x faster than global routers

End-to-end latency p50

199ms

3.7x faster at 200 token output

Error rate

Global routers: 3.3% error rate

Concurrency stable

178ms at 5x

Consistent under parallel load

Full benchmark data

Detailed performance comparison

Measured from Singapore against global routers. All numbers are real. No synthetic optimisations.

End-to-end latency (non-streaming)

Time from request to complete response, by output length

Max tokens	Brightnode p50	Brightnode p95	Global p50	Global p95	Speedup
5 tokens	132ms	139ms	602ms	844ms	4.6x
50 tokens	195ms	309ms	818ms	4,460ms	4.2x
200 tokens	199ms	318ms	733ms	1,643ms	3.7x
500 tokens	198ms	200ms	808ms	1,859ms	4.1x

Streaming performance

TTFT, inter-token latency (TPOT), and total streaming time

Metric	Brightnode	Global routers	Speedup
TTFT p50	94ms	840ms	8.9x
TTFT p95	142ms	1,393ms	9.8x
TTFT p99	224ms	1,799ms	8.0x
TTFT min	86ms	207ms	2.4x
TPOT p50	12ms	16ms	1.3x
TPOT p95	13ms	70ms	5.4x
TPOT p99	15ms	82ms	5.5x
Total p50	193ms	987ms	5.1x
Total p95	303ms	1,695ms	5.6x

Error rate	0%	3.3%	—

Concurrency sweep

5-token payload under parallel load

Concurrency	Brightnode p50	Brightnode p95	Global p50	Global p95	Speedup
1 concurrent	134ms	167ms	757ms	2,695ms	5.6x
5 concurrent	178ms	401ms	728ms	2,614ms	4.1x

Source region: SingaporeRepeated sample runsTPOT = time per output tokenTTFT = time to first token

Verify it yourself

Run a quick reproducibility check from your APAC host

Use your own API key and machine to compare TTFT and total response time against published results.

# Run from any APAC server
curl -o /dev/null -s -w "TTFT: %{time_starttransfer}s\nTotal: %{time_total}s\n" \
  -X POST https://api.brightnode.cloud/v1/chat/completions \
  -H "Authorization: Bearer $BRIGHTNODE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"meta-llama/Llama-3.3-70B-Instruct","messages":[{"role":"user","content":"Hello APAC"}],"max_tokens":128}'