APAC model router

The router for teams who are done paying the APAC latency tax.

Global router patterns are often built around network realities that still feel US-first for APAC teams. Brightnode runs open-weight models in Singapore today and is building direct routes to proprietary model providers in Sydney and Singapore, giving regional production apps a faster path to useful output without the APAC latency tax.

3.7–8.9x faster than global routers from Singapore

94ms TTFT p50, sub-100ms time to first token

0% error rate vs 3.3% on global routers

Stable under concurrency, 178ms p50 at 5x parallel

Operator view

Global routers solved model access. They did not solve APAC geography.

Brightnode routes traffic into APAC-native inference paths instead of defaulting to US-shaped infrastructure.

That is why the product surface starts with benchmark proof, then drills into routing logic and model-region coverage.

Get API key View docs

OpenAI-compatible endpoint

curl https://api.brightnode.cloud/v1/chat/completions \
  -H "Authorization: Bearer $BRIGHTNODE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "anthropic/claude-sonnet-4", "messages": [...]}'

Benchmark note

Performance data lives on /performance

Router page keeps routing context and model-region coverage. Full TTFT/latency methodology and complete tables are centralized on the performance page.

See full methodology on /performance

Routes

Sydney + Singapore first

Proprietary routes in APAC plus Brightnode-hosted open models. More regions expand from a local-first base.

Positioning

Global router vs APAC router

Same category. Different geography. Different benchmark profile. Different production outcome.

What teams get

One API for proprietary and Brightnode-hosted open models.

Performance snapshot shows substantial TTFT and latency improvements on APAC routes.

A migration path from API routing into hosted inference and custom workloads.

See full methodology on /performance

TTFT p50

94ms

8.9x faster than global routers

End-to-end latency p50

199ms

3.7x faster at 200 token output

Error rate

Global routers: 3.3% error rate

Concurrency stable

178ms at 5x

Consistent under parallel load

How the router actually works

Brightnode is not just passing requests through. It is deciding where inference should happen and making sure APAC teams do not carry that operational burden in app code.

1. Request enters one API

Your app hits a single OpenAI-compatible endpoint. No provider-specific auth, no region-specific SDK logic, no switching costs in application code.

2. Brightnode applies routing policy

We resolve the target model, choose the best APAC route, and account for benchmark profile, latency sensitivity, and regional preference.

3. Inference runs close to the user

Proprietary model traffic routes to Sydney or Singapore. Open-weight models can run on Brightnode-hosted inference in-region. Same platform, same account.

4. Fallbacks preserve uptime

If a route degrades, Brightnode can shift traffic across supported providers and regions without asking your app to know the difference.

What you can build with the Router

One API for chat, code, RAG, and agents. Same interface everywhere, we handle region and provider.

Chat & assistants

Build customer support bots, internal copilots, and conversational apps. Route to Claude or open models with one API, no provider lock-in.

Claude, open chat models

Code & agents

Agent frameworks, coding assistants, and tool-calling backends. Low latency in APAC means faster tool calls and smoother multi-step workflows.

Claude, GLM, agent-optimized models

RAG & search

Retrieval-augmented generation and semantic search. Keep embeddings and inference in-region for compliance and speed.

Embeddings + chat in same region

Multimodal

Image-in, image-out, and vision models. Route to proprietary model endpoints or our GPU stack without crossing oceans for each request.

Vision, image generation

Agentic workflows

Multi-step reasoning, tool use, and long-horizon tasks. Same OpenAI-compatible API your agent framework already uses.

Tool-calling, long context

Use the same /v1/chat/completions and /v1/embeddings endpoints across all use cases.

Models & regions

Proprietary and Brightnode-hosted models are available in APAC today, with additional regions and model families rolling out.

Sydney

Proprietary models

Sydney

Amazon Nova Lite
Amazon Nova Micro
Amazon Nova Pro
Amazon Titan Embed Image v1
Amazon Titan Embed Text v2

plus 51 more models in this region

Best for: Production chat, agents, long context

Singapore

Proprietary models

Singapore

Amazon Nova 2 Lite
Amazon Nova Lite
Amazon Nova Micro
Amazon Nova Pro
Claude 3 Haiku

plus 29 more models in this region

Best for: Southeast Asia latency, data residency

Singapore

Brightnode-hosted

Singapore

BGE Base English v1.5
BGE Large English v1.5
BGE-M3 (Multilingual)
Gemma 3 12B Instruct
Gemma 3 27B Instruct

plus 32 more models in this region

Best for: Cost-effective inference, full control, same API

Specify model in your request; we route to the right region and provider automatically.

Why an APAC-first router?

Global routers are great, but many are US-focused. Here’s how we’re different for APAC.

Latency

Global routers

US/EU endpoints add 600–800ms+ round-trip for APAC users. p95 latency tops 1.6s.

Brightnode

Singapore-first. Sub-200ms end-to-end, sub-100ms TTFT. Up to 9.8x faster at p95.

Data residency

Global routers

Data may traverse US or EU. Compliance and sovereignty concerns in APAC.

Brightnode

Singapore infrastructure today. Data stays in-region. Additional proprietary APAC routes coming soon.

Inference + routing

Global routers

Router only, you still need separate GPU/inference elsewhere.

Brightnode

We run open-weight models and route to proprietary providers. One platform, one bill.

Built for APAC

Global routers

Optimized for US/EU. APAC is an afterthought.

Brightnode

APAC-first. Regional startups and enterprises across the region.

Get started in minutes

OpenAI-compatible API. Swap the base URL and use your existing code.

1. Get your API key

Console → API keys

2. Point your client to Brightnode

# Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
  base_url="https://api.brightnode.cloud/v1",
  api_key="YOUR_BRIGHTNODE_API_KEY",
)
response = client.chat.completions.create(
  model="meta-llama/Llama-3.3-70B-Instruct",
  messages=[{"role": "user", "content": "Hello from APAC."}],
)

Same for Node, curl, or any OpenAI-compatible client. We support streaming and embeddings.

3. Docs & limits

Full API reference, model list, and rate limits are in our docs. For agent frameworks, just change the base URL.

Documentation