APAC model router

The router for teams who are done paying the APAC latency tax.

Global router patterns are often built around network realities that still feel US-first for APAC teams. Brightnode runs open-weight models in Singapore today and is building direct routes to proprietary model providers in Sydney and Singapore, giving regional production apps a faster path to useful output without the APAC latency tax.

3.7–8.9x faster than global routers from Singapore
94ms TTFT p50, sub-100ms time to first token
0% error rate vs 3.3% on global routers
Stable under concurrency, 178ms p50 at 5x parallel
Operator view

Global routers solved model access. They did not solve APAC geography.

Brightnode routes traffic into APAC-native inference paths instead of defaulting to US-shaped infrastructure.

That is why the product surface starts with benchmark proof, then drills into routing logic and model-region coverage.

OpenAI-compatible endpoint
curl https://api.brightnode.cloud/v1/chat/completions \
  -H "Authorization: Bearer $BRIGHTNODE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "anthropic/claude-sonnet-4", "messages": [...]}'
Benchmark note
Performance data lives on /performance

Router page keeps routing context and model-region coverage. Full TTFT/latency methodology and complete tables are centralized on the performance page.

See full methodology on /performance
Routes
Sydney + Singapore first

Proprietary routes in APAC plus Brightnode-hosted open models. More regions expand from a local-first base.

Positioning
Global router vs APAC router

Same category. Different geography. Different benchmark profile. Different production outcome.

What teams get
One API for proprietary and Brightnode-hosted open models.
Performance snapshot shows substantial TTFT and latency improvements on APAC routes.
A migration path from API routing into hosted inference and custom workloads.
See full methodology on /performance
TTFT p50
94ms

8.9x faster than global routers

End-to-end latency p50
199ms

3.7x faster at 200 token output

Error rate
0%

Global routers: 3.3% error rate

Concurrency stable
178ms at 5x

Consistent under parallel load

How the router actually works

Brightnode is not just passing requests through. It is deciding where inference should happen and making sure APAC teams do not carry that operational burden in app code.

1. Request enters one API

Your app hits a single OpenAI-compatible endpoint. No provider-specific auth, no region-specific SDK logic, no switching costs in application code.

2. Brightnode applies routing policy

We resolve the target model, choose the best APAC route, and account for benchmark profile, latency sensitivity, and regional preference.

3. Inference runs close to the user

Proprietary model traffic routes to Sydney or Singapore. Open-weight models can run on Brightnode-hosted inference in-region. Same platform, same account.

4. Fallbacks preserve uptime

If a route degrades, Brightnode can shift traffic across supported providers and regions without asking your app to know the difference.

What you can build with the Router

One API for chat, code, RAG, and agents. Same interface everywhere, we handle region and provider.

Chat & assistants

Build customer support bots, internal copilots, and conversational apps. Route to Claude or open models with one API, no provider lock-in.

Claude, open chat models

Code & agents

Agent frameworks, coding assistants, and tool-calling backends. Low latency in APAC means faster tool calls and smoother multi-step workflows.

Claude, GLM, agent-optimized models

RAG & search

Retrieval-augmented generation and semantic search. Keep embeddings and inference in-region for compliance and speed.

Embeddings + chat in same region

Multimodal

Image-in, image-out, and vision models. Route to proprietary model endpoints or our GPU stack without crossing oceans for each request.

Vision, image generation

Agentic workflows

Multi-step reasoning, tool use, and long-horizon tasks. Same OpenAI-compatible API your agent framework already uses.

Tool-calling, long context

Use the same /v1/chat/completions and /v1/embeddings endpoints across all use cases.

Models & regions

Proprietary and Brightnode-hosted models are available in APAC today, with additional regions and model families rolling out.

Sydney

Proprietary models

Sydney

  • Amazon Nova Lite
  • Amazon Nova Micro
  • Amazon Nova Pro
  • Amazon Titan Embed Image v1
  • Amazon Titan Embed Text v2

plus 51 more models in this region

Best for: Production chat, agents, long context

Singapore

Proprietary models

Singapore

  • Amazon Nova 2 Lite
  • Amazon Nova Lite
  • Amazon Nova Micro
  • Amazon Nova Pro
  • Claude 3 Haiku

plus 29 more models in this region

Best for: Southeast Asia latency, data residency

Singapore

Brightnode-hosted

Singapore

  • BGE Base English v1.5
  • BGE Large English v1.5
  • BGE-M3 (Multilingual)
  • Gemma 3 12B Instruct
  • Gemma 3 27B Instruct

plus 32 more models in this region

Best for: Cost-effective inference, full control, same API

Specify model in your request; we route to the right region and provider automatically.

Why an APAC-first router?

Global routers are great, but many are US-focused. Here’s how we’re different for APAC.

Latency

Global routers

US/EU endpoints add 600–800ms+ round-trip for APAC users. p95 latency tops 1.6s.

Brightnode

Singapore-first. Sub-200ms end-to-end, sub-100ms TTFT. Up to 9.8x faster at p95.

Data residency

Global routers

Data may traverse US or EU. Compliance and sovereignty concerns in APAC.

Brightnode

Singapore infrastructure today. Data stays in-region. Additional proprietary APAC routes coming soon.

Inference + routing

Global routers

Router only, you still need separate GPU/inference elsewhere.

Brightnode

We run open-weight models and route to proprietary providers. One platform, one bill.

Built for APAC

Global routers

Optimized for US/EU. APAC is an afterthought.

Brightnode

APAC-first. Regional startups and enterprises across the region.

Get started in minutes

OpenAI-compatible API. Swap the base URL and use your existing code.

1. Get your API key

Sign up at the console, create an API key, and add credits. No long-term contract.

Console → API keys
2. Point your client to Brightnode
# Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
  base_url="https://api.brightnode.cloud/v1",
  api_key="YOUR_BRIGHTNODE_API_KEY",
)
response = client.chat.completions.create(
  model="meta-llama/Llama-3.3-70B-Instruct",
  messages=[{"role": "user", "content": "Hello from APAC."}],
)

Same for Node, curl, or any OpenAI-compatible client. We support streaming and embeddings.

3. Docs & limits

Full API reference, model list, and rate limits are in our docs. For agent frameworks, just change the base URL.

Documentation