APAC Inference API

The fastest inference API for APAC.

50+ models. Sub-100ms TTFT from Singapore. Data stays in-region.

TTFT p5094ms
Speedup8.9x
Error rate0%
User
User
User
User

Built with teams shipping AI in APAC

Performance-focused workflows
app.py
# Before: US-routed, 300ms+ latency from APAC
from openai import OpenAI

# After: Change one line. Sub-100ms from Singapore.
client = OpenAI(
  base_url="https://api.brightnode.cloud/v1",
  api_key="$BRIGHTNODE_API_KEY",
)

response = client.chat.completions.create(
  model="anthropic/claude-sonnet-4",
  messages=[{"role": "user", "content": "Hello"}],
)
OpenAI-compatible APIRegion: Singapore
Benchmark proof

Compact proof from Singapore

TTFT and latency are shown side-by-side with reliability metrics. Full reproducible methodology lives on `/performance`.

MetricGlobalBrightnode
End-to-end latency p50733ms199ms
TTFT p50 (streaming)840ms94ms
End-to-end latency p951,643ms318ms
TTFT p95 (streaming)1,393ms142ms
Product impact (Singapore)

8.9x faster first token where users feel it.

This panel translates benchmark numbers into user-facing impact: faster response perception, lower p95 spikes, and stronger reliability under real traffic patterns.

First token wait reduced746ms saved/request

TTFT p50 from 840ms to 94ms (8.9x faster)

Tail latency reduced1,251ms saved/request

TTFT p95 from 1,393ms to 142ms (9.8x faster)

Reliability in benchmark run0% observed errors

Global baseline observed 3.3% errors

Benchmarked from Singapore against global router baselines. Full reproducible methodology is published on the performance page.

Model catalog

Popular production models across APAC. One API. Data stays in-region.

Claude Sonnet, Llama, Qwen, DeepSeek, and more with current per-1M pricing, context, region, and latency in one view.

ModelProviderInput / 1MOutput / 1MContextAPAC regionsLatency (SG/TYO/SYD)Residency
Claude Sonnet 4
Claude
Anthropic$3$15200,000Singapore, Sydney, Tokyo, Thailand, Malaysia, Jakarta, New Zealand, Seoul, Taiwan, Mumbai83ms / 98ms / 74msin-region
Claude Haiku 4.5
Claude
Anthropic$1$5200,000Singapore, Jakarta, Malaysia, Thailand, Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand59ms / 71ms / 61msin-region
Llama 3.3 70B Instruct
Llama
Meta$0.22$0.50131,072Singapore27ms / 41ms / 34msin-region
DeepSeek V3
Deepseek
DeepSeek$0.60$1.74163,840Jakarta, Singapore, Malaysia, Thailand, Tokyo, Seoul, Taiwan, Mumbai, Sydney, New Zealand40ms / 58ms / 51msin-region
Qwen3 32B
Qwen
Qwen$0.10$1.20131,072Singapore31ms / 45ms / 39msin-region
Mistral Nemo
Mistral
Mistral$0.15$0.15131,072Singapore38ms / 54ms / 47msin-region
Try now

Embedded playground preview

Demo mode works instantly. Live beta mode can run a real request when you provide an API key.

Output stream

Run a prompt to preview streamed output.

Demo mode uses representative text. Live beta mode performs a direct request via the public API proxy.

Get started in 3 lines
Full docs
Python
from openai import OpenAI
client = OpenAI(
  base_url="https://api.brightnode.cloud/v1",
  api_key="YOUR_BRIGHTNODE_API_KEY",
)
stream = client.chat.completions.create(
  model="meta-llama/Llama-3.3-70B-Instruct",
  stream=True,
  messages=[{"role": "user", "content": "Hello APAC"}],
)
Node
import OpenAI from "openai";
const client = new OpenAI({
  baseURL: "https://api.brightnode.cloud/v1",
  apiKey: process.env.BRIGHTNODE_API_KEY,
});
const completion = await client.chat.completions.create({
  model: "meta-llama/Llama-3.3-70B-Instruct",
  messages: [{ role: "user", content: "Hello APAC" }],
});
Platform Architecture

Choose the path that matches your stage.

Router for serving, managed models for fast launch, and Workspaces for pre-production development.

Inference Router

Route API traffic to APAC-first inference paths with OpenAI-compatible requests and clear model controls.

Dedicated Endpoints

Deploy reserved APAC inference capacity for custom checkpoints, LoRA workflows, and enterprise traffic.

GPU Workspaces

Fine-tune and evaluate in APAC, then deploy to Brightnode Inference through the same product workflow.

Join the First 100 Startup Teams

Early access for approved startups. We're onboarding teams across Singapore, Jakarta, and Bangkok.

Priority Support
Direct Discord access with founders
Enhanced Credits
$200 initial credit (instead of $100) then $50 credit per month ongoing
Founding Member Pricing
Lock in early rates forever
Input on Roadmap
Shape what we build next
Spots remaining
85
Last onboarding
Feb 13th
Join Early Access