GPU Infrastructure Partnership Programme

Your idle GPUs.
Our inference workloads.

BrightNode brings the developers. You provide the compute. Revenue flows when GPUs are utilised, with zero commitment required to start.

Representative economics

Example at Tier 1: one 8x A100 80GB cluster at 60% utilisation can generate roughly USD $18,000-$30,000/month before power, space, and hardware costs. Actual realised rates vary by workload mix, region, and latency profile.

5 priority markets

APAC regions

Zero

Commitment to start

Tier 1 option

Revenue share model

~8 weeks

Time to production

The opportunity

APAC inference demand is
accelerating fast

Enterprise and developer adoption of LLMs, embedding models, and multimodal AI is creating sustained demand for GPU compute, particularly NVIDIA A100, H100, L4, and T4 accelerators.

Much of the GPU capacity deployed in the region sits underutilised outside peak training windows. BrightNode channels real-time inference workloads to partner infrastructure, monetising capacity that would otherwise generate zero return.

BrightNode brings the workloads

We manage all customer relationships, billing, and metering.

You provide the GPUs

Power, cooling, and hardware, that's all we need from you.

Revenue flows on utilisation

No upfront cost to partners. You earn when your GPUs serve inference.

Who is BrightNode

Operator-led, APAC focused

BrightNode is a Singapore-based AI infrastructure company founded by a repeat Southeast Asia technology operator who previously built and sold Aquient Pte Ltd. We run inference infrastructure on Google Cloud today and are expanding our APAC backbone through data centre partnerships.

Current traction

Live workloads, not a slide deck

BrightNode already serves production inference traffic from Singapore across chat, embeddings, transcription, and multimodal use cases. Our platform includes a 100+ model catalog and benchmarked live serving paths, giving partners immediate workload categories to absorb.

Partnership tiers

Start on-demand, scale with confidence

Designed for minimal risk on both sides. Most partnerships begin at Tier 1 and transition to Tier 2 within 3–6 months based on utilisation data.

Indicative rates shown for modeling. Final pricing reflects market conditions, geography, latency requirements, and workload profile.

Tier 1

On-Demand

Start immediately with zero commitment. BrightNode sends workloads when your GPUs are available.

Pricing

Indicative Tier 1: A100 80GB at USD $1.80-$2.60/GPU-hr, H100 80GB at USD $3.20-$4.90/GPU-hr

Commitment

None, month-to-month, pause any time

Most Common

Tier 2

Reserved Capacity

BrightNode commits to a minimum monthly GPU-hour volume. You guarantee capacity availability.

Pricing

15-25% below Tier 1 effective rates (monthly commit)

Commitment

3–6 month terms, 500–2,000 GPU-hrs/month minimum

Tier 3

Dedicated Allocation

Specific GPU nodes allocated exclusively to BrightNode workloads, always-on, highest predictability.

Pricing

30-45% below Tier 1 effective rates (fixed dedicated capacity)

Commitment

6–12 month terms, fixed node count

Why partner

Built for both sides to win

Monetise idle GPUs

Turn unused GPU capacity into a revenue stream with zero sales effort. BrightNode brings the customers and workloads.

Zero commitment to start

Begin with on-demand access at Tier 1. No minimum volumes, no contracts. Scale only when the data justifies it.

Predictable demand growth

AI inference is a sustained, growing workload, not a one-off burst. As our developer base compounds, so does your utilisation.

No operational overhead

BrightNode manages all software, model deployment, scaling, monitoring, and billing. You provide power, cooling, and hardware.

APAC-first positioning

Ride the wave of AI adoption in Southeast Asia, India, Japan, and Australia. BrightNode is building the inference backbone for the region.

Transparent metering

Full visibility into GPU utilisation, workload types, and revenue via a partner dashboard. Usage events are signed and retained in immutable audit logs for partner reconciliation.

Technical requirements

Minimal friction, maximum compatibility

Our platform is designed to integrate with partner infrastructure quickly. The core requirements below cover hardware, connectivity, and operations.

Supported GPU hardware

GPUPrimary use caseMin VRAMPriority
NVIDIA H100Large language models, high-throughput inference80 GB HBM3High
NVIDIA A100 (80GB)Large language models (70B+), training runs80 GB HBM2eHigh
NVIDIA L4Mid-size models (7–32B), embeddings, Whisper24 GB GDDR6Medium
NVIDIA T4Embeddings, small models, audio transcription16 GB GDDR6Medium

Connectivity

  • Secure network link to BrightNode control plane (VPN, private interconnect, or direct peering)
  • Kubernetes-compatible orchestration (we can deploy our own K8s or integrate with yours)
  • Low-latency path to APAC end users (Singapore, Mumbai, Tokyo, Sydney preferred)
  • NVMe or high-speed SSD for model weight caching (2 GB – 150 GB per model)

Operational requirements

  • 24/7 availability for Tier 2 and Tier 3 commitments (Tier 1 can be interruptible)
  • 4-hour hardware fault response SLA target for Tier 3
  • Power and cooling to spec (6–10 kW per node for A100/H100)
  • Physical security and access controls consistent with enterprise data centre standards
Workload types

What runs on your hardware

BrightNode routes these workload categories to partner infrastructure based on GPU type and availability.

LLM Inference

Primary

Real-time chat completions and text generation using open-weight models (Llama, Qwen, Mistral, DeepSeek). Served via vLLM. Highest volume category.

Embedding Generation

Vector embedding computation for RAG pipelines and semantic search. Lower GPU intensity, very high request volume.

Audio Transcription

Speech-to-text using Faster Whisper models. Moderate GPU requirement, bursty demand pattern.

Training & Fine-Tuning

Coming

Longer-running, higher-value workloads as the partnership matures. Benefits from Tier 3 dedicated capacity.

Expansion map

Priority regions for 2026 capacity

If you operate in one of these metros, you are in our active partner pipeline for current-year deployment.

Singapore

Expanding existing capacity

Tokyo

Highest priority new region

Sydney

Active demand coverage

Mumbai

Low-latency India route

Jakarta

Southeast Asia growth

Getting started

From conversation to live traffic

Designed to be fast and low-friction. Most partners are live with production inference traffic within eight weeks.

01

Initial conversation

Week 1

Align on GPU types, capacity, and location. Agree on Tier 1 on-demand terms.

02

Technical assessment

Weeks 2–3

Validate network connectivity, GPU readiness, and storage. BrightNode engineering works with your ops team.

03

Pilot deployment

Week 4

BrightNode deploys a lightweight Kubernetes-based orchestration layer (vLLM serving, model weight caching, health monitoring) and routes initial test workloads to your GPUs. We manage software lifecycle; you manage hardware and connectivity.

04

Production ramp

Weeks 5–8

Begin routing live inference traffic. Monitor utilisation, performance, and revenue together.

05

Commitment review

Month 3

Evaluate utilisation data and discuss Tier 2 transition if volumes warrant it.

Next steps

Have GPU capacity?
Let's put it to work.

Whether you have a handful of idle A100s or an entire GPU cluster, BrightNode can route workloads to your infrastructure. Start with zero commitment at Tier 1.

Technical deep-dives available on request