← Back to model catalog
QwenLive

Qwen3 Embedding 8B

Provider
Brightnode-hosted
Context
32,768 tokens
Pricing
$0.05 / 1M input · $0.05 / 1M output
APAC regions
Singapore
Residency
in-region
Task
Embedding
APAC performance

Latency profile

SingaporeTokyoSydney
TTFT p50182924
TTFT p95294538
E2E latency p50344741
E2E latency p95557163
Pricing

Input: $0.05 per 1M tokens

Output: $0.05 per 1M tokens

Billing: Per-token, charged against wallet balance

Dedicated endpoint option
  • A100 80GB: $4.01/hr (Singapore)
  • H100 80GB: $14.29/hr (Singapore)
View full pricing and calculator →
Capabilities

High-quality text embedding model optimized for retrieval and semantic similarity with in-region APAC generation.

Best for: RAG pipelines, Semantic search, Document clustering, Recommendation systems

Quickstart code snippets
Python
from openai import OpenAI

client = OpenAI(
    base_url="https://api.brightnode.cloud/v1",
    api_key="YOUR_BRIGHTNODE_API_KEY",
)

response = client.embeddings.create(
    model="Qwen/Qwen3-Embedding-8B",
    input="Your text to embed",
)
Node
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.brightnode.cloud/v1",
  apiKey: process.env.BRIGHTNODE_API_KEY,
});

const response = await client.embeddings.create({
  model: "Qwen/Qwen3-Embedding-8B",
  input: "Your text to embed",
});
Other models on Brightnode