Uzbek LLM Lab's 8B parameter instruction-tuned model optimized for the Uzbek language. Built on Llama architecture with a custom tokenizer averaging 1.7 tokens per Uzbek word versus 3.5 in original Llama, enabling 2x faster inference. Trained on 3.6B tokens with 4096 context length.
4K tokens
1
available
Cheapest
Hugging Face Inference
$0.20/1M tokens
Hugging Face Inference
Sorted by total cost (input + output per 1M tokens). Click a row to view provider details.
| Provider | Pricing (per 1M) | Rate Limits | Regions | Health | Latency |
|---|---|---|---|---|---|
In: $0.10Out: $0.10 | 60 RPM / 200K TPM | us-east-1 | Healthy | 0ms |
Use this model via Hugging Face Inference with an OpenAI-compatible SDK.
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.hugging-face.com/v1",
apiKey: process.env.HUGGING_FACE_API_KEY,
});
const response = await client.chat.completions.create({
model: "uzlm/alloma-8B-Instruct",
messages: [
{ role: "user", content: "Hello!" }
],
});
console.log(response.choices[0].message.content);Using Hugging Face Inference API • OpenAI-compatible SDK