Uzbek LLM Lab

Alloma 8B Instruct

otherReleased May 2026

Uzbek LLM Lab's 8B parameter instruction-tuned model optimized for the Uzbek language. Built on Llama architecture with a custom tokenizer averaging 1.7 tokens per Uzbek word versus 3.5 in original Llama, enabling 2x faster inference. Trained on 3.6B tokens with 4096 context length.

Capabilities

chatcompletion

Modalities

text

Context Window

4K tokens

Providers

available

Available from 1 provider

Cheapest

Hugging Face Inference

$0.20/1M tokens

Hugging Face Inference

Providers (1)

Sorted by total cost (input + output per 1M tokens). Click a row to view provider details.

Provider	Pricing (per 1M)	Rate Limits	Regions	Health	Latency
Hugging Face Inference	In: $0.10Out: $0.10	60 RPM / 200K TPM	us-east-1	Healthy	0ms

Quick Start

Use this model via Hugging Face Inference with an OpenAI-compatible SDK.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.hugging-face.com/v1",
  apiKey: process.env.HUGGING_FACE_API_KEY,
});

const response = await client.chat.completions.create({
  model: "uzlm/alloma-8B-Instruct",
  messages: [
    { role: "user", content: "Hello!" }
  ],
});

console.log(response.choices[0].message.content);

Using Hugging Face Inference API • OpenAI-compatible SDK