AI Inference

Access leading AI models through a unified API. Generate text, embeddings, and more with Google Gemini, OpenAI, Anthropic Claude, and open-source models.

Supported Models

ProviderModelsCapabilities
Googlegemini-2.5-flash, gemini-3-pro-preview, gemini-embedding-001Text, Vision, Embeddings
OpenAIgpt-5, gpt-5.1, gpt-5-mini, gpt-5.1-codexText, Vision, Code
Anthropicclaude-sonnet-4-5, claude-opus-4-1, claude-haiku-4-5Text, Vision, Code

Quick Start

import { Tenzro } from 'tenzro';
const tenzro = new Tenzro();
// Text generation
const response = await tenzro.ai.generate({
model: 'gemini-2.5-flash',
prompt: 'Explain quantum computing in simple terms',
maxTokens: 500,
});
console.log(response.text);

Chat Completions

const response = await tenzro.ai.chat({
model: 'gemini-3-pro-preview',
messages: [
{ role: 'system', content: 'You are a helpful coding assistant.' },
{ role: 'user', content: 'Write a TypeScript function to validate email addresses' },
],
temperature: 0.7,
maxTokens: 1000,
});
console.log(response.message.content);

Streaming Responses

const stream = await tenzro.ai.stream({
model: 'gemini-2.5-flash',
messages: [
{ role: 'user', content: 'Write a short story about AI' },
],
});
for await (const chunk of stream) {
process.stdout.write(chunk.text);
}
// Get final response with usage stats
const finalResponse = await stream.finalResponse();
console.log('Tokens used:', finalResponse.usage);

Function Calling

const response = await tenzro.ai.chat({
model: 'gemini-3-pro-preview',
messages: [
{ role: 'user', content: 'What is the weather in San Francisco?' },
],
tools: [
{
name: 'get_weather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: { type: 'string', description: 'City name' },
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] },
},
required: ['location'],
},
},
],
toolChoice: 'auto',
});
if (response.toolCalls) {
for (const call of response.toolCalls) {
console.log('Function:', call.name);
console.log('Arguments:', call.arguments);
}
}

Generate Embeddings

// Single embedding
const embedding = await tenzro.ai.embed({
model: 'text-embedding-3-small',
input: 'Machine learning is fascinating',
});
console.log(embedding.vector); // [0.123, -0.456, ...]
console.log(embedding.dimensions); // 1536
// Batch embeddings
const embeddings = await tenzro.ai.embedBatch({
model: 'text-embedding-3-small',
inputs: [
'First document',
'Second document',
'Third document',
],
});
console.log(embeddings.vectors.length); // 3

Vision (Multimodal)

// Analyze an image
const response = await tenzro.ai.chat({
model: 'gemini-3-pro-preview',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What do you see in this image?' },
{ type: 'image_url', image_url: { url: 'https://example.com/image.jpg' } },
],
},
],
});
// Or with base64
const response = await tenzro.ai.chat({
model: 'gpt-5',
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'Describe this diagram' },
{ type: 'image_url', image_url: { url: `data:image/png;base64,${base64Image}` } },
],
},
],
});

Configuration Options

ParameterTypeDescription
modelstringModel identifier
temperaturenumberRandomness (0-2, default 1)
maxTokensnumberMaximum output tokens
topPnumberNucleus sampling (0-1)
frequencyPenaltynumberReduce repetition (-2 to 2)
presencePenaltynumberEncourage new topics (-2 to 2)
stopstring[]Stop sequences

Error Handling

import { TenzroError, RateLimitError, ModelError } from 'tenzro';
try {
const response = await tenzro.ai.generate({
model: 'gemini-2.5-flash',
prompt: 'Hello',
});
} catch (error) {
if (error instanceof RateLimitError) {
console.log('Rate limited, retry after:', error.retryAfter);
} else if (error instanceof ModelError) {
console.log('Model error:', error.message);
} else {
throw error;
}
}

Usage & Pricing

AI inference is billed per token:

ModelInput (1M tokens)Output (1M tokens)
gemini-2.5-flash$0.075$0.30
gemini-3-pro-preview$1.25$5.00
gpt-5-mini$0.15$0.60
gpt-5$2.50$10.00
claude-sonnet-4-5$3.00$15.00

Related