AI Inference
Access leading AI models through a unified API. Generate text, embeddings, and more with Google Gemini, OpenAI, Anthropic Claude, and open-source models.
Supported Models
| Provider | Models | Capabilities |
|---|---|---|
| gemini-2.5-flash, gemini-2.5-pro, gemini-3-pro-preview | Text, Vision, Function Calling | |
| OpenAI | gpt-4o | Text, Vision, Function Calling |
| Anthropic | claude-sonnet-4 | Text, Vision, Function Calling |
Quick Start
import { Tenzro } from '@tenzro/cloud';const client = new Tenzro({ apiKey: 'your-api-key' });// Simple chat - automatically uses gemini-2.5-flashconst response = await client.ai.chat('Explain quantum computing in simple terms');console.log(response.text);
Chat with Messages
// Chat with full message historyconst response = await client.ai.chat({messages: [{ role: 'system', content: 'You are a helpful coding assistant.' },{ role: 'user', content: 'Write a TypeScript function to validate email addresses' },],model: 'gemini-2.5-pro',temperature: 0.7,maxTokens: 1000,});console.log(response.text);console.log('Usage:', response.usage);
Streaming Responses
// Stream responses for real-time outputconst stream = await client.ai.chatStream('Write a short story about AI');for await (const chunk of stream) {process.stdout.write(chunk.text);}
Direct Inference
// Direct model inference with full controlconst response = await client.ai.infer({model: 'gemini-2.5-pro',messages: [{ role: 'system', content: 'You are a helpful assistant.' },{ role: 'user', content: 'What is the capital of France?' },],temperature: 0.7,maxTokens: 500,});console.log(response.text);
Generate Embeddings
// Generate embeddings for semantic searchconst embedding = await client.ai.embed({text: 'Machine learning is fascinating',model: 'gemini-2.5-flash',taskType: 'RETRIEVAL_DOCUMENT',dimensionality: 768,});console.log(embedding.embedding); // [0.123, -0.456, ...]// For queries (semantic search)const queryEmbedding = await client.ai.embed({text: 'What is machine learning?',taskType: 'RETRIEVAL_QUERY',});
AI Endpoints
// Create a custom AI endpointconst endpoint = await client.ai.createEndpoint({projectId: 'project-id',endpointName: 'my-chatbot',model: 'gemini-2.5-flash',systemPrompt: 'You are a helpful customer service assistant.',temperature: 0.7,});console.log('Endpoint:', endpoint.endpoint_url);// Use the endpoint for inferenceconst response = await client.ai.inferWithEndpoint({endpointId: endpoint.endpoint_id,message: 'I need help with my order',});console.log(response.text);
Configuration Options
| Parameter | Type | Description |
|---|---|---|
model | string | Model identifier |
temperature | number | Randomness (0-2, default 1) |
maxTokens | number | Maximum output tokens |
topP | number | Nucleus sampling (0-1) |
frequencyPenalty | number | Reduce repetition (-2 to 2) |
presencePenalty | number | Encourage new topics (-2 to 2) |
stop | string[] | Stop sequences |
Error Handling
try {const response = await client.ai.chat('Hello');console.log(response.text);} catch (error) {console.error('AI inference error:', error.message);// Handle specific error types as needed}
Model Selection
Choose the right model for your use case:
| Model | Provider | Best For |
|---|---|---|
| gemini-2.5-flash | Fast responses, high throughput, cost-effective | |
| gemini-2.5-pro | Complex reasoning, better accuracy | |
| gemini-3-pro-preview | Latest capabilities, experimental features | |
| gpt-4o | OpenAI | Multimodal tasks, vision + text |
| claude-sonnet-4 | Anthropic | Long context, analysis, coding |
Related
- AI Agents - Orchestrate multi-step AI workflows
- MCP Servers - Deploy tool-enabled AI services
- Workflows - Visual AI workflow builder