Model Hub

Host, share, and deploy AI models. Access community models or upload your own for browser-based inference with edge caching.

Overview

The Tenzro Model Hub provides:

  • Model Hosting: Upload and serve models globally
  • Edge Caching: CDN-distributed model files
  • Browser Inference: WebGPU/WebGL/WASM support
  • Version Control: Track model versions
  • Access Control: Public or private models

Supported Formats

FormatRuntimeUse Case
ONNXONNX Runtime WebGeneral inference
GGUFllama.cpp (WASM)LLM inference
SafeTensorsTransformers.jsHuggingFace models
MLCWebLLMOptimized LLMs

Using Hub Models

import { Tenzro } from 'tenzro';
const tenzro = new Tenzro();
// List available models
const models = await tenzro.hub.list({
category: 'text-generation',
runtime: 'onnx',
});
// Get model details
const model = await tenzro.hub.get('tenzro/llama-3.2-1b-instruct-onnx');
console.log(model.name, model.size, model.downloads);
// Download model
const downloaded = await tenzro.hub.download('tenzro/llama-3.2-1b-instruct-onnx', {
progress: (percent) => console.log(`${percent}% downloaded`),
});

Browser Inference

import { TenzroEdge } from '@tenzro/edge';
const edge = new TenzroEdge({
clientId: 'client_xxx',
});
// Load model for browser inference
const model = await edge.hub.load('tenzro/phi-3-mini-onnx', {
runtime: 'webgpu', // or 'webgl', 'wasm'
cache: true, // Cache in IndexedDB
});
// Run inference
const result = await model.generate({
prompt: 'Explain quantum computing',
maxTokens: 100,
});
console.log(result.text);

Uploading Models

// Upload a model
const upload = await tenzro.hub.upload({
name: 'my-custom-model',
description: 'Fine-tuned model for customer support',
format: 'onnx',
files: [
{ path: 'model.onnx', content: modelBuffer },
{ path: 'tokenizer.json', content: tokenizerBuffer },
],
metadata: {
baseModel: 'llama-3.2-1b',
task: 'text-generation',
quantization: 'int8',
},
visibility: 'private', // or 'public'
});
console.log('Model uploaded:', upload.id);

Model Versioning

// Create a new version
await tenzro.hub.createVersion('my-org/my-model', {
version: '1.1.0',
files: updatedFiles,
changelog: 'Improved accuracy on edge cases',
});
// List versions
const versions = await tenzro.hub.listVersions('my-org/my-model');
// Load specific version
const model = await edge.hub.load('my-org/my-model@1.0.0');

Access Control

// Set model visibility
await tenzro.hub.updateVisibility('my-org/my-model', {
visibility: 'private',
});
// Grant access to specific users/organizations
await tenzro.hub.grantAccess('my-org/my-model', {
users: ['user-id-1', 'user-id-2'],
organizations: ['partner-org'],
});
// Revoke access
await tenzro.hub.revokeAccess('my-org/my-model', {
users: ['user-id-1'],
});

Chunked Loading

Large models are automatically split for efficient loading:

const model = await edge.hub.load('tenzro/llama-3.2-3b-onnx', {
// Load only needed chunks (streaming inference)
streaming: true,
// Progress callback
onProgress: ({ loaded, total, chunk }) => {
console.log(`Loading chunk ${chunk}: ${loaded}/${total}`);
},
});

Model Cards

Document your models with model cards:

await tenzro.hub.updateModelCard('my-org/my-model', {
description: 'A fine-tuned model for customer support',
intendedUse: 'Customer service chatbots',
limitations: 'May not handle technical queries well',
trainingData: 'Customer support transcripts (anonymized)',
evaluation: {
accuracy: 0.92,
f1Score: 0.89,
},
license: 'Apache-2.0',
});

Popular Models

ModelSizeTask
tenzro/phi-3-mini-onnx2.4 GBText Generation
tenzro/llama-3.2-1b-onnx1.2 GBText Generation
tenzro/whisper-small-onnx460 MBSpeech Recognition
tenzro/all-minilm-l6-v290 MBEmbeddings
tenzro/vit-base-patch16350 MBImage Classification

Pricing

FeatureFreePro
Public modelsUnlimitedUnlimited
Private models3Unlimited
Storage10 GB1 TB
Bandwidth100 GB/month10 TB/month