Model Hub

Host, share, and deploy AI models. Access community models or upload your own for browser-based inference with edge caching.

Overview

The Tenzro Model Hub provides:

Model Hosting: Upload and serve models globally
Edge Caching: CDN-distributed model files
Browser Inference: WebGPU/WebGL/WASM support
Version Control: Track model versions
Access Control: Public or private models

Supported Formats

Format	Runtime	Use Case
ONNX	ONNX Runtime Web	General inference
GGUF	llama.cpp (WASM)	LLM inference
SafeTensors	Transformers.js	HuggingFace models
MLC	WebLLM	Optimized LLMs

Using Hub Models

import { Tenzro } from 'tenzro';

const tenzro = new Tenzro();

// List available models
const models = await tenzro.hub.list({
  category: 'text-generation',
  runtime: 'onnx',
});

// Get model details
const model = await tenzro.hub.get('tenzro/llama-3.2-1b-instruct-onnx');
console.log(model.name, model.size, model.downloads);

// Download model
const downloaded = await tenzro.hub.download('tenzro/llama-3.2-1b-instruct-onnx', {
  progress: (percent) => console.log(`${percent}% downloaded`),
});

Browser Inference

import { TenzroEdge } from '@tenzro/edge';

const edge = new TenzroEdge({
  clientId: 'client_xxx',
});

// Load model for browser inference
const model = await edge.hub.load('tenzro/phi-3-mini-onnx', {
  runtime: 'webgpu', // or 'webgl', 'wasm'
  cache: true,       // Cache in IndexedDB
});

// Run inference
const result = await model.generate({
  prompt: 'Explain quantum computing',
  maxTokens: 100,
});

console.log(result.text);

Uploading Models

// Upload a model
const upload = await tenzro.hub.upload({
  name: 'my-custom-model',
  description: 'Fine-tuned model for customer support',
  format: 'onnx',
  files: [
    { path: 'model.onnx', content: modelBuffer },
    { path: 'tokenizer.json', content: tokenizerBuffer },
  ],
  metadata: {
    baseModel: 'llama-3.2-1b',
    task: 'text-generation',
    quantization: 'int8',
  },
  visibility: 'private', // or 'public'
});

console.log('Model uploaded:', upload.id);

Model Versioning

// Create a new version
await tenzro.hub.createVersion('my-org/my-model', {
  version: '1.1.0',
  files: updatedFiles,
  changelog: 'Improved accuracy on edge cases',
});

// List versions
const versions = await tenzro.hub.listVersions('my-org/my-model');

// Load specific version
const model = await edge.hub.load('my-org/my-model@1.0.0');

Access Control

// Set model visibility
await tenzro.hub.updateVisibility('my-org/my-model', {
  visibility: 'private',
});

// Grant access to specific users/organizations
await tenzro.hub.grantAccess('my-org/my-model', {
  users: ['user-id-1', 'user-id-2'],
  organizations: ['partner-org'],
});

// Revoke access
await tenzro.hub.revokeAccess('my-org/my-model', {
  users: ['user-id-1'],
});

Chunked Loading

Large models are automatically split for efficient loading:

const model = await edge.hub.load('tenzro/llama-3.2-3b-onnx', {
  // Load only needed chunks (streaming inference)
  streaming: true,

  // Progress callback
  onProgress: ({ loaded, total, chunk }) => {
    console.log(`Loading chunk ${chunk}: ${loaded}/${total}`);
  },
});

Model Cards

Document your models with model cards:

await tenzro.hub.updateModelCard('my-org/my-model', {
  description: 'A fine-tuned model for customer support',
  intendedUse: 'Customer service chatbots',
  limitations: 'May not handle technical queries well',
  trainingData: 'Customer support transcripts (anonymized)',
  evaluation: {
    accuracy: 0.92,
    f1Score: 0.89,
  },
  license: 'Apache-2.0',
});

Popular Models

Model	Size	Task
tenzro/phi-3-mini-onnx	2.4 GB	Text Generation
tenzro/llama-3.2-1b-onnx	1.2 GB	Text Generation
tenzro/whisper-small-onnx	460 MB	Speech Recognition
tenzro/all-minilm-l6-v2	90 MB	Embeddings
tenzro/vit-base-patch16	350 MB	Image Classification

Pricing

Feature	Free	Pro
Public models	Unlimited	Unlimited
Private models	3	Unlimited
Storage	10 GB	1 TB
Bandwidth	100 GB/month	10 TB/month