Announcing our Chat Interface 🚀 - A central interface to chat with and compare LLM endpoints 🤖
Announcing our Chat Interface 🚀 - A central interface to chat with and compare LLM endpoints 🤖
Announcing our Chat Interface 🚀

The Best LLM on Every Prompt

The Best LLM on Every Prompt

Not Sure Which Model to Use?
Automatically Use The Best Model
for Your Task on Every Prompt ->
Not Sure Which Model to Use? Automatically Use The Best Model for Your Task on Every Prompt ->
trusted by engineers at
deepmind logo
amazon logo
tesla logo
twitter x logo
salesforce logo
ezdubs logo
oxford logo
mit logo
stanford logo
imperial college logo
cambridge logo
trusted by engineers at
deepmind logo
amazon logo
tesla logo
twitter x logo
salesforce logo
ezdubs logo
oxford logo
mit logo
stanford logo
imperial college logo
cambridge logo
deepmind logo
amazon logo
tesla logo
twitter x logo
salesforce logo
ezdubs logo
oxford logo
mit logo
stanford logo
imperial college logo
cambridge logo
It Starts with Your Query
api

All Models,
All Providers,
One API

All Models, All Providers, One API

Use our single API to query any model across all providers that support it.

import requests
url = "https://api.unify.ai/v0/inference"
headers = {
   "Authorization": "Bearer YOUR_UNIFY_KEY",
}

payload = {
   "model": "mixtral-8x7b-instruct-v0.1",
   "provider": "anyscale",
   "arguments": {
       "messages": [{
           "role": "user",
           "content": "YOUR_MESSAGE"
       }],
       "temperature": 50,
       "max_tokens": 500,
       "stream": True,
   }
}

response = requests.post(
        url, json=payload, headers=headers, stream=True
)
python
node.js
c
php
ruby
...
make your first request
import requests
url = "https://api.unify.ai/v0/inference"
headers = {
   "Authorization": "Bearer YOUR_UNIFY_KEY",
}

payload = {
   "model": "mixtral-8x7b-instruct-v0.1",
   "provider": "anyscale",
   "arguments": {
       "messages": [{
           "role": "user",
           "content": "YOUR_MESSAGE"
       }],
       "temperature": 50,
       "max_tokens": 500,
       "stream": True,
   }
}

response = requests.post(
        url, json=payload, headers=headers, stream=True
)
python
node.js
c
php
ruby
...
make your first request
Unify Router Preview
router

Get the Most
from Models with
Expert Routing

Get the Most from Models with Expert Routing

Automatically send your queries to the most appropriate model and get the best output at the lowest cost.

Unify Router Preview
performance

Constantly Achieve
Peak Performance

Our dynamic router systematically always sends your queries to the best-performing provider at any given time for the metrics you care about.

Mistral logo
Mixtral8x7B Instruct v0.1
+102.56
%
Tokens / Sec
+406.66
%
TTFT
+138.37
%
E2E Latency
+206.85
%
ITL
Meta logo
LLaMa2 70B Chat
+83.97
%
Tokens / Sec
+262.96
%
TTFT
+95.02
%
E2E Latency
+155.66
%
ITL
anyscale logo
anyscale
replicate logo
replicate
together.ai logo
together.ai
octo-ai logo
octoai
Mistral logo
mistral-ai
Unify logo
router
Unify Benchmarks Mistral Preview
Dynamic router performance improvements on various metrics, averaged across time and benchmarking regimes
Throughput Scatter Graph
modular

Your Query,
Your Needs,
Custom Routing

Your Query, Your Needs,
Custom Routing

Setup your own cost, latency and output speed constraints. Define a custom quality metric. Personalize your router for your requirements.
Throughput Scatter Graph

Frequently Asked Questions

Do I need to create an account with each provider?
Do you charge anything on top of the upstream providers?
How do you determine what the best model is?
->
back to top