AI/ML API Inference Pricing

AI/ML API Tokens offer the flexibility to precisely allocate resources where they're most needed, enhancing performance and cost efficiency across your AI applications.
Calculate by tokens
Input tokens
Output tokens
Api Calls
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Provider
Model
Context
Input/1k
Output/1k
Per call
Total
Delegate
Meta
Llama 3.2 3B Instruct Turbo
131K
$
0.000000063
$
0.000000063
$
0.04324
$
0.04324
Meta
Llama 3.2 90B Vision Instruct Turbo
131K
$
0.00000126
$
0.00000126
$
0.04324
$
0.04324
Meta
Llama 3.2 11B Vision Instruct Turbo
131K
$
0.00000019
$
0.00000019
$
0.04324
$
0.04324
Alibaba Cloud
Qwen2 7B Instruct
8K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Meta
Llama 3 70B Instruct Reference
8K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Meta
Llama 3 8B Instruct Reference
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Snowflake
Snowflake Arctic Instruct
4K
$
0.00000252
$
0.00000252
$
0.04324
$
0.04324
Alibaba Cloud
Qwen 2 1.5B Instruct
128K
$
0.000000525
$
0.000000525
$
0.04324
$
0.04324
OpenAI
OpenAI o1-mini
128K
$
0.00000315
$
0.0000126
$
0.04324
$
0.04324
OpenAI
OpenAI o1-preview
128K
$
0.0000157
$
0.0000157
$
0.04324
$
0.04324
OpenAI
GPT-4o-2024-05-13
128K
$
0.00000525
$
0.00000525
$
0.04324
$
0.04324
OpenAI
GPT-4o-2024-08-06
128K
$
0.000002625
$
0.0000105
$
0.04324
$
0.04324
Google
Gemini 1.0 Pro
32K
$
0.000000131
$
0.000000394
$
0.04324
$
0.04324
Google
Gemini 1.5 Pro
2M
$
0.000002625
$
0.000007875
$
0.04324
$
0.04324
Google
Gemini 1.5 Flash
1M
$
0.000000039
$
0.000000157
$
0.04324
$
0.04324
NousResearch
Hermes 2 Theta Llama-3 70B
8K
$
0.000000525
$
0.000000525
$
0.04324
$
0.04324
Gradient
Llama-3 70B Gradient Instruct 1048k
1048K
$
0.000000525
$
0.000000525
$
0.04324
$
0.04324
Meta
Llama 3 70B Instruct Lite
8K
$
0.000000567
$
0.000000567
$
0.04324
$
0.04324
Meta
Llama 3 8B Instruct Lite
8K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Meta
Llama 3.1 70B Instruct Turbo
128K
$
0.000000924
$
0.000000924
$
0.04324
$
0.04324
Meta
Llama 3.1 8B Instruct Turbo
128K
$
0.000000189
$
0.000000189
$
0.04324
$
0.04324
Meta
Llama 3.1 (405B) Instruct Turbo
4K
$
0.00000525
$
0.00000525
$
0.04324
$
0.04324
OpenAI
Chat GPT 4o mini
128K
$
0.000000157
$
0.00000063
$
0.04324
$
0.04324
Together
Pythia-Chat-Base (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
University of Washington NLP
Guanaco (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
University of Washington NLP
Guanaco (65B)
2K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
University of Washington NLP
Guanaco (33B)
2K
$
0.00000084
$
0.00000084
$
0.04324
$
0.04324
University of Washington NLP
Guanaco (13B)
2K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Alibaba Cloud
Qwen Chat (14B)
8K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
NousResearch
Nous Hermes LLaMA-2 (70B)
4K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Mosaic ML
MPT-Chat (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Mosaic ML
MPT-Chat (30B)
8K
$
0.00000084
$
0.00000084
$
0.04324
$
0.04324
Google
Gemma 2 (9B)
8K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
lmsys
Vicuna FastChat T5 (3B)
512
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
lmsys
Vicuna v1.5 16k (13B)
16K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
BAIR
Koala (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
BAIR
Koala (13B)
2K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Databricks
DBRX Instruct
32K
$
0.00000126
$
0.00000126
$
0.04324
$
0.04324
OpenAI
Chat GPT-3.5 Turbo 0125
16K
$
0.000000525
$
0.0000015750
$
0.04324
$
0.04324
OpenAI
Chat GPT-3.5 Turbo 1106
16K
$
0.00000105
$
0.0000021
$
0.04324
$
0.04324
OpenAI
Chat GPT-3.5 Turbo Instruct
4K
$
0.000001575
$
0.0000021
$
0.04324
$
0.04324
OpenAssistant
Open-Assistant StableLM SFT-7 (7B)
4K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
OpenAssistant
Open-Assistant Pythia SFT-4 (12B)
2K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Hugging Face
StarCoderChat Alpha (16B)
8K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Disco Research
DiscoLM Mixtral 8x7b (46.7B)
32K
$
0.00000063
$
0.00000063
$
0.04324
$
0.04324
Databricks
Dolly v2 (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Databricks
Dolly v2 (3B)
2K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Allen Institute for AI
OLMO TWIN-2T (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
lmsys
Vicuna v1.5 (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Anthropic
Claude 3.5 Sonnet
200K
$
0.00000315
$
0.00001575
$
0.04324
$
0.04324
Alibaba Cloud
Qwen 2 Instruct (72B)
32K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Mistral AI
Mistral (7B) Instruct v0.3
32K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
OpenAI
Chat GPT-4o
128K
$
0.00000525
$
0.00001575
$
0.04324
$
0.04324
Anthropic
Claude 3 Haiku
200K
$
0.000000263
$
0.000001313
$
0.04324
$
0.04324
Anthropic
Claude 3 Sonnet
200K
$
0.00000315
$
0.00001575
$
0.04324
$
0.04324
Anthropic
Claude 3 Opus
200K
$
0.00001575
$
0.00007875
$
0.04324
$
0.04324
Mistral AI
Mixtral 8x22B Instruct
64K
$
0.00000126
$
0.00000126
$
0.04324
$
0.04324
Meta
LLama-3 Chat (8B)
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Meta
Llama-3 Chat (70B)
8K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Allen Institute for AI
OLMo-7B-Instruct
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Together
GPT-NeoXT-Chat-Base-20B
2K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Hugging Face
Zephyr 7B
32K
$
0.000000525
$
0.000000525
$
0.04324
$
0.04324
Databricks
Dolly v2 (12B)
2K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Undi95
Toppy M (7B)
4K
$
0.0000021
$
0.0000021
$
0.04324
$
0.04324
Undi95
ReMM-SLERP-L2-13B
4K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Garage-bAInd
Platypus2-70B-Instruct
4K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
DeepSeek
Deepseek-LLM-67b-Chat
4K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Cognitive Computations
Dolphin-2.5-Mixtral-8x7b
32K
$
0.00000063
$
0.00000063
$
0.04324
$
0.04324
Allen Institute for AI
OLMo-7B
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
OpenAI
Chat GPT 4 32k
32K
$
0.000063
$
0.000126
$
0.04324
$
0.04324
OpenAI
Chat GPT 4 Turbo
128K
$
0.0000105
$
0.0000315
$
0.04324
$
0.04324
OpenAI
Chat GPT 4
8K
$
0.0000315
$
0.000063
$
0.04324
$
0.04324
OpenAI
Chat GPT 3.5 Turbo
16K
$
0.00000315
$
0.0000042
$
0.04324
$
0.04324
Gryphe
MythoMax-L2 (13B)
4K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
TII
Falcon Instruct (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
TII
Falcon Instruct (40B)
2K
$
0.00000084
$
0.00000084
$
0.04324
$
0.04324
Stanford University
Alpaca (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Together
RedPajama-INCITE Chat (3B)
2K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Together
RedPajama-INCITE Chat (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Alibaba Cloud
Qwen Chat (7B)
32K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
lmsys
Vicuna v1.5 (13B)
4K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
WizardLM
WizardLM v1.2 (13B)
4K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
OpenOrca
OpenOrca Mistral (7B)
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Teknium
OpenHermes-2-Mistral (7B)
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Upstage
Upstage SOLAR Instruct v1 (11B)
4K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Teknium
OpenHermes-2.5-Mistral (7B)
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
NousResearch
Nous Capybara v1.9 (7B)
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Nexusflow
NexusRaven (13B)
16K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
OpenChat
OpenChat 3.5 (7B)
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Meta
LLaMA-2 Chat (7B)
4K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
NousResearch
Nous Hermes-2 Yi (34B)
4K
$
0.00000084
$
0.00000084
$
0.04324
$
0.04324
Meta
LLaMA-2 Chat (70B)
4K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Meta
LLaMA-2 Chat (13B)
4K
$
0.000000231
$
0.000000231
$
0.04324
$
0.04324
NousResearch
Nous Hermes LLaMA-2 (7B)
4K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Together
LLaMA-2-7B-32K-Instruct (7B)
32K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
NousResearch
Nous Hermes Llama-2 (13B)
4K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Mistral AI
Mistral (7B) Instruct v0.1
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Snorkel AI
Snorkel Mistral PairRM DPO (7B)
32K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
NousResearch
Nous Hermes 2 - Mixtral 8x7B-DPO
32K
$
0.00000063
$
0.00000063
$
0.04324
$
0.04324
Alibaba Cloud
Qwen 1.5 Chat (0.5B)
32K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Provider
Model
Context
Input/1k
Output/1k
Per call
Total
Delegate
Replit
Replit-Code-v1 (3B)
2K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Salesforce
CodeGen2 (16B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Salesforce
CodeGen2 (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
BigCode
StarCoder (16B)
8K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Defog AI
SQLCoder (15B)
8K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Phind
Phind Code LLaMA v2 (34B)
16K
$
0.00000084
$
0.00000084
$
0.04324
$
0.04324
WizardLM
WizardCoder Python v1.0 (34B)
8K
$
0.00000084
$
0.00000084
$
0.04324
$
0.04324
Meta
Code Llama Instruct (7B)
16K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Meta
Code Llama Instruct (34B)
16K
$
0.000000815
$
0.000000815
$
0.04324
$
0.04324
Meta
Code Llama Python (34B)
16K
$
0.000000815
$
0.000000815
$
0.04324
$
0.04324
Meta
Code Llama Python (7B)
16K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Meta
Code Llama Python (13B)
16K
$
0.000000231
$
0.000000231
$
0.04324
$
0.04324
Meta
Code Llama Instruct (13B)
16K
$
0.000000231
$
0.000000231
$
0.04324
$
0.04324
DeepSeek
Deepseek Coder Instruct (33B)
16K
$
0.00000084
$
0.00000084
$
0.04324
$
0.04324
Meta
Code Llama (70B)
16K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Meta
Code Llama Instruct (70B)
4K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Meta
Code Llama Python (70B)
4K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Provider
Model
Context
256X256
512X512
1024x1024
Total
Delegate
Black Forest Labs
FLUX 1.1 [pro]
4K
$
$
$
0.042
OpenAI
OpenAI DALL·E 2
4K
$
0.0168
$
0.0189
$
0.021
Stability AI
Stable Diffusion 3
77
$
$
$
0.03675
Black Forest Labs
FLUX Realism LoRA
4K
$
$
$
0.03675
Black Forest Labs
FLUX.1 [schnell]
4K
$
$
$
0.00315
Black Forest Labs
FLUX.1 [dev]
4K
$
$
$
0.02625
Black Forest Labs
FLUX.1 [pro]
4K
$
$
$
0.0525
OpenAI
OpenAI DALL·E 3
4K
$
$
$
0.042
Wavymulder
Analog Diffusion
77
$
$
0.00105
$
0.0105
PromptHero
Openjourney v4
77
$
$
0.00105
$
0.0105
Together
Realistic Vision 3.0
77
$
$
0.00105
$
0.0105
Stability AI
Stable Diffusion 2.1
77
$
$
0.00105
$
0.0105
Stability AI
Stable Diffusion XL 1.0
77
$
$
0.00105
$
0.0105
Stability AI
Stable Diffusion 1.5
77
$
$
0.00105
$
0.0105
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Provider
Model
Per / min
Total
Delegate
Deepgram
Aura
$
0.01575
$
0.04324
Deepgram
Deepgram Nova-2
$
0.006195
$
0.04324
OpenAI
Whisper
$
0.003675
$
0.04324
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Provider
Model
Context
Per generation
Total
Delegate
Luma AI
Luma
512
$
0.2625
$
0.04324
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Provider
Model
Per generation
Total
Delegate
Suno AI
Suno AI
$
0.07875
$
0.04324
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Provider
Model
Context
Input/1k
Per col
Total
Delegate
Google
Textembedding-gecko@001
3K
$
0.000000026
$
0.04324
$
0.04324
Google
Textembedding-gecko@003
2K
$
0.000000026
$
0.04324
$
0.04324
Google
Textembedding-gecko-multilingual@001
2K
$
0.000000026
$
0.04324
$
0.04324
Google
Text-multilingual-embedding-002
2K
$
0.000000026
$
0.04324
$
0.04324
Voyage AI
Voyage Large 2 Instruct
16K
$
0.000000126
$
0.04324
$
0.04324
OpenAI
Text-embedding-ada-002
8K
$
0.000000105
$
0.04324
$
0.04324
OpenAI
Text-embedding-3-large
8K
$
0.000000136
$
0.04324
$
0.04324
OpenAI
Text-embedding-3-small
8K
$
0.000000021
$
0.04324
$
0.04324
Google
Bert Base Uncased
512
$
0.000000011
$
0.04324
$
0.04324
Sentence Transformers
Sentence-BERT
512
$
0.000000011
$
0.04324
$
0.04324
BAAI
BAAI-Bge-Base-1p5
512
$
0.000000011
$
0.04324
$
0.04324
Together
M2-BERT-Retrieval-2K
2K
$
0.000000011
$
0.04324
$
0.04324
BAAI
BAAI-Bge-Large-1p5
512
$
0.000000011
$
0.04324
$
0.04324
WhereIsAI
UAE-Large-V1
512
$
0.000000011
$
0.04324
$
0.04324
Together
M2-BERT-Retrieval-32k
32K
$
0.000000011
$
0.04324
$
0.04324
Together
M2-BERT-Retrieval-8k
8K
$
0.000000011
$
0.04324
$
0.04324
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Provider
Model
Context
Input/1k
Output/1k
Per call
Total
Delegate
Meta
Llama-3 (8B)
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Meta
LLama-3 (70B)
8K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Stability AI
StableLM Base Alpha 3B
4K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Google
FLAN T5 XL (3B)
512
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Eleuther AI
GPT Neox 20B
2K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Mistral AI
Mixtral 8x22B
64K
$
0.00000126
$
0.00000126
$
0.04324
$
0.04324
TII
Falcon (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Together
GPT-JT-Moderation (6B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
TII
Falcon (40B)
2K
$
0.00000084
$
0.00000084
$
0.04324
$
0.04324
Together
RedPajama-INCITE (3B)
2K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Together
RedPajama-INCITE Instruct (3B)
2K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Alibaba Cloud
Qwen (7B)
32K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Together
RedPajama-INCITE (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Together
RedPajama-INCITE Instruct (7B)
2K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
01.AI
01-ai Yi Base (6B)
4K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Meta
LLaMA-2 (13B)
4K
$
0.000000231
$
0.000000231
$
0.04324
$
0.04324
Meta
LLaMA-2 (70B)
4K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Together
LLaMA-2-32K (7B)
32K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Mistral AI
Mistral (7B) v0.1
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Together
StripedHyena Hessian (7B)
32K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Alibaba Cloud
Qwen 1.5 (0.5B)
32K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Alibaba Cloud
Qwen 1.5 (1.8B)
32K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Alibaba Cloud
Qwen 1.5 (4B)
32K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Alibaba Cloud
Qwen 1.5 (7B)
32K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Alibaba Cloud
Qwen 1.5 (14B)
32K
$
0.000000315
$
0.000000315
$
0.04324
$
0.04324
Alibaba Cloud
Qwen 1.5 (72B)
32K
$
0.000000945
$
0.000000945
$
0.04324
$
0.04324
Microsoft
Microsoft Phi-2
2K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Mistral AI
Mixtral-8x7B v0.1
32K
$
0.00000063
$
0.00000063
$
0.04324
$
0.04324
Google
Gemma (2B)
8K
$
0.000000105
$
0.000000105
$
0.04324
$
0.04324
Google
Gemma (7B)
8K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Together
StripedHyena Nous (7B)
32K
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Mistral AI
Mixtral 7B
$
0.00000021
$
0.00000021
$
0.04324
$
0.04324
Meta
LLaMA-2 (7B)
4K
$
$
$
0.04324
$
0.04324
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Provider
Model
Per generation
Total
Delegate
Stability AI
Stable TripoSR 3D
$
0.05
$
0.04324
GET IN TOUCH

Frequently asked questions

What is a Token

Does using the playground deduct from my token allocation?

Which AI model should I use?

How to manage my subscription

How to add my model to API?

How to upgrade or downgrade my plan

The Best Growth Choice
for Enterprise

Get API Key