cerebras

cerebras/llama-3.3-70b

Llama 3.3 70B model optimized for fast inference on Cerebras hardware. Supports up to 128,000 tokens context length.

cerebras

cerebras/llama-3.3-70b

Llama 3.3 70B model optimized for fast inference on Cerebras hardware. Supports up to 128,000 tokens context length.

Provider:

cerebras

Model type:

chat

Location:

US

Context Window

128000

Intelligence Rating

Speed Rating

Cost Efficiency Rating

Pricing

$

0.85

Input tokens per million

$

1.2

Output tokens per million

Features

Tool Calling

Supported

JSON Mode

Supported

Create an account and start building today.

Create an account and start building today.