
cerebras
cerebras/llama-3.3-70b
Llama 3.3 70B model optimized for fast inference on Cerebras hardware. Supports up to 128,000 tokens context length.
Provider:
cerebras
Model type:
chat
Location:
US
Context Window
128000
Intelligence Rating
Speed Rating
Cost Efficiency Rating
Pricing
$
0.85
Input tokens per million
$
1.2
Output tokens per million
Features
Tool Calling
Supported
JSON Mode
Supported
