
cerebras
cerebras/llama3.1-8b
Llama 3.1 8B model optimized for fast inference on Cerebras hardware. Supports up to 8,192 tokens context length.
Provider:
cerebras
Model type:
chat
Location:
US
Context Window
32000
Intelligence Rating
Speed Rating
Cost Efficiency Rating
Pricing
$
0.1
Input tokens per million
$
0.1
Output tokens per million
Features
Tool Calling
Supported
JSON Mode
Supported
