cerebras

cerebras/llama3.1-8b

Llama 3.1 8B model optimized for fast inference on Cerebras hardware. Supports up to 8,192 tokens context length.

Learn more

Provider:

cerebras

Model type:

chat

Location:

US

Context Window

32000

Intelligence Rating

Speed Rating

Cost Efficiency Rating

Pricing

$

0.1

Input tokens per million

$

0.1

Output tokens per million

Features

Tool Calling

Supported

JSON Mode

Supported

Create an account and start building today.

Book a demo

Explore docs

Create an account and start building today.

Book a demo

Explore docs