Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.profclaw.ai/llms.txt

Use this file to discover all available pages before exploring further.

Groq’s Language Processing Units (LPUs) deliver the fastest inference available. Llama 3.3 70B runs at hundreds of tokens per second - ideal for real-time chat and low-latency agentic workflows.

Supported Models

ModelIDContextMax OutputToolsInput $/1MOutput $/1M
Llama 3.3 70Bllama-3.3-70b-versatile128K32KYes$0.59$0.79
Llama 3.1 8B Instantllama-3.1-8b-instant128K8KYes$0.05$0.08
Mixtral 8x7Bmixtral-8x7b-3276832K8KYes$0.24$0.24

Setup

1

Get an API key

Sign up at console.groq.com. Free tier available.
2

Set the environment variable

export GROQ_API_KEY=gsk_...
3

Verify

profclaw doctor --provider groq

Environment Variables

GROQ_API_KEY
string
required
Your Groq API key. Format: gsk_...

Configuration Example

GROQ_API_KEY=gsk_...

Model Aliases

AliasModel
groqllama-3.3-70b-versatile
groq-fastllama-3.1-8b-instant
groq-mixtralmixtral-8x7b-32768

Usage Examples

# Fast general purpose
profclaw chat --model groq "Explain this error message"

# Fastest (8B model)
profclaw chat --model groq-fast "One-line summary of this PR"

Notes

  • Groq is ranked 5th in auto-selection priority after Anthropic, OpenAI, Azure, and Google.
  • llama-3.1-8b-instant is one of the cheapest available models at $0.05/1M input tokens.
  • Groq has a generous free tier with rate limits per day.
  • API is OpenAI-compatible - endpoint: https://api.groq.com/openai/v1