Fundamentals

Groq vs OpenRouter Free Tier

You want to practise calling a language model without spending money, and two names keep coming up: Groq and OpenRouter. Both have a free tier, both work from Python, and both claim to be beginner-friendly. But they solve different problems, and picking the wrong one for your task means slow experiments or hitting a limit wall you did not expect. This guide compares them head to head and shows you the exact Python to call each, so you can choose in minutes and start building.

The good news for a beginner: you barely need to learn anything new. If you have already worked through Understanding LLM APIs, you know the openai SDK pattern. Both Groq and OpenRouter are OpenAI-compatible — they accept the same request shape OpenAI uses — so the same SDK talks to either one. You change three things (the web address, the key, and the model name) and your code just works.

This is one guide inside the Understanding LLM APIs section, written for creators, marketers, founders, and students who can run a Python file but have never compared API providers before.

What each service is, in one line

Groq is a single provider that runs a curated set of open-weight models (like Llama and Mixtral) on custom hardware built for one thing: speed. You get a small menu of fast models and very low latency.

OpenRouter is a marketplace. One key and one web address give you access to hundreds of models from many providers — including a rotating set of genuinely free models marked with a :free suffix. You trade a little speed for enormous choice and the ability to switch models by changing one string.

The matrix below sums up the trade-offs you actually care about as a beginner.

Groq versus OpenRouter free tier on four dimensions A comparison matrix rating Groq and OpenRouter on speed and latency, model choice, free-tier limits, and setup, showing Groq leads on speed while OpenRouter leads on model choice. Groq vs OpenRouter Groq OpenRouter Speed / latency Very fast Depends onthe model Model choice Small, curatedmenu Hundreds ofmodels Free-tier limits Daily + minutecaps Free modelscapped per day Setup effort One key,no card One key,no card
Groq wins on raw speed and a simple menu; OpenRouter wins on the sheer number of models behind one key. Both start free with just an account.

Prerequisites

You only need a working Python 3.10 or newer setup and the openai SDK. If you have not installed it yet, the parent Understanding LLM APIs guide covers it; here is the short version:

python --version                       # must print 3.10 or higher
python -m venv .venv
source .venv/bin/activate               # Windows: .venv\Scripts\activate
pip install "openai>=1.40" "python-dotenv>=1.0"

Now sign up for both services and grab a key from each dashboard. Neither asks for a credit card to begin. Put both keys in a single .env file (a plain text file that holds your secrets so they never sit inside your code):

GROQ_API_KEY=gsk_your_real_groq_key_here
OPENROUTER_API_KEY=sk-or-your_real_openrouter_key_here

Immediately add .env to your .gitignore so a key never gets committed and shared:

echo ".env" >> .gitignore

A key pushed to a public repository can be found and abused within minutes, and the usage lands on your account — so this one line matters more than any code below.

The key idea: one SDK, two base URLs

Every OpenAI-compatible service has a base URL — the web address the SDK sends requests to. By default the openai SDK points at OpenAI's own address. To talk to Groq or OpenRouter, you override base_url when you create the client and pass that service's key. The model name changes too, because each service uses its own catalogue. Nothing else moves.

Here is the same first request written for each. Notice how little differs.

Calling Groq from Python

import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()  # loads GROQ_API_KEY from .env (which is in .gitignore)

client = OpenAI(
    api_key=os.getenv("GROQ_API_KEY"),
    base_url="https://api.groq.com/openai/v1",   # point the SDK at Groq
)

response = client.chat.completions.create(
    model="llama-3.3-70b-versatile",             # a fast model on Groq's menu
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Explain what a free API tier is in one sentence."},
    ],
)

print(response.choices[0].message.content)
print("Tokens used:", response.usage.total_tokens)

Run this and the reply usually arrives almost instantly — Groq's whole reason to exist is low latency. The messages list, the way you read response.choices[0].message.content, and the usage counts are identical to plain OpenAI. Only base_url, the key, and model are Groq-specific.

Calling OpenRouter from Python

import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()  # loads OPENROUTER_API_KEY from .env (which is in .gitignore)

client = OpenAI(
    api_key=os.getenv("OPENROUTER_API_KEY"),
    base_url="https://openrouter.ai/api/v1",     # point the SDK at OpenRouter
)

response = client.chat.completions.create(
    model="meta-llama/llama-3.3-70b-instruct:free",  # the :free suffix = no charge
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Explain what an API marketplace is in one sentence."},
    ],
    extra_headers={                              # optional, used for OpenRouter rankings
        "HTTP-Referer": "https://your-site.example",
        "X-Title": "My Beginner App",
    },
)

print(response.choices[0].message.content)
print("Tokens used:", response.usage.total_tokens)

The only structural difference is the :free model name and the optional extra_headers, which OpenRouter uses to attribute traffic on its public rankings (you can omit them entirely). Switch to a different model — say mistralai/mistral-7b-instruct:free — by changing one string, and that is the marketplace's superpower.

One function that talks to either service

Because the calling code is nearly identical, you can wrap both behind a single helper and flip between them with one argument. This is the pattern to keep once your experiments grow.

import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()  # both keys come from .env (which is in .gitignore)

# A small registry of where each provider lives and which model to use.
PROVIDERS = {
    "groq": {
        "base_url": "https://api.groq.com/openai/v1",
        "key_env": "GROQ_API_KEY",
        "model": "llama-3.3-70b-versatile",
    },
    "openrouter": {
        "base_url": "https://openrouter.ai/api/v1",
        "key_env": "OPENROUTER_API_KEY",
        "model": "meta-llama/llama-3.3-70b-instruct:free",
    },
}


def ask(prompt: str, provider: str = "groq") -> str:
    """Send one prompt to the chosen provider and return the reply text."""
    config = PROVIDERS[provider]
    client = OpenAI(
        api_key=os.getenv(config["key_env"]),
        base_url=config["base_url"],
    )
    response = client.chat.completions.create(
        model=config["model"],
        messages=[
            {"role": "system", "content": "You are a concise, helpful assistant."},
            {"role": "user", "content": prompt},
        ],
        temperature=0.3,   # low = consistent answers
        max_tokens=200,    # cap the reply to stay inside free limits
    )
    usage = response.usage
    print(f"[{provider}] tokens: {usage.total_tokens}")  # keep usage visible
    return response.choices[0].message.content


if __name__ == "__main__":
    question = "Summarise the difference between Groq and OpenRouter in two sentences."
    print("--- Groq ---")
    print(ask(question, provider="groq"))
    print("\n--- OpenRouter ---")
    print(ask(question, provider="openrouter"))

Run it with python compare.py and you get the same question answered by both services, with a token count for each. Switching providers is now a one-word change — exactly the flexibility the OpenAI-compatible format buys you.

Key configuration quick-reference

These are the only values that differ between the two providers. Everything else in your request stays the same.

SettingGroqOpenRouter
base_urlhttps://api.groq.com/openai/v1https://openrouter.ai/api/v1
api_keyGROQ_API_KEY (starts gsk_)OPENROUTER_API_KEY (starts sk-or-)
modele.g. llama-3.3-70b-versatilee.g. meta-llama/llama-3.3-70b-instruct:free

How the free tiers actually limit you

Free does not mean unlimited. Knowing each cap stops a confusing failure mid-project.

  • Groq limits you by requests and tokens per minute and per day. Hit the per-minute cap and you get a 429 error; wait a moment and it clears. The daily cap resets each day. Limits vary by model, and faster models tend to have tighter ones.
  • OpenRouter caps its free models per day across your account, with a low per-minute ceiling. Adding a small credit balance can lift the daily ceiling on free models even though you are not paying for the calls themselves. Paid models bill per token with no such daily cap.

Either way, the symptom of going over is the same 429 response. The full recovery pattern — waiting and retrying with an increasing delay — lives in Fix the 429 Rate-Limit Error in Python, and it works unchanged against both services.

Troubleshooting

  1. AuthenticationError: Error code: 401 — The wrong key is reaching the wrong service. A Groq key (gsk_...) sent to OpenRouter's base_url, or vice versa, fails here. Confirm each client uses the matching key and address. The complete fix is in Fix the 401 Unauthorized Error in OpenAI Python.
  2. NotFoundError: Error code: 404 - model not found — The model name does not exist on that provider. Groq and OpenRouter use different catalogues, and on OpenRouter free models need the exact :free suffix. Copy the model id straight from the provider's model list.
  3. RateLimitError: Error code: 429 — You exceeded the free per-minute or daily cap. Slow down, retry after a short pause, or switch to the other provider for that run. See Fix the 429 Rate-Limit Error in Python.
  4. Empty or None reply (AttributeError: 'NoneType' object has no attribute 'content') — A free model occasionally returns nothing under heavy load. Check response.choices[0].finish_reason before reading the text, and retry once if it is empty rather than assuming a string is always there.

When to use which

  • Use Groq when latency matters. Live demos, a chatbot that should feel instant, or any loop where you call the model many times in a row — Groq's speed is the deciding factor.
  • Use OpenRouter when you want choice. Trying a model Groq does not host, comparing several models side by side, or keeping one key and one bill across many providers — that breadth is the whole point of a marketplace.
  • Use Groq to learn the basics fast, then OpenRouter to explore. Start on Groq because the menu is small and the replies are quick, then move to OpenRouter once you want to test models Groq lacks.
  • Use either as a free practice ground before paying. Both let you master the LLM API workflow at zero cost; once you need top-tier paid models, weigh them in OpenAI vs Anthropic API for Beginners.

For a wider tour of cost-free options beyond these two, see Best Free AI APIs for Beginners.

Back to Understanding LLM APIs.