Fundamentals

Python Script to Automate Email Sorting

This guide shows you how to read your inbox over IMAP, classify every unread message with an LLM (a large language model, the kind of AI that reads and writes text), and move or label each email by category — all in a single Python script you can run by hand or schedule. Manual inbox triage quietly eats hours from creators, marketers, and founders; by the end of this page you will have a script that does it for you in under fifteen minutes of setup.

Plain keyword rules are a fine starting point, but they break the moment an email is phrased in a way you did not predict. A bill that says "your statement is ready" never contains the word "invoice", so a keyword filter sails right past it. An LLM reads the meaning of a message, so it can route that email to your Invoices folder anyway. This guide keeps a fast keyword shortcut for the obvious cases and falls back to the model for everything else.

Prerequisites

This guide assumes you already have Python 3.10 or newer and a virtual environment ready. If not, start with Create a Python Virtual Environment for AI and come back. Beyond that, you need three things specific to this task:

  1. IMAP access turned on. IMAP (Internet Message Access Protocol) is the standard way programs read a mailbox. Enable it in your provider's settings — in Gmail it lives under Settings → Forwarding and POP/IMAP.
  2. An app password. With two-factor authentication on, most providers block your normal password from third-party apps. Generate a dedicated 16-character app password instead (Gmail: Google Account → Security → App passwords). It can be revoked any time without touching your main login.
  3. An LLM API key. This guide uses the openai SDK. If you are new to keys and pricing, read Understanding LLM APIs and pick a cheap, fast model.

The imaplib and email modules ship with Python, so the only packages to install are the OpenAI SDK and a helper for reading .env files:

pip install openai python-dotenv

Create a .env file next to your script to hold every secret:

IMAP_SERVER=imap.gmail.com
EMAIL_ADDRESS=you@gmail.com
APP_PASSWORD=your_16_char_app_password
OPENAI_API_KEY=sk-your-key-here

Add .env to your .gitignore immediately so these credentials never land in version control:

echo ".env" >> .gitignore

Step 1: Connect to your inbox over IMAP

First, load your secrets from the environment and open a secure connection. IMAP4_SSL encrypts the whole session, and select("inbox") tells the server which mailbox to work in. Searching for UNSEEN returns only the unread messages, so you never reprocess mail you have already sorted.

import imaplib
import os

from dotenv import load_dotenv

load_dotenv()

IMAP_SERVER = os.environ["IMAP_SERVER"]
EMAIL_ADDRESS = os.environ["EMAIL_ADDRESS"]
APP_PASSWORD = os.environ["APP_PASSWORD"]


def connect():
    mail = imaplib.IMAP4_SSL(IMAP_SERVER)
    mail.login(EMAIL_ADDRESS, APP_PASSWORD)
    mail.select("inbox")
    return mail


def unread_ids(mail):
    _status, messages = mail.search(None, "UNSEEN")
    return messages[0].split()

Each entry from unread_ids is a small numeric identifier the server uses to refer to one message. You will hand these IDs back to the server in the next steps to fetch, copy, and flag each email.

Step 2: Pull the parts an LLM needs to read

You do not need the entire raw email to classify it — the subject and the first slice of the body are plenty, and keeping the snippet short keeps the API call cheap and fast. The email module parses the raw bytes, and decode_header handles subjects that arrive MIME-encoded (the format used for non-English characters), so accented or non-ASCII subjects come through cleanly.

import email
from email.header import decode_header


def decode_subject(msg):
    raw = decode_header(msg.get("Subject", ""))[0][0]
    if isinstance(raw, bytes):
        return raw.decode("utf-8", errors="ignore")
    return raw or ""


def body_snippet(msg, limit=600):
    if msg.is_multipart():
        for part in msg.walk():
            if part.get_content_type() == "text/plain":
                payload = part.get_payload(decode=True) or b""
                return payload.decode("utf-8", errors="ignore")[:limit]
        return ""
    payload = msg.get_payload(decode=True) or b""
    return payload.decode("utf-8", errors="ignore")[:limit]


def fetch_message(mail, eid):
    _status, msg_data = mail.fetch(eid, "(RFC822)")
    return email.message_from_bytes(msg_data[0][1])

is_multipart() matters because many emails carry both a plain-text and an HTML copy; walking the parts and grabbing the text/plain version gives the model clean text instead of a wall of HTML tags.

Step 3: Classify each email with an LLM

Now the interesting part. You give the model your list of category names and the email's subject plus snippet, and ask it to reply with exactly one category. The system message sets the rules; the temperature=0 setting makes the answer as consistent as possible so the same email always lands in the same folder. If the model returns anything outside your list, you fall back to Other so an email is never lost.

This guide leans on plain instruction following; for a deeper look at shaping model output, see Write System Prompts that Control Output Format.

from openai import OpenAI

client = OpenAI()  # reads OPENAI_API_KEY from the environment

CATEGORIES = ["Invoices", "Newsletters", "Clients", "Other"]

# Cheap keyword shortcut: skip the API call when the subject is obvious.
KEYWORD_RULES = {
    "Invoices": ["invoice", "receipt", "payment", "statement"],
    "Newsletters": ["newsletter", "digest", "unsubscribe"],
    "Clients": ["project", "contract", "meeting"],
}


def keyword_guess(subject):
    low = subject.lower()
    for category, words in KEYWORD_RULES.items():
        if any(word in low for word in words):
            return category
    return None


def classify(subject, snippet):
    shortcut = keyword_guess(subject)
    if shortcut:
        return shortcut

    options = ", ".join(CATEGORIES)
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        temperature=0,
        max_tokens=4,
        messages=[
            {
                "role": "system",
                "content": (
                    "You sort emails into folders. Reply with exactly one "
                    f"of these category names and nothing else: {options}."
                ),
            },
            {
                "role": "user",
                "content": f"Subject: {subject}\n\nBody:\n{snippet}",
            },
        ],
    )
    answer = response.choices[0].message.content.strip()
    return answer if answer in CATEGORIES else "Other"

The keyword shortcut is worth keeping: it costs nothing, runs instantly, and handles the bulk of routine mail. The LLM only sees the messages that keywords cannot confidently place, which keeps your bill low while still catching the tricky ones.

Step 4: Move or label each message by category

With a category in hand, copy the email into the matching folder and mark it read so it is excluded from the next run. IMAP servers cannot move a message in one call, so the standard pattern is copy, flag the original as deleted, then expunge to remove it from the inbox. Creating the folders up front means a brand-new category never triggers a "folder does not exist" error.

def ensure_folders(mail):
    for category in CATEGORIES:
        mail.create(category)  # harmless if the folder already exists


def route(mail, eid, category):
    mail.copy(eid, category)
    mail.store(eid, "+FLAGS", "\\Seen")
    mail.store(eid, "+FLAGS", "\\Deleted")  # remove from inbox after copy

If you would rather keep everything in the inbox and only tag it (Gmail treats labels as folders, so a copied message simply gains a label), drop the \\Deleted line and skip the expunge call below. That leaves the original in place with the new label attached.

Step 5: Schedule the script to run on its own

Wire the pieces together into a main() function, run it once by hand to confirm folders fill correctly, then hand it to your operating system's scheduler. This is the same approach covered across Automating Repetitive Tasks with Python — do the work once, then let a schedule repeat it.

0 */4 * * * /path/to/.venv/bin/python /path/to/email_sorter.py >> /tmp/sorter.log 2>&1

On Windows, open Task Scheduler, create a basic task, point the program to your virtual environment's python.exe, and pass the full path to the script as the argument. Either way, log the output so you can see what the script did between runs.

Key parameters

ParameterTypeDefaultEffect
modelstrgpt-4o-miniThe classifier model. Smaller models are cheaper and fast enough for one-word category answers.
temperaturefloat0Controls randomness. Keep at 0 so the same email is always sorted the same way.
body_snippet(limit=...)int600How many characters of the body the model sees. Larger means more context but a slightly higher cost.

Troubleshooting

  1. imaplib.error: Authentication failed — Your provider rejected the login. With two-factor authentication on, your normal password will not work; generate an app password and put it in .env as APP_PASSWORD.
  2. NONEXISTENT error on mail.copy() — The target folder does not exist yet. Call ensure_folders(mail) once before the sorting loop so every category in CATEGORIES is created first.
  3. openai.AuthenticationError / 401 — The API key is missing or wrong. Confirm OPENAI_API_KEY is set in .env and that load_dotenv() runs before you create the client. See Fix the 401 Unauthorized Error in OpenAI Python for the full checklist.
  4. openai.RateLimitError / 429 — You are sending requests faster than your tier allows. Add a short time.sleep() between emails or lean harder on the keyword shortcut. The fixes in Fix the 429 Rate-Limit Error in Python apply directly here.

Full worked example

This is every piece assembled into one runnable file. Save it as email_sorter.py, fill in your .env, and run python email_sorter.py.

import email
import os
from email.header import decode_header

import imaplib
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
client = OpenAI()

IMAP_SERVER = os.environ["IMAP_SERVER"]
EMAIL_ADDRESS = os.environ["EMAIL_ADDRESS"]
APP_PASSWORD = os.environ["APP_PASSWORD"]

CATEGORIES = ["Invoices", "Newsletters", "Clients", "Other"]
KEYWORD_RULES = {
    "Invoices": ["invoice", "receipt", "payment", "statement"],
    "Newsletters": ["newsletter", "digest", "unsubscribe"],
    "Clients": ["project", "contract", "meeting"],
}


def decode_subject(msg):
    raw = decode_header(msg.get("Subject", ""))[0][0]
    return raw.decode("utf-8", "ignore") if isinstance(raw, bytes) else (raw or "")


def body_snippet(msg, limit=600):
    if msg.is_multipart():
        for part in msg.walk():
            if part.get_content_type() == "text/plain":
                data = part.get_payload(decode=True) or b""
                return data.decode("utf-8", "ignore")[:limit]
        return ""
    data = msg.get_payload(decode=True) or b""
    return data.decode("utf-8", "ignore")[:limit]


def classify(subject, snippet):
    low = subject.lower()
    for category, words in KEYWORD_RULES.items():
        if any(word in low for word in words):
            return category  # fast, free keyword match

    options = ", ".join(CATEGORIES)
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        temperature=0,
        max_tokens=4,
        messages=[
            {"role": "system", "content": (
                "You sort emails into folders. Reply with exactly one of "
                f"these category names and nothing else: {options}.")},
            {"role": "user", "content": f"Subject: {subject}\n\nBody:\n{snippet}"},
        ],
    )
    answer = response.choices[0].message.content.strip()
    return answer if answer in CATEGORIES else "Other"


def main():
    mail = imaplib.IMAP4_SSL(IMAP_SERVER)
    mail.login(EMAIL_ADDRESS, APP_PASSWORD)
    mail.select("inbox")
    for category in CATEGORIES:
        mail.create(category)  # safe even if it exists

    _status, messages = mail.search(None, "UNSEEN")
    for eid in messages[0].split():
        _status, data = mail.fetch(eid, "(RFC822)")
        msg = email.message_from_bytes(data[0][1])
        category = classify(decode_subject(msg), body_snippet(msg))
        mail.copy(eid, category)
        mail.store(eid, "+FLAGS", "\\Seen")
        mail.store(eid, "+FLAGS", "\\Deleted")
        print(f"Sorted into {category}: {decode_subject(msg)[:60]}")

    mail.expunge()
    mail.logout()


if __name__ == "__main__":
    main()

When to use this vs. alternatives

  • Use this LLM script when your mail is varied and hard to capture with fixed words — freelance client threads, receipts from dozens of vendors, mixed-language newsletters. The model's reading of intent is what earns its keep.
  • Stick with pure keyword filters when your categories are simple and predictable (one billing address, a handful of known senders). The built-in rules in Gmail or Outlook are free, instant, and need no script at all.
  • Reach for a managed tool when you want a no-code interface and do not mind a subscription — services like Zapier or your provider's native rules can route mail without Python, though they cost more and read meaning less well than an LLM does.

Back to Automating Repetitive Tasks with Python.