How many product descriptions can I rewrite at once?

There is no hard limit in the script. The batching loop processes rows one at a time and saves progress periodically, so a CSV with thousands of rows works fine. Your real limits are your OpenAI usage budget and the rate limit on your account tier.

Which OpenAI model is best for rewriting product copy?

gpt-4o-mini is the best starting point because it is cheap and fast enough for bulk work while still writing fluent copy. Move up to gpt-4o only if you find the smaller model misses brand nuance or struggles with technical products.

Will the rewrites be the same every time I run the script?

Not exactly. Language models add small variations between runs. Set temperature low (around 0.3 to 0.5) for more consistent, predictable copy, or raise it toward 0.8 if you want more creative variety across products.

How do I keep my OpenAI API key safe in this script?

Store the key in a .env file and load it with python-dotenv, never paste it directly into your code. Add .env to your .gitignore so the key is never committed to version control or shared by accident.

What happens if the script crashes halfway through a large file?

The script writes a checkpoint CSV every few rows, so finished rewrites are saved to disk as it goes. When you re-run it, rows that already have a rewrite are skipped, so you only pay for and process the remaining descriptions.

Bulk-Rewrite Product Descriptions with Python

This guide shows you how to take a spreadsheet of product descriptions, rewrite every one of them with AI to match a target tone, length, and SEO style, and save the results back to a CSV — all in under fifteen minutes. If you run an online store with hundreds of products, you already know the pain: descriptions copied from a supplier, written by three different people over two years, or simply too thin to rank in search. Rewriting them by hand is a week of work. A short Python script can do the whole catalogue while you make coffee.

A CSV (short for "comma-separated values") is just a plain-text spreadsheet — the format every store platform, from Shopify to WooCommerce, can export and import. We will read one, send each row's description to a language model, and write the polished version into a new column.

Prerequisites

You only need a few things beyond a working Python setup. If Python itself is new to you, start with Create a Python Virtual Environment for AI so the packages below stay isolated from the rest of your machine. You will also need an OpenAI account with a funded API key; the broader Understanding LLM APIs section explains how keys and billing work if this is your first time.

Install the three packages this script uses. pandas handles the spreadsheet, openai talks to the model, and python-dotenv keeps your key out of your code.

pip install pandas openai python-dotenv

Create a file named .env in the same folder as your script and put your key inside it:

OPENAI_API_KEY=sk-your-real-key-here

Now add .env to your .gitignore file so the key never gets committed to version control or shared by accident — this single line saves a lot of regret:

echo ".env" >> .gitignore

Finally, you need an input file. The script assumes a CSV with at least a product_name column and a description column. A tiny example, products.csv, looks like this:

product_name,description
Stainless Travel Mug,keeps drinks hot. 16oz. dishwasher safe.
Bamboo Cutting Board,wood board for kitchen. medium size.
Wool Hiking Socks,warm socks for hiking trips good quality.

Step 1: Load your API key and the OpenAI client

Start a file called rewrite_descriptions.py. The first job is to load your key and create the OpenAI client — the object that sends requests to the model. Loading the key from .env means the key lives outside your code, exactly where it belongs.

import os
import time
import pandas as pd
from dotenv import load_dotenv
from openai import OpenAI
from openai import APIError, RateLimitError, APITimeoutError

load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

MODEL = "gpt-4o-mini"

gpt-4o-mini is the workhorse here: cheap, fast, and more than capable of polishing product copy. We import three specific error types (RateLimitError, APITimeoutError, APIError) so the retry logic in Step 3 can react to the failures that actually happen in bulk jobs.

Step 2: Load and inspect your product CSV

Read the spreadsheet into a pandas DataFrame — think of a DataFrame as the spreadsheet held in memory, with rows and named columns you can loop over. Before processing anything, confirm the columns are what you expect. A two-second check here prevents rewriting the wrong field.

INPUT_FILE = "products.csv"
DESCRIPTION_COLUMN = "description"

df = pd.read_csv(INPUT_FILE)
print(f"Loaded {len(df)} products")
print("Columns found:", list(df.columns))

if DESCRIPTION_COLUMN not in df.columns:
    raise ValueError(
        f"Column '{DESCRIPTION_COLUMN}' not found. "
        f"Available columns: {list(df.columns)}"
    )

# Create the output column up front, filled with empty strings.
if "rewritten" not in df.columns:
    df["rewritten"] = ""

Creating the rewritten column now matters for one reason: it lets us skip rows that already have a result if the script is re-run after a crash. That is the foundation of the resumable batching you will add in Step 4. If your raw data is messy — missing values, stray HTML, odd encodings — clean it first with the techniques in Cleaning CSV Data with Pandas for AI.

Step 3: Write the rewrite function with retries

This is the core of the script. The function takes one product name and one rough description and returns a polished rewrite. The instructions live in the system prompt — the standing brief that tells the model who it is and how to behave. Spelling out tone, length, and SEO rules here is what turns a generic paraphrase into on-brand copy. If you want to go deeper on shaping output, see Write System Prompts that Control Output Format.

Bulk jobs hit transient errors — a brief rate limit, a timed-out request — that succeed instantly on a second try. We wrap the call in a retry loop with exponential backoff, meaning each failed attempt waits longer than the last (2 seconds, then 4, then 8) before trying again. That spacing gives the API room to recover instead of hammering it.

SYSTEM_PROMPT = (
    "You are an expert e-commerce copywriter. Rewrite the product "
    "description to be persuasive, scannable, and SEO-friendly. "
    "Use a confident, friendly tone. Keep it between 40 and 70 words. "
    "Lead with the main benefit, then list two concrete features. "
    "Naturally include the product name once. Return only the rewritten "
    "description as plain text, with no labels, quotes, or headings."
)


def rewrite_one(product_name: str, description: str, max_retries: int = 4) -> str:
    user_prompt = (
        f"Product name: {product_name}\n"
        f"Original description: {description}"
    )

    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model=MODEL,
                messages=[
                    {"role": "system", "content": SYSTEM_PROMPT},
                    {"role": "user", "content": user_prompt},
                ],
                temperature=0.4,
                max_tokens=200,
            )
            return response.choices[0].message.content.strip()

        except (RateLimitError, APITimeoutError, APIError) as err:
            wait = 2 ** (attempt + 1)
            print(f"  Attempt {attempt + 1} failed ({type(err).__name__}). "
                  f"Retrying in {wait}s...")
            time.sleep(wait)

    raise RuntimeError(f"Gave up rewriting '{product_name}' after {max_retries} tries")

The temperature=0.4 keeps rewrites consistent across your catalogue, and max_tokens=200 caps the length and cost of each reply. Adjust both in the quick-reference table below.

Step 4: Batch through every row and save checkpoints

Now loop over the DataFrame. Two ideas make this safe for large files. First, skip rows that already have a rewrite, so a re-run picks up exactly where a crash left off instead of paying to redo finished work. Second, save a checkpoint every few rows, so progress is written to disk as you go rather than only at the very end.

A short pause between requests keeps you comfortably under the rate limit. If you do hit limits often, the fix is the same approach taught in Fix the 429 Rate-Limit Error in Python.

CHECKPOINT_FILE = "products_checkpoint.csv"
CHECKPOINT_EVERY = 10   # save to disk after this many rewrites
PAUSE_SECONDS = 0.5     # gentle spacing between requests

processed = 0
for index, row in df.iterrows():
    # Skip rows that already have a rewrite (resume after a crash).
    if str(row["rewritten"]).strip():
        continue

    name = str(row.get("product_name", "this product"))
    original = str(row[DESCRIPTION_COLUMN])

    print(f"[{index + 1}/{len(df)}] Rewriting: {name}")
    df.at[index, "rewritten"] = rewrite_one(name, original)
    processed += 1

    if processed % CHECKPOINT_EVERY == 0:
        df.to_csv(CHECKPOINT_FILE, index=False)
        print(f"  Checkpoint saved ({processed} rewritten so far)")

    time.sleep(PAUSE_SECONDS)

print(f"Done. Rewrote {processed} descriptions this run.")

Because finished rewrites are saved into the rewritten column and checkpointed to disk, you can stop the script at any time with Ctrl+C and run it again later — it will resume from the first unwritten row.

Step 5: Write the finished results back to CSV

When the loop finishes, save the complete DataFrame to a clean output file. Keeping the original description column alongside the new rewritten column lets you compare the two before you import anything into your store.

OUTPUT_FILE = "products_rewritten.csv"
df.to_csv(OUTPUT_FILE, index=False)
print(f"Saved {len(df)} products to {OUTPUT_FILE}")

Open products_rewritten.csv in any spreadsheet app and spot-check a dozen rows. When you are happy, map the rewritten column to your platform's description field and import. That is the whole pipeline: read, rewrite, save.

Parameter quick reference

These are the knobs you will actually turn. Everything else can stay at its default.

Parameter	Type	Default	Effect
`MODEL`	string	`gpt-4o-mini`	Which model rewrites the copy. Use `gpt-4o` for higher-stakes or technical products.
`temperature`	float	`0.4`	Creativity. Lower is more consistent and on-brand; higher adds variety between products.
`max_tokens`	int	`200`	Caps reply length and cost. Roughly 150 tokens covers a 70-word description with headroom.
`CHECKPOINT_EVERY`	int	`10`	How many rewrites between disk saves. Lower it for very long runs to lose less on a crash.
`PAUSE_SECONDS`	float	`0.5`	Delay between requests. Raise it if you hit rate limits on a low usage tier.

Troubleshooting

1. KeyError or "Column not found" on startup. The script cannot find your description column. Print list(df.columns) and update the DESCRIPTION_COLUMN variable to match the exact name in your file — watch for capital letters or trailing spaces, which CSV exports love to add.

2. Every rewrite comes back wrapped in quotes or with a "Rewritten:" label. The model is being too helpful. Tighten the last line of SYSTEM_PROMPT ("Return only the rewritten description as plain text") and lower temperature to 0.3. As a backstop, add .strip('"') to the returned text.

3. The script keeps retrying and finally raises RuntimeError. Persistent failure usually means an authentication problem, not a transient one. Confirm your .env key is correct and funded — the steps in Fix the 401 Unauthorized Error in OpenAI Python walk through the exact checks.

4. Rewrites are far too long or get cut off mid-sentence. A cut-off reply means max_tokens is too low for your target length; raise it to 250. If they are simply long, the model is ignoring the word count — restate the limit as a hard rule ("Never exceed 70 words") near the start of the system prompt, where it carries more weight.

When to use this vs. alternatives

Use this script when you own a real catalogue. Once you have more than a few dozen products, batching plus resumable checkpoints saves both hours and the cost of redoing work after a crash. It is the right tool for a one-time cleanup or a quarterly refresh of existing copy.
Reach for a chat UI instead for one or two products. If you only need to polish a handful of descriptions, pasting them into ChatGPT by hand is faster than wiring up a CSV. The script earns its keep through volume.
Build a generation flow, not a rewrite flow, for net-new copy. If you are writing descriptions from scratch rather than improving existing ones, the longer-form approach in Generate Blog Posts with the OpenAI API is a closer fit. Rewriting shines when there is an "original" to improve on.

Once your descriptions are sharp, the same read-loop-write pattern powers the rest of your marketing: tune the prompt and you can turn the same product data into newsletters, ad copy, or meta tags. Back to AI Copywriting Workflows.

AI Copywriting Workflows — the main guide tying these copy-automation tasks together.
Generate Blog Posts with the OpenAI API — produce long-form articles from a single keyword.
Generate Email Newsletters with Python and AI — turn product and content data into ready-to-send emails.
Cleaning CSV Data with Pandas for AI — prep messy spreadsheets before you feed them to a model.

Bulk-Rewrite Product Descriptions with Python

Related pages in this content path

Generate Blog Posts with the OpenAI API

Generate Email Newsletters with Python and AI