With all the best intentions in the world sometimes commit-logs are skipped or entered in haste - a brief short-term gain for long-term pain and future-you (or worse someone else) feeling rightly peeved with you. For personal repositories the temptation to quickly type “notes” / “comments” etc. can be even stronger and once the commits are made you probably wont be thinking about them again until you need to (enter peeved future self).

This is not necessarily a simple thing to fix - because git commit hashes are hashes of the enter git commit object changing any detail (including the commit message) changes the commit hash. So unlike some version control systems (eg. subversion) modifying history is destructive and if a git repository is shared/distributed the following process is likely going to be very complex and risky (as every clone/copy of the repository will need to be reset). This is similar to the problem of deleting branches that you have pushed remote - but oh so much worse.

But for a personal or small-project repository this may still be a useful too! So pushing on…

TL;DR

I’ve written a Python script that analyzes every commit in a git repository and uses the Claude Agent SDK to analyse and enhance commit messages based on the actual code changes. The tool includes safety features like a dry-run mode, explicit confirmation requirements, and the ability to resume interrupted sessions. Critically because the final stage is destructive keeping a backup and carefully considering implications are imperative!

How It Works

The script operates in two phases:

  1. Analysis Phase: Examines each commit, determines if the message is useful, and generates improved messages
  2. Application Phase: Uses git-filter-repo to safely rewrite the commit history

Here’s the workflow:

# Phase 1: Generate improved messages
python regenerate-commit-messages.py

# Review / modify the proposed changes
editor commit-regeneration-log-TIMESTAMP.jsonl

# Phase 2: Preview what will be applied
python regenerate-commit-messages.py --apply commit-regeneration-log-TIMESTAMP.jsonl --dry-run

# CRITICAL: Create a filesystem backup before applying
tar -czf ../repo-backup-$(date +%Y%m%d_%H%M%S).tar.gz .

# Apply the changes
python regenerate-commit-messages.py --apply commit-regeneration-log-TIMESTAMP.jsonl

A Worked Example

Let’s walk through a real example using a sample repository with mixed commit quality.

Initial Repository State

Our demo repository has five commits with varying message quality:

$ git log --oneline
71132c9 temp
8e25b7b fix
6c18b86 Add JSON configuration file for application settings and version tracking
93e9b0c updates
f7413e3 wip

Notice that we have:

  • Three poor messages: “temp”, “fix”, “wip”
  • One generic message: “updates”
  • One good message: “Add JSON configuration file for application settings and version tracking”

Running the Analysis

First, let’s analyze the repository using test mode (which processes only the first N commits):

$ python regenerate-commit-messages.py --test 5
Git Commit Message Regenerator
============================================================
Output logged to: commit-regeneration-output-20251226_101914.log
============================================================

Analysis Mode: Generating improved commit messages...

Fetching commit history...
🧪 Test mode: Processing first 5 commits only
Found 5 commits to analyze

Log file: commit-regeneration-log-20251226_101914.jsonl

Processing 5 remaining commits...

Processing commit 1/5: f7413e32...
  → Generating new message...
  New: Add README with project title...

Processing commit 2/5: 93e9b0c3...
  → Generating new message...
  New: Add greeting function with main entry point...

Processing commit 3/5: 6c18b864...
  ✓ Message already useful, adding notes...
  New: Add JSON configuration file for application settings and ver...

Processing commit 4/5: 8e25b7bd...
  → Generating new message...
  New: Add pytest dependency to requirements.txt...

Processing commit 5/5: 71132c9b...
  → Generating new message...
  New: Add pytest test for greet function...


============================================================
Analysis Complete!
============================================================


Change Summary:
============================================================
Total commits: 5
  Regenerated: 4
  Enhanced: 1

Examining the Proposed Changes

The script creates a JSONL (JSON Lines) file with all proposed changes:

$ cat commit-regeneration-log-20251226_101914.jsonl
{"commit": "f7413e3291ec...", "original": "wip", "new": "Add README with project title", "action": "regenerated"}
{"commit": "93e9b0c3e512...", "original": "updates", "new": "Add greeting function with main entry point", "action": "regenerated"}
{"commit": "6c18b864aff3...", "original": "Add JSON configuration file for application settings and version tracking", "new": "Add JSON configuration file for application settings and version tracking\n\nClaude Notes: Add config.json with app name, version 1.0.0, and debug settings", "action": "enhanced"}
{"commit": "8e25b7bd788d...", "original": "fix", "new": "Add pytest dependency to requirements.txt", "action": "regenerated"}
{"commit": "71132c9befd7...", "original": "temp", "new": "Add pytest test for greet function", "action": "regenerated"}

Notice how the good commit message was “enhanced” with additional context in a “Claude Notes” section, while poor messages were completely “regenerated”.

Previewing the Changes

Before applying changes, use dry-run mode to preview what will happen:

$ python regenerate-commit-messages.py --apply commit-regeneration-log-20251226_101914.jsonl --dry-run

🔍 DRY RUN MODE - No changes will be applied

Change Summary:
============================================================
Total commits: 5
  Regenerated: 4
  Enhanced: 1

Applying the Changes

Once you’re satisfied, apply the changes. The script will require explicit confirmation:

$ python regenerate-commit-messages.py --apply commit-regeneration-log-20251226_101914.jsonl

Performing safety checks...
============================================================
✓ git-filter-repo is installed
✓ Repository is clean (ignoring script-generated files)
ℹ️  No upstream branch configured
============================================================

============================================================
⚠️  CRITICAL WARNING: DESTRUCTIVE OPERATION
============================================================

This will PERMANENTLY REWRITE git history!
Built-in git backups DO NOT WORK - git-filter-repo rewrites ALL refs.

Before proceeding, you MUST create a filesystem backup:
  tar -czf ../repo-backup-$(date +%Y%m%d_%H%M%S).tar.gz .
  OR
  cp -r . ../repo-backup

Without a filesystem backup, recovery is IMPOSSIBLE.
============================================================

Type 'yes' to confirm you have backed up your repository: yes

Saving git remote and branch tracking configuration...
✓ Saved 1 remote(s): origin
✓ Saved tracking info for 1 branch(es): main

Applying changes...
This may take a while for large repositories...

Restoring git remote and branch tracking configuration...
✓ Restored 1 remote(s)
✓ Restored tracking info for 1 branch(es)

✅ Commit messages updated successfully!

⚠️  Note: Remote configuration restored. To push changes:
  git fetch origin                     # Restore remote tracking refs
  git push --force-with-lease          # Safe force-push

✓ Branch tracking restored - 'git pull' and 'git push' should work normally

To restore from your backup if needed:
  tar -xzf ../repo-backup-TIMESTAMP.tar.gz -C ../repo-restored

The Result

Now our git history looks much better - but note that the commit hashes have all changed. The previous hashes are now lost.

$ git log --oneline
75e5625 Add pytest test for greet function
e6539d4 Add pytest dependency to requirements.txt
a3c220c Add JSON configuration file for application settings and version tracking
09c3691 Add greeting function with main entry point
c17610e Add README with project title

And the enhanced commit includes the additional notes:

$ git log -1 a3c220c --format="%B"
Add JSON configuration file for application settings and version tracking

Claude Notes: Add config.json with app name, version 1.0.0, and debug settings

Safety Features

The tool includes several safety mechanisms to prevent accidental data loss:

Critical: Create Filesystem Backup First

Important: Git branches do NOT work as a backup. git-filter-repo rewrites ALL refs in the repository, including any backup branches you create, making them point to the new history instead of preserving the old one.

Before applying changes, a manual backup would be a really good plan:

# Option 1: Create a tarball backup
tar -czf ../repo-backup-$(date +%Y%m%d_%H%M%S).tar.gz .

# Option 2: Copy the entire directory
cp -r . ../repo-backup

Without a filesystem backup, recovery from mistakes is impossible. The script enforces this with an explicit confirmation requirement - you must type exactly “yes” (not just “y”) to proceed.

Dry-Run Mode

Preview exactly what will change before committing to the rewrite:

python regenerate-commit-messages.py --apply logfile.jsonl --dry-run

This shows you the proposed changes without actually modifying your repository.

Automatic Remote Configuration Preservation

The script automatically saves and restores your git remote configuration and branch tracking settings. Before running git-filter-repo (which removes all remote configuration), the script:

  1. Saves all configured remotes and their URLs
  2. Saves branch tracking information (which branches track which remotes)
  3. Restores everything after the rewrite completes

This means you don’t need to manually reconfigure git remote or set up branch tracking after the operation. You only need to run git fetch origin to restore the remote tracking refs before pushing.

Repository State Validation

The script checks for:

  • Uncommitted changes (warns if present)
  • Unpushed commits (confirms before proceeding)
  • git-filter-repo installation (required dependency)

Incremental Logging

Both the analysis log (JSONL) and output log are written incrementally. If the process is interrupted (quota limits, network issues, etc.), you can resume where you left off:

python regenerate-commit-messages.py --continue commit-regeneration-log-TIMESTAMP.jsonl

Test Mode

Try the script on a subset of commits first:

python regenerate-commit-messages.py --test 10

This processes only the first 10 commits, letting you verify the results before running on your entire history.

Installation and Requirements

The script requires two dependencies:

  1. git-filter-repo: tool for rewriting git history

    pip install git-filter-repo
    
  2. claude_agent_sdk: Connects to your local Claude Code session

    pip install claude-agent-sdk
    

You also need an active Claude Code session running, as the script uses it to generate commit messages.

Advanced Features

Handling Rate Limits

If you hit quota or rate limits during processing, the --wait-and-retry flag automatically retries with exponential backoff:

python regenerate-commit-messages.py --wait-and-retry

Without this flag, the script aborts on quota errors (you can then resume with --continue).

Resuming Interrupted Sessions

If the script is interrupted, resume from where it left off:

python regenerate-commit-messages.py --continue commit-regeneration-log-TIMESTAMP.jsonl

The script reads the log file to determine which commits have already been processed and skips them.

Force Mode

Skip interactive confirmations (useful for automation):

python regenerate-commit-messages.py --apply logfile.jsonl --force

Use with caution, as this bypasses safety prompts.

After Rewriting History

Once you’ve rewritten commit messages, you’ll need to update your remote repository:

For Personal Repositories

Important: While the script automatically saves and restores your git remote configuration and branch tracking settings, git-filter-repo still removes remote tracking refs as a safety feature. You must restore them before pushing:

git fetch origin                # Restore remote tracking refs
git push --force-with-lease     # Safe force-push

After running git fetch origin, normal git operations like git pull and git push will work as expected. The --force-with-lease flag ensures you have the latest remote commits before overwriting, preventing accidental destruction of work from other contributors.

For Shared Repositories

If others have cloned the repository:

  1. Coordinate with collaborators before force-pushing
  2. After pushing, they’ll need to update their clones:
    git fetch origin && git reset --hard origin/main
    
  3. Any unpushed work will need to be rebased onto the new history

Recovery

If you need to undo the changes, restore from your filesystem backup:

# Option 1: Extract tarball to a new location
tar -xzf ../repo-backup-TIMESTAMP.tar.gz -C ../repo-restored
cd ../repo-restored
git fetch origin             # Restore remote tracking refs
git push --force-with-lease  # Restore remote

# Option 2: Copy .git directory from backup
rm -rf .git
cp -r ../repo-backup/.git .
git fetch origin             # Restore remote tracking refs
git push --force-with-lease  # Restore remote

How Messages Are Classified

The script uses heuristics to determine if a commit message is “useful”:

Generic messages (always regenerated):

  • “wip”, “work in progress”
  • “fix”, “fixes”
  • “update”, “updates”
  • “changes”
  • “temp”
  • “misc”

Useful messages (enhanced with notes):

  • More than 20 characters
  • Contains spaces (multiple words)
  • Doesn’t match generic patterns

This classification ensures that thoughtful commit messages are preserved while unhelpful ones are improved.

Implementation Details

The script is built with good Python practices:

  • Uses asyncio for asynchronous API calls to Claude
  • Implements exponential backoff for rate limit handling
  • Validates messages to ensure they actually changed (catches API errors)
  • Logs all output to files for debugging and auditing
  • Uses git-filter-repo’s Python API for safe history rewriting
  • Automatically preserves git remote configuration and branch tracking (saves before, restores after)

The message generation prompt is designed to produce concise, imperative-mood commit messages:

prompt = f"""Analyze this git diff and generate a concise commit message (1-2 sentences max).

Original commit message: {original_message}

Diff:
{diff_preview}

Generate a clear, specific commit message that describes what changed and why (if apparent).
Use imperative mood (e.g., "Add feature" not "Added feature").
Be concise but informative. Do not include commit hash or metadata.
Return ONLY the commit message text, nothing else."""

Source Code

The script is released under the MIT License and is available at: /examples/regenerate-commit-messages.py

When to Use This Tool

This tool is particularly useful for:

  • Personal projects where you’ve been using quick commit messages during development
  • Before going public with a repository that has messy commit history
  • After a long coding session where you made many rapid commits
  • Cleaning up feature branches before merging to main

When NOT to Use This Tool

Avoid using this tool for:

  • Already-pushed shared history without coordinating with collaborators
  • Repositories with many contributors (unless everyone agrees)
  • Production repositories where history immutability is important
  • Signed commits (the signatures will be invalidated)

Reflections

Building this tool highlighted the tension between rapid development (where quick commits are useful) and maintainable history (where descriptive messages are essential). The ability to have both - work quickly in the moment, then clean up history later - feels like a useful middle ground.

The integration with Claude Code through the Agent SDK makes the tool practical for everyday use. There are no API costs to worry about, and the quality of generated messages is consistently good because Claude can see the actual code changes in context.

An important lesson learned: the initial version included automatic git backup branch creation, but this was fundamentally broken. git-filter-repo rewrites ALL refs in the repository, including backup branches, making them point to the new history instead of preserving the old one. The fix was to remove the broken git backup mechanism entirely and require explicit filesystem backups (tarballs or directory copies). This made the tool more honest about its requirements - if you’re rewriting history destructively, you need a real backup outside the git repository itself.

Closing Thoughts

Good commit messages are documentation that lives with your code. They help future maintainers (including future you) understand not just what changed, but why it changed. This tool makes it practical to have that documentation even when you’ve been moving too fast to write it in the moment.

If you’re interested in the implementation details or want to adapt this for your own workflow, check out the source code. The script is designed to be readable and modifiable - it’s just a few hundred lines of Python with extensive comments and safety checks.

Reference: regenerate-commit-messages.py - Git Commit Message Regenerator