Auto-Organize Downloads by Content Type and Hash in Python

Manually sorting downloaded files is tedious and error-prone, especially when filenames don’t tell the full story. With a few lines of Python, you can automate file organization by content type and file hash, making your archive more searchable, tamper-proof, and robust against duplicates.

The Problem: Manual File Sorting Is Inefficient

The traditional approach involves:

  • Guessing file types based on extension (which can be missing or wrong)
  • Manually renaming or moving files
  • Risking overwrites or leaving duplicate files scattered across folders

The Pattern: Type and Hash-Based Auto-Organization

By using pathlib.Path.glob, shutil.move, hashlib.md5, and mimetypes.guess_type, you can scan a directory, classify files by their real content type, and move them to new locations based on a content hash.

from pathlib import Path
import shutil, hashlib, mimetypes
for f in Path("downloads").glob("*"):
with open(f, "rb") as b: h = hashlib.md5(b.read()).hexdigest()
kind = mimetypes.guess_type(f.name)[0] or "unknown"
ext_folder = Path("organized") / kind.split('/')[0]
ext_folder.mkdir(parents=True, exist_ok=True)
shutil.move(str(f), ext_folder / f"{h[:8]}-{f.name}")

How It Works

  • Path("downloads").glob("*")

Learn more Auto-Organize Downloads by Content Type and Hash in Python

Leave a Reply