首页 > 文章 > python教程

Python pathlib 批量整理文件实战：按扩展名归档和冲突重命名

来源：17golang原创

时间：2026-06-13 02:56:38 166浏览收藏

下载目录、导出报表目录、截图目录，用久了以后很容易堆成一片：PDF、图片、压缩包、Excel、临时文件混在一起。手动整理不难，但重复做就很烦，而且容易误拖文件。

Python 的 pathlib 很适合写这类小工具。它把路径当成对象来处理，比拼字符串更清晰。本文用一个“按扩展名整理下载目录”的场景，演示如何扫描文件、创建目录、处理同名冲突、移动文件并记录日志。

摘要

本文会从混乱目录的整理问题讲起，用 pathlib.Path 获取文件列表，根据后缀映射到分类目录，再用安全重命名避免覆盖已有文件，最后把每次移动记录到日志中，方便回查。

适合人群

想用 Python 写本地文件整理脚本的初中级开发者。
经常处理下载目录、报表目录、截图目录的办公自动化用户。
已经掌握基础 Python 语法，希望熟悉 pathlib 的读者。

一、先明确整理规则

脚本动文件之前，先把规则写清楚。比如下载目录里常见的文件可以这样分类：

.png、.webp、.gif 放到 images。
.pdf、.docx、.xlsx 放到 documents。
.zip、.tar、.gz 放到 archives。
未知后缀放到 others。

如果不先定义规则，脚本很容易写成“看到什么处理什么”，后面维护起来会很乱。

下载目录文件混杂导致手动整理容易出错的逻辑图

二、用 pathlib 扫描目录

Path.iterdir() 可以列出目录下的直接子项。我们只处理文件，跳过目录。

from pathlib import Path

source_dir = Path.home() / "Downloads"

for item in source_dir.iterdir():
    if item.is_file():
        print(item.name, item.suffix.lower())

suffix 会返回文件后缀，例如 .pdf。这里统一转成小写，是为了让 .PNG 和 .png 使用同一套规则。

三、按扩展名映射目标目录

把分类规则写成字典，脚本会更容易改：

from pathlib import Path

CATEGORY_MAP = {
    ".png": "images",
    ".webp": "images",
    ".gif": "images",
    ".pdf": "documents",
    ".docx": "documents",
    ".xlsx": "documents",
    ".zip": "archives",
    ".tar": "archives",
    ".gz": "archives",
}

def target_folder(file_path: Path) -> str:
    suffix = file_path.suffix.lower()
    return CATEGORY_MAP.get(suffix, "others")

如果以后要新增 .csv、.pptx，只需要改字典，不需要改移动逻辑。

四、处理同名文件冲突

移动文件前必须处理同名冲突。比如 report.pdf 已经在目标目录里，新的 report.pdf 不能直接覆盖。

from pathlib import Path

def unique_path(target: Path) -> Path:
    if not target.exists():
        return target

    stem = target.stem
    suffix = target.suffix
    parent = target.parent

    index = 1
    while True:
        candidate = parent / f"{stem}-{index}{suffix}"
        if not candidate.exists():
            return candidate
        index += 1

这段函数会把冲突文件改成 report-1.pdf、report-2.pdf 这样的形式，避免覆盖历史文件。

Python pathlib 扫描文件、分类目录、冲突重命名和记录日志的流程图

五、移动文件并写入日志

下面把扫描、分类、重命名和日志串起来：

from pathlib import Path
import shutil
from datetime import datetime

def organize_files(source_dir: Path) -> None:
    log_path = source_dir / "organize.log"

    with log_path.open("a", encoding="utf-8") as log:
        for item in source_dir.iterdir():
            if not item.is_file():
                continue
            if item.name == log_path.name:
                continue

            folder_name = target_folder(item)
            target_dir = source_dir / folder_name
            target_dir.mkdir(exist_ok=True)

            target = unique_path(target_dir / item.name)
            shutil.move(str(item), str(target))

            now = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
            log.write(f"{now} | {item.name} -> {target.relative_to(source_dir)}\n")

if __name__ == "__main__":
    organize_files(Path.home() / "Downloads")

日志大致会长这样：

2026-06-13 10:20:01 | report.pdf -> documents/report.pdf
2026-06-13 10:20:01 | report.pdf -> documents/report-1.pdf
2026-06-13 10:20:02 | demo.png -> images/demo.png

有了日志，后续发现移动结果不符合预期时，至少能快速知道每个文件去了哪里。