人工智能之編程基礎 Python 入門：第十章文件讀寫詳情 - python,json,讀取文件,Python,後端開發,yyds乾貨盤點咚咚王哲博客

人工智能之編程基礎 Python 入門

第十章文件讀寫

(文章目錄)

前言

本章節主要學習python的文件讀寫操作，文件讀寫（File I/O） 是程序與外部存儲交互的基礎操作。Python 提供了簡潔而強大的內置函數和上下文管理器來處理各種文件操作。

1. 基本文件操作

1.1 打開文件：`open()`

# 基本語法
file = open(filename, mode, encoding=None)

# 常用模式
# 'r'  - 只讀（默認）
# 'w'  - 寫入（覆蓋原內容）
# 'a'  - 追加（在文件末尾添加）
# 'x'  - 獨佔創建（文件已存在則失敗）
# 'b'  - 二進制模式（如 'rb', 'wb'）
# 't'  - 文本模式（默認，如 'rt', 'wt'）
# '+'  - 讀寫模式（如 'r+', 'w+'）

1.2 推薦方式：使用 `with` 語句（上下文管理器）

# 自動關閉文件，即使發生異常
with open('example.txt', 'r', encoding='utf-8') as f:
    content = f.read()
    print(content)
# 文件在此處自動關閉

✅ 強烈推薦使用 with 語句，避免忘記關閉文件導致資源泄露。

2. 讀取文件

2.1 讀取整個文件

# 讀取全部內容為字符串
with open('file.txt', 'r', encoding='utf-8') as f:
    content = f.read()
    print(content)

2.2 逐行讀取

# 方法1：readline() - 每次讀取一行
with open('file.txt', 'r', encoding='utf-8') as f:
    line = f.readline()
    while line:
        print(line.strip())  # strip() 去除換行符
        line = f.readline()

# 方法2：readlines() - 讀取所有行到列表
with open('file.txt', 'r', encoding='utf-8') as f:
    lines = f.readlines()
    for line in lines:
        print(line.strip())

# 方法3：直接遍歷文件對象（最高效）
with open('file.txt', 'r', encoding='utf-8') as f:
    for line in f:
        print(line.strip())

2.3 讀取指定字符數

with open('file.txt', 'r', encoding='utf-8') as f:
    chunk = f.read(100)  # 讀取前100個字符
    print(chunk)

3. 寫入文件

3.1 寫入字符串

# 覆蓋寫入
with open('output.txt', 'w', encoding='utf-8') as f:
    f.write('Hello, World!\n')
    f.write('這是第二行\n')

# 追加寫入
with open('output.txt', 'a', encoding='utf-8') as f:
    f.write('這是追加的內容\n')

3.2 寫入多行

lines = ['第一行\n', '第二行\n', '第三行\n']

# 方法1：循環寫入
with open('output.txt', 'w', encoding='utf-8') as f:
    for line in lines:
        f.write(line)

# 方法2：writelines()
with open('output.txt', 'w', encoding='utf-8') as f:
    f.writelines(lines)

⚠️ 注意：writelines() 不會自動添加換行符，需要手動添加。

4. 文件指針操作

4.1 獲取和設置文件位置

with open('file.txt', 'r', encoding='utf-8') as f:
    print(f.tell())  # 當前位置（字節）
    
    content = f.read(10)
    print(f.tell())  # 讀取10字節後的位置
    
    f.seek(0)        # 回到文件開頭
    print(f.read(5)) # 重新讀取前5個字符

4.2 常用 seek 參數

f.seek(0)      # 文件開頭
f.seek(0, 2)   # 文件末尾 (0偏移，從末尾開始)
f.seek(-10, 2) # 倒數第10個字節

5. 二進制文件操作

5.1 讀寫二進制文件

# 讀取二進制文件（如圖片、音頻）
with open('image.jpg', 'rb') as f:
    data = f.read()
    print(f"文件大小: {len(data)} 字節")

# 寫入二進制數據
with open('copy.jpg', 'wb') as f:
    f.write(data)

5.2 處理字節數據

# 寫入字節
with open('binary.dat', 'wb') as f:
    f.write(b'Hello World')
    f.write(bytes([0, 1, 2, 3, 4]))

# 讀取字節
with open('binary.dat', 'rb') as f:
    data = f.read()
    print(data)        # b'Hello World\x00\x01\x02\x03\x04'
    print(data[0])     # 72 (H的ASCII碼)

6. 常見文件操作場景

6.1 配置文件讀寫

# 讀取配置文件
def read_config(filename):
    config = {}
    with open(filename, 'r', encoding='utf-8') as f:
        for line in f:
            line = line.strip()
            if line and not line.startswith('#'):
                key, value = line.split('=', 1)
                config[key.strip()] = value.strip()
    return config

# 寫入配置文件
def write_config(filename, config):
    with open(filename, 'w', encoding='utf-8') as f:
        for key, value in config.items():
            f.write(f"{key} = {value}\n")

6.2 CSV 文件處理（推薦使用 csv 模塊）

import csv

# 寫入 CSV
with open('data.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.writer(f)
    writer.writerow(['姓名', '年齡', '城市'])
    writer.writerow(['張三', 25, '北京'])
    writer.writerow(['李四', 30, '上海'])

# 讀取 CSV
with open('data.csv', 'r', encoding='utf-8') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

6.3 JSON 文件處理

import json

# 寫入 JSON
data = {'name': 'Alice', 'age': 25, 'hobbies': ['reading', 'swimming']}
with open('data.json', 'w', encoding='utf-8') as f:
    json.dump(data, f, ensure_ascii=False, indent=2)

# 讀取 JSON
with open('data.json', 'r', encoding='utf-8') as f:
    loaded_data = json.load(f)
    print(loaded_data)

7. 文件和目錄操作（os 和 pathlib）

7.1 使用 `os` 模塊

import os

# 檢查文件是否存在
if os.path.exists('file.txt'):
    print("文件存在")

# 獲取文件大小
size = os.path.getsize('file.txt')

# 列出目錄內容
files = os.listdir('.')

# 創建目錄
os.makedirs('new_dir', exist_ok=True)

# 刪除文件
os.remove('old_file.txt')

7.2 使用 `pathlib`（Python 3.4+ 推薦）

from pathlib import Path

# 創建 Path 對象
file_path = Path('data') / 'input.txt'

# 檢查文件是否存在
if file_path.exists():
    print("文件存在")

# 讀取文件
content = file_path.read_text(encoding='utf-8')

# 寫入文件
file_path.write_text('Hello World', encoding='utf-8')

# 創建目錄
file_path.parent.mkdir(parents=True, exist_ok=True)

# 遍歷目錄
for file in Path('.').glob('*.py'):
    print(file)

8. 異常處理

8.1 處理文件操作異常

try:
    with open('nonexistent.txt', 'r', encoding='utf-8') as f:
        content = f.read()
except FileNotFoundError:
    print("文件未找到")
except PermissionError:
    print("沒有權限訪問文件")
except UnicodeDecodeError:
    print("文件編碼錯誤")
except Exception as e:
    print(f"其他錯誤: {e}")

8.2 安全的文件操作函數

def safe_read_file(filename, encoding='utf-8'):
    """安全讀取文件"""
    try:
        with open(filename, 'r', encoding=encoding) as f:
            return f.read()
    except FileNotFoundError:
        print(f"文件 {filename} 不存在")
        return None
    except Exception as e:
        print(f"讀取文件 {filename} 時出錯: {e}")
        return None

9. 性能優化技巧

9.1 大文件處理

# 逐塊讀取大文件（避免內存溢出）
def process_large_file(filename, chunk_size=8192):
    with open(filename, 'r', encoding='utf-8') as f:
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break
            # 處理數據塊
            process_chunk(chunk)

def process_chunk(chunk):
    # 處理邏輯
    print(f"處理了 {len(chunk)} 個字符")

9.2 緩衝區設置

# 自定義緩衝區大小
with open('large_file.txt', 'r', buffering=8192) as f:
    # 默認緩衝區通常是 8192 字節
    content = f.read()

10. 最佳實踐總結

✅ 推薦做法

始終使用 with 語句處理文件
明確指定編碼（通常是 utf-8）
使用 pathlib 而不是字符串拼接路徑
處理異常，特別是 FileNotFoundError
大文件逐行或分塊處理，避免內存問題
使用專門的模塊處理特定格式（如 csv、json、xml）

❌ 避免的做法

# 錯誤1：忘記關閉文件
f = open('file.txt', 'r')
content = f.read()
# f.close() 被忘記了！

# 錯誤2：不指定編碼
with open('file.txt', 'r') as f:  # 可能在某些系統上出錯
    content = f.read()

# 錯誤3：字符串拼接路徑
filename = 'data/' + 'file.txt'  # 跨平台問題

✅ 正確做法

from pathlib import Path

# 使用 pathlib 處理路徑
data_dir = Path('data')
filename = data_dir / 'file.txt'

# 安全的文件操作
try:
    with open(filename, 'r', encoding='utf-8') as f:
        content = f.read()
except FileNotFoundError:
    print(f"文件不存在: {filename}")

11. 實用工具函數

from pathlib import Path
import json

def read_lines(filename):
    """讀取文件所有行，返回列表（去除換行符）"""
    try:
        return Path(filename).read_text(encoding='utf-8').splitlines()
    except Exception as e:
        print(f"讀取文件失敗: {e}")
        return []

def write_lines(filename, lines):
    """寫入行列表到文件"""
    try:
        Path(filename).write_text('\n'.join(lines) + '\n', encoding='utf-8')
        return True
    except Exception as e:
        print(f"寫入文件失敗: {e}")
        return False

def backup_file(filename):
    """創建文件備份"""
    src = Path(filename)
    if src.exists():
        backup = src.with_suffix(src.suffix + '.bak')
        backup.write_bytes(src.read_bytes())
        return backup
    return None

總結

本文主要介紹python的文件讀寫操作，也是python入門的結束，關於數據結構、爬蟲、以及面向對象等更高一層的需要繼續學習，同時也需要掌握相關的算法庫。

感想

當下經濟形式的嚴峻，並不能阻止我們努力向前，人工智能不論是否當前泡沫化，與當初的互聯網類似，是未來發展的趨勢，也是不可阻擋的。我們只有擁抱它，一方面提升知識儲備即根基，避免過度依賴ai失去了人本身的創造能力，一方面藉助ai來達到更進一步的滿足市場的需求。

資料關注

相關資料獲取: 公眾號：咚咚王

藝術二維碼.png

《Python編程：從入門到實踐》《利用Python進行數據分析》《算法導論中文第三版》《概率論與數理統計（第四版） (盛驟) 》《程序員的數學》《線性代數應該這樣學第3版》《微積分和數學分析引論》《（西瓜書）周志華-機器學習》《TensorFlow機器學習實戰指南》《Sklearn與TensorFlow機器學習實用指南》《模式識別（第四版）》《深度學習 deep learning》伊恩·古德費洛著花書《Python深度學習第二版(中文版)【純文本】 (登封大數據 (Francois Choliet)) (Z-Library)》《深入淺出神經網絡與深度學習+(邁克爾·尼爾森（Michael+Nielsen）》《自然語言處理綜論第2版》《Natural-Language-Processing-with-PyTorch》《計算機視覺-算法與應用(中文版)》《Learning OpenCV 4》《AIGC：智能創作時代》杜雨+&+張孜銘《AIGC原理與實踐：零基礎學大語言模型、擴散模型和多模態模型》《從零構建大語言模型（中文版）》《實戰AI大模型》《AI 3.0》

博客 / 詳情