Linux 文件 I/O 全景指南：从 open 到重定向详解

Linux 文件 I/O 全景指南：从 open 到重定向详解 | 极客日志

打开 → 读 / 写 → 关闭

文件 → 文件描述符 → 系统调用 → 封装接口 → 重定向与工程实践

open → read / write → close

#include <fcntl.h>
#include <unistd.h>
int fd = open(const char *pathname, int flags, mode_t mode);

O_RDONLY // 只读
O_WRONLY // 只写
O_RDWR // 读写

O_CREAT // 文件不存在则创建
O_TRUNC // 打开时截断文件
O_APPEND // 每次写入都追加到末尾

int fd = open("log.txt", O_WRONLY | O_CREAT | O_APPEND, 0644);

rw- r-- r--

ssize_t read(int fd, void *buf, size_t count);

ssize_t write(int fd, const void *buf, size_t count);

int close(int fd);

int fd = open("data.txt", O_RDONLY);
while ((n = read(fd, buf, sizeof(buf))) > 0) {
    // 处理数据
}
close(fd);

int fd = open(pathname, flags, mode);

O_WRONLY | O_CREAT | O_TRUNC

O_RDONLY // 只读
O_WRONLY // 只写
O_RDWR // 读写

open("a.txt", O_RDONLY | O_WRONLY); // ❌ 未定义行为

O_CREAT

open("file", O_WRONLY | O_CREAT, 0644);

open("file", O_WRONLY, 0644); // mode 被忽略

O_TRUNC

open("a.txt", O_RDONLY | O_TRUNC); // ❌ 失败

int fd = open("data.txt", O_WRONLY | O_TRUNC);

O_APPEND

lseek(fd, 0, SEEK_END);
write(fd, buf, len);

open("log.txt", O_WRONLY | O_APPEND);
write(fd, buf, len);

O_EXCL

open("lock", O_CREAT | O_EXCL, 0644);

int fd = open(
    "server.log",
    O_WRONLY | O_CREAT | O_APPEND,
    0644
);

错误行为	真实后果
忘了 `O_CREAT`	文件不存在直接失败
误用 `O_TRUNC`	文件被清空
读写权限不匹配	`read/write` 直接报错
并发写不加 `O_APPEND`	日志内容错乱
以为 `mode` 决定最终权限	忽略了 `umask`

进程 └── 文件描述符表（fd table） └── struct file（打开文件） └── inode（真实文件）

fd	含义
0	标准输入（stdin）
1	标准输出（stdout）
2	标准错误（stderr）

printf("hello\n"); // 本质是向 fd=1 写
scanf("%d", &x); // 本质是从 fd=0 读

int fd = open("a.txt", O_RDONLY);

close(3);

int fd1 = open("a.txt", O_RDONLY);
int fd2 = dup(fd1);

fork();

dup2(fd, 1);
execvp("ls", argv);

误区	正解
fd 是文件本身	fd 只是索引
fd 是全局唯一	进程私有
每个 fd 有独立偏移	可能共享
exec 会重置 I/O	fd 会保留
重定向是 shell 特性	本质是 fd 操作

fd	名称	角色
0	stdin	数据入口
1	stdout	正常输出
2	stderr	错误输出

my_program > result.txt

接口	默认 fd
stdin	0
stdout	1
stderr	2

printf("hello\n"); // → write(1, ...)
fprintf(stderr, "error\n"); // → write(2, ...)

流对象	fd
cin	0
cout	1
cerr	2

ls > out.txt

int fd = open("out.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
dup2(fd, 1);
close(fd);
printf("hello\n");

int fd = open("err.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
dup2(fd, 2);
close(fd);
fprintf(stderr, "error\n");

ps aux | grep root

错误	后果
把日志写到 stdout	污染数据流
把错误混进正常输出	无法管道处理
手动写死文件路径	程序不可组合
不理解 fd 继承	重定向失效

FILE *fp = fopen("a.txt", "r");
fgets(buf, sizeof(buf), fp);
fclose(fp);

struct _IO_FILE {
    int _fileno; // 对应的 fd
    char *_IO_read_ptr; // 读缓冲区
    char *_IO_write_ptr;// 写缓冲区
    ...
};

FILE *fp;

FILE *fp = fopen("a.txt", "r");

read(fd, buf, 1024);

能力	fd	FILE*
系统调用	直接	间接
缓冲	无	有
格式化 I/O	无	有
跨平台	差	好

extern FILE *stdin;
extern FILE *stdout;
extern FILE *stderr;

FILE*	fd
stdin	0
stdout	1
stderr	2

printf scanf fprintf

write(fd, ...)

int fd = fileno(fp);

FILE *fp = fdopen(fd, "r");

接口	作用
fclose	刷新缓冲 + 关闭 fd
close	只关闭 fd

FILE *fp = fopen("a.txt", "w");
close(fileno(fp)); // ❌

误区	正解
FILE* 是文件	它是结构体
FILE* = fd	FILE* 包装 fd
printf 比 write 快	取决于缓冲
close 能代替 fclose	错
stderr 天生特殊	本质是无缓冲

#include <fstream>
std::ofstream out("a.txt");
out << "hello" << std::endl;
out.close();

 ios_base | ios | +-----+------+ istream ostream | | ifstream ofstream \ / fstream

std::ofstream out("a.txt");

out << "hello";

operator<< → ostream → streambuf → write(fd, ...)

std::ios::sync_with_stdio(true);

std::ios::sync_with_stdio(false);

std::cout << "hello" << std::endl;

std::cout << "hello\n";

流	fd
cin	0
cout	1
cerr	2

./a.out > out.txt

std::cout << "data\n";

维度	iostream	FILE*	fd
类型安全	✔	✖	✖
抽象层级	高	中	低
性能可控	中	高	最高
复杂性	高	中	低
工程灵活性	高	中	高

误区	真相
iostream 不用 fd	内部一定有
endl = 换行	还会 flush
cout 比 printf 慢	取决于配置
不能和系统 I/O 混用	可以，但要理解

C++ iostream ↓ C 标准库 FILE* ↓ Linux 系统调用 fd（read / write） ↓ 内核 VFS / 设备

接口层级	主要解决什么
系统 I/O	精确控制、性能、可组合性
C FILE*	易用性 + 缓冲
C++ iostream	类型安全 + 抽象表达

维度	系统 I/O	FILE*	iostream
抽象层级	低	中	高
是否缓冲	无	有	有
类型安全	无	无	有
性能可控	最高	高	中
易用性	低	中	高
可组合性	极强	强	强
工程复杂度	高	中	高

当前偏移量（offset）

进程 └── fd └── open file description ├── offset ├── flags └── inode

int fd = open("a.txt", O_RDONLY);
read(fd, buf1, 10);
read(fd, buf2, 10);

int fd = open("a.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
write(fd, "hello", 5);
write(fd, "world", 5);

helloworld

off_t lseek(int fd, off_t offset, int whence);

lseek(fd, 0, SEEK_SET); // 回到文件开头
lseek(fd, 0, SEEK_END); // 跳到文件末尾

off_t pos = lseek(fd, 0, SEEK_CUR);

open("a.txt", O_WRONLY | O_APPEND);

int fd = open("a.txt", O_WRONLY);
fork();
write(fd, "X", 1);

./a.out > out.txt

./a.out >> out.txt

误区	真相
偏移量属于文件	属于 '打开实例'
lseek 修改文件内容	只改位置
fork 后偏移量独立	是共享的
O_APPEND = 手动 lseek	不等价
顺序读写很安全	并发下不安全

./a.out > out.txt

printf("hello\n");

fd	含义
0	stdin
1	stdout
2	stderr

./a.out > out.txt

int dup(int oldfd);
int dup2(int oldfd, int newfd);

./a.out < in.txt

int fd = open("in.txt", O_RDONLY);
dup2(fd, 0);
close(fd);
exec(...);

./a.out 2> err.txt

./a.out > out.txt 2>&1

ls | grep txt

int fd = open("out.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
dup2(fd, STDOUT_FILENO);
close(fd);
execl("./a.out", "./a.out", NULL);

误区	真相
程序决定输出位置	shell 决定
printf 特殊	它只是 fd 1
重定向是字符串替换	是 fd 绑定
只能重定向 stdout	任意 fd 都行
C++ 不支持重定向	完全支持

printf("log: start\n");

./server > server.log

printf("normal info\n");
fprintf(stderr, "error!\n");

./app > out.log 2> err.log

ps aux | grep root | wc -l

int fd = open("app.log", O_WRONLY | O_CREAT | O_APPEND, 0644);
dup2(fd, STDOUT_FILENO);
dup2(fd, STDERR_FILENO);
close(fd);

close(0);
close(1);
close(2);
open("/dev/null", O_RDONLY);
open("/dev/null", O_WRONLY);
open("/dev/null", O_WRONLY);

./parser < test.txt > out.txt

int main(int argc, char* argv[]) {
    if (argc > 1) {
        int fd = open(argv[1], O_WRONLY | O_CREAT | O_TRUNC, 0644);
        dup2(fd, STDOUT_FILENO);
        close(fd);
    }
    printf("Hello, world\n");
}

./a.out ./a.out out.txt

场景	问题
忘记 close	fd 泄漏
先 exec 再 dup2	无效
混用缓冲	数据丢失
多进程共享 fd	输出错乱

open("a.txt", O_WRONLY);
write(fd, "data", 4);

FILE* fp = fopen("a.txt", "w");
int fd = fileno(fp);
write(fd, "hello", 5);
fclose(fp);

write(fd, buf, len);

read(fd, buf, 1024);

close(fd);

fork();
write(fd, "X", 1);

lseek(fd, 0, SEEK_END);
write(fd, buf, len);

while (...) {
    std::cout << data << std::endl;
}

printf("error\n");

exec(...);
dup2(fd, 1);

./a.out > out.txt

dup2(fd, 1); // 忘记 close(fd)

# 从文件拷贝到文件
./mycp a.txt b.txt
# 从 stdin 拷贝到文件
cat a.txt | ./mycp - b.txt
# 从文件拷贝到 stdout
./mycp a.txt -
# 完全通过重定向工作
./mycp < a.txt > b.txt

mycp [src] [dst]

read(fd_in) -> buffer -> write(fd_out)

#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#define BUF_SIZE 4096

int main(int argc, char *argv[]) {
    if (argc != 3) {
        fprintf(stderr, "usage: %s src dst\n", argv[0]);
        return 1;
    }

    int fd_in = STDIN_FILENO;
    int fd_out = STDOUT_FILENO;

    // 处理输入
    if (argv[1][0] != '-' || argv[1][1] != '\0') {
        fd_in = open(argv[1], O_RDONLY);
        if (fd_in < 0) {
            perror("open src");
            return 1;
        }
    }

    // 处理输出
    if (argv[2][0] != '-' || argv[2][1] != '\0') {
        fd_out = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 0644);
        if (fd_out < 0) {
            perror("open dst");
            return 1;
        }
    }

    char buf[BUF_SIZE];
    ssize_t n;
    while ((n = read(fd_in, buf, sizeof(buf))) > 0) {
        ssize_t written = 0;
        while (written < n) {
            ssize_t w = write(fd_out, buf + written, n - written);
            if (w < 0) {
                perror("write");
                return 1;
            }
            written += w;
        }
    }

    if (n < 0) {
        perror("read");
        return 1;
    }

    if (fd_in != STDIN_FILENO) close(fd_in);
    if (fd_out != STDOUT_FILENO) close(fd_out);

    return 0;
}

echo "hello" > a.txt
./mycp a.txt b.txt
cat b.txt
./mycp a.txt -
cat a.txt | ./mycp - b.txt
./mycp < a.txt > b.txt

fd	含义	默认指向
0	stdin	终端输入
1	stdout	终端输出
2	stderr	终端错误

C++ iostream ↓ C 标准库 FILE* ↓ 系统调用 fd ↓ Linux 内核

Linux 文件 I/O 全景指南：从 open 到重定向详解

摘要

前言：为什么 '文件 I/O' 是 Linux 编程的第一块硬骨头

1、什么是文件 I/O？先统一 '文件' 的认知

1.1、Linux 中的 '文件'，不是你想象的那个文件

1.2、什么叫 I/O？I/O 到底在 '交换什么'

1.3、文件 I/O = 对 '字节流' 的顺序操作

1.4、文件 I/O 的三层视角（先有全景）

1.4.1、系统层（System I/O）

1.4.2、C 标准库层（C FILE* I/O）

1.4.3、C++ 流式 I/O

1.5、为什么一定要 '先统一文件的认知'

1.6、小结：建立正确的 '文件观'

2、系统级文件 I/O：open / read / write / close

2.1、为什么要直接学 '系统级 I/O'

2.2、open：把 '路径' 变成 '可操作的对象'

2.2.1、文件描述符是什么？

2.2.2、flags：真正的控制核心

2.2.3、mode：创建时的权限模板

2.3、read：从文件中取字节

2.4、write：把字节送进文件

2.5、close：结束一段 I/O 关系

2.6、一条完整的系统 I/O 生命周期

2.7、小结：你已经站在 '内核 I/O 视角'

3、深入理解 open 的 flags（重点章节）

3.1、flags 的整体设计思想：位图 + 组合语义

3.2、访问模式：你 '允许' 对文件做什么

3.2.1、三种访问模式

3.2.2、访问模式影响的不只是权限

3.3、O_CREAT：创建文件的 '条件触发器'

3.3.1、O_CREAT 只影响 '是否存在'

3.3.2、mode 只有在 O_CREAT 时才生效

3.3.3、权限还会被 umask 再削一刀

3.4、O_TRUNC：最危险、也最容易误用的 flag

3.4.1、O_TRUNC 只对 '可写打开' 生效

3.4.2、O_TRUNC 是 '立即生效' 的

3.5、O_APPEND：写入行为的 '内核级保证'

3.5.1、O_APPEND vs lseek + write

3.5.2、为什么日志文件几乎都用 O_APPEND

3.6、O_EXCL：和 O_CREAT 的'强绑定'

3.7、一个完整 flags 组合的真实案例

3.8、新手高频翻车点总结

3.9、小结：flags 决定的是 '文件的命运'

4、文件描述符（fd）：Linux I/O 的核心抽象

4.1、为什么 Linux 不直接用 '文件指针'

4.2、文件描述符到底是什么？

4.3、fd 的真实身份：它指向了什么？

4.4、为什么 fd 从 0 开始？

4.5、open 是如何分配 fd 的？

4.6、fd 的 '可复用性'：关闭即释放

4.7、文件偏移量（offset）到底属于谁？

4.8、fd 与 fork：为什么子进程能继承 I/O

4.9、fd 与 exec：为什么重定向能 '生效'

4.10、fd 是一切 I/O 技术的基石

4.11、新手高频误区总结

4.12、小结：fd 是 Linux I/O 的 '总开关'

5、标准文件描述符：0 / 1 / 2 的工程意义

5.1、0 / 1 / 2 是 '接口约定'，不是偶然数字

5.2、为什么要把 stdout 和 stderr 分开？

5.3、printf / cout / cerr 背后发生了什么？

5.3.1、C 标准库视角

5.3.2、C++ 流视角

5.4、标准 fd 的 '可替换性'：工程设计的核心

5.5、重定向的工程本质（提前预告）

5.6、为什么 stderr 默认不缓冲？

5.7、工程实践：手动重定向 stdout / stderr

5.7.1、重定向 stdout

5.7.2、重定向 stderr

5.8、0 / 1 / 2 在管道中的意义

5.9、新手最容易犯的错误

5.10、小结：0 / 1 / 2 是工程级 '接口标准'

6、C 语言文件接口：FILE* 是什么？

6.1、FILE* 不是文件，也不是 fd

6.2、FILE 的真实身份：一个结构体

6.3、fopen 到底做了什么？

6.4、为什么要有 FILE* 这一层？

6.4.1、直接用 fd 的问题

6.4.2、FILE* 带来的好处

6.5、三种缓冲模式（必须理解）

6.5.1、全缓冲（Fully Buffered）

2.2、`open`：把 '路径' 变成 '可操作的对象'

2.2.2、`flags`：真正的控制核心

2.2.3、`mode`：创建时的权限模板

2.3、`read`：从文件中取字节

2.4、`write`：把字节送进文件

2.5、`close`：结束一段 I/O 关系

3.3、`O_CREAT`：创建文件的 '条件触发器'

3.3.1、`O_CREAT` 只影响 '是否存在'

3.3.2、`mode` 只有在 `O_CREAT` 时才生效

3.3.3、权限还会被 `umask` 再削一刀

3.4、`O_TRUNC`：最危险、也最容易误用的 flag

3.4.1、`O_TRUNC` 只对 '可写打开' 生效

3.4.2、`O_TRUNC` 是 '立即生效' 的

3.5、`O_APPEND`：写入行为的 '内核级保证'

3.5.1、`O_APPEND` vs `lseek + write`

3.5.2、为什么日志文件几乎都用 `O_APPEND`

3.6、`O_EXCL`：和 `O_CREAT` 的'强绑定'