Linux文件接口及文件描述符

Posted 2022-02-09 WoLannnnn

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Linux文件接口及文件描述符相关的知识，希望对你有一定的参考价值。

系统文件I/O

操作文件，除了C接口（当然，C++也有接口，其他语言也有），我们还可以采用系统接口来进行文件访问

写文件:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

int main()

    umask(0);
    int fd = open("myfile", O_WRONLY|O_CREAT, 0644);
    if(fd < 0)
    
        perror("open");
        return 1;
    
    
    int count = 5;
    const char *msg = "hello bit!\\n";
    
    int len = strlen(msg);
    while(count--)
    
    	write(fd, msg, len);//fd: 后面讲， msg：缓冲区首地址， len: 本次读取，期望写入多少个字节的数据。 返回值：实际写了多少字节数据
    
    
    close(fd);
    return 0;

读文件:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
int main()

    int fd = open("myfile", O_RDONLY);
    if(fd < 0)
    
    	perror("open");
        return 1;
    
    
    const char *msg = "hello bit!\\n";
    char buf[1024];
    while(1)
    
    	ssize_t s = read(fd, buf, strlen(msg));//类比write
        if(s > 0)
        
            printf("%s", buf);
        
        else
        
            break;
        
        
    
    close(fd);
    return 0;

接口介绍

open man open

open属于系统接口

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);

pathname: 要打开或创建的目标文件

flags: 打开文件时，可以传入多个参数选项，用下面的一个或者多个常量进行“或”运算，构成flags。
参数:

O_RDONLY: 只读打开
O_WRONLY: 只写打开
O_RDWR : 读，写打开
这三个常量，必须指定一个且只能指定一个
O_CREAT : 若文件不存在，则创建它。需要使用mode选项，来指明新文件的访问权限
O_APPEND: 追加写

返回值：
成功：新打开的文件描述符
失败：-1

mode_t理解：直接 man 手册，比什么都清楚。
open 函数具体使用哪个，和具体应用场景相关，如目标文件不存在，需要open创建，则第三个参数表示创建文件的默认权限,否则，使用两个参数的open

write read close lseek 等都属于系统接口，类比C文件相关接口

接口的使用

open

重点解释第二个参数flags的使用

第二个参数有O_RDONLY，O_WRONLY，O_RDWR ，O_CREAT : 若文件不存在，则创建它。需要使用mode选项，来指明新文件的访问权限。O_TRUNC：覆盖写，O_APPEND: 追加写

前三个参数好理解，我们就不具体讲解了

关于O_CREAT：

#include<stdio.h>
#include<string.h>
#include <sys/types.h> 
#include <sys/stat.h>
#include <fcntl.h>
#include<unistd.h>

int main()

  int fd = open("log.txt", O_WRONLY|O_CREAT, 0644);
  if (fd < 0)
  
    perror("open file");
    return 1;
  

  char*buf = "hello world\\n";
  write(fd, buf, strlen(buf));//向我们新建的log.txt中写入hello world

  return 0;

结果：

如果我们再执行一次，log.txt中还是只有一个hello world，因为此时默认的是覆盖写入：

只不过这是隐式的，我们也可以显式地添加选项O_TRUNC，该选项就表示覆盖写入。

如果我们想接在log.txt已有的内容后面继续写入，就可以使用选项O_APPEND，表示追加写入

将上面的代码中的open选项稍微修改一下

#include<stdio.h>
#include<string.h>
#include <sys/types.h> 
#include <sys/stat.h>
#include <fcntl.h>
#include<unistd.h>

int main()

  int fd = open("log.txt", O_WRONLY|O_CREAT|O_APPEND, 0644);
  if (fd < 0)
  
    perror("open file");
    return 1;
  

  char*buf = "hello world\\n";
  write(fd, buf, strlen(buf));//向我们新建的log.txt中写入hello world

  return 0;

多次运行该程序，结果：

其实，O_RDONLY，O_WRONLY…这些选项，本质上是宏，表示某个数字，这些数字转换成二进制数字后，所有的位里面只有一个1，也就表示某个功能(权限)的开启，我们看到使用多个功能时，就要用按位或|运算符，将对应的位都置为1，这样函数才知道有哪些功能可以使用。

write

函数原型

ssize_t write(int fd, const void *buf, size_t count);

fd表示写入文件的文件描述符，buf表示指向的空间，count表示我们想让文件读取的字节数，返回值表示实际读取的字节数。

read 和 close就类比write 和 open理解就可以了

open函数返回值

在认识返回值之前，先来认识一下两个概念: 系统调用和库函数

上面的 fopen fclose fread fwrite 都是C标准库当中的函数，我们称之为库函数（libc）。

而， open close read write lseek 都属于系统提供的接口，称之为系统调用接口

回忆一下我们讲操作系统概念时，画的一张图

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-RnRwOSOX-1642676253595)(C:\\Users\\晏思俊\\AppData\\Roaming\\Typora\\typora-user-images\\image-20211004113037174.png)]

系统调用接口和库函数的关系一目了然。
所以，可以认为，f#系列的函数，都是对系统调用的封装，方便二次开发。

open函数返回值：

#include<stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main()

  int fd = open("log.txt", O_WRONLY);
    
    //如果文件不存在就创建
  int fd1 = open("log1.txt", O_RDONLY|O_CREAT);
  int fd2 = open("log2.txt", O_RDONLY|O_CREAT);
  int fd3 = open("log3.txt", O_RDONLY|O_CREAT);
    
    //打开不存在的文件
  int fd4 = open("log4.txt", O_RDONLY);
  int fd5 = open("log5.txt", O_RDONLY);
    
  printf("%d\\n", fd);
  printf("%d\\n", fd1);
  printf("%d\\n", fd2);
  printf("%d\\n", fd3);
  printf("%d\\n", fd4);
  printf("%d\\n", fd5);
    
  close(fd);
  close(fd1);
  close(fd2);
  close(fd3);
  close(fd4);
  close(fd5);
    
  return 0;

我们看到open的返回值在OS层面就是一个整数

文件描述符fd

通过对open函数的学习，我们知道了文件描述符就是一个小整数

但是我们打开的文件，返回的整数是从3开始而不是从1开始或0开始呢？

因为0，1，2三个整数，对应的三个C语言默认打开的文件，标准输入0，标准输出1，标准错误2

0 & 1 & 2

0,1,2对应的物理设备一般是：键盘，显示器，显示器
所以输入输出还可以采用如下方式 :

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>

int main()

    char buf[1024];
    ssize_t s = read(0, buf, sizeof(buf));//从键盘上读取数据到buf中

    if(s > 0)//表示如果成功读到了数据
    
    	buf[s] = 0;//将buf的最后一个元素设置为'\\0'
    	write(1, buf, strlen(buf));//向标准输出流写
    	write(2, buf, strlen(buf));//向标准输入流写
    

    return 0;

我们仔细观察这些从0开始的整数fd，是不是很像数组的下标？

没错，这些整数，就是一个数组的下标，这个数组就是将打开的文件组织起来。

首先，一个进程是可以打开多个文件的，那么，打开了这么多文件，总要对这些文件进行管理，提到管理，我们就要想到：先描述，再组织。

对于打开的文件，我们用一个数组来进行管理，这个数组，是一个结构体指针数组file* fd_array[]，里面存放的指针都指向我们已经打开的文件。这些文件都是从一个进程中打开的，那么就要将它们与进程联系起来，进程里面有一个指针，指向结构体files_struct，这个结构体里又包括了指针数组fd_array。因此，进程与打开的文件就联系起来了。

具体关系如下图：

而现在知道，文件描述符就是从0开始的小整数。当我们打开文件时，操作系统在内存中要创建相应的数据结构来
描述目标文件。于是就有了file结构体。表示一个已经打开的文件对象。而进程执行open系统调用，所以必须让进
程和文件关联起来。每个进程都有一个指针*files, 指向一张表files_struct,该表最重要的部分就是包涵一个指针数
组，每个元素都是一个指向打开文件的指针！所以，本质上，文件描述符就是该数组的下标。所以，只要拿着文件
描述符，就可以找到对应的文件

浅层理解为什么Linux下一切皆文件

在file结构体中，是通过函数指针来对底层的硬件进行读写等操作的，底层有办法将对应的函数来针对不同的硬件进行操作。因此，从操作系统层来看，所有的对象都是由一个file结构体来描述的，中间就像一层虚拟层，模糊了视野，所以才说，Linux下一切皆文件。

文件描述符的分配规则

直接看代码：

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main()

    int fd = open("myfile", O_RDONLY);
    if(fd < 0)
    
        perror("open");
        return 1;
    
    
    printf("fd: %d\\n", fd);
    
    close(fd);
    return 0;

输出发现是 fd: 3
关闭0或者2，再看

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main()

    //关闭0
    close(0);
    //关闭2
    //close(2);
    int fd = open("myfile", O_RDONLY);
    if(fd < 0)
    
    	perror("open");
    	return 1;
    
    
    printf("fd: %d\\n", fd);
    
    close(fd);
    return 0;

发现是结果是： fd: 0 或者 fd 2 .

可见，文件描述符的分配规则：在files_struct数组当中，找到当前没有被使用的最小的一个下标，作为新的文件描述符

重定向

那如果关闭1呢？看代码：

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdlib.h>
int main()

    close(1);//关闭显示器，空出1号位置
    
    //myfile就被放到了1号位置
    int fd = open("myfile", O_WRONLY|O_CREAT, 00644);
    if(fd < 0)
    
    	perror("open");
    	return 1;
    
    
    printf("fd: %d\\n", fd);
    
    fflush(stdout);
    close(fd);
    exit(0);

此时，我们发现，本来应该输出到显示器上的内容，输出到了文件 myfile 当中，其中，fd＝1。这种现象叫做输出重定向。常见的重定向有:>, >>, <

那重定向的本质是什么呢？

我们拿stdout解释，stdout本质是一个文件指针FILE*，也就是指向了一个结构体FILE，该结构体里有一个变量fileno，也就是文件描述符，stdout指向的结构体内的文件描述符是默认为1的，当我们close掉显示器，myfile被放入1号位置，此时printf是向stdout，也就是1号位置的文件输出内容的，但是我们在底层将1号位置的显示器文件改成了myfile，在代码层的stdout指针并不知情，也就是说，stdout此时是指向myfile的，所以printf不是向stdout自认为的显示器输出，而是向myfile中输出了。

当shell命令行(具体是bash)检测到重定向符号后(以">"为例)，将1即标准输出关掉，打开的文件就放入了1的位置，这个操作是操作系统执行的。而stdout并不知情，所以默认还是将printf的内容输出给1号位置的文件，实际上是给我们重定向的文件输出了。

缓冲区

看一段代码：

#include<stdio.h>
#include<string.h>
#include <sys/types.h> 
#include <sys/stat.h>
#include <fcntl.h>
#include<unistd.h>


int main()

  //关闭显示器
  close(1);
    
  //log.txt的文件描述符就是1
  int fd = open("log.txt", O_WRONLY|O_CREAT|O_APPEND, 0644);
  if (fd < 0)
  
    perror("open error");
    return 1;
  

  char*buf1 = "hello write\\n";
  char*buf2 = "hello printf\\n";
  char*buf3 = "hello fprintf\\n";
	
  write(1, buf1, strlen(buf1));//向文件描述符为1的文件中写入，也就是log.txt
  
    //向stdout打印buf2，实际上写到了log.txt中
  printf(buf2);
    //与printf同理
  fprintf(stdout, "%s", buf3);

  fork();
  fflush(stdout);
  
  return 0;

修改一下代码，不关闭1号文件即显示器

#include<stdio.h>
#include<string.h>
#include <sys/types.h> 
#include <sys/stat.h>
#include <fcntl.h>
#include<unistd.h>


int main()

  //close(1);
  int fd = open("log.txt", O_WRONLY|O_CREAT|O_APPEND, 0644);
  if (fd < 0)
  
    perror("open error");
    return 1;
  

  char*buf1 = "hello write\\n";
  char*buf2 = "hello printf\\n";
  char*buf3 = "hello fprintf\\n";

  write(1, buf1, strlen(buf1));
  printf(buf2);
  fprintf(stdout, "%s", buf3);

  fork();
  fflush(stdout);
  
  return 0;

为什么出现了这种情况呢？

首先，我们要了解，显示器缓冲区的缓冲方式是行缓冲，也就是我们之前讲的；文件缓冲区的缓冲方式是全缓冲，也就是当缓冲区写满了才会刷新，或者强制刷新。

使用close，数据写到log.txt文件里，因为文件是全缓冲的，也就是printf和fprintf将打印数据放到了缓冲区，但是缓冲区没有满且没有强制刷新，所以log.txt里还没有这两行数据，当fork()创建子进程后，共享了父进程的代码和数据，所以子进程的缓冲区里也还保留着两行数据，于是，加上父进程缓冲区中的数据，共有4行数据，fflush刷新缓冲区后，这4行数据被写到了log.txt中

不使用close，数据写到显示器上，因为显示器是行缓冲的，在调用printf和fprintf时，字符串中带有\\n，就刷新缓冲区了，数据就被输出到了显示器上，write也将数据写到了显示器上。

关闭显示器时，write没有像printf和fprintf一样，写了两份数据到log.txt上，而wirte是系统调用的函数，printf和fprintf都是C语言提供的函数，所以这里的缓冲区是C语言提供的，不是系统提供的。因此，write写的数据不会放到该缓冲区里。

我们常说的缓冲区就是C语言提供的缓冲区(用户级)，fflush就是将用户级的缓冲区往系统刷新。

使用 dup2 系统调用

使用close来关闭显示器的方法完成重定向太麻烦了，并且如果我们已经打开了文件，想完成重定向怎么办呢？dup/dup2就可以帮助我们实现。

dup太简单，我们直接用dup2

函数原型如下

#include <unistd.h>
int dup2(int oldfd, int newfd);

该函数的作用是将oldfd指向的位置的文件内容拷贝到newfd指向的位置，也就是将newfd指向的位置的文件覆盖，

示例代码

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main() 

    int fd = open("log.txt", O_CREAT | O_RDWR, 0644);
    if (fd < 0) 
    
    	perror("open");
    	return 1;
    
    
    //将文件log.txt替换显示器文件
    dup2(fd, 1);
    for (;;) 
    
    	char buf[1024] = 0;
    	ssize_t read_size = read(0, buf, sizeof(buf) - 1);//向buf中输入内容
    	
        if (read_size < 0) 
        
            perror("read");
            break;
        
        
        printf("%s", buf);//将buf向stdout输出，实际上是对log.txt输出
        fflush(stdout);//刷新文件缓冲区，将内容写入
    
    
    return 0;

效果展示：

如果我们把dup2的新内容换成网卡，就是向网络写数据，也就是通信。