文件I/O

Posted fireway

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了文件I/O相关的知识,希望对你有一定的参考价值。

File I/O

Introduction

    We’ll start our discussion of the UNIX System by describing the functions available for file I/O—open a file, read a file, write a file, and so on. Most file I/O on a UNIX system can be performed using only five functions: open, read, write, lseek, and close.We then examine the effect of various buffer sizes on the read and write functions.
    The functions described in this chapter are often referred to as unbuffered I/O, in contrast to the standard I/O routines, which we describe in Chapter 5. The term unbuffered means that each read or write invokes a system call in the kernel. These unbuffered I/O functions are not part of ISO C, but are part of POSIX.1 and the Single UNIX Specification.
    Whenever we describe the sharing of resources among multiple processes, the concept of an atomic operation becomes important. We examine this concept with regard to file I/O and the arguments to the open function. This leads to a discussion of how files are shared among multiple processes and which kernel data structures are involved. After describing these features, we describe the dup, fcntl, sync, fsync, and ioctl functions.

File Descriptors

    To the kernel, all open files are referred to by file descriptors. A file descriptor is a non-negative integer. When we open an existing file or create a new file, the kernel returns a file descriptor to the process. When we want to read or write a file, we identify the file with the file descriptor that was returned by open or creat as an argument to either read or write.
    By convention, UNIX System shells associate file descriptor 0 with the standard input of a process, file descriptor 1 with the standard output, and file descriptor 2 with the standard error. This convention is used by the shells and many applications; it is not a feature of the UNIX kernel. Nevertheless, many applications would break if these associations weren’t followed.
    Although their values are standardized by POSIX.1, the magic numbers 0, 1, and 2 should be replaced in POSIX-compliant applications with the symbolic constants STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO to improve readability.These constants are defined in the <unistd.h> header.
    File descriptors range from 0 through OPEN_MAX−1. (Recall Figure 2.11.) Early historical implementations of the UNIX System had an upper limit of 19, allowing a maximum of 20 open files per process, but many systems subsequently increased this limit to 63.
    With FreeBSD 8.0, Linux 3.2.0, Mac OS X 10.6.8, and Solaris 10, the limit is essentially infinite, bounded by the amount of memory on the system, the size of an integer, and any hard and soft limits configured by the system administrator.

open and openat Functions

    A file is opened or created by calling either the open function or the openat function.
#include <fcntl.h> 

int open(const char *path,int oflag,... /* mode_t mode */ ); 

int openat(int fd,const char *path,int oflag,... /* mode_t mode */ );

                Both return: file descriptor if OK, −1 on error
    We show the last argument as ..., which is the ISO C way to specify that the number and types of the remaining arguments may vary. For these functions, the last argument is used only when a new file is being created, as we describe later. We show this argument as a comment in the prototype.
    The path parameter is the name of the file to open or create. This function has a multitude of options, which are specified by the oflag argument. This argument is formed by ORing together one or more of the following constants from the <fcntl.h> header:
O_RDONLY Open for reading only.
O_WRONLY Open for writing only.
O_RDWR Open for reading and writing.
    Most implementations define O_RDONLY as 0, O_WRONLY as 1, and O_RDWR as 2, for compatibility with older programs.
O_EXEC Open for execute only.
O_SEARCH Open for search only (applies to directories).
    The purpose of the O_SEARCH constant is to evaluate search permissions at the time a directory is opened. Further operations using the directory’s file descriptor will not reevaluate permission to search the directory. None of the versions of the operating systems covered in this book support O_SEARCH yet.
    One and only one of the previous five constants must be specified. The following constants are optional:
O_APPEND Append to the end of file on each write. We describe this option in detail in Section 3.11.
O_CLOEXEC Set the FD_CLOEXEC file descriptor flag. We discuss file descriptor flags in Section 3.14.
O_CREAT Create the file if it doesn’t exist. This option requires a third argument to the open function (a fourth argument to the openat function) — the mode, which specifies the access permission bits of the new file. (When we describe a file’s access permission bits in Section 4.5, we’ll see how to specify the mode and how it can be modified by the umask value of a process.)
O_DIRECTORY Generate an error if path doesn’t refer to a directory.
O_EXCL Generate an error if O_CREAT is also specified and the file already exists. This test for whether the file already exists and the creation of the file if it doesn’t exist is an atomic operation. We describe atomic operations in more detail in Section 3.11.
O_NOCTTY If path refers to a terminal device, do not allocate the device as the controlling terminal for this process. We talk about controlling terminals in Section 9.6.
O_NOFOLLOW Generate an error if path refers to a symbolic link. We discuss symbolic links in Section 4.17.
O_NONBLOCK If path refers to a FIFO, a block special file, or a character special file, this option sets the nonblocking mode for both the opening of the file and subsequent I/O. We describe this mode in Section 14.2.
    In earlier releases of System V, the O_NDELAY (no delay) flag was introduced. This option is similar to the O_NONBLOCK (nonblocking) option, but an ambiguity was introduced in the return value from a read operation. The no-delay option causes a read operation to return 0 if there is no data to be read from a pipe, FIFO, or device, but this conflicts with a return value of 0, indicating an end of file. SVR4-based systems still support the no-delay option, with the old semantics, but new applications should use the nonblocking option instead.
O_SYNC Have each write wait for physical I/O to complete, including I/O necessary to update file attributes modified as a result of the write. We use this option in Section 3.14.
O_TRUNC If the file exists and if it is successfully opened for either write-only or read–write, truncate its length to 0.
O_TTY_INIT When opening a terminal device that is not already open, set the nonstandard termios parameters to values that result in behavior that conforms to the Single UNIX Specification. We discuss the termios structure when we discuss terminal I/O in Chapter 18.
    The following two flags are also optional. They are part of the synchronized input and output option of the Single UNIX Specification (and thus POSIX.1).
O_DSYNC Have each write wait for physical I/O to complete, but don’t wait for file attributes to be updated if they don’t affect the ability to read the data just written.
    The O_DSYNC and O_SYNC flags are similar, but subtly different. The O_DSYNC flag affects a file’s attributes only when they need to be updated to reflect a change in the file’s data (for example, update the file’s size to reflect more data). With the O_SYNC flag, data and attributes are always updated synchronously. When overwriting an existing part of a file opened with the O_DSYNC flag, the file times wouldn’t be updated synchronously. In contrast, if we had opened the file with the O_SYNC flag, every write to the file would update the file’s times before the write returns, regardless of whether we were writing over existing bytes or appending to the file.
O_RSYNC Have each read operation on the file descriptor wait until any pending writes for the same portion of the file are complete.
    Solaris 10 supports all three synchronization flags. Historically, FreeBSD (and thus Mac OS X) have used the O_FSYNC flag, which has the same behavior as O_SYNC. Because the two flags are equivalent, they define the flags to have the same value. FreeBSD 8.0 doesn’t support the O_DSYNC or O_RSYNC flags. Mac OS X doesn’t support the O_RSYNC flag, but defines the O_DSYNC flag, treating it the same as the O_SYNC flag. Linux 3.2.0 supports the O_DSYNC flag, but treats the O_RSYNC flag the same as O_SYNC.
    The file descriptor returned by open and openat is guaranteed to be the lowest numbered unused descriptor. This fact is used by some applications to open a new file on standard input, standard output, or standard error. For example, an application might close standard output — normally, file descriptor 1—and then open another file, knowing that it will be opened on file descriptor 1. We’ll see a better way to guarantee that a file is open on a given descriptor in Section 3.12, when we explore the dup2 function.
    The fd parameter distinguishes the openat function from the open function. There are three possibilities:
    1) The path parameter specifies an absolute pathname. In this case, the fd parameter is ignored and the openat function behaves like the open function.
    2) The path parameter specifies a relative pathname and the fd parameter is a file descriptor that specifies the starting location in the file system where the relative pathname is to be evaluated. The fd parameter is obtained by opening the directory wherethe relative pathname is to be evaluated.
    3) The path parameter specifies a relative pathname and the fd parameter has the special value AT_FDCWD. In this case, the pathname is evaluated starting in the current working directory and the openat function behaves like the open function.
    The openat function is one of a class of functions added to the latest version of POSIX.1 to address two problems. First, it gives threads a way to use relative pathnames to open files in directories other than the current working directory. As we’ll see in Chapter 11, all threads in the same process share the same current working directory, so this makes it difficult for multiple threads in the same process to work in different directories at the same time. Second, it provides a way to avoid time-of-checkto-time-of-use (TOCTTOU) errors.
    The basic idea behind TOCTTOU errors is that a program is vulnerable if it makes two file-based function calls where the second call depends on the results of the first call. Because the two calls are not atomic, the file can change between the two calls, thereby invalidating the results of the first call, leading to a program error. TOCTTOU errors in the file system namespace generally deal with attempts to subvert file system permissions by tricking a privileged program into either reducing permissions on a privileged file or modifying a privileged file to open up a security hole. Wei and Pu [2005] discuss TOCTTOU weaknesses in the UNIX file system interface.

Filename and Pathname Truncation

    What happens if NAME_MAX is 14 and we try to create a new file in the current directory with a filename containing 15 characters? Traditionally, early releases of System V, such as SVR2, allowed this to happen, silently truncating the filename beyond the 14th character. BSD-derived systems, in contrast, returned an error status, with errno set to ENAMETOOLONG. Silently truncating the filename presents a problem that affects more than simply the creation of new files. If NAME_MAX is 14 and a file exists whose name is exactly 14 characters, any function that accepts a pathname argument, such as open or stat, has no way to determine what the original name of the file was, as the original name might have been truncated.
    With POSIX.1, the constant _POSIX_NO_TRUNC determines whether long filenames and long components of pathnames are truncated or an error is returned. As we saw in Chapter 2, this value can vary based on the type of the file system, and we can use fpathconf or pathconf to query a directory to see which behavior is supported.
    Whether an error is returned is largely historical. For example, SVR4-based systems do not generate an error for the traditional System V file system, S5. For the BSD-style file system (known as UFS), however, SVR4-based systems do generate an error. Figure 2.20 illustrates another example: Solaris will return an error for UFS, but not for PCFS, the DOS-compatible file system, as DOS silently truncates filenames that don’t fit in an 8.3 format. BSD-derived systems and Linux always return an error.
    If _POSIX_NO_TRUNC is in effect, errno is set to ENAMETOOLONG, and an error status is returned if any filename component of the pathname exceeds NAME_MAX.
    Most modern file systems support a maximum of 255 characters for filenames. Because filenames are usually shorter than this limit, this constraint tends to not present problems for most applications.

creat Function

    A new file can also be created by calling the creat function.
#include <fcntl.h> 

int creat(const char *path,mode_t mode);

                Returns: file descriptor opened for write-only if OK, −1 on error
    Note that this function is equivalent to
open(path,O_WRONLY | O_CREAT | O_TRUNC, mode);
    Historically, in early versions of the UNIX System, the second argument to open could be only 0, 1, or 2. There was no way to open a file that didn’t already exist. Therefore, a separate system call, creat, was needed to create new files. With the O_CREAT and O_TRUNC options now provided by open,a separate creat function is no longer needed.
    We’ll show how to specify mode in Section 4.5 when we describe a file’s access permissions in detail.
    One deficiency with creat is that the file is opened only for writing. Before the new version of open was provided, if we were creating a temporary file that we wanted to write and then read back, we had to call creat, close, and then open. A better way is to use the open function, as in
open(path,O_RDWR | O_CREAT | O_TRUNC, mode);

close Function

    An open file is closed by calling the close function.
#include <unistd.h> 

int close(int fd);

                Returns: 0 if OK, −1 on error
    Closing a file also releases any record locks that the process may have on the file. We’ll discuss this point further in Section 14.3.
    When a process terminates, all of its open files are closed automatically by the kernel. Many programs take advantage of this fact and don’t explicitly close open files. See the program in Figure1.4, for example.

重定向和管道的命令行简介

重定向

    假设您想要一张 images 目录中所有以 .png 结尾的文件列表
$ ls images/*.png 1>file_list
    这表示把该命令的标准输出(1)重定向到(>)file_list 文件。其中的 > 操作符是输出重定向符。如果要重定向到的文件不存在,它将被创建;如果它已经存在,那么它先前的内容将被覆盖。
    该操作符默认的描述符就是标准输出,因此就不用在命令行上特意指出。所以,上述命令可以简化为
$ ls images/*.png >file_list
其结果是一样的。然后您就可以用某个文本文件查看器(比如 less)来查看。
    现在,假定您想要知道这样的文件有多少
wc -l 0<file_list
其中的 < 操作符是输入重定向符,并且其默认重定向描述符是标准输入(即 0)。因此您只需
wc -l <file_list
    假定您又想去掉其中所有文件的“扩展名”,并将结果保存到另一个文件。您只要将 sed 的标准输入重定向为 file_list,并将其输出重定向到结果文件 the_list
sed -e \'s/\\.png$//g\' <file_list >the_list
    重定向标准错误输出也很有用。例如:您会想要知道在 /shared 中有哪些目录您不能够访问。一个办法是递归地列出该目录并重定向错误输出到某个文件,并且不要显示标准输出:
ls -R /shared >/dev/null 2>errors
这表示标准输出将被重定向到(>)/dev/null,并将标准错误输出(2)重定向到(>)errors 文件。

管道

    管道在某种程度上是标准输入和标准输出重定向的结合。其原理同物理管道类似:一个进程向管道的一端发送数据,而另一个进程从该管道的另一端读取数据。如Figure 3.0, 通过管道之后cmd1,cmd2的标准输出(standard output)不会显示在屏幕上面。
    管道符是 |。
Figure 3.0 管道
    让我们再来看看上述文件列表的例子。假设您想直接找出有多少对应的文件,而不想先将它们保存到一个临时文件,您可以
ls images/*.png | wc -l
这表示将 ls 命令的标准输出(即文件列表)重定向到 wc 命令的输入。这样您就直接得到了想要的结果。
注意:
1)管道命令只处理前一个命令正确输出(standard output),不处理错误输出(standard error)
2)管道命令右边命令,必须能够接收标准输入流(standard input)命令才行
    您也可以使用下述命令得到“除去扩展名”的文件列表
ls images/*.png | sed -e \'s/\\.png$//g\' >the_list
或者,如果您想要直接查看结果而不想保存到某个文件:
ls images/*.png | sed -e \'s/\\.png$//g\' | less

lseek Function

    Every open file has an associated ‘‘current file offset,’’ normally a non-negative integer that measures the number of bytes from the beginning of the file. (We describe some exceptions to the ‘‘non-negative’’ qualifier later in this section.) Read and write operations normally start at the current file offset and cause the offset to be incremented by the number of bytes read or written. By default, this offset is initialized to 0 when a file is opened, unless the O_APPEND option is specified.
    An open file’s offset can be set explicitly by calling lseek.
#include <unistd.h>

off_t lseek(int fd,off_t offset,int whence);

                Returns: new file offset if OK, −1 on error 
    
    The interpretation of the offset depends on the value of the whence argument.
  • If whence is SEEK_SET, the file’s offset is set to offset bytes from the beginning of the file.
  • If whence is SEEK_CUR, the file’s offset is set to its current value plus the offset. The offset can be positive or negative.
  • If whence is SEEK_END, the file’s offset is set to the size of the file plus the offset. The offset can be positive or negative.
    Because a successful call to lseek returns the new file offset, we can seek zero bytes from the current position to determine the current offset:
off_t currpos;
currpos = lseek(fd, 0, SEEK_CUR);
    This technique can also be used to determine if a file is capable of seeking. If the file descriptor refers to a pipe, FIFO, or socket, lseek sets errno to ESPIPE and returns −1.
下列是较特别的使用方式:
欲将读写位置移到文件开头时: lseek(fd, 0, SEEK_SET);
欲将读写位置移到文件尾时: lseek(fd, 0, SEEK_END);
想要取得目前文件位置时: lseek(fd, 0, SEEK_CUR);
    The three symbolic constants—SEEK_SET, SEEK_CUR, and SEEK_END—were introduced with System V. Prior to this, whence was specified as 0 (absolute), 1 (relative to the current offset), or 2 (relative to the end of file). Much software still exists with these numbers hard coded.
    The character l in the name lseek means ‘‘long integer.’’ Before the introduction of the off_t data type, the offset argument and the return value were long integers. lseek was introduced with Version 7 when long integers were added to C. (Similar functionality was provided in Version 6 by the functions seek and tell.)

Example

    The program in Figure3.1 tests its standard input to see whether it is capable of seeking.
  1. /**
  2. * 文件名: fileio/seek.c
  3. * 内容:用于测试对其标准输入能否设置偏移量
  4. * 时间: 2016年 08月 23日 星期二 16:03:00 CST
  5. * 作者:firewaywei
  6. *
  7. */
  8. #include"apue.h"
  9. int
  10. main(void)
  11. {
  12. if(lseek(STDIN_FILENO,0, SEEK_CUR)==-1)
  13. {
  14. printf("cannot seek\\n");
  15. }
  16. else
  17. {
  18. printf("seek OK\\n");
  19. }
  20. exit(0);
  21. }
 Figure 3.1 Test whether standard input is capable of seeking
    If we invoke this program interactively, we get
$ ./a.out < /etc/passwd
seek OK
$ cat < /etc/passwd | ./a.out
cannot seek
$ ./a.out < /var/spool/cron/FIFO
cannot seek
    Normally, a file’s current offset must be a non-negative integer. It is possible, however, that certain devices could allow negative offsets. But for regular files, the offset must be non-negative. Because negative offsets are possible, we should be careful to compare the return value from lseek as being equal to or not equal to −1, rather than testing whether it is less than 0.
    The /dev/kmem device on FreeBSD for the Intel x86 processor supports negative offsets. Because the offset (off_t) is a signed data type (Figure 2.21), we lose a factor of 2 in the maximum file size. If off_t is a 32-bit integer,the maximum file size is 2^31−1bytes.
    lseek only records the current file offset within the kernel—it does not cause any I/O to take place. This offset is then used by the next read or write operation.
    The file’s offset can be greater than the file’s current size, in which case the next write to the file will extend the file. This is referred to as creating a hole in a file and is allowed. Any bytes in a file that have not been written are read back as 0.
    A hole in a file isn’t required to have storage backing it on disk. Depending on the file system implementation, when you write after seeking past the end of a file, new disk blocks might be allocated to store the data, but there is no need to allocate disk blocks for the data between the old end of file and the location where you start writing.

Example

    The program shown in Figure 3.2 creates a file with a hole in it.
  1. /**
  2. * 文件名: fileio/hole.c
  3. * 内容:用于创建一个具有空洞的文件。
  4. * 时间: 2016年 08月 23日 星期二 16:03:00 CST
  5. * 作者:firewaywei
  6. */
  7. #include"apue.h"
  8. #include<fcntl.h>
  9. char buf1[]="abcdefghij";
  10. char buf2[]="ABCDEFGHIJ";
  11. int
  12. main(void)
  13. {
  14. int fd;
  15. if((fd = creat("file.hole", FILE_MODE))<0)
  16. {
  17. err_sys("creat error");
  18. }
  19. /* offset now = 10 */
  20. if(write(fd, buf1,10)!=10)
  21. {
  22. err_sys("buf1 write error");
  23. }
  24. /* offset now = 16384 */
  25. if(lseek(fd,16384, SEEK_SET)==-1)
  26. {
  27. err_sys("lseek error");
  28. }
  29. /* offset now = 16394 */
  30. if(write(fd, buf2,10)!=10)
  31. {
  32. err_sys("buf2 write error");
  33. }
  34. exit(0);
  35. }
Figure 3.2 Create a file with a hole in it
    Running this program gives us
$ ./hole 
$ ll file.hole 
-rw-r--r-- 1 fireway fireway 16394  8月 23 16:18 file.hole
fireway:~/study/apue.3e/fileio$ od -c file.hole 
0000000   a   b   c   d   e   f   g   h   i   j  \\0  \\0  \\0  \\0  \\0  \\0
0000020  \\0  \\0  \\0  \\0  \\0  \\0  \\0  \\0  \\0  \\0  \\0  \\0  \\0  \\0  \\0  \\0
*
0040000   A   B   C   D   E   F   G   H   I   J
0040012
    We use the od(1) command to look at the contents of the file. The -c flag tells it to print the contents as characters. We can see that the unwritten bytes in the middle are read back as zero. The seven-digit number at the beginning of each line is the byte offset in octal.
    To prove that there is really a hole in the file, let’s compare the file we just created with a file of the same size, but without holes:
$ ls -ls file.hole file.nohole        // 比较长度
 8 -rw-r--r-- 1 fireway fireway 16394  8月 23 16:18 file.hole
20 -rw-r--r-- 1 fireway fireway 16394  8月 23 16:37 file.nohole
    Although both files are the same size, the file without holes consumes 20 disk blocks, whereas the file with holes consumes only 8 blocks.
    In this example, we call the write function (Section 3.8). We’ll have more to say about files with holes in Section 4.12.
    Because the offset address that lseek uses is represented by an off_t, implementations are allowed to support whatever size is appropriate on their particular platform. Most platforms today provide two sets of interfaces to manipulate file offsets: one set that uses 32-bit file offsets and another set that uses 64-bit file offsets.
    The Single UNIX Specification provides a way for applications to determine which environments are supported through the sysconf function (Section 2.5.4). Figure 3.3 summarizes the sysconf constants that are defined.
Name of option Description name argument
_POSIX_V7_ILP32_OFF32 int, long,pointer,and off_t types are 32 bits. _SC_V7_ILP32_OFF32
_POSIX_V7_ILP32_OFFBIG int, long,and pointer types are 32 bits; off_t types are at least 64 bits. _SC_V7_ILP32_OFFBIG
_POSIX_V7_LP64_OFF64 int types are 32 bits; long,pointer, and off_t types are 64 bits. _SC_V7_LP64_OFF64
_POSIX_V7_LP64_OFFBIG int types are at least 32 bits; long, pointer,and off_t types areat least 64 bits. _SC_V7_LP64_OFFBIG
Figure 3.3 Data size options and name arguments to sysconf
    The c99 compiler requires that we use the getconf(1) command to map the desired data size model to the flags necessary to compile and link our programs. Different flags and libraries might be needed, depending on the environments supported by each platform.
    Unfortunately, this is one area in which implementations haven’t caught up to the standards. If your system does not match the latest version of the standard, the system might support the option names from the previous version of the Single UNIX Specification: _POSIX_V6_ILP32_OFF32, _POSIX_V6_ILP32_OFFBIG, _POSIX_V6_LP64_OFF64, and _POSIX_V6_LP64_OFFBIG.
    To get around this, applications can set the _FILE_OFFSET_BITS constant to 64 to enable 64-bit offsets. Doing so changes the definition of off_t to be a 64-bit signed integer. Setting _FILE_OFFSET_BITS to 32 enables 32-bit file offsets. Be aware, however, that although all four platforms discussed in this text support both 32-bit and 64-bit file offsets, setting _FILE_OFFSET_BITS is not guaranteed to be portable and might not have the desired effect.
    Figure 3.4 summarizes the size in bytes of the off_t data type for the platforms covered in this book when an application doesn’t define _FILE_OFFSET_BITS, as well as the size when an application defines _FILE_OFFSET_BITS to have a value of either 32 or 64.
Operating system CPU architecture _FILE_OFFSET_BITS value
Undefined 32 64
FreeBSD 8.0 x86 32-bit 8 8 8
Linux 3.2.0 x86 64-bit 8 8 8
Mac OS X 10.6.8 x86 64-bit 8 8 8
Solaris 10 SPARC 64-bit 8 4 8
Figure 3.4 Size in bytes of off_t for different platforms
    Note that even though you might enable 64-bit file offsets, your ability to create a file larger than 2 GB (2^31−1bytes) depends on the underlying file system type.

read Function

    Data is read from an open file with the read function.
#include <unistd.h> 

ssize_t read(int fd,void *buf,size_t nbytes);

                    Returns: number of bytes read, 0 if end of file, −1 on error
    If the read is successful, the number of bytes read is returned. If the end of file is encountered, 0 is returned.
    There are several cases in which the number of bytes actually read is less than the amount requested:
  • When reading from a regular file, if the end of file is reached before the requested number of bytes has been read. For example, if 30 bytes remain until the end of file and we try to read 100 bytes, read returns 30. The next time we call read, it will return 0 (end of file).
  • When reading from a terminal device. Normally, up to one line is read at a time. (We’ll see how to change this default in Chapter 18.)
  • When reading from a network. Buffering within the network may cause less than the requested amount to be returned.
  • When reading from a pipe or FIFO. If the pipe contains fewer bytes than requested, read will return only what is available.
  • When reading from a record-oriented device. Some record-oriented devices, such as magnetic tape, can return up to a single record at a time.
  • When interrupted by a signal and a partial amount of data has already been read. We discuss this further in Section 10.5.
    The read operation starts at the file’s current offset. Before a successful return, the offset is incremented by the number of bytes actually read.
    POSIX.1 changed the prototype for this function in several ways. The classic definition is
int read(int fd,char *buf,unsigned nbytes);
  • First, the second argument was changed from char * to void * to be consistent with ISO C: the type void * is used for generic pointers.
  • Next, the return value was required to be a signed integer (ssize_t) to return a positive byte count, 0 (for end of file), or −1(for an error).
  • Finally, the third argument historically has been an unsigned integer, to allow a 16-bit implementation to read or write up to 65,534 bytes at a time. With the 1990 POSIX.1 standard, the primitive system data type ssize_t was introduced to provide the signed return value, and the unsigned size_t was used for the third argument. (Recall the SSIZE_MAX constant from Section 2.5.2.)

write Function

    Data is written to an open file with the write function.
#include <unistd.h> 

ssize_t write(int fd,const void *buf,size_t nbytes);

                Returns: number of bytes written if OK, −1 on error
    The return value is usually equal to the nbytes argument; otherwise, an error has occurred. A common cause for a write error is either filling up a disk or exceeding the file size limit for a given process (Section 7.11and Exercise 10.11).
    For a regular file, the write operation starts at the file’s current offset. If the O_APPEND option was specified when the file was opened, the file’s offset is set to the current end of file before each write operation. After a successful write, the file’s offset is incremented by the number of bytes actually written.

I/O Efficiency

    The program in Figure3.5 copies a file, using only the read and write functions.
  1. /**
  2. * 文件名: fileio/mycat.c
  3. * 内容:只使用read和write函数复制一个文件.
  4. * 时间: 2016年 08月 25日 星期四 11:00:18 CST
  5. * 作者:firewaywei
  6. * 执行命令:
  7. * ./a.out < infile > outfile
  8. */
  9. #include"apue.h"
  10. #define BUFFSIZE 4096
  11. int
  12. main(void)
  13. {
  14. int n;
  15. char buf[BUFFSIZE以上是关于文件I/O的主要内容,如果未能解决你的问题,请参考以下文章

    java I/O流基础(知识+代码示例)

    对文件 I/O,标准 I/O 的缓冲的理解

    监控文件描述符的六种方式(进程监控selectpoll非阻塞轮询I/O异步I/O线程监控)

    asyncio 是不是支持文件操作的异步 I/O?

    Scala 文件 I/O

    Linux I/O重定向