传统的UNIX实现在内核中设有缓冲区高速缓存或页面高速缓存,大多数磁盘I/O都通过缓冲进行。当将数据写入文件时,内核通常先将该数据复制到其中一个缓冲区中,如果该缓冲区尚未写满,则并不将其排入输出队列,而是等待其写满或者当内核需要重用该缓冲区以便存放其他磁盘块数据时,再将该缓冲排入输出队列,然后待其到达队首时,才进行实际的I/O操作。这种输出方式被称为延迟写(delayed write)(Bach [1986]第3章详细讨论了缓冲区高速缓存)。
延迟写减少了磁盘读写次数,但是却降低了文件内容的更新速度,使得欲写到文件中的数据在一段时间内并没有写到磁盘上。当系统发生故障时,这种延迟可能造成文件更新内容的丢失。
为了保证磁盘上实际文件系统与缓冲区高速缓存中内容的一致性,UNIX系统提供了sync
、fsync
和fdatasync
三个函数。
sync 和 syncfs
#include <unistd.h>
int sync();
int syncfs(int fd);
sync
函数只是将修改过的数据块缓冲区放入到设备写队列中后立即返回。数据刷新到磁盘的操作完全由内核守护线程处理。因此调用sync
函数返回后,数据有可能还在内存中,当系统掉电后有可能数据造成丢失。
网上的资料都是这样说的,但查看 man 2 sync
sync() causes all pending modifications to filesystem metadata and cached file data to be written to the underlying filesystems.
syncfs() is like sync(), but synchronizes just the filesystem containing file referred to by the open file descriptor fd.
According to the standard specification (e.g., POSIX.1-2001), sync() schedules the writes, but may return before the actual writing is done. However Linux waits for I/O completions, and thus sync() or syncfs() provide the same guarantees as fsync() called on every file in the system or filesystem respectively.
Before version 1.3.20 Linux did not wait for I/O to complete before returning.
所以,sync
也会等数据持久化到磁盘之后才返回???
fsync
#include <unistd.h>
int fsync(int fd);
fsync
的功能是确保文件fd所有已修改的内容已经正确同步到硬盘上,该调用会阻塞等待直到设备报告IO完成。
fsync() transfers ("flushes") all modified in-core data of (i.e., modified buffer cache pages for) the file referred to by the file descriptor fd to the disk device (or other permanent storage device) so that all changed information can be retrieved even after the system crashed or was rebooted.
-
fdatasync
#include <unistd.h> int fdatasync(int fd);
fdatasync
的功能与fsync
类似,但是仅仅在必要的情况下才会同步metadata,目的是减少非必要的等待磁盘写入完成。fdatasync() is similar to fsync(), but does not flush modified metadata unless that metadata is needed in order to allow a subsequent data retrieval to be correctly handled. For example, changes to st_atime or st_mtime do not require flushing because they are not necessary for a subsequent data read to be handled correctly. On the other hand, a change to the file size would require a metadata flush.
The aim of fdatasync() is to reduce disk activity for applications that do not require all metadata to be synchronized with the disk.
网友评论