Message ID | 524f69650902201721l1c7f0d34p4708980a90d7fc8f@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, On Fri, 20 Feb 2009, Steve French wrote: > I modified your patch slightly to not lose the writeback rc in one > case, and to change camel case pTcon to tcon and remove one > unnecessary local variable. Thanks. > If you have any performance numbers before and after (with e.g. dbench, > iozone, bonnie etc. or perhaps something which calls fsync more often > - that would be helpful in determining whether we need a mount option > to optionally disable it - as the samba server does) I tested the patch using a simple test driver that transfers data in 1MB chunks and calls fsync after each chunk. The results are as follows: Test run Throughput (MB/s) 1 client (MB/s) 10 clients 20 clients 40 clients cifs - vanilla 64 116 117 111 cifs - fsync.patch 54 90 95 95 As you can see, fsync does make a noticeable difference here. The problem with benchmarks in this respect is that the difference solely depends on the underlying device. In this scenario, the storage device featured a battery-backed disk cache. Without it, the overhead would have been much more significant. In any case, it's probably a good idea to introduce a mount option. The new behavior should be enabled by default though, to provide maximum data consistency. While I can't think of a legitimate reason to disable fsync if data consistency is a concern, an option might be useful in the exceptional case where 1) data consistency is not a strict requirement and 2) SMB_COM_FLUSH cannot be ignored on the server side. Thanks, Horst.
On Sun, Feb 22, 2009 at 2:50 PM, Horst Reiterer <horst.reiterer@gmail.com> wrote: > In any case, it's probably a good idea to introduce a mount option. The > new behavior should be enabled by default though, to provide maximum data > consistency. Suggestions on what to call such a new mount option? How about "nostrictfsync" ?
On Sun, 22 Feb 2009, Steve French wrote: > Suggestions on what to call such a new mount option? How about > "nostrictfsync" ? Sound good, should be self-explanatory for Samba users and those familiar with the fsync concept. Horst.
On Sun, 22 Feb 2009 22:12:11 +0100 (CET) Horst Reiterer <horst.reiterer@gmail.com> wrote: > On Sun, 22 Feb 2009, Steve French wrote: > > Suggestions on what to call such a new mount option? How about > > "nostrictfsync" ? > > Sound good, should be self-explanatory for Samba users and those familiar > with the fsync concept. > > Horst. My suggestion would be to not add a new option until someone requests it/complains about it. We already have a lot of unneeded/unused mount options, and I think this will just be adding one to the pile. My $.02...
On Sun, 22 Feb 2009 17:34:46 -0800 Jeff Layton <jlayton@redhat.com> wrote: > On Sun, 22 Feb 2009 22:12:11 +0100 (CET) > Horst Reiterer <horst.reiterer@gmail.com> wrote: > > > On Sun, 22 Feb 2009, Steve French wrote: > > > Suggestions on what to call such a new mount option? How about > > > "nostrictfsync" ? > > > > Sound good, should be self-explanatory for Samba users and those familiar > > with the fsync concept. > > > > Horst. > > My suggestion would be to not add a new option until someone requests > it/complains about it. We already have a lot of unneeded/unused mount > options, and I think this will just be adding one to the pile. > > My $.02... > ...and to lend further strength to this argument, we're not doing this at close(), but rather at fsync(). posix is pretty clear about what's supposed to happen at fsync: "The fsync() function shall request that all data for the open file descriptor named by fildes is to be transferred to the storage device associated with the file described by fildes." ...if you don't want to take the performance penalty then don't use fsync(). If you're using fsync, then you care about data integrity and adding a mount option to disable it is rather pointless.
On Sun, Feb 22, 2009 at 8:10 PM, Jeff Layton <jlayton@redhat.com> wrote: > On Sun, 22 Feb 2009 17:34:46 -0800 > Jeff Layton <jlayton@redhat.com> wrote: > >> On Sun, 22 Feb 2009 22:12:11 +0100 (CET) >> Horst Reiterer <horst.reiterer@gmail.com> wrote: >> >> > On Sun, 22 Feb 2009, Steve French wrote: >> > > Suggestions on what to call such a new mount option? How about >> > > "nostrictfsync" ? >> > >> > Sound good, should be self-explanatory for Samba users and those familiar >> > with the fsync concept. >> > >> > Horst. >> >> My suggestion would be to not add a new option until someone requests >> it/complains about it. We already have a lot of unneeded/unused mount >> options, and I think this will just be adding one to the pile. >> >> My $.02... >> > > ...and to lend further strength to this argument, we're not doing this > at close(), but rather at fsync(). posix is pretty clear about what's > supposed to happen at fsync: > > "The fsync() function shall request that all data for the open file > descriptor named by fildes is to be transferred to the storage device > associated with the file described by fildes." > > ...if you don't want to take the performance penalty then don't use > fsync(). If you're using fsync, then you care about data integrity > and adding a mount option to disable it is rather pointless. There is a lot of history here (especially in the man page of smb.conf) on dumb things apps do in this area. I don't object to allowing users to turn strict sync behavior off. There are some apps who presumably consider getting data to the server (which we already do in fsync) "stable enough" storage (clients are not battery backed up, with redundant, highly available hardware as some servers are). I agree that the default should be to write the data to the metal on the server (even though this can be very, very expensive) for fsync
diff --git a/fs/cifs/CHANGES b/fs/cifs/CHANGES index 851388f..d43e0fe 100644 --- a/fs/cifs/CHANGES +++ b/fs/cifs/CHANGES @@ -6,7 +6,9 @@ the server to treat subsequent connections, especially those that are authenticated as guest, as reconnections, invalidating the earlier user's smb session. This fix allows cifs to mount multiple times to the same server with different userids without risking invalidating earlier -established security contexts. +established security contexts. fsync now sends SMB Flush operation +to better ensure that we wait for server to write all of the data to +server disk (not just write it over the network). Version 1.56 ------------ diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c index 490e34b..877e4d9 100644 --- a/fs/cifs/cifs_debug.c +++ b/fs/cifs/cifs_debug.c @@ -340,6 +340,8 @@ static int cifs_stats_proc_show(struct seq_file *m, void *v) seq_printf(m, "\nWrites: %d Bytes: %lld", atomic_read(&tcon->num_writes), (long long)(tcon->bytes_written)); + seq_printf(m, "\nFlushes: %d", + atomic_read(&tcon->num_flushes)); seq_printf(m, "\nLocks: %d HardLinks: %d " "Symlinks: %d", atomic_read(&tcon->num_locks), diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index e004f6d..44ff94d 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -254,6 +254,7 @@ struct cifsTconInfo { atomic_t num_smbs_sent; atomic_t num_writes; atomic_t num_reads; + atomic_t num_flushes; atomic_t num_oplock_brks; atomic_t num_opens; atomic_t num_closes; diff --git a/fs/cifs/cifspdu.h b/fs/cifs/cifspdu.h index b4e2e9f..eda6e51 100644 --- a/fs/cifs/cifspdu.h +++ b/fs/cifs/cifspdu.h @@ -43,6 +43,7 @@ #define SMB_COM_CREATE_DIRECTORY 0x00 /* trivial response */ #define SMB_COM_DELETE_DIRECTORY 0x01 /* trivial response */ #define SMB_COM_CLOSE 0x04 /* triv req/rsp, timestamp ignored */ +#define SMB_COM_FLUSH 0x05 /* triv req/rsp */ #define SMB_COM_DELETE 0x06 /* trivial response */ #define SMB_COM_RENAME 0x07 /* trivial response */ #define SMB_COM_QUERY_INFORMATION 0x08 /* aka getattr */ @@ -790,6 +791,12 @@ typedef struct smb_com_close_rsp { __u16 ByteCount; /* bct = 0 */ } __attribute__((packed)) CLOSE_RSP; +typedef struct smb_com_flush_req { + struct smb_hdr hdr; /* wct = 1 */ + __u16 FileID; + __u16 ByteCount; /* 0 */ +} __attribute__((packed)) FLUSH_REQ; + typedef struct smb_com_findclose_req { struct smb_hdr hdr; /* wct = 1 */ __u16 FileID; diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index 083dfc5..596fc86 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -281,6 +281,9 @@ extern int CIFSPOSIXCreate(const int xid, struct cifsTconInfo *tcon, extern int CIFSSMBClose(const int xid, struct cifsTconInfo *tcon, const int smb_file_id); +extern int CIFSSMBFlush(const int xid, struct cifsTconInfo *tcon, + const int smb_file_id); + extern int CIFSSMBRead(const int xid, struct cifsTconInfo *tcon, const int netfid, unsigned int count, const __u64 lseek, unsigned int *nbytes, char **buf, diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 939e2f7..4c344fe 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -1934,6 +1934,27 @@ CIFSSMBClose(const int xid, struct cifsTconInfo *tcon, int smb_file_id) } int +CIFSSMBFlush(const int xid, struct cifsTconInfo *tcon, int smb_file_id) +{ + int rc = 0; + FLUSH_REQ *pSMB = NULL; + cFYI(1, ("In CIFSSMBFlush")); + + rc = small_smb_init(SMB_COM_FLUSH, 1, tcon, (void **) &pSMB); + if (rc) + return rc; + + pSMB->FileID = (__u16) smb_file_id; + pSMB->ByteCount = 0; + rc = SendReceiveNoRsp(xid, tcon->ses, (struct smb_hdr *) pSMB, 0); + cifs_stats_inc(&tcon->num_flushes); + if (rc) + cERROR(1, ("Send error in Flush = %d", rc)); + + return rc; +} + +int CIFSSMBRename(const int xid, struct cifsTconInfo *tcon, const char *fromName, const char *toName, const struct nls_table *nls_codepage, int remap) diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 12bb656..83b4741 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -1523,6 +1523,9 @@ int cifs_fsync(struct file *file, struct dentry *dentry, int datasync) { int xid; int rc = 0; + struct cifsTconInfo *tcon; + struct cifsFileInfo *smbfile = + (struct cifsFileInfo *)file->private_data; struct inode *inode = file->f_path.dentry->d_inode; xid = GetXid(); @@ -1534,7 +1537,11 @@ int cifs_fsync(struct file *file, struct dentry *dentry, int datasync) if (rc == 0) { rc = CIFS_I(inode)->write_behind_rc; CIFS_I(inode)->write_behind_rc = 0; + tcon = CIFS_SB(inode->i_sb)->tcon; + if (!rc && tcon && smbfile) + rc = CIFSSMBFlush(xid, tcon, smbfile->netfid); } + FreeXid(xid); return rc; }