NFS fixes - Patchwork

Message ID	OF4D8DDA37.ED72F2E9-ON88257FD1.00031E93-88257FD1.0003298D@notes.na.collabserv.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@kernel.org> Gateway: Authorized Use Only! Violators will be prosecuted for <linux-nfs@vger.kernel.org> from <eshel@us.ibm.com>; Sun, 12 Jun 2016 18:34:42 -0600 Gateway: Authorized Use Only! Violators will be prosecuted; Sun, 12 Jun 2016 18:34:39 -0600 Gateway: Authorized Use Only! Violators will be prosecuted for <linux-nfs@vger.kernel.org> from <eshel@us.ibm.com>; Sun, 12 Jun 2016 20:34:38 -0400 Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256/256) Sun, 12 Jun 2016 20:34:36 -0400 In-Reply-To: <OFAE63B935.50A753D1-ON88257FD0.00812E5D-88257FD0.0081788A@LocalDomain> To: "J. Bruce Fields" <bfields@fieldses.org> Cc: linux-nfs@vger.kernel.org, "Srikanth Srinivasan" <ssrikanth@in.ibm.com>, "Trond Myklebust" <trond.myklebust@primarydata.com>, "Venkateswara R Puvvada" <vpuvvada@in.ibm.com> Subject: Re: NFS fixes From: "Marc Eshel" <eshel@us.ibm.com> Date: Sun, 12 Jun 2016 17:34:32 -0700 References: <OF9C78122D.D4F282BF-ON88257FD0.00686038-88257FD0.006E09F0@LocalDomain> <OFAE63B935.50A753D1-ON88257FD0.00812E5D-88257FD0.0081788A@LocalDomain> MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT Content-Type: text/plain; charset="US-ASCII" Message-Id: <OF4D8DDA37.ED72F2E9-ON88257FD1.00031E93-88257FD1.0003298D@notes.na.collabserv.com> Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk

Message ID

OF4D8DDA37.ED72F2E9-ON88257FD1.00031E93-88257FD1.0003298D@notes.na.collabserv.com (mailing list archive)

State

New, archived

Headers

In-Reply-To: <OFAE63B935.50A753D1-ON88257FD0.00812E5D-88257FD0.0081788A@LocalDomain>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: linux-nfs@vger.kernel.org, "Srikanth Srinivasan" <ssrikanth@in.ibm.com>,
	"Trond Myklebust" <trond.myklebust@primarydata.com>,
	"Venkateswara R Puvvada" <vpuvvada@in.ibm.com>
Subject: Re: NFS fixes
From: "Marc Eshel" <eshel@us.ibm.com>
Date: Sun, 12 Jun 2016 17:34:32 -0700
References: <OF9C78122D.D4F282BF-ON88257FD0.00686038-88257FD0.006E09F0@LocalDomain>
	<OFAE63B935.50A753D1-ON88257FD0.00812E5D-88257FD0.0081788A@LocalDomain>
MIME-Version: 1.0
Content-Transfer-Encoding: 8BIT
Content-Type: text/plain; charset="US-ASCII"
Message-Id: <OF4D8DDA37.ED72F2E9-ON88257FD1.00031E93-88257FD1.0003298D@notes.na.collabserv.com>
Sender: linux-nfs-owner@vger.kernel.org
Precedence: bulk

Commit Message

Marc Eshel June 13, 2016, 12:34 a.m. UTC

We are seeing a data corruption when putting very high load on the NFS V3 
client reading multi gigabyte files in parallel. The check-sum on the 
files is showing the corruption, and looking at the data we see data that 
in one block that belongs in another block but it is not the full block. 
The test is done on multiple set of hardware using different type of 
server including kNFS and Ganesha servers with EXT3 or GPFS file system. 
The only common part in all test are NFSv3 client on REHL7.0, 7.1, 7.2.

The question is there anything up stream that might fix data corruption by 
the NFSv3 client, oo do we know if this problem might have been reported 
by other users.

The only fix that I see that might be related is attached, can this 
explain a data corruption?

Thanks, Marc. 


Author: Trond Myklebust <trond.myklebust@primarydata.com>
Date:   Mon Aug 17 12:57:07 2015 -0500

    NFS: nfs_set_pgio_error sometimes misses errors
 
    We should ensure that we always set the pgio_header's error field
    if a READ or WRITE RPC call returns an error. The current code depends
    on 'hdr->good_bytes' always being initialised to a large value, which
    is not always done correctly by callers.
    When this happens, applications may end up missing important errors.
 
    Cc: stable@vger.kernel.org
    Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


\

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

J. Bruce Fields June 13, 2016, 4:01 p.m. UTC | #1

On Sun, Jun 12, 2016 at 05:34:32PM -0700, Marc Eshel wrote:
> We are seeing a data corruption when putting very high load on the NFS V3 
> client reading multi gigabyte files in parallel. The check-sum on the 
> files is showing the corruption, and looking at the data we see data that 
> in one block that belongs in another block but it is not the full block. 
> The test is done on multiple set of hardware using different type of 
> server including kNFS and Ganesha servers with EXT3 or GPFS file system. 
> The only common part in all test are NFSv3 client on REHL7.0, 7.1, 7.2.
> 
> The question is there anything up stream that might fix data corruption by 
> the NFSv3 client, oo do we know if this problem might have been reported 
> by other users.
> 
> The only fix that I see that might be related is attached, can this 
> explain a data corruption?

It should be pretty easy to check whether there've been any READ/WRITE
errors, and rule this out if not.

Is the data being read completely static?  (So you can rule out e.g.
some subtle violation of close-to-open.)

Sorry, no special knowledge here.

--b.

> 
> Thanks, Marc. 
> 
> 
> Author: Trond Myklebust <trond.myklebust@primarydata.com>
> Date:   Mon Aug 17 12:57:07 2015 -0500
> 
>     NFS: nfs_set_pgio_error sometimes misses errors
>  
>     We should ensure that we always set the pgio_header's error field
>     if a READ or WRITE RPC call returns an error. The current code depends
>     on 'hdr->good_bytes' always being initialised to a large value, which
>     is not always done correctly by callers.
>     When this happens, applications may end up missing important errors.
>  
>     Cc: stable@vger.kernel.org
>     Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> 
> diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> index 4984bbe..7c5718b 100644
> --- a/fs/nfs/pagelist.c
> +++ b/fs/nfs/pagelist.c
> @@ -77,8 +77,8 @@ EXPORT_SYMBOL_GPL(nfs_pgheader_init);
>  void nfs_set_pgio_error(struct nfs_pgio_header *hdr, int error, loff_t 
> pos)
>  {
>         spin_lock(&hdr->lock);
> -       if (pos < hdr->io_start + hdr->good_bytes) {
> -               set_bit(NFS_IOHDR_ERROR, &hdr->flags);
> +       if (!test_and_set_bit(NFS_IOHDR_ERROR, &hdr->flags)
> +           || pos < hdr->io_start + hdr->good_bytes) {
>                 clear_bit(NFS_IOHDR_EOF, &hdr->flags);
>                 hdr->good_bytes = pos - hdr->io_start;
> 
> \
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Marc Eshel June 13, 2016, 5:01 p.m. UTC | #2

There are no error from vfs_read just the data corruption, we tried it 
with DIO and still see the problem, so it might not be the NFS client, we 
are looking again at the memory management of the application.
Thanks, Marc.

linux-nfs-owner@vger.kernel.org wrote on 06/13/2016 09:01:38 AM:

> From: "J. Bruce Fields" <bfields@fieldses.org>
> To: Marc Eshel/Almaden/IBM@IBMUS
> Cc: linux-nfs@vger.kernel.org, Srikanth Srinivasan 
> <ssrikanth@in.ibm.com>, Trond Myklebust 
> <trond.myklebust@primarydata.com>, Venkateswara R Puvvada 
> <vpuvvada@in.ibm.com>
> Date: 06/13/2016 09:01 AM
> Subject: Re: NFS fixes
> Sent by: linux-nfs-owner@vger.kernel.org
> 
> On Sun, Jun 12, 2016 at 05:34:32PM -0700, Marc Eshel wrote:
> > We are seeing a data corruption when putting very high load on the NFS 
V3 
> > client reading multi gigabyte files in parallel. The check-sum on the 
> > files is showing the corruption, and looking at the data we see data 
that 
> > in one block that belongs in another block but it is not the full 
block. 
> > The test is done on multiple set of hardware using different type of 
> > server including kNFS and Ganesha servers with EXT3 or GPFS file 
system. 
> > The only common part in all test are NFSv3 client on REHL7.0, 7.1, 
7.2.
> > 
> > The question is there anything up stream that might fix data 
corruption by 
> > the NFSv3 client, oo do we know if this problem might have been 
reported 
> > by other users.
> > 
> > The only fix that I see that might be related is attached, can this 
> > explain a data corruption?
> 
> It should be pretty easy to check whether there've been any READ/WRITE
> errors, and rule this out if not.
> 
> Is the data being read completely static?  (So you can rule out e.g.
> some subtle violation of close-to-open.)
> 
> Sorry, no special knowledge here.
> 
> --b.
> 
> > 
> > Thanks, Marc. 
> > 
> > 
> > Author: Trond Myklebust <trond.myklebust@primarydata.com>
> > Date:   Mon Aug 17 12:57:07 2015 -0500
> > 
> >     NFS: nfs_set_pgio_error sometimes misses errors
> > 
> >     We should ensure that we always set the pgio_header's error field
> >     if a READ or WRITE RPC call returns an error. The current code 
depends
> >     on 'hdr->good_bytes' always being initialised to a large value, 
which
> >     is not always done correctly by callers.
> >     When this happens, applications may end up missing important 
errors.
> > 
> >     Cc: stable@vger.kernel.org
> >     Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> > 
> > diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
> > index 4984bbe..7c5718b 100644
> > --- a/fs/nfs/pagelist.c
> > +++ b/fs/nfs/pagelist.c
> > @@ -77,8 +77,8 @@ EXPORT_SYMBOL_GPL(nfs_pgheader_init);
> >  void nfs_set_pgio_error(struct nfs_pgio_header *hdr, int error, 
loff_t 
> > pos)
> >  {
> >         spin_lock(&hdr->lock);
> > -       if (pos < hdr->io_start + hdr->good_bytes) {
> > -               set_bit(NFS_IOHDR_ERROR, &hdr->flags);
> > +       if (!test_and_set_bit(NFS_IOHDR_ERROR, &hdr->flags)
> > +           || pos < hdr->io_start + hdr->good_bytes) {
> >                 clear_bit(NFS_IOHDR_EOF, &hdr->flags);
> >                 hdr->good_bytes = pos - hdr->io_start;
> > 
> > \
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" 
in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/fs/nfs/pagelist.c b/fs/nfs/pagelist.c
index 4984bbe..7c5718b 100644
--- a/fs/nfs/pagelist.c
+++ b/fs/nfs/pagelist.c
@@ -77,8 +77,8 @@  EXPORT_SYMBOL_GPL(nfs_pgheader_init);
 void nfs_set_pgio_error(struct nfs_pgio_header *hdr, int error, loff_t 
pos)
 {
        spin_lock(&hdr->lock);
-       if (pos < hdr->io_start + hdr->good_bytes) {
-               set_bit(NFS_IOHDR_ERROR, &hdr->flags);
+       if (!test_and_set_bit(NFS_IOHDR_ERROR, &hdr->flags)
+           || pos < hdr->io_start + hdr->good_bytes) {
                clear_bit(NFS_IOHDR_EOF, &hdr->flags);
                hdr->good_bytes = pos - hdr->io_start;