[v2] file-posix: Skip effectiveless OFD lock operations

Message ID	20180813023958.5352-1-famz@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org> From: Fam Zheng <famz@redhat.com> To: qemu-devel@nongnu.org Date: Mon, 13 Aug 2018 10:39:58 +0800 Message-Id: <20180813023958.5352-1-famz@redhat.com> Subject: [Qemu-devel] [PATCH v2] file-posix: Skip effectiveless OFD lock operations Precedence: list Cc: Kevin Wolf <kwolf@redhat.com>, qemu-block@nongnu.org, Max Reitz <mreitz@redhat.com> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	[v2] file-posix: Skip effectiveless OFD lock operations \| expand [v2] file-posix: Skip effectiveless OFD lock operations

Message ID

20180813023958.5352-1-famz@redhat.com (mailing list archive)

State

New, archived

Headers

From: Fam Zheng <famz@redhat.com>
To: qemu-devel@nongnu.org
Date: Mon, 13 Aug 2018 10:39:58 +0800
Message-Id: <20180813023958.5352-1-famz@redhat.com>
Subject: [Qemu-devel] [PATCH v2] file-posix: Skip effectiveless OFD lock
 operations
Precedence: list
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-block@nongnu.org,
	Max Reitz <mreitz@redhat.com>
Errors-To: 
 qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>

Series

[v2] file-posix: Skip effectiveless OFD lock operations | expand

Commit Message

Fam Zheng Aug. 13, 2018, 2:39 a.m. UTC

If we know we've already locked the bytes, don't do it again; similarly
don't unlock a byte if we haven't locked it. This doesn't change the
behavior, but fixes a corner case explained below.

Libvirt had an error handling bug that an image can get its (ownership,
file mode, SELinux) permissions changed (RHBZ 1584982) by mistake behind
QEMU. Specifically, an image in use by Libvirt VM has:

    $ ls -lhZ b.img
    -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c600,c690 b.img

Trying to attach it a second time won't work because of image locking.
And after the error, it becomes:

    $ ls -lhZ b.img
    -rw-r--r--. root root system_u:object_r:virt_image_t:s0 b.img

Then, we won't be able to do OFD lock operations with the existing fd.
In other words, the code such as in blk_detach_dev:

    blk_set_perm(blk, 0, BLK_PERM_ALL, &error_abort);

can abort() QEMU, out of environmental changes.

This patch is an easy fix to this and the change is regardlessly
reasonable, so do it.

Signed-off-by: Fam Zheng <famz@redhat.com>

---

v2: For s == NULL, unlock all bits. [Kevin]
---
 block/file-posix.c | 41 +++++++++++++++++++++++++++++++----------
 1 file changed, 31 insertions(+), 10 deletions(-)

Comments

Kevin Wolf Aug. 13, 2018, 1:42 p.m. UTC | #1

Am 13.08.2018 um 04:39 hat Fam Zheng geschrieben:
> If we know we've already locked the bytes, don't do it again; similarly
> don't unlock a byte if we haven't locked it. This doesn't change the
> behavior, but fixes a corner case explained below.
> 
> Libvirt had an error handling bug that an image can get its (ownership,
> file mode, SELinux) permissions changed (RHBZ 1584982) by mistake behind
> QEMU. Specifically, an image in use by Libvirt VM has:
> 
>     $ ls -lhZ b.img
>     -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c600,c690 b.img
> 
> Trying to attach it a second time won't work because of image locking.
> And after the error, it becomes:
> 
>     $ ls -lhZ b.img
>     -rw-r--r--. root root system_u:object_r:virt_image_t:s0 b.img
> 
> Then, we won't be able to do OFD lock operations with the existing fd.
> In other words, the code such as in blk_detach_dev:
> 
>     blk_set_perm(blk, 0, BLK_PERM_ALL, &error_abort);
> 
> can abort() QEMU, out of environmental changes.
> 
> This patch is an easy fix to this and the change is regardlessly
> reasonable, so do it.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

Thanks, applied to the block branch.

Kevin

Fam Zheng Aug. 14, 2018, 8:12 a.m. UTC | #2

On Mon, 08/13 15:42, Kevin Wolf wrote:
> Am 13.08.2018 um 04:39 hat Fam Zheng geschrieben:
> > If we know we've already locked the bytes, don't do it again; similarly
> > don't unlock a byte if we haven't locked it. This doesn't change the
> > behavior, but fixes a corner case explained below.
> > 
> > Libvirt had an error handling bug that an image can get its (ownership,
> > file mode, SELinux) permissions changed (RHBZ 1584982) by mistake behind
> > QEMU. Specifically, an image in use by Libvirt VM has:
> > 
> >     $ ls -lhZ b.img
> >     -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c600,c690 b.img
> > 
> > Trying to attach it a second time won't work because of image locking.
> > And after the error, it becomes:
> > 
> >     $ ls -lhZ b.img
> >     -rw-r--r--. root root system_u:object_r:virt_image_t:s0 b.img
> > 
> > Then, we won't be able to do OFD lock operations with the existing fd.
> > In other words, the code such as in blk_detach_dev:
> > 
> >     blk_set_perm(blk, 0, BLK_PERM_ALL, &error_abort);
> > 
> > can abort() QEMU, out of environmental changes.
> > 
> > This patch is an easy fix to this and the change is regardlessly
> > reasonable, so do it.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> 
> Thanks, applied to the block branch.

Self-NACK. This breaks raw_abort_perm_update(). The extra bytes locked by
raw_check_perm() are not tracked by s->perm (it is only updated in
raw_set_perm()), thus will not get released. This patch is "misusing" s->perm
and s->shared_perm.

I'll revise the implementation and send another version together with dropping
s->lock_fd.

Fam

Kevin Wolf Aug. 14, 2018, 8:22 a.m. UTC | #3

Am 14.08.2018 um 10:12 hat Fam Zheng geschrieben:
> On Mon, 08/13 15:42, Kevin Wolf wrote:
> > Am 13.08.2018 um 04:39 hat Fam Zheng geschrieben:
> > > If we know we've already locked the bytes, don't do it again; similarly
> > > don't unlock a byte if we haven't locked it. This doesn't change the
> > > behavior, but fixes a corner case explained below.
> > > 
> > > Libvirt had an error handling bug that an image can get its (ownership,
> > > file mode, SELinux) permissions changed (RHBZ 1584982) by mistake behind
> > > QEMU. Specifically, an image in use by Libvirt VM has:
> > > 
> > >     $ ls -lhZ b.img
> > >     -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c600,c690 b.img
> > > 
> > > Trying to attach it a second time won't work because of image locking.
> > > And after the error, it becomes:
> > > 
> > >     $ ls -lhZ b.img
> > >     -rw-r--r--. root root system_u:object_r:virt_image_t:s0 b.img
> > > 
> > > Then, we won't be able to do OFD lock operations with the existing fd.
> > > In other words, the code such as in blk_detach_dev:
> > > 
> > >     blk_set_perm(blk, 0, BLK_PERM_ALL, &error_abort);
> > > 
> > > can abort() QEMU, out of environmental changes.
> > > 
> > > This patch is an easy fix to this and the change is regardlessly
> > > reasonable, so do it.
> > > 
> > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > 
> > Thanks, applied to the block branch.
> 
> Self-NACK. This breaks raw_abort_perm_update(). The extra bytes locked by
> raw_check_perm() are not tracked by s->perm (it is only updated in
> raw_set_perm()), thus will not get released. This patch is "misusing" s->perm
> and s->shared_perm.
> 
> I'll revise the implementation and send another version together with dropping
> s->lock_fd.

Oops! I'm dequeuing the patch for now. Also, getting rid of s->lock_fd
sounds good!

I wonder if we can give this some test coverage, too, so that we'll
notice the breakage earlier next time. Maybe we can check from Python
which bits are locked?

Kevin

Fam Zheng Aug. 14, 2018, 8:31 a.m. UTC | #4

On Tue, 08/14 10:22, Kevin Wolf wrote:
> Am 14.08.2018 um 10:12 hat Fam Zheng geschrieben:
> > On Mon, 08/13 15:42, Kevin Wolf wrote:
> > > Am 13.08.2018 um 04:39 hat Fam Zheng geschrieben:
> > > > If we know we've already locked the bytes, don't do it again; similarly
> > > > don't unlock a byte if we haven't locked it. This doesn't change the
> > > > behavior, but fixes a corner case explained below.
> > > > 
> > > > Libvirt had an error handling bug that an image can get its (ownership,
> > > > file mode, SELinux) permissions changed (RHBZ 1584982) by mistake behind
> > > > QEMU. Specifically, an image in use by Libvirt VM has:
> > > > 
> > > >     $ ls -lhZ b.img
> > > >     -rw-r--r--. qemu qemu system_u:object_r:svirt_image_t:s0:c600,c690 b.img
> > > > 
> > > > Trying to attach it a second time won't work because of image locking.
> > > > And after the error, it becomes:
> > > > 
> > > >     $ ls -lhZ b.img
> > > >     -rw-r--r--. root root system_u:object_r:virt_image_t:s0 b.img
> > > > 
> > > > Then, we won't be able to do OFD lock operations with the existing fd.
> > > > In other words, the code such as in blk_detach_dev:
> > > > 
> > > >     blk_set_perm(blk, 0, BLK_PERM_ALL, &error_abort);
> > > > 
> > > > can abort() QEMU, out of environmental changes.
> > > > 
> > > > This patch is an easy fix to this and the change is regardlessly
> > > > reasonable, so do it.
> > > > 
> > > > Signed-off-by: Fam Zheng <famz@redhat.com>
> > > 
> > > Thanks, applied to the block branch.
> > 
> > Self-NACK. This breaks raw_abort_perm_update(). The extra bytes locked by
> > raw_check_perm() are not tracked by s->perm (it is only updated in
> > raw_set_perm()), thus will not get released. This patch is "misusing" s->perm
> > and s->shared_perm.
> > 
> > I'll revise the implementation and send another version together with dropping
> > s->lock_fd.
> 
> Oops! I'm dequeuing the patch for now. Also, getting rid of s->lock_fd
> sounds good!
> 
> I wonder if we can give this some test coverage, too, so that we'll
> notice the breakage earlier next time. Maybe we can check from Python
> which bits are locked?

I can write a unit test around open/close/reopen in C, where it is convenient to
check the lock status with F_OFD_GETLK/F_OFD_GETLK before/after the operations.

Fam

diff --git a/block/file-posix.c b/block/file-posix.c
index fe83cbf0eb..73ae00c8c5 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -680,23 +680,42 @@  typedef enum {
  * file; if @unlock == true, also unlock the unneeded bytes.
  * @shared_perm_lock_bits is the mask of all permissions that are NOT shared.
  */
-static int raw_apply_lock_bytes(int fd,
+static int raw_apply_lock_bytes(BDRVRawState *s, int fd,
                                 uint64_t perm_lock_bits,
                                 uint64_t shared_perm_lock_bits,
                                 bool unlock, Error **errp)
 {
     int ret;
     int i;
+    uint64_t locked_perm, locked_shared_perm;
+
+    if (s) {
+        locked_perm = s->perm;
+        locked_shared_perm = ~s->shared_perm & BLK_PERM_ALL;
+    } else {
+        /*
+         * We don't have the previous bits, just lock/unlock for each of the
+         * requested bits.
+         */
+        if (unlock) {
+            locked_perm = BLK_PERM_ALL;
+            locked_shared_perm = BLK_PERM_ALL;
+        } else {
+            locked_perm = 0;
+            locked_shared_perm = 0;
+        }
+    }
 
     PERM_FOREACH(i) {
         int off = RAW_LOCK_PERM_BASE + i;
-        if (perm_lock_bits & (1ULL << i)) {
+        uint64_t bit = (1ULL << i);
+        if ((perm_lock_bits & bit) && !(locked_perm & bit)) {
             ret = qemu_lock_fd(fd, off, 1, false);
             if (ret) {
                 error_setg(errp, "Failed to lock byte %d", off);
                 return ret;
             }
-        } else if (unlock) {
+        } else if (unlock && (locked_perm & bit) && !(perm_lock_bits & bit)) {
             ret = qemu_unlock_fd(fd, off, 1);
             if (ret) {
                 error_setg(errp, "Failed to unlock byte %d", off);
@@ -706,13 +725,15 @@  static int raw_apply_lock_bytes(int fd,
     }
     PERM_FOREACH(i) {
         int off = RAW_LOCK_SHARED_BASE + i;
-        if (shared_perm_lock_bits & (1ULL << i)) {
+        uint64_t bit = (1ULL << i);
+        if ((shared_perm_lock_bits & bit) && !(locked_shared_perm & bit)) {
             ret = qemu_lock_fd(fd, off, 1, false);
             if (ret) {
                 error_setg(errp, "Failed to lock byte %d", off);
                 return ret;
             }
-        } else if (unlock) {
+        } else if (unlock && (locked_shared_perm & bit) &&
+                   !(shared_perm_lock_bits & bit)) {
             ret = qemu_unlock_fd(fd, off, 1);
             if (ret) {
                 error_setg(errp, "Failed to unlock byte %d", off);
@@ -788,7 +809,7 @@  static int raw_handle_perm_lock(BlockDriverState *bs,
 
     switch (op) {
     case RAW_PL_PREPARE:
-        ret = raw_apply_lock_bytes(s->lock_fd, s->perm | new_perm,
+        ret = raw_apply_lock_bytes(s, s->lock_fd, s->perm | new_perm,
                                    ~s->shared_perm | ~new_shared,
                                    false, errp);
         if (!ret) {
@@ -800,7 +821,7 @@  static int raw_handle_perm_lock(BlockDriverState *bs,
         op = RAW_PL_ABORT;
         /* fall through to unlock bytes. */
     case RAW_PL_ABORT:
-        raw_apply_lock_bytes(s->lock_fd, s->perm, ~s->shared_perm,
+        raw_apply_lock_bytes(s, s->lock_fd, s->perm, ~s->shared_perm,
                              true, &local_err);
         if (local_err) {
             /* Theoretically the above call only unlocks bytes and it cannot
@@ -810,7 +831,7 @@  static int raw_handle_perm_lock(BlockDriverState *bs,
         }
         break;
     case RAW_PL_COMMIT:
-        raw_apply_lock_bytes(s->lock_fd, new_perm, ~new_shared,
+        raw_apply_lock_bytes(s, s->lock_fd, new_perm, ~new_shared,
                              true, &local_err);
         if (local_err) {
             /* Theoretically the above call only unlocks bytes and it cannot
@@ -2209,7 +2230,7 @@  raw_co_create(BlockdevCreateOptions *options, Error **errp)
     shared = BLK_PERM_ALL & ~BLK_PERM_RESIZE;
 
     /* Step one: Take locks */
-    result = raw_apply_lock_bytes(fd, perm, ~shared, false, errp);
+    result = raw_apply_lock_bytes(NULL, fd, perm, ~shared, false, errp);
     if (result < 0) {
         goto out_close;
     }
@@ -2250,7 +2271,7 @@  raw_co_create(BlockdevCreateOptions *options, Error **errp)
     }
 
 out_unlock:
-    raw_apply_lock_bytes(fd, 0, 0, true, &local_err);
+    raw_apply_lock_bytes(NULL, fd, 0, 0, true, &local_err);
     if (local_err) {
         /* The above call should not fail, and if it does, that does
          * not mean the whole creation operation has failed.  So

[v2] file-posix: Skip effectiveless OFD lock operations

Commit Message

Comments

Patch