From patchwork Thu Sep 27 18:17:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 10618385 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C14CA175A for ; Thu, 27 Sep 2018 18:18:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B09212B3B0 for ; Thu, 27 Sep 2018 18:18:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A43A42BA1D; Thu, 27 Sep 2018 18:18:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E07432B3B0 for ; Thu, 27 Sep 2018 18:18:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728317AbeI1Ahj (ORCPT ); Thu, 27 Sep 2018 20:37:39 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:45138 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727335AbeI1Ahj (ORCPT ); Thu, 27 Sep 2018 20:37:39 -0400 Received: by mail-pg1-f193.google.com with SMTP id t70-v6so2519765pgd.12 for ; Thu, 27 Sep 2018 11:18:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I4GS0rtTAw8MP4uDvktXudPbNVsZwrCDwQldL/U9M54=; b=VggOzRkqnjAoxcTrgosJ7WEGAFqFH0snDqnyDhIcppWn+qtwX6981iZlumlrBU65Pn +idBw9olDHPvcYryEG4dLrim8b/McRQF6ZckDOfsv0F3Ua21HwkPV99ducvE99NmIba7 cEuESyueeoYlXsyutl3vrhC/0YlrkW4wf1QkP4y/USOQU+otwVJ19GgBXZRMgz9ty6Tx pUe88OlM0Ngc4cTVGtNFs6Sfos+7hCTyIPtS+NfY+IuCft7P3sQreNfoMHMDZbMpHgqV ST6U0ypv25SGQs4F7DjXiY5PxdEF4uE7mXG5nKRMS9zzcK8btTSWSTrbj0CLRAmzrg48 8DlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I4GS0rtTAw8MP4uDvktXudPbNVsZwrCDwQldL/U9M54=; b=EQGQuzA2tQfG/7kX4U/5AS0SS5O5hwyHTI1cNauCwwBjvLAn0OH6Ol2EHpV6R1T7my AqrFqc+znndVwX+UzJm7RFnsr83sEjlD6J+CXFoZEWRKhWVLhcFRH6zMgrxHZhvWS7Ny kO8iJwa2wZfw4oU5+A+4blh1y/Vmq1uVdBukDrFqKd+xZE1anfgAOWrDHwpSj19iyQee 6qVwu3aLdhpzSxAgnLspBPbQFHKhP/XP/d5kPt+COIjiOJYd3k4CKIGyUoe82mlpV5Qj rkdxMpgGfiJzZ2lb3RXMZjdn84seN8ZMiFO8Fvsja9T3xNrnPiJgGHXdtOA99QOtrr5+ gogg== X-Gm-Message-State: ABuFfoh/Y7quMidlO4H4YPepll9RDhKctRO9Bh7wLJMiXdAs+1ZfcF+L dILYa91ThMb3TF5qwfcrHWJPPA== X-Google-Smtp-Source: ACcGV605PP43xzxTBdNtYCCeoIvGq/Fty4vz67jZQZ4pVYqsg+Z6Kh9eUn/2ueCVN+HFl1+Bj5LdGg== X-Received: by 2002:a17:902:ba8b:: with SMTP id k11-v6mr12563943pls.12.1538072287902; Thu, 27 Sep 2018 11:18:07 -0700 (PDT) Received: from vader.thefacebook.com ([2620:10d:c090:200::5:3e64]) by smtp.gmail.com with ESMTPSA id p19-v6sm4086614pgh.60.2018.09.27.11.18.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Sep 2018 11:18:07 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, David Sterba , linux-fsdevel@vger.kernel.org Subject: [PATCH v9 1/6] mm: split SWP_FILE into SWP_ACTIVATED and SWP_FS Date: Thu, 27 Sep 2018 11:17:33 -0700 Message-Id: <0f83d16b8f1fe8452a84886b4e206c3146711fe8.1538072009.git.osandov@fb.com> X-Mailer: git-send-email 2.19.0 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval The SWP_FILE flag serves two purposes: to make swap_{read,write}page() go through the filesystem, and to make swapoff() call ->swap_deactivate(). For Btrfs, we want the latter but not the former, so split this flag into two. This makes us always call ->swap_deactivate() if ->swap_activate() succeeded, not just if it didn't add any swap extents itself. This also resolves the issue of the very misleading name of SWP_FILE, which is only used for swap files over NFS. Reviewed-by: Nikolay Borisov Acked-by: Johannes Weiner Signed-off-by: Omar Sandoval --- include/linux/swap.h | 13 +++++++------ mm/page_io.c | 6 +++--- mm/swapfile.c | 13 ++++++++----- 3 files changed, 18 insertions(+), 14 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 8e2c11e692ba..0fda0aa743f0 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -167,13 +167,14 @@ enum { SWP_SOLIDSTATE = (1 << 4), /* blkdev seeks are cheap */ SWP_CONTINUED = (1 << 5), /* swap_map has count continuation */ SWP_BLKDEV = (1 << 6), /* its a block device */ - SWP_FILE = (1 << 7), /* set after swap_activate success */ - SWP_AREA_DISCARD = (1 << 8), /* single-time swap area discards */ - SWP_PAGE_DISCARD = (1 << 9), /* freed swap page-cluster discards */ - SWP_STABLE_WRITES = (1 << 10), /* no overwrite PG_writeback pages */ - SWP_SYNCHRONOUS_IO = (1 << 11), /* synchronous IO is efficient */ + SWP_ACTIVATED = (1 << 7), /* set after swap_activate success */ + SWP_FS = (1 << 8), /* swap file goes through fs */ + SWP_AREA_DISCARD = (1 << 9), /* single-time swap area discards */ + SWP_PAGE_DISCARD = (1 << 10), /* freed swap page-cluster discards */ + SWP_STABLE_WRITES = (1 << 11), /* no overwrite PG_writeback pages */ + SWP_SYNCHRONOUS_IO = (1 << 12), /* synchronous IO is efficient */ /* add others here before... */ - SWP_SCANNING = (1 << 12), /* refcount in scan_swap_map */ + SWP_SCANNING = (1 << 13), /* refcount in scan_swap_map */ }; #define SWAP_CLUSTER_MAX 32UL diff --git a/mm/page_io.c b/mm/page_io.c index aafd19ec1db4..e8653c368069 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -283,7 +283,7 @@ int __swap_writepage(struct page *page, struct writeback_control *wbc, struct swap_info_struct *sis = page_swap_info(page); VM_BUG_ON_PAGE(!PageSwapCache(page), page); - if (sis->flags & SWP_FILE) { + if (sis->flags & SWP_FS) { struct kiocb kiocb; struct file *swap_file = sis->swap_file; struct address_space *mapping = swap_file->f_mapping; @@ -365,7 +365,7 @@ int swap_readpage(struct page *page, bool synchronous) goto out; } - if (sis->flags & SWP_FILE) { + if (sis->flags & SWP_FS) { struct file *swap_file = sis->swap_file; struct address_space *mapping = swap_file->f_mapping; @@ -423,7 +423,7 @@ int swap_set_page_dirty(struct page *page) { struct swap_info_struct *sis = page_swap_info(page); - if (sis->flags & SWP_FILE) { + if (sis->flags & SWP_FS) { struct address_space *mapping = sis->swap_file->f_mapping; VM_BUG_ON_PAGE(!PageSwapCache(page), page); diff --git a/mm/swapfile.c b/mm/swapfile.c index d954b71c4f9c..d3f95833d12e 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -989,7 +989,7 @@ int get_swap_pages(int n_goal, swp_entry_t swp_entries[], int entry_size) goto nextsi; } if (size == SWAPFILE_CLUSTER) { - if (!(si->flags & SWP_FILE)) + if (!(si->flags & SWP_FS)) n_ret = swap_alloc_cluster(si, swp_entries); } else n_ret = scan_swap_map_slots(si, SWAP_HAS_CACHE, @@ -2310,12 +2310,13 @@ static void destroy_swap_extents(struct swap_info_struct *sis) kfree(se); } - if (sis->flags & SWP_FILE) { + if (sis->flags & SWP_ACTIVATED) { struct file *swap_file = sis->swap_file; struct address_space *mapping = swap_file->f_mapping; - sis->flags &= ~SWP_FILE; - mapping->a_ops->swap_deactivate(swap_file); + sis->flags &= ~SWP_ACTIVATED; + if (mapping->a_ops->swap_deactivate) + mapping->a_ops->swap_deactivate(swap_file); } } @@ -2411,8 +2412,10 @@ static int setup_swap_extents(struct swap_info_struct *sis, sector_t *span) if (mapping->a_ops->swap_activate) { ret = mapping->a_ops->swap_activate(sis, swap_file, span); + if (ret >= 0) + sis->flags |= SWP_ACTIVATED; if (!ret) { - sis->flags |= SWP_FILE; + sis->flags |= SWP_FS; ret = add_swap_extent(sis, 0, sis->max, 0); *span = sis->pages; } From patchwork Thu Sep 27 18:17:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 10618391 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2DF68175A for ; Thu, 27 Sep 2018 18:18:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1D4662B3B0 for ; Thu, 27 Sep 2018 18:18:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1195B2B961; Thu, 27 Sep 2018 18:18:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B4AC029988 for ; Thu, 27 Sep 2018 18:18:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728390AbeI1Ahl (ORCPT ); Thu, 27 Sep 2018 20:37:41 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:35901 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728251AbeI1Ahk (ORCPT ); Thu, 27 Sep 2018 20:37:40 -0400 Received: by mail-pg1-f193.google.com with SMTP id d1-v6so2551465pgo.3 for ; Thu, 27 Sep 2018 11:18:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ocSFXP7O1JUc5iJPcip80IbqCw8dHalc06/sU/WM7Ns=; b=VbtSIMXjOIJxw3zIp5D2Jbe/ciMod9y1Cw+jsDmd9QOitaT+GkvgeA00kJgaJ15bWW kczqRYZ2YX2NudovKmwam+l+3J/zolw7IhIBzmklG3MJwW9ad23JrzPvm+F5M+M6NJ6Q 0Mhm+s+OCT7VLAPr3iuag8vgt5xNLTji+eMS5ATZGvGqn4/LX/VL1dQVwTChz92VQHWs fXAX8lXozUtoS77Jijm0k0Wrs4XygxP/yXjRo88G6NqexA8hLdmLFHZX9mmxNhRV1B/l NnPNG1rrr94agLkOb5mV1+oaDDGsUVwN9LKZnTyyjweXlVcd6OjgZkkwLeYWu0+hlPb7 ug2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ocSFXP7O1JUc5iJPcip80IbqCw8dHalc06/sU/WM7Ns=; b=IxLo0wt7D1VpTtJvuzBhI/lrNvDvf616Uu+AZai21dJhVPWz/xShw1eSPrLtMtqj5F qn4FTrqVBjsEIUNOIGf8LQlntelY9FP+YGKowZsYAVZiC1Erc4DKi+pgjpW6+JZ4Hiaj ceAr6b4ss3EL/6hUD5sQN8MwPx1WaKaCqxFj+MHVUtZHKlPQJ1d+TGlruRvniwx7omTh SoF2/2craP5XGdXT8KSv9vgvjp8GBF1I1C4jf6bQqjjlFhJxc0yITmEznHUYfInhD4pu NgUregqqnUEzVKgeiWdJSOphIQfpPlIy+CowrdbVOpTCgzMXbDV1EP2ZF8foVkQhgn/I xFlA== X-Gm-Message-State: ABuFfogGIK+l+81kFNkA49tV2uX+zQ+og+i4Mf8kOynhdZEJ9+7zGi9v Mp0TwdekBCcrRqcwq3iLcLlGdw== X-Google-Smtp-Source: ACcGV62I52V4bUXBdNvnT4sH4FX9jm0Yga4su74IE1PYyAotxg6MJaZ7uJ+a1QxKzbcshOmLiZzI0w== X-Received: by 2002:a17:902:1ab:: with SMTP id b40-v6mr93392plb.82.1538072289745; Thu, 27 Sep 2018 11:18:09 -0700 (PDT) Received: from vader.thefacebook.com ([2620:10d:c090:200::5:3e64]) by smtp.gmail.com with ESMTPSA id p19-v6sm4086614pgh.60.2018.09.27.11.18.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Sep 2018 11:18:09 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, David Sterba , linux-fsdevel@vger.kernel.org Subject: [PATCH v9 2/6] mm: export add_swap_extent() Date: Thu, 27 Sep 2018 11:17:34 -0700 Message-Id: <56d08bc182becd35eaa4605c970936caca312cae.1538072009.git.osandov@fb.com> X-Mailer: git-send-email 2.19.0 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval Btrfs currently does not support swap files because swap's use of bmap does not work with copy-on-write and multiple devices. See commit 35054394c4b3 ("Btrfs: stop providing a bmap operation to avoid swapfile corruptions"). However, the swap code has a mechanism for the filesystem to manually add swap extents using add_swap_extent() from the ->swap_activate() aop. iomap has done this since commit 67482129cdab ("iomap: add a swapfile activation function"). Btrfs will do the same in a later patch, so export add_swap_extent(). Acked-by: Johannes Weiner Signed-off-by: Omar Sandoval --- mm/swapfile.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/swapfile.c b/mm/swapfile.c index d3f95833d12e..51cb30de17bc 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2365,6 +2365,7 @@ add_swap_extent(struct swap_info_struct *sis, unsigned long start_page, list_add_tail(&new_se->list, &sis->first_swap_extent.list); return 1; } +EXPORT_SYMBOL_GPL(add_swap_extent); /* * A `swap extent' is a simple thing which maps a contiguous range of pages From patchwork Thu Sep 27 18:17:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 10618397 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 12C5D3CF1 for ; Thu, 27 Sep 2018 18:18:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0736929988 for ; Thu, 27 Sep 2018 18:18:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F015F2B9EC; Thu, 27 Sep 2018 18:18:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 66E2D2B3B0 for ; Thu, 27 Sep 2018 18:18:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728451AbeI1Aho (ORCPT ); Thu, 27 Sep 2018 20:37:44 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:38021 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728251AbeI1Ahn (ORCPT ); Thu, 27 Sep 2018 20:37:43 -0400 Received: by mail-pf1-f193.google.com with SMTP id x17-v6so2480339pfh.5 for ; Thu, 27 Sep 2018 11:18:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pB4cjWu7kIoSAPGh3rcF+ODQvk/IUq3VExz60hmyk9c=; b=iRcBmGe2iOyXaPiMWvLgRrEGh+QqWaFC0SXLrEChkiiVJ2SYONu+OGRC9CE2krWwm3 SFI+iDm5JXO7ytk0cdjfABEIMDy4mI4MTxaA69jfNYSsB83ZyQ1jIvXND3F7q7RG+lrF FT/c+TBQ0UBukd+v6rq+Cpz0Ctk0PaGJEOmeI6AtgqIrWwxkGwz0dgsPZV22iOjlul1z Zb8c29ozRtkdq5QqVDEqqNkX4XIAH+C1W1ixwQLbtQoNyH66nGEv5361zSw0K+P6bllZ aJ3M7TvnyIikAt9c8IJFgc6zPwFIC1Lmv/8GmxjzaFEhb5lbBG/qD+LpKztjcj1rwqd0 RnYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pB4cjWu7kIoSAPGh3rcF+ODQvk/IUq3VExz60hmyk9c=; b=PmpBpnUwCECTyjpcCxH/w1gGutGx+Jp/ajyVtOkApQ+x92YyYwAX9nUUjxQsh6P3zd oHrr6QSjbJ+25NIS7fFUeBMqvOUNDadZpo+foSS8EBmMID18Kwn1+5kkwlpyOj9WTDtm +5zST0yWrQeax+Lafq4a3+N08IvQTfAEb/RGj+2VcUSv4iZ7eklFrw/w+YOQHxM8Ibk6 nvelY+7JMlGj2M/jNOuvbiwZ4UXiwzn8tsZSHy6a1s2SsItXhm09Pxevfr0e6hXUpxVX TE6kzrlmPXv3Bm8246XKhan4IhPk7g3WMY3hqOraJS5lkIRr5nWNUQp3GOONVUj3o5tI PslQ== X-Gm-Message-State: ABuFfoiheMUyvEAMolSHGdgyGd11NnbPG2DvyxXRjA6Ik8w2weSb2sWL 9NcKqVb5DQlwyoGu012Rc7pg5g== X-Google-Smtp-Source: ACcGV612aO4ToB/YjCcbARk3DH6sdoMScF3+ZHfByjDMyiZZ7dFcziLBbwS1aG3/zLbIXllhaGMD1Q== X-Received: by 2002:a17:902:1681:: with SMTP id h1-v6mr12185624plh.262.1538072291599; Thu, 27 Sep 2018 11:18:11 -0700 (PDT) Received: from vader.thefacebook.com ([2620:10d:c090:200::5:3e64]) by smtp.gmail.com with ESMTPSA id p19-v6sm4086614pgh.60.2018.09.27.11.18.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Sep 2018 11:18:11 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, David Sterba , linux-fsdevel@vger.kernel.org, Jonathan Corbet , Al Viro Subject: [PATCH v9 3/6] vfs: update swap_{,de}activate documentation Date: Thu, 27 Sep 2018 11:17:35 -0700 Message-Id: <79f71cb15a9008c000ae3eb77118d32cb88948ba.1538072009.git.osandov@fb.com> X-Mailer: git-send-email 2.19.0 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval The documentation for these functions is wrong in several ways: - swap_activate() is called with the inode locked - swap_activate() takes a swap_info_struct * and a sector_t * - swap_activate() can also return a positive number of extents it added itself - swap_deactivate() does not return anything Cc: Jonathan Corbet Cc: Al Viro Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval --- Hi, Jon, Al, could I get an ack on this patch? Thanks! Documentation/filesystems/Locking | 17 +++++++---------- Documentation/filesystems/vfs.txt | 12 ++++++++---- 2 files changed, 15 insertions(+), 14 deletions(-) diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index efea228ccd8a..b970c8c2ee22 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -210,8 +210,9 @@ prototypes: int (*launder_page)(struct page *); int (*is_partially_uptodate)(struct page *, unsigned long, unsigned long); int (*error_remove_page)(struct address_space *, struct page *); - int (*swap_activate)(struct file *); - int (*swap_deactivate)(struct file *); + int (*swap_activate)(struct swap_info_struct *, struct file *, + sector_t *); + void (*swap_deactivate)(struct file *); locking rules: All except set_page_dirty and freepage may block @@ -235,8 +236,8 @@ putback_page: yes launder_page: yes is_partially_uptodate: yes error_remove_page: yes -swap_activate: no -swap_deactivate: no +swap_activate: yes +swap_deactivate: no ->write_begin(), ->write_end() and ->readpage() may be called from the request handler (/dev/loop). @@ -333,14 +334,10 @@ cleaned, or an error value if not. Note that in order to prevent the page getting mapped back in and redirtied, it needs to be kept locked across the entire operation. - ->swap_activate will be called with a non-zero argument on -files backing (non block device backed) swapfiles. A return value -of zero indicates success, in which case this file can be used for -backing swapspace. The swapspace operations will be proxied to the -address space operations. + ->swap_activate is called from sys_swapon() with the inode locked. ->swap_deactivate() will be called in the sys_swapoff() -path after ->swap_activate() returned success. +path after ->swap_activate() returned success. The inode is not locked. ----------------------- file_lock_operations ------------------------------ prototypes: diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index a6c6a8af48a2..6e14db053eaa 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -652,8 +652,9 @@ struct address_space_operations { unsigned long); void (*is_dirty_writeback) (struct page *, bool *, bool *); int (*error_remove_page) (struct mapping *mapping, struct page *page); - int (*swap_activate)(struct file *); - int (*swap_deactivate)(struct file *); + int (*swap_activate)(struct swap_info_struct *, struct file *, + sector_t *); + void (*swap_deactivate)(struct file *); }; writepage: called by the VM to write a dirty page to backing store. @@ -830,8 +831,11 @@ struct address_space_operations { swap_activate: Called when swapon is used on a file to allocate space if necessary and pin the block lookup information in - memory. A return value of zero indicates success, - in which case this file can be used to back swapspace. + memory. If this returns zero, the swap system will call the address + space operations ->readpage() and ->direct_IO(). Alternatively, this + may call add_swap_extent() and return the number of extents added, in + which case the swap system will use the provided blocks directly + instead of going through the filesystem. swap_deactivate: Called during swapoff on files where swap_activate was successful. From patchwork Thu Sep 27 18:17:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 10618405 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C9FB4417B for ; Thu, 27 Sep 2018 18:18:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BA0CF29988 for ; Thu, 27 Sep 2018 18:18:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AEA222B9EC; Thu, 27 Sep 2018 18:18:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DCCF92B98C for ; Thu, 27 Sep 2018 18:18:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728474AbeI1Ahp (ORCPT ); Thu, 27 Sep 2018 20:37:45 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:35911 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728251AbeI1Ahp (ORCPT ); Thu, 27 Sep 2018 20:37:45 -0400 Received: by mail-pg1-f196.google.com with SMTP id d1-v6so2551605pgo.3 for ; Thu, 27 Sep 2018 11:18:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Gz4F4IjCcs1aVpbcqRI+F+lRRHULLEoRwCUrGwJZfnU=; b=T3F6W7v336Ux6k64eFmGDM2guV5pgIwkNrNqFNY3r03HnBT6vtcn0AWzvMRvjOOSrq KS0C6OGzrfQ+bORgrpy7vekO3OBfBp3ga2eMJRIEUMhCxjZwnQ3tFhUtcaBSRHlWFd+I 7gPKkqYJMs6A5WioWvNvQRKfMuuboVlYm5eXcgPXxA8E6ifE/9wMNIjKl9poAMgTPQLQ /Pg/AJkSHFjEAi6ed4TWRLi1D2Meei7S6/WKRUWzOj++CzWop0mlNeG7p4YRZnV8s+oX leZC3CiHvwzoqwEEBucrqga6EWQ+d7lwJaVu6XnwDEXU/Dtdbgshy+hwXo9BUFkPty7p LFqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Gz4F4IjCcs1aVpbcqRI+F+lRRHULLEoRwCUrGwJZfnU=; b=icXUUzbwHyiN0DFpYvlbSdDeDrHlKw9y1po+9kCJWdHSZSVndXPkOM/N25xSdfGHRV 8oT+j9R3HRYIBxZjPCHSiSJZYZrroo9s7Y90bwh9GQsvocPLYhuGG6mUKqUC5+cOJRUA r9JLmIeWSlWEhjN7Ox0f8uvAsyyhFY6hSUtvd4M/UuH9x6cR7Y5pU69e9YFIIc8ZEKa6 Qbic6KNaWlC+1KXDGqkcv9uNzsQa60/D8EIonyzbLfc3EVLqiEoQKC4I6Y1mD5z+bKnx 7o6PWbZaFKNQq8NpPokAD38YOcqbVbAebbdUNW9aeCCQoiOrxVBLM4SMtbOj+Q0XrLWs RdLw== X-Gm-Message-State: ABuFfoj/wtaGKbCpn7uTKsHL8NPyqJ44AW8HfcVJffeveyk+LMXyK+t4 ObKAqYWe7zPe8t/MIoRZmFZBRw== X-Google-Smtp-Source: ACcGV62TESeGu+OECpMwoAye/NN2ryKURqWOdyFowSPRHFq/2eL/BdTlqcqH3D5V05OAjg0EVQpfMg== X-Received: by 2002:a63:2483:: with SMTP id k125-v6mr3584733pgk.287.1538072293786; Thu, 27 Sep 2018 11:18:13 -0700 (PDT) Received: from vader.thefacebook.com ([2620:10d:c090:200::5:3e64]) by smtp.gmail.com with ESMTPSA id p19-v6sm4086614pgh.60.2018.09.27.11.18.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Sep 2018 11:18:13 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, David Sterba , linux-fsdevel@vger.kernel.org Subject: [PATCH v9 4/6] Btrfs: prevent ioctls from interfering with a swap file Date: Thu, 27 Sep 2018 11:17:36 -0700 Message-Id: X-Mailer: git-send-email 2.19.0 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval A later patch will implement swap file support for Btrfs, but before we do that, we need to make sure that the various Btrfs ioctls cannot change a swap file. When a swap file is active, we must make sure that the extents of the file are not moved and that they don't become shared. That means that the following are not safe: - chattr +c (enable compression) - reflink - dedupe - snapshot - defrag Don't allow those to happen on an active swap file. Additionally, balance, resize, device remove, and device replace are also unsafe if they affect an active swapfile. Add a red-black tree of block groups and devices which contain an active swapfile. Relocation checks each block group against this tree and skips it or errors out for balance or resize, respectively. Device remove and device replace check the tree for the device they will operate on. Note that we don't have to worry about chattr -C (disable nocow), which we ignore for non-empty files, because an active swapfile must be non-empty and can't be truncated. We also don't have to worry about autodefrag because it's only done on COW files. Truncate and fallocate are already taken care of by the generic code. Device add doesn't do relocation so it's not an issue, either. Signed-off-by: Omar Sandoval --- fs/btrfs/ctree.h | 29 +++++++++++++++++++++++ fs/btrfs/dev-replace.c | 8 +++++++ fs/btrfs/disk-io.c | 4 ++++ fs/btrfs/ioctl.c | 31 +++++++++++++++++++++--- fs/btrfs/relocation.c | 18 ++++++++++---- fs/btrfs/volumes.c | 53 ++++++++++++++++++++++++++++++++++++++---- 6 files changed, 131 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 2cddfe7806a4..08df61b8fc87 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -716,6 +716,28 @@ struct btrfs_fs_devices; struct btrfs_balance_control; struct btrfs_delayed_root; +/* + * Block group or device which contains an active swapfile. Used for preventing + * unsafe operations while a swapfile is active. + * + * These are sorted on (ptr, inode) (note that a block group or device can + * contain more than one swapfile). We compare the pointer values because we + * don't actually care what the object is, we just need a quick check whether + * the object exists in the rbtree. + */ +struct btrfs_swapfile_pin { + struct rb_node node; + void *ptr; + struct inode *inode; + /* + * If true, ptr points to a struct btrfs_block_group_cache. Otherwise, + * ptr points to a struct btrfs_device. + */ + bool is_block_group; +}; + +bool btrfs_pinned_by_swapfile(struct btrfs_fs_info *fs_info, void *ptr); + #define BTRFS_FS_BARRIER 1 #define BTRFS_FS_CLOSING_START 2 #define BTRFS_FS_CLOSING_DONE 3 @@ -1121,6 +1143,10 @@ struct btrfs_fs_info { u32 sectorsize; u32 stripesize; + /* Block groups and devices containing active swapfiles. */ + spinlock_t swapfile_pins_lock; + struct rb_root swapfile_pins; + #ifdef CONFIG_BTRFS_FS_REF_VERIFY spinlock_t ref_verify_lock; struct rb_root block_tree; @@ -1286,6 +1312,9 @@ struct btrfs_root { spinlock_t qgroup_meta_rsv_lock; u64 qgroup_meta_rsv_pertrans; u64 qgroup_meta_rsv_prealloc; + + /* Number of active swapfiles */ + atomic_t nr_swapfiles; }; struct btrfs_file_private { diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index dec01970d8c5..781006b6fca3 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -414,6 +414,14 @@ int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info, if (ret) return ret; + if (btrfs_pinned_by_swapfile(fs_info, src_device)) { + btrfs_warn_in_rcu(fs_info, + "cannot replace device %s (devid %llu) due to active swapfile", + btrfs_dev_name(src_device), + src_device->devid); + return -ETXTBSY; + } + ret = btrfs_init_dev_replace_tgtdev(fs_info, tgtdev_name, src_device, &tgt_device); if (ret) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 05dc3c17cb62..2428a73067d2 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1188,6 +1188,7 @@ static void __setup_root(struct btrfs_root *root, struct btrfs_fs_info *fs_info, refcount_set(&root->refs, 1); atomic_set(&root->will_be_snapshotted, 0); atomic_set(&root->snapshot_force_cow, 0); + atomic_set(&root->nr_swapfiles, 0); root->log_transid = 0; root->log_transid_committed = -1; root->last_log_commit = 0; @@ -2782,6 +2783,9 @@ int open_ctree(struct super_block *sb, fs_info->sectorsize = 4096; fs_info->stripesize = 4096; + spin_lock_init(&fs_info->swapfile_pins_lock); + fs_info->swapfile_pins = RB_ROOT; + ret = btrfs_alloc_stripe_hash_table(fs_info); if (ret) { err = ret; diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index d60b6caf09e8..72f56c1bbc74 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -290,6 +290,11 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg) } else if (fsflags & FS_COMPR_FL) { const char *comp; + if (IS_SWAPFILE(inode)) { + ret = -ETXTBSY; + goto out_unlock; + } + binode->flags |= BTRFS_INODE_COMPRESS; binode->flags &= ~BTRFS_INODE_NOCOMPRESS; @@ -752,6 +757,12 @@ static int create_snapshot(struct btrfs_root *root, struct inode *dir, if (!test_bit(BTRFS_ROOT_REF_COWS, &root->state)) return -EINVAL; + if (atomic_read(&root->nr_swapfiles)) { + btrfs_warn(fs_info, + "cannot snapshot subvolume with active swapfile"); + return -ETXTBSY; + } + pending_snapshot = kzalloc(sizeof(*pending_snapshot), GFP_KERNEL); if (!pending_snapshot) return -ENOMEM; @@ -1503,9 +1514,13 @@ int btrfs_defrag_file(struct inode *inode, struct file *file, } inode_lock(inode); - if (do_compress) - BTRFS_I(inode)->defrag_compress = compress_type; - ret = cluster_pages_for_defrag(inode, pages, i, cluster); + if (IS_SWAPFILE(inode)) { + ret = -ETXTBSY; + } else { + if (do_compress) + BTRFS_I(inode)->defrag_compress = compress_type; + ret = cluster_pages_for_defrag(inode, pages, i, cluster); + } if (ret < 0) { inode_unlock(inode); goto out_ra; @@ -3573,6 +3588,11 @@ static int btrfs_extent_same(struct inode *src, u64 loff, u64 olen, goto out_unlock; } + if (IS_SWAPFILE(src) || IS_SWAPFILE(dst)) { + ret = -ETXTBSY; + goto out_unlock; + } + tail_len = olen % BTRFS_MAX_DEDUPE_LEN; chunk_count = div_u64(olen, BTRFS_MAX_DEDUPE_LEN); if (chunk_count == 0) @@ -4269,6 +4289,11 @@ static noinline int btrfs_clone_files(struct file *file, struct file *file_src, goto out_unlock; } + if (IS_SWAPFILE(src) || IS_SWAPFILE(inode)) { + ret = -ETXTBSY; + goto out_unlock; + } + /* determine range to clone */ ret = -EINVAL; if (off + len > src->i_size || off + len < off) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 8783a1776540..7468a0f55cd2 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -4226,6 +4226,7 @@ static void describe_relocation(struct btrfs_fs_info *fs_info, */ int btrfs_relocate_block_group(struct btrfs_fs_info *fs_info, u64 group_start) { + struct btrfs_block_group_cache *bg; struct btrfs_root *extent_root = fs_info->extent_root; struct reloc_control *rc; struct inode *inode; @@ -4234,14 +4235,23 @@ int btrfs_relocate_block_group(struct btrfs_fs_info *fs_info, u64 group_start) int rw = 0; int err = 0; + bg = btrfs_lookup_block_group(fs_info, group_start); + if (!bg) + return -ENOENT; + + if (btrfs_pinned_by_swapfile(fs_info, bg)) { + btrfs_put_block_group(bg); + return -ETXTBSY; + } + rc = alloc_reloc_control(); - if (!rc) + if (!rc) { + btrfs_put_block_group(bg); return -ENOMEM; + } rc->extent_root = extent_root; - - rc->block_group = btrfs_lookup_block_group(fs_info, group_start); - BUG_ON(!rc->block_group); + rc->block_group = bg; ret = btrfs_inc_block_group_ro(rc->block_group); if (ret) { diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index f4405e430da6..aa37ae30bf62 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1882,6 +1882,14 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, if (ret) goto out; + if (btrfs_pinned_by_swapfile(fs_info, device)) { + btrfs_warn_in_rcu(fs_info, + "cannot remove device %s (devid %llu) due to active swapfile", + rcu_str_deref(device->name), device->devid); + ret = -ETXTBSY; + goto out; + } + if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) { ret = BTRFS_ERROR_DEV_TGT_REPLACE; goto out; @@ -3626,10 +3634,15 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info) ret = btrfs_relocate_chunk(fs_info, found_key.offset); mutex_unlock(&fs_info->delete_unused_bgs_mutex); - if (ret && ret != -ENOSPC) - goto error; if (ret == -ENOSPC) { enospc_errors++; + } else if (ret == -ETXTBSY) { + btrfs_info(fs_info, + "skipping relocation of block group %llu due to active swapfile", + found_key.offset); + ret = 0; + } else if (ret) { + goto error; } else { spin_lock(&fs_info->balance_lock); bctl->stat.completed++; @@ -4426,10 +4439,16 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size) ret = btrfs_relocate_chunk(fs_info, chunk_offset); mutex_unlock(&fs_info->delete_unused_bgs_mutex); - if (ret && ret != -ENOSPC) - goto done; - if (ret == -ENOSPC) + if (ret == -ENOSPC) { failed++; + } else if (ret) { + if (ret == -ETXTBSY) { + btrfs_warn(fs_info, + "could not shrink block group %llu due to active swapfile", + chunk_offset); + } + goto done; + } } while (key.offset-- > 0); if (failed && !retried) { @@ -7530,3 +7549,27 @@ int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info) btrfs_free_path(path); return ret; } + +/* + * Check whether the given block group or device is pinned by any inode being + * used as a swapfile. + */ +bool btrfs_pinned_by_swapfile(struct btrfs_fs_info *fs_info, void *ptr) +{ + struct btrfs_swapfile_pin *sp; + struct rb_node *node; + + spin_lock(&fs_info->swapfile_pins_lock); + node = fs_info->swapfile_pins.rb_node; + while (node) { + sp = rb_entry(node, struct btrfs_swapfile_pin, node); + if (ptr < sp->ptr) + node = node->rb_left; + else if (ptr > sp->ptr) + node = node->rb_right; + else + break; + } + spin_unlock(&fs_info->swapfile_pins_lock); + return node != NULL; +} From patchwork Thu Sep 27 18:17:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 10618403 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B2233175A for ; Thu, 27 Sep 2018 18:18:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A58D029988 for ; Thu, 27 Sep 2018 18:18:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 99A082B9CF; Thu, 27 Sep 2018 18:18:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 32DF929988 for ; Thu, 27 Sep 2018 18:18:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728485AbeI1Ahr (ORCPT ); Thu, 27 Sep 2018 20:37:47 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:35402 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728401AbeI1Ahq (ORCPT ); Thu, 27 Sep 2018 20:37:46 -0400 Received: by mail-pg1-f196.google.com with SMTP id v133-v6so2556373pgb.2 for ; Thu, 27 Sep 2018 11:18:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=T+NFN9NWs+Mlw0O4AU4cAMqXtia5hH+lkj6tgPdvIGI=; b=z/caWFSCrDh6wG3X9A3bOKKqNfhJ5siHyK6zUe4QvRYzYiwJuLQ/I/8fKvGnXlScxU tjsYLxHRNqF4fQTTMZ2SeLvoEvCf/6l5xXNbIHp+U2dOj/5K63BrHw9wk9710sUmYi1E Bc+TExvzhhXEcdb4/HbpPfOusSPVUQ8KcUEM9T815LujpZnfop3LypGhA8udcPtI+IZj 1t9X/CsP4MymPg/cbyvE+d9iY0mARCenPpZC7SIOnfeeCrxJW8cbWq0+4e4b5UUcyotj VW8m8U+GyJ1CAad2AU3QJD+/W9NYwXDK4+GP9mmJxO61JIz4YAyuHnSupeO03yyJUcla WfWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=T+NFN9NWs+Mlw0O4AU4cAMqXtia5hH+lkj6tgPdvIGI=; b=WcnT+Q/xEGtZoVMiHVFolt0MdpUveP3528Wqqfyp5vqpdXhFdHymY/PlnUFPTW6DNq aqn6dDcmR3ZqM/b4Vky6PMCAD8S2x9jwhUkZ1dVzy5gLw8jAFjU3Wj5PYNziYTODMHQe 3KuW2Gk+iCT3JjY9PpL00XAFF9wJ7wqVwU8ykJW9nQi3d1ath610da731a8Y+jVSoneT aAyu4MpiSwgLPDbVR98QKqbzDLlIo8MqGNG6lkUHiakSAPaNILkj1jMXxMExzCJbl5n2 70XvhDtd1hNf9L7MBJzjjMAxvQIQBmZQyrGkcS7K2fFsrRRnJhg05pKqKSK/naIkGoVE wTgw== X-Gm-Message-State: ABuFfoiWhXPXmg7CMMiceU9WQN6IjJYu/eQp7CXD/x1gZYaFWfD5Vf8i Q1gwHhaSPGFxvukKpumHZSEN4Q== X-Google-Smtp-Source: ACcGV62Q7lMRu1UM7zBN0da0nTYBwbilCXUBmvOHJLhuUj0CC2gx5ZzuyFwp5CUuTgi4LKnvQRur7g== X-Received: by 2002:a62:3306:: with SMTP id z6-v6mr12629911pfz.85.1538072295694; Thu, 27 Sep 2018 11:18:15 -0700 (PDT) Received: from vader.thefacebook.com ([2620:10d:c090:200::5:3e64]) by smtp.gmail.com with ESMTPSA id p19-v6sm4086614pgh.60.2018.09.27.11.18.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Sep 2018 11:18:15 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, David Sterba , linux-fsdevel@vger.kernel.org Subject: [PATCH v9 5/6] Btrfs: rename get_chunk_map() and make it non-static Date: Thu, 27 Sep 2018 11:17:37 -0700 Message-Id: X-Mailer: git-send-email 2.19.0 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval The Btrfs swap code is going to need it, so give it a btrfs_ prefix and make it non-static. Reviewed-by: Nikolay Borisov Signed-off-by: Omar Sandoval --- fs/btrfs/volumes.c | 29 ++++++++++++++++++----------- fs/btrfs/volumes.h | 2 ++ 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index aa37ae30bf62..20c26afdd330 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2714,8 +2714,15 @@ static int btrfs_del_sys_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset) return ret; } -static struct extent_map *get_chunk_map(struct btrfs_fs_info *fs_info, - u64 logical, u64 length) +/* + * btrfs_get_chunk_map() - Find the mapping containing the given logical extent. + * @logical: Logical block offset in bytes. + * @length: Length of extent in bytes. + * + * Return: Chunk mapping or ERR_PTR. + */ +struct extent_map *btrfs_get_chunk_map(struct btrfs_fs_info *fs_info, + u64 logical, u64 length) { struct extent_map_tree *em_tree; struct extent_map *em; @@ -2752,7 +2759,7 @@ int btrfs_remove_chunk(struct btrfs_trans_handle *trans, u64 chunk_offset) int i, ret = 0; struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; - em = get_chunk_map(fs_info, chunk_offset, 1); + em = btrfs_get_chunk_map(fs_info, chunk_offset, 1); if (IS_ERR(em)) { /* * This is a logic error, but we don't want to just rely on the @@ -4902,7 +4909,7 @@ int btrfs_finish_chunk_alloc(struct btrfs_trans_handle *trans, int i = 0; int ret = 0; - em = get_chunk_map(fs_info, chunk_offset, chunk_size); + em = btrfs_get_chunk_map(fs_info, chunk_offset, chunk_size); if (IS_ERR(em)) return PTR_ERR(em); @@ -5044,7 +5051,7 @@ int btrfs_chunk_readonly(struct btrfs_fs_info *fs_info, u64 chunk_offset) int miss_ndevs = 0; int i; - em = get_chunk_map(fs_info, chunk_offset, 1); + em = btrfs_get_chunk_map(fs_info, chunk_offset, 1); if (IS_ERR(em)) return 1; @@ -5104,7 +5111,7 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len) struct map_lookup *map; int ret; - em = get_chunk_map(fs_info, logical, len); + em = btrfs_get_chunk_map(fs_info, logical, len); if (IS_ERR(em)) /* * We could return errors for these cases, but that could get @@ -5150,7 +5157,7 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info *fs_info, struct map_lookup *map; unsigned long len = fs_info->sectorsize; - em = get_chunk_map(fs_info, logical, len); + em = btrfs_get_chunk_map(fs_info, logical, len); if (!WARN_ON(IS_ERR(em))) { map = em->map_lookup; @@ -5167,7 +5174,7 @@ int btrfs_is_parity_mirror(struct btrfs_fs_info *fs_info, u64 logical, u64 len) struct map_lookup *map; int ret = 0; - em = get_chunk_map(fs_info, logical, len); + em = btrfs_get_chunk_map(fs_info, logical, len); if(!WARN_ON(IS_ERR(em))) { map = em->map_lookup; @@ -5326,7 +5333,7 @@ static int __btrfs_map_block_for_discard(struct btrfs_fs_info *fs_info, /* discard always return a bbio */ ASSERT(bbio_ret); - em = get_chunk_map(fs_info, logical, length); + em = btrfs_get_chunk_map(fs_info, logical, length); if (IS_ERR(em)) return PTR_ERR(em); @@ -5652,7 +5659,7 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, return __btrfs_map_block_for_discard(fs_info, logical, *length, bbio_ret); - em = get_chunk_map(fs_info, logical, *length); + em = btrfs_get_chunk_map(fs_info, logical, *length); if (IS_ERR(em)) return PTR_ERR(em); @@ -5951,7 +5958,7 @@ int btrfs_rmap_block(struct btrfs_fs_info *fs_info, u64 chunk_start, u64 rmap_len; int i, j, nr = 0; - em = get_chunk_map(fs_info, chunk_start, 1); + em = btrfs_get_chunk_map(fs_info, chunk_start, 1); if (IS_ERR(em)) return -EIO; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 23e9285d88de..f4c190c2ab84 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -465,6 +465,8 @@ unsigned long btrfs_full_stripe_len(struct btrfs_fs_info *fs_info, int btrfs_finish_chunk_alloc(struct btrfs_trans_handle *trans, u64 chunk_offset, u64 chunk_size); int btrfs_remove_chunk(struct btrfs_trans_handle *trans, u64 chunk_offset); +struct extent_map *btrfs_get_chunk_map(struct btrfs_fs_info *fs_info, + u64 logical, u64 length); static inline void btrfs_dev_stat_inc(struct btrfs_device *dev, int index) From patchwork Thu Sep 27 18:17:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Omar Sandoval X-Patchwork-Id: 10618407 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2901414BD for ; Thu, 27 Sep 2018 18:18:23 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 17B5F29988 for ; Thu, 27 Sep 2018 18:18:23 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0BCA92B961; Thu, 27 Sep 2018 18:18:23 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2E9F529988 for ; Thu, 27 Sep 2018 18:18:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728547AbeI1Aht (ORCPT ); Thu, 27 Sep 2018 20:37:49 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:38992 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728507AbeI1Aht (ORCPT ); Thu, 27 Sep 2018 20:37:49 -0400 Received: by mail-pg1-f195.google.com with SMTP id 85-v6so2541392pge.6 for ; Thu, 27 Sep 2018 11:18:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osandov-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QXDWG45gpDSq3zL/smt5myzX38kSpgMjDpiwlbpqRuI=; b=DKAdIqgWFG6SV0CBA48SgC4NQm7T4RakGJNIi/cetz/XZKnId153u8LkBYwxYqp4u8 cHzVwJky2CYsfRKhf7yOwYXK+mSQ5ZG1tV47GXjBCQY0GF3NCVqpuUMIQWYWaNPUWv4v dobATjU+N7oG54oV8uYDHrvaCZSbwfonog20qdLsh5lrUJorX9gbsqDKVcJRfFGuZrFs nNn8hyTDIlc58ngWFyhToWNnsVy0TdArYIfVVkp/6tbzocwO5tk00ItQsJxR5mmOORDf uJNDkSyER33vGL1iT6YNpB8qfG3lGYQGcHbbj9VgXMTo3/pL1Ewg5JQH+JMVkET0Y6fN DAJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QXDWG45gpDSq3zL/smt5myzX38kSpgMjDpiwlbpqRuI=; b=hlszwnFDImfxVdB38YuYA2jZnpw6clgVv1DJ3ooGFeR5XM3Cu8Q65mWzj3dUXttk9f ONf2Yxu1Hh/j1q5F06C2lYjXJ3LvFNhR3sNiutBk3vcq+zP/ezudqW4YSyF99sXpgvVN 8/tjmaTGHxrFccqTuB7DxVGwVHACykx3tAaICZI5NU8t2t99AofsVBQEQ3v2+bf77hpr dtmrCB564xhHNR2jmfJ+LRjRyfBvL9hm8OJE8fwCwCGtwt0XaZDP/srzECh6ExxNehuj EYjeR/7PoqD/ZgEZxNL44uNwH8BrZ78DSF9GjvPMCZR8Rx31Wp8Dt+oUJnHyIcBXbUU7 Xg4Q== X-Gm-Message-State: ABuFfogSVbViZ75qZFnTc1phIGuRYXuDoFi+sNpb5tQiaanLLx0mfPCl xzsd6xu4CpKmGC8n36YU50nqjw== X-Google-Smtp-Source: ACcGV60yZ2h3lhm1qoylwDz/2vvBySB8SIcwjxHsKffJDAbHTMu/P79T+B9meiGgkJV+6aeJXuZvsQ== X-Received: by 2002:a63:4c16:: with SMTP id z22-v6mr11763860pga.312.1538072297770; Thu, 27 Sep 2018 11:18:17 -0700 (PDT) Received: from vader.thefacebook.com ([2620:10d:c090:200::5:3e64]) by smtp.gmail.com with ESMTPSA id p19-v6sm4086614pgh.60.2018.09.27.11.18.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Sep 2018 11:18:17 -0700 (PDT) From: Omar Sandoval To: linux-btrfs@vger.kernel.org Cc: kernel-team@fb.com, David Sterba , linux-fsdevel@vger.kernel.org Subject: [PATCH v9 6/6] Btrfs: support swap files Date: Thu, 27 Sep 2018 11:17:38 -0700 Message-Id: <08c9e240cc1de1e861c0e3781b7405a8c3f0a120.1538072009.git.osandov@fb.com> X-Mailer: git-send-email 2.19.0 In-Reply-To: References: MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Omar Sandoval Btrfs has not allowed swap files since commit 35054394c4b3 ("Btrfs: stop providing a bmap operation to avoid swapfile corruptions"). However, now that the proper restrictions are in place, Btrfs can support swap files through the swap file a_ops, similar to iomap in commit 67482129cdab ("iomap: add a swapfile activation function"). For Btrfs, activation needs to make sure that the file can be used as a swap file, which currently means that it must be fully allocated as nocow with no compression on one device. It must also do the proper tracking so that ioctls will not interfere with the swap file. Deactivation clears this tracking. Signed-off-by: Omar Sandoval --- fs/btrfs/inode.c | 338 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 338 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3ea5339603cf..8f8b7079e1ba 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include "ctree.h" #include "disk-io.h" @@ -10488,6 +10489,341 @@ void btrfs_set_range_writeback(struct extent_io_tree *tree, u64 start, u64 end) } } +#ifdef CONFIG_SWAP +/* + * Add an entry indicating a block group or device which is pinned by a + * swapfile. Returns 0 on success, 1 if there is already an entry for it, or a + * negative errno on failure. + */ +static int btrfs_add_swapfile_pin(struct inode *inode, void *ptr, + bool is_block_group) +{ + struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; + struct btrfs_swapfile_pin *sp, *entry; + struct rb_node **p; + struct rb_node *parent = NULL; + + sp = kmalloc(sizeof(*sp), GFP_NOFS); + if (!sp) + return -ENOMEM; + sp->ptr = ptr; + sp->inode = inode; + sp->is_block_group = is_block_group; + + spin_lock(&fs_info->swapfile_pins_lock); + p = &fs_info->swapfile_pins.rb_node; + while (*p) { + parent = *p; + entry = rb_entry(parent, struct btrfs_swapfile_pin, node); + if (sp->ptr < entry->ptr || + (sp->ptr == entry->ptr && sp->inode < entry->inode)) { + p = &(*p)->rb_left; + } else if (sp->ptr > entry->ptr || + (sp->ptr == entry->ptr && sp->inode > entry->inode)) { + p = &(*p)->rb_right; + } else { + spin_unlock(&fs_info->swapfile_pins_lock); + kfree(sp); + return 1; + } + } + rb_link_node(&sp->node, parent, p); + rb_insert_color(&sp->node, &fs_info->swapfile_pins); + spin_unlock(&fs_info->swapfile_pins_lock); + return 0; +} + +/* Free all of the entries pinned by this swapfile. */ +static void btrfs_free_swapfile_pins(struct inode *inode) +{ + struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; + struct btrfs_swapfile_pin *sp; + struct rb_node *node, *next; + + spin_lock(&fs_info->swapfile_pins_lock); + node = rb_first(&fs_info->swapfile_pins); + while (node) { + next = rb_next(node); + sp = rb_entry(node, struct btrfs_swapfile_pin, node); + if (sp->inode == inode) { + rb_erase(&sp->node, &fs_info->swapfile_pins); + if (sp->is_block_group) + btrfs_put_block_group(sp->ptr); + kfree(sp); + } + node = next; + } + spin_unlock(&fs_info->swapfile_pins_lock); +} + +struct btrfs_swap_info { + u64 start; + u64 block_start; + u64 block_len; + u64 lowest_ppage; + u64 highest_ppage; + unsigned long nr_pages; + int nr_extents; +}; + +static int btrfs_add_swap_extent(struct swap_info_struct *sis, + struct btrfs_swap_info *bsi) +{ + unsigned long nr_pages; + u64 first_ppage, first_ppage_reported, next_ppage; + int ret; + + first_ppage = ALIGN(bsi->block_start, PAGE_SIZE) >> PAGE_SHIFT; + next_ppage = ALIGN_DOWN(bsi->block_start + bsi->block_len, + PAGE_SIZE) >> PAGE_SHIFT; + + if (first_ppage >= next_ppage) + return 0; + nr_pages = next_ppage - first_ppage; + + first_ppage_reported = first_ppage; + if (bsi->start == 0) + first_ppage_reported++; + if (bsi->lowest_ppage > first_ppage_reported) + bsi->lowest_ppage = first_ppage_reported; + if (bsi->highest_ppage < (next_ppage - 1)) + bsi->highest_ppage = next_ppage - 1; + + ret = add_swap_extent(sis, bsi->nr_pages, nr_pages, first_ppage); + if (ret < 0) + return ret; + bsi->nr_extents += ret; + bsi->nr_pages += nr_pages; + return 0; +} + +static void btrfs_swap_deactivate(struct file *file) +{ + struct inode *inode = file_inode(file); + + btrfs_free_swapfile_pins(inode); + atomic_dec(&BTRFS_I(inode)->root->nr_swapfiles); +} + +static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file, + sector_t *span) +{ + struct inode *inode = file_inode(file); + struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; + struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; + struct extent_state *cached_state = NULL; + struct extent_map *em = NULL; + struct btrfs_device *device = NULL; + struct btrfs_swap_info bsi = { + .lowest_ppage = (sector_t)-1ULL, + }; + int ret = 0; + u64 isize = inode->i_size; + u64 start; + + /* + * If the swap file was just created, make sure delalloc is done. If the + * file changes again after this, the user is doing something stupid and + * we don't really care. + */ + ret = btrfs_wait_ordered_range(inode, 0, (u64)-1); + if (ret) + return ret; + + /* + * The inode is locked, so these flags won't change after we check them. + */ + if (BTRFS_I(inode)->flags & BTRFS_INODE_COMPRESS) { + btrfs_warn(fs_info, "swapfile must not be compressed"); + return -EINVAL; + } + if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW)) { + btrfs_warn(fs_info, "swapfile must not be copy-on-write"); + return -EINVAL; + } + if (!(BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM)) { + btrfs_warn(fs_info, "swapfile must not be checksummed"); + return -EINVAL; + } + + /* + * Balance or device remove/replace/resize can move stuff around from + * under us. The EXCL_OP flag makes sure they aren't running/won't run + * concurrently while we are mapping the swap extents, and + * fs_info->swapfile_pins prevents them from running while the swap file + * is active and moving the extents. Note that this also prevents a + * concurrent device add which isn't actually necessary, but it's not + * really worth the trouble to allow it. + */ + if (test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags)) { + btrfs_warn(fs_info, + "cannot activate swapfile while exclusive operation is running"); + return -EBUSY; + } + /* + * Snapshots can create extents which require COW even if NODATACOW is + * set. We use this counter to prevent snapshots. We must increment it + * before walking the extents because we don't want a concurrent + * snapshot to run after we've already checked the extents. + */ + atomic_inc(&BTRFS_I(inode)->root->nr_swapfiles); + + lock_extent_bits(io_tree, 0, isize - 1, &cached_state); + start = 0; + while (start < isize) { + u64 logical_block_start, physical_block_start; + struct btrfs_block_group_cache *bg; + u64 len = isize - start; + + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, start, len, 0); + if (IS_ERR(em)) { + ret = PTR_ERR(em); + goto out; + } + + if (em->block_start == EXTENT_MAP_HOLE) { + btrfs_warn(fs_info, "swapfile must not have holes"); + ret = -EINVAL; + goto out; + } + if (em->block_start == EXTENT_MAP_INLINE) { + /* + * It's unlikely we'll ever actually find ourselves + * here, as a file small enough to fit inline won't be + * big enough to store more than the swap header, but in + * case something changes in the future, let's catch it + * here rather than later. + */ + btrfs_warn(fs_info, "swapfile must not be inline"); + ret = -EINVAL; + goto out; + } + if (test_bit(EXTENT_FLAG_COMPRESSED, &em->flags)) { + btrfs_warn(fs_info, "swapfile must not be compressed"); + ret = -EINVAL; + goto out; + } + + logical_block_start = em->block_start + (start - em->start); + len = min(len, em->len - (start - em->start)); + free_extent_map(em); + em = NULL; + + ret = can_nocow_extent(inode, start, &len, NULL, NULL, NULL); + if (ret < 0) { + goto out; + } else if (ret) { + ret = 0; + } else { + btrfs_warn(fs_info, + "swapfile must not be copy-on-write"); + ret = -EINVAL; + goto out; + } + + em = btrfs_get_chunk_map(fs_info, logical_block_start, len); + if (IS_ERR(em)) { + ret = PTR_ERR(em); + goto out; + } + + if (em->map_lookup->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) { + btrfs_warn(fs_info, + "swapfile must have single data profile"); + ret = -EINVAL; + goto out; + } + + if (device == NULL) { + device = em->map_lookup->stripes[0].dev; + ret = btrfs_add_swapfile_pin(inode, device, false); + if (ret == 1) + ret = 0; + else if (ret) + goto out; + } else if (device != em->map_lookup->stripes[0].dev) { + btrfs_warn(fs_info, "swapfile must be on one device"); + ret = -EINVAL; + goto out; + } + + physical_block_start = (em->map_lookup->stripes[0].physical + + (logical_block_start - em->start)); + len = min(len, em->len - (logical_block_start - em->start)); + free_extent_map(em); + em = NULL; + + bg = btrfs_lookup_block_group(fs_info, logical_block_start); + if (!bg) { + btrfs_warn(fs_info, + "could not find block group containing swapfile"); + ret = -EINVAL; + goto out; + } + + ret = btrfs_add_swapfile_pin(inode, bg, true); + if (ret) { + btrfs_put_block_group(bg); + if (ret == 1) + ret = 0; + else + goto out; + } + + if (bsi.block_len && + bsi.block_start + bsi.block_len == physical_block_start) { + bsi.block_len += len; + } else { + if (bsi.block_len) { + ret = btrfs_add_swap_extent(sis, &bsi); + if (ret) + goto out; + } + bsi.start = start; + bsi.block_start = physical_block_start; + bsi.block_len = len; + } + + start += len; + } + + if (bsi.block_len) + ret = btrfs_add_swap_extent(sis, &bsi); + +out: + if (!IS_ERR_OR_NULL(em)) + free_extent_map(em); + + unlock_extent_cached(io_tree, 0, isize - 1, &cached_state); + + if (ret) + btrfs_swap_deactivate(file); + + clear_bit(BTRFS_FS_EXCL_OP, &fs_info->flags); + + if (ret) + return ret; + + if (device) + sis->bdev = device->bdev; + *span = bsi.highest_ppage - bsi.lowest_ppage + 1; + sis->max = bsi.nr_pages; + sis->pages = bsi.nr_pages - 1; + sis->highest_bit = bsi.nr_pages - 1; + return bsi.nr_extents; +} +#else +static void btrfs_swap_deactivate(struct file *file) +{ +} + +static int btrfs_swap_activate(struct swap_info_struct *sis, struct file *file, + sector_t *span) +{ + return -EOPNOTSUPP; +} +#endif + static const struct inode_operations btrfs_dir_inode_operations = { .getattr = btrfs_getattr, .lookup = btrfs_lookup, @@ -10565,6 +10901,8 @@ static const struct address_space_operations btrfs_aops = { .releasepage = btrfs_releasepage, .set_page_dirty = btrfs_set_page_dirty, .error_remove_page = generic_error_remove_page, + .swap_activate = btrfs_swap_activate, + .swap_deactivate = btrfs_swap_deactivate, }; static const struct address_space_operations btrfs_symlink_aops = {