[04/14] xfs: document the user interface for online fsck

Message ID	167243825217.682859.14201039734624895373.stgit@magnolia (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@vger.kernel.org> Subject: [PATCH 04/14] xfs: document the user interface for online fsck From: "Darrick J. Wong" <djwong@kernel.org> To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org, willy@infradead.org, chandan.babu@oracle.com, allison.henderson@oracle.com, linux-fsdevel@vger.kernel.org, hch@infradead.org, catherine.hoang@oracle.com, david@fromorbit.com Date: Fri, 30 Dec 2022 14:10:52 -0800 Message-ID: <167243825217.682859.14201039734624895373.stgit@magnolia> In-Reply-To: <167243825144.682859.12802259329489258661.stgit@magnolia> References: <167243825144.682859.12802259329489258661.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Precedence: bulk
Series	xfs: design documentation for online fsck \| expand [PATCHSET,v24.0,00/14] xfs: design documentation for online fsck [01/14] xfs: document the motivation for online fsck design [02/14] xfs: document the general theory underlying online fsck design [03/14] xfs: document the testing plan for online fsck [04/14] xfs: document the user interface for online fsck [05/14] xfs: document the filesystem metadata checking strategy [06/14] xfs: document how online fsck deals with eventual consistency [07/14] xfs: document pageable kernel memory [08/14] xfs: document btree bulk loading [09/14] xfs: document online file metadata repair code [10/14] xfs: document full filesystem scans for online fsck [11/14] xfs: document metadata file repair [12/14] xfs: document directory tree repairs [13/14] xfs: document the userspace fsck driver program [14/14] xfs: document future directions of online fsck

Message ID

167243825217.682859.14201039734624895373.stgit@magnolia (mailing list archive)

State

New, archived

Headers

Subject: [PATCH 04/14] xfs: document the user interface for online fsck
From: "Darrick J. Wong" <djwong@kernel.org>
To: djwong@kernel.org
Cc: linux-xfs@vger.kernel.org, willy@infradead.org,
        chandan.babu@oracle.com, allison.henderson@oracle.com,
        linux-fsdevel@vger.kernel.org, hch@infradead.org,
        catherine.hoang@oracle.com, david@fromorbit.com
Date: Fri, 30 Dec 2022 14:10:52 -0800
Message-ID: <167243825217.682859.14201039734624895373.stgit@magnolia>
In-Reply-To: <167243825144.682859.12802259329489258661.stgit@magnolia>
References: <167243825144.682859.12802259329489258661.stgit@magnolia>
User-Agent: StGit/0.19
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Precedence: bulk

Series

xfs: design documentation for online fsck | expand

Commit Message

Darrick J. Wong Dec. 30, 2022, 10:10 p.m. UTC

From: Darrick J. Wong <djwong@kernel.org>

Start the fourth chapter of the online fsck design documentation, which
discusses the user interface and the background scrubbing service.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 .../filesystems/xfs-online-fsck-design.rst         |  114 ++++++++++++++++++++
 1 file changed, 114 insertions(+)

Comments

Allison Henderson Jan. 18, 2023, 12:03 a.m. UTC | #1

On Fri, 2022-12-30 at 14:10 -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Start the fourth chapter of the online fsck design documentation,
> which
> discusses the user interface and the background scrubbing service.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  .../filesystems/xfs-online-fsck-design.rst         |  114
> ++++++++++++++++++++
>  1 file changed, 114 insertions(+)
> 
> 
> diff --git a/Documentation/filesystems/xfs-online-fsck-design.rst
> b/Documentation/filesystems/xfs-online-fsck-design.rst
> index d630b6bdbe4a..42e82971e036 100644
> --- a/Documentation/filesystems/xfs-online-fsck-design.rst
> +++ b/Documentation/filesystems/xfs-online-fsck-design.rst
> @@ -750,3 +750,117 @@ Proposed patchsets include `general stress
> testing
>  <
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.g
> it/log/?h=race-scrub-and-mount-state-changes>`_
>  and the `evolution of existing per-function stress testing
>  <
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.g
> it/log/?h=refactor-scrub-stress>`_.
> +
> +4. User Interface
> +=================
> +
> +The primary user of online fsck is the system administrator, just
> like offline
> +repair.
> +Online fsck presents two modes of operation to administrators:
> +A foreground CLI process for online fsck on demand, and a background
> service
> +that performs autonomous checking and repair.
> +
> +Checking on Demand
> +------------------
> +
> +For administrators who want the absolute freshest information about
> the
> +metadata in a filesystem, ``xfs_scrub`` can be run as a foreground
> process on
> +a command line.
> +The program checks every piece of metadata in the filesystem while
> the
> +administrator waits for the results to be reported, just like the
> existing
> +``xfs_repair`` tool.
> +Both tools share a ``-n`` option to perform a read-only scan, and a
> ``-v``
> +option to increase the verbosity of the information reported.
> +
> +A new feature of ``xfs_scrub`` is the ``-x`` option, which employs
> the error
> +correction capabilities of the hardware to check data file contents.
> +The media scan is not enabled by default because it may dramatically
> increase
> +program runtime and consume a lot of bandwidth on older storage
> hardware.
> +
> +The output of a foreground invocation is captured in the system log.
> +
> +The ``xfs_scrub_all`` program walks the list of mounted filesystems
> and
> +initiates ``xfs_scrub`` for each of them in parallel.
> +It serializes scans for any filesystems that resolve to the same top
> level
> +kernel block device to prevent resource overconsumption.
> +
> +Background Service
> +------------------
> +
I'm assuming the below systemd services are configurable right?
> +To reduce the workload of system administrators, the ``xfs_scrub``
> package
> +provides a suite of `systemd <https://systemd.io/>`_ timers and
> services that
> +run online fsck automatically on weekends.
by default.

> +The background service configures scrub to run with as little
> privilege as
> +possible, the lowest CPU and IO priority, and in a CPU-constrained
> single
> +threaded mode.
"This can be tuned at anytime to best suit the needs of the customer
workload."

Then I think you can drop the below line...
> +It is hoped that this minimizes the amount of load generated on the
> system and
> +avoids starving regular workloads.
> +
> +The output of the background service is also captured in the system
> log.
> +If desired, reports of failures (either due to inconsistencies or
> mere runtime
> +errors) can be emailed automatically by setting the ``EMAIL_ADDR``
> environment
> +variable in the following service files:
> +
> +* ``xfs_scrub_fail@.service``
> +* ``xfs_scrub_media_fail@.service``
> +* ``xfs_scrub_all_fail.service``
> +
> +The decision to enable the background scan is left to the system
> administrator.
> +This can be done by enabling either of the following services:
> +
> +* ``xfs_scrub_all.timer`` on systemd systems
> +* ``xfs_scrub_all.cron`` on non-systemd systems
> +
> +This automatic weekly scan is configured out of the box to perform
> an
> +additional media scan of all file data once per month.
> +This is less foolproof than, say, storing file data block checksums,
> but much
> +more performant if application software provides its own integrity
> checking,
> +redundancy can be provided elsewhere above the filesystem, or the
> storage
> +device's integrity guarantees are deemed sufficient.
> +
> +The systemd unit file definitions have been subjected to a security
> audit
> +(as of systemd 249) to ensure that the xfs_scrub processes have as
> little
> +access to the rest of the system as possible.
> +This was performed via ``systemd-analyze security``, after which
> privileges
> +were restricted to the minimum required, sandboxing was set up to
> the maximal
> +extent possible with sandboxing and system call filtering; and
> access to the
> +filesystem tree was restricted to the minimum needed to start the
> program and
> +access the filesystem being scanned.
> +The service definition files restrict CPU usage to 80% of one CPU
> core, and
> +apply as nice of a priority to IO and CPU scheduling as possible.
> +This measure was taken to minimize delays in the rest of the
> filesystem.
> +No such hardening has been performed for the cron job.
> +
> +Proposed patchset:
> +`Enabling the xfs_scrub background service
> +<
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.g
> it/log/?h=scrub-media-scan-service>`_.
> +
> +Health Reporting
> +----------------
> +
> +XFS caches a summary of each filesystem's health status in memory.
> +The information is updated whenever ``xfs_scrub`` is run, or
> whenever
> +inconsistencies are detected in the filesystem metadata during
> regular
> +operations.
> +System administrators should use the ``health`` command of
> ``xfs_spaceman`` to
> +download this information into a human-readable format.
> +If problems have been observed, the administrator can schedule a
> reduced
> +service window to run the online repair tool to correct the problem.
> +Failing that, the administrator can decide to schedule a maintenance
> window to
> +run the traditional offline repair tool to correct the problem.
> +
> +**Question**: Should the health reporting integrate with the new
> inotify fs
> +error notification system?
> +
> +**Question**: Would it be helpful for sysadmins to have a daemon to
> listen for
> +corruption notifications and initiate a repair?
> +
> +*Answer*: These questions remain unanswered, but should be a part of
> the
> +conversation with early adopters and potential downstream users of
> XFS.
I think if there's been no commentary at this point then likely they
can't be answered at this time.  Perhaps for now it is reasonable to
just let the be a potential improvement in the future if the demand for
it arises. In any case, I think we should probably clean out the Q&A
discussion prompts.

Rest looks good tho
Allison

> +
> +Proposed patchsets include
> +`wiring up health reports to correction returns
> +<
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> log/?h=corruption-health-reports>`_
> +and
> +`preservation of sickness info during memory reclaim
> +<
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> log/?h=indirect-health-reporting>`_.
>

Darrick J. Wong Jan. 18, 2023, 2:42 a.m. UTC | #2

On Wed, Jan 18, 2023 at 12:03:29AM +0000, Allison Henderson wrote:
> On Fri, 2022-12-30 at 14:10 -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Start the fourth chapter of the online fsck design documentation,
> > which
> > discusses the user interface and the background scrubbing service.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  .../filesystems/xfs-online-fsck-design.rst         |  114
> > ++++++++++++++++++++
> >  1 file changed, 114 insertions(+)
> > 
> > 
> > diff --git a/Documentation/filesystems/xfs-online-fsck-design.rst
> > b/Documentation/filesystems/xfs-online-fsck-design.rst
> > index d630b6bdbe4a..42e82971e036 100644
> > --- a/Documentation/filesystems/xfs-online-fsck-design.rst
> > +++ b/Documentation/filesystems/xfs-online-fsck-design.rst
> > @@ -750,3 +750,117 @@ Proposed patchsets include `general stress
> > testing
> >  <
> > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.g
> > it/log/?h=race-scrub-and-mount-state-changes>`_
> >  and the `evolution of existing per-function stress testing
> >  <
> > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.g
> > it/log/?h=refactor-scrub-stress>`_.
> > +
> > +4. User Interface
> > +=================
> > +
> > +The primary user of online fsck is the system administrator, just
> > like offline
> > +repair.
> > +Online fsck presents two modes of operation to administrators:
> > +A foreground CLI process for online fsck on demand, and a background
> > service
> > +that performs autonomous checking and repair.
> > +
> > +Checking on Demand
> > +------------------
> > +
> > +For administrators who want the absolute freshest information about
> > the
> > +metadata in a filesystem, ``xfs_scrub`` can be run as a foreground
> > process on
> > +a command line.
> > +The program checks every piece of metadata in the filesystem while
> > the
> > +administrator waits for the results to be reported, just like the
> > existing
> > +``xfs_repair`` tool.
> > +Both tools share a ``-n`` option to perform a read-only scan, and a
> > ``-v``
> > +option to increase the verbosity of the information reported.
> > +
> > +A new feature of ``xfs_scrub`` is the ``-x`` option, which employs
> > the error
> > +correction capabilities of the hardware to check data file contents.
> > +The media scan is not enabled by default because it may dramatically
> > increase
> > +program runtime and consume a lot of bandwidth on older storage
> > hardware.
> > +
> > +The output of a foreground invocation is captured in the system log.
> > +
> > +The ``xfs_scrub_all`` program walks the list of mounted filesystems
> > and
> > +initiates ``xfs_scrub`` for each of them in parallel.
> > +It serializes scans for any filesystems that resolve to the same top
> > level
> > +kernel block device to prevent resource overconsumption.
> > +
> > +Background Service
> > +------------------
> > +
> I'm assuming the below systemd services are configurable right?

Yes, through the standard systemd overriddes.

> > +To reduce the workload of system administrators, the ``xfs_scrub``
> > package
> > +provides a suite of `systemd <https://systemd.io/>`_ timers and
> > services that
> > +run online fsck automatically on weekends.
> by default.

Fixed.

> > +The background service configures scrub to run with as little
> > privilege as
> > +possible, the lowest CPU and IO priority, and in a CPU-constrained
> > single
> > +threaded mode.
> "This can be tuned at anytime to best suit the needs of the customer
> workload."

Fixed.

> Then I think you can drop the below line...
> > +It is hoped that this minimizes the amount of load generated on the
> > system and
> > +avoids starving regular workloads.

Done.

> > +The output of the background service is also captured in the system
> > log.
> > +If desired, reports of failures (either due to inconsistencies or
> > mere runtime
> > +errors) can be emailed automatically by setting the ``EMAIL_ADDR``
> > environment
> > +variable in the following service files:
> > +
> > +* ``xfs_scrub_fail@.service``
> > +* ``xfs_scrub_media_fail@.service``
> > +* ``xfs_scrub_all_fail.service``
> > +
> > +The decision to enable the background scan is left to the system
> > administrator.
> > +This can be done by enabling either of the following services:
> > +
> > +* ``xfs_scrub_all.timer`` on systemd systems
> > +* ``xfs_scrub_all.cron`` on non-systemd systems
> > +
> > +This automatic weekly scan is configured out of the box to perform
> > an
> > +additional media scan of all file data once per month.
> > +This is less foolproof than, say, storing file data block checksums,
> > but much
> > +more performant if application software provides its own integrity
> > checking,
> > +redundancy can be provided elsewhere above the filesystem, or the
> > storage
> > +device's integrity guarantees are deemed sufficient.
> > +
> > +The systemd unit file definitions have been subjected to a security
> > audit
> > +(as of systemd 249) to ensure that the xfs_scrub processes have as
> > little
> > +access to the rest of the system as possible.
> > +This was performed via ``systemd-analyze security``, after which
> > privileges
> > +were restricted to the minimum required, sandboxing was set up to
> > the maximal
> > +extent possible with sandboxing and system call filtering; and
> > access to the
> > +filesystem tree was restricted to the minimum needed to start the
> > program and
> > +access the filesystem being scanned.
> > +The service definition files restrict CPU usage to 80% of one CPU
> > core, and
> > +apply as nice of a priority to IO and CPU scheduling as possible.
> > +This measure was taken to minimize delays in the rest of the
> > filesystem.
> > +No such hardening has been performed for the cron job.
> > +
> > +Proposed patchset:
> > +`Enabling the xfs_scrub background service
> > +<
> > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.g
> > it/log/?h=scrub-media-scan-service>`_.
> > +
> > +Health Reporting
> > +----------------
> > +
> > +XFS caches a summary of each filesystem's health status in memory.
> > +The information is updated whenever ``xfs_scrub`` is run, or
> > whenever
> > +inconsistencies are detected in the filesystem metadata during
> > regular
> > +operations.
> > +System administrators should use the ``health`` command of
> > ``xfs_spaceman`` to
> > +download this information into a human-readable format.
> > +If problems have been observed, the administrator can schedule a
> > reduced
> > +service window to run the online repair tool to correct the problem.
> > +Failing that, the administrator can decide to schedule a maintenance
> > window to
> > +run the traditional offline repair tool to correct the problem.
> > +
> > +**Question**: Should the health reporting integrate with the new
> > inotify fs
> > +error notification system?
> > +
> > +**Question**: Would it be helpful for sysadmins to have a daemon to
> > listen for
> > +corruption notifications and initiate a repair?
> > +
> > +*Answer*: These questions remain unanswered, but should be a part of
> > the
> > +conversation with early adopters and potential downstream users of
> > XFS.
> I think if there's been no commentary at this point then likely they
> can't be answered at this time.  Perhaps for now it is reasonable to
> just let the be a potential improvement in the future if the demand for
> it arises. In any case, I think we should probably clean out the Q&A
> discussion prompts.

I'll change them to "future work Q's" so I don't forget to pursue them
after part 1 is merged.

> Rest looks good tho

:-D  Thanks!

--D

> Allison
> 
> > +
> > +Proposed patchsets include
> > +`wiring up health reports to correction returns
> > +<
> > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > log/?h=corruption-health-reports>`_
> > +and
> > +`preservation of sickness info during memory reclaim
> > +<
> > https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/
> > log/?h=indirect-health-reporting>`_.
> > 
>

diff --git a/Documentation/filesystems/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs-online-fsck-design.rst
index d630b6bdbe4a..42e82971e036 100644
--- a/Documentation/filesystems/xfs-online-fsck-design.rst
+++ b/Documentation/filesystems/xfs-online-fsck-design.rst
@@ -750,3 +750,117 @@  Proposed patchsets include `general stress testing
 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=race-scrub-and-mount-state-changes>`_
 and the `evolution of existing per-function stress testing
 <https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/log/?h=refactor-scrub-stress>`_.
+
+4. User Interface
+=================
+
+The primary user of online fsck is the system administrator, just like offline
+repair.
+Online fsck presents two modes of operation to administrators:
+A foreground CLI process for online fsck on demand, and a background service
+that performs autonomous checking and repair.
+
+Checking on Demand
+------------------
+
+For administrators who want the absolute freshest information about the
+metadata in a filesystem, ``xfs_scrub`` can be run as a foreground process on
+a command line.
+The program checks every piece of metadata in the filesystem while the
+administrator waits for the results to be reported, just like the existing
+``xfs_repair`` tool.
+Both tools share a ``-n`` option to perform a read-only scan, and a ``-v``
+option to increase the verbosity of the information reported.
+
+A new feature of ``xfs_scrub`` is the ``-x`` option, which employs the error
+correction capabilities of the hardware to check data file contents.
+The media scan is not enabled by default because it may dramatically increase
+program runtime and consume a lot of bandwidth on older storage hardware.
+
+The output of a foreground invocation is captured in the system log.
+
+The ``xfs_scrub_all`` program walks the list of mounted filesystems and
+initiates ``xfs_scrub`` for each of them in parallel.
+It serializes scans for any filesystems that resolve to the same top level
+kernel block device to prevent resource overconsumption.
+
+Background Service
+------------------
+
+To reduce the workload of system administrators, the ``xfs_scrub`` package
+provides a suite of `systemd <https://systemd.io/>`_ timers and services that
+run online fsck automatically on weekends.
+The background service configures scrub to run with as little privilege as
+possible, the lowest CPU and IO priority, and in a CPU-constrained single
+threaded mode.
+It is hoped that this minimizes the amount of load generated on the system and
+avoids starving regular workloads.
+
+The output of the background service is also captured in the system log.
+If desired, reports of failures (either due to inconsistencies or mere runtime
+errors) can be emailed automatically by setting the ``EMAIL_ADDR`` environment
+variable in the following service files:
+
+* ``xfs_scrub_fail@.service``
+* ``xfs_scrub_media_fail@.service``
+* ``xfs_scrub_all_fail.service``
+
+The decision to enable the background scan is left to the system administrator.
+This can be done by enabling either of the following services:
+
+* ``xfs_scrub_all.timer`` on systemd systems
+* ``xfs_scrub_all.cron`` on non-systemd systems
+
+This automatic weekly scan is configured out of the box to perform an
+additional media scan of all file data once per month.
+This is less foolproof than, say, storing file data block checksums, but much
+more performant if application software provides its own integrity checking,
+redundancy can be provided elsewhere above the filesystem, or the storage
+device's integrity guarantees are deemed sufficient.
+
+The systemd unit file definitions have been subjected to a security audit
+(as of systemd 249) to ensure that the xfs_scrub processes have as little
+access to the rest of the system as possible.
+This was performed via ``systemd-analyze security``, after which privileges
+were restricted to the minimum required, sandboxing was set up to the maximal
+extent possible with sandboxing and system call filtering; and access to the
+filesystem tree was restricted to the minimum needed to start the program and
+access the filesystem being scanned.
+The service definition files restrict CPU usage to 80% of one CPU core, and
+apply as nice of a priority to IO and CPU scheduling as possible.
+This measure was taken to minimize delays in the rest of the filesystem.
+No such hardening has been performed for the cron job.
+
+Proposed patchset:
+`Enabling the xfs_scrub background service
+<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfsprogs-dev.git/log/?h=scrub-media-scan-service>`_.
+
+Health Reporting
+----------------
+
+XFS caches a summary of each filesystem's health status in memory.
+The information is updated whenever ``xfs_scrub`` is run, or whenever
+inconsistencies are detected in the filesystem metadata during regular
+operations.
+System administrators should use the ``health`` command of ``xfs_spaceman`` to
+download this information into a human-readable format.
+If problems have been observed, the administrator can schedule a reduced
+service window to run the online repair tool to correct the problem.
+Failing that, the administrator can decide to schedule a maintenance window to
+run the traditional offline repair tool to correct the problem.
+
+**Question**: Should the health reporting integrate with the new inotify fs
+error notification system?
+
+**Question**: Would it be helpful for sysadmins to have a daemon to listen for
+corruption notifications and initiate a repair?
+
+*Answer*: These questions remain unanswered, but should be a part of the
+conversation with early adopters and potential downstream users of XFS.
+
+Proposed patchsets include
+`wiring up health reports to correction returns
+<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=corruption-health-reports>`_
+and
+`preservation of sickness info during memory reclaim
+<https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=indirect-health-reporting>`_.

[04/14] xfs: document the user interface for online fsck

Commit Message

Comments

Patch