diff mbox series

[v5,08/24] fsverity: add per-sb workqueue for post read processing

Message ID 20240304191046.157464-10-aalbersh@redhat.com (mailing list archive)
State Superseded, archived
Headers show
Series fs-verity support for XFS | expand

Commit Message

Andrey Albershteyn March 4, 2024, 7:10 p.m. UTC
For XFS, fsverity's global workqueue is not really suitable due to:

1. High priority workqueues are used within XFS to ensure that data
   IO completion cannot stall processing of journal IO completions.
   Hence using a WQ_HIGHPRI workqueue directly in the user data IO
   path is a potential filesystem livelock/deadlock vector.

2. The fsverity workqueue is global - it creates a cross-filesystem
   contention point.

This patch adds per-filesystem, per-cpu workqueue for fsverity
work. This allows iomap to add verification work in the read path on
BIO completion.

Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
---
 fs/super.c               |  7 +++++++
 include/linux/fs.h       |  2 ++
 include/linux/fsverity.h | 22 ++++++++++++++++++++++
 3 files changed, 31 insertions(+)

Comments

Eric Biggers March 5, 2024, 1:08 a.m. UTC | #1
On Mon, Mar 04, 2024 at 08:10:31PM +0100, Andrey Albershteyn wrote:
> For XFS, fsverity's global workqueue is not really suitable due to:
> 
> 1. High priority workqueues are used within XFS to ensure that data
>    IO completion cannot stall processing of journal IO completions.
>    Hence using a WQ_HIGHPRI workqueue directly in the user data IO
>    path is a potential filesystem livelock/deadlock vector.
> 
> 2. The fsverity workqueue is global - it creates a cross-filesystem
>    contention point.
> 
> This patch adds per-filesystem, per-cpu workqueue for fsverity
> work. This allows iomap to add verification work in the read path on
> BIO completion.
> 
> Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>

Should ext4 and f2fs switch over to this by converting
fsverity_enqueue_verify_work() to use it?  I'd prefer not to have to maintain
two separate workqueue strategies as part of the fs/verity/ infrastructure.

> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 1fbc72c5f112..5863519ffd51 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1223,6 +1223,8 @@ struct super_block {
>  #endif
>  #ifdef CONFIG_FS_VERITY
>  	const struct fsverity_operations *s_vop;
> +	/* Completion queue for post read verification */
> +	struct workqueue_struct *s_read_done_wq;
>  #endif

Maybe s_verity_wq?  Or do you anticipate this being used for other purposes too,
such as fscrypt?  Note that it's behind CONFIG_FS_VERITY and is allocated by an
fsverity_* function, so at least at the moment it doesn't feel very generic.

> diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
> index 0973b521ac5a..45b7c613148a 100644
> --- a/include/linux/fsverity.h
> +++ b/include/linux/fsverity.h
> @@ -241,6 +241,22 @@ void fsverity_enqueue_verify_work(struct work_struct *work);
>  void fsverity_invalidate_block(struct inode *inode,
>  		struct fsverity_blockbuf *block);
>  
> +static inline int fsverity_set_ops(struct super_block *sb,
> +				   const struct fsverity_operations *ops)

This doesn't just set the ops, but also allocates a workqueue too.  A better
name for this function might be fsverity_init_sb.

Also this shouldn't really be an inline function.

> +{
> +	sb->s_vop = ops;
> +
> +	/* Create per-sb workqueue for post read bio verification */
> +	struct workqueue_struct *wq = alloc_workqueue(
> +		"pread/%s", (WQ_FREEZABLE | WQ_MEM_RECLAIM), 0, sb->s_id);

"pread" is short for "post read", I guess?  Should it really be this generic?

> +static inline int fsverity_set_ops(struct super_block *sb,
> +				   const struct fsverity_operations *ops)
> +{
> +	return -EOPNOTSUPP;
> +}

I think it would be better to just not have a !CONFIG_FS_VERITY stub for this.

You *could* make it work like fscrypt_set_ops(), which the ubifs folks added,
where it can be called unconditionally if the filesystem has a declaration for
the operations (but not necessarily a definition).  In that case it would need
to return 0, rather than an error.  But I think I prefer just omitting the stub
and having filesystems guard the call to this by CONFIG_FS_VERITY, as you've
already done in XFS.

- Eric
Darrick J. Wong March 7, 2024, 9:58 p.m. UTC | #2
On Mon, Mar 04, 2024 at 05:08:05PM -0800, Eric Biggers wrote:
> On Mon, Mar 04, 2024 at 08:10:31PM +0100, Andrey Albershteyn wrote:
> > For XFS, fsverity's global workqueue is not really suitable due to:
> > 
> > 1. High priority workqueues are used within XFS to ensure that data
> >    IO completion cannot stall processing of journal IO completions.
> >    Hence using a WQ_HIGHPRI workqueue directly in the user data IO
> >    path is a potential filesystem livelock/deadlock vector.
> > 
> > 2. The fsverity workqueue is global - it creates a cross-filesystem
> >    contention point.
> > 
> > This patch adds per-filesystem, per-cpu workqueue for fsverity
> > work. This allows iomap to add verification work in the read path on
> > BIO completion.
> > 
> > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> 
> Should ext4 and f2fs switch over to this by converting
> fsverity_enqueue_verify_work() to use it?  I'd prefer not to have to maintain
> two separate workqueue strategies as part of the fs/verity/ infrastructure.

(Agreed.)

> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 1fbc72c5f112..5863519ffd51 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -1223,6 +1223,8 @@ struct super_block {
> >  #endif
> >  #ifdef CONFIG_FS_VERITY
> >  	const struct fsverity_operations *s_vop;
> > +	/* Completion queue for post read verification */
> > +	struct workqueue_struct *s_read_done_wq;
> >  #endif
> 
> Maybe s_verity_wq?  Or do you anticipate this being used for other purposes too,
> such as fscrypt?  Note that it's behind CONFIG_FS_VERITY and is allocated by an
> fsverity_* function, so at least at the moment it doesn't feel very generic.

Doesn't fscrypt already create its own workqueues?

> > diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
> > index 0973b521ac5a..45b7c613148a 100644
> > --- a/include/linux/fsverity.h
> > +++ b/include/linux/fsverity.h
> > @@ -241,6 +241,22 @@ void fsverity_enqueue_verify_work(struct work_struct *work);
> >  void fsverity_invalidate_block(struct inode *inode,
> >  		struct fsverity_blockbuf *block);
> >  
> > +static inline int fsverity_set_ops(struct super_block *sb,
> > +				   const struct fsverity_operations *ops)
> 
> This doesn't just set the ops, but also allocates a workqueue too.  A better
> name for this function might be fsverity_init_sb.
> 
> Also this shouldn't really be an inline function.

Yeah.

> > +{
> > +	sb->s_vop = ops;
> > +
> > +	/* Create per-sb workqueue for post read bio verification */
> > +	struct workqueue_struct *wq = alloc_workqueue(
> > +		"pread/%s", (WQ_FREEZABLE | WQ_MEM_RECLAIM), 0, sb->s_id);
> 
> "pread" is short for "post read", I guess?  Should it really be this generic?

I think it shouldn't use a term that already means "positioned read" to
userspace.

> > +static inline int fsverity_set_ops(struct super_block *sb,
> > +				   const struct fsverity_operations *ops)
> > +{
> > +	return -EOPNOTSUPP;
> > +}
> 
> I think it would be better to just not have a !CONFIG_FS_VERITY stub for this.
> 
> You *could* make it work like fscrypt_set_ops(), which the ubifs folks added,
> where it can be called unconditionally if the filesystem has a declaration for
> the operations (but not necessarily a definition).  In that case it would need
> to return 0, rather than an error.  But I think I prefer just omitting the stub
> and having filesystems guard the call to this by CONFIG_FS_VERITY, as you've
> already done in XFS.

Aha, I was going to ask why XFS had its own #ifdef guards when this
already exists. :)

--D

> - Eric
>
Eric Biggers March 7, 2024, 10:26 p.m. UTC | #3
On Thu, Mar 07, 2024 at 01:58:57PM -0800, Darrick J. Wong wrote:
> > Maybe s_verity_wq?  Or do you anticipate this being used for other purposes too,
> > such as fscrypt?  Note that it's behind CONFIG_FS_VERITY and is allocated by an
> > fsverity_* function, so at least at the moment it doesn't feel very generic.
> 
> Doesn't fscrypt already create its own workqueues?

There's a global workqueue in fs/crypto/ too.  IIRC, it has to be separate from
the fs/verity/ workqueue to avoid deadlocks when a file has both encryption and
verity enabled.  This is because the verity step can involve reading and
decrypting Merkle tree blocks.

The main thing I was wondering was whether there was some plan to use this new
per-sb workqueue as a generic post-read processing queue, as seemed to be
implied by the chosen naming.  If there's no clear plan, it should instead just
be named after what it's actually used for, fsverity.

- Eric
Dave Chinner March 7, 2024, 10:55 p.m. UTC | #4
On Thu, Mar 07, 2024 at 01:58:57PM -0800, Darrick J. Wong wrote:
> On Mon, Mar 04, 2024 at 05:08:05PM -0800, Eric Biggers wrote:
> > On Mon, Mar 04, 2024 at 08:10:31PM +0100, Andrey Albershteyn wrote:
> > > For XFS, fsverity's global workqueue is not really suitable due to:
> > > 
> > > 1. High priority workqueues are used within XFS to ensure that data
> > >    IO completion cannot stall processing of journal IO completions.
> > >    Hence using a WQ_HIGHPRI workqueue directly in the user data IO
> > >    path is a potential filesystem livelock/deadlock vector.
> > > 
> > > 2. The fsverity workqueue is global - it creates a cross-filesystem
> > >    contention point.
> > > 
> > > This patch adds per-filesystem, per-cpu workqueue for fsverity
> > > work. This allows iomap to add verification work in the read path on
> > > BIO completion.
> > > 
> > > Signed-off-by: Andrey Albershteyn <aalbersh@redhat.com>
> > 
> > Should ext4 and f2fs switch over to this by converting
> > fsverity_enqueue_verify_work() to use it?  I'd prefer not to have to maintain
> > two separate workqueue strategies as part of the fs/verity/ infrastructure.
> 
> (Agreed.)
> 
> > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > index 1fbc72c5f112..5863519ffd51 100644
> > > --- a/include/linux/fs.h
> > > +++ b/include/linux/fs.h
> > > @@ -1223,6 +1223,8 @@ struct super_block {
> > >  #endif
> > >  #ifdef CONFIG_FS_VERITY
> > >  	const struct fsverity_operations *s_vop;
> > > +	/* Completion queue for post read verification */
> > > +	struct workqueue_struct *s_read_done_wq;
> > >  #endif
> > 
> > Maybe s_verity_wq?  Or do you anticipate this being used for other purposes too,
> > such as fscrypt?  Note that it's behind CONFIG_FS_VERITY and is allocated by an
> > fsverity_* function, so at least at the moment it doesn't feel very generic.
> 
> Doesn't fscrypt already create its own workqueues?

Yes, but iomap really needs it to be a generic sb workqueue like the
sb->s_dio_done_wq.

i.e. the completion processing in task context is not isolated to
fsverity - it will likely also be needed for fscrypt completion and
decompression on read completion, when they get supported by the
generic iomap IO infrastruture. i.e. Any read IO that needs a task
context for completion side data processing will need this.

The reality is that the major filesytsems are already using either
generic or internal per-fs workqueues for DIO completion and
buffered write completions. Some already use buried workqueues on
read IO completion, too, so moving this to being supported by the
generic IO infrastructure rather than reimplmenting it just a little
bit differently in every filesystem makes a lot of sense.

-Dave.
Darrick J. Wong March 8, 2024, 3:53 a.m. UTC | #5
On Thu, Mar 07, 2024 at 02:26:22PM -0800, Eric Biggers wrote:
> On Thu, Mar 07, 2024 at 01:58:57PM -0800, Darrick J. Wong wrote:
> > > Maybe s_verity_wq?  Or do you anticipate this being used for other purposes too,
> > > such as fscrypt?  Note that it's behind CONFIG_FS_VERITY and is allocated by an
> > > fsverity_* function, so at least at the moment it doesn't feel very generic.
> > 
> > Doesn't fscrypt already create its own workqueues?
> 
> There's a global workqueue in fs/crypto/ too.  IIRC, it has to be separate from
> the fs/verity/ workqueue to avoid deadlocks when a file has both encryption and
> verity enabled.  This is because the verity step can involve reading and
> decrypting Merkle tree blocks.
> 
> The main thing I was wondering was whether there was some plan to use this new
> per-sb workqueue as a generic post-read processing queue, as seemed to be
> implied by the chosen naming.  If there's no clear plan, it should instead just
> be named after what it's actually used for, fsverity.

My guess is that there's a lot less complexity if each subsystem (crypt,
compression, verity, etc) gets its own workqueues instead of sharing so
that there aren't starvation issues.  It's too bad that's sort of
wasteful, when what we really want is to submit a chain of dependent
work_structs and have the workqueue(s) run that chain on the same cpu if
possible.

--D

> - Eric
>
diff mbox series

Patch

diff --git a/fs/super.c b/fs/super.c
index d6efeba0d0ce..03795ee4d9b9 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -637,6 +637,13 @@  void generic_shutdown_super(struct super_block *sb)
 			sb->s_dio_done_wq = NULL;
 		}
 
+#ifdef CONFIG_FS_VERITY
+		if (sb->s_read_done_wq) {
+			destroy_workqueue(sb->s_read_done_wq);
+			sb->s_read_done_wq = NULL;
+		}
+#endif
+
 		if (sop->put_super)
 			sop->put_super(sb);
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1fbc72c5f112..5863519ffd51 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1223,6 +1223,8 @@  struct super_block {
 #endif
 #ifdef CONFIG_FS_VERITY
 	const struct fsverity_operations *s_vop;
+	/* Completion queue for post read verification */
+	struct workqueue_struct *s_read_done_wq;
 #endif
 #if IS_ENABLED(CONFIG_UNICODE)
 	struct unicode_map *s_encoding;
diff --git a/include/linux/fsverity.h b/include/linux/fsverity.h
index 0973b521ac5a..45b7c613148a 100644
--- a/include/linux/fsverity.h
+++ b/include/linux/fsverity.h
@@ -241,6 +241,22 @@  void fsverity_enqueue_verify_work(struct work_struct *work);
 void fsverity_invalidate_block(struct inode *inode,
 		struct fsverity_blockbuf *block);
 
+static inline int fsverity_set_ops(struct super_block *sb,
+				   const struct fsverity_operations *ops)
+{
+	sb->s_vop = ops;
+
+	/* Create per-sb workqueue for post read bio verification */
+	struct workqueue_struct *wq = alloc_workqueue(
+		"pread/%s", (WQ_FREEZABLE | WQ_MEM_RECLAIM), 0, sb->s_id);
+	if (!wq)
+		return -ENOMEM;
+
+	sb->s_read_done_wq = wq;
+
+	return 0;
+}
+
 #else /* !CONFIG_FS_VERITY */
 
 static inline struct fsverity_info *fsverity_get_info(const struct inode *inode)
@@ -318,6 +334,12 @@  static inline void fsverity_enqueue_verify_work(struct work_struct *work)
 	WARN_ON_ONCE(1);
 }
 
+static inline int fsverity_set_ops(struct super_block *sb,
+				   const struct fsverity_operations *ops)
+{
+	return -EOPNOTSUPP;
+}
+
 #endif	/* !CONFIG_FS_VERITY */
 
 static inline bool fsverity_verify_folio(struct folio *folio)