Message ID | 20210806051140.301127-3-damien.lemoal@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | IO priority fixes and improvements | expand |
On 8/6/21 7:11 AM, Damien Le Moal wrote: > An iocb aio_reqprio field is 16-bits (u16) but often handled as an int > in the block layer. E.g. ioprio_check_cap() takes an int as argument. > With such implicit int casting function calls, the upper 16-bits of the > int argument may be left uninitialized by the compiler, resulting in > invalid values for the IOPRIO_PRIO_CLASS() macro (garbage upper bits) > and in an error return for functions such as ioprio_check_cap(). > > Fix this by masking the result of the shift by IOPRIO_CLASS_SHIFT bits > in the IOPRIO_PRIO_CLASS() macro. The new macro IOPRIO_CLASS_MASK > defines the 3-bits mask for the priority class. > > While at it, cleanup the following: > * Apply the mask IOPRIO_PRIO_MASK to the data argument of the > IOPRIO_PRIO_VALUE() macro to ignore upper bits of the data value. > * Remove unnecessary parenthesis around fixed values in the macro > definitions in include/uapi/linux/ioprio.h. > * Update the outdated mention of CFQ in the comment describing priority > classes and instead mention BFQ and mq-deadline. > * Change the argument name of the IOPRIO_PRIO_CLASS() and > IOPRIO_PRIO_DATA() macros from "mask" to "ioprio" to reflect the fact > that an IO priority value should be passed rather than a mask. > * Change the ioprio_valid() macro into an inline function, adding a > check on the maximum value of the class of a priority value as > defined by the IOPRIO_CLASS_MAX enum value. Move this function to > the kernel side in include/linux/ioprio.h. > * Remove the unnecessary "else" after the return statements in > task_nice_ioclass(). > > Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> > --- > include/linux/ioprio.h | 15 ++++++++++++--- > include/uapi/linux/ioprio.h | 19 +++++++++++-------- > 2 files changed, 23 insertions(+), 11 deletions(-) > > diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h > index ef9ad4fb245f..9b3a6d8172b4 100644 > --- a/include/linux/ioprio.h > +++ b/include/linux/ioprio.h > @@ -8,6 +8,16 @@ > > #include <uapi/linux/ioprio.h> > > +/* > + * Check that a priority value has a valid class. > + */ > +static inline bool ioprio_valid(unsigned short ioprio) Wouldn't it be better to use 'u16' here as type, as we're relying on the number of bits? > +{ > + unsigned short class = IOPRIO_PRIO_CLASS(ioprio); > + > + return class > IOPRIO_CLASS_NONE && class < IOPRIO_CLASS_MAX; > +} > + > /* > * if process has set io priority explicitly, use that. if not, convert > * the cpu scheduler nice value to an io priority > @@ -25,10 +35,9 @@ static inline int task_nice_ioclass(struct task_struct *task) > { > if (task->policy == SCHED_IDLE) > return IOPRIO_CLASS_IDLE; > - else if (task_is_realtime(task)) > + if (task_is_realtime(task)) > return IOPRIO_CLASS_RT; > - else > - return IOPRIO_CLASS_BE; > + return IOPRIO_CLASS_BE; > } > > /* > diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h > index 77b17e08b0da..abc40965aa96 100644 > --- a/include/uapi/linux/ioprio.h > +++ b/include/uapi/linux/ioprio.h > @@ -5,12 +5,15 @@ > /* > * Gives us 8 prio classes with 13-bits of data for each class > */ > -#define IOPRIO_CLASS_SHIFT (13) > +#define IOPRIO_CLASS_SHIFT 13 > +#define IOPRIO_CLASS_MASK 0x07 > #define IOPRIO_PRIO_MASK ((1UL << IOPRIO_CLASS_SHIFT) - 1) > > -#define IOPRIO_PRIO_CLASS(mask) ((mask) >> IOPRIO_CLASS_SHIFT) > -#define IOPRIO_PRIO_DATA(mask) ((mask) & IOPRIO_PRIO_MASK) > -#define IOPRIO_PRIO_VALUE(class, data) (((class) << IOPRIO_CLASS_SHIFT) | data) > +#define IOPRIO_PRIO_CLASS(ioprio) \ > + (((ioprio) >> IOPRIO_CLASS_SHIFT) & IOPRIO_CLASS_MASK) > +#define IOPRIO_PRIO_DATA(ioprio) ((ioprio) & IOPRIO_PRIO_MASK) > +#define IOPRIO_PRIO_VALUE(class, data) \ > + (((class) << IOPRIO_CLASS_SHIFT) | ((data) & IOPRIO_PRIO_MASK)) > > /* > * These are the io priority groups as implemented by CFQ. RT is the realtime > @@ -23,14 +26,14 @@ enum { > IOPRIO_CLASS_RT, > IOPRIO_CLASS_BE, > IOPRIO_CLASS_IDLE, > -}; > > -#define ioprio_valid(mask) (IOPRIO_PRIO_CLASS((mask)) != IOPRIO_CLASS_NONE) > + IOPRIO_CLASS_MAX, > +}; > > /* > * 8 best effort priority levels are supported > */ > -#define IOPRIO_BE_NR (8) > +#define IOPRIO_BE_NR 8 > > enum { > IOPRIO_WHO_PROCESS = 1, > @@ -41,6 +44,6 @@ enum { > /* > * Fallback BE prioritye@su > */ > -#define IOPRIO_NORM (4) > +#define IOPRIO_NORM 4 > > #endif /* _UAPI_LINUX_IOPRIO_H */ > Other than that: Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes
On 2021/08/06 15:35, Hannes Reinecke wrote: > On 8/6/21 7:11 AM, Damien Le Moal wrote: >> An iocb aio_reqprio field is 16-bits (u16) but often handled as an int >> in the block layer. E.g. ioprio_check_cap() takes an int as argument. >> With such implicit int casting function calls, the upper 16-bits of the >> int argument may be left uninitialized by the compiler, resulting in >> invalid values for the IOPRIO_PRIO_CLASS() macro (garbage upper bits) >> and in an error return for functions such as ioprio_check_cap(). >> >> Fix this by masking the result of the shift by IOPRIO_CLASS_SHIFT bits >> in the IOPRIO_PRIO_CLASS() macro. The new macro IOPRIO_CLASS_MASK >> defines the 3-bits mask for the priority class. >> >> While at it, cleanup the following: >> * Apply the mask IOPRIO_PRIO_MASK to the data argument of the >> IOPRIO_PRIO_VALUE() macro to ignore upper bits of the data value. >> * Remove unnecessary parenthesis around fixed values in the macro >> definitions in include/uapi/linux/ioprio.h. >> * Update the outdated mention of CFQ in the comment describing priority >> classes and instead mention BFQ and mq-deadline. >> * Change the argument name of the IOPRIO_PRIO_CLASS() and >> IOPRIO_PRIO_DATA() macros from "mask" to "ioprio" to reflect the fact >> that an IO priority value should be passed rather than a mask. >> * Change the ioprio_valid() macro into an inline function, adding a >> check on the maximum value of the class of a priority value as >> defined by the IOPRIO_CLASS_MAX enum value. Move this function to >> the kernel side in include/linux/ioprio.h. >> * Remove the unnecessary "else" after the return statements in >> task_nice_ioclass(). >> >> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> >> --- >> include/linux/ioprio.h | 15 ++++++++++++--- >> include/uapi/linux/ioprio.h | 19 +++++++++++-------- >> 2 files changed, 23 insertions(+), 11 deletions(-) >> >> diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h >> index ef9ad4fb245f..9b3a6d8172b4 100644 >> --- a/include/linux/ioprio.h >> +++ b/include/linux/ioprio.h >> @@ -8,6 +8,16 @@ >> >> #include <uapi/linux/ioprio.h> >> >> +/* >> + * Check that a priority value has a valid class. >> + */ >> +static inline bool ioprio_valid(unsigned short ioprio) > > Wouldn't it be better to use 'u16' here as type, as we're relying on the > number of bits? Other functions in block/ioprio.c and in include/linux/ioprio.h use "unsigned short", so I followed. But many functions, if not most, use "int". This is all a bit of a mess. I think we need a "typedef ioprio_t u16;" to clean things up. But there are a lot of places to fix. I can add such patch... Worth it ? > >> +{ >> + unsigned short class = IOPRIO_PRIO_CLASS(ioprio); >> + >> + return class > IOPRIO_CLASS_NONE && class < IOPRIO_CLASS_MAX; >> +} >> + >> /* >> * if process has set io priority explicitly, use that. if not, convert >> * the cpu scheduler nice value to an io priority >> @@ -25,10 +35,9 @@ static inline int task_nice_ioclass(struct task_struct *task) >> { >> if (task->policy == SCHED_IDLE) >> return IOPRIO_CLASS_IDLE; >> - else if (task_is_realtime(task)) >> + if (task_is_realtime(task)) >> return IOPRIO_CLASS_RT; >> - else >> - return IOPRIO_CLASS_BE; >> + return IOPRIO_CLASS_BE; >> } >> >> /* >> diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h >> index 77b17e08b0da..abc40965aa96 100644 >> --- a/include/uapi/linux/ioprio.h >> +++ b/include/uapi/linux/ioprio.h >> @@ -5,12 +5,15 @@ >> /* >> * Gives us 8 prio classes with 13-bits of data for each class >> */ >> -#define IOPRIO_CLASS_SHIFT (13) >> +#define IOPRIO_CLASS_SHIFT 13 >> +#define IOPRIO_CLASS_MASK 0x07 >> #define IOPRIO_PRIO_MASK ((1UL << IOPRIO_CLASS_SHIFT) - 1) >> >> -#define IOPRIO_PRIO_CLASS(mask) ((mask) >> IOPRIO_CLASS_SHIFT) >> -#define IOPRIO_PRIO_DATA(mask) ((mask) & IOPRIO_PRIO_MASK) >> -#define IOPRIO_PRIO_VALUE(class, data) (((class) << IOPRIO_CLASS_SHIFT) | data) >> +#define IOPRIO_PRIO_CLASS(ioprio) \ >> + (((ioprio) >> IOPRIO_CLASS_SHIFT) & IOPRIO_CLASS_MASK) >> +#define IOPRIO_PRIO_DATA(ioprio) ((ioprio) & IOPRIO_PRIO_MASK) >> +#define IOPRIO_PRIO_VALUE(class, data) \ >> + (((class) << IOPRIO_CLASS_SHIFT) | ((data) & IOPRIO_PRIO_MASK)) >> >> /* >> * These are the io priority groups as implemented by CFQ. RT is the realtime >> @@ -23,14 +26,14 @@ enum { >> IOPRIO_CLASS_RT, >> IOPRIO_CLASS_BE, >> IOPRIO_CLASS_IDLE, >> -}; >> >> -#define ioprio_valid(mask) (IOPRIO_PRIO_CLASS((mask)) != IOPRIO_CLASS_NONE) >> + IOPRIO_CLASS_MAX, >> +}; >> >> /* >> * 8 best effort priority levels are supported >> */ >> -#define IOPRIO_BE_NR (8) >> +#define IOPRIO_BE_NR 8 >> >> enum { >> IOPRIO_WHO_PROCESS = 1, >> @@ -41,6 +44,6 @@ enum { >> /* >> * Fallback BE prioritye@su >> */ >> -#define IOPRIO_NORM (4) >> +#define IOPRIO_NORM 4 >> >> #endif /* _UAPI_LINUX_IOPRIO_H */ >> > Other than that: > > Reviewed-by: Hannes Reinecke <hare@suse.de> > > Cheers, > > Hannes >
On 8/6/21 8:57 AM, Damien Le Moal wrote: > On 2021/08/06 15:35, Hannes Reinecke wrote: >> On 8/6/21 7:11 AM, Damien Le Moal wrote: >>> An iocb aio_reqprio field is 16-bits (u16) but often handled as an int >>> in the block layer. E.g. ioprio_check_cap() takes an int as argument. >>> With such implicit int casting function calls, the upper 16-bits of the >>> int argument may be left uninitialized by the compiler, resulting in >>> invalid values for the IOPRIO_PRIO_CLASS() macro (garbage upper bits) >>> and in an error return for functions such as ioprio_check_cap(). >>> >>> Fix this by masking the result of the shift by IOPRIO_CLASS_SHIFT bits >>> in the IOPRIO_PRIO_CLASS() macro. The new macro IOPRIO_CLASS_MASK >>> defines the 3-bits mask for the priority class. >>> >>> While at it, cleanup the following: >>> * Apply the mask IOPRIO_PRIO_MASK to the data argument of the >>> IOPRIO_PRIO_VALUE() macro to ignore upper bits of the data value. >>> * Remove unnecessary parenthesis around fixed values in the macro >>> definitions in include/uapi/linux/ioprio.h. >>> * Update the outdated mention of CFQ in the comment describing priority >>> classes and instead mention BFQ and mq-deadline. >>> * Change the argument name of the IOPRIO_PRIO_CLASS() and >>> IOPRIO_PRIO_DATA() macros from "mask" to "ioprio" to reflect the fact >>> that an IO priority value should be passed rather than a mask. >>> * Change the ioprio_valid() macro into an inline function, adding a >>> check on the maximum value of the class of a priority value as >>> defined by the IOPRIO_CLASS_MAX enum value. Move this function to >>> the kernel side in include/linux/ioprio.h. >>> * Remove the unnecessary "else" after the return statements in >>> task_nice_ioclass(). >>> >>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> >>> --- >>> include/linux/ioprio.h | 15 ++++++++++++--- >>> include/uapi/linux/ioprio.h | 19 +++++++++++-------- >>> 2 files changed, 23 insertions(+), 11 deletions(-) >>> >>> diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h >>> index ef9ad4fb245f..9b3a6d8172b4 100644 >>> --- a/include/linux/ioprio.h >>> +++ b/include/linux/ioprio.h >>> @@ -8,6 +8,16 @@ >>> >>> #include <uapi/linux/ioprio.h> >>> >>> +/* >>> + * Check that a priority value has a valid class. >>> + */ >>> +static inline bool ioprio_valid(unsigned short ioprio) >> >> Wouldn't it be better to use 'u16' here as type, as we're relying on the >> number of bits? > > Other functions in block/ioprio.c and in include/linux/ioprio.h use "unsigned > short", so I followed. But many functions, if not most, use "int". This is all a > bit of a mess. I think we need a "typedef ioprio_t u16;" to clean things up. But > there are a lot of places to fix. I can add such patch... Worth it ? > Possibly not. Consider my comment retracted :-) Cheers, Hannes
diff --git a/include/linux/ioprio.h b/include/linux/ioprio.h index ef9ad4fb245f..9b3a6d8172b4 100644 --- a/include/linux/ioprio.h +++ b/include/linux/ioprio.h @@ -8,6 +8,16 @@ #include <uapi/linux/ioprio.h> +/* + * Check that a priority value has a valid class. + */ +static inline bool ioprio_valid(unsigned short ioprio) +{ + unsigned short class = IOPRIO_PRIO_CLASS(ioprio); + + return class > IOPRIO_CLASS_NONE && class < IOPRIO_CLASS_MAX; +} + /* * if process has set io priority explicitly, use that. if not, convert * the cpu scheduler nice value to an io priority @@ -25,10 +35,9 @@ static inline int task_nice_ioclass(struct task_struct *task) { if (task->policy == SCHED_IDLE) return IOPRIO_CLASS_IDLE; - else if (task_is_realtime(task)) + if (task_is_realtime(task)) return IOPRIO_CLASS_RT; - else - return IOPRIO_CLASS_BE; + return IOPRIO_CLASS_BE; } /* diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h index 77b17e08b0da..abc40965aa96 100644 --- a/include/uapi/linux/ioprio.h +++ b/include/uapi/linux/ioprio.h @@ -5,12 +5,15 @@ /* * Gives us 8 prio classes with 13-bits of data for each class */ -#define IOPRIO_CLASS_SHIFT (13) +#define IOPRIO_CLASS_SHIFT 13 +#define IOPRIO_CLASS_MASK 0x07 #define IOPRIO_PRIO_MASK ((1UL << IOPRIO_CLASS_SHIFT) - 1) -#define IOPRIO_PRIO_CLASS(mask) ((mask) >> IOPRIO_CLASS_SHIFT) -#define IOPRIO_PRIO_DATA(mask) ((mask) & IOPRIO_PRIO_MASK) -#define IOPRIO_PRIO_VALUE(class, data) (((class) << IOPRIO_CLASS_SHIFT) | data) +#define IOPRIO_PRIO_CLASS(ioprio) \ + (((ioprio) >> IOPRIO_CLASS_SHIFT) & IOPRIO_CLASS_MASK) +#define IOPRIO_PRIO_DATA(ioprio) ((ioprio) & IOPRIO_PRIO_MASK) +#define IOPRIO_PRIO_VALUE(class, data) \ + (((class) << IOPRIO_CLASS_SHIFT) | ((data) & IOPRIO_PRIO_MASK)) /* * These are the io priority groups as implemented by CFQ. RT is the realtime @@ -23,14 +26,14 @@ enum { IOPRIO_CLASS_RT, IOPRIO_CLASS_BE, IOPRIO_CLASS_IDLE, -}; -#define ioprio_valid(mask) (IOPRIO_PRIO_CLASS((mask)) != IOPRIO_CLASS_NONE) + IOPRIO_CLASS_MAX, +}; /* * 8 best effort priority levels are supported */ -#define IOPRIO_BE_NR (8) +#define IOPRIO_BE_NR 8 enum { IOPRIO_WHO_PROCESS = 1, @@ -41,6 +44,6 @@ enum { /* * Fallback BE priority */ -#define IOPRIO_NORM (4) +#define IOPRIO_NORM 4 #endif /* _UAPI_LINUX_IOPRIO_H */
An iocb aio_reqprio field is 16-bits (u16) but often handled as an int in the block layer. E.g. ioprio_check_cap() takes an int as argument. With such implicit int casting function calls, the upper 16-bits of the int argument may be left uninitialized by the compiler, resulting in invalid values for the IOPRIO_PRIO_CLASS() macro (garbage upper bits) and in an error return for functions such as ioprio_check_cap(). Fix this by masking the result of the shift by IOPRIO_CLASS_SHIFT bits in the IOPRIO_PRIO_CLASS() macro. The new macro IOPRIO_CLASS_MASK defines the 3-bits mask for the priority class. While at it, cleanup the following: * Apply the mask IOPRIO_PRIO_MASK to the data argument of the IOPRIO_PRIO_VALUE() macro to ignore upper bits of the data value. * Remove unnecessary parenthesis around fixed values in the macro definitions in include/uapi/linux/ioprio.h. * Update the outdated mention of CFQ in the comment describing priority classes and instead mention BFQ and mq-deadline. * Change the argument name of the IOPRIO_PRIO_CLASS() and IOPRIO_PRIO_DATA() macros from "mask" to "ioprio" to reflect the fact that an IO priority value should be passed rather than a mask. * Change the ioprio_valid() macro into an inline function, adding a check on the maximum value of the class of a priority value as defined by the IOPRIO_CLASS_MAX enum value. Move this function to the kernel side in include/linux/ioprio.h. * Remove the unnecessary "else" after the return statements in task_nice_ioclass(). Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> --- include/linux/ioprio.h | 15 ++++++++++++--- include/uapi/linux/ioprio.h | 19 +++++++++++-------- 2 files changed, 23 insertions(+), 11 deletions(-)