[1/7] mkfs: Save raw user input field to the opts struct

Message ID	20170720092932.32580-2-jtulak@redhat.com (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <linux-xfs-owner@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 94A337CE0D From: Jan Tulak <jtulak@redhat.com> To: linux-xfs@vger.kernel.org Cc: Jan Tulak <jtulak@redhat.com> Subject: [PATCH 1/7] mkfs: Save raw user input field to the opts struct Date: Thu, 20 Jul 2017 11:29:26 +0200 Message-Id: <20170720092932.32580-2-jtulak@redhat.com> In-Reply-To: <20170720092932.32580-1-jtulak@redhat.com> References: <20170720092932.32580-1-jtulak@redhat.com> Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk

Jan Tulak July 20, 2017, 9:29 a.m. UTC

Save exactly what the user gave us for every option.  This way, we will
never lose the information if we need it to print back an issue.
(Just add the infrastructure now, used in the next patches.)

Signed-off-by: Jan Tulak <jtulak@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 mkfs/xfs_mkfs.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Luis Chamberlain July 27, 2017, 4:27 p.m. UTC | #1

On Thu, Jul 20, 2017 at 11:29:26AM +0200, Jan Tulak wrote:
> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> index a69190b9..4b030101 100644
> --- a/mkfs/xfs_mkfs.c
> +++ b/mkfs/xfs_mkfs.c
> @@ -107,6 +107,11 @@ unsigned int		sectorsize;
>   *     sets what is used with simple specifying the subopt (-d file).
>   *     A special SUBOPT_NEEDS_VAL can be used to require a user-given
>   *     value in any case.
> + *
> + *   raw_input INTERNAL
> + *     Filled raw string from the user, so we never lose that information e.g.
> + *     to print it back in case of an issue.
> + *
>   */
>  struct opt_params {
>  	const char	name;
> @@ -122,6 +127,7 @@ struct opt_params {
>  		long long	minval;
>  		long long	maxval;
>  		long long	defaultval;
> +		const char	*raw_input;
>  	}		subopt_params[MAX_SUBOPTS];
>  };
>  
> @@ -729,6 +735,18 @@ struct opt_params mopts = {
>   */
>  #define WHACK_SIZE (128 * 1024)
>  
> +static inline void
> +set_conf_raw(struct opt_params *opt, int subopt, const char *value)
> +{
> +	opt->subopt_params[subopt].raw_input = value;
> +}

There are no bounds check on the array here, I think set_conf_raw()
should return int and we would check the return value. It could
return -EINVAL if the subopt is invalid for instance.

> +
> +static inline const char *
> +get_conf_raw(const struct opt_params *opt, int subopt)
> +{
> +	return opt->subopt_params[subopt].raw_input;
> +}
> +
>  /*
>   * Convert lsu to lsunit for 512 bytes blocks and check validity of the values.

These are not pass by value.

The usage of set_conf_raw() and get_conf_raw() therefore have strict
constraints and can be only used within certain contexts:

  o Since they are pointers the lifetime usage of these functions
    are limited to the lifetime of the pointers                                 
  o Since they are *currently* used on main() this is fine but this would
    limit its use. In the future if we want to defer access to these
    pointers outside of main() or if main() uses a library which would
    parse some string and free it we'd have to make another change
    yet again.

Even if its *OK* today, if some helpers are used later which for instance call
set_conf_raw() and then free the passed pointer right away we are screwed,
leading to potentially using random values.  An alternative to limiting the use
of these routines would be to instead have set_conf_raw() to use strdup() and
have it return an int in case of -ENOMEM.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Tulak July 28, 2017, 2:45 p.m. UTC | #2

On 27/07/2017 18:27, Luis R. Rodriguez wrote:
> On Thu, Jul 20, 2017 at 11:29:26AM +0200, Jan Tulak wrote:
>> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
>> index a69190b9..4b030101 100644
>> --- a/mkfs/xfs_mkfs.c
>> +++ b/mkfs/xfs_mkfs.c
>> @@ -107,6 +107,11 @@ unsigned int		sectorsize;
>>    *     sets what is used with simple specifying the subopt (-d file).
>>    *     A special SUBOPT_NEEDS_VAL can be used to require a user-given
>>    *     value in any case.
>> + *
>> + *   raw_input INTERNAL
>> + *     Filled raw string from the user, so we never lose that information e.g.
>> + *     to print it back in case of an issue.
>> + *
>>    */
>>   struct opt_params {
>>   	const char	name;
>> @@ -122,6 +127,7 @@ struct opt_params {
>>   		long long	minval;
>>   		long long	maxval;
>>   		long long	defaultval;
>> +		const char	*raw_input;
>>   	}		subopt_params[MAX_SUBOPTS];
>>   };
>>   
>> @@ -729,6 +735,18 @@ struct opt_params mopts = {
>>    */
>>   #define WHACK_SIZE (128 * 1024)
>>   
>> +static inline void
>> +set_conf_raw(struct opt_params *opt, int subopt, const char *value)
>> +{
>> +	opt->subopt_params[subopt].raw_input = value;
>> +}
> There are no bounds check on the array here, I think set_conf_raw()
> should return int and we would check the return value. It could
> return -EINVAL if the subopt is invalid for instance.
Good idea. The only issue is with the return code, that causes some 
issues when we are also returning values - I wanted the values to be 
turned into uint64. But do we need to return an error? I don't see what 
usecase there would be for it, other than detecting a bug. So an assert 
might be a better solution - then it can't happen that a wrong index is 
used and result not tested.
>> +
>> +static inline const char *
>> +get_conf_raw(const struct opt_params *opt, int subopt)
>> +{
>> +	return opt->subopt_params[subopt].raw_input;
>> +}
>> +
>>   /*
>>    * Convert lsu to lsunit for 512 bytes blocks and check validity of the values.
> These are not pass by value.
>
> The usage of set_conf_raw() and get_conf_raw() therefore have strict
> constraints and can be only used within certain contexts:
>
>    o Since they are pointers the lifetime usage of these functions
>      are limited to the lifetime of the pointers
>    o Since they are *currently* used on main() this is fine but this would
>      limit its use. In the future if we want to defer access to these
>      pointers outside of main() or if main() uses a library which would
>      parse some string and free it we'd have to make another change
>      yet again.
>
> Even if its *OK* today, if some helpers are used later which for instance call
> set_conf_raw() and then free the passed pointer right away we are screwed,
> leading to potentially using random values.  An alternative to limiting the use
> of these routines would be to instead have set_conf_raw() to use strdup() and
> have it return an int in case of -ENOMEM.
>
>    Luis

Sounds reasonable.

Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Luis Chamberlain July 29, 2017, 5:12 p.m. UTC | #3

On Fri, Jul 28, 2017 at 04:45:58PM +0200, Jan Tulak wrote:
> 
> 
> On 27/07/2017 18:27, Luis R. Rodriguez wrote:
> > On Thu, Jul 20, 2017 at 11:29:26AM +0200, Jan Tulak wrote:
> > > diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> > > index a69190b9..4b030101 100644
> > > --- a/mkfs/xfs_mkfs.c
> > > +++ b/mkfs/xfs_mkfs.c
> > > @@ -107,6 +107,11 @@ unsigned int		sectorsize;
> > >    *     sets what is used with simple specifying the subopt (-d file).
> > >    *     A special SUBOPT_NEEDS_VAL can be used to require a user-given
> > >    *     value in any case.
> > > + *
> > > + *   raw_input INTERNAL
> > > + *     Filled raw string from the user, so we never lose that information e.g.
> > > + *     to print it back in case of an issue.
> > > + *
> > >    */
> > >   struct opt_params {
> > >   	const char	name;
> > > @@ -122,6 +127,7 @@ struct opt_params {
> > >   		long long	minval;
> > >   		long long	maxval;
> > >   		long long	defaultval;
> > > +		const char	*raw_input;
> > >   	}		subopt_params[MAX_SUBOPTS];
> > >   };
> > > @@ -729,6 +735,18 @@ struct opt_params mopts = {
> > >    */
> > >   #define WHACK_SIZE (128 * 1024)
> > > +static inline void
> > > +set_conf_raw(struct opt_params *opt, int subopt, const char *value)
> > > +{
> > > +	opt->subopt_params[subopt].raw_input = value;
> > > +}
> > There are no bounds check on the array here, I think set_conf_raw()
> > should return int and we would check the return value. It could
> > return -EINVAL if the subopt is invalid for instance.
> Good idea. The only issue is with the return code, that causes some issues
> when we are also returning values - I wanted the values to be turned into
> uint64. But do we need to return an error? I don't see what usecase there
> would be for it, other than detecting a bug. So an assert might be a better
> solution - then it can't happen that a wrong index is used and result not
> tested.

The setting of the value can be done by using an extra argument pointer. Then
if its set it be assigned. Otherwise it would be left alone. The return value
would return 0 on success, otherwise a standard return value indicating the
cause of the error.

I don't think we need the too small or too big, a simple range issue should
suffice and we have -ERANGE.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Tulak Aug. 2, 2017, 2:30 p.m. UTC | #4

On 29/07/2017 19:12, Luis R. Rodriguez wrote:
> On Fri, Jul 28, 2017 at 04:45:58PM +0200, Jan Tulak wrote:
>>
>> On 27/07/2017 18:27, Luis R. Rodriguez wrote:
>>> On Thu, Jul 20, 2017 at 11:29:26AM +0200, Jan Tulak wrote:
>>>> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
>>>> index a69190b9..4b030101 100644
>>>> --- a/mkfs/xfs_mkfs.c
>>>> +++ b/mkfs/xfs_mkfs.c
>>>> @@ -107,6 +107,11 @@ unsigned int		sectorsize;
>>>>     *     sets what is used with simple specifying the subopt (-d file).
>>>>     *     A special SUBOPT_NEEDS_VAL can be used to require a user-given
>>>>     *     value in any case.
>>>> + *
>>>> + *   raw_input INTERNAL
>>>> + *     Filled raw string from the user, so we never lose that information e.g.
>>>> + *     to print it back in case of an issue.
>>>> + *
>>>>     */
>>>>    struct opt_params {
>>>>    	const char	name;
>>>> @@ -122,6 +127,7 @@ struct opt_params {
>>>>    		long long	minval;
>>>>    		long long	maxval;
>>>>    		long long	defaultval;
>>>> +		const char	*raw_input;
>>>>    	}		subopt_params[MAX_SUBOPTS];
>>>>    };
>>>> @@ -729,6 +735,18 @@ struct opt_params mopts = {
>>>>     */
>>>>    #define WHACK_SIZE (128 * 1024)
>>>> +static inline void
>>>> +set_conf_raw(struct opt_params *opt, int subopt, const char *value)
>>>> +{
>>>> +	opt->subopt_params[subopt].raw_input = value;
>>>> +}
>>> There are no bounds check on the array here, I think set_conf_raw()
>>> should return int and we would check the return value. It could
>>> return -EINVAL if the subopt is invalid for instance.
>> Good idea. The only issue is with the return code, that causes some issues
>> when we are also returning values - I wanted the values to be turned into
>> uint64. But do we need to return an error? I don't see what usecase there
>> would be for it, other than detecting a bug. So an assert might be a better
>> solution - then it can't happen that a wrong index is used and result not
>> tested.
> The setting of the value can be done by using an extra argument pointer. Then
> if its set it be assigned. Otherwise it would be left alone. The return value
> would return 0 on success, otherwise a standard return value indicating the
> cause of the error.
I strongly prefer to return the value, not an error code. We can do the 
other way around, put the error code into an argument to get roughly the 
same result, while constructions like set_conf_raw(FOO, BAR, baz * 
get_conf_raw(FOO, BAR)) will continue to work without the need for 
intermediate variables.

The *_raw functions are used on few places only, so it would be only a 
small issue there, but for consistency, (get|set)_conf_val should have 
the same behavior and an intermediate variable for every use of those 
would be really annoying. So, how about this?

static inline void
set_conf_raw(struct opt_params *opt, int subopt, const char *value, int 
*err)
{
     if (subopt < 0 || subopt >= MAX_SUBOPTS) {
         if (err != NULL) *err = EINVAL;
         return;
     }
     opt->subopt_params[subopt].raw_input = value;
}

> I don't think we need the too small or too big, a simple range issue should
> suffice and we have -ERANGE.
>
At this moment, we are telling if it is too small or too big, but when 
there is no standard error code for that, ERANGE has to suffice.

Cheers,
Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Tulak Aug. 2, 2017, 3:51 p.m. UTC | #5

An addendum to the previous email.

On 02/08/2017 16:30, Jan Tulak wrote:
> On 29/07/2017 19:12, Luis R. Rodriguez wrote:
>> On Fri, Jul 28, 2017 at 04:45:58PM +0200, Jan Tulak wrote:
>>>
>>> On 27/07/2017 18:27, Luis R. Rodriguez wrote:
>>>> On Thu, Jul 20, 2017 at 11:29:26AM +0200, Jan Tulak wrote:
>>>>> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
>>>>> index a69190b9..4b030101 100644
>>>>> --- a/mkfs/xfs_mkfs.c
>>>>> +++ b/mkfs/xfs_mkfs.c
>>>>> @@ -107,6 +107,11 @@ unsigned int        sectorsize;
>>>>>     *     sets what is used with simple specifying the subopt (-d 
>>>>> file).
>>>>>     *     A special SUBOPT_NEEDS_VAL can be used to require a 
>>>>> user-given
>>>>>     *     value in any case.
>>>>> + *
>>>>> + *   raw_input INTERNAL
>>>>> + *     Filled raw string from the user, so we never lose that 
>>>>> information e.g.
>>>>> + *     to print it back in case of an issue.
>>>>> + *
>>>>>     */
>>>>>    struct opt_params {
>>>>>        const char    name;
>>>>> @@ -122,6 +127,7 @@ struct opt_params {
>>>>>            long long    minval;
>>>>>            long long    maxval;
>>>>>            long long    defaultval;
>>>>> +        const char    *raw_input;
>>>>>        }        subopt_params[MAX_SUBOPTS];
>>>>>    };
>>>>> @@ -729,6 +735,18 @@ struct opt_params mopts = {
>>>>>     */
>>>>>    #define WHACK_SIZE (128 * 1024)
>>>>> +static inline void
>>>>> +set_conf_raw(struct opt_params *opt, int subopt, const char *value)
>>>>> +{
>>>>> +    opt->subopt_params[subopt].raw_input = value;
>>>>> +}
>>>> There are no bounds check on the array here, I think set_conf_raw()
>>>> should return int and we would check the return value. It could
>>>> return -EINVAL if the subopt is invalid for instance.
>>> Good idea. The only issue is with the return code, that causes some 
>>> issues
>>> when we are also returning values - I wanted the values to be turned 
>>> into
>>> uint64. But do we need to return an error? I don't see what usecase 
>>> there
>>> would be for it, other than detecting a bug. So an assert might be a 
>>> better
>>> solution - then it can't happen that a wrong index is used and 
>>> result not
>>> tested.
>> The setting of the value can be done by using an extra argument 
>> pointer. Then
>> if its set it be assigned. Otherwise it would be left alone. The 
>> return value
>> would return 0 on success, otherwise a standard return value 
>> indicating the
>> cause of the error.
> I strongly prefer to return the value, not an error code. We can do 
> the other way around, put the error code into an argument to get 
> roughly the same result, while constructions like set_conf_raw(FOO, 
> BAR, baz * get_conf_raw(FOO, BAR)) will continue to work without the 
> need for intermediate variables.
>
> The *_raw functions are used on few places only, so it would be only a 
> small issue there, but for consistency, (get|set)_conf_val should have 
> the same behavior and an intermediate variable for every use of those 
> would be really annoying. So, how about this?
>
> static inline void
> set_conf_raw(struct opt_params *opt, int subopt, const char *value, 
> int *err)
> {
>     if (subopt < 0 || subopt >= MAX_SUBOPTS) {
>         if (err != NULL) *err = EINVAL;
>         return;
>     }
>     opt->subopt_params[subopt].raw_input = value;
> }
>
I just realized that there is probably no reason for set_conf_raw to 
expect invalid subopt - that's clearly a bug and we should just print a 
message and die, because who knows what happened... But for errors that 
can arose from user input, the style presented above is still valid.

Jan
>> I don't think we need the too small or too big, a simple range issue 
>> should
>> suffice and we have -ERANGE.
>>
> At this moment, we are telling if it is too small or too big, but when 
> there is no standard error code for that, ERANGE has to suffice.
>
> Cheers,
> Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Luis Chamberlain Aug. 2, 2017, 7:19 p.m. UTC | #6

On Wed, Aug 02, 2017 at 04:30:09PM +0200, Jan Tulak wrote:
> On 29/07/2017 19:12, Luis R. Rodriguez wrote:
> > On Fri, Jul 28, 2017 at 04:45:58PM +0200, Jan Tulak wrote:
> > > 
> > > On 27/07/2017 18:27, Luis R. Rodriguez wrote:
> > > > On Thu, Jul 20, 2017 at 11:29:26AM +0200, Jan Tulak wrote:
> > > > > diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> > > > > index a69190b9..4b030101 100644
> > > > > --- a/mkfs/xfs_mkfs.c
> > > > > +++ b/mkfs/xfs_mkfs.c
> > > > > @@ -107,6 +107,11 @@ unsigned int		sectorsize;
> > > > >     *     sets what is used with simple specifying the subopt (-d file).
> > > > >     *     A special SUBOPT_NEEDS_VAL can be used to require a user-given
> > > > >     *     value in any case.
> > > > > + *
> > > > > + *   raw_input INTERNAL
> > > > > + *     Filled raw string from the user, so we never lose that information e.g.
> > > > > + *     to print it back in case of an issue.
> > > > > + *
> > > > >     */
> > > > >    struct opt_params {
> > > > >    	const char	name;
> > > > > @@ -122,6 +127,7 @@ struct opt_params {
> > > > >    		long long	minval;
> > > > >    		long long	maxval;
> > > > >    		long long	defaultval;
> > > > > +		const char	*raw_input;
> > > > >    	}		subopt_params[MAX_SUBOPTS];
> > > > >    };
> > > > > @@ -729,6 +735,18 @@ struct opt_params mopts = {
> > > > >     */
> > > > >    #define WHACK_SIZE (128 * 1024)
> > > > > +static inline void
> > > > > +set_conf_raw(struct opt_params *opt, int subopt, const char *value)
> > > > > +{
> > > > > +	opt->subopt_params[subopt].raw_input = value;
> > > > > +}
> > > > There are no bounds check on the array here, I think set_conf_raw()
> > > > should return int and we would check the return value. It could
> > > > return -EINVAL if the subopt is invalid for instance.
> > > Good idea. The only issue is with the return code, that causes some issues
> > > when we are also returning values - I wanted the values to be turned into
> > > uint64. But do we need to return an error? I don't see what usecase there
> > > would be for it, other than detecting a bug. So an assert might be a better
> > > solution - then it can't happen that a wrong index is used and result not
> > > tested.
> > The setting of the value can be done by using an extra argument pointer. Then
> > if its set it be assigned. Otherwise it would be left alone. The return value
> > would return 0 on success, otherwise a standard return value indicating the
> > cause of the error.
> I strongly prefer to return the value, not an error code. We can do the
> other way around, put the error code into an argument to get roughly the
> same result, while constructions like set_conf_raw(FOO, BAR, baz *
> get_conf_raw(FOO, BAR)) will continue to work without the need for
> intermediate variables.
> 
> The *_raw functions are used on few places only, so it would be only a small
> issue there, but for consistency, (get|set)_conf_val should have the same
> behavior and an intermediate variable for every use of those would be really
> annoying. So, how about this?

It would not be intermediate, the main error variable from the start of
each function could be used, as is typical in many properly written C
programs.

> static inline void
> set_conf_raw(struct opt_params *opt, int subopt, const char *value, int
> *err)
> {
>     if (subopt < 0 || subopt >= MAX_SUBOPTS) {
>         if (err != NULL) *err = EINVAL;
>         return;
>     }
>     opt->subopt_params[subopt].raw_input = value;
> }

If you go with the strdup thing to avoid limiting the context of the use of
the pointer then you'll still have to return an error or abort, and I think
returning an error is best.

> > I don't think we need the too small or too big, a simple range issue should
> > suffice and we have -ERANGE.
> > 
> At this moment, we are telling if it is too small or too big, but when there
> is no standard error code for that, ERANGE has to suffice.

Sure, my point was that we have special values for too big or too small, and
I consider that hacky, we could just *say* if it was too big or too small
but just use ERANGE as its standard and non-hacky.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Luis Chamberlain Aug. 2, 2017, 7:41 p.m. UTC | #7

On Wed, Aug 02, 2017 at 05:51:47PM +0200, Jan Tulak wrote:
> An addendum to the previous email.
> 
> On 02/08/2017 16:30, Jan Tulak wrote:
> > On 29/07/2017 19:12, Luis R. Rodriguez wrote:
> > > On Fri, Jul 28, 2017 at 04:45:58PM +0200, Jan Tulak wrote:
> > > > 
> > > > On 27/07/2017 18:27, Luis R. Rodriguez wrote:
> > > > > On Thu, Jul 20, 2017 at 11:29:26AM +0200, Jan Tulak wrote:
> > > > > > diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> > > > > > index a69190b9..4b030101 100644
> > > > > > --- a/mkfs/xfs_mkfs.c
> > > > > > +++ b/mkfs/xfs_mkfs.c
> > > > > > @@ -107,6 +107,11 @@ unsigned int        sectorsize;
> > > > > >     *     sets what is used with simple specifying the
> > > > > > subopt (-d file).
> > > > > >     *     A special SUBOPT_NEEDS_VAL can be used to
> > > > > > require a user-given
> > > > > >     *     value in any case.
> > > > > > + *
> > > > > > + *   raw_input INTERNAL
> > > > > > + *     Filled raw string from the user, so we never
> > > > > > lose that information e.g.
> > > > > > + *     to print it back in case of an issue.
> > > > > > + *
> > > > > >     */
> > > > > >    struct opt_params {
> > > > > >        const char    name;
> > > > > > @@ -122,6 +127,7 @@ struct opt_params {
> > > > > >            long long    minval;
> > > > > >            long long    maxval;
> > > > > >            long long    defaultval;
> > > > > > +        const char    *raw_input;
> > > > > >        }        subopt_params[MAX_SUBOPTS];
> > > > > >    };
> > > > > > @@ -729,6 +735,18 @@ struct opt_params mopts = {
> > > > > >     */
> > > > > >    #define WHACK_SIZE (128 * 1024)
> > > > > > +static inline void
> > > > > > +set_conf_raw(struct opt_params *opt, int subopt, const char *value)
> > > > > > +{
> > > > > > +    opt->subopt_params[subopt].raw_input = value;
> > > > > > +}
> > > > > There are no bounds check on the array here, I think set_conf_raw()
> > > > > should return int and we would check the return value. It could
> > > > > return -EINVAL if the subopt is invalid for instance.
> > > > Good idea. The only issue is with the return code, that causes
> > > > some issues
> > > > when we are also returning values - I wanted the values to be
> > > > turned into
> > > > uint64. But do we need to return an error? I don't see what
> > > > usecase there
> > > > would be for it, other than detecting a bug. So an assert might
> > > > be a better
> > > > solution - then it can't happen that a wrong index is used and
> > > > result not
> > > > tested.
> > > The setting of the value can be done by using an extra argument
> > > pointer. Then
> > > if its set it be assigned. Otherwise it would be left alone. The
> > > return value
> > > would return 0 on success, otherwise a standard return value
> > > indicating the
> > > cause of the error.
> > I strongly prefer to return the value, not an error code. We can do the
> > other way around, put the error code into an argument to get roughly the
> > same result, while constructions like set_conf_raw(FOO, BAR, baz *
> > get_conf_raw(FOO, BAR)) will continue to work without the need for
> > intermediate variables.
> > 
> > The *_raw functions are used on few places only, so it would be only a
> > small issue there, but for consistency, (get|set)_conf_val should have
> > the same behavior and an intermediate variable for every use of those
> > would be really annoying. So, how about this?
> > 
> > static inline void
> > set_conf_raw(struct opt_params *opt, int subopt, const char *value, int
> > *err)
> > {
> >     if (subopt < 0 || subopt >= MAX_SUBOPTS) {
> >         if (err != NULL) *err = EINVAL;
> >         return;
> >     }
> >     opt->subopt_params[subopt].raw_input = value;
> > }
> > 
> I just realized that there is probably no reason for set_conf_raw to expect
> invalid subopt - that's clearly a bug and we should just print a message and
> die, because who knows what happened... But for errors that can arose from
> user input, the style presented above is still valid.

True however the issue of limiting the context of the use of the pointer
is still present and if you strdup you have to check for ENOMEM. If this
is done in a helper then its done only once, specially if a description
for the subopt is placed into the subopt structure.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Tulak Aug. 3, 2017, 1:07 p.m. UTC | #8

On 02/08/2017 21:19, Luis R. Rodriguez wrote:
> On Wed, Aug 02, 2017 at 04:30:09PM +0200, Jan Tulak wrote:
>> On 29/07/2017 19:12, Luis R. Rodriguez wrote:
>>> On Fri, Jul 28, 2017 at 04:45:58PM +0200, Jan Tulak wrote:
>>>> On 27/07/2017 18:27, Luis R. Rodriguez wrote:
>>>>> On Thu, Jul 20, 2017 at 11:29:26AM +0200, Jan Tulak wrote:
>>>>>> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
>>>>>> index a69190b9..4b030101 100644
>>>>>> --- a/mkfs/xfs_mkfs.c
>>>>>> +++ b/mkfs/xfs_mkfs.c
>>>>>> @@ -107,6 +107,11 @@ unsigned int		sectorsize;
>>>>>>      *     sets what is used with simple specifying the subopt (-d file).
>>>>>>      *     A special SUBOPT_NEEDS_VAL can be used to require a user-given
>>>>>>      *     value in any case.
>>>>>> + *
>>>>>> + *   raw_input INTERNAL
>>>>>> + *     Filled raw string from the user, so we never lose that information e.g.
>>>>>> + *     to print it back in case of an issue.
>>>>>> + *
>>>>>>      */
>>>>>>     struct opt_params {
>>>>>>     	const char	name;
>>>>>> @@ -122,6 +127,7 @@ struct opt_params {
>>>>>>     		long long	minval;
>>>>>>     		long long	maxval;
>>>>>>     		long long	defaultval;
>>>>>> +		const char	*raw_input;
>>>>>>     	}		subopt_params[MAX_SUBOPTS];
>>>>>>     };
>>>>>> @@ -729,6 +735,18 @@ struct opt_params mopts = {
>>>>>>      */
>>>>>>     #define WHACK_SIZE (128 * 1024)
>>>>>> +static inline void
>>>>>> +set_conf_raw(struct opt_params *opt, int subopt, const char *value)
>>>>>> +{
>>>>>> +	opt->subopt_params[subopt].raw_input = value;
>>>>>> +}
>>>>> There are no bounds check on the array here, I think set_conf_raw()
>>>>> should return int and we would check the return value. It could
>>>>> return -EINVAL if the subopt is invalid for instance.
>>>> Good idea. The only issue is with the return code, that causes some issues
>>>> when we are also returning values - I wanted the values to be turned into
>>>> uint64. But do we need to return an error? I don't see what usecase there
>>>> would be for it, other than detecting a bug. So an assert might be a better
>>>> solution - then it can't happen that a wrong index is used and result not
>>>> tested.
>>> The setting of the value can be done by using an extra argument pointer. Then
>>> if its set it be assigned. Otherwise it would be left alone. The return value
>>> would return 0 on success, otherwise a standard return value indicating the
>>> cause of the error.
>> I strongly prefer to return the value, not an error code. We can do the
>> other way around, put the error code into an argument to get roughly the
>> same result, while constructions like set_conf_raw(FOO, BAR, baz *
>> get_conf_raw(FOO, BAR)) will continue to work without the need for
>> intermediate variables.
>>
>> The *_raw functions are used on few places only, so it would be only a small
>> issue there, but for consistency, (get|set)_conf_val should have the same
>> behavior and an intermediate variable for every use of those would be really
>> annoying. So, how about this?
> It would not be intermediate, the main error variable from the start of
> each function could be used, as is typical in many properly written C
> programs.
I meant value-carrying variables, not the error one:

int temp; // a variable useful only on the next two lines
err = foo(&temp);
bar(temp);

versus:
bar(foo(&err));

The composition of functions would not be usable all the time, it 
depends on what would be the return value in case of an error and how 
would the outer function deal with it. But when I checked the code, I 
think that it could work in a lot of places.
>> static inline void
>> set_conf_raw(struct opt_params *opt, int subopt, const char *value, int
>> *err)
>> {
>>      if (subopt < 0 || subopt >= MAX_SUBOPTS) {
>>          if (err != NULL) *err = EINVAL;
>>          return;
>>      }
>>      opt->subopt_params[subopt].raw_input = value;
>> }
> If you go with the strdup thing to avoid limiting the context of the use of
> the pointer then you'll still have to return an error or abort, and I think
> returning an error is best.
OK, I'm willing to return errors for the _raw functions. These are used 
only on few places, so it is not a big issue. Especially if I add a 
wrapper for the get_conf_raw function - right now, these are used only 
as fprintf() arguments to print an error. So the wrapper makes it easy 
to use in this case (with the old die-on-error behavior), but if you 
want to use it for something else, you can use it directly and get an 
error as a return code. Does this looks good?

+/*
+ * Return 0 on success, -ENOMEM if it could not allocate enough memory for
+ * the string to be saved into the out pointer.
+ */
+static int
+get_conf_raw(const struct opt_params *opt, const int subopt, char **out)
+{
+       if (subopt < 0 || subopt >= MAX_SUBOPTS) {
+               fprintf(stderr,
+               "This is a bug: get_conf_raw called with invalid 
opt/subopt: %c/%d\n",
+               opt->name, subopt);
+               exit(1);
+       }
+       *out = strdup(opt->subopt_params[subopt].raw_input);
+       if (*out == NULL)
+               return -ENOMEM;
+       return 0;
+
+}
+
+/*
+ * Same as get_conf_raw(), except it returns the string through return
+ * and dies on any error.
+ */
+static char *
+get_conf_raw_safe(const struct opt_params *opt, const int subopt)
+{
+       char *str;
+       if (get_conf_raw(opt, subopt, &str) == -ENOMEM) {
+               fprintf(stderr, "Out of memory!");
+               exit(1);
+       }
+       return str;
+}


>
>>> I don't think we need the too small or too big, a simple range issue should
>>> suffice and we have -ERANGE.
>>>
>> At this moment, we are telling if it is too small or too big, but when there
>> is no standard error code for that, ERANGE has to suffice.
> Sure, my point was that we have special values for too big or too small, and
> I consider that hacky, we could just *say* if it was too big or too small
> but just use ERANGE as its standard and non-hacky.
We don't have special values, we just print it out and die. But yes, if 
we will pass the information anywhere, then it is better to use ERANGE 
rather than some custom error number.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Luis Chamberlain Aug. 3, 2017, 10:25 p.m. UTC | #9

On Thu, Aug 03, 2017 at 03:07:20PM +0200, Jan Tulak wrote:
> OK, I'm willing to return errors for the _raw functions. These are used only
> on few places, so it is not a big issue. Especially if I add a wrapper for
> the get_conf_raw function - right now, these are used only as fprintf()
> arguments to print an error. So the wrapper makes it easy to use in this
> case (with the old die-on-error behavior), but if you want to use it for
> something else, you can use it directly and get an error as a return code.
> Does this looks good?
> 
> +/*
> + * Return 0 on success, -ENOMEM if it could not allocate enough memory for
> + * the string to be saved into the out pointer.
> + */
> +static int
> +get_conf_raw(const struct opt_params *opt, const int subopt, char **out)
> +{
> +       if (subopt < 0 || subopt >= MAX_SUBOPTS) {
> +               fprintf(stderr,
> +               "This is a bug: get_conf_raw called with invalid opt/subopt:
> %c/%d\n",
> +               opt->name, subopt);
> +               exit(1);

Why not return -EINVAL?

> +       }
> +       *out = strdup(opt->subopt_params[subopt].raw_input);
> +       if (*out == NULL)
> +               return -ENOMEM;
> +       return 0;
> +
> +}
> +
> +/*
> + * Same as get_conf_raw(), except it returns the string through return
> + * and dies on any error.
> + */
> +static char *
> +get_conf_raw_safe(const struct opt_params *opt, const int subopt)
> +{
> +       char *str;
> +       if (get_conf_raw(opt, subopt, &str) == -ENOMEM) {
> +               fprintf(stderr, "Out of memory!");
> +               exit(1);

I'd say no, just return NULL; these aborts drive me personally nuts.

> +       }
> +       return str;
> +}
> 
> 
> > 
> > > > I don't think we need the too small or too big, a simple range issue should
> > > > suffice and we have -ERANGE.
> > > > 
> > > At this moment, we are telling if it is too small or too big, but when there
> > > is no standard error code for that, ERANGE has to suffice.
> > Sure, my point was that we have special values for too big or too small, and
> > I consider that hacky, we could just *say* if it was too big or too small
> > but just use ERANGE as its standard and non-hacky.
> We don't have special values, we just print it out and die. But yes, if we
> will pass the information anywhere, then it is better to use ERANGE rather
> than some custom error number.

Great.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Tulak Aug. 4, 2017, 1:50 p.m. UTC | #10

On 04/08/2017 00:25, Luis R. Rodriguez wrote:
> On Thu, Aug 03, 2017 at 03:07:20PM +0200, Jan Tulak wrote:
>> OK, I'm willing to return errors for the _raw functions. These are used only
>> on few places, so it is not a big issue. Especially if I add a wrapper for
>> the get_conf_raw function - right now, these are used only as fprintf()
>> arguments to print an error. So the wrapper makes it easy to use in this
>> case (with the old die-on-error behavior), but if you want to use it for
>> something else, you can use it directly and get an error as a return code.
>> Does this looks good?
>>
>> +/*
>> + * Return 0 on success, -ENOMEM if it could not allocate enough memory for
>> + * the string to be saved into the out pointer.
>> + */
>> +static int
>> +get_conf_raw(const struct opt_params *opt, const int subopt, char **out)
>> +{
>> +       if (subopt < 0 || subopt >= MAX_SUBOPTS) {
>> +               fprintf(stderr,
>> +               "This is a bug: get_conf_raw called with invalid opt/subopt:
>> %c/%d\n",
>> +               opt->name, subopt);
>> +               exit(1);
> Why not return -EINVAL?

If we know we hit a bug, we should terminate as soon as possible. We are 
in an indeterminable state and we shouldn't risk that we will write 
anything. C does not have exceptions, so I think that here we really 
should just exit. The memory issue can have a solution, but a bug? Time 
to end ASAP.

And set/get_conf_val is yet another issue. I really don't want to return 
errors there, because then we can't do things like:

if (get_conf_val(OPT_D, D_AGCOUNT) > XFS_MAX_AGNUMBER + 1)

There is over 350 uses of get_conf_val similar to this and if every 
usage should be changed to something like:

test_error(get_conf_val(OPT_D, D_AGCOUNT, &tmp_x));
if(tmp_x > XFS_MAX_AGNUMBER + 1)

Then this whole thing with temporary variables would make the situation 
worse than it is now.
We are not speaking about handling of issues that can arose from user 
input - that *should* be handled with returns - but about bugs and 
severe situations "the system can't allocate even few more bytes."

I really don't see how to avoid the aborts at all time here, while at 
the same time:
1) being able to detect that something happened and abort immediately
2) and having a simple usage that does not inflate every access to a 
multi-line mess.

>
>> +       }
>> +       *out = strdup(opt->subopt_params[subopt].raw_input);
>> +       if (*out == NULL)
>> +               return -ENOMEM;
>> +       return 0;
>> +
>> +}
>> +
>> +/*
>> + * Same as get_conf_raw(), except it returns the string through return
>> + * and dies on any error.
>> + */
>> +static char *
>> +get_conf_raw_safe(const struct opt_params *opt, const int subopt)
>> +{
>> +       char *str;
>> +       if (get_conf_raw(opt, subopt, &str) == -ENOMEM) {
>> +               fprintf(stderr, "Out of memory!");
>> +               exit(1);
> I'd say no, just return NULL; these aborts drive me personally nuts.
OK, NULL works here.

Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Luis Chamberlain Aug. 7, 2017, 5:26 p.m. UTC | #11

On Fri, Aug 04, 2017 at 03:50:19PM +0200, Jan Tulak wrote:
> 
> 
> On 04/08/2017 00:25, Luis R. Rodriguez wrote:
> > On Thu, Aug 03, 2017 at 03:07:20PM +0200, Jan Tulak wrote:
> > > OK, I'm willing to return errors for the _raw functions. These are used only
> > > on few places, so it is not a big issue. Especially if I add a wrapper for
> > > the get_conf_raw function - right now, these are used only as fprintf()
> > > arguments to print an error. So the wrapper makes it easy to use in this
> > > case (with the old die-on-error behavior), but if you want to use it for
> > > something else, you can use it directly and get an error as a return code.
> > > Does this looks good?
> > > 
> > > +/*
> > > + * Return 0 on success, -ENOMEM if it could not allocate enough memory for
> > > + * the string to be saved into the out pointer.
> > > + */
> > > +static int
> > > +get_conf_raw(const struct opt_params *opt, const int subopt, char **out)
> > > +{
> > > +       if (subopt < 0 || subopt >= MAX_SUBOPTS) {
> > > +               fprintf(stderr,
> > > +               "This is a bug: get_conf_raw called with invalid opt/subopt:
> > > %c/%d\n",
> > > +               opt->name, subopt);
> > > +               exit(1);
> > Why not return -EINVAL?
> 
> If we know we hit a bug, we should terminate as soon as possible. We are in
> an indeterminable state and we shouldn't risk that we will write anything. C
> does not have exceptions, so I think that here we really should just exit.
> The memory issue can have a solution, but a bug? Time to end ASAP.
> 
> And set/get_conf_val is yet another issue. I really don't want to return
> errors there, because then we can't do things like:
> 
> if (get_conf_val(OPT_D, D_AGCOUNT) > XFS_MAX_AGNUMBER + 1)
> 
> There is over 350 uses of get_conf_val similar to this and if every usage
> should be changed to something like:
> 
> test_error(get_conf_val(OPT_D, D_AGCOUNT, &tmp_x));
> if(tmp_x > XFS_MAX_AGNUMBER + 1)
> 
> Then this whole thing with temporary variables would make the situation
> worse than it is now.

Then one can keep the behaviour for get_conf_val() and it would use __get_conf_val()
which in turn *does* do the return. This way if I need to capture and handle the return
differently later this can be done and the code for existing callers does not need
to change, and the same paranoid behaviour can be kept?

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Tulak Aug. 7, 2017, 5:36 p.m. UTC | #12

On Mon, Aug 7, 2017 at 7:26 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Fri, Aug 04, 2017 at 03:50:19PM +0200, Jan Tulak wrote:
>>
>>
>> On 04/08/2017 00:25, Luis R. Rodriguez wrote:
>> > On Thu, Aug 03, 2017 at 03:07:20PM +0200, Jan Tulak wrote:
>> > > OK, I'm willing to return errors for the _raw functions. These are used only
>> > > on few places, so it is not a big issue. Especially if I add a wrapper for
>> > > the get_conf_raw function - right now, these are used only as fprintf()
>> > > arguments to print an error. So the wrapper makes it easy to use in this
>> > > case (with the old die-on-error behavior), but if you want to use it for
>> > > something else, you can use it directly and get an error as a return code.
>> > > Does this looks good?
>> > >
>> > > +/*
>> > > + * Return 0 on success, -ENOMEM if it could not allocate enough memory for
>> > > + * the string to be saved into the out pointer.
>> > > + */
>> > > +static int
>> > > +get_conf_raw(const struct opt_params *opt, const int subopt, char **out)
>> > > +{
>> > > +       if (subopt < 0 || subopt >= MAX_SUBOPTS) {
>> > > +               fprintf(stderr,
>> > > +               "This is a bug: get_conf_raw called with invalid opt/subopt:
>> > > %c/%d\n",
>> > > +               opt->name, subopt);
>> > > +               exit(1);
>> > Why not return -EINVAL?
>>
>> If we know we hit a bug, we should terminate as soon as possible. We are in
>> an indeterminable state and we shouldn't risk that we will write anything. C
>> does not have exceptions, so I think that here we really should just exit.
>> The memory issue can have a solution, but a bug? Time to end ASAP.
>>
>> And set/get_conf_val is yet another issue. I really don't want to return
>> errors there, because then we can't do things like:
>>
>> if (get_conf_val(OPT_D, D_AGCOUNT) > XFS_MAX_AGNUMBER + 1)
>>
>> There is over 350 uses of get_conf_val similar to this and if every usage
>> should be changed to something like:
>>
>> test_error(get_conf_val(OPT_D, D_AGCOUNT, &tmp_x));
>> if(tmp_x > XFS_MAX_AGNUMBER + 1)
>>
>> Then this whole thing with temporary variables would make the situation
>> worse than it is now.
>
> Then one can keep the behaviour for get_conf_val() and it would use __get_conf_val()
> which in turn *does* do the return. This way if I need to capture and handle the return
> differently later this can be done and the code for existing callers does not need
> to change, and the same paranoid behaviour can be kept?
>

Yes, I'm ok with two versions, one safe and one
unsafe-you-has-to-test-for-errors, if you have a use for it.

Jan

[1/7] mkfs: Save raw user input field to the opts struct

Commit Message

Comments

Patch