Message ID | 20230613172927.19019-1-jonathantanmy@google.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | CodingGuidelines: use octal escapes, not hex | expand |
On Tue, Jun 13, 2023 at 1:44 PM Jonathan Tan <jonathantanmy@google.com> wrote: > Hexadecimal escapes in shell scripts are not portable across shells (in > particular, "dash" does not support them). Write in the CodingGuidelines > document that we should be using octal escapes instead. > > Signed-off-by: Jonathan Tan <jonathantanmy@google.com> > --- > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines > @@ -188,6 +188,9 @@ For shell scripts specifically (not exhaustive): > + - Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g. > + "\xc2\xa2"), as the latter is not portable. The shell itself doesn't interpret these sequences, so this description feels too generic. Perhaps it would make more sense to cite specific tools for which octal sequences are needed for portability reasons, such as `printf`, `sed`, `tr`, etc.
Eric Sunshine <sunshine@sunshineco.com> writes: > On Tue, Jun 13, 2023 at 1:44 PM Jonathan Tan <jonathantanmy@google.com> wrote: > > Hexadecimal escapes in shell scripts are not portable across shells (in > > particular, "dash" does not support them). Write in the CodingGuidelines > > document that we should be using octal escapes instead. > > > > Signed-off-by: Jonathan Tan <jonathantanmy@google.com> > > --- > > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines > > @@ -188,6 +188,9 @@ For shell scripts specifically (not exhaustive): > > + - Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g. > > + "\xc2\xa2"), as the latter is not portable. > > The shell itself doesn't interpret these sequences, so this > description feels too generic. Perhaps it would make more sense to > cite specific tools for which octal sequences are needed for > portability reasons, such as `printf`, `sed`, `tr`, etc. Ah...good point. I checked with "echo" in "dash" and assumed that it was "dash" that was interpreting the escapes, but indeed it is the "echo" (and "printf") builtins in "dash" that are actually interpreting them. What do you think of the following in the commit message: Hexadecimal escapes in shell scripts are not portable across shell builtins (in particular, the "printf" of "dash" does not support them). Write in the CodingGuidelines document that we should be using octal escapes instead. and in the CodingGuidelines doc: + - Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g. + "\xc2\xa2"), as the latter is not portable across some shell builtins like printf.
On Tue, Jun 13, 2023 at 2:43 PM Jonathan Tan <jonathantanmy@google.com> wrote: > Eric Sunshine <sunshine@sunshineco.com> writes: > > On Tue, Jun 13, 2023 at 1:44 PM Jonathan Tan <jonathantanmy@google.com> wrote: > > > + - Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g. > > > + "\xc2\xa2"), as the latter is not portable. > > > > The shell itself doesn't interpret these sequences, so this > > description feels too generic. Perhaps it would make more sense to > > cite specific tools for which octal sequences are needed for > > portability reasons, such as `printf`, `sed`, `tr`, etc. > > Ah...good point. I checked with "echo" in "dash" and assumed that it > was "dash" that was interpreting the escapes, but indeed it is the > "echo" (and "printf") builtins in "dash" that are actually interpreting > them. What do you think of the following in the commit message: > > Hexadecimal escapes in shell scripts are not portable across shell builtins (in > particular, the "printf" of "dash" does not support them). Write in the CodingGuidelines > document that we should be using octal escapes instead. > > and in the CodingGuidelines doc: > > + - Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g. > + "\xc2\xa2"), as the latter is not portable across some shell builtins like printf. The portability concern is not specific to a certain shell or whether a command is builtin, so it feels misleading to single out "dash" and builtins. The same portability problems can crop up, as well, with older (non-"dash") shells, and with commands which may or may not be builtins (such as "echo" which, historically, was not always a builtin), and non-builtins commands, such as "sed" and "tr". So, for the commit message, perhaps simply: Extend the shell-scripting section of CodingGuidelines to suggest octal escape sequences (e.g. "\302\242") over hexadecimal (e.g. "\xc2\xa2") since the latter can be a source of portability problems. As for the change to CodingGuidelines, this would probably be sufficient: Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g. "\xc2\xa2"), since the latter is not portable across some commands, such as `printf`, `sed`, `tr`, etc.
Eric Sunshine <sunshine@sunshineco.com> writes: > So, for the commit message, perhaps simply: > > Extend the shell-scripting section of CodingGuidelines to suggest > octal escape sequences (e.g. "\302\242") over hexadecimal > (e.g. "\xc2\xa2") since the latter can be a source of portability > problems. > > As for the change to CodingGuidelines, this would probably be sufficient: > > Use octal escape sequences (e.g. "\302\242"), not hexadecimal > (e.g. "\xc2\xa2"), since the latter is not portable across some > commands, such as `printf`, `sed`, `tr`, etc. I'd prefer singling out `printf`, actually, and not talking about "across some commands". As I said in a separate message, we certainly do *not* want to rely on `echo` interpreting bs-escaped octal sequences without '-e', even though it may be expected on a POSIX systems, because it is not portable across systems our users commonly encounter. And `printf` has been what we chose to turn bs-escaped octal sequence into binary. I'd prefer not having to even worry about `sed`, `tr`, etc. behaving differently and not allowing to expect these other commands to be usable for turning bs-escaped octal sequence into binary would be one way to achieve that goal. Thanks.
diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 003393ed16..1c54abd7c5 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -188,6 +188,9 @@ For shell scripts specifically (not exhaustive): hopefully nobody starts using "local" before they are reimplemented in C ;-) + - Use octal escape sequences (e.g. "\302\242"), not hexadecimal (e.g. + "\xc2\xa2"), as the latter is not portable. + For C programs:
Hexadecimal escapes in shell scripts are not portable across shells (in particular, "dash" does not support them). Write in the CodingGuidelines document that we should be using octal escapes instead. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> --- Documentation/CodingGuidelines | 3 +++ 1 file changed, 3 insertions(+)