diff mbox series

[1/6] exec: Don't execute binary files if execve() returned ENOEXEC.

Message ID 20180907083414.14673-2-andrew.shadura@collabora.co.uk (mailing list archive)
State Rejected
Delegated to: Herbert Xu
Headers show
Series [1/6] exec: Don't execute binary files if execve() returned ENOEXEC. | expand

Commit Message

Andrej Shadura Sept. 7, 2018, 8:34 a.m. UTC
From: Adam Borowski <kilobyte@angband.pl>

Both "dash -c foo" and "./foo" are supposed to be able to run hashbang-less
scripts, but attempts to execute common binary files tend to be nasty:
especially both ELF and PE tend to make dash create a bunch of files with
unprintable names, that in turn confuse some tools up to causing data loss.

Thus, let's read the first line and see if it looks like text.  This is a
variant of the approach used by bash and zsh; mksh instead checks for
signatures of a bunch of common file types.

POSIX says: "If the executable file is not a text file, the shell may bypass
this command execution."

Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
---
 src/exec.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

Comments

Jilles Tjoelker Sept. 11, 2018, 9:06 p.m. UTC | #1
On Fri, Sep 07, 2018 at 10:34:09AM +0200, Andrej Shadura wrote:
> From: Adam Borowski <kilobyte@angband.pl>

> Both "dash -c foo" and "./foo" are supposed to be able to run hashbang-less
> scripts, but attempts to execute common binary files tend to be nasty:
> especially both ELF and PE tend to make dash create a bunch of files with
> unprintable names, that in turn confuse some tools up to causing data loss.

> Thus, let's read the first line and see if it looks like text.  This is a
> variant of the approach used by bash and zsh; mksh instead checks for
> signatures of a bunch of common file types.

> POSIX says: "If the executable file is not a text file, the shell may bypass
> this command execution."

I think this is worth the extra code, even though POSIX does not require
it.

Neither the file_is_binary function nor its caller take care of
preserving errno, which might lead to confusing error messages.

On another note, this is not fully POSIX-compliant because it rejects
some files that comply to POSIX's definition of a text file. Per POSIX's
definition, only files containing '\0' can be rejected (or containing
too long lines or not ending with a newline, both of which seem like a
bad idea to check); this probably requires checking more than the first
line for decent detection. However, I suppose the benefits and
disadvantages were weighed here and non-compliance was decided.

> Signed-off-by: Adam Borowski <kilobyte@angband.pl>
> Signed-off-by: Andrej Shadura <andrew.shadura@collabora.co.uk>
> ---
>  src/exec.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/src/exec.c b/src/exec.c
> index 9d0215a..6300001 100644
> --- a/src/exec.c
> +++ b/src/exec.c
> @@ -148,6 +148,36 @@ shellexec(char **argv, const char *path, int idx)
>  }
>  
>  
> +/*
> + * Check if an executable that just failed with ENOEXEC shouldn't be
> + * considered a script (wrong-arch ELF/PE, junk accidentally set +x, etc).
> + * We check only the first line to allow binaries encapsulated in a shell
> + * script without proper quoting.  The first line, if not a hashbang, is
> + * likely to contain comments; even ancient encodings, at least popular
> + * ones, don't use 0x7f nor values below 0x1f other than whitespace (\t,
> + * \n, \v, \f, \r), ISO/IEC 2022 can have SI, SO and \e.
> + */
> +STATIC int file_is_binary(const char *cmd)
> +{
> +	char buf[128];
> +	int fd = open(cmd, O_RDONLY|O_NOCTTY);
> +	if (fd == -1)
> +		return 1;
> +	int len = read(fd, buf, sizeof(buf));
> +	close(fd);
> +	for (int i = 0; i < len; ++i) {
> +		char c = buf[i];
> +		if (c >= 0 && c <= 8 ||
> +		    c >= 16 && c <= 31 && c != 27 ||
> +		    c == 0x7f)
> +			return 1;
> +		if (c == '\n')
> +			return 0;
> +	}
> +	return 0;
> +}
> +
> +
>  STATIC void
>  tryexec(char *cmd, char **argv, char **envp)
>  {
> @@ -162,6 +192,8 @@ repeat:
>  	execve(cmd, argv, envp);
>  #endif
>  	if (cmd != path_bshell && errno == ENOEXEC) {
> +		if (file_is_binary(cmd))
> +			return;
>  		*argv-- = cmd;
>  		*argv = cmd = path_bshell;
>  		goto repeat;
diff mbox series

Patch

diff --git a/src/exec.c b/src/exec.c
index 9d0215a..6300001 100644
--- a/src/exec.c
+++ b/src/exec.c
@@ -148,6 +148,36 @@  shellexec(char **argv, const char *path, int idx)
 }
 
 
+/*
+ * Check if an executable that just failed with ENOEXEC shouldn't be
+ * considered a script (wrong-arch ELF/PE, junk accidentally set +x, etc).
+ * We check only the first line to allow binaries encapsulated in a shell
+ * script without proper quoting.  The first line, if not a hashbang, is
+ * likely to contain comments; even ancient encodings, at least popular
+ * ones, don't use 0x7f nor values below 0x1f other than whitespace (\t,
+ * \n, \v, \f, \r), ISO/IEC 2022 can have SI, SO and \e.
+ */
+STATIC int file_is_binary(const char *cmd)
+{
+	char buf[128];
+	int fd = open(cmd, O_RDONLY|O_NOCTTY);
+	if (fd == -1)
+		return 1;
+	int len = read(fd, buf, sizeof(buf));
+	close(fd);
+	for (int i = 0; i < len; ++i) {
+		char c = buf[i];
+		if (c >= 0 && c <= 8 ||
+		    c >= 16 && c <= 31 && c != 27 ||
+		    c == 0x7f)
+			return 1;
+		if (c == '\n')
+			return 0;
+	}
+	return 0;
+}
+
+
 STATIC void
 tryexec(char *cmd, char **argv, char **envp)
 {
@@ -162,6 +192,8 @@  repeat:
 	execve(cmd, argv, envp);
 #endif
 	if (cmd != path_bshell && errno == ENOEXEC) {
+		if (file_is_binary(cmd))
+			return;
 		*argv-- = cmd;
 		*argv = cmd = path_bshell;
 		goto repeat;