diff mbox series

[6.10.0-rc2] kernel/module: avoid panic on loading broken module

Message ID 230772fc-1076-4afb-8f7a-e7c402548c3b@gmail.com (mailing list archive)
State Handled Elsewhere
Headers show
Series [6.10.0-rc2] kernel/module: avoid panic on loading broken module | expand

Commit Message

Daniel von Kirschten June 6, 2024, 1:31 p.m. UTC
If a module is being loaded, and the .gnu.linkonce.this_module section
in the module's ELF file does not have the WRITE flag, the kernel will
map the finished module struct of that module as read-only.
This causes a kernel panic when the struct is written to the first time
after it has been marked read-only. Currently this happens in
complete_formation in kernel/module/main.c:2765 when the module's state is
set to MODULE_STATE_COMING, just after setting up the memory protections.

Down the line, this seems to lead to unpredictable freezes when trying to
load other modules - I guess this is due to some structures not being
cleaned up properly, but I didn't investigate this further.

A check already exists which verifies that .gnu.linkonce.this_module
is ALLOC. This patch simply adds an analogous check for WRITE.

Signed-off-by: Daniel Kirschten <danielkirschten@gmail.com>
---
  kernel/module/main.c | 6 ++++++
  1 file changed, 6 insertions(+)

Comments

Luis Chamberlain June 18, 2024, 7:58 p.m. UTC | #1
On Thu, Jun 06, 2024 at 03:31:49PM +0200, Daniel v. Kirschten wrote:
> If a module is being loaded, and the .gnu.linkonce.this_module section
> in the module's ELF file does not have the WRITE flag, the kernel will
> map the finished module struct of that module as read-only.
> This causes a kernel panic when the struct is written to the first time
> after it has been marked read-only. Currently this happens in
> complete_formation in kernel/module/main.c:2765 when the module's state is
> set to MODULE_STATE_COMING, just after setting up the memory protections.

How did you find this issue?

> Down the line, this seems to lead to unpredictable freezes when trying to
> load other modules - I guess this is due to some structures not being
> cleaned up properly, but I didn't investigate this further.
> 
> A check already exists which verifies that .gnu.linkonce.this_module
> is ALLOC. This patch simply adds an analogous check for WRITE.

Can you check to ensure our modules generated have a respective check to
ensure this check exists at build time? That would proactively inform
userspace when a built module is not built correctly, and the tool
responsible can be identified.

  Luis
Daniel von Kirschten June 21, 2024, 2:05 p.m. UTC | #2
Am 18.06.2024 um 21:58 schrieb Luis Chamberlain:
> On Thu, Jun 06, 2024 at 03:31:49PM +0200, Daniel v. Kirschten wrote:
>> If a module is being loaded, and the .gnu.linkonce.this_module section
>> in the module's ELF file does not have the WRITE flag, the kernel will
>> map the finished module struct of that module as read-only.
>> This causes a kernel panic when the struct is written to the first time
>> after it has been marked read-only. Currently this happens in
>> complete_formation in kernel/module/main.c:2765 when the module's state is
>> set to MODULE_STATE_COMING, just after setting up the memory protections.
> 
> How did you find this issue?

In a university course I got the assignment to manually craft a loadable 
.ko file, given only a regular object file, without using Kbuild. During 
testing my module files, most of them were simply (correctly) rejected 
by the kernel with an appropriate error message, but at some point I ran 
into this exact kernel panic, and investigated it to understand why my 
module file was invalid.

> 
>> Down the line, this seems to lead to unpredictable freezes when trying to
>> load other modules - I guess this is due to some structures not being
>> cleaned up properly, but I didn't investigate this further.
>>
>> A check already exists which verifies that .gnu.linkonce.this_module
>> is ALLOC. This patch simply adds an analogous check for WRITE.
> 
> Can you check to ensure our modules generated have a respective check to
> ensure this check exists at build time? That would proactively inform
> userspace when a built module is not built correctly, and the tool
> responsible can be identified.

See above - I don't think it's possible to create such a broken module 
file with any of "official" tools. I haven't looked too deeply into how 
Kbuild actually builds modules, but as far as I know, the user doesn't 
even come into contact with this_module when using the regular 
toolchain, because Kbuild is responsible for creating the .this_module 
section. And Kbuild of course creates it with the correct flags. So if I 
understand correctly, this problem can only occur when the module was 
built by some external tooling (or manually, in my case).

   Daniel
Luis Chamberlain June 28, 2024, 5:25 p.m. UTC | #3
On Fri, Jun 21, 2024 at 04:05:27PM +0200, Daniel von Kirschten wrote:
> Am 18.06.2024 um 21:58 schrieb Luis Chamberlain:
> > On Thu, Jun 06, 2024 at 03:31:49PM +0200, Daniel v. Kirschten wrote:
> > > If a module is being loaded, and the .gnu.linkonce.this_module section
> > > in the module's ELF file does not have the WRITE flag, the kernel will
> > > map the finished module struct of that module as read-only.
> > > This causes a kernel panic when the struct is written to the first time
> > > after it has been marked read-only. Currently this happens in
> > > complete_formation in kernel/module/main.c:2765 when the module's state is
> > > set to MODULE_STATE_COMING, just after setting up the memory protections.
> > 
> > How did you find this issue?
> 
> In a university course I got the assignment to manually craft a loadable .ko
> file, given only a regular object file, without using Kbuild. During testing
> my module files, most of them were simply (correctly) rejected by the kernel
> with an appropriate error message, but at some point I ran into this exact
> kernel panic, and investigated it to understand why my module file was
> invalid.

OK, then the commit log should describe that this doesn't fix any known
real world issue, but rather a custom crafted module without the regular
module build system.

> > > Down the line, this seems to lead to unpredictable freezes when trying to
> > > load other modules - I guess this is due to some structures not being
> > > cleaned up properly, but I didn't investigate this further.
> > > 
> > > A check already exists which verifies that .gnu.linkonce.this_module
> > > is ALLOC. This patch simply adds an analogous check for WRITE.
> > 
> > Can you check to ensure our modules generated have a respective check to
> > ensure this check exists at build time? That would proactively inform
> > userspace when a built module is not built correctly, and the tool
> > responsible can be identified.
> 
> See above - I don't think it's possible to create such a broken module file
> with any of "official" tools.

That should be clearly stated on the commit log.

> I haven't looked too deeply into how Kbuild
> actually builds modules, but as far as I know, the user doesn't even come
> into contact with this_module w

Consider that a next level university assignment and is more useful to the world
than this debug message. Because above you suggest "I don't think", go
out and now be sure.

> hen using the regular toolchain, because
> Kbuild is responsible for creating the .this_module section. And Kbuild of
> course creates it with the correct flags. So if I understand correctly,

...

> this
> problem can only occur when the module was built by some external tooling
> (or manually, in my case).

Who would create custom modules without the Linux kernel module build
system, and what uses does that provide? It seems you are proving why
this would be terribly silly thing to do.

Now, the *value* your change has is it can prevent a crash in case of a
corrupted module, which *can* occur, consider an odd filesystem
live corruption, at least this would be caught at module load attempt
and not crash. That's worth committing for this reason but your commit
log really needs much more clarity. Why? Because stupid bots want to
assign stupid CVEs for anything that seems like a security issue and
this could escalate to such type of things. Providing clarity helps
system integrators decide if they want to backport this sort of patch.
Providing clarify on the chances of this happening and how we think it
can happen helps a lot.

If you want to be more proactive, try to enhance userspace kmod modprobe
so that this is also verified.

  Luis
diff mbox series

Patch

diff --git a/kernel/module/main.c b/kernel/module/main.c
index d18a94b973e1..abba097551a2 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -1886,6 +1886,12 @@  static int elf_validity_cache_copy(struct load_info *info, int flags)
                 goto no_exec;
         }
  
+       if (!(shdr->sh_flags & SHF_WRITE)) {
+               pr_err("module %s: .gnu.linkonce.this_module must be writable\n",
+                      info->name ?: "(missing .modinfo section or name field)");
+               goto no_exec;
+       }
+
         if (shdr->sh_size != sizeof(struct module)) {
                 pr_err("module %s: .gnu.linkonce.this_module section size must match the kernel's built struct module size at run time\n",
                        info->name ?: "(missing .modinfo section or name field)");