mbox series

[v3,00/11] sysctl: treewide: constify ctl_table argument of sysctl handlers

Message ID 20240423-sysctl-const-handler-v3-0-e0beccb836e2@weissschuh.net (mailing list archive)
Headers show
Series sysctl: treewide: constify ctl_table argument of sysctl handlers | expand

Message

Thomas Weißschuh April 23, 2024, 7:54 a.m. UTC
* Patch 1 is a bugfix for the stack_erasing sysctl handler
* Patches 2-10 change various helper functions throughout the kernel to
  be able to handle 'const ctl_table'.
* Patch 11 changes the signatures of all proc handlers through the tree.
  Some other signatures are also adapted, for details see the commit
  message.

Only patch 1 changes any code at all.

The series was compile-tested on top of next-20230423 for
i386, x86_64, arm, arm64, riscv, loongarch, s390 and m68k.

The series was split from my larger series sysctl-const series [0].
It only focusses on the proc_handlers but is an important step to be
able to move all static definitions of ctl_table into .rodata.

[0] https://lore.kernel.org/lkml/20231204-const-sysctl-v2-0-7a5060b11447@weissschuh.net/

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
---
Changes in v3:
- Rebase on current -next
- Cc affected mailing lists again to gather feedback
- Link to v2: https://lore.kernel.org/r/20240323-sysctl-const-handler-v2-0-e80b178f1727@weissschuh.net

Changes in v2:
- Reduce recipient list
- Fix source formatting
- Rebase onto next-20240322
- Link to v1: https://lore.kernel.org/r/20240315-sysctl-const-handler-v1-0-1322ac7cb03d@weissschuh.net

---
Thomas Weißschuh (11):
      stackleak: don't modify ctl_table argument
      cgroup: bpf: constify ctl_table arguments and fields
      hugetlb: constify ctl_table arguments of utility functions
      utsname: constify ctl_table arguments of utility function
      neighbour: constify ctl_table arguments of utility function
      ipv4/sysctl: constify ctl_table arguments of utility functions
      ipv6/addrconf: constify ctl_table arguments of utility functions
      ipv6/ndisc: constify ctl_table arguments of utility function
      ipvs: constify ctl_table arguments of utility functions
      sysctl: constify ctl_table arguments of utility function
      sysctl: treewide: constify the ctl_table argument of handlers

 arch/arm64/kernel/armv8_deprecated.c      |  2 +-
 arch/arm64/kernel/fpsimd.c                |  2 +-
 arch/s390/appldata/appldata_base.c        | 10 ++--
 arch/s390/kernel/debug.c                  |  2 +-
 arch/s390/kernel/topology.c               |  2 +-
 arch/s390/mm/cmm.c                        |  6 +--
 arch/x86/kernel/itmt.c                    |  2 +-
 drivers/cdrom/cdrom.c                     |  4 +-
 drivers/char/random.c                     |  6 +--
 drivers/macintosh/mac_hid.c               |  2 +-
 drivers/net/vrf.c                         |  2 +-
 drivers/parport/procfs.c                  | 12 ++---
 drivers/perf/arm_pmuv3.c                  |  4 +-
 drivers/perf/riscv_pmu_sbi.c              |  2 +-
 fs/coredump.c                             |  2 +-
 fs/dcache.c                               |  4 +-
 fs/drop_caches.c                          |  2 +-
 fs/exec.c                                 |  4 +-
 fs/file_table.c                           |  4 +-
 fs/fs-writeback.c                         |  2 +-
 fs/inode.c                                |  4 +-
 fs/pipe.c                                 |  2 +-
 fs/quota/dquot.c                          |  2 +-
 fs/xfs/xfs_sysctl.c                       |  6 +--
 include/linux/filter.h                    |  2 +-
 include/linux/ftrace.h                    |  4 +-
 include/linux/mm.h                        |  8 +--
 include/linux/perf_event.h                |  6 +--
 include/linux/security.h                  |  2 +-
 include/linux/sysctl.h                    | 36 ++++++-------
 include/linux/vmstat.h                    |  6 +--
 include/linux/writeback.h                 |  2 +-
 include/net/ndisc.h                       |  2 +-
 include/net/neighbour.h                   |  6 +--
 include/net/netfilter/nf_hooks_lwtunnel.h |  2 +-
 ipc/ipc_sysctl.c                          |  8 +--
 kernel/bpf/syscall.c                      |  4 +-
 kernel/delayacct.c                        |  4 +-
 kernel/events/callchain.c                 |  2 +-
 kernel/events/core.c                      |  4 +-
 kernel/fork.c                             |  2 +-
 kernel/hung_task.c                        |  6 +--
 kernel/kexec_core.c                       |  2 +-
 kernel/kprobes.c                          |  2 +-
 kernel/latencytop.c                       |  4 +-
 kernel/pid_namespace.c                    |  2 +-
 kernel/pid_sysctl.h                       |  2 +-
 kernel/printk/internal.h                  |  2 +-
 kernel/printk/printk.c                    |  2 +-
 kernel/printk/sysctl.c                    |  5 +-
 kernel/sched/core.c                       |  8 +--
 kernel/sched/rt.c                         | 16 +++---
 kernel/sched/topology.c                   |  2 +-
 kernel/seccomp.c                          | 10 ++--
 kernel/stackleak.c                        |  9 ++--
 kernel/sysctl.c                           | 89 ++++++++++++++++---------------
 kernel/time/timer.c                       |  2 +-
 kernel/trace/ftrace.c                     |  2 +-
 kernel/trace/trace.c                      |  2 +-
 kernel/trace/trace_events_user.c          |  2 +-
 kernel/trace/trace_stack.c                |  2 +-
 kernel/umh.c                              |  2 +-
 kernel/utsname_sysctl.c                   |  4 +-
 kernel/watchdog.c                         | 12 ++---
 mm/compaction.c                           |  6 +--
 mm/hugetlb.c                              | 12 ++---
 mm/page-writeback.c                       | 18 +++----
 mm/page_alloc.c                           | 14 ++---
 mm/util.c                                 | 12 ++---
 mm/vmstat.c                               |  4 +-
 net/bridge/br_netfilter_hooks.c           |  2 +-
 net/core/neighbour.c                      | 20 +++----
 net/core/sysctl_net_core.c                | 20 +++----
 net/ipv4/devinet.c                        |  6 +--
 net/ipv4/route.c                          |  2 +-
 net/ipv4/sysctl_net_ipv4.c                | 38 ++++++-------
 net/ipv6/addrconf.c                       | 27 +++++-----
 net/ipv6/ndisc.c                          |  4 +-
 net/ipv6/route.c                          |  2 +-
 net/ipv6/sysctl_net_ipv6.c                |  4 +-
 net/mpls/af_mpls.c                        |  4 +-
 net/netfilter/ipvs/ip_vs_ctl.c            | 19 +++----
 net/netfilter/nf_conntrack_standalone.c   |  2 +-
 net/netfilter/nf_hooks_lwtunnel.c         |  2 +-
 net/netfilter/nf_log.c                    |  2 +-
 net/phonet/sysctl.c                       |  2 +-
 net/rds/tcp.c                             |  4 +-
 net/sctp/sysctl.c                         | 32 +++++------
 net/sunrpc/sysctl.c                       |  6 +--
 net/sunrpc/xprtrdma/svc_rdma.c            |  2 +-
 security/apparmor/lsm.c                   |  2 +-
 security/min_addr.c                       |  2 +-
 security/yama/yama_lsm.c                  |  2 +-
 93 files changed, 329 insertions(+), 322 deletions(-)
---
base-commit: a59668a9397e7245b26e9be85d23f242ff757ae8
change-id: 20231226-sysctl-const-handler-883b5eba7e80

Best regards,

Comments

Luis R. Rodriguez April 23, 2024, 6:31 p.m. UTC | #1
On Tue, Apr 23, 2024 at 09:54:35AM +0200, Thomas Weißschuh wrote:
> * Patch 1 is a bugfix for the stack_erasing sysctl handler
> * Patches 2-10 change various helper functions throughout the kernel to
>   be able to handle 'const ctl_table'.
> * Patch 11 changes the signatures of all proc handlers through the tree.
>   Some other signatures are also adapted, for details see the commit
>   message.
> 
> Only patch 1 changes any code at all.
> 
> The series was compile-tested on top of next-20230423 for
> i386, x86_64, arm, arm64, riscv, loongarch, s390 and m68k.
> 
> The series was split from my larger series sysctl-const series [0].
> It only focusses on the proc_handlers but is an important step to be
> able to move all static definitions of ctl_table into .rodata.
> 
> [0] https://lore.kernel.org/lkml/20231204-const-sysctl-v2-0-7a5060b11447@weissschuh.net/
> 
> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>

Cover letters don't need SOBS we only use them for patches.

But anyway:

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>

  Luis
Jakub Kicinski April 25, 2024, 3:12 a.m. UTC | #2
On Tue, 23 Apr 2024 09:54:35 +0200 Thomas Weißschuh wrote:
> The series was split from my larger series sysctl-const series [0].
> It only focusses on the proc_handlers but is an important step to be
> able to move all static definitions of ctl_table into .rodata.

Split this per subsystem, please.
Thomas Weißschuh April 25, 2024, 7:10 a.m. UTC | #3
On 2024-04-24 20:12:34+0000, Jakub Kicinski wrote:
> On Tue, 23 Apr 2024 09:54:35 +0200 Thomas Weißschuh wrote:
> > The series was split from my larger series sysctl-const series [0].
> > It only focusses on the proc_handlers but is an important step to be
> > able to move all static definitions of ctl_table into .rodata.
> 
> Split this per subsystem, please.

Unfortunately this would introduce an enormous amount of code churn.

The function prototypes for each callback have to stay consistent.
So a another callback member ("proc_handler_new") is needed and users
would be migrated to it gradually.

But then *all* definitions of "struct ctl_table" throughout the tree need to
be touched.
In contrast, the proposed series only needs to change the handler
implementations, not their usage sites.

There are many, many more usage sites than handler implementations.

Especially, as the majority of sysctl tables use the standard handlers
(proc_dostring, proc_dobool, ...) and are not affected by the proposed
aproach at all.

And then we would have introduced a new handler name "proc_handler_new"
and maybe have to do the whole thing again to rename it back to
the original and well-known "proc_handler".


Of course if somebody has a better aproach, I'm all ears.


Thomas
Joel Granados April 25, 2024, 11:04 a.m. UTC | #4
On Wed, Apr 24, 2024 at 08:12:34PM -0700, Jakub Kicinski wrote:
> On Tue, 23 Apr 2024 09:54:35 +0200 Thomas Weißschuh wrote:
> > The series was split from my larger series sysctl-const series [0].
> > It only focusses on the proc_handlers but is an important step to be
> > able to move all static definitions of ctl_table into .rodata.
> 
> Split this per subsystem, please.
It is tricky to do that because it changes the first argument (ctl*) to
const in the proc_handler function type defined in sysclt.h:
"
-typedef int proc_handler(struct ctl_table *ctl, int write, void *buffer,
+typedef int proc_handler(const struct ctl_table *ctl, int write, void *buffer,
                size_t *lenp, loff_t *ppos);
"
This means that all the proc_handlers need to change at the same time.

However, there is an alternative way to do this that allows chunking. We
first define the proc_handler as a void pointer (casting it where it is
being used) [1]. Then we could do the constification by subsystem (like
Jakub proposes). Finally we can "revert the void pointer change so we
don't have one size fit all pointer as our proc_handler [2].

Here are some comments about the alternative:
1. We would need to make the first argument const in all the derived
   proc_handlers [3] 
2. There would be no undefined behavior for two reasons:
   2.1. There is no case where we change the first argument. We know
        this because there are no compile errors after we make it const.
   2.2. We would always go from non-const to const. This is the case
        because all the stuff that is unchanged in non-const.
3. If the idea sticks, it should go into mainline as one patchset. I
   would not like to have a void* proc_handler in a kernel release.
4. I think this is a "win/win" solution were the constification goes
   through and it is divided in such a way that it is reviewable.

I would really like to hear what ppl think about this "heretic"
alternative. @Thomas, @Luis, @Kees @Jakub?

Best

[1] https://git.kernel.org/pub/scm/linux/kernel/git/joel.granados/linux.git/commit/?h=jag/constfy_treewide_alternative&id=4a383503b1ea650d4e12c1f5838974e879f5aa6f
[2] https://git.kernel.org/pub/scm/linux/kernel/git/joel.granados/linux.git/commit/?h=jag/constfy_treewide_alternative&id=a3be65973d27ec2933b9e81e1bec60be3a9b460d
[3] proc_dostring, proc_dobool, proc_dointvec....

--

Joel Granados
Thomas Weißschuh April 25, 2024, 8:34 p.m. UTC | #5
Hi Joel,

On 2024-04-25 13:04:12+0000, Joel Granados wrote:
> On Wed, Apr 24, 2024 at 08:12:34PM -0700, Jakub Kicinski wrote:
> > On Tue, 23 Apr 2024 09:54:35 +0200 Thomas Weißschuh wrote:
> > > The series was split from my larger series sysctl-const series [0].
> > > It only focusses on the proc_handlers but is an important step to be
> > > able to move all static definitions of ctl_table into .rodata.
> > 
> > Split this per subsystem, please.
> It is tricky to do that because it changes the first argument (ctl*) to
> const in the proc_handler function type defined in sysclt.h:
> "
> -typedef int proc_handler(struct ctl_table *ctl, int write, void *buffer,
> +typedef int proc_handler(const struct ctl_table *ctl, int write, void *buffer,
>                 size_t *lenp, loff_t *ppos);
> "
> This means that all the proc_handlers need to change at the same time.
> 
> However, there is an alternative way to do this that allows chunking. We
> first define the proc_handler as a void pointer (casting it where it is
> being used) [1]. Then we could do the constification by subsystem (like
> Jakub proposes). Finally we can "revert the void pointer change so we
> don't have one size fit all pointer as our proc_handler [2].
> 
> Here are some comments about the alternative:
> 1. We would need to make the first argument const in all the derived
>    proc_handlers [3] 
> 2. There would be no undefined behavior for two reasons:
>    2.1. There is no case where we change the first argument. We know
>         this because there are no compile errors after we make it const.
>    2.2. We would always go from non-const to const. This is the case
>         because all the stuff that is unchanged in non-const.
> 3. If the idea sticks, it should go into mainline as one patchset. I
>    would not like to have a void* proc_handler in a kernel release.
> 4. I think this is a "win/win" solution were the constification goes
>    through and it is divided in such a way that it is reviewable.
> 
> I would really like to hear what ppl think about this "heretic"
> alternative. @Thomas, @Luis, @Kees @Jakub?

Thanks for that alternative, I'm not a big fan though.

Besides the wonky syntax, Control Flow Integrity should trap on
this construct. Functions are called through different pointers than
their actual types which is exactly what CFI is meant to prevent.

Maybe people find it easier to review when using
"--word-diff" and/or "-U0" with git diff/show.
There is really nothing going an besides adding a few "const"s.

But if the consensus prefers this solution, I'll be happy to adopt it.

> [1] https://git.kernel.org/pub/scm/linux/kernel/git/joel.granados/linux.git/commit/?h=jag/constfy_treewide_alternative&id=4a383503b1ea650d4e12c1f5838974e879f5aa6f
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/joel.granados/linux.git/commit/?h=jag/constfy_treewide_alternative&id=a3be65973d27ec2933b9e81e1bec60be3a9b460d
> [3] proc_dostring, proc_dobool, proc_dointvec....


Thomas
Thomas Weißschuh April 27, 2024, 7:40 a.m. UTC | #6
On 2024-04-25 09:10:27+0000, Thomas Weißschuh wrote:
> On 2024-04-24 20:12:34+0000, Jakub Kicinski wrote:
> > On Tue, 23 Apr 2024 09:54:35 +0200 Thomas Weißschuh wrote:
> > > The series was split from my larger series sysctl-const series [0].
> > > It only focusses on the proc_handlers but is an important step to be
> > > able to move all static definitions of ctl_table into .rodata.
> > 
> > Split this per subsystem, please.
> 
> Unfortunately this would introduce an enormous amount of code churn.
> 
> The function prototypes for each callback have to stay consistent.
> So a another callback member ("proc_handler_new") is needed and users
> would be migrated to it gradually.
> 
> But then *all* definitions of "struct ctl_table" throughout the tree need to
> be touched.
> In contrast, the proposed series only needs to change the handler
> implementations, not their usage sites.
> 
> There are many, many more usage sites than handler implementations.
> 
> Especially, as the majority of sysctl tables use the standard handlers
> (proc_dostring, proc_dobool, ...) and are not affected by the proposed
> aproach at all.
> 
> And then we would have introduced a new handler name "proc_handler_new"
> and maybe have to do the whole thing again to rename it back to
> the original and well-known "proc_handler".

This aproach could be optimized by only migrating the usages of the
custom handler implementations to "proc_handler_new".
After this we could move over the core handlers and "proc_handler" in
one small patch that does not need to touch the usages sites.

Afterwards all non-core usages would be migrated back from
"proc_handler_new" to "proc_handler" and the _new variant could be
dropped again.

It would still be more than twice the churn of my current patch.
And these patches would be more complex than the current
"just add a bunch of consts, nothing else".

Personally I still prefer the original aproach.


Thomas
Joel Granados May 3, 2024, 9:03 a.m. UTC | #7
Hey Thomas

Here is my feedback for your outstanding constification patches [1] and [2].

# You need to split the patch
The answer that you got from Jakub in the network subsystem is very clear and
baring a change of heart from the network folks, this will go in as but as a
split patchset. Please split it considering the following:
1. Create a different patchset for drivers/,  fs/, kernel/, net, and a
   miscellaneous that includes whatever does not fit into the others.
2. Consider that this might take several releases.
3. Consider the following sufix for the interim function name "_const". Like in
   kfree_const. Please not "_new".
4. Please publish the final result somewhere. This is important so someone can
   take over in case you need to stop.
5. Consistently mention the motivation in your cover letters. I specify more
   further down in "#Motivation".
6. Also mention that this is part of a bigger effort (like you did in your
   original cover letters). I would include [3,4,5,6]
7. Include a way to show what made it into .rodata. I specify more further down
   in "#Show the move".

# Motivation
As I read it, the motivation for these constification efforts are:
1. It provides increased safety: Having things in .rodata section reduces the
   attack surface. This is especially relevant for structures that have function
   pointers (like ctl_table); having these in .rodata means that these pointers
   always point to the "intended" function and cannot be changed.
2. Compiler optimizations: This was just a comment in the patchsets that I have
   mentioned ([3,4,5]). Do you know what optimizations specifically? Does it
   have to do with enhancing locality for the data in .rodata? Do you have other
   specific optimizations in mind?
3. Readability: because it is easier to know up-front that data is not supposed
   to change or its obvious that a function is re-entrant. Actually a lot of the
   readability reasons is about knowing things "up-front".
As we move forward with the constification in sysctl, please include a more
detailed motivation in all your cover letters. This helps maintainers (that
don't have the context) understand what you are trying to do. It does not need
to be my three points, but it should be more than just "put things into
.rodata". Please tell me if I have missed anything in the motivation.

# Show the move
I created [8] because there is no easy way to validate which objects made it
into .rodata. I ran [8] for your Dec 2nd patcheset [7] and there are less in
.rodata than I expected (the results are in [9]) Why is that? Is it something
that has not been posted to the lists yet? 

Best

[1] https://lore.kernel.org/all/20240423-sysctl-const-handler-v3-0-e0beccb836e2@weissschuh.net/
[2] https://lore.kernel.org/all/20240418-sysctl-const-table-arg-v2-1-4012abc31311@weissschuh.net
[3] [PATCH v2 00/14] ASoC: Constify local snd_sof_dsp_ops
    https://lore.kernel.org/all/20240426-n-const-ops-var-v2-0-e553fe67ae82@kernel.org
[4] [PATCH v2 00/19] backlight: Constify lcd_ops
    https://lore.kernel.org/all/20240424-video-backlight-lcd-ops-v2-0-1aaa82b07bc6@kernel.org
[5] [PATCH 1/4] iommu: constify pointer to bus_type
    https://lore.kernel.org/all/20240216144027.185959-1-krzysztof.kozlowski@linaro.org
[6] [PATCH 00/29] const xattr tables
    https://lore.kernel.org/all/20230930050033.41174-1-wedsonaf@gmail.com
[7] https://lore.kernel.org/all/20231204-const-sysctl-v2-0-7a5060b11447@weissschuh.net/

[8]
    #!/usr/bin/python3

    import subprocess
    import re

    def exec_cmd( cmd ):
        try:
            result = subprocess.run(cmd, shell=True, text=True, check=True, capture_output=True)
            output_lines = result.stdout.splitlines()
            return output_lines
        except Exception as e:
            print(f"An error occurred: {e}")
            return []

    def remove_tokens_re(lines, regex_patterns, uniq = True):
        filtered_lines = []
        seen_lines = set()
        regexes = [re.compile(pattern) for pattern in regex_patterns]

        for line in lines:
            for regex in regexes:
                line = regex.sub('', line)  # Replace matches with empty string

            if uniq:
                if line not in seen_lines:
                    seen_lines.add(line)
                    filtered_lines.append(line)
            else:
                    filtered_lines.append(line)

        return filtered_lines

    def filter_in_lines(lines, regex_patterns):
        filtered_lines = []
        regexes = [re.compile(pattern) for pattern in regex_patterns]

        for line in lines:
            if any(regex.search(line) for regex in regexes):
                filtered_lines.append(line)

        return filtered_lines

    cmd = "git grep 'static \(const \)\?struct ctl_table '"
    regex_patterns = ['[\}]*;$', ' = \{', '\[.*\]', '.*\.(c|h):[ \t]*static (const )?struct ctl_table ']
    ctl_table_structs = remove_tokens_re(exec_cmd( cmd ), regex_patterns)

    cmd = "readelf -X -Ws build/vmlinux"
    regex_patterns = ['.*OBJECT.*']
    output_lines = filter_in_lines(exec_cmd( cmd ), regex_patterns);

    regex_patterns = ['^.*OBJECT', '[ \t]+[A-Z]+[ \t]+[A-Z]+.*\(.*\)[ \t]+']
    obj_elems = remove_tokens_re( output_lines, regex_patterns, uniq=False)

    regex_patterns = ['^.*\(', '\)[ ]+.*$']
    sec_names = remove_tokens_re( output_lines, regex_patterns, uniq=False)

    for i in range(len(sec_names)):
        obj_name = obj_elems[i]
        if obj_name in ctl_table_structs:
            print ("section: {}\t\tobj_name : {}". format(sec_names[i], obj_name))

[9]
    section: .rodata                obj_name : kern_table
    section: .rodata                obj_name : sysctl_mount_point
    section: .rodata                obj_name : addrconf_sysctl
    section: .rodata                obj_name : ax25_param_table
    section: .rodata                obj_name : mpls_table
    section: .rodata                obj_name : mpls_dev_table
    section: .data          obj_name : sld_sysctls
    section: .data          obj_name : kern_panic_table
    section: .data          obj_name : kern_exit_table
    section: .data          obj_name : vm_table
    section: .data          obj_name : signal_debug_table
    section: .data          obj_name : usermodehelper_table
    section: .data          obj_name : kern_reboot_table
    section: .data          obj_name : user_table
    section: .bss           obj_name : sched_core_sysctls
    section: .data          obj_name : sched_fair_sysctls
    section: .data          obj_name : sched_rt_sysctls
    section: .data          obj_name : sched_dl_sysctls
    section: .data          obj_name : printk_sysctls
    section: .data          obj_name : pid_ns_ctl_table_vm
    section: .data          obj_name : seccomp_sysctl_table
    section: .data          obj_name : uts_kern_table
    section: .data          obj_name : vm_oom_kill_table
    section: .data          obj_name : vm_page_writeback_sysctls
    section: .data          obj_name : page_alloc_sysctl_table
    section: .data          obj_name : hugetlb_table
    section: .data          obj_name : fs_stat_sysctls
    section: .data          obj_name : fs_exec_sysctls
    section: .data          obj_name : fs_pipe_sysctls
    section: .data          obj_name : namei_sysctls
    section: .data          obj_name : fs_dcache_sysctls
    section: .data          obj_name : inodes_sysctls
    section: .data          obj_name : fs_namespace_sysctls
    section: .data          obj_name : dnotify_sysctls
    section: .data          obj_name : inotify_table
    section: .data          obj_name : epoll_table
    section: .data          obj_name : aio_sysctls
    section: .data          obj_name : locks_sysctls
    section: .data          obj_name : coredump_sysctls
    section: .data          obj_name : fs_shared_sysctls
    section: .data          obj_name : fs_dqstats_table
    section: .data          obj_name : root_table
    section: .data          obj_name : pty_table
    section: .data          obj_name : xfs_table
    section: .data          obj_name : ipc_sysctls
    section: .data          obj_name : key_sysctls
    section: .data          obj_name : kernel_io_uring_disabled_table
    section: .data          obj_name : tty_table
    section: .data          obj_name : random_table
    section: .data          obj_name : scsi_table
    section: .data          obj_name : iwcm_ctl_table
    section: .data          obj_name : net_core_table
    section: .data          obj_name : netns_core_table
    section: .bss           obj_name : nf_log_sysctl_table
    section: .data          obj_name : nf_log_sysctl_ftable
    section: .data          obj_name : vs_vars
    section: .data          obj_name : vs_vars_table
    section: .data          obj_name : ipv4_route_netns_table
    section: .data          obj_name : ipv4_route_table
    section: .data          obj_name : ip4_frags_ns_ctl_table
    section: .data          obj_name : ip4_frags_ctl_table
    section: .data          obj_name : ctl_forward_entry
    section: .data          obj_name : ipv4_table
    section: .data          obj_name : ipv4_net_table
    section: .data          obj_name : unix_table
    section: .data          obj_name : ipv6_route_table_template
    section: .data          obj_name : ipv6_icmp_table_template
    section: .data          obj_name : ip6_frags_ns_ctl_table
    section: .data          obj_name : ip6_frags_ctl_table
    section: .data          obj_name : ipv6_table_template
    section: .data          obj_name : ipv6_rotable
    section: .data          obj_name : sctp_net_table
    section: .data          obj_name : sctp_table
    section: .data          obj_name : smc_table
    section: .data          obj_name : lowpan_frags_ns_ctl_table
    section: .data          obj_name : lowpan_frags_ctl_table

--

Joel Granados
Thomas Weißschuh May 3, 2024, 2:09 p.m. UTC | #8
Hey Joel,

On 2024-05-03 11:03:32+0000, Joel Granados wrote:
> Here is my feedback for your outstanding constification patches [1] and [2].

Thanks!

> # You need to split the patch
> The answer that you got from Jakub in the network subsystem is very clear and
> baring a change of heart from the network folks, this will go in as but as a
> split patchset. Please split it considering the following:
> 1. Create a different patchset for drivers/,  fs/, kernel/, net, and a
>    miscellaneous that includes whatever does not fit into the others.
> 2. Consider that this might take several releases.
> 3. Consider the following sufix for the interim function name "_const". Like in
>    kfree_const. Please not "_new".

Ack. "_new" was an intentionally unacceptable placeholder.

> 4. Please publish the final result somewhere. This is important so someone can
>    take over in case you need to stop.

Will do. Both for each single series and a combination of all of them.

> 5. Consistently mention the motivation in your cover letters. I specify more
>    further down in "#Motivation".
> 6. Also mention that this is part of a bigger effort (like you did in your
>    original cover letters). I would include [3,4,5,6]
> 7. Include a way to show what made it into .rodata. I specify more further down
>    in "#Show the move".
> 
> # Motivation
> As I read it, the motivation for these constification efforts are:
> 1. It provides increased safety: Having things in .rodata section reduces the
>    attack surface. This is especially relevant for structures that have function
>    pointers (like ctl_table); having these in .rodata means that these pointers
>    always point to the "intended" function and cannot be changed.
> 2. Compiler optimizations: This was just a comment in the patchsets that I have
>    mentioned ([3,4,5]). Do you know what optimizations specifically? Does it
>    have to do with enhancing locality for the data in .rodata? Do you have other
>    specific optimizations in mind?

I don't know about anything that would make it faster.
It's more about safety and transmission of intent to API users,
especially callback implementers.

> 3. Readability: because it is easier to know up-front that data is not supposed
>    to change or its obvious that a function is re-entrant. Actually a lot of the
>    readability reasons is about knowing things "up-front".
> As we move forward with the constification in sysctl, please include a more
> detailed motivation in all your cover letters. This helps maintainers (that
> don't have the context) understand what you are trying to do. It does not need
> to be my three points, but it should be more than just "put things into
> .rodata". Please tell me if I have missed anything in the motivation.

Will do.

> # Show the move
> I created [8] because there is no easy way to validate which objects made it
> into .rodata. I ran [8] for your Dec 2nd patcheset [7] and there are less in
> .rodata than I expected (the results are in [9]) Why is that? Is it something
> that has not been posted to the lists yet? 

Constifying the APIs only *allows* the actual table to be constified
themselves.
Then each table definition will have to be touched and "const" added.

See patches 17 and 18 in [7] for two examples.

Some tables in net/ are already "const" as the static definitions are
never registered themselves but only their copies are.

This seems to explain your findings.

> Best

Thanks!

> [1] https://lore.kernel.org/all/20240423-sysctl-const-handler-v3-0-e0beccb836e2@weissschuh.net/
> [2] https://lore.kernel.org/all/20240418-sysctl-const-table-arg-v2-1-4012abc31311@weissschuh.net
> [3] [PATCH v2 00/14] ASoC: Constify local snd_sof_dsp_ops
>     https://lore.kernel.org/all/20240426-n-const-ops-var-v2-0-e553fe67ae82@kernel.org
> [4] [PATCH v2 00/19] backlight: Constify lcd_ops
>     https://lore.kernel.org/all/20240424-video-backlight-lcd-ops-v2-0-1aaa82b07bc6@kernel.org
> [5] [PATCH 1/4] iommu: constify pointer to bus_type
>     https://lore.kernel.org/all/20240216144027.185959-1-krzysztof.kozlowski@linaro.org
> [6] [PATCH 00/29] const xattr tables
>     https://lore.kernel.org/all/20230930050033.41174-1-wedsonaf@gmail.com
> [7] https://lore.kernel.org/all/20231204-const-sysctl-v2-0-7a5060b11447@weissschuh.net/
> 
> [8]

[snip]

> [9]
>     section: .rodata                obj_name : kern_table
>     section: .rodata                obj_name : sysctl_mount_point
>     section: .rodata                obj_name : addrconf_sysctl
>     section: .rodata                obj_name : ax25_param_table
>     section: .rodata                obj_name : mpls_table
>     section: .rodata                obj_name : mpls_dev_table
>     section: .data          obj_name : sld_sysctls
>     section: .data          obj_name : kern_panic_table
>     section: .data          obj_name : kern_exit_table
>     section: .data          obj_name : vm_table
>     section: .data          obj_name : signal_debug_table
>     section: .data          obj_name : usermodehelper_table
>     section: .data          obj_name : kern_reboot_table
>     section: .data          obj_name : user_table
>     section: .bss           obj_name : sched_core_sysctls
>     section: .data          obj_name : sched_fair_sysctls
>     section: .data          obj_name : sched_rt_sysctls
>     section: .data          obj_name : sched_dl_sysctls
>     section: .data          obj_name : printk_sysctls
>     section: .data          obj_name : pid_ns_ctl_table_vm
>     section: .data          obj_name : seccomp_sysctl_table
>     section: .data          obj_name : uts_kern_table
>     section: .data          obj_name : vm_oom_kill_table
>     section: .data          obj_name : vm_page_writeback_sysctls
>     section: .data          obj_name : page_alloc_sysctl_table
>     section: .data          obj_name : hugetlb_table
>     section: .data          obj_name : fs_stat_sysctls
>     section: .data          obj_name : fs_exec_sysctls
>     section: .data          obj_name : fs_pipe_sysctls
>     section: .data          obj_name : namei_sysctls
>     section: .data          obj_name : fs_dcache_sysctls
>     section: .data          obj_name : inodes_sysctls
>     section: .data          obj_name : fs_namespace_sysctls
>     section: .data          obj_name : dnotify_sysctls
>     section: .data          obj_name : inotify_table
>     section: .data          obj_name : epoll_table
>     section: .data          obj_name : aio_sysctls
>     section: .data          obj_name : locks_sysctls
>     section: .data          obj_name : coredump_sysctls
>     section: .data          obj_name : fs_shared_sysctls
>     section: .data          obj_name : fs_dqstats_table
>     section: .data          obj_name : root_table
>     section: .data          obj_name : pty_table
>     section: .data          obj_name : xfs_table
>     section: .data          obj_name : ipc_sysctls
>     section: .data          obj_name : key_sysctls
>     section: .data          obj_name : kernel_io_uring_disabled_table
>     section: .data          obj_name : tty_table
>     section: .data          obj_name : random_table
>     section: .data          obj_name : scsi_table
>     section: .data          obj_name : iwcm_ctl_table
>     section: .data          obj_name : net_core_table
>     section: .data          obj_name : netns_core_table
>     section: .bss           obj_name : nf_log_sysctl_table
>     section: .data          obj_name : nf_log_sysctl_ftable
>     section: .data          obj_name : vs_vars
>     section: .data          obj_name : vs_vars_table
>     section: .data          obj_name : ipv4_route_netns_table
>     section: .data          obj_name : ipv4_route_table
>     section: .data          obj_name : ip4_frags_ns_ctl_table
>     section: .data          obj_name : ip4_frags_ctl_table
>     section: .data          obj_name : ctl_forward_entry
>     section: .data          obj_name : ipv4_table
>     section: .data          obj_name : ipv4_net_table
>     section: .data          obj_name : unix_table
>     section: .data          obj_name : ipv6_route_table_template
>     section: .data          obj_name : ipv6_icmp_table_template
>     section: .data          obj_name : ip6_frags_ns_ctl_table
>     section: .data          obj_name : ip6_frags_ctl_table
>     section: .data          obj_name : ipv6_table_template
>     section: .data          obj_name : ipv6_rotable
>     section: .data          obj_name : sctp_net_table
>     section: .data          obj_name : sctp_table
>     section: .data          obj_name : smc_table
>     section: .data          obj_name : lowpan_frags_ns_ctl_table
>     section: .data          obj_name : lowpan_frags_ctl_table
Joel Granados May 8, 2024, 11:37 a.m. UTC | #9
Kees

Could you comment on the feasibility of this alternative from the
Control Flow Integrity perspective. My proposal is to change the
proc_handler to void* and back in the same release. So there would not
be a kernel released with a void* proc_handler.

> > However, there is an alternative way to do this that allows chunking. We
> > first define the proc_handler as a void pointer (casting it where it is
> > being used) [1]. Then we could do the constification by subsystem (like
> > Jakub proposes). Finally we can "revert the void pointer change so we
> > don't have one size fit all pointer as our proc_handler [2].
> > 
> > Here are some comments about the alternative:
> > 1. We would need to make the first argument const in all the derived
> >    proc_handlers [3] 
> > 2. There would be no undefined behavior for two reasons:
> >    2.1. There is no case where we change the first argument. We know
> >         this because there are no compile errors after we make it const.
> >    2.2. We would always go from non-const to const. This is the case
> >         because all the stuff that is unchanged in non-const.
> > 3. If the idea sticks, it should go into mainline as one patchset. I
> >    would not like to have a void* proc_handler in a kernel release.
> > 4. I think this is a "win/win" solution were the constification goes
> >    through and it is divided in such a way that it is reviewable.
> > 
> > I would really like to hear what ppl think about this "heretic"
> > alternative. @Thomas, @Luis, @Kees @Jakub?
> 
> Thanks for that alternative, I'm not a big fan though.
> 
> Besides the wonky syntax, Control Flow Integrity should trap on
> this construct. Functions are called through different pointers than
> their actual types which is exactly what CFI is meant to prevent.
> 
> Maybe people find it easier to review when using
> "--word-diff" and/or "-U0" with git diff/show.
> There is really nothing going an besides adding a few "const"s.
> 
> But if the consensus prefers this solution, I'll be happy to adopt it.
> 
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/joel.granados/linux.git/commit/?h=jag/constfy_treewide_alternative&id=4a383503b1ea650d4e12c1f5838974e879f5aa6f
> > [2] https://git.kernel.org/pub/scm/linux/kernel/git/joel.granados/linux.git/commit/?h=jag/constfy_treewide_alternative&id=a3be65973d27ec2933b9e81e1bec60be3a9b460d
> > [3] proc_dostring, proc_dobool, proc_dointvec....
> 
> 
> Thomas

Best
Joel Granados May 8, 2024, 11:40 a.m. UTC | #10
On Fri, May 03, 2024 at 04:09:40PM +0200, Thomas Weißschuh wrote:
> Hey Joel,
> 
...
> > # Motivation
> > As I read it, the motivation for these constification efforts are:
> > 1. It provides increased safety: Having things in .rodata section reduces the
> >    attack surface. This is especially relevant for structures that have function
> >    pointers (like ctl_table); having these in .rodata means that these pointers
> >    always point to the "intended" function and cannot be changed.
> > 2. Compiler optimizations: This was just a comment in the patchsets that I have
> >    mentioned ([3,4,5]). Do you know what optimizations specifically? Does it
> >    have to do with enhancing locality for the data in .rodata? Do you have other
> >    specific optimizations in mind?
> 
> I don't know about anything that would make it faster.
> It's more about safety and transmission of intent to API users,
> especially callback implementers.
Noted.

...
> > # Show the move
> > I created [8] because there is no easy way to validate which objects made it
> > into .rodata. I ran [8] for your Dec 2nd patcheset [7] and there are less in
> > .rodata than I expected (the results are in [9]) Why is that? Is it something
> > that has not been posted to the lists yet? 
> 
> Constifying the APIs only *allows* the actual table to be constified
> themselves.
> Then each table definition will have to be touched and "const" added.
That is what I thought. thx for clarifying.

> 
> See patches 17 and 18 in [7] for two examples.
> 
> Some tables in net/ are already "const" as the static definitions are
> never registered themselves but only their copies are.
> 
...

best
Kees Cook May 8, 2024, 5:11 p.m. UTC | #11
On Wed, Apr 24, 2024 at 08:12:34PM -0700, Jakub Kicinski wrote:
> On Tue, 23 Apr 2024 09:54:35 +0200 Thomas Weißschuh wrote:
> > The series was split from my larger series sysctl-const series [0].
> > It only focusses on the proc_handlers but is an important step to be
> > able to move all static definitions of ctl_table into .rodata.
> 
> Split this per subsystem, please.

I've done a few painful API transitions before, and I don't think the
complexity of these changes needs a per-subsystem constification pass. I
think this series is the right approach, but that patch 11 will need
coordination with Linus. We regularly do system-wide prototype changes
like this right at the end of the merge window before -rc1 comes out.

The requirements are pretty simple: it needs to be a obvious changes
(this certainly is) and as close to 100% mechanical as possible. I think
patch 11 easily qualifies. Linus should be able to run the same Coccinelle
script and get nearly the same results, etc. And all the other changes
need to have landed. This change also has no "silent failure" conditions:
anything mismatched will immediately stand out.

So, have patches 1-10 go via their respective subsystems, and once all
of those are in Linus's tree, send patch 11 as a stand-alone PR.

(From patch 11, it looks like the seccomp read/write function changes
could be split out? I'll do that now...)

-Kees
Jakub Kicinski May 9, 2024, 1 a.m. UTC | #12
On Wed, 8 May 2024 10:11:35 -0700 Kees Cook wrote:
> > Split this per subsystem, please.  
> 
> I've done a few painful API transitions before, and I don't think the
> complexity of these changes needs a per-subsystem constification pass. I
> think this series is the right approach, but that patch 11 will need
> coordination with Linus. We regularly do system-wide prototype changes
> like this right at the end of the merge window before -rc1 comes out.

Right. I didn't read the code closely enough before responding.
Chalk my response up to being annoyed by the constant stream of
cross-tree changes in procfs without proper cover letter explaining 
how they will be merged :|
Thomas Weißschuh May 11, 2024, 9:51 a.m. UTC | #13
Hi Kees,

On 2024-05-08 10:11:35+0000, Kees Cook wrote:
> On Wed, Apr 24, 2024 at 08:12:34PM -0700, Jakub Kicinski wrote:
> > On Tue, 23 Apr 2024 09:54:35 +0200 Thomas Weißschuh wrote:
> > > The series was split from my larger series sysctl-const series [0].
> > > It only focusses on the proc_handlers but is an important step to be
> > > able to move all static definitions of ctl_table into .rodata.
> > 
> > Split this per subsystem, please.
> 
> I've done a few painful API transitions before, and I don't think the
> complexity of these changes needs a per-subsystem constification pass. I
> think this series is the right approach, but that patch 11 will need
> coordination with Linus. We regularly do system-wide prototype changes
> like this right at the end of the merge window before -rc1 comes out.

That sounds good.

> The requirements are pretty simple: it needs to be a obvious changes
> (this certainly is) and as close to 100% mechanical as possible. I think
> patch 11 easily qualifies. Linus should be able to run the same Coccinelle
> script and get nearly the same results, etc. And all the other changes
> need to have landed. This change also has no "silent failure" conditions:
> anything mismatched will immediately stand out.

Unfortunately coccinelle alone is not sufficient, as some helpers with
different prototypes are called by handlers and themselves are calling
handler and therefore need to change in the same commit.
But if I add a diff for those on top of the coccinelle script to the
changelog it should be obvious.

> So, have patches 1-10 go via their respective subsystems, and once all
> of those are in Linus's tree, send patch 11 as a stand-alone PR.

Ack, I'll do that with the cover letter information requested by Joel.

> (From patch 11, it looks like the seccomp read/write function changes
> could be split out? I'll do that now...)

Thanks!

Thomas
Joel Granados May 12, 2024, 7:32 p.m. UTC | #14
On Sat, May 11, 2024 at 11:51:18AM +0200, Thomas Weißschuh wrote:
> Hi Kees,
> 
> On 2024-05-08 10:11:35+0000, Kees Cook wrote:
> > On Wed, Apr 24, 2024 at 08:12:34PM -0700, Jakub Kicinski wrote:
> > > On Tue, 23 Apr 2024 09:54:35 +0200 Thomas Weißschuh wrote:
> > > > The series was split from my larger series sysctl-const series [0].
> > > > It only focusses on the proc_handlers but is an important step to be
> > > > able to move all static definitions of ctl_table into .rodata.
> > > 
> > > Split this per subsystem, please.
> > 
> > I've done a few painful API transitions before, and I don't think the
> > complexity of these changes needs a per-subsystem constification pass. I
> > think this series is the right approach, but that patch 11 will need
> > coordination with Linus. We regularly do system-wide prototype changes
> > like this right at the end of the merge window before -rc1 comes out.
> 
> That sounds good.
> 
> > The requirements are pretty simple: it needs to be a obvious changes
> > (this certainly is) and as close to 100% mechanical as possible. I think
> > patch 11 easily qualifies. Linus should be able to run the same Coccinelle
> > script and get nearly the same results, etc. And all the other changes
> > need to have landed. This change also has no "silent failure" conditions:
> > anything mismatched will immediately stand out.
> 
> Unfortunately coccinelle alone is not sufficient, as some helpers with
> different prototypes are called by handlers and themselves are calling
> handler and therefore need to change in the same commit.
> But if I add a diff for those on top of the coccinelle script to the
> changelog it should be obvious.
Judging by Kees' comment on "100% mechanical", it might be better just
having the diff and have Linus apply than rather than two step process?
Have not these types of PRs, so am interested in what folks think.

> 
> > So, have patches 1-10 go via their respective subsystems, and once all
> > of those are in Linus's tree, send patch 11 as a stand-alone PR.
> 
> Ack, I'll do that with the cover letter information requested by Joel.
> 
> > (From patch 11, it looks like the seccomp read/write function changes
> > could be split out? I'll do that now...)
> 
> Thanks!
> 
> Thomas
Kees Cook May 13, 2024, 2:57 a.m. UTC | #15
On Sun, May 12, 2024 at 09:32:40PM +0200, Joel Granados wrote:
> On Sat, May 11, 2024 at 11:51:18AM +0200, Thomas Weißschuh wrote:
> > Hi Kees,
> > 
> > On 2024-05-08 10:11:35+0000, Kees Cook wrote:
> > > On Wed, Apr 24, 2024 at 08:12:34PM -0700, Jakub Kicinski wrote:
> > > > On Tue, 23 Apr 2024 09:54:35 +0200 Thomas Weißschuh wrote:
> > > > > The series was split from my larger series sysctl-const series [0].
> > > > > It only focusses on the proc_handlers but is an important step to be
> > > > > able to move all static definitions of ctl_table into .rodata.
> > > > 
> > > > Split this per subsystem, please.
> > > 
> > > I've done a few painful API transitions before, and I don't think the
> > > complexity of these changes needs a per-subsystem constification pass. I
> > > think this series is the right approach, but that patch 11 will need
> > > coordination with Linus. We regularly do system-wide prototype changes
> > > like this right at the end of the merge window before -rc1 comes out.
> > 
> > That sounds good.
> > 
> > > The requirements are pretty simple: it needs to be a obvious changes
> > > (this certainly is) and as close to 100% mechanical as possible. I think
> > > patch 11 easily qualifies. Linus should be able to run the same Coccinelle
> > > script and get nearly the same results, etc. And all the other changes
> > > need to have landed. This change also has no "silent failure" conditions:
> > > anything mismatched will immediately stand out.
> > 
> > Unfortunately coccinelle alone is not sufficient, as some helpers with
> > different prototypes are called by handlers and themselves are calling
> > handler and therefore need to change in the same commit.
> > But if I add a diff for those on top of the coccinelle script to the
> > changelog it should be obvious.
> Judging by Kees' comment on "100% mechanical", it might be better just
> having the diff and have Linus apply than rather than two step process?
> Have not these types of PRs, so am interested in what folks think.

I tried to soften it a little with my "*close* to 100%" modifier, and
I think that patch basically matched that requirement, and where it had
manual changes it was detailed in the commit log. I only split out the
seccomp part because it could actually stand alone.

So yeah, let's get the last of the subsystem specific stuff landed after
-rc1, and it should be possible to finish it all up for 6.11. Yay! :)

-Kees