Message ID | 8c8e16ae-87a2-44bf-a87b-7422eb04fec2@ramsayjones.plus.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [01/12] meson.build: remove -DCURL_DISABLE_TYPECHECK | expand |
On 2025-03-15 at 02:49:18, Ramsay Jones wrote: > > Commit 05cd988dce ("wrapper: add a helper to generate numbers from a > CSPRNG", 2022-01-17) added a csprng_bytes() function which used one > of several interfaces to provide a source of cryptographically secure > pseudorandom numbers. The CSPRNG_METHOD make variable was provided to > determine the choice of available 'backends' for the source of random > bytes. > > Commit 05cd988dce did not set CSPRNG_METHOD in the Linux section of > the config.mak.uname file, so it defaults to using '/dev/urandom' as > the source of random bytes. The 'backend' values which could be used > on Linux, in order of preference, are 'arc4random', 'getrandom' or > 'getentropy' ('openssl' is an option, but seems to be discouraged). > > The arc4random routines (ar4random_buf() is the one actually used) were > added to glibc in version 2.36, while both getrandom() and getentropy() > were included in 2.25. So, some of the more up-to-date distributions of > Linux (eg Debian 12, Ubuntu 24.04) would be able to use the preferred > 'arc4random' setting. > > If the meson build system is used on a newer platform, then they will be > configured to use 'arc4random', whereas the make build will currently > default to using '/dev/urandom'. Add a note to the config.mak.uname file, > in the Linux section, to prompt make users to override CSPRNG_METHOD in > the config.mak file, if appropriate. arc4random operates differently on Linux than it does on the BSDs, and the right choice on Linux is `getrandom`. The reason is that on the BSDs, a userspace ChaCha20 which is seeded from the kernel is used, along with an integer representing whether it's inititalize, and this state is stored in a page that is zeroed on fork, so that it automatically becomes uninitialized then (and is hence reseeded). Because it is in userspace, it avoids the overhead of a syscall, and is thus usually faster. arc4random has also been around longer than getrandom or getentropy on the BSDs and is widely supported there, and so it's generally the right choice (and hence, the default). When arc4random was added to glibc, the Linux kernel CSPRNG maintainer argued that it was not a secure approach (I disagree), and convinced the glibc maintainers to just make it a wrapper around the Linux kernel CSPRNG, which it now is. So there's no actual benefit to calling arc4random versus getrandom, and since it's newer and less commonly available than getrandom, as well as slightly slower (because of an extra function call), getrandom should be preferred. All Linux distros within our current support window have glibc 2.25 or newer (RHEL 8 being the oldest one), so we may want to just default to getrandom on Linux.
"brian m. carlson" <sandals@crustytoothpaste.net> writes: > When arc4random was added to glibc, the Linux kernel CSPRNG maintainer > argued that it was not a secure approach (I disagree), and convinced the > glibc maintainers to just make it a wrapper around the Linux kernel > CSPRNG, which it now is. So there's no actual benefit to calling > arc4random versus getrandom, and since it's newer and less commonly > available than getrandom, as well as slightly slower (because of an > extra function call), getrandom should be preferred. This https://www.phoronix.com/news/GNU-Glibc-arc4random-Functions was the first hit of my search in the area, but I think you are referring to https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=eaad4f9 that happened 5 days after the thing got in and the code there tells me that your summary of the situation is quite accurate. So I agree that dropping this patch makes sense, but do we want to do a bit more to improve the situation? Here is an attempt to improve what we have in Makefile (and possibly the Linux section in config.mak.uname, but that is improving what we do not have) to tell folks that arc4random in glibc is only for compatibility and they should pick getrandom() until the situation changes. --- >8 --- Subject: config/Makefile: a note on CSPRNG_METHOD choice for Linux arc4random() was added to glibc in July 2022, but quickly replaced by a stub implementation that wraps around getrandom(). Hence there is no actual benefit to calling arc4random() over getrandom() on glibc based systems, at least for now. To avoid enticing Linux users to choose arc4random(), leave a note that their arc4random() in glibc is not the same as what their friends use on other platforms, and guide them to use getrandom() instead in the meantime. Helped-by: "brian m. carlson" <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- Makefile | 5 +++-- config.mak.uname | 2 ++ 2 files changed, 5 insertions(+), 2 deletions(-) diff --git c/Makefile w/Makefile index 7315507381..7214936295 100644 --- c/Makefile +++ w/Makefile @@ -155,8 +155,9 @@ include shared.mak # Define NO_TRUSTABLE_FILEMODE if your filesystem may claim to support # the executable mode bit, but doesn't really do so. # -# Define CSPRNG_METHOD to "arc4random" if your system has arc4random and -# arc4random_buf, "libbsd" if your system has those functions from libbsd, +# Define CSPRNG_METHOD to "arc4random" if your system has true arc4random and +# arc4random_buf (not wrappers around "getrandom" shipped with glibc), +# "libbsd" if your system has those functions from libbsd, # "getrandom" if your system has getrandom, "getentropy" if your system has # getentropy, "rtlgenrandom" for RtlGenRandom (Windows only), or "openssl" if # you'd want to use the OpenSSL CSPRNG. You may set multiple options with diff --git c/config.mak.uname w/config.mak.uname index b12d4e168a..6bf511f24b 100644 --- c/config.mak.uname +++ w/config.mak.uname @@ -58,6 +58,8 @@ ifeq ($(uname_S),Linux) NEEDS_LIBRT = YesPlease HAVE_SYNC_FILE_RANGE = YesPlease HAVE_GETDELIM = YesPlease + # note: don't choose arc4random on glibc systems + CSPRNG_METHOD = FREAD_READS_DIRECTORIES = UnfortunatelyYes BASIC_CFLAGS += -DHAVE_SYSINFO PROCFS_EXECUTABLE_PATH = /proc/self/exe
Hi Brian, On 16/03/2025 00:28, brian m. carlson wrote: > On 2025-03-15 at 02:49:18, Ramsay Jones wrote: [snip] >> If the meson build system is used on a newer platform, then they will be >> configured to use 'arc4random', whereas the make build will currently >> default to using '/dev/urandom'. Add a note to the config.mak.uname file, >> in the Linux section, to prompt make users to override CSPRNG_METHOD in >> the config.mak file, if appropriate. > > arc4random operates differently on Linux than it does on the BSDs, and > the right choice on Linux is `getrandom`. > > The reason is that on the BSDs, a userspace ChaCha20 which is seeded > from the kernel is used, along with an integer representing whether it's > inititalize, and this state is stored in a page that is zeroed on fork, > so that it automatically becomes uninitialized then (and is hence > reseeded). Because it is in userspace, it avoids the overhead of a > syscall, and is thus usually faster. arc4random has also been around > longer than getrandom or getentropy on the BSDs and is widely supported > there, and so it's generally the right choice (and hence, the default). > > When arc4random was added to glibc, the Linux kernel CSPRNG maintainer > argued that it was not a secure approach (I disagree), and convinced the > glibc maintainers to just make it a wrapper around the Linux kernel > CSPRNG, which it now is. So there's no actual benefit to calling > arc4random versus getrandom, and since it's newer and less commonly > available than getrandom, as well as slightly slower (because of an > extra function call), getrandom should be preferred. Ah, OK, thanks! While researching this I was only concerned about when the functions were available. I didn't give quality/performance a minutes thought! ;) [I am aware, of course, that newer doesn't automatically mean better. It seems I forgot about that here - my bad! :( ] However, this afternoon, while I was waiting for the 'meson test' on cygwin to finish, I had a quick look at the implementation(s) on glibc and cygwin. On cygwin, the arc4random_buf() implementation seems to have been imported from OpenBSD, and uses a chacha_encrypt_bytes() function call during the process of creating the random bytes (see newlib/libc/stdlib/arc4random.c in the cygwin repo [0]). Also, the getrandom() and getentropy() functions are simple wrappers around an RtlGenRandom() call (see winsup/cygwin/libc/\ getentropy.cc in [0]). The glibc implementation of arc4random_buf() (see [1]), as you say, is just a simple wrapper around the Linux 'getrandom syscall'. In addition, we can also confirm that getrandom() (see [2]) and getentropy() (see [3]) are also simple wrappers around the 'getrandom syscall'. However, I don't see the 'extra function call' you refer to above. (Yes, all the layers of macros does obscure things somewhat, but I don't see that extra function call). What am I missing? [I haven't actually done any tests for quality/performance, so I am quite prepared to take your word for it! :) ] As you say, arc4random() is less available on Linux, so getrandom() makes for a better default. Anyway, I guess that means the meson build needs to be modified, since it currently selects arc4random() on Linux (this is OK on cygwin, see above). [I was writing an autoconf test for this, which will also need to change!] [0] git://cygwin.com/git/newlib-cygwin.git [1] https://codebrowser.dev/glibc/glibc/stdlib/arc4random.c.html [2] https://codebrowser.dev/glibc/glibc/sysdeps/unix/sysv/linux/getrandom.c.html [3] https://codebrowser.dev/glibc/glibc/sysdeps/unix/sysv/linux/getentropy.c.html > > All Linux distros within our current support window have glibc 2.25 or > newer (RHEL 8 being the oldest one), so we may want to just default to > getrandom on Linux. So, Yes, this patch needs to be dropped. Thanks! ATB, Ramsay Jones
On 16/03/2025 20:41, Junio C Hamano wrote: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > >> When arc4random was added to glibc, the Linux kernel CSPRNG maintainer >> argued that it was not a secure approach (I disagree), and convinced the >> glibc maintainers to just make it a wrapper around the Linux kernel >> CSPRNG, which it now is. So there's no actual benefit to calling >> arc4random versus getrandom, and since it's newer and less commonly >> available than getrandom, as well as slightly slower (because of an >> extra function call), getrandom should be preferred. > > This > > https://www.phoronix.com/news/GNU-Glibc-arc4random-Functions > > was the first hit of my search in the area, but I think you are > referring to > > https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=eaad4f9 > > that happened 5 days after the thing got in and the code there tells > me that your summary of the situation is quite accurate. > > So I agree that dropping this patch makes sense, but do we want to > do a bit more to improve the situation? > Yes, this patch should be dropped. (See my reply to Brian; our emails crossed!). ATB, Ramsay Jones
On 2025-03-16 at 21:51:45, Ramsay Jones wrote: > Ah, OK, thanks! While researching this I was only concerned about when > On cygwin, the arc4random_buf() implementation seems to have been imported > from OpenBSD, and uses a chacha_encrypt_bytes() function call during the > process of creating the random bytes (see newlib/libc/stdlib/arc4random.c > in the cygwin repo [0]). Also, the getrandom() and getentropy() functions > are simple wrappers around an RtlGenRandom() call (see winsup/cygwin/libc/\ > getentropy.cc in [0]). Yeah, that's similar to what I'd expect to see on the BSDs, except that `getrandom` or `getentropy` is the actual system call. > The glibc implementation of arc4random_buf() (see [1]), as you say, is just > a simple wrapper around the Linux 'getrandom syscall'. In addition, we can > also confirm that getrandom() (see [2]) and getentropy() (see [3]) are also > simple wrappers around the 'getrandom syscall'. However, I don't see the > 'extra function call' you refer to above. (Yes, all the layers of macros does > obscure things somewhat, but I don't see that extra function call). There's a call to `__getrandom_nocancel` in a loop, which I believe is the function wrapping the system call. It's probably inlined, so it's not that big a deal, though. > As you say, arc4random() is less available on Linux, so getrandom() makes > for a better default. > > Anyway, I guess that means the meson build needs to be modified, since it > currently selects arc4random() on Linux (this is OK on cygwin, see above). It's not the end of the world if the meson build system automatically prefers `arc4random` over `getrandom`. That's a sensible default in the general case, and I believe that the use case is generating temporary files, which will always incur the cost of a context switch and a disk access, so in general the performance difference would be negligible anyway. It's not a hot path, since nobody expects us to write millions of loose objects, pack files, or other temporary files in the typical case. I _do_ think that if we're specifying a default in `config.mak` on Linux, `getrandom` is the right choice, and I think we can probably just hard-code that now and let people override it if they're using a system older than our supported environments that needs `/dev/urandom`. POSIX 1003.1-2024 specifies `getentropy`, so my guess is that we'll see that on more and more systems as we go on, and the other choices will be less and less necessary, but we're not there yet.
On Sun, Mar 16, 2025 at 01:41:40PM -0700, Junio C Hamano wrote: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > > > When arc4random was added to glibc, the Linux kernel CSPRNG maintainer > > argued that it was not a secure approach (I disagree), and convinced the > > glibc maintainers to just make it a wrapper around the Linux kernel > > CSPRNG, which it now is. So there's no actual benefit to calling > > arc4random versus getrandom, and since it's newer and less commonly > > available than getrandom, as well as slightly slower (because of an > > extra function call), getrandom should be preferred. > > This > > https://www.phoronix.com/news/GNU-Glibc-arc4random-Functions > > was the first hit of my search in the area, but I think you are > referring to > > https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=eaad4f9 > > that happened 5 days after the thing got in and the code there tells > me that your summary of the situation is quite accurate. > > So I agree that dropping this patch makes sense, but do we want to > do a bit more to improve the situation? > > Here is an attempt to improve what we have in Makefile (and possibly > the Linux section in config.mak.uname, but that is improving what we > do not have) to tell folks that arc4random in glibc is only for > compatibility and they should pick getrandom() until the situation > changes. > > --- >8 --- > Subject: config/Makefile: a note on CSPRNG_METHOD choice for Linux > > arc4random() was added to glibc in July 2022, but quickly replaced > by a stub implementation that wraps around getrandom(). Hence there > is no actual benefit to calling arc4random() over getrandom() on > glibc based systems, at least for now. > > To avoid enticing Linux users to choose arc4random(), leave a note > that their arc4random() in glibc is not the same as what their > friends use on other platforms, and guide them to use getrandom() > instead in the meantime. Makes me wonder whether we should also change the order in which Meson auto-detects functions. That is, do we want the following patch that favors getrandom over arc4random? Patrick diff --git a/meson.build b/meson.build index d6e27b236fa..501b2becabb 100644 --- a/meson.build +++ b/meson.build @@ -1481,15 +1481,15 @@ endif # Backends are ordered to reflect our preference for more secure and faster # ones over the ones that are less so. -if csprng_backend in ['auto', 'arc4random'] and compiler.has_header_symbol('stdlib.h', 'arc4random_buf', required: csprng_backend == 'arc4random') +if csprng_backend in ['auto', 'getrandom'] and compiler.has_header_symbol('sys/random.h', 'getrandom', required: csprng_backend == 'getrandom') + libgit_c_args += '-DHAVE_GETRANDOM' + csprng_backend = 'getrandom' +elif csprng_backend in ['auto', 'arc4random'] and compiler.has_header_symbol('stdlib.h', 'arc4random_buf', required: csprng_backend == 'arc4random') libgit_c_args += '-DHAVE_ARC4RANDOM' csprng_backend = 'arc4random' elif csprng_backend in ['auto', 'arc4random_bsd'] and compiler.has_header_symbol('bsd/stdlib.h', 'arc4random_buf', required: csprng_backend == 'arc4random_bsd') libgit_c_args += '-DHAVE_ARC4RANDOM_BSD' csprng_backend = 'arc4random_bsd' -elif csprng_backend in ['auto', 'getrandom'] and compiler.has_header_symbol('sys/random.h', 'getrandom', required: csprng_backend == 'getrandom') - libgit_c_args += '-DHAVE_GETRANDOM' - csprng_backend = 'getrandom' elif csprng_backend in ['auto', 'getentropy'] and compiler.has_header_symbol('unistd.h', 'getentropy', required: csprng_backend == 'getentropy') libgit_c_args += '-DHAVE_GETENTROPY' csprng_backend = 'getentropy'
On 19/03/2025 13:30, Patrick Steinhardt wrote: > On Sun, Mar 16, 2025 at 01:41:40PM -0700, Junio C Hamano wrote: >> "brian m. carlson" <sandals@crustytoothpaste.net> writes: >> >>> When arc4random was added to glibc, the Linux kernel CSPRNG maintainer >>> argued that it was not a secure approach (I disagree), and convinced the >>> glibc maintainers to just make it a wrapper around the Linux kernel >>> CSPRNG, which it now is. So there's no actual benefit to calling >>> arc4random versus getrandom, and since it's newer and less commonly >>> available than getrandom, as well as slightly slower (because of an >>> extra function call), getrandom should be preferred. >> >> This >> >> https://www.phoronix.com/news/GNU-Glibc-arc4random-Functions >> >> was the first hit of my search in the area, but I think you are >> referring to >> >> https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=eaad4f9 >> >> that happened 5 days after the thing got in and the code there tells >> me that your summary of the situation is quite accurate. >> >> So I agree that dropping this patch makes sense, but do we want to >> do a bit more to improve the situation? >> >> Here is an attempt to improve what we have in Makefile (and possibly >> the Linux section in config.mak.uname, but that is improving what we >> do not have) to tell folks that arc4random in glibc is only for >> compatibility and they should pick getrandom() until the situation >> changes. >> >> --- >8 --- >> Subject: config/Makefile: a note on CSPRNG_METHOD choice for Linux >> >> arc4random() was added to glibc in July 2022, but quickly replaced >> by a stub implementation that wraps around getrandom(). Hence there >> is no actual benefit to calling arc4random() over getrandom() on >> glibc based systems, at least for now. >> >> To avoid enticing Linux users to choose arc4random(), leave a note >> that their arc4random() in glibc is not the same as what their >> friends use on other platforms, and guide them to use getrandom() >> instead in the meantime. > > Makes me wonder whether we should also change the order in which Meson > auto-detects functions. That is, do we want the following patch that > favors getrandom over arc4random? > That was my immediate thought also. :) ATB, Ramsay Jones
On Thu, Mar 20, 2025 at 01:28:31AM +0000, Ramsay Jones wrote: > > > On 19/03/2025 13:30, Patrick Steinhardt wrote: > > On Sun, Mar 16, 2025 at 01:41:40PM -0700, Junio C Hamano wrote: > >> "brian m. carlson" <sandals@crustytoothpaste.net> writes: > >> > >>> When arc4random was added to glibc, the Linux kernel CSPRNG maintainer > >>> argued that it was not a secure approach (I disagree), and convinced the > >>> glibc maintainers to just make it a wrapper around the Linux kernel > >>> CSPRNG, which it now is. So there's no actual benefit to calling > >>> arc4random versus getrandom, and since it's newer and less commonly > >>> available than getrandom, as well as slightly slower (because of an > >>> extra function call), getrandom should be preferred. > >> > >> This > >> > >> https://www.phoronix.com/news/GNU-Glibc-arc4random-Functions > >> > >> was the first hit of my search in the area, but I think you are > >> referring to > >> > >> https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=eaad4f9 > >> > >> that happened 5 days after the thing got in and the code there tells > >> me that your summary of the situation is quite accurate. > >> > >> So I agree that dropping this patch makes sense, but do we want to > >> do a bit more to improve the situation? > >> > >> Here is an attempt to improve what we have in Makefile (and possibly > >> the Linux section in config.mak.uname, but that is improving what we > >> do not have) to tell folks that arc4random in glibc is only for > >> compatibility and they should pick getrandom() until the situation > >> changes. > >> > >> --- >8 --- > >> Subject: config/Makefile: a note on CSPRNG_METHOD choice for Linux > >> > >> arc4random() was added to glibc in July 2022, but quickly replaced > >> by a stub implementation that wraps around getrandom(). Hence there > >> is no actual benefit to calling arc4random() over getrandom() on > >> glibc based systems, at least for now. > >> > >> To avoid enticing Linux users to choose arc4random(), leave a note > >> that their arc4random() in glibc is not the same as what their > >> friends use on other platforms, and guide them to use getrandom() > >> instead in the meantime. > > > > Makes me wonder whether we should also change the order in which Meson > > auto-detects functions. That is, do we want the following patch that > > favors getrandom over arc4random? > > > > That was my immediate thought also. :) Okay. Will you pick it up in v2 of this patch series? Thanks! Patrick
On 20/03/2025 05:20, Patrick Steinhardt wrote: [snip] >> That was my immediate thought also. :) > > Okay. Will you pick it up in v2 of this patch series? Heh, well, that was my immediate thought, but having thought some more, and considering Brian's earlier email response, I have thought again. ;) Without a wholesale change to the logic (in order to make it platform siloed), it is probably best to just set the Linux default to 'getrandom'. (I can override in config.mak for my tests, or just ignore that difference now I know :) ). The issue is that each platform has a different priority order for the 'backends' (even multiple orders depending on the criteria, eg. availability, performance), so the current order is probably fine on newer Linux systems. However, to cater to older systems, it would be best to default to getrandom. [Note cygwin and *BSD systems should prefer arc4random, on Linux all of the three main options (arc4random, getrandom, getentropy) are pretty much the same from a performance perspective, but getrandom is much more 'available']. So, yes, a v2 of the series, but a different final patch. ATB, Ramsay Jones
diff --git a/config.mak.uname b/config.mak.uname index 4f6770a5f4..1a897bd022 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -50,6 +50,8 @@ ifeq ($(uname_S),Linux) HAVE_ALLOCA_H = YesPlease # override in config.mak if you have glibc >= 2.38 NO_STRLCPY = YesPlease + # set to arc4random (in config.mak) if you have glibc >= 2.36 + CSPRNG_METHOD = HAVE_PATHS_H = YesPlease LIBC_CONTAINS_LIBINTL = YesPlease HAVE_DEV_TTY = YesPlease
Commit 05cd988dce ("wrapper: add a helper to generate numbers from a CSPRNG", 2022-01-17) added a csprng_bytes() function which used one of several interfaces to provide a source of cryptographically secure pseudorandom numbers. The CSPRNG_METHOD make variable was provided to determine the choice of available 'backends' for the source of random bytes. Commit 05cd988dce did not set CSPRNG_METHOD in the Linux section of the config.mak.uname file, so it defaults to using '/dev/urandom' as the source of random bytes. The 'backend' values which could be used on Linux, in order of preference, are 'arc4random', 'getrandom' or 'getentropy' ('openssl' is an option, but seems to be discouraged). The arc4random routines (ar4random_buf() is the one actually used) were added to glibc in version 2.36, while both getrandom() and getentropy() were included in 2.25. So, some of the more up-to-date distributions of Linux (eg Debian 12, Ubuntu 24.04) would be able to use the preferred 'arc4random' setting. If the meson build system is used on a newer platform, then they will be configured to use 'arc4random', whereas the make build will currently default to using '/dev/urandom'. Add a note to the config.mak.uname file, in the Linux section, to prompt make users to override CSPRNG_METHOD in the config.mak file, if appropriate. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> --- config.mak.uname | 2 ++ 1 file changed, 2 insertions(+)