mbox series

[v2,0/2] Prevent re-reading 4 GiB files on every status

Message ID 20231012160930.330618-1-sandals@crustytoothpaste.net (mailing list archive)
Headers show
Series Prevent re-reading 4 GiB files on every status | expand

Message

brian m. carlson Oct. 12, 2023, 4:09 p.m. UTC
Several people have noticed that Git continually re-reads and re-hashes
files that are an exact multiple of 4 GiB every time the index is
refreshed (for example, when "git status" is called).  This is slow and
expensive, especially with SHA-1 DC, and it also causes performance
problems when these files are used with Git LFS (because the same issue
occurs, just with Git LFS hashing the data instead of Git).

Jason Hatton sent a patch previously to fix this, but it lacked tests
and didn't get picked up.  I've adopted their patch, making some minor
changes to the commit message and including some tests, and also
including a suitable test helper to make the tests possible.  All credit
should be directed to Jason, and I'll accept all the responsibility for
any problems.

I don't anticipate this being in any way controversial, so I'm not
expecting a huge number of rerolls, but of course one or two might be
necessary.

Jason Hatton (1):
  Prevent git from rehashing 4GiB files

brian m. carlson (1):
  t: add a test helper to truncate files

 Makefile                 |  1 +
 statinfo.c               | 20 ++++++++++++++++++--
 t/helper/test-tool.c     |  1 +
 t/helper/test-tool.h     |  1 +
 t/helper/test-truncate.c | 27 +++++++++++++++++++++++++++
 t/t7508-status.sh        | 16 ++++++++++++++++
 6 files changed, 64 insertions(+), 2 deletions(-)
 create mode 100644 t/helper/test-truncate.c