mbox series

[0/9] bound upload-pack memory allocations

Message ID 20240228223700.GA1157826@coredump.intra.peff.net (mailing list archive)
Headers show
Series bound upload-pack memory allocations | expand

Message

Jeff King Feb. 28, 2024, 10:37 p.m. UTC
Benjamin Flesch reported to the security list that a client can convince
upload-pack to allocate arbitrary amounts of memory by sending a few
repeated nonsense directives. Patches 1-6 here eliminate those spots.
which roughly bounds the amount of memory that upload-pack will allocate
to scale with the number of refs and objects in the repository. I also
found a few spots where upload-pack is a little too eager to hold object
contents in memory (so still bounded, but something like "all the trees"
can get pretty big). That's patches 7-9.

We're not doing a coordinated disclosure or special release for this.
Even after these patches, it's possible to get upload-pack to allocate
quite a bit of memory, especially for a large repository. Not to mention
that pack-objects may also allocate quite a bit to serve the pack
itself. So while this is low-hanging fruit, a public-facing Git site
probably still needs to have some kind of external tooling to kill
hungry processes (even if it's just OOM-killing them so they don't hurt
other clients).

There are a few unbounded parts of receive-pack (e.g., you can send an
infinite number of refs to push). That may be something we want to put
some configurable boundaries on, but I punted on it for this series.
IMHO it is a lot less interesting since you'd usually have to
authenticate to invoke receive-pack in the first place.

  [1/9]: upload-pack: drop separate v2 "haves" array
  [2/9]: upload-pack: switch deepen-not list to an oid_array
  [3/9]: upload-pack: use oidset for deepen_not list
  [4/9]: upload-pack: use a strmap for want-ref lines
  [5/9]: upload-pack: accept only a single packfile-uri line
  [6/9]: upload-pack: disallow object-info capability by default
  [7/9]: upload-pack: always turn off save_commit_buffer
  [8/9]: upload-pack: use PARSE_OBJECT_SKIP_HASH_CHECK in more places
  [9/9]: upload-pack: free tree buffers after parsing

 Documentation/config/transfer.txt |   4 +
 Documentation/gitprotocol-v2.txt  |   6 +-
 builtin/upload-pack.c             |   2 +
 object.c                          |  14 ++++
 object.h                          |   1 +
 revision.c                        |   3 +-
 serve.c                           |  14 +++-
 t/t5555-http-smart-common.sh      |   1 -
 t/t5701-git-serve.sh              |  24 +++++-
 upload-pack.c                     | 117 +++++++++++++-----------------
 10 files changed, 113 insertions(+), 73 deletions(-)

-Peff

Comments

Junio C Hamano Feb. 28, 2024, 10:47 p.m. UTC | #1
Jeff King <peff@peff.net> writes:

> We're not doing a coordinated disclosure or special release for this.
> Even after these patches, it's possible to get upload-pack to allocate
> quite a bit of memory, especially for a large repository. Not to mention
> that pack-objects may also allocate quite a bit to serve the pack
> itself. So while this is low-hanging fruit, a public-facing Git site
> probably still needs to have some kind of external tooling to kill
> hungry processes (even if it's just OOM-killing them so they don't hurt
> other clients).
>
>   [1/9]: upload-pack: drop separate v2 "haves" array
>   [2/9]: upload-pack: switch deepen-not list to an oid_array
>   [3/9]: upload-pack: use oidset for deepen_not list
>   [4/9]: upload-pack: use a strmap for want-ref lines
>   [5/9]: upload-pack: accept only a single packfile-uri line
>   [6/9]: upload-pack: disallow object-info capability by default
>   [7/9]: upload-pack: always turn off save_commit_buffer
>   [8/9]: upload-pack: use PARSE_OBJECT_SKIP_HASH_CHECK in more places
>   [9/9]: upload-pack: free tree buffers after parsing

I saw them when they were posted to the security list and they
looked good already.  Will queue.  Thanks.