Message ID | e811455d55cdb222a85d880f3cf3d5e28a8d4c91.1597406877.git.martin.agren@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | more SHA-256 documentation | expand |
On 8/14/2020 8:21 AM, Martin Ågren wrote: > Similar to a recent commit, document that in SHA-1 repositories, we use > SHA-1 and in SHA-256 repositories, we use SHA-256, then replace all > other uses of "SHA-1" with something more neutral. > > Signed-off-by: Martin Ågren <martin.agren@gmail.com> > --- > Documentation/technical/index-format.txt | 27 +++++++++++++----------- > 1 file changed, 15 insertions(+), 12 deletions(-) > > diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt > index faa25c5c52..827ece2ed1 100644 > --- a/Documentation/technical/index-format.txt > +++ b/Documentation/technical/index-format.txt > @@ -3,8 +3,11 @@ Git index format > > == The Git index file has the following format > > - All binary numbers are in network byte order. Version 2 is described > - here unless stated otherwise. > + All binary numbers are in network byte order. > + In a repository using the traditional SHA-1, checksums and object IDs > + (object names) mentioned below are all computed using SHA-1. Similarly, > + in SHA-256 repositories, these values are computed using SHA-256. > + Version 2 is described here unless stated otherwise. > > - A 12-byte header consisting of > > @@ -32,7 +35,7 @@ Git index format > > Extension data > > - - 160-bit SHA-1 over the content of the index file before this > + - 160-bit hash checksum over the content of the index file before this > checksum. If this hash is flexible, then "160-bit" is not correct anymore, right? > == Index entry > @@ -80,7 +83,7 @@ Git index format > 32-bit file size > This is the on-disk size from stat(2), truncated to 32-bit. > > - 160-bit SHA-1 for the represented object > + 160-bit object name for the represented object Same here. The later instances of "160-bit" were dropped. > A 16-bit 'flags' field split into (high to low bits) > > @@ -211,8 +214,8 @@ Git index format > > The extension consists of: > > - - 160-bit SHA-1 of the shared index file. The shared index file path > - is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the > + - Hash of the shared index file. The shared index file path > + is $GIT_DIR/sharedindex.<hash>. If all bits are zero, the > index does not require a shared index file. > > - An ewah-encoded delete bitmap, each bit represents an entry in the > @@ -253,10 +256,10 @@ Git index format > > - 32-bit dir_flags (see struct dir_struct) > > - - 160-bit SHA-1 of $GIT_DIR/info/exclude. Null SHA-1 means the file > + - Hash of $GIT_DIR/info/exclude. A null hash means the file > does not exist. > > - - 160-bit SHA-1 of core.excludesfile. Null SHA-1 means the file does > + - Hash of core.excludesfile. A null hash means the file does > not exist. > > - NUL-terminated string of per-dir exclude file name. This usually > @@ -285,13 +288,13 @@ The remaining data of each directory block is grouped by type: > - An ewah bitmap, the n-th bit records "check-only" bit of > read_directory_recursive() for the n-th directory. > > - - An ewah bitmap, the n-th bit indicates whether SHA-1 and stat data > + - An ewah bitmap, the n-th bit indicates whether hash and stat data > is valid for the n-th directory and exists in the next data. > > - An array of stat data. The n-th data corresponds with the n-th > "one" bit in the previous ewah bitmap. > > - - An array of SHA-1. The n-th SHA-1 corresponds with the n-th "one" bit > + - An array of hashes. The n-th hash corresponds with the n-th "one" bit > in the previous ewah bitmap. > > - One NUL. > @@ -330,12 +333,12 @@ The remaining data of each directory block is grouped by type: > > - 32-bit offset to the end of the index entries > > - - 160-bit SHA-1 over the extension types and their sizes (but not > + - Hash over the extension types and their sizes (but not > their contents). E.g. if we have "TREE" extension that is N-bytes > long, "REUC" extension that is M-bytes long, followed by "EOIE", > then the hash would be: > > - SHA-1("TREE" + <binary representation of N> + > + Hash("TREE" + <binary representation of N> + > "REUC" + <binary representation of M>) > > == Index Entry Offset Table > Thanks, -Stolee
On Fri, 14 Aug 2020 at 14:28, Derrick Stolee <stolee@gmail.com> wrote: > > On 8/14/2020 8:21 AM, Martin Ågren wrote: > > - - 160-bit SHA-1 over the content of the index file before this > > + - 160-bit hash checksum over the content of the index file before this > > checksum. > > If this hash is flexible, then "160-bit" is not correct anymore, right? > > > - 160-bit SHA-1 for the represented object > > + 160-bit object name for the represented object > > Same here. The later instances of "160-bit" were dropped. Thanks for pointing out these errors. Martin
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index faa25c5c52..827ece2ed1 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -3,8 +3,11 @@ Git index format == The Git index file has the following format - All binary numbers are in network byte order. Version 2 is described - here unless stated otherwise. + All binary numbers are in network byte order. + In a repository using the traditional SHA-1, checksums and object IDs + (object names) mentioned below are all computed using SHA-1. Similarly, + in SHA-256 repositories, these values are computed using SHA-256. + Version 2 is described here unless stated otherwise. - A 12-byte header consisting of @@ -32,7 +35,7 @@ Git index format Extension data - - 160-bit SHA-1 over the content of the index file before this + - 160-bit hash checksum over the content of the index file before this checksum. == Index entry @@ -80,7 +83,7 @@ Git index format 32-bit file size This is the on-disk size from stat(2), truncated to 32-bit. - 160-bit SHA-1 for the represented object + 160-bit object name for the represented object A 16-bit 'flags' field split into (high to low bits) @@ -211,8 +214,8 @@ Git index format The extension consists of: - - 160-bit SHA-1 of the shared index file. The shared index file path - is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the + - Hash of the shared index file. The shared index file path + is $GIT_DIR/sharedindex.<hash>. If all bits are zero, the index does not require a shared index file. - An ewah-encoded delete bitmap, each bit represents an entry in the @@ -253,10 +256,10 @@ Git index format - 32-bit dir_flags (see struct dir_struct) - - 160-bit SHA-1 of $GIT_DIR/info/exclude. Null SHA-1 means the file + - Hash of $GIT_DIR/info/exclude. A null hash means the file does not exist. - - 160-bit SHA-1 of core.excludesfile. Null SHA-1 means the file does + - Hash of core.excludesfile. A null hash means the file does not exist. - NUL-terminated string of per-dir exclude file name. This usually @@ -285,13 +288,13 @@ The remaining data of each directory block is grouped by type: - An ewah bitmap, the n-th bit records "check-only" bit of read_directory_recursive() for the n-th directory. - - An ewah bitmap, the n-th bit indicates whether SHA-1 and stat data + - An ewah bitmap, the n-th bit indicates whether hash and stat data is valid for the n-th directory and exists in the next data. - An array of stat data. The n-th data corresponds with the n-th "one" bit in the previous ewah bitmap. - - An array of SHA-1. The n-th SHA-1 corresponds with the n-th "one" bit + - An array of hashes. The n-th hash corresponds with the n-th "one" bit in the previous ewah bitmap. - One NUL. @@ -330,12 +333,12 @@ The remaining data of each directory block is grouped by type: - 32-bit offset to the end of the index entries - - 160-bit SHA-1 over the extension types and their sizes (but not + - Hash over the extension types and their sizes (but not their contents). E.g. if we have "TREE" extension that is N-bytes long, "REUC" extension that is M-bytes long, followed by "EOIE", then the hash would be: - SHA-1("TREE" + <binary representation of N> + + Hash("TREE" + <binary representation of N> + "REUC" + <binary representation of M>) == Index Entry Offset Table
Similar to a recent commit, document that in SHA-1 repositories, we use SHA-1 and in SHA-256 repositories, we use SHA-256, then replace all other uses of "SHA-1" with something more neutral. Signed-off-by: Martin Ågren <martin.agren@gmail.com> --- Documentation/technical/index-format.txt | 27 +++++++++++++----------- 1 file changed, 15 insertions(+), 12 deletions(-)