Message ID | pull.1005.git.1630359290.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Upstreaming the Scalar command | expand |
On 8/30/21 5:34 PM, Johannes Schindelin via GitGitGadget wrote: > tl;dr: This series contributes the Scalar command to the Git project. This > command provides an opinionated way to create and configure repositories > with a focus on very large repositories. I want to give Johannes a big thanks for organizing this RFC. As you can see from the authorship of the patches, this was an amazingly collaborative effort, but Johannes led the way by creating a base that the rest of us could work with, then finally he brought in all of the gritty details to finish the effort. > Background > ========== ... > The Scalar project > was created to make that separation, refine the key concepts, and then > extract those features into the new Scalar command. When people have asked me how Scalar fits with the core Git client, I point them to our "Philosophy of Scalar" document [1]. The most concise summary of our goals since starting Scalar has been that Scalar aligns with features already within Git that enable scale. I've said several times that we are constantly making Scalar do less by making Git do more. [1] https://github.com/microsoft/git/blob/HEAD/contrib/scalar/docs/philosophy.md Here is an example: when our large, internal customer told us that they required Linux support for Scalar, we looked at what it would take. We could have done the necessary platform-specific things to convince .NET Core to create a long-running process that launched Git maintenance tasks at different intervals, creating a similar mechanism to the Windows and macOS services that did those operations. But we also knew that the existing system was stuck with architectural decisions from VFS for Git that were not actually in service of how Scalar worked. Instead, we decided to build background maintenance into Git itself and had our Linux port of Scalar run "git maintenance start". Once the Linux port was proven out with Git's background maintenance, we realized that the window where a user actually interacts with Scalar instead of Git is extremely narrow: users run "scalar clone" or "scalar register" and otherwise only run Git commands. The Scalar process does not need to exist outside of that. (There are some other helpers that can be used in a pinch to diagnose and fix problems, but they are rarely used. These commands, such as 'scalar diagnose' can be contributed separately.) It became clear that for our own needs it would be easier to ship one installer that included the microsoft/git fork and the Scalar CLI, and it would be simple to rewrite the Scalar CLI with all of the Git helper APIs. We organized the code in a way that we thought would be amenable to an upstream contribution (by placing in contrib/ and using Git code style). The thing about these commands is that they are _opinionated_. We rely on these opinions for important internal users, but we realize that they are not necessarily optimal for all users. Hence, we did not think it wise to push those opinions onto the 'git' executable. Having 'scalar' continue to live as a separate executable made sense to us. I believe that by contributing Scalar to the full community, that we create opportunities for Git in the future. For one, users and Git distributors can opt into compiling Scalar so it is more available to users who are interested. Another hopeful idea is that maybe this reinvigorates ideas of how to streamline Git clones for large repos without users needing to learn each and every knob to twist to get things working. Since the Scalar CLI is contributed in the full license of the Git project, pieces of it can be adapted into Git proper as needed. I look forward to hearing your thoughts. Thanks, -Stolee
On Mon, Aug 30, 2021 at 5:52 PM Derrick Stolee <stolee@gmail.com> wrote: > > On 8/30/21 5:34 PM, Johannes Schindelin via GitGitGadget wrote: > > tl;dr: This series contributes the Scalar command to the Git project. This > > command provides an opinionated way to create and configure repositories > > with a focus on very large repositories. > > I want to give Johannes a big thanks for organizing this RFC. As you > can see from the authorship of the patches, this was an amazingly > collaborative effort, but Johannes led the way by creating a base that > the rest of us could work with, then finally he brought in all of the > gritty details to finish the effort. > > > Background > > ========== > > ... > > > The Scalar project > > was created to make that separation, refine the key concepts, and then > > extract those features into the new Scalar command. > > When people have asked me how Scalar fits with the core Git client, I > point them to our "Philosophy of Scalar" document [1]. The most concise > summary of our goals since starting Scalar has been that Scalar aligns > with features already within Git that enable scale. I've said several > times that we are constantly making Scalar do less by making Git do more. > > [1] https://github.com/microsoft/git/blob/HEAD/contrib/scalar/docs/philosophy.md > > Here is an example: when our large, internal customer told us that they > required Linux support for Scalar, we looked at what it would take. We > could have done the necessary platform-specific things to convince .NET > Core to create a long-running process that launched Git maintenance tasks > at different intervals, creating a similar mechanism to the Windows and > macOS services that did those operations. But we also knew that the > existing system was stuck with architectural decisions from VFS for Git > that were not actually in service of how Scalar worked. Instead, we > decided to build background maintenance into Git itself and had our Linux > port of Scalar run "git maintenance start". > > Once the Linux port was proven out with Git's background maintenance, we > realized that the window where a user actually interacts with Scalar instead > of Git is extremely narrow: users run "scalar clone" or "scalar register" > and otherwise only run Git commands. The Scalar process does not need to > exist outside of that. (There are some other helpers that can be used in > a pinch to diagnose and fix problems, but they are rarely used. These > commands, such as 'scalar diagnose' can be contributed separately.) > > It became clear that for our own needs it would be easier to ship one > installer that included the microsoft/git fork and the Scalar CLI, and > it would be simple to rewrite the Scalar CLI with all of the Git helper > APIs. We organized the code in a way that we thought would be amenable > to an upstream contribution (by placing in contrib/ and using Git code > style). > > The thing about these commands is that they are _opinionated_. We rely > on these opinions for important internal users, but we realize that they > are not necessarily optimal for all users. Hence, we did not think it > wise to push those opinions onto the 'git' executable. Having 'scalar' > continue to live as a separate executable made sense to us. > > I believe that by contributing Scalar to the full community, that we > create opportunities for Git in the future. For one, users and Git > distributors can opt into compiling Scalar so it is more available > to users who are interested. Another hopeful idea is that maybe this > reinvigorates ideas of how to streamline Git clones for large repos > without users needing to learn each and every knob to twist to get > things working. Since the Scalar CLI is contributed in the full > license of the Git project, pieces of it can be adapted into Git > proper as needed. > > I look forward to hearing your thoughts. > > Thanks, > -Stolee Looks like exciting stuff, you two. I'm behind on review as it is; I still need to get back to Stolee's sparse-index add/rm/mv series, but I'll try to circle back and take a look.