Message ID | 20240709225042.2005233-1-emilyshaffer@google.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Documentation: add platform support policy | expand |
On 2024-07-09 at 22:50:42, Emily Shaffer wrote: > Right now, this doc talks about "guarantees." I used that phrasing based on > what I've observed to be an implicit expectation that we guarantee support; it > could be this isn't actually a guarantee that the community is willing to make, > so I am hoping we can discuss it and come up with the right term. I think it might be helpful to look at what some other projects do. Rust has a concept of tiered support, and it requires platforms to have maintainers who will commit to support an OS. I don't think we necessarily need to be so formal, but if nobody's stepping up to monitor an OS or architecture, it may break at any time and we won't be able to consider it when deciding on features we require from the platform (such as Rust, C versions, or POSIX versions). I think it's also worth discussing what we require from a platform we're willing to support. For example, we might require that the platform pass the entire testsuite (ignoring irrelevant tests or tests for things that platform doesn't use, such as Perl) or be actively pursuing an attempt to do so. We may also want to require that an OS be actively receiving security support so that we don't have people asking us to carry patches for actively obsolete OSes, such as CentOS 6. Finally, some sort of time limit may be helpful, since some Linux vendors are now offering 15 years of support, and we really may not want to target really ancient versions of things like libcurl. At the same time, we do have people actively building Git on a variety of platforms and a huge number of architectures, including most Linux distros and the BSDs, and we will want to be cognizant that we should avoid breaking those environments when possible, even though, say, the porters for some of those OSes or architectures may not actively follow the list (due to limited porters and lots of porting work). I imagine we might say that released architectures on certain distros (Debian comes to mind as a very portable option) might be implicitly supported. > +Compatible on `next` > +-------------------- > + > +To guarantee that `next` will work for your platform, avoiding reactive > +debugging and fixing: > + > +* You should add a runner for your platform to the GitHub Actions CI suite. > +This suite is run when any Git developer proposes a new patch, and having a > +runner for your platform/configuration means every developer will know if they > +break you, immediately. I think this is a particularly helpful approach. I understand the Linux runners support nested virtualization, so it's possible to run tests in a VM on a Linux runner on OSes that Actions doesn't natively support. I do this for several of my Rust projects[0] on FreeBSD and NetBSD, for example, and it should work on platforms that support Vagrant and run on x86-64. That won't catch things like alignment problems which don't affect x86-64, but it does catch a lot of general portability problems that are OS-related. I'm in agreement with all of your suggestions, by the way, and I appreciate you opening this discussion. [0] An example for the curious is muter: https://github.com/bk2204/muter.
Emily Shaffer <emilyshaffer@google.com> writes: > +Platform Support Policy > +======================= > + > +Git has a history of providing broad "support" for exotic platforms and older > +platforms, without an explicit commitment. This support becomes easier to > +maintain (and possible to commit to) when Git developers are providing with "providing"? "provided"? > +adequate tooling to test for compatibility. Variouis levels of tooling will "Variouis"? > +allow us to make more solid commitments around Git's compatibility with your > +platform. > + > +Compatible by vN+1 release > +-------------------------- I couldn't quite tell what you meant by vN+1 on the title. If Git v2.45.X were working fine on an un(der)maintained platform, and some changes went into Git v2.46.0 were incompatible with it, then vN would obviously be v2.46.0 but what is vN+1? v2.47.0 or v2.46.1? howto/maintain-git.txt calls v2.47.0 "the next feature release" after v2.46.0, while v2.46.1 is "the first maintenance release". > +To increase probability that compatibility issues introduced in a point release > +will be fixed by the next point release: So you meant "by v2.46.1 (or if you fail to notice breakage then it might slip until v2.46.2)". Is the procedure for the platform folks any different if they target the next feature release? I think what they need to do would not change all that much between these two cases, so I'd suggest dropping a mention of "point release". I.e, "introduced in an earlier release will be fixed by a future release". A point release cannot introduce compatibility issues or any breakages, but mistakes happen ;-) But for a receiver of a new bug, it does not matter an iota if a point release or a major release introduced an issue. To recap, my suggestions for the above part are: - retitle to "Compatible by the next release" - "introduced in an earlier release will be fixed by a future release" without mentioning the nature of releases like point, feature, and maintenance. > +* You should send a bug report as soon as you notice the breakage on your > +platform. The sooner you notice, the better; it's better for you to watch `seen` > +than to watch `master`. Let's clarify what goal they want to achieve by "watching". ... for you to watch `seen` to prevent changes that break your platform from getting merged into `next`, than to watch `master`. > See linkgit:gitworkflows[7] under "Graduation" for an > +overview of which branches are used in git.git, and how. Or "The Policy" section of howto/maintain-git.txt where the use of each branch makes it a bit more clear what 'next' is for, and why 'seen' may be worth looking at by these people. > +Compatible on `master` and point releases > +----------------------------------------- > + > +To guarantee that `master` and all point releases work for your platform the > +first time: OK, as most of the changes go to `master` before getting merged down to `maint` to become part of the next maintenance release, actively protecting `master` from bugs is worthwhile. What about changes that do not come via the `master` branch? Should they also join the security list and have an early access to the cabal material? > +* You should run nightly tests against the `next` branch and publish breakage > +reports to the mailing list immediately when they happen. > +* It may make sense to automate these; if you do, make sure they are not noisy > +(you don't need to send a report when everything works, only when something > +breaks). > +* Breakage reports should be actionable - include clear error messages that can > +help developers who may not have access to test directly on your platform. > +* You should use git-bisect and determine which commit introduced the breakage; > +if you can't do this with automation, you should do this yourself manually as > +soon as you notice a breakage report was sent. All of the above are actually applicable to any active contributors on any platforms. If your group feeds custom builds of Git out of "master" to your $CORP customers, you want to ensure you catch badness while it is still in "next" (or better yet, before it hits "next"). If your internal builds are based on "next", you'd want to ensure that "next" stays clean, which means you'd need to watch "seen" (or better yet, patches floating on the list before they hit "seen"). Your group may build with unusual toolchain internal to your $CORP and may link with specialized libraries, etc., in which case maintaining such a build is almost like maintaining an exotic platform. > +* You should either: > +** Provide VM access on-demand to a trusted developer working to fix the issue, > +so they can test their fix, OR > +** Work closely with the developer fixing the issue - testing turnaround to > +check whether the fix works for your platform should not be longer than a > +business day. These are very specific, especially for minority platform folks. I agree with the direction, but "not be longer than" might be too strong. Longer turnaround time will certainly make the issue resolution slower, but if the platform maintainer can stand it, that is their choice. Finding some volunteers among our developers who are familiar with the code to help their problem with more patience and can wait for more than a business day is also up to them. > +Compatible on `next` > +-------------------- > + > +To guarantee that `next` will work for your platform, avoiding reactive > +debugging and fixing: > + > +* You should add a runner for your platform to the GitHub Actions CI suite. > +This suite is run when any Git developer proposes a new patch, and having a > +runner for your platform/configuration means every developer will know if they > +break you, immediately. This would be nice even if the platform maintainer do not care about `next` occasionally breaking (i.e. keep `master` working, in the previous section, or even find breakages on `master` before the next feature release, in the section before that). > +* If you rely on Git avoiding a specific pattern that doesn't work well with > +your platform (like a certain malloc pattern), if possible, add a coccicheck > +rule to ensure that pattern is not used. Sorry, but I do not quite follow you here. In general, it is a bad idea to promise that we are willing to tie our hands with coccicheck to satisfy needs by exotic platforms, without first having a chance to see and evaluate such needs. "if possible, add" -> "sometimes it may turn out to be a good idea to add", perhaps? > +* If you rely on some configuration or behavior, add a test for it. You may > +find it easier to add a unit test ensuring the behavior you need than to add an > +integration test; either one works. Untested behavior is subject to breakage at > +any time. A unit test may be easier to add than an end-to-end test, but given that end-users and platform maintainers want to see Git work as a whole (e.g., if you prepare two repositories and do "git push there :refs/heads/foo" then it removes the 'foo' branch), an end-to-end test would probably be more useful and robust way to ensure that a feature you care about will keep working. In any case, I am not sure the sentence that ends with "either one works" is worth existing here in this document. Two important points to stress here are (1) add test to protect what you care about and (2) otherwise you can keep both halves.
On Tue, Jul 9, 2024 at 5:57 PM Junio C Hamano <gitster@pobox.com> wrote: > > Emily Shaffer <emilyshaffer@google.com> writes: > > > +Platform Support Policy > > +======================= > > + > > +Git has a history of providing broad "support" for exotic platforms and older > > +platforms, without an explicit commitment. This support becomes easier to > > +maintain (and possible to commit to) when Git developers are providing with > > "providing"? "provided"? > > > +adequate tooling to test for compatibility. Variouis levels of tooling will > > "Variouis"? Thanks, fixed both for next time there's a reroll ready. > > > +allow us to make more solid commitments around Git's compatibility with your > > +platform. > > + > > +Compatible by vN+1 release > > +-------------------------- > > I couldn't quite tell what you meant by vN+1 on the title. If Git > v2.45.X were working fine on an un(der)maintained platform, and some > changes went into Git v2.46.0 were incompatible with it, then vN > would obviously be v2.46.0 but what is vN+1? v2.47.0 or v2.46.1? > > howto/maintain-git.txt calls v2.47.0 "the next feature release" > after v2.46.0, while v2.46.1 is "the first maintenance release". > > > +To increase probability that compatibility issues introduced in a point release > > +will be fixed by the next point release: > > So you meant "by v2.46.1 (or if you fail to notice breakage then it > might slip until v2.46.2)". Is the procedure for the platform folks > any different if they target the next feature release? > > I think what they need to do would not change all that much between > these two cases, so I'd suggest dropping a mention of "point > release". I.e, "introduced in an earlier release will be fixed by a > future release". > > A point release cannot introduce compatibility issues or any > breakages, but mistakes happen ;-) But for a receiver of a new bug, > it does not matter an iota if a point release or a major release > introduced an issue. > > To recap, my suggestions for the above part are: > > - retitle to "Compatible by the next release" > > - "introduced in an earlier release will be fixed by a future > release" without mentioning the nature of releases like point, > feature, and maintenance. Thanks for the thorough clarification, I hadn't thought in as much detail as this :) Will make those edits. > > > +* You should send a bug report as soon as you notice the breakage on your > > +platform. The sooner you notice, the better; it's better for you to watch `seen` > > +than to watch `master`. > > Let's clarify what goal they want to achieve by "watching". Fixed for next reroll. > > ... for you to watch `seen` to prevent changes that break your > platform from getting merged into `next`, than to watch `master`. > > > See linkgit:gitworkflows[7] under "Graduation" for an > > +overview of which branches are used in git.git, and how. > > Or "The Policy" section of howto/maintain-git.txt where the use of > each branch makes it a bit more clear what 'next' is for, and why > 'seen' may be worth looking at by these people. Thanks, yeah, changed the link to point there instead. > > > > +Compatible on `master` and point releases > > +----------------------------------------- > > + > > +To guarantee that `master` and all point releases work for your platform the > > +first time: > > OK, as most of the changes go to `master` before getting merged down > to `maint` to become part of the next maintenance release, actively > protecting `master` from bugs is worthwhile. What about changes > that do not come via the `master` branch? Should they also join the > security list and have an early access to the cabal material? Good question, I actually am not sure of the answer. Does that make it too easy for anybody to claim they maintain some random platform and therefore they'd like to see all the RCE howtos weeks before they are fixed? I guess that we already have distro packagers in security list/cabal, so it may not be worse exposure than that. > > > +* You should run nightly tests against the `next` branch and publish breakage > > +reports to the mailing list immediately when they happen. > > +* It may make sense to automate these; if you do, make sure they are not noisy > > +(you don't need to send a report when everything works, only when something > > +breaks). > > +* Breakage reports should be actionable - include clear error messages that can > > +help developers who may not have access to test directly on your platform. > > +* You should use git-bisect and determine which commit introduced the breakage; > > +if you can't do this with automation, you should do this yourself manually as > > +soon as you notice a breakage report was sent. > > All of the above are actually applicable to any active contributors > on any platforms. If your group feeds custom builds of Git out of > "master" to your $CORP customers, you want to ensure you catch > badness while it is still in "next" (or better yet, before it hits > "next"). If your internal builds are based on "next", you'd want to > ensure that "next" stays clean, which means you'd need to watch > "seen" (or better yet, patches floating on the list before they hit > "seen"). Your group may build with unusual toolchain internal to > your $CORP and may link with specialized libraries, etc., in which > case maintaining such a build is almost like maintaining an exotic > platform. Hits close to home ;) Does this mean that this part of the document should go somewhere else and we should just use a pointer here? Is there a guide handy for "how to soft-fork Git"? > > > +* You should either: > > +** Provide VM access on-demand to a trusted developer working to fix the issue, > > +so they can test their fix, OR > > +** Work closely with the developer fixing the issue - testing turnaround to > > +check whether the fix works for your platform should not be longer than a > > +business day. > > These are very specific, especially for minority platform folks. I > agree with the direction, but "not be longer than" might be too > strong. Longer turnaround time will certainly make the issue > resolution slower, but if the platform maintainer can stand it, that > is their choice. Finding some volunteers among our developers who > are familiar with the code to help their problem with more patience > and can wait for more than a business day is also up to them. Maybe something like this is better? "Work closely with the developer fixing the issue; the turnaround to check that a proposed fix works for your platform should be fast enough that it doesn't hinder the developer working on that fix. If the turnaround is too slow, fixing the issue may miss the next release or the developer may lose interest in working on the fix at all." This last bit seems harsh but might be a good reminder - in this situation, the developer is a volunteer, and if that volunteer work is artificially annoying, they can decide to stop doing it. Open to rephrasing. > > > +Compatible on `next` > > +-------------------- > > + > > +To guarantee that `next` will work for your platform, avoiding reactive > > +debugging and fixing: > > + > > +* You should add a runner for your platform to the GitHub Actions CI suite. > > +This suite is run when any Git developer proposes a new patch, and having a > > +runner for your platform/configuration means every developer will know if they > > +break you, immediately. > > This would be nice even if the platform maintainer do not care about > `next` occasionally breaking (i.e. keep `master` working, in the > previous section, or even find breakages on `master` before the next > feature release, in the section before that). I agree that it would be nice for any scenario :) I was trying to link lower quality of service with lower investment; it would be nice to have things from the "higher investment" tier, but it's not really necessary for Git to be providing that worse quality of service. Would it be worth mentioning at the very beginning of the doc that "it's OK if you pick and choose between different tiers, and we appreciate anything that takes a higher investment, but these lists should give you an impression of what you'll get from the level of effort you want to provide yourself"? Probably not with that exact phrasing, but to try and get that concept across, at least. > > > +* If you rely on Git avoiding a specific pattern that doesn't work well with > > +your platform (like a certain malloc pattern), if possible, add a coccicheck > > +rule to ensure that pattern is not used. > > Sorry, but I do not quite follow you here. > > In general, it is a bad idea to promise that we are willing to tie > our hands with coccicheck to satisfy needs by exotic platforms, > without first having a chance to see and evaluate such needs. > > "if possible, add" -> "sometimes it may turn out to be a good idea > to add", perhaps? Maybe it is better to ask them to discuss it with us on-list, and that the result of that discussion may be that they should add some such test? Or, do we want to firmly say, no coccicheck restrictions based on platform, give us a CI runner or bust? I don't feel super strongly either way - writing this section I was trying to come up with any way to get on-demand ~instant (<1hr) feedback to any contributor, and this seemed like one someone could do. That doesn't mean we have to let them, if we don't like this way. > > > +* If you rely on some configuration or behavior, add a test for it. You may > > +find it easier to add a unit test ensuring the behavior you need than to add an > > +integration test; either one works. Untested behavior is subject to breakage at > > +any time. > > A unit test may be easier to add than an end-to-end test, but given > that end-users and platform maintainers want to see Git work as a > whole (e.g., if you prepare two repositories and do "git push there > :refs/heads/foo" then it removes the 'foo' branch), an end-to-end > test would probably be more useful and robust way to ensure that a > feature you care about will keep working. > > In any case, I am not sure the sentence that ends with "either one > works" is worth existing here in this document. Two important points > to stress here are (1) add test to protect what you care about and (2) > otherwise you can keep both halves. > Thanks, will do. I've got a couple of changes locally but I'll hold off on the reroll til the open phrasing questions get resolved, and I get through brian's review as well, and maybe another day to see if Randall makes a response, since I cc'd him too. Will aim to send out v2 Friday latest (barring any combustion events at $dayjob). Thanks much for the quick and thorough reply. - Emily
On Tue, Jul 9, 2024 at 3:50 PM Emily Shaffer <emilyshaffer@google.com> wrote: > +* If you rely on some configuration or behavior, add a test for it. You may > +find it easier to add a unit test ensuring the behavior you need than to add an > +integration test; either one works. Untested behavior is subject to breakage at > +any time. Should we state that we reserve the right to reject these tests if they would put what we feel is an excessive burden on the git developers? i.e. a requirement to use C89 would be rejected (obviously). a requirement to support 16-bit platforms would also be rejected. I don't know that we need to list examples for what we'd reject, they could be implied that we're likely to accept anything else. > +** Clearly label these tests as necessary for platform compatibility. Add them > +to an isolated compatibility-related test suite, like a new t* file or unit test > +suite, so that they're easy to remove when compatibility is no longer required. > +If the specific compatibility need is gated behind an issue with another > +project, link to documentation of that issue (like a bug or email thread) to > +make it easier to tell when that compatibility need goes away. I think that we likely won't have the ability to investigate whether it's _truly_ gone away ourselves, and there's no guarantee that the person that added these tests will be able to vet it either (maybe they've switched jobs, for example). I think we should take a stance that may be considered hostile, but I can't really think of a better one: - there needs to be a regular (6 month? 12 month? no longer than 12 month surely...) reaffirmation by the interested party that this is still a requirement for them. This comes in the form of updating a date in the codebase, not just a message on the list. If this reaffirmation does not happen, we are allowed to assume that this is not needed anymore and remove the test that's binding us to supporting that. We should probably go a step further and intentionally violate the test condition, so that any builds done by the interested parties break immediately (which should be caught by the other processes documented here; if they don't break, then it was correct to remove the restriction). - _most_ of these restrictions should probably have a limited number of reaffirmations? I feel like this needs to be handled on a case-by-case basis, but I want us to be clear that just because we accepted these restrictions in the past doesn't mean we will accept them indefinitely. - Just because there's a reaffirmation doesn't mean we're guaranteeing we won't delete the test before the affirmation "expires". If there's an urgent security need to do something, and it can't be done without breaking this, we'll choose to break this test. If there's general consensus to do something (adopt a new language standard version, for example), there's no guarantee we'll wait until all the existing affirmations expire. The thinking here is that this test is imposing a restriction on the git developers that we've agreed to take on as a favor: we are going to restrict ourselves from doing X _for the time being_ not necessarily because it'll break you, but because it's a bad experience for the git developers to create a patch series that lands and then gets backed out when the breakage on $non_standard_platform is detected on seen/next/master.
Emily Shaffer <nasamuffin@google.com> writes: >> > +Compatible on `master` and point releases >> > +----------------------------------------- >> > + >> > +To guarantee that `master` and all point releases work for your platform the >> > +first time: >> >> OK, as most of the changes go to `master` before getting merged down >> to `maint` to become part of the next maintenance release, actively >> protecting `master` from bugs is worthwhile. What about changes >> that do not come via the `master` branch? Should they also join the >> security list and have an early access to the cabal material? > > Good question, I actually am not sure of the answer. Does that make it > too easy for anybody to claim they maintain some random platform and > therefore they'd like to see all the RCE howtos weeks before they are > fixed? I guess that we already have distro packagers in security > list/cabal, so it may not be worse exposure than that. Stopping at saying "You may want to ask to join the security list" and then leave the vetting process out of the guidelines for the contributor (i.e. out of this document) may strike a good balance. We will obviously be careful about whom to add to the security list, but that does not change where people hear about the list and apply to join. >> All of the above are actually applicable to any active contributors >> on any platforms. >> ... > > Hits close to home ;) > > Does this mean that this part of the document should go somewhere else > and we should just use a pointer here? Is there a guide handy for "how > to soft-fork Git"? Once we have a contributor guidelines this is a good material to migrate there, but that would probably wait after the dust from this document settles. > Maybe something like this is better? > > "Work closely with the developer fixing the issue; the turnaround to > check that a proposed fix works for your platform should be fast > enough that it doesn't hinder the developer working on that fix. If > the turnaround is too slow, fixing the issue may miss the next release > or the developer may lose interest in working on the fix at all." I think that is a good approach to take. "We will not promise to wait for you if you are slow, and that is not limited to those who are working on minority/niche platforms" is a good point to make. >> > +* If you rely on Git avoiding a specific pattern that doesn't work well with >> > +your platform (like a certain malloc pattern), if possible, add a coccicheck >> > +rule to ensure that pattern is not used. >> >> Sorry, but I do not quite follow you here. >> >> In general, it is a bad idea to promise that we are willing to tie >> our hands with coccicheck to satisfy needs by exotic platforms, >> without first having a chance to see and evaluate such needs. >> >> "if possible, add" -> "sometimes it may turn out to be a good idea >> to add", perhaps? > > Maybe it is better to ask them to discuss it with us on-list, and that > the result of that discussion may be that they should add some such > test? Or, do we want to firmly say, no coccicheck restrictions based > on platform, give us a CI runner or bust? I don't feel super strongly > either way - writing this section I was trying to come up with any way > to get on-demand ~instant (<1hr) feedback to any contributor, and this > seemed like one someone could do. That doesn't mean we have to let > them, if we don't like this way. Yes. If you want to add additional constraints on how the codebase does things, discuss it on list first and work with us to come up with a way to without forcing too many unnecessary constraints on other platforms. It may result in keeping the generic codebase pristine and free from #ifdef and having platform specific code somewhere in compat/ but such details do not have to be spelled out---they will be different case-by-case and we will hopefully devise new and improved ways to deal with them.
On Wednesday, July 10, 2024 2:56 PM, Emily Shaffer wrote: >On Tue, Jul 9, 2024 at 5:57 PM Junio C Hamano <gitster@pobox.com> wrote: >> >> Emily Shaffer <emilyshaffer@google.com> writes: >> >> > +Platform Support Policy >> > +======================= >> > + >> > +Git has a history of providing broad "support" for exotic platforms >> > +and older platforms, without an explicit commitment. This support >> > +becomes easier to maintain (and possible to commit to) when Git >> > +developers are providing with >> >> "providing"? "provided"? >> >> > +adequate tooling to test for compatibility. Variouis levels of >> > +tooling will >> >> "Variouis"? > >Thanks, fixed both for next time there's a reroll ready. > >> >> > +allow us to make more solid commitments around Git's compatibility >> > +with your platform. >> > + >> > +Compatible by vN+1 release >> > +-------------------------- >> >> I couldn't quite tell what you meant by vN+1 on the title. If Git >> v2.45.X were working fine on an un(der)maintained platform, and some >> changes went into Git v2.46.0 were incompatible with it, then vN would >> obviously be v2.46.0 but what is vN+1? v2.47.0 or v2.46.1? >> >> howto/maintain-git.txt calls v2.47.0 "the next feature release" >> after v2.46.0, while v2.46.1 is "the first maintenance release". >> >> > +To increase probability that compatibility issues introduced in a >> > +point release will be fixed by the next point release: >> >> So you meant "by v2.46.1 (or if you fail to notice breakage then it >> might slip until v2.46.2)". Is the procedure for the platform folks >> any different if they target the next feature release? >> >> I think what they need to do would not change all that much between >> these two cases, so I'd suggest dropping a mention of "point release". >> I.e, "introduced in an earlier release will be fixed by a future >> release". >> >> A point release cannot introduce compatibility issues or any >> breakages, but mistakes happen ;-) But for a receiver of a new bug, it >> does not matter an iota if a point release or a major release >> introduced an issue. >> >> To recap, my suggestions for the above part are: >> >> - retitle to "Compatible by the next release" >> >> - "introduced in an earlier release will be fixed by a future >> release" without mentioning the nature of releases like point, >> feature, and maintenance. > >Thanks for the thorough clarification, I hadn't thought in as much detail as this :) >Will make those edits. > >> >> > +* You should send a bug report as soon as you notice the breakage >> > +on your platform. The sooner you notice, the better; it's better >> > +for you to watch `seen` than to watch `master`. >> >> Let's clarify what goal they want to achieve by "watching". > >Fixed for next reroll. > >> >> ... for you to watch `seen` to prevent changes that break your >> platform from getting merged into `next`, than to watch `master`. >> >> > See linkgit:gitworkflows[7] under "Graduation" for an >> > +overview of which branches are used in git.git, and how. >> >> Or "The Policy" section of howto/maintain-git.txt where the use of >> each branch makes it a bit more clear what 'next' is for, and why >> 'seen' may be worth looking at by these people. > >Thanks, yeah, changed the link to point there instead. > >> >> >> > +Compatible on `master` and point releases >> > +----------------------------------------- >> > + >> > +To guarantee that `master` and all point releases work for your >> > +platform the first time: >> >> OK, as most of the changes go to `master` before getting merged down >> to `maint` to become part of the next maintenance release, actively >> protecting `master` from bugs is worthwhile. What about changes that >> do not come via the `master` branch? Should they also join the >> security list and have an early access to the cabal material? > >Good question, I actually am not sure of the answer. Does that make it too easy for >anybody to claim they maintain some random platform and therefore they'd like to >see all the RCE howtos weeks before they are fixed? I guess that we already have >distro packagers in security list/cabal, so it may not be worse exposure than that. > >> >> > +* You should run nightly tests against the `next` branch and >> > +publish breakage reports to the mailing list immediately when they happen. >> > +* It may make sense to automate these; if you do, make sure they >> > +are not noisy (you don't need to send a report when everything >> > +works, only when something breaks). >> > +* Breakage reports should be actionable - include clear error >> > +messages that can help developers who may not have access to test directly on >your platform. >> > +* You should use git-bisect and determine which commit introduced >> > +the breakage; if you can't do this with automation, you should do >> > +this yourself manually as soon as you notice a breakage report was sent. >> >> All of the above are actually applicable to any active contributors on >> any platforms. If your group feeds custom builds of Git out of >> "master" to your $CORP customers, you want to ensure you catch badness >> while it is still in "next" (or better yet, before it hits "next"). >> If your internal builds are based on "next", you'd want to ensure that >> "next" stays clean, which means you'd need to watch "seen" (or better >> yet, patches floating on the list before they hit "seen"). Your group >> may build with unusual toolchain internal to your $CORP and may link >> with specialized libraries, etc., in which case maintaining such a >> build is almost like maintaining an exotic platform. > >Hits close to home ;) I hear that. Sometimes having an exotic platform and specialized libraries are overlapping. I am still stuck with 32-bit git because some of the available DLLs on NonStop are still only 32-bit - I'm working hard on changing that but it's not under my budget control. On that subject, I think it is important to have known or designated platform maintainers for the exotics. The downside is that some people expect miracles from us - I just had one request to permanently preserve timestamps of files as they were at commit time. We're into weeks of explanations on why this is a bad idea. Nonetheless, there is a certain amount of responsibility that comes with maintaining a platform, and knowing whom to ask when there are issues. The platform maintainers also can provide needed (preemptive) feedback on dependency changes. I'm not sure how to encode that in a compatible policy, however. >Does this mean that this part of the document should go somewhere else and we >should just use a pointer here? Is there a guide handy for "how to soft-fork Git"? > >> >> > +* You should either: >> > +** Provide VM access on-demand to a trusted developer working to >> > +fix the issue, so they can test their fix, OR >> > +** Work closely with the developer fixing the issue - testing >> > +turnaround to check whether the fix works for your platform should >> > +not be longer than a business day. >> >> These are very specific, especially for minority platform folks. I >> agree with the direction, but "not be longer than" might be too >> strong. Longer turnaround time will certainly make the issue >> resolution slower, but if the platform maintainer can stand it, that >> is their choice. Finding some volunteers among our developers who are >> familiar with the code to help their problem with more patience and >> can wait for more than a business day is also up to them. > >Maybe something like this is better? > >"Work closely with the developer fixing the issue; the turnaround to check that a >proposed fix works for your platform should be fast enough that it doesn't hinder >the developer working on that fix. If the turnaround is too slow, fixing the issue may >miss the next release or the developer may lose interest in working on the fix at all." > >This last bit seems harsh but might be a good reminder - in this situation, the >developer is a volunteer, and if that volunteer work is artificially annoying, they can >decide to stop doing it. Open to rephrasing. > >> >> > +Compatible on `next` >> > +-------------------- >> > + >> > +To guarantee that `next` will work for your platform, avoiding >> > +reactive debugging and fixing: >> > + >> > +* You should add a runner for your platform to the GitHub Actions CI suite. >> > +This suite is run when any Git developer proposes a new patch, and >> > +having a runner for your platform/configuration means every >> > +developer will know if they break you, immediately. >> >> This would be nice even if the platform maintainer do not care about >> `next` occasionally breaking (i.e. keep `master` working, in the >> previous section, or even find breakages on `master` before the next >> feature release, in the section before that). > >I agree that it would be nice for any scenario :) > >I was trying to link lower quality of service with lower investment; it would be nice >to have things from the "higher investment" tier, but it's not really necessary for Git >to be providing that worse quality of service. > >Would it be worth mentioning at the very beginning of the doc that "it's OK if you >pick and choose between different tiers, and we appreciate anything that takes a >higher investment, but these lists should give you an impression of what you'll get >from the level of effort you want to provide yourself"? Probably not with that exact >phrasing, but to try and get that concept across, at least. > >> >> > +* If you rely on Git avoiding a specific pattern that doesn't work >> > +well with your platform (like a certain malloc pattern), if >> > +possible, add a coccicheck rule to ensure that pattern is not used. >> >> Sorry, but I do not quite follow you here. >> >> In general, it is a bad idea to promise that we are willing to tie our >> hands with coccicheck to satisfy needs by exotic platforms, without >> first having a chance to see and evaluate such needs. >> >> "if possible, add" -> "sometimes it may turn out to be a good idea to >> add", perhaps? > >Maybe it is better to ask them to discuss it with us on-list, and that the result of that >discussion may be that they should add some such test? Or, do we want to firmly >say, no coccicheck restrictions based on platform, give us a CI runner or bust? I >don't feel super strongly either way - writing this section I was trying to come up >with any way to get on-demand ~instant (<1hr) feedback to any contributor, and >this seemed like one someone could do. That doesn't mean we have to let them, if >we don't like this way. > >> >> > +* If you rely on some configuration or behavior, add a test for it. >> > +You may find it easier to add a unit test ensuring the behavior you >> > +need than to add an integration test; either one works. Untested >> > +behavior is subject to breakage at any time. >> >> A unit test may be easier to add than an end-to-end test, but given >> that end-users and platform maintainers want to see Git work as a >> whole (e.g., if you prepare two repositories and do "git push there >> :refs/heads/foo" then it removes the 'foo' branch), an end-to-end test >> would probably be more useful and robust way to ensure that a feature >> you care about will keep working. >> >> In any case, I am not sure the sentence that ends with "either one >> works" is worth existing here in this document. Two important points >> to stress here are (1) add test to protect what you care about and (2) >> otherwise you can keep both halves. >> > >Thanks, will do. > > >I've got a couple of changes locally but I'll hold off on the reroll til the open phrasing >questions get resolved, and I get through brian's review as well, and maybe another >day to see if Randall makes a response, since I cc'd him too. Will aim to send out v2 >Friday latest (barring any combustion events at $dayjob). Thanks much for the >quick and thorough reply. --Randall
On Tue, Jul 9, 2024 at 4:16 PM brian m. carlson <sandals@crustytoothpaste.net> wrote: > > On 2024-07-09 at 22:50:42, Emily Shaffer wrote: > > Right now, this doc talks about "guarantees." I used that phrasing based on > > what I've observed to be an implicit expectation that we guarantee support; it > > could be this isn't actually a guarantee that the community is willing to make, > > so I am hoping we can discuss it and come up with the right term. > > I think it might be helpful to look at what some other projects do. Rust > has a concept of tiered support, and it requires platforms to have > maintainers who will commit to support an OS. I don't think we > necessarily need to be so formal, but if nobody's stepping up to monitor > an OS or architecture, it may break at any time and we won't be able to > consider it when deciding on features we require from the platform (such > as Rust, C versions, or POSIX versions). It took me a little time to find it, so here's the link for others: https://doc.rust-lang.org/nightly/rustc/platform-support.html I do think it's interesting that while Rust splits their tiers based on how much of it will definitely work (definitely builds+tests, definitely builds, probably builds maybe), which is different from how I sliced it by time (works on release, works on stable, works on unstable). This kind of lines up with what you mentioned next about the tests (or some subset) working, which I hadn't considered, either. > > I think it's also worth discussing what we require from a platform we're > willing to support. For example, we might require that the platform > pass the entire testsuite (ignoring irrelevant tests or tests for things > that platform doesn't use, such as Perl) or be actively pursuing an > attempt to do so. We may also want to require that an OS be actively > receiving security support so that we don't have people asking us to > carry patches for actively obsolete OSes, such as CentOS 6. Finally, > some sort of time limit may be helpful, since some Linux vendors are now > offering 15 years of support, and we really may not want to target > really ancient versions of things like libcurl. I sort of wonder how much of this is taken care of by expressing "fully supported" as "can run in GitHub Actions". Even if an LTS distro is 12 years old and using ancient curl, will GitHub still be able to run it in a VM/container? Maybe there's no such guarantee, since you can hook up self-hosted runners (which sounds more appealing if someone's got something weird enough it doesn't run well in a container). I'm not sure which of these requirements we'd want to enumerate - but does it make sense to tack it onto the end of this doc? Something like: """ Minimum Requirements ------ Even if tests or CI runners are added to guarantee support, supported platforms must: * Be compatible with C99 * Use curl vX.Y.Z or later * Use zlib vA.B.C or later ... """ > > At the same time, we do have people actively building Git on a variety > of platforms and a huge number of architectures, including most Linux > distros and the BSDs, and we will want to be cognizant that we should > avoid breaking those environments when possible, even though, say, the > porters for some of those OSes or architectures may not actively follow > the list (due to limited porters and lots of porting work). I imagine > we might say that released architectures on certain distros (Debian > comes to mind as a very portable option) might be implicitly supported. Are they implicitly supported, or are they explicitly supported via the GH runners? Or indirectly supported? For example, the Actions suite tests on Ubuntu; at least once upon a time Ubuntu was derived from Debian (is it still? I don't play with distros much anymore); so would that mean that running tests in Ubuntu also implies they will pass in Debian? (By the way, I think we should probably just add a BSD test runner to Actions config; we test on MacOS but that's not that closely related. It seems like it might be a pretty easy lift to do that.) > > > +Compatible on `next` > > +-------------------- > > + > > +To guarantee that `next` will work for your platform, avoiding reactive > > +debugging and fixing: > > + > > +* You should add a runner for your platform to the GitHub Actions CI suite. > > +This suite is run when any Git developer proposes a new patch, and having a > > +runner for your platform/configuration means every developer will know if they > > +break you, immediately. > > I think this is a particularly helpful approach. I understand the Linux > runners support nested virtualization, so it's possible to run tests in > a VM on a Linux runner on OSes that Actions doesn't natively support. I > do this for several of my Rust projects[0] on FreeBSD and NetBSD, for > example, and it should work on platforms that support Vagrant and run on > x86-64. > > That won't catch things like alignment problems which don't affect > x86-64, but it does catch a lot of general portability problems that are > OS-related. > > I'm in agreement with all of your suggestions, by the way, and I > appreciate you opening this discussion. > > [0] An example for the curious is muter: https://github.com/bk2204/muter. Neat :) > -- > brian m. carlson (they/them or he/him) > Toronto, Ontario, CA
> >> > +* You should run nightly tests against the `next` branch and > >> > +publish breakage reports to the mailing list immediately when they happen. > >> > +* It may make sense to automate these; if you do, make sure they > >> > +are not noisy (you don't need to send a report when everything > >> > +works, only when something breaks). > >> > +* Breakage reports should be actionable - include clear error > >> > +messages that can help developers who may not have access to test directly on > >your platform. > >> > +* You should use git-bisect and determine which commit introduced > >> > +the breakage; if you can't do this with automation, you should do > >> > +this yourself manually as soon as you notice a breakage report was sent. > >> > >> All of the above are actually applicable to any active contributors on > >> any platforms. If your group feeds custom builds of Git out of > >> "master" to your $CORP customers, you want to ensure you catch badness > >> while it is still in "next" (or better yet, before it hits "next"). > >> If your internal builds are based on "next", you'd want to ensure that > >> "next" stays clean, which means you'd need to watch "seen" (or better > >> yet, patches floating on the list before they hit "seen"). Your group > >> may build with unusual toolchain internal to your $CORP and may link > >> with specialized libraries, etc., in which case maintaining such a > >> build is almost like maintaining an exotic platform. > > > >Hits close to home ;) > > I hear that. Sometimes having an exotic platform and specialized libraries are overlapping. I am still stuck with 32-bit git because some of the available DLLs on NonStop are still only 32-bit - I'm working hard on changing that but it's not under my budget control. > > On that subject, I think it is important to have known or designated platform maintainers for the exotics. The downside is that some people expect miracles from us - I just had one request to permanently preserve timestamps of files as they were at commit time. We're into weeks of explanations on why this is a bad idea. Nonetheless, there is a certain amount of responsibility that comes with maintaining a platform, and knowing whom to ask when there are issues. The platform maintainers also can provide needed (preemptive) feedback on dependency changes. I'm not sure how to encode that in a compatible policy, however. I think it's a pretty good idea to have a contact list written down somewhere, yeah. Maybe something similarly-formatted to a MAINTAINERS file. I don't feel bad if it's just appended to the bottom of this doc til we find a better place to put it... or maybe we can put such a contact list in compat/, since someone lost trying to figure out a compatibility thing might be looking there anyway? Who else would we put on there? I can think of you for NonStop from the top of my head; that AIX breakage I dug up was reported by AEvar, but it's also a few years old; and I could imagine putting Johannes down for Windows. Maybe that's enough to start with. By the way, Randall, should I be waiting for a more complete review of this patch from you before I reroll? - Emily
On Wed, Jul 10, 2024 at 1:13 PM Junio C Hamano <gitster@pobox.com> wrote: > > Emily Shaffer <nasamuffin@google.com> writes: > > >> > +Compatible on `master` and point releases > >> > +----------------------------------------- > >> > + > >> > +To guarantee that `master` and all point releases work for your platform the > >> > +first time: > >> > >> OK, as most of the changes go to `master` before getting merged down > >> to `maint` to become part of the next maintenance release, actively > >> protecting `master` from bugs is worthwhile. What about changes > >> that do not come via the `master` branch? Should they also join the > >> security list and have an early access to the cabal material? > > > > Good question, I actually am not sure of the answer. Does that make it > > too easy for anybody to claim they maintain some random platform and > > therefore they'd like to see all the RCE howtos weeks before they are > > fixed? I guess that we already have distro packagers in security > > list/cabal, so it may not be worse exposure than that. > > Stopping at saying "You may want to ask to join the security list" > and then leave the vetting process out of the guidelines for the > contributor (i.e. out of this document) may strike a good balance. > > We will obviously be careful about whom to add to the security list, > but that does not change where people hear about the list and apply > to join. Thanks, done. > >> All of the above are actually applicable to any active contributors > >> on any platforms. > >> ... > > > > Hits close to home ;) > > > > Does this mean that this part of the document should go somewhere else > > and we should just use a pointer here? Is there a guide handy for "how > > to soft-fork Git"? > > Once we have a contributor guidelines this is a good material to > migrate there, but that would probably wait after the dust from this > document settles. Ok. For now I'll leave it as-is. > > > Maybe something like this is better? > > > > "Work closely with the developer fixing the issue; the turnaround to > > check that a proposed fix works for your platform should be fast > > enough that it doesn't hinder the developer working on that fix. If > > the turnaround is too slow, fixing the issue may miss the next release > > or the developer may lose interest in working on the fix at all." > > I think that is a good approach to take. "We will not promise to > wait for you if you are slow, and that is not limited to those who > are working on minority/niche platforms" is a good point to make. > > >> > +* If you rely on Git avoiding a specific pattern that doesn't work well with > >> > +your platform (like a certain malloc pattern), if possible, add a coccicheck > >> > +rule to ensure that pattern is not used. > >> > >> Sorry, but I do not quite follow you here. > >> > >> In general, it is a bad idea to promise that we are willing to tie > >> our hands with coccicheck to satisfy needs by exotic platforms, > >> without first having a chance to see and evaluate such needs. > >> > >> "if possible, add" -> "sometimes it may turn out to be a good idea > >> to add", perhaps? > > > > Maybe it is better to ask them to discuss it with us on-list, and that > > the result of that discussion may be that they should add some such > > test? Or, do we want to firmly say, no coccicheck restrictions based > > on platform, give us a CI runner or bust? I don't feel super strongly > > either way - writing this section I was trying to come up with any way > > to get on-demand ~instant (<1hr) feedback to any contributor, and this > > seemed like one someone could do. That doesn't mean we have to let > > them, if we don't like this way. > > Yes. If you want to add additional constraints on how the codebase > does things, discuss it on list first and work with us to come up > with a way to without forcing too many unnecessary constraints on > other platforms. It may result in keeping the generic codebase > pristine and free from #ifdef and having platform specific code > somewhere in compat/ but such details do not have to be spelled > out---they will be different case-by-case and we will hopefully > devise new and improved ways to deal with them. Ok. I've got a rephrasing for the next reroll; still thinking I can get that sent by tomorrow latest. Thanks! - Emily
On Wed, Jul 10, 2024 at 12:11 PM Kyle Lippincott <spectral@google.com> wrote: > > On Tue, Jul 9, 2024 at 3:50 PM Emily Shaffer <emilyshaffer@google.com> wrote: > > +* If you rely on some configuration or behavior, add a test for it. You may > > +find it easier to add a unit test ensuring the behavior you need than to add an > > +integration test; either one works. Untested behavior is subject to breakage at > > +any time. > > Should we state that we reserve the right to reject these tests if > they would put what we feel is an excessive burden on the git > developers? i.e. a requirement to use C89 would be rejected > (obviously). a requirement to support 16-bit platforms would also be > rejected. I don't know that we need to list examples for what we'd > reject, they could be implied that we're likely to accept anything > else. brian mentioned something similar in their review, to which I proposed a minimum requirements field[1]; I think this would work for that too? Thoughts? > > +** Clearly label these tests as necessary for platform compatibility. Add them > > +to an isolated compatibility-related test suite, like a new t* file or unit test > > +suite, so that they're easy to remove when compatibility is no longer required. > > +If the specific compatibility need is gated behind an issue with another > > +project, link to documentation of that issue (like a bug or email thread) to > > +make it easier to tell when that compatibility need goes away. > > I think that we likely won't have the ability to investigate whether > it's _truly_ gone away ourselves, and there's no guarantee that the > person that added these tests will be able to vet it either (maybe > they've switched jobs, for example). > > I think we should take a stance that may be considered hostile, but I > can't really think of a better one: > - there needs to be a regular (6 month? 12 month? no longer than 12 > month surely...) reaffirmation by the interested party that this is > still a requirement for them. This comes in the form of updating a > date in the codebase, not just a message on the list. If this > reaffirmation does not happen, we are allowed to assume that this is > not needed anymore and remove the test that's binding us to supporting > that. I like the idea of the date in code, because that's "polled" rather than "pushed" - when I break the test, I can go look at it, and see a condition like "Until July 1st, 2024" and say "well it's July 11th, so I don't care, deleted!" - or, more likely, I can see that date expired a year and a half ago :) > We should probably go a step further and intentionally violate > the test condition, so that any builds done by the interested parties > break immediately (which should be caught by the other processes > documented here; if they don't break, then it was correct to remove > the restriction). ...but this seems harder to keep track of. Where are we remembering these "due dates" and remembering to break them on purpose? I'm not sure that there's a good way to enforce this. > - _most_ of these restrictions should probably have a limited number > of reaffirmations? I feel like this needs to be handled on a > case-by-case basis, but I want us to be clear that just because we > accepted these restrictions in the past doesn't mean we will accept > them indefinitely. > - Just because there's a reaffirmation doesn't mean we're guaranteeing > we won't delete the test before the affirmation "expires". If there's > an urgent security need to do something, and it can't be done without > breaking this, we'll choose to break this test. If there's general > consensus to do something (adopt a new language standard version, for > example), there's no guarantee we'll wait until all the existing > affirmations expire. I honestly like this point more than the previous one. Limiting by a number, even one that changes, feels less flexible than allowing ourselves to say "enough of this nonsense, it's the Century of the Fruitbat, we really want to use <C11 feature> so you can get maintenance updates instead now". > > The thinking here is that this test is imposing a restriction on the > git developers that we've agreed to take on as a favor: we are going > to restrict ourselves from doing X _for the time being_ not > necessarily because it'll break you, but because it's a bad experience > for the git developers to create a patch series that lands and then > gets backed out when the breakage on $non_standard_platform is > detected on seen/next/master. 1: https://lore.kernel.org/git/CAO_smVjZ7DSPdL+KYCm2mQ=q55XbEH7Vu_jLxkAa5WTcD9rq8A@mail.gmail.com/T/#m9d7656905be500a97ee6f86b030e94c91a79507a
On Thursday, July 11, 2024 2:20 PM, Emily Shaffer wrote: >> >> > +* You should run nightly tests against the `next` branch and >> >> > +publish breakage reports to the mailing list immediately when they happen. >> >> > +* It may make sense to automate these; if you do, make sure they >> >> > +are not noisy (you don't need to send a report when everything >> >> > +works, only when something breaks). >> >> > +* Breakage reports should be actionable - include clear error >> >> > +messages that can help developers who may not have access to >> >> > +test directly on >> >your platform. >> >> > +* You should use git-bisect and determine which commit >> >> > +introduced the breakage; if you can't do this with automation, >> >> > +you should do this yourself manually as soon as you notice a breakage >report was sent. >> >> >> >> All of the above are actually applicable to any active contributors >> >> on any platforms. If your group feeds custom builds of Git out of >> >> "master" to your $CORP customers, you want to ensure you catch >> >> badness while it is still in "next" (or better yet, before it hits "next"). >> >> If your internal builds are based on "next", you'd want to ensure >> >> that "next" stays clean, which means you'd need to watch "seen" (or >> >> better yet, patches floating on the list before they hit "seen"). >> >> Your group may build with unusual toolchain internal to your $CORP >> >> and may link with specialized libraries, etc., in which case >> >> maintaining such a build is almost like maintaining an exotic platform. >> > >> >Hits close to home ;) >> >> I hear that. Sometimes having an exotic platform and specialized libraries are >overlapping. I am still stuck with 32-bit git because some of the available DLLs on >NonStop are still only 32-bit - I'm working hard on changing that but it's not under >my budget control. >> >> On that subject, I think it is important to have known or designated platform >maintainers for the exotics. The downside is that some people expect miracles from >us - I just had one request to permanently preserve timestamps of files as they >were at commit time. We're into weeks of explanations on why this is a bad idea. >Nonetheless, there is a certain amount of responsibility that comes with >maintaining a platform, and knowing whom to ask when there are issues. The >platform maintainers also can provide needed (preemptive) feedback on >dependency changes. I'm not sure how to encode that in a compatible policy, >however. > >I think it's a pretty good idea to have a contact list written down somewhere, yeah. >Maybe something similarly-formatted to a MAINTAINERS file. I don't feel bad if it's >just appended to the bottom of this doc til we find a better place to put it... or >maybe we can put such a contact list in compat/, since someone lost trying to figure >out a compatibility thing might be looking there anyway? > >Who else would we put on there? I can think of you for NonStop from the top of >my head; that AIX breakage I dug up was reported by AEvar, but it's also a few years >old; and I could imagine putting Johannes down for Windows. Maybe that's enough >to start with. > >By the way, Randall, should I be waiting for a more complete review of this patch >from you before I reroll? Please reroll. --Randall
Emily Shaffer <nasamuffin@google.com> writes: > ...but this seems harder to keep track of. Where are we remembering > these "due dates" and remembering to break them on purpose? I'm not > sure that there's a good way to enforce this. If you come up with a good way, let me know. We have a few "test balloons" in the code base and a due date that allows us to say "now it is 18 months since this test balloon was raised, and nobody complained, so we can safely assume that all compilers that matter to users of this project support this language construct just fine" would be a wonderful thing to have.
Junio C Hamano <gitster@pobox.com> writes: > Emily Shaffer <nasamuffin@google.com> writes: > >> ...but this seems harder to keep track of. Where are we remembering >> these "due dates" and remembering to break them on purpose? I'm not >> sure that there's a good way to enforce this. > > If you come up with a good way, let me know. We have a few "test > balloons" in the code base and a due date that allows us to say "now > it is 18 months since this test balloon was raised, and nobody > complained, so we can safely assume that all compilers that matter > to users of this project support this language construct just fine" > would be a wonderful thing to have. Or, just set up a calendar reminder or an entry in an issue tracker that are accessible publicly, only to ease the "keeping track of" part? As an open source public project, we would not want to depend too heavily on a single vendor's offering for anything that is essential to run the project, but as long as we make sure that: - the authoritative due date is in the commit log message that introduces something, e.g., "we add this limitation, which we will revisit and reconsider in 6 months", but - if any hosting, cloud, or any other service providers, be they commercial or run by volunteers, offer a way for us to make it easier to keep track of that authoritative due date established above, without additional strings attached, we may freely take it (in other words, we are not free-software-socialists against commercial services). then the "reminder" part would not be something essential to run the project, so ...
On Thu, Jul 11, 2024 at 11:15 AM Emily Shaffer <nasamuffin@google.com> wrote: > > On Tue, Jul 9, 2024 at 4:16 PM brian m. carlson > <sandals@crustytoothpaste.net> wrote: > > > > On 2024-07-09 at 22:50:42, Emily Shaffer wrote: > > > Right now, this doc talks about "guarantees." I used that phrasing based on > > > what I've observed to be an implicit expectation that we guarantee support; it > > > could be this isn't actually a guarantee that the community is willing to make, > > > so I am hoping we can discuss it and come up with the right term. > > > > I think it might be helpful to look at what some other projects do. Rust > > has a concept of tiered support, and it requires platforms to have > > maintainers who will commit to support an OS. I don't think we > > necessarily need to be so formal, but if nobody's stepping up to monitor > > an OS or architecture, it may break at any time and we won't be able to > > consider it when deciding on features we require from the platform (such > > as Rust, C versions, or POSIX versions). > > It took me a little time to find it, so here's the link for others: > https://doc.rust-lang.org/nightly/rustc/platform-support.html > > I do think it's interesting that while Rust splits their tiers based > on how much of it will definitely work (definitely builds+tests, > definitely builds, probably builds maybe), which is different from how > I sliced it by time (works on release, works on stable, works on > unstable). This kind of lines up with what you mentioned next about > the tests (or some subset) working, which I hadn't considered, either. > > > > > I think it's also worth discussing what we require from a platform we're > > willing to support. For example, we might require that the platform > > pass the entire testsuite (ignoring irrelevant tests or tests for things > > that platform doesn't use, such as Perl) or be actively pursuing an > > attempt to do so. We may also want to require that an OS be actively > > receiving security support so that we don't have people asking us to > > carry patches for actively obsolete OSes, such as CentOS 6. Finally, > > some sort of time limit may be helpful, since some Linux vendors are now > > offering 15 years of support, and we really may not want to target > > really ancient versions of things like libcurl. > > I sort of wonder how much of this is taken care of by expressing > "fully supported" as "can run in GitHub Actions". Even if an LTS > distro is 12 years old and using ancient curl, will GitHub still be > able to run it in a VM/container? Maybe there's no such guarantee, > since you can hook up self-hosted runners (which sounds more appealing > if someone's got something weird enough it doesn't run well in a > container). > > I'm not sure which of these requirements we'd want to enumerate - but > does it make sense to tack it onto the end of this doc? Something > like: > > """ > Minimum Requirements > ------ > > Even if tests or CI runners are added to guarantee support, supported > platforms must: > > * Be compatible with C99 > * Use curl vX.Y.Z or later > * Use zlib vA.B.C or later > ... > """ My concern with actually listing what the minimum requirements are is that we then need a process for raising the minimum requirements. For C specification, I can see that rightfully being an involved conversation and achieving consensus that this is the right time to do it. For things like library versions, I'm less comfortable with it because if we have to raise the minimum bar for some urgent reason, there's the potential for additional friction with these platforms claiming that we stated we'd support them (ex: we say you need to be able to use libfoo v3.x.x (v4.x.x had some breaking changes, but coexists with v3, so we just stuck with v3), and some security fix that we need to receive only exists on the v4 version, so now we need to port to using v4 so that we get the security fix). I think it's probably fine to list minimum requirements, as long as we have something conveying "and possibly other criteria". I don't want this interpreted as a "do this, and we will try hard to not break you", it should be interpreted as "if you can't do at least this, we won't even look at patches/tests/CI to unbreak you/keep you unbroken" > > > > > At the same time, we do have people actively building Git on a variety > > of platforms and a huge number of architectures, including most Linux > > distros and the BSDs, and we will want to be cognizant that we should > > avoid breaking those environments when possible, even though, say, the > > porters for some of those OSes or architectures may not actively follow > > the list (due to limited porters and lots of porting work). I imagine > > we might say that released architectures on certain distros (Debian > > comes to mind as a very portable option) might be implicitly supported. > > Are they implicitly supported, or are they explicitly supported via > the GH runners? Or indirectly supported? For example, the Actions > suite tests on Ubuntu; at least once upon a time Ubuntu was derived > from Debian (is it still? I don't play with distros much anymore); so > would that mean that running tests in Ubuntu also implies they will > pass in Debian? > > (By the way, I think we should probably just add a BSD test runner to > Actions config; we test on MacOS but that's not that closely related. > It seems like it might be a pretty easy lift to do that.) > > > > > > +Compatible on `next` > > > +-------------------- > > > + > > > +To guarantee that `next` will work for your platform, avoiding reactive > > > +debugging and fixing: > > > + > > > +* You should add a runner for your platform to the GitHub Actions CI suite. > > > +This suite is run when any Git developer proposes a new patch, and having a > > > +runner for your platform/configuration means every developer will know if they > > > +break you, immediately. > > > > I think this is a particularly helpful approach. I understand the Linux > > runners support nested virtualization, so it's possible to run tests in > > a VM on a Linux runner on OSes that Actions doesn't natively support. I > > do this for several of my Rust projects[0] on FreeBSD and NetBSD, for > > example, and it should work on platforms that support Vagrant and run on > > x86-64. > > > > That won't catch things like alignment problems which don't affect > > x86-64, but it does catch a lot of general portability problems that are > > OS-related. > > > > I'm in agreement with all of your suggestions, by the way, and I > > appreciate you opening this discussion. > > > > [0] An example for the curious is muter: https://github.com/bk2204/muter. > > Neat :) > > > -- > > brian m. carlson (they/them or he/him) > > Toronto, Ontario, CA >
On Thursday, July 11, 2024 4:13 PM, Kyle Lippincott wrote: >On Thu, Jul 11, 2024 at 11:15 AM Emily Shaffer <nasamuffin@google.com> wrote: >> >> On Tue, Jul 9, 2024 at 4:16 PM brian m. carlson >> <sandals@crustytoothpaste.net> wrote: >> > >> > On 2024-07-09 at 22:50:42, Emily Shaffer wrote: >> > > Right now, this doc talks about "guarantees." I used that phrasing >> > > based on what I've observed to be an implicit expectation that we >> > > guarantee support; it could be this isn't actually a guarantee >> > > that the community is willing to make, so I am hoping we can discuss it and >come up with the right term. >> > >> > I think it might be helpful to look at what some other projects do. >> > Rust has a concept of tiered support, and it requires platforms to >> > have maintainers who will commit to support an OS. I don't think we >> > necessarily need to be so formal, but if nobody's stepping up to >> > monitor an OS or architecture, it may break at any time and we won't >> > be able to consider it when deciding on features we require from the >> > platform (such as Rust, C versions, or POSIX versions). >> >> It took me a little time to find it, so here's the link for others: >> https://doc.rust-lang.org/nightly/rustc/platform-support.html >> >> I do think it's interesting that while Rust splits their tiers based >> on how much of it will definitely work (definitely builds+tests, >> definitely builds, probably builds maybe), which is different from how >> I sliced it by time (works on release, works on stable, works on >> unstable). This kind of lines up with what you mentioned next about >> the tests (or some subset) working, which I hadn't considered, either. >> >> > >> > I think it's also worth discussing what we require from a platform >> > we're willing to support. For example, we might require that the >> > platform pass the entire testsuite (ignoring irrelevant tests or >> > tests for things that platform doesn't use, such as Perl) or be >> > actively pursuing an attempt to do so. We may also want to require >> > that an OS be actively receiving security support so that we don't >> > have people asking us to carry patches for actively obsolete OSes, >> > such as CentOS 6. Finally, some sort of time limit may be helpful, >> > since some Linux vendors are now offering 15 years of support, and >> > we really may not want to target really ancient versions of things like libcurl. >> >> I sort of wonder how much of this is taken care of by expressing >> "fully supported" as "can run in GitHub Actions". Even if an LTS >> distro is 12 years old and using ancient curl, will GitHub still be >> able to run it in a VM/container? Maybe there's no such guarantee, >> since you can hook up self-hosted runners (which sounds more appealing >> if someone's got something weird enough it doesn't run well in a >> container). >> >> I'm not sure which of these requirements we'd want to enumerate - but >> does it make sense to tack it onto the end of this doc? Something >> like: >> >> """ >> Minimum Requirements >> ------ >> >> Even if tests or CI runners are added to guarantee support, supported >> platforms must: >> >> * Be compatible with C99 >> * Use curl vX.Y.Z or later >> * Use zlib vA.B.C or later >> ... >> """ > >My concern with actually listing what the minimum requirements are is that we >then need a process for raising the minimum requirements. For C specification, I can >see that rightfully being an involved conversation and achieving consensus that this >is the right time to do it. For things like library versions, I'm less comfortable with it >because if we have to raise the minimum bar for some urgent reason, there's the >potential for additional friction with these platforms claiming that we stated we'd >support them (ex: we say you need to be able to use libfoo v3.x.x (v4.x.x had some >breaking changes, but coexists with v3, so we just stuck with v3), and some security >fix that we need to receive only exists on the v4 version, so now we need to port to >using v4 so that we get the security fix). > >I think it's probably fine to list minimum requirements, as long as we have >something conveying "and possibly other criteria". I don't want this interpreted as a >"do this, and we will try hard to not break you", it should be interpreted as "if you >can't do at least this, we won't even look at patches/tests/CI to unbreak you/keep >you unbroken" I have similar concerns both in terms of compilers and libraries. I am maintaining two exotic platforms, NonStop x86 and ia64. The ia64 is going off support "soon". In x86, we have more recent tooling, but not everything. It is at an older POSIX level. However, getting libraries built is either going hat in hand to HPE asking for updates on libraries or porting them myself. This is hard because gcc is not available, nor is likely to ever be - so anything with gcc in the mandatory toolchain is out of the question. For ia64, it doesn't matter much, and freezing on 2.49, or slightly later than 2.50 is not a horrible decision. I do have to keep x86 alive for the foreseeable future. My only real time limit is getting a 64-bit time_t (not available yet) in a 64-bit git build (also not possible yet) - which really depends on 64-bit versions of dependencies from the vendor, which is not easy to make happen. So, time limits are a definite concern and policies on those are important to quantify. Fortunately, I am monitoring and building git frequently looking for trouble. I think that part is important for any exotic maintainer and probably should be officially quantified/sanctioned. > >> >> > >> > At the same time, we do have people actively building Git on a >> > variety of platforms and a huge number of architectures, including >> > most Linux distros and the BSDs, and we will want to be cognizant >> > that we should avoid breaking those environments when possible, even >> > though, say, the porters for some of those OSes or architectures may >> > not actively follow the list (due to limited porters and lots of >> > porting work). I imagine we might say that released architectures >> > on certain distros (Debian comes to mind as a very portable option) might be >implicitly supported. >> >> Are they implicitly supported, or are they explicitly supported via >> the GH runners? Or indirectly supported? For example, the Actions >> suite tests on Ubuntu; at least once upon a time Ubuntu was derived >> from Debian (is it still? I don't play with distros much anymore); so >> would that mean that running tests in Ubuntu also implies they will >> pass in Debian? >> >> (By the way, I think we should probably just add a BSD test runner to >> Actions config; we test on MacOS but that's not that closely related. >> It seems like it might be a pretty easy lift to do that.) >> >> > >> > > +Compatible on `next` >> > > +-------------------- >> > > + >> > > +To guarantee that `next` will work for your platform, avoiding >> > > +reactive debugging and fixing: >> > > + >> > > +* You should add a runner for your platform to the GitHub Actions CI suite. >> > > +This suite is run when any Git developer proposes a new patch, >> > > +and having a runner for your platform/configuration means every >> > > +developer will know if they break you, immediately. >> > >> > I think this is a particularly helpful approach. I understand the >> > Linux runners support nested virtualization, so it's possible to run >> > tests in a VM on a Linux runner on OSes that Actions doesn't >> > natively support. I do this for several of my Rust projects[0] on >> > FreeBSD and NetBSD, for example, and it should work on platforms >> > that support Vagrant and run on x86-64. >> > >> > That won't catch things like alignment problems which don't affect >> > x86-64, but it does catch a lot of general portability problems that >> > are OS-related. >> > >> > I'm in agreement with all of your suggestions, by the way, and I >> > appreciate you opening this discussion. >> > >> > [0] An example for the curious is muter: https://github.com/bk2204/muter. >> >> Neat :) >> >> > -- >> > brian m. carlson (they/them or he/him) Toronto, Ontario, CA >>
On Thu, Jul 11, 2024 at 11:37 AM Emily Shaffer <nasamuffin@google.com> wrote: > > On Wed, Jul 10, 2024 at 12:11 PM Kyle Lippincott <spectral@google.com> wrote: > > > > On Tue, Jul 9, 2024 at 3:50 PM Emily Shaffer <emilyshaffer@google.com> wrote: > > > +* If you rely on some configuration or behavior, add a test for it. You may > > > +find it easier to add a unit test ensuring the behavior you need than to add an > > > +integration test; either one works. Untested behavior is subject to breakage at > > > +any time. > > > > Should we state that we reserve the right to reject these tests if > > they would put what we feel is an excessive burden on the git > > developers? i.e. a requirement to use C89 would be rejected > > (obviously). a requirement to support 16-bit platforms would also be > > rejected. I don't know that we need to list examples for what we'd > > reject, they could be implied that we're likely to accept anything > > else. > > brian mentioned something similar in their review, to which I proposed > a minimum requirements field[1]; I think this would work for that too? > Thoughts? > > > > +** Clearly label these tests as necessary for platform compatibility. Add them > > > +to an isolated compatibility-related test suite, like a new t* file or unit test > > > +suite, so that they're easy to remove when compatibility is no longer required. > > > +If the specific compatibility need is gated behind an issue with another > > > +project, link to documentation of that issue (like a bug or email thread) to > > > +make it easier to tell when that compatibility need goes away. > > > > I think that we likely won't have the ability to investigate whether > > it's _truly_ gone away ourselves, and there's no guarantee that the > > person that added these tests will be able to vet it either (maybe > > they've switched jobs, for example). > > > > I think we should take a stance that may be considered hostile, but I > > can't really think of a better one: > > - there needs to be a regular (6 month? 12 month? no longer than 12 > > month surely...) reaffirmation by the interested party that this is > > still a requirement for them. This comes in the form of updating a > > date in the codebase, not just a message on the list. If this > > reaffirmation does not happen, we are allowed to assume that this is > > not needed anymore and remove the test that's binding us to supporting > > that. > > I like the idea of the date in code, because that's "polled" rather > than "pushed" - when I break the test, I can go look at it, and see a > condition like "Until July 1st, 2024" and say "well it's July 11th, so > I don't care, deleted!" - or, more likely, I can see that date expired > a year and a half ago :) > > > We should probably go a step further and intentionally violate > > the test condition, so that any builds done by the interested parties > > break immediately (which should be caught by the other processes > > documented here; if they don't break, then it was correct to remove > > the restriction). > > ...but this seems harder to keep track of. Where are we remembering > these "due dates" and remembering to break them on purpose? I'm not > sure that there's a good way to enforce this. Ah, I was unclear - I was saying when we remove the test, we should intentionally add a use of the thing the test was preventing us from using. So if there's a test saying "you can't use printf()", when removing that test, we should add a use of printf in the same series. I was intentionally being vague about "when" we should remove these tests, and how long we should consider the newly added printf provisional (and possibly going to be rolled back), like in our current test balloons. Just: if you're removing an expired platform-support-policy test, add something to the codebase (possibly just another test) that exercises the previously forbidden aspect, if possible. This is likely to happen organically: most people aren't going to notice an expired platform-support-policy test and remove it just because. They'll trip over it while attempting to work on a feature, notice it's expired, and remove it as part of the series that makes use of it. But if someone is trying to be helpful and clean up expired platform-support-policy tests without having the goal of immediately using it, they should be encouraged to invert the test so that we don't have potentially months of time difference between "implicitly allowed again [since there's no test preventing its use]" and the actual usage being added. > > > - _most_ of these restrictions should probably have a limited number > > of reaffirmations? I feel like this needs to be handled on a > > case-by-case basis, but I want us to be clear that just because we > > accepted these restrictions in the past doesn't mean we will accept > > them indefinitely. > > - Just because there's a reaffirmation doesn't mean we're guaranteeing > > we won't delete the test before the affirmation "expires". If there's > > an urgent security need to do something, and it can't be done without > > breaking this, we'll choose to break this test. If there's general > > consensus to do something (adopt a new language standard version, for > > example), there's no guarantee we'll wait until all the existing > > affirmations expire. > > I honestly like this point more than the previous one. Limiting by a > number, even one that changes, feels less flexible than allowing > ourselves to say "enough of this nonsense, it's the Century of the > Fruitbat, we really want to use <C11 feature> so you can get > maintenance updates instead now". Yeah, that's fair. Let's withdraw my suggestion of a limited number of reaffirmations. :) That works well if the goal is to have a temporary restriction while work is actively happening to remove its necessity, but that's less likely to be the case here. If we can't move to C42 because it's not supported on Linux Distro 2050™ yet, users of Linux Distro 2050™ aren't able to promise eventual support of C42 or accelerate that happening. > > > > > The thinking here is that this test is imposing a restriction on the > > git developers that we've agreed to take on as a favor: we are going > > to restrict ourselves from doing X _for the time being_ not > > necessarily because it'll break you, but because it's a bad experience > > for the git developers to create a patch series that lands and then > > gets backed out when the breakage on $non_standard_platform is > > detected on seen/next/master. > > 1: https://lore.kernel.org/git/CAO_smVjZ7DSPdL+KYCm2mQ=q55XbEH7Vu_jLxkAa5WTcD9rq8A@mail.gmail.com/T/#m9d7656905be500a97ee6f86b030e94c91a79507a
On Thu, Jul 11, 2024 at 1:12 PM Kyle Lippincott <spectral@google.com> wrote: > > On Thu, Jul 11, 2024 at 11:15 AM Emily Shaffer <nasamuffin@google.com> wrote: > > > > On Tue, Jul 9, 2024 at 4:16 PM brian m. carlson > > > I think it's also worth discussing what we require from a platform we're > > > willing to support. For example, we might require that the platform > > > pass the entire testsuite (ignoring irrelevant tests or tests for things > > > that platform doesn't use, such as Perl) or be actively pursuing an > > > attempt to do so. We may also want to require that an OS be actively > > > receiving security support so that we don't have people asking us to > > > carry patches for actively obsolete OSes, such as CentOS 6. Finally, > > > some sort of time limit may be helpful, since some Linux vendors are now > > > offering 15 years of support, and we really may not want to target > > > really ancient versions of things like libcurl. > > > > I sort of wonder how much of this is taken care of by expressing > > "fully supported" as "can run in GitHub Actions". Even if an LTS > > distro is 12 years old and using ancient curl, will GitHub still be > > able to run it in a VM/container? Maybe there's no such guarantee, > > since you can hook up self-hosted runners (which sounds more appealing > > if someone's got something weird enough it doesn't run well in a > > container). > > > > I'm not sure which of these requirements we'd want to enumerate - but > > does it make sense to tack it onto the end of this doc? Something > > like: > > > > """ > > Minimum Requirements > > ------ > > > > Even if tests or CI runners are added to guarantee support, supported > > platforms must: > > > > * Be compatible with C99 > > * Use curl vX.Y.Z or later > > * Use zlib vA.B.C or later > > ... > > """ > > My concern with actually listing what the minimum requirements are is > that we then need a process for raising the minimum requirements. For > C specification, I can see that rightfully being an involved > conversation and achieving consensus that this is the right time to do > it. For things like library versions, I'm less comfortable with it > because if we have to raise the minimum bar for some urgent reason, > there's the potential for additional friction with these platforms > claiming that we stated we'd support them (ex: we say you need to be > able to use libfoo v3.x.x (v4.x.x had some breaking changes, but > coexists with v3, so we just stuck with v3), and some security fix > that we need to receive only exists on the v4 version, so now we need > to port to using v4 so that we get the security fix). > > I think it's probably fine to list minimum requirements, as long as we > have something conveying "and possibly other criteria". I don't want > this interpreted as a "do this, and we will try hard to not break > you", it should be interpreted as "if you can't do at least this, we > won't even look at patches/tests/CI to unbreak you/keep you unbroken" Yeah, I agree that I'm somewhat uninterested in enumerating a set of deps/criteria that will then magically assure it works - that does sound like a hassle to keep up-to-date, and in fact I think would set us backwards from the goal of this doc. We'd feel tied to that criteria without any ability to move forward, which would probably be our current status quo or worse. So, I'll make it clear to reflect the second thing you're saying in the reroll. - Emily
On 2024-07-11 at 18:14:36, Emily Shaffer wrote: > On Tue, Jul 9, 2024 at 4:16 PM brian m. carlson > <sandals@crustytoothpaste.net> wrote: > > > > I think it's also worth discussing what we require from a platform we're > > willing to support. For example, we might require that the platform > > pass the entire testsuite (ignoring irrelevant tests or tests for things > > that platform doesn't use, such as Perl) or be actively pursuing an > > attempt to do so. We may also want to require that an OS be actively > > receiving security support so that we don't have people asking us to > > carry patches for actively obsolete OSes, such as CentOS 6. Finally, > > some sort of time limit may be helpful, since some Linux vendors are now > > offering 15 years of support, and we really may not want to target > > really ancient versions of things like libcurl. > > I sort of wonder how much of this is taken care of by expressing > "fully supported" as "can run in GitHub Actions". Even if an LTS > distro is 12 years old and using ancient curl, will GitHub still be > able to run it in a VM/container? Maybe there's no such guarantee, > since you can hook up self-hosted runners (which sounds more appealing > if someone's got something weird enough it doesn't run well in a > container). Some older OSes require kernel features that aren't compiled in by default, so containers are out. For example, CentOS 6 won't run on a modern kernel because it lacks whatever the predecessor to the vDSO was (which can be recompiled into the kernel, but nobody does that). We also don't really want to be on the hook for trying to support OSes that get no security support. We don't want to be running an OS connected to the Internet that has known root security holes, even in a CI environment, so I think _someone_ should be providing security support for it to their customers or the public free of charge. I also want to let us use new features from time to time that may not be able to be conditionally compiled in (such as new Perl features), so I think maybe setting a hard limit on a supported age of software might be a good idea. If we want to have good support for LTS Linux distros, we could set it at 10 years initially. Also, if we do decide to adopt Rust in the future, we'll need to consider a different lifetime policy for that. I try to support an old Debian release for a year after the new one comes out, which is about three years for a compiler version, but anything older that that becomes a real bear to support because most dependencies won't support older versions. Since we're not using Rust now, we don't have to consider that at the moment, but it's a thing to think about when we're discussing policy since different language communities have different expectations. We might find that we need different policies for, say, Perl than we do for C. > """ > Minimum Requirements > ------ > > Even if tests or CI runners are added to guarantee support, supported > platforms must: > > * Be compatible with C99 > * Use curl vX.Y.Z or later > * Use zlib vA.B.C or later > ... > """ I think to start off, we could say that it has to have C99 or C11, that dependencies must have been released upstream in the past 10 years, and that the platform and dependencies must have security support. I mention C99 or C11 because Windows _does_ have C11 but not C99; they are mostly compatible, but not entirely. FreeBSD also _requires_ C11 for its header files and won't compile with C99. I think the difference is small enough that we can paper over it ourselves with little difficulty, though. > > At the same time, we do have people actively building Git on a variety > > of platforms and a huge number of architectures, including most Linux > > distros and the BSDs, and we will want to be cognizant that we should > > avoid breaking those environments when possible, even though, say, the > > porters for some of those OSes or architectures may not actively follow > > the list (due to limited porters and lots of porting work). I imagine > > we might say that released architectures on certain distros (Debian > > comes to mind as a very portable option) might be implicitly supported. > > Are they implicitly supported, or are they explicitly supported via > the GH runners? Or indirectly supported? For example, the Actions > suite tests on Ubuntu; at least once upon a time Ubuntu was derived > from Debian (is it still? I don't play with distros much anymore); so > would that mean that running tests in Ubuntu also implies they will > pass in Debian? Ubuntu is still derived from Debian. It is likely that things which work in one will work in another, but not guaranteed. I mention Debian is because it has a large variety of supported architectures. I absolutely don't want to say, "Oh, you have MIPS hardware, too bad if Git doesn't work for you." (I assure you the distro maintainers will be displeased if we break Git on less common architectures, as will I.) In fact, MIPS is an architecture that requires natural alignment and can be big-endian, so it's very useful in helping us find places we wrote invalid or unportable C. The reason I'm very hesitant to require that we run everything in GitHub Actions because it only supports two architectures. ARM64 and RISC-V are really popular, and I can tell you that running even a Linux container in emulation is really quite slow. I do it for my projects, but Git LFS only builds one set of non-x86 packages (the latest Debian) because emulation is far too slow to build the normal suite of five or six packages. And it's actually much easier to run Linux binaries in emulation in a container because QEMU supports that natively. Most other OSes must run a full OS in emulation, which is slower. There also aren't usually pre-made images for that; I tend to use Vagrant, which has pre-built images for x86-64, but I don't really want to be bootstrapping OS images other architectures in CI. So I'd say that we might want to consider supporting all release architectures on specific OSes. I think Debian is one of the more portable Linux distros (more portable than Ubuntu). Debian has also abandoned some architectures that you can't buy anymore, like Itanium, so that might be a reasonable limit on how far we're willing to go. > (By the way, I think we should probably just add a BSD test runner to > Actions config; we test on MacOS but that's not that closely related. > It seems like it might be a pretty easy lift to do that.) FreeBSD and NetBSD are pretty easy. I won't 100% commit to it, but if I find time I'll try to send a patch.
On Thu, Jul 11, 2024 at 3:24 PM brian m. carlson <sandals@crustytoothpaste.net> wrote: > > On 2024-07-11 at 18:14:36, Emily Shaffer wrote: > > On Tue, Jul 9, 2024 at 4:16 PM brian m. carlson > > <sandals@crustytoothpaste.net> wrote: > > > > > > I think it's also worth discussing what we require from a platform we're > > > willing to support. For example, we might require that the platform > > > pass the entire testsuite (ignoring irrelevant tests or tests for things > > > that platform doesn't use, such as Perl) or be actively pursuing an > > > attempt to do so. We may also want to require that an OS be actively > > > receiving security support so that we don't have people asking us to > > > carry patches for actively obsolete OSes, such as CentOS 6. Finally, > > > some sort of time limit may be helpful, since some Linux vendors are now > > > offering 15 years of support, and we really may not want to target > > > really ancient versions of things like libcurl. > > > > I sort of wonder how much of this is taken care of by expressing > > "fully supported" as "can run in GitHub Actions". Even if an LTS > > distro is 12 years old and using ancient curl, will GitHub still be > > able to run it in a VM/container? Maybe there's no such guarantee, > > since you can hook up self-hosted runners (which sounds more appealing > > if someone's got something weird enough it doesn't run well in a > > container). > > Some older OSes require kernel features that aren't compiled in by > default, so containers are out. For example, CentOS 6 won't run on a > modern kernel because it lacks whatever the predecessor to the vDSO was > (which can be recompiled into the kernel, but nobody does that). Is this hinting that we should mention a minimum kernel version for Linux-kernel-based OSes? > > We also don't really want to be on the hook for trying to support OSes > that get no security support. We don't want to be running an OS > connected to the Internet that has known root security holes, even in a > CI environment, so I think _someone_ should be providing security > support for it to their customers or the public free of charge. Agreed - I added this to the minimum requirement section like you suggest. > > I also want to let us use new features from time to time that may not be > able to be conditionally compiled in (such as new Perl features), so I > think maybe setting a hard limit on a supported age of software might be > a good idea. If we want to have good support for LTS Linux distros, we > could set it at 10 years initially. Sure, I added this to the minimum requirements for now. > > Also, if we do decide to adopt Rust in the future, we'll need to > consider a different lifetime policy for that. I try to support an old > Debian release for a year after the new one comes out, which is about > three years for a compiler version, but anything older that that becomes > a real bear to support because most dependencies won't support older > versions. Since we're not using Rust now, we don't have to consider > that at the moment, but it's a thing to think about when we're > discussing policy since different language communities have different > expectations. We might find that we need different policies for, say, > Perl than we do for C. Yes, definitely, the Rust bits will be a whole different conversation. I do think it's gated behind this - is it ok for us to even add Rust, given our current support story? - so I'm happy to punt on it for now, frankly. > > > """ > > Minimum Requirements > > ------ > > > > Even if tests or CI runners are added to guarantee support, supported > > platforms must: > > > > * Be compatible with C99 > > * Use curl vX.Y.Z or later > > * Use zlib vA.B.C or later > > ... > > """ > > I think to start off, we could say that it has to have C99 or C11, that > dependencies must have been released upstream in the past 10 years, and > that the platform and dependencies must have security support. > > I mention C99 or C11 because Windows _does_ have C11 but not C99; they > are mostly compatible, but not entirely. FreeBSD also _requires_ C11 > for its header files and won't compile with C99. I think the difference > is small enough that we can paper over it ourselves with little > difficulty, though. Added both for reroll. Thanks for the thoughtful suggestions :) > > > > At the same time, we do have people actively building Git on a variety > > > of platforms and a huge number of architectures, including most Linux > > > distros and the BSDs, and we will want to be cognizant that we should > > > avoid breaking those environments when possible, even though, say, the > > > porters for some of those OSes or architectures may not actively follow > > > the list (due to limited porters and lots of porting work). I imagine > > > we might say that released architectures on certain distros (Debian > > > comes to mind as a very portable option) might be implicitly supported. > > > > Are they implicitly supported, or are they explicitly supported via > > the GH runners? Or indirectly supported? For example, the Actions > > suite tests on Ubuntu; at least once upon a time Ubuntu was derived > > from Debian (is it still? I don't play with distros much anymore); so > > would that mean that running tests in Ubuntu also implies they will > > pass in Debian? > > Ubuntu is still derived from Debian. It is likely that things which > work in one will work in another, but not guaranteed. > > I mention Debian is because it has a large variety of supported > architectures. I absolutely don't want to say, "Oh, you have MIPS > hardware, too bad if Git doesn't work for you." (I assure you the > distro maintainers will be displeased if we break Git on less common > architectures, as will I.) In fact, MIPS is an architecture that > requires natural alignment and can be big-endian, so it's very useful in > helping us find places we wrote invalid or unportable C. > > The reason I'm very hesitant to require that we run everything in GitHub > Actions because it only supports two architectures. ARM64 and RISC-V > are really popular, and I can tell you that running even a Linux > container in emulation is really quite slow. I do it for my projects, > but Git LFS only builds one set of non-x86 packages (the latest Debian) > because emulation is far too slow to build the normal suite of five or > six packages. Does that restriction apply to just GitHub-hosted runners, though? Would it be possible for an interested party to set up self-hosted runners (configured via GH Actions) that are using AMD or POWER or whatever? (For example, I think it would be quite feasible for Google to donate some compute for this, though no promises). The appeal is not "because GitHub Actions are great!" for me - the appeal is "because most Git developers run the CI this way if they remember to or use GGG, and Junio runs it on `seen` all the time". If there's some other recommendation for self-service CI runs that don't need some careful reminding or secret knowledge to run, I'm happy with that too. (For example, if someone wants to set up some bot that looks for new [PATCH]-shaped emails, applies, builds, runs tests, and mails test results to the author after run, that would fit into the spirit of this point, although that sounds like a lot of setup to me.) > > And it's actually much easier to run Linux binaries in emulation in a > container because QEMU supports that natively. Most other OSes must run > a full OS in emulation, which is slower. There also aren't usually > pre-made images for that; I tend to use Vagrant, which has pre-built > images for x86-64, but I don't really want to be bootstrapping OS images > other architectures in CI. > > So I'd say that we might want to consider supporting all release > architectures on specific OSes. I think Debian is one of the more > portable Linux distros (more portable than Ubuntu). Debian has also > abandoned some architectures that you can't buy anymore, like Itanium, > so that might be a reasonable limit on how far we're willing to go. Sounds neat - I wonder if this is something I should start asking around about within Google and find out if we can contribute. It might be feasible. Based on this mail, I'll try to reword the bit about "add a GitHub runner" to make it a little less GitHub-prescriptive. Should have a reroll in the next 30min, it was ready to go and then I got this mail :) Thanks! - Emily
On 2024-07-11 at 23:15:35, Emily Shaffer wrote: > On Thu, Jul 11, 2024 at 3:24 PM brian m. carlson > <sandals@crustytoothpaste.net> wrote: > > Some older OSes require kernel features that aren't compiled in by > > default, so containers are out. For example, CentOS 6 won't run on a > > modern kernel because it lacks whatever the predecessor to the vDSO was > > (which can be recompiled into the kernel, but nobody does that). > > Is this hinting that we should mention a minimum kernel version for > Linux-kernel-based OSes? This is actually a feature that still exists in the kernel and could be enabled for newer kernels, but because distros don't use it (they use the vDSO instead), they don't compile it in. I'm not sure a minimum kernel version is helpful, because most of the LTS distro kernels backport features, like Red Hat backported getrandom for example. In the interests of getting to a useful agreement, I think for now we should just punt on this and having a 10 year lifespan will probably do the trick, and we can determine in the future if we need to apply more stringent measures. > > We also don't really want to be on the hook for trying to support OSes > > Ubuntu is still derived from Debian. It is likely that things which > > work in one will work in another, but not guaranteed. > > > > I mention Debian is because it has a large variety of supported > > architectures. I absolutely don't want to say, "Oh, you have MIPS > > hardware, too bad if Git doesn't work for you." (I assure you the > > distro maintainers will be displeased if we break Git on less common > > architectures, as will I.) In fact, MIPS is an architecture that > > requires natural alignment and can be big-endian, so it's very useful in > > helping us find places we wrote invalid or unportable C. > > > > The reason I'm very hesitant to require that we run everything in GitHub > > Actions because it only supports two architectures. ARM64 and RISC-V > > are really popular, and I can tell you that running even a Linux > > container in emulation is really quite slow. I do it for my projects, > > but Git LFS only builds one set of non-x86 packages (the latest Debian) > > because emulation is far too slow to build the normal suite of five or > > six packages. > > Does that restriction apply to just GitHub-hosted runners, though? > Would it be possible for an interested party to set up self-hosted > runners (configured via GH Actions) that are using AMD or POWER or > whatever? (For example, I think it would be quite feasible for Google > to donate some compute for this, though no promises). Self-hosted runners on public code are very hard to secure. You're basically letting arbitrary people on the Internet run code on those machines and make outgoing network connections (due to the fact that you can push whatever into a PR branch), with all of the potential for abuse that that involves (and as my colleagues can tell you, there's a whole lot of it). GitHub has taken extensive pains to secure GitHub Actions runners in the cloud, and while we use self-hosted runners for some internal projects, they are absolutely not allowed for any public project for that reason. I would be delighted if Google were willing to do that, but I think you're going to need help from teams like Google Cloud who are going to be used to dealing with abuse at scale, like cryptocurrency miners and such. Unfortunately, there are many people who act in a less than lovely way and will exploit whatever they can to make a buck. I will also note that the official Actions runner is in C# and only runs on a handful of platforms due to the lack of portability of C#. (It might theoretically run on Mono, which would increase its portability, but I must profess my complete ignorance on anything about that code.) I also know of an unofficial one in Go[0], which I'm for obvious reasons unable to endorse, encourage, or speak about authoritatively in any way, but that would still exclude some platforms and architectures which don't support Go. > The appeal is not "because GitHub Actions are great!" for me - the > appeal is "because most Git developers run the CI this way if they > remember to or use GGG, and Junio runs it on `seen` all the time". If > there's some other recommendation for self-service CI runs that don't > need some careful reminding or secret knowledge to run, I'm happy with > that too. (For example, if someone wants to set up some bot that looks > for new [PATCH]-shaped emails, applies, builds, runs tests, and mails > test results to the author after run, that would fit into the spirit > of this point, although that sounds like a lot of setup to me.) Yeah, I understand what you're going for. If there were some super easy way to get everything running in an automatic CI, I'm all for it. I think CI is the easiest way to make sure we don't break anything. I think it's worth trying to get CI set up for whatever we can, and if CI is a possibility somewhere, it becomes a lot easier to say yes. > Should have a reroll in the next 30min, it was ready to go and then I > got this mail :) Sounds good. I don't think anything in this email should affect that reroll. [0] https://github.com/ChristopherHX/github-act-runner
On Friday, July 12, 2024 3:34 PM, brian m. carlson wrote: >Subject: Re: [PATCH] Documentation: add platform support policy > >On 2024-07-11 at 23:15:35, Emily Shaffer wrote: >> On Thu, Jul 11, 2024 at 3:24 PM brian m. carlson >> <sandals@crustytoothpaste.net> wrote: >> > Some older OSes require kernel features that aren't compiled in by >> > default, so containers are out. For example, CentOS 6 won't run on >> > a modern kernel because it lacks whatever the predecessor to the >> > vDSO was (which can be recompiled into the kernel, but nobody does that). >> >> Is this hinting that we should mention a minimum kernel version for >> Linux-kernel-based OSes? > >This is actually a feature that still exists in the kernel and could be enabled for newer >kernels, but because distros don't use it (they use the vDSO instead), they don't >compile it in. > >I'm not sure a minimum kernel version is helpful, because most of the LTS distro >kernels backport features, like Red Hat backported getrandom for example. In the >interests of getting to a useful agreement, I think for now we should just punt on >this and having a 10 year lifespan will probably do the trick, and we can determine >in the future if we need to apply more stringent measures. When looking at the "exotics", many have their own kernel lifespans. When worrying about NonStop, for example, the Kernel version stays around about 5 years under full support, then goes another few until retired. I think it is important for people to know the compatible versions of the kernel builds have. I do track those for the platform. >> > We also don't really want to be on the hook for trying to support >> > OSes Ubuntu is still derived from Debian. It is likely that things >> > which work in one will work in another, but not guaranteed. >> > >> > I mention Debian is because it has a large variety of supported >> > architectures. I absolutely don't want to say, "Oh, you have MIPS >> > hardware, too bad if Git doesn't work for you." (I assure you the >> > distro maintainers will be displeased if we break Git on less common >> > architectures, as will I.) In fact, MIPS is an architecture that >> > requires natural alignment and can be big-endian, so it's very >> > useful in helping us find places we wrote invalid or unportable C. >> > >> > The reason I'm very hesitant to require that we run everything in >> > GitHub Actions because it only supports two architectures. ARM64 >> > and RISC-V are really popular, and I can tell you that running even >> > a Linux container in emulation is really quite slow. I do it for my >> > projects, but Git LFS only builds one set of non-x86 packages (the >> > latest Debian) because emulation is far too slow to build the normal >> > suite of five or six packages. >> >> Does that restriction apply to just GitHub-hosted runners, though? >> Would it be possible for an interested party to set up self-hosted >> runners (configured via GH Actions) that are using AMD or POWER or >> whatever? (For example, I think it would be quite feasible for Google >> to donate some compute for this, though no promises). > >Self-hosted runners on public code are very hard to secure. You're basically letting >arbitrary people on the Internet run code on those machines and make outgoing >network connections (due to the fact that you can push whatever into a PR >branch), with all of the potential for abuse that that involves (and as my colleagues >can tell you, there's a whole lot of it). GitHub has taken extensive pains to secure >GitHub Actions runners in the cloud, and while we use self-hosted runners for some >internal projects, they are absolutely not allowed for any public project for that >reason. I'm not sure this applies to some of the exotics. NonStop cannot run the Google CI code. While we could, in theory, connect to via SSH to run builds, my system is behind a VPN/Firewall, which would make the builds impossible. I do build using Jenkins based on an SCM poll. It's not perfect and some tests do not run correctly in Jenkins but do outside. I would like to provide the feedback to the git team, somehow, on what built successfully or not, outside of the mailing list. >I would be delighted if Google were willing to do that, but I think you're going to >need help from teams like Google Cloud who are going to be used to dealing with >abuse at scale, like cryptocurrency miners and such. Unfortunately, there are many >people who act in a less than lovely way and will exploit whatever they can to make >a buck. > >I will also note that the official Actions runner is in C# and only runs on a handful of >platforms due to the lack of portability of C#. (It might theoretically run on Mono, >which would increase its portability, but I must profess my complete ignorance on >anything about that code.) I also know of an unofficial one in Go[0], which I'm for >obvious reasons unable to endorse, encourage, or speak about authoritatively in >any way, but that would still exclude some platforms and architectures which don't >support Go. We don't have C# or Go, nor are likely to any time soon, so that is a problem. > >> The appeal is not "because GitHub Actions are great!" for me - the >> appeal is "because most Git developers run the CI this way if they >> remember to or use GGG, and Junio runs it on `seen` all the time". If >> there's some other recommendation for self-service CI runs that don't >> need some careful reminding or secret knowledge to run, I'm happy with >> that too. (For example, if someone wants to set up some bot that looks >> for new [PATCH]-shaped emails, applies, builds, runs tests, and mails >> test results to the author after run, that would fit into the spirit >> of this point, although that sounds like a lot of setup to me.) > >Yeah, I understand what you're going for. If there were some super easy way to get >everything running in an automatic CI, I'm all for it. I think CI is the easiest way to >make sure we don't break anything. > >I think it's worth trying to get CI set up for whatever we can, and if CI is a possibility >somewhere, it becomes a lot easier to say yes. > >> Should have a reroll in the next 30min, it was ready to go and then I >> got this mail :) > >Sounds good. I don't think anything in this email should affect that reroll. > >[0] https://github.com/ChristopherHX/github-act-runner >-- >brian m. carlson (they/them or he/him) >Toronto, Ontario, CA --Randall
On Fri, Jul 12, 2024 at 12:33 PM brian m. carlson <sandals@crustytoothpaste.net> wrote: > > On 2024-07-11 at 23:15:35, Emily Shaffer wrote: > > On Thu, Jul 11, 2024 at 3:24 PM brian m. carlson > > <sandals@crustytoothpaste.net> wrote: > > > Some older OSes require kernel features that aren't compiled in by > > > default, so containers are out. For example, CentOS 6 won't run on a > > > modern kernel because it lacks whatever the predecessor to the vDSO was > > > (which can be recompiled into the kernel, but nobody does that). > > > > Is this hinting that we should mention a minimum kernel version for > > Linux-kernel-based OSes? > > This is actually a feature that still exists in the kernel and could be > enabled for newer kernels, but because distros don't use it (they use > the vDSO instead), they don't compile it in. > > I'm not sure a minimum kernel version is helpful, because most of the > LTS distro kernels backport features, like Red Hat backported getrandom > for example. In the interests of getting to a useful agreement, I think > for now we should just punt on this and having a 10 year lifespan will > probably do the trick, and we can determine in the future if we need to > apply more stringent measures. Sounds good, thanks! > > > > We also don't really want to be on the hook for trying to support OSes > > > Ubuntu is still derived from Debian. It is likely that things which > > > work in one will work in another, but not guaranteed. > > > > > > I mention Debian is because it has a large variety of supported > > > architectures. I absolutely don't want to say, "Oh, you have MIPS > > > hardware, too bad if Git doesn't work for you." (I assure you the > > > distro maintainers will be displeased if we break Git on less common > > > architectures, as will I.) In fact, MIPS is an architecture that > > > requires natural alignment and can be big-endian, so it's very useful in > > > helping us find places we wrote invalid or unportable C. > > > > > > The reason I'm very hesitant to require that we run everything in GitHub > > > Actions because it only supports two architectures. ARM64 and RISC-V > > > are really popular, and I can tell you that running even a Linux > > > container in emulation is really quite slow. I do it for my projects, > > > but Git LFS only builds one set of non-x86 packages (the latest Debian) > > > because emulation is far too slow to build the normal suite of five or > > > six packages. > > > > Does that restriction apply to just GitHub-hosted runners, though? > > Would it be possible for an interested party to set up self-hosted > > runners (configured via GH Actions) that are using AMD or POWER or > > whatever? (For example, I think it would be quite feasible for Google > > to donate some compute for this, though no promises). > > Self-hosted runners on public code are very hard to secure. You're > basically letting arbitrary people on the Internet run code on those > machines and make outgoing network connections (due to the fact that you > can push whatever into a PR branch), with all of the potential for abuse > that that involves (and as my colleagues can tell you, there's a whole > lot of it). GitHub has taken extensive pains to secure GitHub Actions > runners in the cloud, and while we use self-hosted runners for some > internal projects, they are absolutely not allowed for any public > project for that reason. > > I would be delighted if Google were willing to do that, but I think > you're going to need help from teams like Google Cloud who are going to > be used to dealing with abuse at scale, like cryptocurrency miners and > such. Unfortunately, there are many people who act in a less than > lovely way and will exploit whatever they can to make a buck. > > I will also note that the official Actions runner is in C# and only runs > on a handful of platforms due to the lack of portability of C#. (It > might theoretically run on Mono, which would increase its portability, > but I must profess my complete ignorance on anything about that code.) I > also know of an unofficial one in Go[0], which I'm for obvious reasons > unable to endorse, encourage, or speak about authoritatively in any way, > but that would still exclude some platforms and architectures which > don't support Go. Do you think it's worth us linking out to some helpful doc (like the official one, and someone who doesn't work at GH adding a link to some unofficial Golang thing)? I sort of feel like since "you can also DIY a scraper that looks at the mailing list if you want" is included, the gist makes it across, so I'm tempted not to go into a bunch of prescriptive detail here. > > > The appeal is not "because GitHub Actions are great!" for me - the > > appeal is "because most Git developers run the CI this way if they > > remember to or use GGG, and Junio runs it on `seen` all the time". If > > there's some other recommendation for self-service CI runs that don't > > need some careful reminding or secret knowledge to run, I'm happy with > > that too. (For example, if someone wants to set up some bot that looks > > for new [PATCH]-shaped emails, applies, builds, runs tests, and mails > > test results to the author after run, that would fit into the spirit > > of this point, although that sounds like a lot of setup to me.) > > Yeah, I understand what you're going for. If there were some super easy > way to get everything running in an automatic CI, I'm all for it. I > think CI is the easiest way to make sure we don't break anything. > > I think it's worth trying to get CI set up for whatever we can, and if > CI is a possibility somewhere, it becomes a lot easier to say yes. > > > Should have a reroll in the next 30min, it was ready to go and then I > > got this mail :) > > Sounds good. I don't think anything in this email should affect that > reroll. > > [0] https://github.com/ChristopherHX/github-act-runner > -- > brian m. carlson (they/them or he/him) > Toronto, Ontario, CA
On Fri, Jul 12, 2024 at 12:46 PM <rsbecker@nexbridge.com> wrote: > > On Friday, July 12, 2024 3:34 PM, brian m. carlson wrote: > >Subject: Re: [PATCH] Documentation: add platform support policy > > > >On 2024-07-11 at 23:15:35, Emily Shaffer wrote: > >> On Thu, Jul 11, 2024 at 3:24 PM brian m. carlson > >> <sandals@crustytoothpaste.net> wrote: > >> > Some older OSes require kernel features that aren't compiled in by > >> > default, so containers are out. For example, CentOS 6 won't run on > >> > a modern kernel because it lacks whatever the predecessor to the > >> > vDSO was (which can be recompiled into the kernel, but nobody does that). > >> > >> Is this hinting that we should mention a minimum kernel version for > >> Linux-kernel-based OSes? > > > >This is actually a feature that still exists in the kernel and could be enabled for newer > >kernels, but because distros don't use it (they use the vDSO instead), they don't > >compile it in. > > > >I'm not sure a minimum kernel version is helpful, because most of the LTS distro > >kernels backport features, like Red Hat backported getrandom for example. In the > >interests of getting to a useful agreement, I think for now we should just punt on > >this and having a 10 year lifespan will probably do the trick, and we can determine > >in the future if we need to apply more stringent measures. > > When looking at the "exotics", many have their own kernel lifespans. When worrying about NonStop, for example, the Kernel version stays around about 5 years under full support, then goes another few until retired. I think it is important for people to know the compatible versions of the kernel builds have. I do track those for the platform. > > >> > We also don't really want to be on the hook for trying to support > >> > OSes Ubuntu is still derived from Debian. It is likely that things > >> > which work in one will work in another, but not guaranteed. > >> > > >> > I mention Debian is because it has a large variety of supported > >> > architectures. I absolutely don't want to say, "Oh, you have MIPS > >> > hardware, too bad if Git doesn't work for you." (I assure you the > >> > distro maintainers will be displeased if we break Git on less common > >> > architectures, as will I.) In fact, MIPS is an architecture that > >> > requires natural alignment and can be big-endian, so it's very > >> > useful in helping us find places we wrote invalid or unportable C. > >> > > >> > The reason I'm very hesitant to require that we run everything in > >> > GitHub Actions because it only supports two architectures. ARM64 > >> > and RISC-V are really popular, and I can tell you that running even > >> > a Linux container in emulation is really quite slow. I do it for my > >> > projects, but Git LFS only builds one set of non-x86 packages (the > >> > latest Debian) because emulation is far too slow to build the normal > >> > suite of five or six packages. > >> > >> Does that restriction apply to just GitHub-hosted runners, though? > >> Would it be possible for an interested party to set up self-hosted > >> runners (configured via GH Actions) that are using AMD or POWER or > >> whatever? (For example, I think it would be quite feasible for Google > >> to donate some compute for this, though no promises). > > > >Self-hosted runners on public code are very hard to secure. You're basically letting > >arbitrary people on the Internet run code on those machines and make outgoing > >network connections (due to the fact that you can push whatever into a PR > >branch), with all of the potential for abuse that that involves (and as my colleagues > >can tell you, there's a whole lot of it). GitHub has taken extensive pains to secure > >GitHub Actions runners in the cloud, and while we use self-hosted runners for some > >internal projects, they are absolutely not allowed for any public project for that > >reason. > > I'm not sure this applies to some of the exotics. NonStop cannot run the Google CI code. I assume you meant the GitHub CI code? > While we could, in theory, connect to via SSH to run builds, my system is behind a VPN/Firewall, which would make the builds impossible. I do build using Jenkins based on an SCM poll. It's not perfect and some tests do not run correctly in Jenkins but do outside. I would like to provide the feedback to the git team, somehow, on what built successfully or not, outside of the mailing list. I hope that the overall intent - you need to give us *something* that we can self-serve issue reproduction on, or else you can deal with lower quality of service - is clear and understood. It sounds like this puts NonStop squarely into this second category, running against `next` or `seen` on a nightly cadence and publishing breakage reports, then working closely with developers to make fixes as appropriate (or making them yourself). (By the way, is this doc something you can point your manager chain to to explain why Git issues appear and how to stop them from doing so? Maybe that would make your life easier, even if the answer to that conversation is "we can't provide what Git is asking for, so we'll be OK with lower quality of support"...) > > >I would be delighted if Google were willing to do that, but I think you're going to > >need help from teams like Google Cloud who are going to be used to dealing with > >abuse at scale, like cryptocurrency miners and such. Unfortunately, there are many > >people who act in a less than lovely way and will exploit whatever they can to make > >a buck. > > > >I will also note that the official Actions runner is in C# and only runs on a handful of > >platforms due to the lack of portability of C#. (It might theoretically run on Mono, > >which would increase its portability, but I must profess my complete ignorance on > >anything about that code.) I also know of an unofficial one in Go[0], which I'm for > >obvious reasons unable to endorse, encourage, or speak about authoritatively in > >any way, but that would still exclude some platforms and architectures which don't > >support Go. > > We don't have C# or Go, nor are likely to any time soon, so that is a problem. > > > > >> The appeal is not "because GitHub Actions are great!" for me - the > >> appeal is "because most Git developers run the CI this way if they > >> remember to or use GGG, and Junio runs it on `seen` all the time". If > >> there's some other recommendation for self-service CI runs that don't > >> need some careful reminding or secret knowledge to run, I'm happy with > >> that too. (For example, if someone wants to set up some bot that looks > >> for new [PATCH]-shaped emails, applies, builds, runs tests, and mails > >> test results to the author after run, that would fit into the spirit > >> of this point, although that sounds like a lot of setup to me.) > > > >Yeah, I understand what you're going for. If there were some super easy way to get > >everything running in an automatic CI, I'm all for it. I think CI is the easiest way to > >make sure we don't break anything. > > > >I think it's worth trying to get CI set up for whatever we can, and if CI is a possibility > >somewhere, it becomes a lot easier to say yes. > > > >> Should have a reroll in the next 30min, it was ready to go and then I > >> got this mail :) > > > >Sounds good. I don't think anything in this email should affect that reroll. > > > >[0] https://github.com/ChristopherHX/github-act-runner > >-- > >brian m. carlson (they/them or he/him) > >Toronto, Ontario, CA > > --Randall >
On Monday, July 15, 2024 6:28 PM, Emily Shaffer wrote: >On Fri, Jul 12, 2024 at 12:46 PM <rsbecker@nexbridge.com> wrote: >> >> On Friday, July 12, 2024 3:34 PM, brian m. carlson wrote: >> >Subject: Re: [PATCH] Documentation: add platform support policy >> > >> >On 2024-07-11 at 23:15:35, Emily Shaffer wrote: >> >> On Thu, Jul 11, 2024 at 3:24 PM brian m. carlson >> >> <sandals@crustytoothpaste.net> wrote: >> >> > Some older OSes require kernel features that aren't compiled in >> >> > by default, so containers are out. For example, CentOS 6 won't >> >> > run on a modern kernel because it lacks whatever the predecessor >> >> > to the vDSO was (which can be recompiled into the kernel, but nobody does >that). >> >> >> >> Is this hinting that we should mention a minimum kernel version for >> >> Linux-kernel-based OSes? >> > >> >This is actually a feature that still exists in the kernel and could >> >be enabled for newer kernels, but because distros don't use it (they >> >use the vDSO instead), they don't compile it in. >> > >> >I'm not sure a minimum kernel version is helpful, because most of the >> >LTS distro kernels backport features, like Red Hat backported >> >getrandom for example. In the interests of getting to a useful >> >agreement, I think for now we should just punt on this and having a >> >10 year lifespan will probably do the trick, and we can determine in the future if >we need to apply more stringent measures. >> >> When looking at the "exotics", many have their own kernel lifespans. When >worrying about NonStop, for example, the Kernel version stays around about 5 >years under full support, then goes another few until retired. I think it is important >for people to know the compatible versions of the kernel builds have. I do track >those for the platform. >> >> >> > We also don't really want to be on the hook for trying to support >> >> > OSes Ubuntu is still derived from Debian. It is likely that >> >> > things which work in one will work in another, but not guaranteed. >> >> > >> >> > I mention Debian is because it has a large variety of supported >> >> > architectures. I absolutely don't want to say, "Oh, you have >> >> > MIPS hardware, too bad if Git doesn't work for you." (I assure >> >> > you the distro maintainers will be displeased if we break Git on >> >> > less common architectures, as will I.) In fact, MIPS is an >> >> > architecture that requires natural alignment and can be >> >> > big-endian, so it's very useful in helping us find places we wrote invalid or >unportable C. >> >> > >> >> > The reason I'm very hesitant to require that we run everything in >> >> > GitHub Actions because it only supports two architectures. ARM64 >> >> > and RISC-V are really popular, and I can tell you that running >> >> > even a Linux container in emulation is really quite slow. I do >> >> > it for my projects, but Git LFS only builds one set of non-x86 >> >> > packages (the latest Debian) because emulation is far too slow to >> >> > build the normal suite of five or six packages. >> >> >> >> Does that restriction apply to just GitHub-hosted runners, though? >> >> Would it be possible for an interested party to set up self-hosted >> >> runners (configured via GH Actions) that are using AMD or POWER or >> >> whatever? (For example, I think it would be quite feasible for >> >> Google to donate some compute for this, though no promises). >> > >> >Self-hosted runners on public code are very hard to secure. You're >> >basically letting arbitrary people on the Internet run code on those >> >machines and make outgoing network connections (due to the fact that >> >you can push whatever into a PR branch), with all of the potential >> >for abuse that that involves (and as my colleagues can tell you, >> >there's a whole lot of it). GitHub has taken extensive pains to >> >secure GitHub Actions runners in the cloud, and while we use >> >self-hosted runners for some internal projects, they are absolutely not allowed >for any public project for that reason. >> >> I'm not sure this applies to some of the exotics. NonStop cannot run the Google CI >code. > > >I assume you meant the GitHub CI code? Yes, sorry. >> While we could, in theory, connect to via SSH to run builds, my system is behind a >VPN/Firewall, which would make the builds impossible. I do build using Jenkins >based on an SCM poll. It's not perfect and some tests do not run correctly in Jenkins >but do outside. I would like to provide the feedback to the git team, somehow, on >what built successfully or not, outside of the mailing list. > >I hope that the overall intent - you need to give us *something* that we can self- >serve issue reproduction on, or else you can deal with lower quality of service - is >clear and understood. It sounds like this puts NonStop squarely into this second >category, running against `next` or `seen` on a nightly cadence and publishing >breakage reports, then working closely with developers to make fixes as appropriate >(or making them yourself). (By the way, is this doc something you can point your >manager chain to to explain why Git issues appear and how to stop them from >doing so? Maybe that would make your life easier, even if the answer to that >conversation is "we can't provide what Git is asking for, so we'll be OK with lower >quality of support"...) That is my expectation also. >> >I would be delighted if Google were willing to do that, but I think >> >you're going to need help from teams like Google Cloud who are going >> >to be used to dealing with abuse at scale, like cryptocurrency miners >> >and such. Unfortunately, there are many people who act in a less >> >than lovely way and will exploit whatever they can to make a buck. >> > >> >I will also note that the official Actions runner is in C# and only >> >runs on a handful of platforms due to the lack of portability of C#. >> >(It might theoretically run on Mono, which would increase its >> >portability, but I must profess my complete ignorance on anything >> >about that code.) I also know of an unofficial one in Go[0], which >> >I'm for obvious reasons unable to endorse, encourage, or speak about >> >authoritatively in any way, but that would still exclude some platforms and >architectures which don't support Go. >> >> We don't have C# or Go, nor are likely to any time soon, so that is a problem. >> >> > >> >> The appeal is not "because GitHub Actions are great!" for me - the >> >> appeal is "because most Git developers run the CI this way if they >> >> remember to or use GGG, and Junio runs it on `seen` all the time". >> >> If there's some other recommendation for self-service CI runs that >> >> don't need some careful reminding or secret knowledge to run, I'm >> >> happy with that too. (For example, if someone wants to set up some >> >> bot that looks for new [PATCH]-shaped emails, applies, builds, runs >> >> tests, and mails test results to the author after run, that would >> >> fit into the spirit of this point, although that sounds like a lot >> >> of setup to me.) >> > >> >Yeah, I understand what you're going for. If there were some super >> >easy way to get everything running in an automatic CI, I'm all for >> >it. I think CI is the easiest way to make sure we don't break anything. >> > >> >I think it's worth trying to get CI set up for whatever we can, and >> >if CI is a possibility somewhere, it becomes a lot easier to say yes. >> > >> >> Should have a reroll in the next 30min, it was ready to go and then >> >> I got this mail :) >> > >> >Sounds good. I don't think anything in this email should affect that reroll. >> > >> >[0] https://github.com/ChristopherHX/github-act-runner >> >-- >> >brian m. carlson (they/them or he/him) Toronto, Ontario, CA >> >> --Randall >>
diff --git a/Documentation/Makefile b/Documentation/Makefile index dc65759cb1..462af0311f 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -118,6 +118,7 @@ TECH_DOCS += technical/multi-pack-index TECH_DOCS += technical/pack-heuristics TECH_DOCS += technical/parallel-checkout TECH_DOCS += technical/partial-clone +TECH_DOCS += technical/platform-support TECH_DOCS += technical/racy-git TECH_DOCS += technical/reftable TECH_DOCS += technical/scalar diff --git a/Documentation/technical/platform-support.txt b/Documentation/technical/platform-support.txt new file mode 100644 index 0000000000..23ab708144 --- /dev/null +++ b/Documentation/technical/platform-support.txt @@ -0,0 +1,102 @@ +Platform Support Policy +======================= + +Git has a history of providing broad "support" for exotic platforms and older +platforms, without an explicit commitment. This support becomes easier to +maintain (and possible to commit to) when Git developers are providing with +adequate tooling to test for compatibility. Variouis levels of tooling will +allow us to make more solid commitments around Git's compatibility with your +platform. + +Compatible by vN+1 release +-------------------------- + +To increase probability that compatibility issues introduced in a point release +will be fixed by the next point release: + +* You should send a bug report as soon as you notice the breakage on your +platform. The sooner you notice, the better; it's better for you to watch `seen` +than to watch `master`. See linkgit:gitworkflows[7] under "Graduation" for an +overview of which branches are used in git.git, and how. +* The bug report should include information about what platform you are using. +* You should also use linkgit:git-bisect[1] and determine which commit +introduced the breakage. +* Please include any information you have about the nature of the breakage: is +it a memory alignment issue? Is an underlying library missing or broken for +your platform? Is there some quirk about your platform which means typical +practices (like malloc) behave strangely? +* Once we begin to fix the issue, please work closely with the contributor +working on it to test the proposed fix against your platform. + +Example: NonStop +https://lore.kernel.org/git/01bd01da681a$b8d70a70$2a851f50$@nexbridge.com/[reports +problems] when they're noticed. + +Compatible on `master` and point releases +----------------------------------------- + +To guarantee that `master` and all point releases work for your platform the +first time: + +* You should run nightly tests against the `next` branch and publish breakage +reports to the mailing list immediately when they happen. +* It may make sense to automate these; if you do, make sure they are not noisy +(you don't need to send a report when everything works, only when something +breaks). +* Breakage reports should be actionable - include clear error messages that can +help developers who may not have access to test directly on your platform. +* You should use git-bisect and determine which commit introduced the breakage; +if you can't do this with automation, you should do this yourself manually as +soon as you notice a breakage report was sent. +* You should either: +** Provide VM access on-demand to a trusted developer working to fix the issue, +so they can test their fix, OR +** Work closely with the developer fixing the issue - testing turnaround to +check whether the fix works for your platform should not be longer than a +business day. + +Example: +https://lore.kernel.org/git/CAHd-oW6X4cwD_yLNFONPnXXUAFPxgDoccv2SOdpeLrqmHCJB4Q@mail.gmail.com/[AIX] +provides a build farm and runs tests against release candidates. + +Compatible on `next` +-------------------- + +To guarantee that `next` will work for your platform, avoiding reactive +debugging and fixing: + +* You should add a runner for your platform to the GitHub Actions CI suite. +This suite is run when any Git developer proposes a new patch, and having a +runner for your platform/configuration means every developer will know if they +break you, immediately. +* If you rely on Git avoiding a specific pattern that doesn't work well with +your platform (like a certain malloc pattern), if possible, add a coccicheck +rule to ensure that pattern is not used. +* If you rely on some configuration or behavior, add a test for it. You may +find it easier to add a unit test ensuring the behavior you need than to add an +integration test; either one works. Untested behavior is subject to breakage at +any time. +** Clearly label these tests as necessary for platform compatibility. Add them +to an isolated compatibility-related test suite, like a new t* file or unit test +suite, so that they're easy to remove when compatibility is no longer required. +If the specific compatibility need is gated behind an issue with another +project, link to documentation of that issue (like a bug or email thread) to +make it easier to tell when that compatibility need goes away. + +Example: We run our +https://git.kernel.org/pub/scm/git/git.git/tree/.github/workflows/main.yml[CI +suite] on Windows, Ubuntu, Mac, and others. + +Getting help writing platform support patches +--------------------------------------------- + +In general, when sending patches to fix platform support problems, follow +these guidelines to make sure the patch is reviewed with the appropriate level +of urgency: + +* Clearly state in the commit message that you are fixing a platform breakage, +and for which platform. +* Use the CI and test suite to ensure that the fix for your platform doesn't +break other platforms. +* If possible, add a test ensuring this regression doesn't happen again. If +it's not possible to add a test, explain why in the commit message.
Supporting many platforms is only easy when we have the right tools to ensure that support. Teach platform maintainers how they can help us to help them, by explaining what kind of tooling support we would like to have, and what level of support becomes available as a result. Provide examples so that platform maintainers can see what we're asking for in practice. With this policy in place, we can make changes with stronger assurance that we are not breaking anybody we promised not to. Instead, we can feel confident that our existing testing and integration practices protect those who care from breakage. Signed-off-by: Emily Shaffer <nasamuffin@google.com> --- Hi folks, Some months ago, I branched the Rust discussion to talk about platform support[1]. That discussion didn't go far, but with libification efforts touching the Makefile and worrying about compatibility, it seems like a worthy discussion to have. My goal with this document is that if we claim to support a given platform, that platform should be easy for contributors to test against. And the corollary, if we can't test against a given platform easily, we shouldn't be assuring users on that platform that it will always work. I'm thinking of platforms which rely on recent Git binaries but which we don't test against by default, like NonStop[2] or AIX[3]. Right now, this doc talks about "guarantees." I used that phrasing based on what I've observed to be an implicit expectation that we guarantee support; it could be this isn't actually a guarantee that the community is willing to make, so I am hoping we can discuss it and come up with the right term. I'm hoping that this policy (or one like it) can clarify a "help us help you" relationship with platform maintainers. If we can clearly define the resources we need in order to guarantee that we're supporting a platform, then we can make changes more confidently; if, instead, we make a serious effort to fix any and all reported breakages, then all patches are at risk of breaking a platform we didn't know exists but care about providing first-class support for and being rolled back. I think there is some space in between those two extremes, too, that might be suitable for some less-used platforms, or platforms which follow new releases less closely. Looking forward to discussing. - Emily 1: https://lore.kernel.org/git/CAJoAoZnHGTFhfR6e6r=GMSfVbSNgLoHF-opaWYLbHppiuzi+Rg@mail.gmail.com/ 2: https://lore.kernel.org/git/01bd01da681a$b8d70a70$2a851f50$@nexbridge.com/ 3: https://lore.kernel.org/git/CAHd-oW6X4cwD_yLNFONPnXXUAFPxgDoccv2SOdpeLrqmHCJB4Q@mail.gmail.com/ --- Documentation/Makefile | 1 + Documentation/technical/platform-support.txt | 102 +++++++++++++++++++ 2 files changed, 103 insertions(+) create mode 100644 Documentation/technical/platform-support.txt