Message ID | 20190524184809.25121-1-dgilbert@interlog.com (mailing list archive) |
---|---|
Headers | show |
Series | sg: v4 interface, rq sharing + multiple rqs | expand |
On 5/24/19 11:47 AM, Douglas Gilbert wrote: > This patchset restores some functionality that was in the v2 driver > version and has been broken by some external patch in the interim > period (20 years). What is an "external patch"? Thanks, Bart.
On 2019-05-26 11:49 a.m., Bart Van Assche wrote: > On 5/24/19 11:47 AM, Douglas Gilbert wrote: >> This patchset restores some functionality that was in the v2 driver >> version and has been broken by some external patch in the interim >> period (20 years). > > What is an "external patch"? It's a patch made by someone with a first name like Christoph, Jens or James (picking some names at random) that is applied whether or not the maintainer of said driver approves ("ack's) or not. IMO The maintainer should be able to restore features removed in this fashion (I'm talking specifically about the neutering of SG_FLAG_NO_DXFER) without review as it is in the documented interface. Plus I know of at least one power user who was peeved to find out that it was quietly broken. Doug Gilbert P.S the use case is mirrored disks: reading one disk for the actual data, and the other disk to check if there is an IO error. So the second disk doesn't need its data transferred to the user space (thus saving time).
On 6/2/19 1:23 PM, Douglas Gilbert wrote: > On 2019-05-26 11:49 a.m., Bart Van Assche wrote: >> On 5/24/19 11:47 AM, Douglas Gilbert wrote: >>> This patchset restores some functionality that was in the v2 driver >>> version and has been broken by some external patch in the interim >>> period (20 years). >> >> What is an "external patch"? > > It's a patch made by someone with a first name like Christoph, Jens or > James (picking some names at random) that is applied whether or not > the maintainer of said driver approves ("ack's) or not. > > IMO The maintainer should be able to restore features removed in this > fashion (I'm talking specifically about the neutering of > SG_FLAG_NO_DXFER) without review as it is in the documented interface. > Plus I know of at least one power user who was peeved to find out that > it was quietly broken. That's unfortunate that that regression happened. As you probably know the traditional approach to address a regression is to post a fix that has the "Fixes:" and "Cc: stable" tags. Bart.
On 5/24/19 11:47 AM, Douglas Gilbert wrote: > This patchset is big and can be regarded as a driver rewrite. The > number of lines increases from around 3000 to over 6000. Every SCSI reviewer I know is too busy to review a patch series that involves so many changes. You may have to split this series into a number of smaller series if you want to encourage reviewers to have a look. Bart.
On 2019-06-03 12:19 p.m., Bart Van Assche wrote: > On 5/24/19 11:47 AM, Douglas Gilbert wrote: >> This patchset is big and can be regarded as a driver rewrite. The >> number of lines increases from around 3000 to over 6000. > > Every SCSI reviewer I know is too busy to review a patch series that involves so > many changes. You may have to split this series into a number of smaller series > if you want to encourage reviewers to have a look. Cutting a patchset that touches around 1500 lines of a 3000 line driver, then adds new functionality amounting to an extra 3000 lines of code (and comments), according to the "one change per patch" rule would result in a patchset with hundreds of patches. Further if they are to be bisectable then they must not only compile and build, but run properly. Of course that is impossible for new functionality as there is little to test the new functionality against. Then there is the case of a driver maintainer who wants to clean up code that has "rusted" over time, including being weakened by hundreds of well-meaning but silly janitor type patches. Should a maintainer be discouraged from doing that? Looking around at current patchsets, LWN keeps a helpful list at the bottom of this page (from last Thursday): https://lwn.net/Articles/789234/ [non-subscribers need to wait until next Thursday to view that]. I looked at the "Device Drivers" section. Most patchsets there had between 10 and 20 patches, one had 33. One, the SIW Infiniband driver, is over 10,000 lines long, and contains 'only' 12 patches. Reviewers are obviously a scarce resource, but making their live's easier shouldn't be a goal in itself. Further, the utility of bottom up reviews by humans reduces with time as the automatic tools to do that get better. Of the responses to my "version 1" patchset 10 days ago, I would regard only one of your responses "useful", and all but one of the kbuild robot's responses useful. By "useful" I mean that they lead to better code. If new functionality is being proposed, surely it is better to check that it is documented and that test code exists. Then the design and high level details of the implementation should be assessed. And if the code doesn't involve special hardware, shouldn't the test code be run as part of the review process? Code reviews have their place, but over the last 10 years, bugs in the sg and scsi_debug drivers have been found by _using_ those drivers. Also with those drivers I and others have found bugs in the block layer and the SCSI mid-level. To date I have had no feedback about design document describing this patchset: http://sg.danny.cz/sg/sg_v40.html even though I refer to sections of it in about half of the patches in this patchset. Plus I have written extensive test code and made it available; again no feedback on that. In the article: "A ring buffer for epoll": https://lwn.net/Articles/789603/ [May 30, 2019] Jonathan Corbet discusses these issues in the final section titled: "Some closing grumbles". Doug Gilbert
On 6/4/19 2:31 PM, Douglas Gilbert wrote: > Then there is the case of a driver maintainer > who wants to clean up code that has "rusted" over time, including > being weakened by hundreds of well-meaning but silly janitor type > patches. Should a maintainer be discouraged from doing that? I think in this case the answer to that question is "yes, definitely". Multiple companies, including my employer, use the SG I/O interface as the primary interface for controlling SMR drives. Any regression in the SG I/O code will have severe consequences for users of SMR drives. As you know a rewrite can introduce regressions. Since the current implementation works fine I think it is up to you to find a way to motivate existing SG I/O users to adopt your new implementation. The cloud companies I know employ kernel developers and test new kernel versions before deploying these internally. How are you going to motivate these companies to adopt your rewrite instead of combining e.g. the Linux kernel v5.1 SG I/O implementation with the latest version of the Linux kernel? > To date I have had no feedback about design document describing this > patchset: > http://sg.danny.cz/sg/sg_v40.html > even though I refer to sections of it in about half of the patches > in this patchset. Plus I have written extensive test code and made > it available; again no feedback on that. It is great that you took the time to write a design document. I can't speak for other kernel developers. But if I see a URL myself in a cover letter of a patch series or in a commit message I ignore it. If you want feedback on that design document please convert it into a documentation patch and include it in this patch series. Thanks, Bart.
Doug, > Cutting a patchset that touches around 1500 lines of a 3000 line > driver, then adds new functionality amounting to an extra 3000 lines > of code (and comments), according to the "one change per patch" rule > would result in a patchset with hundreds of patches. The problem here is that you think of it as a single patch set transitioning the driver from major version X to version Y. Linux kernel development has moved away from that model. The kernel release cadence is more or less fixed at 10 weeks. The release process is not controlled by features or component versions or anything of that nature. It is set by passing of time. The notion that a driver or kernel component may have a version number applied to it is orthogonal to that process. A version is something you as a driver maintainer may decide to use to describe a certain set of commits. But it has no meaning wrt. the Linux development process. If you want to transition sg from what you call v3 to v4, then the process is that you submit a handful or two of small, easily digestible patches at a time. It may take a few kernel releases to get to where you want to be. But that is the process that everybody else is following to get their changes merged. Nobody says you can only make one submission per submission window. It's perfectly fine to submit patches 11-20 as soon as patches 1-10 have been reviewed and merged. As an example of this, the HBA vendors usually send several driver updates each release cycle. > Further if they are to be bisectable then they must not only > compile and build, but run properly. Absolutely. That's a requirement. > Of course that is impossible for new functionality as there is little > to test the new functionality against. The burden is on you to submit patches in an order in which they make logical sense and in which they can be reviewed, bisected, and tested. > I looked at the "Device Drivers" section. Most patchsets there > had between 10 and 20 patches, one had 33. The "number of patches in a driver submission" as a measure is a red herring. sg is not a new driver, it has users. > One, the SIW Infiniband driver, is over 10,000 lines long, and > contains 'only' 12 patches. But it was presumably developed out of tree in a separate repo. And the thousands of commits that resulted in this driver have been collapsed into 12 patches for initial submission. The number of lines is also a red herring. If a patch changes 10,000 lines to make one logical change that's perfectly fine. What's not fine is a patch changing 10,000 lines and also making an entirely different logical change. > Reviewers are obviously a scarce resource, but making their live's > easier shouldn't be a goal in itself. Couldn't disagree more. If you want your code merged, you will have to present it in a way that caters to the reviewers. Because without reviews, your code won't get merged. > If new functionality is being proposed, surely it is better to check > that it is documented and that test code exists. Then the design and > high level details of the implementation should be assessed. Testing and documentation are absolutely important. But so is documenting what compelled a code change. Reviews aid in verifying that the thought process outlined in the patch description matches the code changes performed. This is where the whole "one logical change" comes from. And when things subsequently break, it is then easy to identify which assumptions the patch author made that turned out not to be valid. > To date I have had no feedback about design document describing this > patchset: http://sg.danny.cz/sg/sg_v40.html This suffers the same problem as your patch series in that it encapsulates 26 years of thought in a single blob. Presumably that document developed over time and didn't go from nothing to 22K words in an instant. You need to document that process. Also, having a design document that describes a wealth of changes after the fact is not terribly helpful. I understand appreciate that you are focused on the end product. But to get there you need to slowly and iteratively submit patches against v3 that can be independently reviewed and merged. Please pick one feature at a time. Carve that into a few patches that each logically only do one thing. And then submit that as a patch set with an intro mail that describes the design of the feature, which assumptions are made, what the benefits are, who needs this capability, etc.
On 2019-06-06 10:34 a.m., Martin K. Petersen wrote: > > Doug, > >> Cutting a patchset that touches around 1500 lines of a 3000 line >> driver, then adds new functionality amounting to an extra 3000 lines >> of code (and comments), according to the "one change per patch" rule >> would result in a patchset with hundreds of patches. > > The problem here is that you think of it as a single patch set > transitioning the driver from major version X to version Y. > > Linux kernel development has moved away from that model. The kernel > release cadence is more or less fixed at 10 weeks. The release process > is not controlled by features or component versions or anything of that > nature. It is set by passing of time. > > The notion that a driver or kernel component may have a version number > applied to it is orthogonal to that process. A version is something you > as a driver maintainer may decide to use to describe a certain set of > commits. But it has no meaning wrt. the Linux development process. > > If you want to transition sg from what you call v3 to v4, then the > process is that you submit a handful or two of small, easily digestible > patches at a time. > > It may take a few kernel releases to get to where you want to be. But > that is the process that everybody else is following to get their > changes merged. > > Nobody says you can only make one submission per submission window. It's > perfectly fine to submit patches 11-20 as soon as patches 1-10 have been > reviewed and merged. As an example of this, the HBA vendors usually send > several driver updates each release cycle. > >> Further if they are to be bisectable then they must not only >> compile and build, but run properly. > > Absolutely. That's a requirement. > >> Of course that is impossible for new functionality as there is little >> to test the new functionality against. > > The burden is on you to submit patches in an order in which they make > logical sense and in which they can be reviewed, bisected, and tested. > >> I looked at the "Device Drivers" section. Most patchsets there >> had between 10 and 20 patches, one had 33. > > The "number of patches in a driver submission" as a measure is a red > herring. sg is not a new driver, it has users. > >> One, the SIW Infiniband driver, is over 10,000 lines long, and >> contains 'only' 12 patches. > > But it was presumably developed out of tree in a separate repo. And the > thousands of commits that resulted in this driver have been collapsed > into 12 patches for initial submission. > > The number of lines is also a red herring. If a patch changes 10,000 > lines to make one logical change that's perfectly fine. What's not fine > is a patch changing 10,000 lines and also making an entirely different > logical change. > >> Reviewers are obviously a scarce resource, but making their live's >> easier shouldn't be a goal in itself. > > Couldn't disagree more. If you want your code merged, you will have to > present it in a way that caters to the reviewers. Because without > reviews, your code won't get merged. > >> If new functionality is being proposed, surely it is better to check >> that it is documented and that test code exists. Then the design and >> high level details of the implementation should be assessed. > > Testing and documentation are absolutely important. But so is > documenting what compelled a code change. > > Reviews aid in verifying that the thought process outlined in the patch > description matches the code changes performed. This is where the whole > "one logical change" comes from. And when things subsequently break, it > is then easy to identify which assumptions the patch author made that > turned out not to be valid. > >> To date I have had no feedback about design document describing this >> patchset: http://sg.danny.cz/sg/sg_v40.html > > This suffers the same problem as your patch series in that it > encapsulates 26 years of thought in a single blob. > > Presumably that document developed over time and didn't go from nothing > to 22K words in an instant. You need to document that process. Also, > having a design document that describes a wealth of changes after the > fact is not terribly helpful. So tl;dr ? And if I hid the code, would my "blob" then qualify as a design document? Also if my sg v2/v3 documentation targetted the kernel submission processes between 1998 and 2002, would it be much use today? I think not. The way C++ standards are being developed these days is interesting. A proposer comes with an idea, code, test cases, rationale; then clang and gcc folks implement it and then the committee looks at it seriously. Some proposals get shunted through TS groups for a longer look. I'm old enough to have been through the Ada debacle. So I know about design documentation that went from the floor to the ceiling, several times. Ada wasn't (isn't) a bad language but it was killed by its associated documentation workflow. The Ada prophets claimed it was a self-evident truth that almost all documentation (and test cases) must be written before any code was cut. Ada is still used in air traffic control, railway signalling, etc; not sure if that is re-assuring. BTW Not everything in that "blob" is new. Only a small part of the large ioctl table is new; plus the v1, v2, v3 and v4 interfaces have all been in place for over 10 years. The object tree of the driver remains the same. > I understand appreciate that you are focused on the end product. But to > get there you need to slowly and iteratively submit patches against v3 > that can be independently reviewed and merged. > > Please pick one feature at a time. Carve that into a few patches that > each logically only do one thing. And then submit that as a patch set > with an intro mail that describes the design of the feature, which > assumptions are made, what the benefits are, who needs this capability, > etc. Thank you for repeating the party line. I expected none other. As a bonus, you took the "scarce resource" bait. And I'm reminded of the well reasoned process that the cabal in which Christoph, you and (I suspect) James conspired, to remove bidirectional SCSI command support from the kernel. You dismissed it masterfully as "old cruft" while Christoph continued to misquote Boaz after he had been told by same to stop or properly qualify the quote in question. Cache lines also got an honourable mention as a justification. But where was the design document (aka a "blob") justifying the change/removal, complete with a fallback strategy if the removal "blew up"? And where are the design documents for the sd driver and its ongoing evolutionary changes? Ever seen anything written about the sr or ses driver? ... in short: don't do as I do, do as I say ... I will probably produce a first half patchset, when I have properly adjusted my "cadence". Doug Gilbert P.S. For historical accuracy it is 21 years not 26. I have never spoken to, or received any correspondence (or design documents) from, Lawrence Foard. In 1998 he had not responded to emails for several years (I was told) and Alan Cox was looking for someone, anyone to take over maintenance of the sg driver. IMO Alan considered that ongoing maintenance and evolutionary change of the existing driver was more important than a replacement driver that was on offer. So now I'm looking at a bit more evolution.
Doug, > So tl;dr ? I actually tried. I also spent some time reading the patches two weeks ago. The new series is an improvement over the initial sg v4 posting from last winter. But it still needs work. > Also if my sg v2/v3 documentation targetted the kernel submission > processes between 1998 and 2002, would it be much use today? I think > not. No, but a diff, change bars, or something similar would have been very helpful to clearly identify what's changed between v3 and v4. Just like it's done in standards documents. You have your section 3 bullet list. But the rest of the document is huge and it is not immediately obvious what the v4-specific bits are. > Thank you for repeating the party line. I expected none other. Don't shoot the messenger. This is all documented in detail in: Documentation/process/submitting-patches.rst Please adjust your patch submissions accordingly. Especially the first few paragraphs of section 2 in that document are worth paying attention to. Your patch descriptions typically don't have a problem statement or justification. Why are you making all these changes? Why do we need a v4 in the first place? Why don't the existing ioctls suffice? Why aren't 16 outstanding requests per fd enough? Etc. > And where are the design documents for the sd driver and its ongoing > evolutionary changes? Ever seen anything written about the sr or ses > driver? I think you are missing my point. We don't have gnarly design documents because every change is either a self-contained commit or small series where each patch describes why a change was made. The git log *is* the design document. Each of your bullet features in section 3 ("Changes to sg driver between version 3.5.36 and 4.0") would ideally be submitted as a separate patch series whose individual merits could be discussed on the list. This is in sharp contrast to the "This patchset is big and can be regarded as a driver rewrite" approach that signals a preconceived commitment to a particular set of features and design choices. Why comment on something you can't influence? Who has time to read 22K words on a website and try to find out how those words relate to the posted patches? The only person on linux-scsi that *has* to look at your patches is me. Everybody else is a volunteer you need to entice to invest time and effort in reviewing your work. Quite possibly in their spare time. That's why I suggested that the burden is on you. You need to present your work in a way that engages people, is manageable in terms of time commitment, and makes them feel their investment is worthwhile and appreciated. Please take that into consideration.