Message ID | 20210623105858.6978-2-mcroce@linux.microsoft.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block: add a sequence number to disks | expand |
On Wed, Jun 23, 2021 at 12:58:53PM +0200, Matteo Croce wrote: > +void inc_diskseq(struct gendisk *disk) > +{ > + static atomic64_t diskseq; Please don't hide file scope variables in functions. Can you explain a little more why we need a global sequence count vs a per-disk one here?
On Wed, Jun 23, 2021 at 1:49 PM Christoph Hellwig <hch@infradead.org> wrote: > > On Wed, Jun 23, 2021 at 12:58:53PM +0200, Matteo Croce wrote: > > +void inc_diskseq(struct gendisk *disk) > > +{ > > + static atomic64_t diskseq; > > Please don't hide file scope variables in functions. > I just didn't want to clobber that file namespace, as that is the only point where it's used. > Can you explain a little more why we need a global sequence count vs > a per-disk one here? The point of the whole series is to have an unique sequence number for all the disks. Events can arrive to the userspace delayed or out-of-order, so this helps to correlate events to the disk. It might seem strange, but there isn't a way to do this yet, so I come up with a global, monotonically incrementing number.
On Mi, 23.06.21 15:10, Matteo Croce (mcroce@linux.microsoft.com) wrote: > On Wed, Jun 23, 2021 at 1:49 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > On Wed, Jun 23, 2021 at 12:58:53PM +0200, Matteo Croce wrote: > > > +void inc_diskseq(struct gendisk *disk) > > > +{ > > > + static atomic64_t diskseq; > > > > Please don't hide file scope variables in functions. > > > > I just didn't want to clobber that file namespace, as that is the only > point where it's used. > > > Can you explain a little more why we need a global sequence count vs > > a per-disk one here? > > The point of the whole series is to have an unique sequence number for > all the disks. > Events can arrive to the userspace delayed or out-of-order, so this > helps to correlate events to the disk. > It might seem strange, but there isn't a way to do this yet, so I come > up with a global, monotonically incrementing number. To extend on this and given an example why the *global* sequence number matters: Consider you plug in a USB storage key, and it gets named /dev/sda. You unplug it, the kernel structures for that device all disappear. Then you plug in a different USB storage key, and since it's the only one it will too be called /dev/sda. With the global sequence number we can still distinguish these two devices even though otherwise they can look pretty much identical. If we had per-device counters then this would fall flat because the counter would be flushed out when the device disappears and when a device reappears under the same generic name we couldn't assign it a different sequence number than before. Thus: a global instead of local sequence number counter is absolutely *key* for the problem this is supposed to solve Lennart -- Lennart Poettering, Berlin
On 6/23/21 3:51 PM, Lennart Poettering wrote: > On Mi, 23.06.21 15:10, Matteo Croce (mcroce@linux.microsoft.com) wrote: > >> On Wed, Jun 23, 2021 at 1:49 PM Christoph Hellwig <hch@infradead.org> wrote: >>> >>> On Wed, Jun 23, 2021 at 12:58:53PM +0200, Matteo Croce wrote: >>>> +void inc_diskseq(struct gendisk *disk) >>>> +{ >>>> + static atomic64_t diskseq; >>> >>> Please don't hide file scope variables in functions. >>> >> >> I just didn't want to clobber that file namespace, as that is the only >> point where it's used. >> >>> Can you explain a little more why we need a global sequence count vs >>> a per-disk one here? >> >> The point of the whole series is to have an unique sequence number for >> all the disks. >> Events can arrive to the userspace delayed or out-of-order, so this >> helps to correlate events to the disk. >> It might seem strange, but there isn't a way to do this yet, so I come >> up with a global, monotonically incrementing number. > > To extend on this and given an example why the *global* sequence number > matters: > > Consider you plug in a USB storage key, and it gets named > /dev/sda. You unplug it, the kernel structures for that device all > disappear. Then you plug in a different USB storage key, and since > it's the only one it will too be called /dev/sda. > > With the global sequence number we can still distinguish these two > devices even though otherwise they can look pretty much identical. If > we had per-device counters then this would fall flat because the > counter would be flushed out when the device disappears and when a device > reappears under the same generic name we couldn't assign it a > different sequence number than before. > > Thus: a global instead of local sequence number counter is absolutely > *key* for the problem this is supposed to solve > Well ... except that you'll need to keep track of the numbers (otherwise you wouldn't know if the numbers changed, right?). And if you keep track of the numbers you probably will have to implement an uevent listener to get the events in time. But if you have an uevent listener you will also get the add/remove events for these devices. And if you get add and remove events you can as well implement sequence numbers in your application, seeing that you have all information allowing you to do so. So why burden the kernel with it? Cheers, Hannes
On Wed, 2021-06-23 at 16:01 +0200, Hannes Reinecke wrote: > On 6/23/21 3:51 PM, Lennart Poettering wrote: > > On Mi, 23.06.21 15:10, Matteo Croce (mcroce@linux.microsoft.com) wrote: > > > > > On Wed, Jun 23, 2021 at 1:49 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > On Wed, Jun 23, 2021 at 12:58:53PM +0200, Matteo Croce wrote: > > > > > +void inc_diskseq(struct gendisk *disk) > > > > > +{ > > > > > + static atomic64_t diskseq; > > > > > > > > Please don't hide file scope variables in functions. > > > > > > > > > > I just didn't want to clobber that file namespace, as that is the only > > > point where it's used. > > > > > > > Can you explain a little more why we need a global sequence count vs > > > > a per-disk one here? > > > > > > The point of the whole series is to have an unique sequence number for > > > all the disks. > > > Events can arrive to the userspace delayed or out-of-order, so this > > > helps to correlate events to the disk. > > > It might seem strange, but there isn't a way to do this yet, so I come > > > up with a global, monotonically incrementing number. > > > > To extend on this and given an example why the *global* sequence number > > matters: > > > > Consider you plug in a USB storage key, and it gets named > > /dev/sda. You unplug it, the kernel structures for that device all > > disappear. Then you plug in a different USB storage key, and since > > it's the only one it will too be called /dev/sda. > > > > With the global sequence number we can still distinguish these two > > devices even though otherwise they can look pretty much identical. If > > we had per-device counters then this would fall flat because the > > counter would be flushed out when the device disappears and when a device > > reappears under the same generic name we couldn't assign it a > > different sequence number than before. > > > > Thus: a global instead of local sequence number counter is absolutely > > *key* for the problem this is supposed to solve > > > Well ... except that you'll need to keep track of the numbers (otherwise > you wouldn't know if the numbers changed, right?). > And if you keep track of the numbers you probably will have to implement > an uevent listener to get the events in time. > But if you have an uevent listener you will also get the add/remove > events for these devices. > And if you get add and remove events you can as well implement sequence > numbers in your application, seeing that you have all information > allowing you to do so. > So why burden the kernel with it? > > Cheers, > > Hannes Hi, We need this so that we can reliably correlate events to instances of a device. Events alone cannot solve this problem, because events _are_ the problem.
On Mi, 23.06.21 16:01, Hannes Reinecke (hare@suse.de) wrote: > > Thus: a global instead of local sequence number counter is absolutely > > *key* for the problem this is supposed to solve > > > Well ... except that you'll need to keep track of the numbers (otherwise you > wouldn't know if the numbers changed, right?). > And if you keep track of the numbers you probably will have to implement an > uevent listener to get the events in time. Hmm? This is backwards. The goal here is to be able to safely match up uevents to specific uses of a block device, given that block device names are agressively recycled. you imply it was easy to know which device use a uevent belongs to. But that's the problem: it is not possible to do so safely. if i see a uevent for a block device "loop0" I cannot tell if it was from my own use of the device or for some previous user of it. And that's what we'd like to see fixed: i.e. we query the block device for the seqeno now used and then we can use that to filter the uevents and ignore the ones that do not carry the same sequence number as we got assigned for our user. Lennart -- Lennart Poettering, Berlin
On 6/23/21 4:07 PM, Luca Boccassi wrote: > On Wed, 2021-06-23 at 16:01 +0200, Hannes Reinecke wrote: >> On 6/23/21 3:51 PM, Lennart Poettering wrote: >>> On Mi, 23.06.21 15:10, Matteo Croce (mcroce@linux.microsoft.com) wrote: >>> >>>> On Wed, Jun 23, 2021 at 1:49 PM Christoph Hellwig <hch@infradead.org> wrote: >>>>> On Wed, Jun 23, 2021 at 12:58:53PM +0200, Matteo Croce wrote: >>>>>> +void inc_diskseq(struct gendisk *disk) >>>>>> +{ >>>>>> + static atomic64_t diskseq; >>>>> >>>>> Please don't hide file scope variables in functions. >>>>> >>>> >>>> I just didn't want to clobber that file namespace, as that is the only >>>> point where it's used. >>>> >>>>> Can you explain a little more why we need a global sequence count vs >>>>> a per-disk one here? >>>> >>>> The point of the whole series is to have an unique sequence number for >>>> all the disks. >>>> Events can arrive to the userspace delayed or out-of-order, so this >>>> helps to correlate events to the disk. >>>> It might seem strange, but there isn't a way to do this yet, so I come >>>> up with a global, monotonically incrementing number. >>> >>> To extend on this and given an example why the *global* sequence number >>> matters: >>> >>> Consider you plug in a USB storage key, and it gets named >>> /dev/sda. You unplug it, the kernel structures for that device all >>> disappear. Then you plug in a different USB storage key, and since >>> it's the only one it will too be called /dev/sda. >>> >>> With the global sequence number we can still distinguish these two >>> devices even though otherwise they can look pretty much identical. If >>> we had per-device counters then this would fall flat because the >>> counter would be flushed out when the device disappears and when a device >>> reappears under the same generic name we couldn't assign it a >>> different sequence number than before. >>> >>> Thus: a global instead of local sequence number counter is absolutely >>> *key* for the problem this is supposed to solve >>> >> Well ... except that you'll need to keep track of the numbers (otherwise >> you wouldn't know if the numbers changed, right?). >> And if you keep track of the numbers you probably will have to implement >> an uevent listener to get the events in time. >> But if you have an uevent listener you will also get the add/remove >> events for these devices. >> And if you get add and remove events you can as well implement sequence >> numbers in your application, seeing that you have all information >> allowing you to do so. >> So why burden the kernel with it? >> >> Cheers, >> >> Hannes > > Hi, > > We need this so that we can reliably correlate events to instances of a > device. Events alone cannot solve this problem, because events _are_ > the problem. > In which sense? Yes, events can be delayed (if you list to uevents), but if you listen to kernel events there shouldn't be a delay, right? Cheers, Hannes
On Wed, Jun 23, 2021 at 03:10:21PM +0200, Matteo Croce wrote: > I just didn't want to clobber that file namespace, as that is the only > point where it's used. It doesn't really clobber the file namespace. Declaring it normally at the top of the file makes it very clear we have global state. Hiding it in a function just causes obsfucation. > > Can you explain a little more why we need a global sequence count vs > > a per-disk one here? > > The point of the whole series is to have an unique sequence number for > all the disks. > Events can arrive to the userspace delayed or out-of-order, so this > helps to correlate events to the disk. > It might seem strange, but there isn't a way to do this yet, so I come > up with a global, monotonically incrementing number. Maybe add a comment to explain that?
On Wed, 2021-06-23 at 16:21 +0200, Hannes Reinecke wrote: > On 6/23/21 4:07 PM, Luca Boccassi wrote: > > On Wed, 2021-06-23 at 16:01 +0200, Hannes Reinecke wrote: > > > On 6/23/21 3:51 PM, Lennart Poettering wrote: > > > > On Mi, 23.06.21 15:10, Matteo Croce (mcroce@linux.microsoft.com) wrote: > > > > > > > > > On Wed, Jun 23, 2021 at 1:49 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > > > On Wed, Jun 23, 2021 at 12:58:53PM +0200, Matteo Croce wrote: > > > > > > > +void inc_diskseq(struct gendisk *disk) > > > > > > > +{ > > > > > > > + static atomic64_t diskseq; > > > > > > > > > > > > Please don't hide file scope variables in functions. > > > > > > > > > > > > > > > > I just didn't want to clobber that file namespace, as that is the only > > > > > point where it's used. > > > > > > > > > > > Can you explain a little more why we need a global sequence count vs > > > > > > a per-disk one here? > > > > > > > > > > The point of the whole series is to have an unique sequence number for > > > > > all the disks. > > > > > Events can arrive to the userspace delayed or out-of-order, so this > > > > > helps to correlate events to the disk. > > > > > It might seem strange, but there isn't a way to do this yet, so I come > > > > > up with a global, monotonically incrementing number. > > > > > > > > To extend on this and given an example why the *global* sequence number > > > > matters: > > > > > > > > Consider you plug in a USB storage key, and it gets named > > > > /dev/sda. You unplug it, the kernel structures for that device all > > > > disappear. Then you plug in a different USB storage key, and since > > > > it's the only one it will too be called /dev/sda. > > > > > > > > With the global sequence number we can still distinguish these two > > > > devices even though otherwise they can look pretty much identical. If > > > > we had per-device counters then this would fall flat because the > > > > counter would be flushed out when the device disappears and when a device > > > > reappears under the same generic name we couldn't assign it a > > > > different sequence number than before. > > > > > > > > Thus: a global instead of local sequence number counter is absolutely > > > > *key* for the problem this is supposed to solve > > > > > > > Well ... except that you'll need to keep track of the numbers (otherwise > > > you wouldn't know if the numbers changed, right?). > > > And if you keep track of the numbers you probably will have to implement > > > an uevent listener to get the events in time. > > > But if you have an uevent listener you will also get the add/remove > > > events for these devices. > > > And if you get add and remove events you can as well implement sequence > > > numbers in your application, seeing that you have all information > > > allowing you to do so. > > > So why burden the kernel with it? > > > > > > Cheers, > > > > > > Hannes > > > > Hi, > > > > We need this so that we can reliably correlate events to instances of a > > device. Events alone cannot solve this problem, because events _are_ > > the problem. > > > In which sense? > Yes, events can be delayed (if you list to uevents), but if you listen > to kernel events there shouldn't be a delay, right? > > Cheers, > > Hannes Hi, Userspace programs don't have exclusive usage rights on loopdev, so you can't reliably know if an uevent correlates to the loop0 you just added, or to the loop0 someone else added some time earlier. This series lets us do just that, reliably, without races, and fix long- standing bugs. Please see Lennart's reply, it has much more details. Thanks!
On Mi, 23.06.21 16:21, Hannes Reinecke (hare@suse.de) wrote: > > We need this so that we can reliably correlate events to instances of a > > device. Events alone cannot solve this problem, because events _are_ > > the problem. > > > In which sense? > Yes, events can be delayed (if you list to uevents), but if you listen to > kernel events there shouldn't be a delay, right? uevents are delivered to userpace via an AF_NETLINK socket. The AF_NETLINK socket is basically an asynchronous buffer. I mean, consider what you are saying: you establish the AF_NETLINK uevent watching socket, then you allocate /dev/loop0. Since you cannot do that atomically, you'll first have to do one, and then the other. But if you do that in two steps, then in the middle some other process might get scheduled that quickly allocates /dev/loop0 and releases it again, before your code gets to run. So now you have in your AF_NETLINK socket buffer the uevents for that other process' use of the device, and you cannot sanely distinguish them from your own. of course you could do it the other way: allocate the device first, and only then allocate the AF_NETLINK uevent socket. But then you might or might not lose the "add" event for the device you just allocated. And you don't know if you should wait for it or not. This isn't even a constructed issue, this is the common case if you have multiple processes all simultaneously trying to acquire a loopback block device, because they all will end up eying /dev/loop0 at the same time. But it gets worse IRL because of various factors. For example, partition probing is asynchronous, so if you use LO_FLAGS_PARTSCAN and want to watch for some partition device associated to your loopback block device to show up, this can take *really* long, so the race window is large. Or you actually use udev (like most userspace probably should) because you want the metainfo it collects about the device, in which case it will take even longer for the uevent to reach you, i.e. the time window where a previous user's uevents and your own for the same loopback device "overlap" can be quite large and you cannot determine if they are yours or the previous user's uevents — unless we have these new sequence numbers. Lennart -- Lennart Poettering, Berlin
On 6/23/21 4:12 PM, Lennart Poettering wrote: > On Mi, 23.06.21 16:01, Hannes Reinecke (hare@suse.de) wrote: > >>> Thus: a global instead of local sequence number counter is absolutely >>> *key* for the problem this is supposed to solve >>> >> Well ... except that you'll need to keep track of the numbers (otherwise you >> wouldn't know if the numbers changed, right?). >> And if you keep track of the numbers you probably will have to implement an >> uevent listener to get the events in time. > > Hmm? This is backwards. The goal here is to be able to safely match up > uevents to specific uses of a block device, given that block device > names are agressively recycled. > > you imply it was easy to know which device use a uevent belongs > to. But that's the problem: it is not possible to do so safely. if i > see a uevent for a block device "loop0" I cannot tell if it was from > my own use of the device or for some previous user of it. > > And that's what we'd like to see fixed: i.e. we query the block device > for the seqeno now used and then we can use that to filter the uevents > and ignore the ones that do not carry the same sequence number as we > got assigned for our user. > It is notoriously tricky to monitor the intended use-case for kernel devices, precisely because we do _not_ attach any additional information to it. I have send a proposal for LSF to implement block-namespaces, the prime use-case of which is indeed attaching cgroup/namespace information to block devices such that we _can_ match (block) devices to specific contexts. Which I rather prefer than adding sequence numbers to block devices; incidentally you could solve the same problem by _not_ reusing numbers aggressively but rather allocate the next free one after the most recently allocated one. Will give you much the same thing without having to burden others with it. The better alternative here would be to extend the loop ioctl to pass in an UUID when allocating the device. That way you can easily figure out whether the loop device has been modified. But in the end, it's the loop driver and I'm not particular bothered with it. I am, though, if you need to touch all drivers just to support one particular use-case in one particular device driver. Incidentally, we don't have this problem in SCSI as we _can_ identify devices here. So in the end we couldn't care less on which /dev/sdX device it ends up. And I guess that's what we should attempt for loop devices, too. Cheers, Hannes
On Wed, 2021-06-23 at 17:02 +0200, Hannes Reinecke wrote: > On 6/23/21 4:12 PM, Lennart Poettering wrote: > > On Mi, 23.06.21 16:01, Hannes Reinecke (hare@suse.de) wrote: > > > > > > Thus: a global instead of local sequence number counter is absolutely > > > > *key* for the problem this is supposed to solve > > > > > > > Well ... except that you'll need to keep track of the numbers (otherwise you > > > wouldn't know if the numbers changed, right?). > > > And if you keep track of the numbers you probably will have to implement an > > > uevent listener to get the events in time. > > > > Hmm? This is backwards. The goal here is to be able to safely match up > > uevents to specific uses of a block device, given that block device > > names are agressively recycled. > > > > you imply it was easy to know which device use a uevent belongs > > to. But that's the problem: it is not possible to do so safely. if i > > see a uevent for a block device "loop0" I cannot tell if it was from > > my own use of the device or for some previous user of it. > > > > And that's what we'd like to see fixed: i.e. we query the block device > > for the seqeno now used and then we can use that to filter the uevents > > and ignore the ones that do not carry the same sequence number as we > > got assigned for our user. > > > > It is notoriously tricky to monitor the intended use-case for kernel > devices, precisely because we do _not_ attach any additional information > to it. > I have send a proposal for LSF to implement block-namespaces, the prime > use-case of which is indeed attaching cgroup/namespace information to > block devices such that we _can_ match (block) devices to specific contexts. Having namespaces for block devices would be an awesome feature, very much looking forward to have that, as it solves a number of other issues we have. And while it could maybe be used in some instances of this particular problem, unfortunately I don't think it can solve all of them - in some real cases, we have to work in the root namespace as we are setting things up for it. > Which I rather prefer than adding sequence numbers to block devices; > incidentally you could solve the same problem by _not_ reusing numbers > aggressively but rather allocate the next free one after the most > recently allocated one. > Will give you much the same thing without having to burden others with it. If I understood this right, you are proposing to move the monothonically increasing sequence id to the device name, rather than as internal metadata? So that, eg, loop0 gets used exactly once and never again, and so on? Wouldn't that be a much more visible and disruptive change, potentially backward incompatible and breaking userspace left and right? > The better alternative here would be to extend the loop ioctl to pass in > an UUID when allocating the device. > That way you can easily figure out whether the loop device has been > modified. A UUID solves the problem we are currently talking about. But a monothonically increasing sequence number has additional great properties compared to a UUID, that we very much want to make use of (again these are all _real_ use cases), and were described in detail here: https://lore.kernel.org/linux-fsdevel/20210315201331.GA2577561@casper.infradead.org/t/#m3b1ffdffcc70a7c3ef4d7f13d0c2d5b9e4dde35a > But in the end, it's the loop driver and I'm not particular bothered > with it. I am, though, if you need to touch all drivers just to support > one particular use-case in one particular device driver. > > Incidentally, we don't have this problem in SCSI as we _can_ identify > devices here. So in the end we couldn't care less on which /dev/sdX > device it ends up. > And I guess that's what we should attempt for loop devices, too. Sorry, I'm not sure what you mean here by "touch all drivers"? This series changes only drivers/block/loop.c, no other device drivers code is touched?
On Mi, 23.06.21 17:02, Hannes Reinecke (hare@suse.de) wrote: > > you imply it was easy to know which device use a uevent belongs > > to. But that's the problem: it is not possible to do so safely. if i > > see a uevent for a block device "loop0" I cannot tell if it was from > > my own use of the device or for some previous user of it. > > > > And that's what we'd like to see fixed: i.e. we query the block device > > for the seqeno now used and then we can use that to filter the uevents > > and ignore the ones that do not carry the same sequence number as we > > got assigned for our user. > > It is notoriously tricky to monitor the intended use-case for kernel > devices, precisely because we do _not_ attach any additional information to > it. > I have send a proposal for LSF to implement block-namespaces, the prime > use-case of which is indeed attaching cgroup/namespace information to block > devices such that we _can_ match (block) devices to specific > contexts. The goal of the patchset is to make loopback block devices (and similar) safely and robustly concurrently allocatable from the main OS namespace, without any cgroup/containerization logic. In systemd we want to be able to allocate loopback block devices from any context, and concurrently without having to set up a cgroup/namespace first for each user for it. Any approach that binds two distinct subsystems like this together (e.g. "you need to set up cgroups to safely allocate loopback block devices") is really problematic for us, since we manage both, but independently and always with minimal privileges. > Which I rather prefer than adding sequence numbers to block devices; > incidentally you could solve the same problem by _not_ reusing numbers > aggressively but rather allocate the next free one after the most recently > allocated one. You are suggesting that instead of allocating loopback block devices always from the "bottom", i.e. always handing out from "loop0" on, with the lowest preferred we'd just always hand out "loop1", "loop2", … with strictly monotonically increasing numbres and never reuse "loop0" anymore and other names we already handed out? That would certainly work, but this would require quite some kernel rework, since the loopbck allocation API is really not designed to work like that right now. Moreover, the proposed sequence number stuff also covers floppies/cdroms and other stuff nicely, i.e. where drives stick around but their media changes. Also, USB sticks are currently effectively always called /dev/sda. It would be great to be able to distinguish each plug/replug too. Of course, you could argue that there too /dev/sda should never be reused, but strictly monotonically increasing /dev/sdb, /dev/sdc, … and so on, and I'd sympathize with that, but that makes it a major kernel rework, because basically every block subsystem would have to be reworked to never reuse block device names anymore. Also, i doubt people would be happy if they then regularly would have to deal with device names such as /dev/loop84763874658743 or /dev/sdzbghz just because their system has been running for a while. > The better alternative here would be to extend the loop ioctl to pass in an > UUID when allocating the device. > That way you can easily figure out whether the loop device has been > modified. UUIDs instead of sequence numbers would mostly solve our probelms too. i.e. chaotic, randomized assignment of identifiers instead of linearly progressing assignment of idenitifers. However I prefer sequence numbers as discussed in this thread before: they allow us to derive ordering from things: thus if you see an uevent with a seqnum smaller than the one you are interested in you know its worth waiting for the ones you are looking for to appear. But if you see a uevent with a seqnum greater than the one you are interested in then you know it's pointless to wait, the device has already been acquired by someone else. With randomized UUIDs you can't know that, since uses by other participants are only recognizable as distinct from your own but don't tell you if they are earlier or later than your own. After all the AF_NETLINK/uevent socket is lossy, so you must be prepared for dropped messages, hence it's reat if we can easily resync when your own messages get dropped. Lennart -- Lennart Poettering, Berlin
diff --git a/block/genhd.c b/block/genhd.c index 9f8cb7beaad1..c96b667136ee 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -1129,8 +1129,17 @@ static void disk_release(struct device *dev) blk_put_queue(disk->queue); kfree(disk); } + +static int block_uevent(struct device *dev, struct kobj_uevent_env *env) +{ + struct gendisk *disk = dev_to_disk(dev); + + return add_uevent_var(env, "DISKSEQ=%llu", disk->diskseq); +} + struct class block_class = { .name = "block", + .dev_uevent = block_uevent, }; static char *block_devnode(struct device *dev, umode_t *mode, @@ -1304,6 +1313,8 @@ struct gendisk *__alloc_disk_node(int minors, int node_id) disk_to_dev(disk)->class = &block_class; disk_to_dev(disk)->type = &disk_type; device_initialize(disk_to_dev(disk)); + inc_diskseq(disk); + return disk; out_destroy_part_tbl: @@ -1854,3 +1865,11 @@ static void disk_release_events(struct gendisk *disk) WARN_ON_ONCE(disk->ev && disk->ev->block != 1); kfree(disk->ev); } + +void inc_diskseq(struct gendisk *disk) +{ + static atomic64_t diskseq; + + disk->diskseq = atomic64_inc_return(&diskseq); +} +EXPORT_SYMBOL_GPL(inc_diskseq); diff --git a/include/linux/genhd.h b/include/linux/genhd.h index 6fc26f7bdf71..a0d04250a2db 100644 --- a/include/linux/genhd.h +++ b/include/linux/genhd.h @@ -167,6 +167,7 @@ struct gendisk { int node_id; struct badblocks *bb; struct lockdep_map lockdep_map; + u64 diskseq; }; /* @@ -306,6 +307,7 @@ static inline void bd_unlink_disk_holder(struct block_device *bdev, } #endif /* CONFIG_SYSFS */ +void inc_diskseq(struct gendisk *disk); dev_t blk_lookup_devt(const char *name, int partno); void blk_request_module(dev_t devt); #ifdef CONFIG_BLOCK