Message ID | 20230307-rust-drm-v1-0-917ff5bc80a8@asahilina.net (mailing list archive) |
---|---|
Headers | show |
Series | Rust DRM subsystem abstractions (& preview AGX driver) | expand |
That was supposed to have Markdown-style section headings, but I forgot that b4 considers a leading # as a comment... sorry for the abrupt topic changes... The intended headings are below. On 07/03/2023 23.25, Asahi Lina wrote: > Hi everyone! > > This is my first take on the Rust abstractions for the DRM > subsystem. It includes the abstractions themselves, some minor > prerequisite changes to the C side, as well as the drm-asahi GPU driver > (for reference on how the abstractions are used, but not necessarily > intended to land together). > > These patches apply on top of the tree at [1], which is based on > 6.3-rc1 with a large number of Rust abstraction/support commits added on > top. Most of these are not prerequisites for the DRM abstractions > themselves, but rather only of the driver. > > * #1-12 introduce the abstractions, module by module, with minor C > changes before the dependent abstraction. > * Patch 10 is a little addition to drm_sched that I ended up needing, > but I can pull it out of the abstraction into its own patch if > needed. > * #13-14 add a minor feature to drm/gem and its abstraction used > by the driver. > * #15-16 introduce the (unstable) asahi UAPI. This is obviously not > ready for merge yet, but comments are welcome! > * #17 adds a Rust helper macro to handle GPU core/firmware differences. > This probably belongs in the driver at this point, but right now it > has to live in rust/macros since there is no mechanism for per-driver > proc macros. > * #18 adds the driver proper, in one big commit, for reference purposes. ## Background > I've been working since mid last year on an Apple AGX GPU driver for > Linux, using the (at the time) out-of-tree Rust support. As part of this > effort, I've been writing safe Rust abstractions for portions of the DRM > subsystem. > > Now that Rust itself is upstream, I'd like to get all the abstractions > upstreamed so we can eventually get the driver upstreamed! > > These abstractions have been used by the driver since our release in > December [2], in a simpler synchronous-submission form: > > * drm::ioctl > * drm::device > * drm::drv > * drm::file > * drm::{gem, gem::shmem} > * drm::mm > > This series adds these too, which are used by the explicit sync refactor > of the driver (the version in this series): > > * drm::syncobj > * drm::sched > * dma_fence > > The major dependencies for the DRM abstractions themselves are: > > * [3] rust: error: Add missing wrappers to convert to/from kernel error codes > * [4] rust: Miscellaneous macro improvements > * [5] rust: Add a Sealed trait > * [6] rust: device: Add a minimal RawDevice trait > * [7] rust: Enable the new_uninit feature for kernel and driver crates > * [8] rust: ioctl: Add ioctl number manipulation functions > * [9] rust: sync: Arc: Any downcasting and assume_init() > * rust: Add `container_of` and `offset_of` macros > * kernel::sync::mutex and dependencies > > Most of these (the ones with links) have already been submitted, and I > expect all of them to land for 6.4 (the mutex one will likely be last, > since there is some refactoring that will happen over the current state > to make it more ergonomic to use). The mutex dep is only necessary for > drm::mm and dma_fence, and transitively drm::syncobj and drm::sched. ## State > Things work! We've had most of the abstractions in production edge > kernels with the driver, and the new explicit sync stuff has passed > quite a few torture tests (this is how we found the drm_sched issue, > patch 11). > > The abstractions are intended to be safe (safety review very welcome!). > While writing them, I tried to avoid making any changes to the C side > unless absolutely necessary. I understand that it will probably make > sense to adjust the C side to make some things easier, but I wanted to > start from this as a baseline. > > Known issues: > > - The existing Rust integration does not currently allow building > abstractions as modules, so the Rust abstractions are only available > for DRM components that are built in. I added some extra Kconfig > symbols to deal with this, so a driver built as a module can depende > on having those built in. This should go away in the future (but may > not be ready in time for submission... I understand this probably > shouldn't be a blocker though?). > > - DRM relies heavily on the "subclassing" pattern for driver objects, > and this doesn't map well to Rust. I tried several approaches for > various bits, so we can see how they work out. In particular, whether > wrapper types should pretend to be smart pointers and Deref to their > inner driver-specific types, and whether they should be marked as > method receivers (Yuck, internal rustc implementation hacks! But > Arc<T> already does the same thing and it makes usage in > driver-implemented callbacks as `self` possible) are things I'd love > to discuss ^^. > > - Only what I need for my driver is implemented (plus a small amount of > obvious extras where better API completeness makes sense). I think the > general idea with Rust abstractions is that we add things as they > become necessary. > > - The plain GEM vs. GEM-shmem duality ended up with quite a hairy type > hierarchy. I'd love to figure out how to make this simpler... > > - drm::mm ends up requiring a built-in mutex in the abstraction, instead > of delegating that to the user with the usual Rust mutability rules. > This is because nodes can be dropped at any time, and those operations > need to be synchronized. We could try to avoid forbidding those drops > or mark the node type !Send, but that would make it a lot less > ergonomic to use... > > I'm looking for feedback on the abstractions of all kinds, so we can > move towards an upstreamable version. Optimistically, I'd love to get > this upstream for 6.5, and the driver for 6.6. > > Please feel free to ask any questions about the Rust bits, since I know > a lot of this is new to many of the C folks! ## About the drm-asahi driver > This is a fairly complete driver for Apple AGX G13 and G14 series GPUs. > > The driver today supports the Apple M1, M1 Pro, M1 Max, M1 Ultra, and M2 > SoCs, across two firmware revisions each. It has an explicit sync UAPI > heavily inspired by the upcoming Intel Xe UAPI, designed with Vulkan > support in mind. On the Mesa side we currently have a Gallium driver > that is mostly already upstream (missing the UAPI bits mostly) and > passes the dEQP GLES2/EGL tests, with most of GLES3.0 passing in > downstream work-in-progress branches. This is a reverse engineered > community driver (we have no hardware documentation of any kind, other > than some hints from aspects shared with PowerVR). > > While developing the driver, I tried to make use of Rust's safety and > lifetime features to provide not just CPU-side safety, but also > partial firmware-ABI safety. Thanks to this, it has turned out to be > a very stable driver even though GPU firmware crashes are fatal (no > restart capability, need to reboot!) and the FW/driver interface is a > huge mess of unsafe shared memory structures with complex pointer > chains. There are over 70 ABI types and 3000+ lines of firmware ABI type > definitions that vary between firmware builds and GPU cores... > > In a simpler blocking-submission form, it has been shipping in Asahi > Linux edge kernels since December [2], with lots of users and zero (!) > reported oopses (and only a couple reports of GPU firmware crashes, > though that issue should now be fixed). It has survived OOM scenarios > (Rust makes error cleanup easy!), UAPI-level fuzzing, countless broken > Mesa builds, uptimes of 40+ days, and more. > > The explicit sync refactor significantly increases performance (and > potential problems), but this version has survived a lot of torture > with dEQP/piglit tests and some manual corner case testing. > > In other words, Rust works! ^^ > > There are some design notes on the driver and further links at [10]. ## Links > [1] https://github.com/AsahiLinux/linux.git drm-rfc-base-20230307 > [2] https://asahilinux.org/2022/12/gpu-drivers-now-in-asahi-linux/ > [3] https://lore.kernel.org/rust-for-linux/20230224-rust-error-v1-0-f8f9a9a87303@asahilina.net/T/ > [4] https://lore.kernel.org/rust-for-linux/20230224-rust-macros-v1-0-b39fae46e102@asahilina.net/T/ > [5] https://lore.kernel.org/rust-for-linux/20230224-rust-iopt-rtkit-v1-0-49ced3391295@asahilina.net/T/#m515bad2cff7f5a46f55897e6b73c6c2f1fb2c638 > [6] https://lore.kernel.org/rust-for-linux/20230224-rust-iopt-rtkit-v1-0-49ced3391295@asahilina.net/T/#m4c64e390c43b3ff1b8470fc8b37eaf87f6e12c94 > [7] https://lore.kernel.org/rust-for-linux/CQV7ZNT6LMXI.1XG4YXSH8I7JK@vincent-arch/T/ > [8] https://lore.kernel.org/rust-for-linux/61f734d6-1497-755f-3632-3f261b890846@asahilina.net/T/ > [9] https://lore.kernel.org/rust-for-linux/20230224-rust-arc-v1-0-568eea613a41@asahilina.net/T/ > [10] https://github.com/AsahiLinux/docs/wiki/SW:AGX-driver-notes ~~ Lina
On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: > +/// Look up a GEM object handle for a `File` and return an `ObjectRef` for it. > +pub(crate) fn lookup_handle(file: &DrmFile, handle: u32) -> Result<ObjectRef> { > + Ok(ObjectRef::new(shmem::Object::lookup_handle(file, handle)?)) > +} So maybe my expectations for rust typing is a bit too much, but I kinda expected this to be fully generic: - trait Driver (drm_driver) knows the driver's object type - a generic create_handle function could ensure that for drm_file (which is always for a specific drm_device and hence Driver) can ensure at the type level that you only put the right objects into the drm_file - a generic lookup_handle function on the drm_file knows the Driver trait and so can give you back the right type right away. Why the wrapping, what do I miss? -Daniel
On 05/04/2023 23.37, Daniel Vetter wrote: > On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: >> +/// A generic monotonically incrementing ID used to uniquely identify object instances within the >> +/// driver. >> +pub(crate) struct ID(AtomicU64); >> + >> +impl ID { >> + /// Create a new ID counter with a given value. >> + fn new(val: u64) -> ID { >> + ID(AtomicU64::new(val)) >> + } >> + >> + /// Fetch the next unique ID. >> + pub(crate) fn next(&self) -> u64 { >> + self.0.fetch_add(1, Ordering::Relaxed) >> + } >> +} > > Continuing the theme of me commenting on individual things, I stumbled > over this because I noticed that there's a lot of id based lookups where I > don't expect them, and started chasing. > > - For ids use xarray, not atomic counters. Yes I know dma_fence timelines > gets this wrong, this goes back to an innocent time where we didn't > allocate more than one timeline per engine, and no one fixed it since > then. Yes u64 should be big enough for everyone :-/ > > - Attaching ID spaces to drm_device is also not great. drm is full of > these mistakes. Much better if their per drm_file and so private to each > client. > > - They shouldn't be used for anything else than uapi id -> kernel object > lookup at the beginning of ioctl code, and nowhere else. At least from > skimming it seems like these are used all over the driver codebase, > which does freak me out. At least on the C side that's a clear indicator > for a refcount/lockin/data structure model that's not thought out at > all. > > What's going on here, what do I miss? These aren't UAPI IDs, they are driver-internal IDs (the UAPI IDs do use xarray and are per-File). Most of them are just for debugging, so that when I enable full debug spam I have some way to correlate different things that are happening together (this subset of interleaved log lines relate to the same submission). Basically just object names that are easier to read (and less of a security leak) than pointers and guaranteed not to repeat. You could get rid of most of them and it wouldn't affect the driver design, it just makes it very hard to see what's going on with debug logs ^^; There are only two that are ever used for non-debugging purposes: the VM ID, and the File ID. Both are per-device global IDs attached to the VMs (not the UAPI VM objects, but rather the underlyng MMU address space managers they represent, including the kernel-internal ones) and to Files themselves. They are used for destroying GEM objects: since the objects are also device-global across multiple clients, I need a way to do things like "clean up all mappings for this File" or "clean up all mappings for this VM". There's an annoying circular reference between GEM objects and their mappings, which is why this is explicitly coded out in destroy paths instead of naturally happening via Drop semantics (without that cleanup code, the circular reference leaks it). So e.g. when a File does a GEM close or explicitly asks for all mappings of an object to be removed, it goes out to the (possibly shared) GEM object and tells it to drop all mappings marked as owned by that unique File ID. When an explicit "unmap all in VM" op happens, it asks the GEM object to drop all mappings for that underlying VM ID. Similarly, when a UAPI VM object is dropped (in the Drop impl, so both explicitly and when the whole File/xarray is dropped and such), that does an explicit unmap of a special dummy object it owns which would otherwise leak since it is not tracked as a GEM object owned by that File and therefore not handled by GEM closing. And again along the same lines, the allocators in alloc.rs explicitly destroy the mappings for their backing GEM objects on Drop. All this is due to that annoying circular reference between VMs and GEM objects that I'm not sure how to fix. Note that if I *don't* do this (or forget to do it somewhere) the consequence is just that we leak memory, and if you try to destroy the wrong IDs somehow the worst that can happen is you unmap things you shouldn't and fault the GPU (or, in the kernel or kernel-managed user VM cases, potentially the firmware). Rust safety guarantees still keep things from going entirely off the rails within the kernel, since everything that matters is reference counted (which is why these reference cycles are possible at all). This all started when I was looking at the panfrost driver for reference. It does the same thing except it uses actual pointers to the owning entities instead of IDs, and pointer comparison (see panfrost_gem_close). Of course you could try do that in Rust too (literally storing and comparing raw pointers that aren't owned references), but then you're introducing a Pin<> requirement on those objects to make their addresses stable and it feels way more icky and error-prone than unique IDs (since addresses can be reused). panfrost only has a single mmu (what I call the raw VM) per File while I have an arbitrary number, which is why I end up with the extra distinction/complexity of both File and VM IDs, but the concept is the same. Some of this is going to be refactored when I implement arbitrary VM range mapping/unmapping, which would be a good time to improve this... but is there something particularly wrong/broken about the way I'm doing it now that I missed? I figured unique u64 IDs would be a pretty safe way to identify entities and cleanup the mappings when needed. ~~ Lina
On 05/04/2023 23.44, Daniel Vetter wrote: > On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: >> +/// Look up a GEM object handle for a `File` and return an `ObjectRef` for it. >> +pub(crate) fn lookup_handle(file: &DrmFile, handle: u32) -> Result<ObjectRef> { >> + Ok(ObjectRef::new(shmem::Object::lookup_handle(file, handle)?)) >> +} > > So maybe my expectations for rust typing is a bit too much, but I kinda > expected this to be fully generic: > > - trait Driver (drm_driver) knows the driver's object type > - a generic create_handle function could ensure that for drm_file (which > is always for a specific drm_device and hence Driver) can ensure at the > type level that you only put the right objects into the drm_file > - a generic lookup_handle function on the drm_file knows the Driver trait > and so can give you back the right type right away. > > Why the wrapping, what do I miss? Sigh, so this is one of the many ways I'm trying to work around the "Rust doesn't do subclasses" problem (so we can figure out what the best one is ^^). The generic shmem::Object::lookup_handle() call *is* fully generic and will get you back a driver-specific object. But since Rust doesn't do subclassing, what you get back isn't a driver-specific type T, but rather a (reference to a) shmem::Object<T>. T represents the inner driver-specific data/functionality (only), and the outer shmem::Object<T> includes the actual drm_gem_shmem_object plus a T. This is backwards from C, where you expect the opposite situation where T contains a shmem object, but that just doesn't work with Rust because there's no way to build a safe API around that model as far as I know. Now the problem is from the higher layers I want object operations that interact with the shmem::Object<T> (that is, they call generic GEM functions on the object). Options so far: 1. Add an outer wrapper and put that functionality there. 2. Just have the functions on T as helpers, so you need to call T::foo(obj) instead of obj.foo(). 3. Use the undocumented method receiver trait thing to make shmem::Object<T> a valid `self` type, plus add auto-Deref to shmem::Object. Then obj.foo() works. #1 is what I use here. #2 is how the driver-specific File ioctl callbacks are implemented, and also sched::Job<T>. #3 is used for fence callbacks (FenceObject<T>). None of them are great, and I'd love to hear what people think of the various options... There are other unexplored options, like in this GEM case it could be covered with a driver-internal auxiliary trait impl'd on shmem::Object<T> buuut that doesn't work when you actually need callbacks on T itself to circle back to shmem::Object<T>, as is the case with File/Job/FenceObject. ~~ Lina
Argh. This (and my other reply) was supposed to go to Daniel, but Thunderbird... just dropped that recipient? And then my silly brain saw all the Cc:s go to To: and figured it was some weird consolidation and so I moved everything to Cc: except the only name that started with "Da" and... yeah, that wasn't the same person. Sorry for the confusion... I have no idea why Thunderbird hates Daniel... On 06/04/2023 13.44, Asahi Lina wrote: > On 05/04/2023 23.37, Daniel Vetter wrote: >> On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: >>> +/// A generic monotonically incrementing ID used to uniquely identify object instances within the >>> +/// driver. >>> +pub(crate) struct ID(AtomicU64); >>> + >>> +impl ID { >>> + /// Create a new ID counter with a given value. >>> + fn new(val: u64) -> ID { >>> + ID(AtomicU64::new(val)) >>> + } >>> + >>> + /// Fetch the next unique ID. >>> + pub(crate) fn next(&self) -> u64 { >>> + self.0.fetch_add(1, Ordering::Relaxed) >>> + } >>> +} >> >> Continuing the theme of me commenting on individual things, I stumbled >> over this because I noticed that there's a lot of id based lookups where I >> don't expect them, and started chasing. >> >> - For ids use xarray, not atomic counters. Yes I know dma_fence timelines >> gets this wrong, this goes back to an innocent time where we didn't >> allocate more than one timeline per engine, and no one fixed it since >> then. Yes u64 should be big enough for everyone :-/ >> >> - Attaching ID spaces to drm_device is also not great. drm is full of >> these mistakes. Much better if their per drm_file and so private to each >> client. >> >> - They shouldn't be used for anything else than uapi id -> kernel object >> lookup at the beginning of ioctl code, and nowhere else. At least from >> skimming it seems like these are used all over the driver codebase, >> which does freak me out. At least on the C side that's a clear indicator >> for a refcount/lockin/data structure model that's not thought out at >> all. >> >> What's going on here, what do I miss? > > These aren't UAPI IDs, they are driver-internal IDs (the UAPI IDs do use > xarray and are per-File). Most of them are just for debugging, so that > when I enable full debug spam I have some way to correlate different > things that are happening together (this subset of interleaved log lines > relate to the same submission). Basically just object names that are > easier to read (and less of a security leak) than pointers and > guaranteed not to repeat. You could get rid of most of them and it > wouldn't affect the driver design, it just makes it very hard to see > what's going on with debug logs ^^; > > There are only two that are ever used for non-debugging purposes: the VM > ID, and the File ID. Both are per-device global IDs attached to the VMs > (not the UAPI VM objects, but rather the underlyng MMU address space > managers they represent, including the kernel-internal ones) and to > Files themselves. They are used for destroying GEM objects: since the > objects are also device-global across multiple clients, I need a way to > do things like "clean up all mappings for this File" or "clean up all > mappings for this VM". There's an annoying circular reference between > GEM objects and their mappings, which is why this is explicitly coded > out in destroy paths instead of naturally happening via Drop semantics > (without that cleanup code, the circular reference leaks it). > > So e.g. when a File does a GEM close or explicitly asks for all mappings > of an object to be removed, it goes out to the (possibly shared) GEM > object and tells it to drop all mappings marked as owned by that unique > File ID. When an explicit "unmap all in VM" op happens, it asks the GEM > object to drop all mappings for that underlying VM ID. Similarly, when a > UAPI VM object is dropped (in the Drop impl, so both explicitly and when > the whole File/xarray is dropped and such), that does an explicit unmap > of a special dummy object it owns which would otherwise leak since it is > not tracked as a GEM object owned by that File and therefore not handled > by GEM closing. And again along the same lines, the allocators in > alloc.rs explicitly destroy the mappings for their backing GEM objects > on Drop. All this is due to that annoying circular reference between VMs > and GEM objects that I'm not sure how to fix. > > Note that if I *don't* do this (or forget to do it somewhere) the > consequence is just that we leak memory, and if you try to destroy the > wrong IDs somehow the worst that can happen is you unmap things you > shouldn't and fault the GPU (or, in the kernel or kernel-managed user VM > cases, potentially the firmware). Rust safety guarantees still keep > things from going entirely off the rails within the kernel, since > everything that matters is reference counted (which is why these > reference cycles are possible at all). > > This all started when I was looking at the panfrost driver for > reference. It does the same thing except it uses actual pointers to the > owning entities instead of IDs, and pointer comparison (see > panfrost_gem_close). Of course you could try do that in Rust too > (literally storing and comparing raw pointers that aren't owned > references), but then you're introducing a Pin<> requirement on those > objects to make their addresses stable and it feels way more icky and > error-prone than unique IDs (since addresses can be reused). panfrost > only has a single mmu (what I call the raw VM) per File while I have an > arbitrary number, which is why I end up with the extra > distinction/complexity of both File and VM IDs, but the concept is the same. > > Some of this is going to be refactored when I implement arbitrary VM > range mapping/unmapping, which would be a good time to improve this... > but is there something particularly wrong/broken about the way I'm doing > it now that I missed? I figured unique u64 IDs would be a pretty safe > way to identify entities and cleanup the mappings when needed. > > ~~ Lina > ~~ Lina
Same as the prior email, this was supposed to go to Daniel... On 06/04/2023 14.02, Asahi Lina wrote: > On 05/04/2023 23.44, Daniel Vetter wrote: >> On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: >>> +/// Look up a GEM object handle for a `File` and return an `ObjectRef` for it. >>> +pub(crate) fn lookup_handle(file: &DrmFile, handle: u32) -> Result<ObjectRef> { >>> + Ok(ObjectRef::new(shmem::Object::lookup_handle(file, handle)?)) >>> +} >> >> So maybe my expectations for rust typing is a bit too much, but I kinda >> expected this to be fully generic: >> >> - trait Driver (drm_driver) knows the driver's object type >> - a generic create_handle function could ensure that for drm_file (which >> is always for a specific drm_device and hence Driver) can ensure at the >> type level that you only put the right objects into the drm_file >> - a generic lookup_handle function on the drm_file knows the Driver trait >> and so can give you back the right type right away. >> >> Why the wrapping, what do I miss? > > Sigh, so this is one of the many ways I'm trying to work around the > "Rust doesn't do subclasses" problem (so we can figure out what the best > one is ^^). > > The generic shmem::Object::lookup_handle() call *is* fully generic and > will get you back a driver-specific object. But since Rust doesn't do > subclassing, what you get back isn't a driver-specific type T, but > rather a (reference to a) shmem::Object<T>. T represents the inner > driver-specific data/functionality (only), and the outer > shmem::Object<T> includes the actual drm_gem_shmem_object plus a T. This > is backwards from C, where you expect the opposite situation where T > contains a shmem object, but that just doesn't work with Rust because > there's no way to build a safe API around that model as far as I know. > > Now the problem is from the higher layers I want object operations that > interact with the shmem::Object<T> (that is, they call generic GEM > functions on the object). Options so far: > > 1. Add an outer wrapper and put that functionality there. > 2. Just have the functions on T as helpers, so you need to call > T::foo(obj) instead of obj.foo(). > 3. Use the undocumented method receiver trait thing to make > shmem::Object<T> a valid `self` type, plus add auto-Deref to > shmem::Object. Then obj.foo() works. > > #1 is what I use here. #2 is how the driver-specific File ioctl > callbacks are implemented, and also sched::Job<T>. #3 is used for fence > callbacks (FenceObject<T>). None of them are great, and I'd love to hear > what people think of the various options... > > There are other unexplored options, like in this GEM case it could be > covered with a driver-internal auxiliary trait impl'd on > shmem::Object<T> buuut that doesn't work when you actually need > callbacks on T itself to circle back to shmem::Object<T>, as is the case > with File/Job/FenceObject. > > ~~ Lina > ~~ Lina
On Thu, Apr 06, 2023 at 01:44:22PM +0900, Asahi Lina wrote: > On 05/04/2023 23.37, Daniel Vetter wrote: > > On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: > > > +/// A generic monotonically incrementing ID used to uniquely identify object instances within the > > > +/// driver. > > > +pub(crate) struct ID(AtomicU64); > > > + > > > +impl ID { > > > + /// Create a new ID counter with a given value. > > > + fn new(val: u64) -> ID { > > > + ID(AtomicU64::new(val)) > > > + } > > > + > > > + /// Fetch the next unique ID. > > > + pub(crate) fn next(&self) -> u64 { > > > + self.0.fetch_add(1, Ordering::Relaxed) > > > + } > > > +} > > > > Continuing the theme of me commenting on individual things, I stumbled > > over this because I noticed that there's a lot of id based lookups where I > > don't expect them, and started chasing. > > > > - For ids use xarray, not atomic counters. Yes I know dma_fence timelines > > gets this wrong, this goes back to an innocent time where we didn't > > allocate more than one timeline per engine, and no one fixed it since > > then. Yes u64 should be big enough for everyone :-/ > > > > - Attaching ID spaces to drm_device is also not great. drm is full of > > these mistakes. Much better if their per drm_file and so private to each > > client. > > > > - They shouldn't be used for anything else than uapi id -> kernel object > > lookup at the beginning of ioctl code, and nowhere else. At least from > > skimming it seems like these are used all over the driver codebase, > > which does freak me out. At least on the C side that's a clear indicator > > for a refcount/lockin/data structure model that's not thought out at > > all. > > > > What's going on here, what do I miss? > > These aren't UAPI IDs, they are driver-internal IDs (the UAPI IDs do use > xarray and are per-File). Most of them are just for debugging, so that when > I enable full debug spam I have some way to correlate different things that > are happening together (this subset of interleaved log lines relate to the > same submission). Basically just object names that are easier to read (and > less of a security leak) than pointers and guaranteed not to repeat. You > could get rid of most of them and it wouldn't affect the driver design, it > just makes it very hard to see what's going on with debug logs ^^; Hm generally we just print the kernel addresses with the right printk modifiers. Those filter/hash addresses if you have the right paranoia settings enabled. I guess throwing in a debug id doesn't hurt, but would be good to make that a lot more clearer. I haven't read the full driver yet because I'm still too much lost, that's why I guess I missed the xarray stuff on the file. I'll try and go understand that. For the big topic below I need to think more. -Daniel > There are only two that are ever used for non-debugging purposes: the VM ID, > and the File ID. Both are per-device global IDs attached to the VMs (not the > UAPI VM objects, but rather the underlyng MMU address space managers they > represent, including the kernel-internal ones) and to Files themselves. They > are used for destroying GEM objects: since the objects are also > device-global across multiple clients, I need a way to do things like "clean > up all mappings for this File" or "clean up all mappings for this VM". > There's an annoying circular reference between GEM objects and their > mappings, which is why this is explicitly coded out in destroy paths instead > of naturally happening via Drop semantics (without that cleanup code, the > circular reference leaks it). > > So e.g. when a File does a GEM close or explicitly asks for all mappings of > an object to be removed, it goes out to the (possibly shared) GEM object and > tells it to drop all mappings marked as owned by that unique File ID. When > an explicit "unmap all in VM" op happens, it asks the GEM object to drop all > mappings for that underlying VM ID. Similarly, when a UAPI VM object is > dropped (in the Drop impl, so both explicitly and when the whole File/xarray > is dropped and such), that does an explicit unmap of a special dummy object > it owns which would otherwise leak since it is not tracked as a GEM object > owned by that File and therefore not handled by GEM closing. And again along > the same lines, the allocators in alloc.rs explicitly destroy the mappings > for their backing GEM objects on Drop. All this is due to that annoying > circular reference between VMs and GEM objects that I'm not sure how to fix. > > Note that if I *don't* do this (or forget to do it somewhere) the > consequence is just that we leak memory, and if you try to destroy the wrong > IDs somehow the worst that can happen is you unmap things you shouldn't and > fault the GPU (or, in the kernel or kernel-managed user VM cases, > potentially the firmware). Rust safety guarantees still keep things from > going entirely off the rails within the kernel, since everything that > matters is reference counted (which is why these reference cycles are > possible at all). > > This all started when I was looking at the panfrost driver for reference. It > does the same thing except it uses actual pointers to the owning entities > instead of IDs, and pointer comparison (see panfrost_gem_close). Of course > you could try do that in Rust too (literally storing and comparing raw > pointers that aren't owned references), but then you're introducing a Pin<> > requirement on those objects to make their addresses stable and it feels way > more icky and error-prone than unique IDs (since addresses can be reused). > panfrost only has a single mmu (what I call the raw VM) per File while I > have an arbitrary number, which is why I end up with the extra > distinction/complexity of both File and VM IDs, but the concept is the same. > > Some of this is going to be refactored when I implement arbitrary VM range > mapping/unmapping, which would be a good time to improve this... but is > there something particularly wrong/broken about the way I'm doing it now > that I missed? I figured unique u64 IDs would be a pretty safe way to > identify entities and cleanup the mappings when needed. > > ~~ Lina > > _______________________________________________ > Linaro-mm-sig mailing list -- linaro-mm-sig@lists.linaro.org > To unsubscribe send an email to linaro-mm-sig-leave@lists.linaro.org
On Thu, Apr 06, 2023 at 02:02:55PM +0900, Asahi Lina wrote: > On 05/04/2023 23.44, Daniel Vetter wrote: > > On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: > > > +/// Look up a GEM object handle for a `File` and return an `ObjectRef` for it. > > > +pub(crate) fn lookup_handle(file: &DrmFile, handle: u32) -> Result<ObjectRef> { > > > + Ok(ObjectRef::new(shmem::Object::lookup_handle(file, handle)?)) > > > +} > > > > So maybe my expectations for rust typing is a bit too much, but I kinda > > expected this to be fully generic: > > > > - trait Driver (drm_driver) knows the driver's object type > > - a generic create_handle function could ensure that for drm_file (which > > is always for a specific drm_device and hence Driver) can ensure at the > > type level that you only put the right objects into the drm_file > > - a generic lookup_handle function on the drm_file knows the Driver trait > > and so can give you back the right type right away. > > > > Why the wrapping, what do I miss? > > Sigh, so this is one of the many ways I'm trying to work around the "Rust > doesn't do subclasses" problem (so we can figure out what the best one is > ^^). > > The generic shmem::Object::lookup_handle() call *is* fully generic and will > get you back a driver-specific object. But since Rust doesn't do > subclassing, what you get back isn't a driver-specific type T, but rather a > (reference to a) shmem::Object<T>. T represents the inner driver-specific > data/functionality (only), and the outer shmem::Object<T> includes the > actual drm_gem_shmem_object plus a T. This is backwards from C, where you > expect the opposite situation where T contains a shmem object, but that just > doesn't work with Rust because there's no way to build a safe API around > that model as far as I know. Ah I think I just got confused. I did untangle (I think at least) the Object<T> trick, I guess the only thing that confused me here is why this is in the shmem module? Or is that the rust problem again? I'd kinda have expected that we'd have a gem::Object<T> here that the lookup_handle function returns. So for the shmem case I guess that would then be gem::Object<shmem::Object<T>> for the driver type T with driver specific stuff? I guess not very pretty ... > Now the problem is from the higher layers I want object operations that > interact with the shmem::Object<T> (that is, they call generic GEM functions > on the object). Options so far: > > 1. Add an outer wrapper and put that functionality there. > 2. Just have the functions on T as helpers, so you need to call T::foo(obj) > instead of obj.foo(). > 3. Use the undocumented method receiver trait thing to make shmem::Object<T> > a valid `self` type, plus add auto-Deref to shmem::Object. Then obj.foo() > works. > > #1 is what I use here. #2 is how the driver-specific File ioctl callbacks > are implemented, and also sched::Job<T>. #3 is used for fence callbacks > (FenceObject<T>). None of them are great, and I'd love to hear what people > think of the various options... > > There are other unexplored options, like in this GEM case it could be > covered with a driver-internal auxiliary trait impl'd on shmem::Object<T> > buuut that doesn't work when you actually need callbacks on T itself to > circle back to shmem::Object<T>, as is the case with File/Job/FenceObject. Ok I think I'm completely lost here. But I also havent' looked at how this is all really used in the driver, it's really just the shmem:: module in the lookup_handle function which looked strange to me. -Daniel
On Thu, Apr 06, 2023 at 02:09:21PM +0900, Asahi Lina wrote: > Argh. This (and my other reply) was supposed to go to Daniel, but > Thunderbird... just dropped that recipient? And then my silly brain saw all > the Cc:s go to To: and figured it was some weird consolidation and so I > moved everything to Cc: except the only name that started with "Da" and... > yeah, that wasn't the same person. > > Sorry for the confusion... I have no idea why Thunderbird hates Daniel... Don't worry, I get cc'ed on so much stuff that whether I'm cc'ed or not has zero impact on whether I'll read a mail or not. It just kinda disappears into the big lable:cc bucket ... -Daniel > > On 06/04/2023 13.44, Asahi Lina wrote: > > On 05/04/2023 23.37, Daniel Vetter wrote: > > > On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: > > > > +/// A generic monotonically incrementing ID used to uniquely identify object instances within the > > > > +/// driver. > > > > +pub(crate) struct ID(AtomicU64); > > > > + > > > > +impl ID { > > > > + /// Create a new ID counter with a given value. > > > > + fn new(val: u64) -> ID { > > > > + ID(AtomicU64::new(val)) > > > > + } > > > > + > > > > + /// Fetch the next unique ID. > > > > + pub(crate) fn next(&self) -> u64 { > > > > + self.0.fetch_add(1, Ordering::Relaxed) > > > > + } > > > > +} > > > > > > Continuing the theme of me commenting on individual things, I stumbled > > > over this because I noticed that there's a lot of id based lookups where I > > > don't expect them, and started chasing. > > > > > > - For ids use xarray, not atomic counters. Yes I know dma_fence timelines > > > gets this wrong, this goes back to an innocent time where we didn't > > > allocate more than one timeline per engine, and no one fixed it since > > > then. Yes u64 should be big enough for everyone :-/ > > > > > > - Attaching ID spaces to drm_device is also not great. drm is full of > > > these mistakes. Much better if their per drm_file and so private to each > > > client. > > > > > > - They shouldn't be used for anything else than uapi id -> kernel object > > > lookup at the beginning of ioctl code, and nowhere else. At least from > > > skimming it seems like these are used all over the driver codebase, > > > which does freak me out. At least on the C side that's a clear indicator > > > for a refcount/lockin/data structure model that's not thought out at > > > all. > > > > > > What's going on here, what do I miss? > > > > These aren't UAPI IDs, they are driver-internal IDs (the UAPI IDs do use > > xarray and are per-File). Most of them are just for debugging, so that > > when I enable full debug spam I have some way to correlate different > > things that are happening together (this subset of interleaved log lines > > relate to the same submission). Basically just object names that are > > easier to read (and less of a security leak) than pointers and > > guaranteed not to repeat. You could get rid of most of them and it > > wouldn't affect the driver design, it just makes it very hard to see > > what's going on with debug logs ^^; > > > > There are only two that are ever used for non-debugging purposes: the VM > > ID, and the File ID. Both are per-device global IDs attached to the VMs > > (not the UAPI VM objects, but rather the underlyng MMU address space > > managers they represent, including the kernel-internal ones) and to > > Files themselves. They are used for destroying GEM objects: since the > > objects are also device-global across multiple clients, I need a way to > > do things like "clean up all mappings for this File" or "clean up all > > mappings for this VM". There's an annoying circular reference between > > GEM objects and their mappings, which is why this is explicitly coded > > out in destroy paths instead of naturally happening via Drop semantics > > (without that cleanup code, the circular reference leaks it). > > > > So e.g. when a File does a GEM close or explicitly asks for all mappings > > of an object to be removed, it goes out to the (possibly shared) GEM > > object and tells it to drop all mappings marked as owned by that unique > > File ID. When an explicit "unmap all in VM" op happens, it asks the GEM > > object to drop all mappings for that underlying VM ID. Similarly, when a > > UAPI VM object is dropped (in the Drop impl, so both explicitly and when > > the whole File/xarray is dropped and such), that does an explicit unmap > > of a special dummy object it owns which would otherwise leak since it is > > not tracked as a GEM object owned by that File and therefore not handled > > by GEM closing. And again along the same lines, the allocators in > > alloc.rs explicitly destroy the mappings for their backing GEM objects > > on Drop. All this is due to that annoying circular reference between VMs > > and GEM objects that I'm not sure how to fix. > > > > Note that if I *don't* do this (or forget to do it somewhere) the > > consequence is just that we leak memory, and if you try to destroy the > > wrong IDs somehow the worst that can happen is you unmap things you > > shouldn't and fault the GPU (or, in the kernel or kernel-managed user VM > > cases, potentially the firmware). Rust safety guarantees still keep > > things from going entirely off the rails within the kernel, since > > everything that matters is reference counted (which is why these > > reference cycles are possible at all). > > > > This all started when I was looking at the panfrost driver for > > reference. It does the same thing except it uses actual pointers to the > > owning entities instead of IDs, and pointer comparison (see > > panfrost_gem_close). Of course you could try do that in Rust too > > (literally storing and comparing raw pointers that aren't owned > > references), but then you're introducing a Pin<> requirement on those > > objects to make their addresses stable and it feels way more icky and > > error-prone than unique IDs (since addresses can be reused). panfrost > > only has a single mmu (what I call the raw VM) per File while I have an > > arbitrary number, which is why I end up with the extra > > distinction/complexity of both File and VM IDs, but the concept is the same. > > > > Some of this is going to be refactored when I implement arbitrary VM > > range mapping/unmapping, which would be a good time to improve this... > > but is there something particularly wrong/broken about the way I'm doing > > it now that I missed? I figured unique u64 IDs would be a pretty safe > > way to identify entities and cleanup the mappings when needed. > > > > ~~ Lina > > > > ~~ Lina >
On Thu, Apr 06, 2023 at 01:44:22PM +0900, Asahi Lina wrote: > On 05/04/2023 23.37, Daniel Vetter wrote: > > On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: > > > +/// A generic monotonically incrementing ID used to uniquely identify object instances within the > > > +/// driver. > > > +pub(crate) struct ID(AtomicU64); > > > + > > > +impl ID { > > > + /// Create a new ID counter with a given value. > > > + fn new(val: u64) -> ID { > > > + ID(AtomicU64::new(val)) > > > + } > > > + > > > + /// Fetch the next unique ID. > > > + pub(crate) fn next(&self) -> u64 { > > > + self.0.fetch_add(1, Ordering::Relaxed) > > > + } > > > +} > > > > Continuing the theme of me commenting on individual things, I stumbled > > over this because I noticed that there's a lot of id based lookups where I > > don't expect them, and started chasing. > > > > - For ids use xarray, not atomic counters. Yes I know dma_fence timelines > > gets this wrong, this goes back to an innocent time where we didn't > > allocate more than one timeline per engine, and no one fixed it since > > then. Yes u64 should be big enough for everyone :-/ > > > > - Attaching ID spaces to drm_device is also not great. drm is full of > > these mistakes. Much better if their per drm_file and so private to each > > client. > > > > - They shouldn't be used for anything else than uapi id -> kernel object > > lookup at the beginning of ioctl code, and nowhere else. At least from > > skimming it seems like these are used all over the driver codebase, > > which does freak me out. At least on the C side that's a clear indicator > > for a refcount/lockin/data structure model that's not thought out at > > all. > > > > What's going on here, what do I miss? > > These aren't UAPI IDs, they are driver-internal IDs (the UAPI IDs do use > xarray and are per-File). Most of them are just for debugging, so that when > I enable full debug spam I have some way to correlate different things that > are happening together (this subset of interleaved log lines relate to the > same submission). Basically just object names that are easier to read (and > less of a security leak) than pointers and guaranteed not to repeat. You > could get rid of most of them and it wouldn't affect the driver design, it > just makes it very hard to see what's going on with debug logs ^^; > > There are only two that are ever used for non-debugging purposes: the VM ID, > and the File ID. Both are per-device global IDs attached to the VMs (not the > UAPI VM objects, but rather the underlyng MMU address space managers they > represent, including the kernel-internal ones) and to Files themselves. They > are used for destroying GEM objects: since the objects are also > device-global across multiple clients, I need a way to do things like "clean > up all mappings for this File" or "clean up all mappings for this VM". > There's an annoying circular reference between GEM objects and their > mappings, which is why this is explicitly coded out in destroy paths instead > of naturally happening via Drop semantics (without that cleanup code, the > circular reference leaks it). > > So e.g. when a File does a GEM close or explicitly asks for all mappings of > an object to be removed, it goes out to the (possibly shared) GEM object and > tells it to drop all mappings marked as owned by that unique File ID. When > an explicit "unmap all in VM" op happens, it asks the GEM object to drop all > mappings for that underlying VM ID. Similarly, when a UAPI VM object is > dropped (in the Drop impl, so both explicitly and when the whole File/xarray > is dropped and such), that does an explicit unmap of a special dummy object > it owns which would otherwise leak since it is not tracked as a GEM object > owned by that File and therefore not handled by GEM closing. And again along > the same lines, the allocators in alloc.rs explicitly destroy the mappings > for their backing GEM objects on Drop. All this is due to that annoying > circular reference between VMs and GEM objects that I'm not sure how to fix. > > Note that if I *don't* do this (or forget to do it somewhere) the > consequence is just that we leak memory, and if you try to destroy the wrong > IDs somehow the worst that can happen is you unmap things you shouldn't and > fault the GPU (or, in the kernel or kernel-managed user VM cases, > potentially the firmware). Rust safety guarantees still keep things from > going entirely off the rails within the kernel, since everything that > matters is reference counted (which is why these reference cycles are > possible at all). > > This all started when I was looking at the panfrost driver for reference. It > does the same thing except it uses actual pointers to the owning entities > instead of IDs, and pointer comparison (see panfrost_gem_close). Of course > you could try do that in Rust too (literally storing and comparing raw > pointers that aren't owned references), but then you're introducing a Pin<> > requirement on those objects to make their addresses stable and it feels way > more icky and error-prone than unique IDs (since addresses can be reused). > panfrost only has a single mmu (what I call the raw VM) per File while I > have an arbitrary number, which is why I end up with the extra > distinction/complexity of both File and VM IDs, but the concept is the same. > > Some of this is going to be refactored when I implement arbitrary VM range > mapping/unmapping, which would be a good time to improve this... but is > there something particularly wrong/broken about the way I'm doing it now > that I missed? I figured unique u64 IDs would be a pretty safe way to > identify entities and cleanup the mappings when needed. Ok, some attempt at going through the vm_id/file_id stuff. Extremely high-level purely informed by having read too many drivers: First on the drm_file/struct file/file_id. This is the uapi interface object, and it's refcounted in the vfs, but that's entirely the vfs' business and none of the driver (or even subsystem). Once userspace has done the final close() the file is gone, there's no way to ever get anything meaningfully out of it because userspace dropped it. So if the driver has any kind of backpointer to that's a design bug, because in all the place you might want to care (ioctl, fdinfo for schedu stats, any other file_operations callback) the vfs ensures it stays alive during the callback and you essentially have a borrowed reference. I've seen a lot of drivers try to make clever backpointings to stuff that's essentially tied to the drm_file, and I've not found a single case that made sense. iow, file_id as a lookup thingie needs to go. In principle it's the same argument I've made already for the syncobj rust wrappers. For specific uses I guess I need some rust reading help, but from your description it sounds like the vm_id is much more the core piece. So for that we have the gpu ctx -> vm -> gem_bos chain of reference. Now on the C side if you have a modern driver that uses the vm_bind/unbind/gpuva manager approach, the reference counts go in that single direction only, anything else is essentially borrowed references under protection of a mutex/lock or similar thing (for e.g. going from the bo to the vm for eviction). In addition to the above chain the xarray in the drm_file also holds references to each of these. So far so good, in the drm_file ->postclose callback you just walk the xarrays and drop all the references, and everything gets cleaned up, at least in the C world. Aside: I'm ignoring the entire sched/job/gpu-ctx side because that's a separate can of worms and big other threads floating around already. But if either due to the uabi being a bit more legacy, or Rust requiring that the backpointers are reference-counted from the gem_bo->vma->vm and can't follow borrow semantics (afaiui the usual linux list_head pattern of walking the list under a lock giving you a borrowed reference for each element doesn't work too well in rust?) then that's not a problem, you can still all clean it out: - The key bit is that your vm struct needs both a refcount like kref and a separate open count. Each gpu ctx and the xarray for vm objects in drm_file hold _both_ the kref and the open refcount (in rust the open refcount implies the Arc or things go sideways). - the other key bit is that drm_file ->postclose does _not_ have simple Drop semantics, it's more explicit. - in the drm_file lastclose you first walk all the gpu ctx. The simplest semantics is that close() synchronously tears down all leftover gpu ctx, i.e. you unload them from the gpu. Details are under a lot of discussion in the various scheduler threads, but essentially this should ensure that the gpu ctx destruction completely removes all references to the ctx. If instead you have the legacy problem of apps expecting that rendering continues even if they called exit() before it finishes, then it gets more messy. I have no idea whether that's still a problem for new drivers or can be avoided. - Next up you do the same thing for the vm xarray (which drops both the kref an open refcounts). - At this point there might still be a ton of vm objects around with elevated kref. Except not, because at this point the open refcount of each vm should have dropped to zero. When that happens the vm object itself is still alive, plus even better for rust, you are in the vm_close(vm) function call so you have a full borrowed reference to that. Which means you can walk the entire address space and unmap everything explicit. Which should get rid of any gem_bo->vma->vm backpointers you have lying around. - At that point all your vm objects are gone too, because the kref managed backpointers are gone. - You walk the xarray of gem_bo (well the drm subsystem does that for you), which cleans out the reamining references to gem_bo. Only the gem_bo which are shared with other process or have a dma_buf will survive, like they should. No leak, no funky driver-internal vm_id based lookup, and with rust we should even be able to guarantee you never mix up Arc<Vm> with OpenRef<Vm> (or however that exactly works in rust types, I have not much real clue). If you have any other functional needs for vm_id then I guess I need to go through them, but they should be all fixable. -Daniel
On 06/04/2023 20.55, Daniel Vetter wrote: > On Thu, Apr 06, 2023 at 01:44:22PM +0900, Asahi Lina wrote: >> On 05/04/2023 23.37, Daniel Vetter wrote: >>> On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: >>>> +/// A generic monotonically incrementing ID used to uniquely identify object instances within the >>>> +/// driver. >>>> +pub(crate) struct ID(AtomicU64); >>>> + >>>> +impl ID { >>>> + /// Create a new ID counter with a given value. >>>> + fn new(val: u64) -> ID { >>>> + ID(AtomicU64::new(val)) >>>> + } >>>> + >>>> + /// Fetch the next unique ID. >>>> + pub(crate) fn next(&self) -> u64 { >>>> + self.0.fetch_add(1, Ordering::Relaxed) >>>> + } >>>> +} >>> >>> Continuing the theme of me commenting on individual things, I stumbled >>> over this because I noticed that there's a lot of id based lookups where I >>> don't expect them, and started chasing. >>> >>> - For ids use xarray, not atomic counters. Yes I know dma_fence timelines >>> gets this wrong, this goes back to an innocent time where we didn't >>> allocate more than one timeline per engine, and no one fixed it since >>> then. Yes u64 should be big enough for everyone :-/ >>> >>> - Attaching ID spaces to drm_device is also not great. drm is full of >>> these mistakes. Much better if their per drm_file and so private to each >>> client. >>> >>> - They shouldn't be used for anything else than uapi id -> kernel object >>> lookup at the beginning of ioctl code, and nowhere else. At least from >>> skimming it seems like these are used all over the driver codebase, >>> which does freak me out. At least on the C side that's a clear indicator >>> for a refcount/lockin/data structure model that's not thought out at >>> all. >>> >>> What's going on here, what do I miss? >> >> These aren't UAPI IDs, they are driver-internal IDs (the UAPI IDs do use >> xarray and are per-File). Most of them are just for debugging, so that when >> I enable full debug spam I have some way to correlate different things that >> are happening together (this subset of interleaved log lines relate to the >> same submission). Basically just object names that are easier to read (and >> less of a security leak) than pointers and guaranteed not to repeat. You >> could get rid of most of them and it wouldn't affect the driver design, it >> just makes it very hard to see what's going on with debug logs ^^; >> >> There are only two that are ever used for non-debugging purposes: the VM ID, >> and the File ID. Both are per-device global IDs attached to the VMs (not the >> UAPI VM objects, but rather the underlyng MMU address space managers they >> represent, including the kernel-internal ones) and to Files themselves. They >> are used for destroying GEM objects: since the objects are also >> device-global across multiple clients, I need a way to do things like "clean >> up all mappings for this File" or "clean up all mappings for this VM". >> There's an annoying circular reference between GEM objects and their >> mappings, which is why this is explicitly coded out in destroy paths instead >> of naturally happening via Drop semantics (without that cleanup code, the >> circular reference leaks it). >> >> So e.g. when a File does a GEM close or explicitly asks for all mappings of >> an object to be removed, it goes out to the (possibly shared) GEM object and >> tells it to drop all mappings marked as owned by that unique File ID. When >> an explicit "unmap all in VM" op happens, it asks the GEM object to drop all >> mappings for that underlying VM ID. Similarly, when a UAPI VM object is >> dropped (in the Drop impl, so both explicitly and when the whole File/xarray >> is dropped and such), that does an explicit unmap of a special dummy object >> it owns which would otherwise leak since it is not tracked as a GEM object >> owned by that File and therefore not handled by GEM closing. And again along >> the same lines, the allocators in alloc.rs explicitly destroy the mappings >> for their backing GEM objects on Drop. All this is due to that annoying >> circular reference between VMs and GEM objects that I'm not sure how to fix. >> >> Note that if I *don't* do this (or forget to do it somewhere) the >> consequence is just that we leak memory, and if you try to destroy the wrong >> IDs somehow the worst that can happen is you unmap things you shouldn't and >> fault the GPU (or, in the kernel or kernel-managed user VM cases, >> potentially the firmware). Rust safety guarantees still keep things from >> going entirely off the rails within the kernel, since everything that >> matters is reference counted (which is why these reference cycles are >> possible at all). >> >> This all started when I was looking at the panfrost driver for reference. It >> does the same thing except it uses actual pointers to the owning entities >> instead of IDs, and pointer comparison (see panfrost_gem_close). Of course >> you could try do that in Rust too (literally storing and comparing raw >> pointers that aren't owned references), but then you're introducing a Pin<> >> requirement on those objects to make their addresses stable and it feels way >> more icky and error-prone than unique IDs (since addresses can be reused). >> panfrost only has a single mmu (what I call the raw VM) per File while I >> have an arbitrary number, which is why I end up with the extra >> distinction/complexity of both File and VM IDs, but the concept is the same. >> >> Some of this is going to be refactored when I implement arbitrary VM range >> mapping/unmapping, which would be a good time to improve this... but is >> there something particularly wrong/broken about the way I'm doing it now >> that I missed? I figured unique u64 IDs would be a pretty safe way to >> identify entities and cleanup the mappings when needed. > > Ok, some attempt at going through the vm_id/file_id stuff. Extremely > high-level purely informed by having read too many drivers: > > First on the drm_file/struct file/file_id. This is the uapi interface > object, and it's refcounted in the vfs, but that's entirely the vfs' > business and none of the driver (or even subsystem). Once userspace has > done the final close() the file is gone, there's no way to ever get > anything meaningfully out of it because userspace dropped it. So if the > driver has any kind of backpointer to that's a design bug, because in all > the place you might want to care (ioctl, fdinfo for schedu stats, any > other file_operations callback) the vfs ensures it stays alive during the > callback and you essentially have a borrowed reference. Right, there's none of that for the File, and it is not refcounted itself. Certainly there are no direct references, and as for the IDs: the IDs of relevant Files live in GEM objects that hold mappings owned by that file. As part of File close all the GEM objects get closed, which removes those mappings. So by the time the File goes away there should be no references to its ID anywhere (other than if I stashed some away for debugging, I forget whether I did in some child object). If this process breaks for some reason (say, stray mappings remain indexed to a File ID that is gone), that means we leak the mappings, which leaks the GEM objects themselves and the VM they are mapped to. Not great but not fireworks either. As long as the DRM core properly calls the GEM close callback on everything before calling the File close callback though, that shouldn't happen. > I've seen a lot of drivers try to make clever backpointings to stuff > that's essentially tied to the drm_file, and I've not found a single case > that made sense. iow, file_id as a lookup thingie needs to go. In > principle it's the same argument I've made already for the syncobj rust > wrappers. For specific uses I guess I need some rust reading help, but > from your description it sounds like the vm_id is much more the core > piece. The file ID is simply how GEM mappings are identified as belonging to an active file within the mapping list of an object. GEM object close is literally the only place this ID is ever used for anything other than passing around: /// Callback to drop all mappings for a GEM object owned by a given `File` fn close(obj: &Object, file: &DrmFile) { mod_pr_debug!("DriverObject::close vm_id={:?} id={}\n", obj.vm_id, obj.id); obj.drop_file_mappings(file.inner().file_id()); } I could also just iterate through the VM XArray for the File and drop mappings one VM at a time instead of doing all of them in one go, it's just slightly more cumbersome (though potentially less code because I could get rid of all the forwarding the file_id I do now). On the other hand, once we implement arbitrary VM maps, I suspect this is going to go away anyway with the new design, so I'm not really very inclined to fix it until that happens... ^^ > So for that we have the gpu ctx -> vm -> gem_bos chain of reference. Now > on the C side if you have a modern driver that uses the > vm_bind/unbind/gpuva manager approach, the reference counts go in that > single direction only, anything else is essentially borrowed references > under protection of a mutex/lock or similar thing (for e.g. going from the > bo to the vm for eviction). Right, so that is what is going to change with the pending refactor. What I have right now is a design that used to be the old driver-managed VM design (and still retains part of that for kernel-managed objects) for the old synchronous demo UAPI, that I then shoehorned into the redesigned vm_bind UAPI by just not supporting the interesting cases (partial maps/unmaps/remaps, etc.). This is all temporary, it's just to get us by for now since OpenGL doesn't need it and there is no usable Vulkan driver that cares yet... I wanted to focus on the explicit sync and general sched/queuing part of the new UAPI before I got to the VM bind stuff, since I figured that would be more interesting (and pulls in all the new abstractions, plus major perf benefit). So the UAPI itself has vm_bind but only the "easy" subset of cases are supported by the driver (whole object maps/unmaps) and the refcounting is still backwards. As I said this originally came from the Panfrost design that doesn't have vm_bind but instead keeps a list of mappings with pointer equality checks in BOs... so that's why ^^ Thanks for explaining the design approach though, it's roughly what I had in mind but it's good to hear I'm on the right track! I'd love to go into more detail about how to implement vm_bind if you have time though (maybe a meeting?). In particular things like using the mm allocator to keep track of mapping ranges and supporting splitting and all that. > In addition to the above chain the xarray in the drm_file also holds > references to each of these. So far so good, in the drm_file ->postclose > callback you just walk the xarrays and drop all the references, and > everything gets cleaned up, at least in the C world. In the Rust world you just do nothing since the XArray abstraction knows how to drop all of its contained objects! > But if either due to the uabi being a bit more legacy, or Rust requiring > that the backpointers are reference-counted from the gem_bo->vma->vm and > can't follow borrow semantics (afaiui the usual linux list_head pattern of > walking the list under a lock giving you a borrowed reference for each > element doesn't work too well in rust?) then that's not a problem, you can > still all clean it out: > > - The key bit is that your vm struct needs both a refcount like kref and > a separate open count. Each gpu ctx and the xarray for vm objects in > drm_file hold _both_ the kref and the open refcount (in rust the open > refcount implies the Arc or things go sideways). > > - the other key bit is that drm_file ->postclose does _not_ have simple > Drop semantics, it's more explicit. > > - in the drm_file lastclose you first walk all the gpu ctx. The simplest > semantics is that close() synchronously tears down all leftover gpu ctx, > i.e. you unload them from the gpu. Details are under a lot of discussion > in the various scheduler threads, but essentially this should ensure > that the gpu ctx destruction completely removes all references to the > ctx. If instead you have the legacy problem of apps expecting that > rendering continues even if they called exit() before it finishes, then > it gets more messy. I have no idea whether that's still a problem for > new drivers or can be avoided. > > - Next up you do the same thing for the vm xarray (which drops both the > kref an open refcounts). > > - At this point there might still be a ton of vm objects around with > elevated kref. Except not, because at this point the open refcount of > each vm should have dropped to zero. When that happens the vm object > itself is still alive, plus even better for rust, you are in the > vm_close(vm) function call so you have a full borrowed reference to > that. Which means you can walk the entire address space and unmap > everything explicit. Which should get rid of any gem_bo->vma->vm > backpointers you have lying around. > > - At that point all your vm objects are gone too, because the kref managed > backpointers are gone. > > - You walk the xarray of gem_bo (well the drm subsystem does that for > you), which cleans out the reamining references to gem_bo. Only the > gem_bo which are shared with other process or have a dma_buf will > survive, like they should. > > No leak, no funky driver-internal vm_id based lookup, and with rust we > should even be able to guarantee you never mix up Arc<Vm> with OpenRef<Vm> > (or however that exactly works in rust types, I have not much real clue). That would totally work, and actually I already use somewhat analogous mechanisms in other places like firmware queues! If this all weren't getting turned on its head for the new VM management I'd implement it, but hopefully we can agree there's not much point right now... I'd rather focus on the DRM abstraction design and work on improving the driver in parallel right now, and then about one kernel cycle or so from now it should definitely be in a better place for review. Honestly, there are bigger design problems with the driver right now than these IDs (that I already know about)... so I want to focus more on the abstractions and their usage right now than the internal driver design which I *know* has problems ^^ Rust is really good at getting you to come up with a *safe* design as far as memory and ownership, but that doesn't mean it's perfectly clean code and more importantly it does nothing for deadlocks and allocating in the wrong paths and getting resource allocation semantics right etc etc. The GPU FW queue stuff is at the very least due for another major refactor/cleanup to defer resource allocation and actual queuing to job prepare/run time (right now there's some horrible hacks to do it upfront at submit because I don't have a mechanism to back-patch job structures with those resource IDs later at exec time, but I want to add that), and along the way I can also fix the using job fences to block on pending job count thing that Christian really wants me to do instead of the can_run_job thing, and then getting all this resource stuff truly right is also going to mean eventually using fences to handle blocking on resource exhaustion too (though maybe I can get away with implementing that a bit later)... The driver works stupidly well for how quickly I wrote it, but it still has all these rough edges that definitely need fixing before it's something I could say I'm happy with... I'm sure if you start hammering it with evil workloads you will hit some of its current problems (like I did yesterday with the deadlocks on GpuContext inval). I also need to learn more about the subtleties of fence signaling and all that, especially once a shrinker comes into play... ~~ Lina
On 06/04/2023 20.25, Daniel Vetter wrote: > On Thu, Apr 06, 2023 at 02:02:55PM +0900, Asahi Lina wrote: >> On 05/04/2023 23.44, Daniel Vetter wrote: >>> On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: >>>> +/// Look up a GEM object handle for a `File` and return an `ObjectRef` for it. >>>> +pub(crate) fn lookup_handle(file: &DrmFile, handle: u32) -> Result<ObjectRef> { >>>> + Ok(ObjectRef::new(shmem::Object::lookup_handle(file, handle)?)) >>>> +} >>> >>> So maybe my expectations for rust typing is a bit too much, but I kinda >>> expected this to be fully generic: >>> >>> - trait Driver (drm_driver) knows the driver's object type >>> - a generic create_handle function could ensure that for drm_file (which >>> is always for a specific drm_device and hence Driver) can ensure at the >>> type level that you only put the right objects into the drm_file >>> - a generic lookup_handle function on the drm_file knows the Driver trait >>> and so can give you back the right type right away. >>> >>> Why the wrapping, what do I miss? >> >> Sigh, so this is one of the many ways I'm trying to work around the "Rust >> doesn't do subclasses" problem (so we can figure out what the best one is >> ^^). >> >> The generic shmem::Object::lookup_handle() call *is* fully generic and will >> get you back a driver-specific object. But since Rust doesn't do >> subclassing, what you get back isn't a driver-specific type T, but rather a >> (reference to a) shmem::Object<T>. T represents the inner driver-specific >> data/functionality (only), and the outer shmem::Object<T> includes the >> actual drm_gem_shmem_object plus a T. This is backwards from C, where you >> expect the opposite situation where T contains a shmem object, but that just >> doesn't work with Rust because there's no way to build a safe API around >> that model as far as I know. > > Ah I think I just got confused. I did untangle (I think at least) the > Object<T> trick, I guess the only thing that confused me here is why this > is in the shmem module? Or is that the rust problem again? > > I'd kinda have expected that we'd have a gem::Object<T> here that the > lookup_handle function returns. So for the shmem case I guess that would > then be gem::Object<shmem::Object<T>> for the driver type T with driver > specific stuff? I guess not very pretty ... Ahh, uh... Yeah, so shmem objects are allocated their own way (the shmem core expects to kfree them in drm_gem_shmem_free) and bindings::drm_gem_shmem_object already contains a bindings::drm_gem_object. Since the composition is already done in the C side, we can't just do it again in Rust cleanly. That's why I have this weird setup with both a common trait for common GEM functionality and separate actual types that both implement it. Honestly the whole GEM codepath is untested other than the bits inherited by shmem. I'm not sure I'll be able to verify that this all makes sense until another Rust driver comes along that needs something other than shmem. I just felt I had to do *something* for GEM since the hierarchy is there and I needed shmem... This whole gem stuff is IMO the messiest part of the abstractions though, so I'm happy to turn it on its head if it makes it better and someone has an idea of how to do that ^^ >> Now the problem is from the higher layers I want object operations that >> interact with the shmem::Object<T> (that is, they call generic GEM functions >> on the object). Options so far: >> >> 1. Add an outer wrapper and put that functionality there. >> 2. Just have the functions on T as helpers, so you need to call T::foo(obj) >> instead of obj.foo(). >> 3. Use the undocumented method receiver trait thing to make shmem::Object<T> >> a valid `self` type, plus add auto-Deref to shmem::Object. Then obj.foo() >> works. >> >> #1 is what I use here. #2 is how the driver-specific File ioctl callbacks >> are implemented, and also sched::Job<T>. #3 is used for fence callbacks >> (FenceObject<T>). None of them are great, and I'd love to hear what people >> think of the various options... >> >> There are other unexplored options, like in this GEM case it could be >> covered with a driver-internal auxiliary trait impl'd on shmem::Object<T> >> buuut that doesn't work when you actually need callbacks on T itself to >> circle back to shmem::Object<T>, as is the case with File/Job/FenceObject. > > Ok I think I'm completely lost here. But I also havent' looked at how this > is all really used in the driver, it's really just the shmem:: module in > the lookup_handle function which looked strange to me. Ah, sorry, I misunderstood what you were talking about in my previous email then. That's just a default trait function. It comes from common functionality in the gem module, but shmem::Object implements the trait so it ends up offering it too (lookup_handle() is not duplicated, it only lives in gem, shmem only has to implement going to/from the drm_gem_object pointer so the rest of the methods can use it). That's part of why the type/trait hierarchy is kind of messy here, it's so I can share functionality between both types even though they are pre-composed on the C side. In the end the object types are specialized for any given driver, so you're always getting your own unique kind of object anyway. It's just that drivers based on shmem will go through it to reach the common code and work with a shmem::Object<T>, and drivers using raw gem will use gem::Object<T> instead. ~~ Lina
On Thu, Apr 06, 2023 at 10:15:56PM +0900, Asahi Lina wrote: > On 06/04/2023 20.55, Daniel Vetter wrote: > > On Thu, Apr 06, 2023 at 01:44:22PM +0900, Asahi Lina wrote: > > > On 05/04/2023 23.37, Daniel Vetter wrote: > > > > On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: > > > > > +/// A generic monotonically incrementing ID used to uniquely identify object instances within the > > > > > +/// driver. > > > > > +pub(crate) struct ID(AtomicU64); > > > > > + > > > > > +impl ID { > > > > > + /// Create a new ID counter with a given value. > > > > > + fn new(val: u64) -> ID { > > > > > + ID(AtomicU64::new(val)) > > > > > + } > > > > > + > > > > > + /// Fetch the next unique ID. > > > > > + pub(crate) fn next(&self) -> u64 { > > > > > + self.0.fetch_add(1, Ordering::Relaxed) > > > > > + } > > > > > +} > > > > > > > > Continuing the theme of me commenting on individual things, I stumbled > > > > over this because I noticed that there's a lot of id based lookups where I > > > > don't expect them, and started chasing. > > > > > > > > - For ids use xarray, not atomic counters. Yes I know dma_fence timelines > > > > gets this wrong, this goes back to an innocent time where we didn't > > > > allocate more than one timeline per engine, and no one fixed it since > > > > then. Yes u64 should be big enough for everyone :-/ > > > > > > > > - Attaching ID spaces to drm_device is also not great. drm is full of > > > > these mistakes. Much better if their per drm_file and so private to each > > > > client. > > > > > > > > - They shouldn't be used for anything else than uapi id -> kernel object > > > > lookup at the beginning of ioctl code, and nowhere else. At least from > > > > skimming it seems like these are used all over the driver codebase, > > > > which does freak me out. At least on the C side that's a clear indicator > > > > for a refcount/lockin/data structure model that's not thought out at > > > > all. > > > > > > > > What's going on here, what do I miss? > > > > > > These aren't UAPI IDs, they are driver-internal IDs (the UAPI IDs do use > > > xarray and are per-File). Most of them are just for debugging, so that when > > > I enable full debug spam I have some way to correlate different things that > > > are happening together (this subset of interleaved log lines relate to the > > > same submission). Basically just object names that are easier to read (and > > > less of a security leak) than pointers and guaranteed not to repeat. You > > > could get rid of most of them and it wouldn't affect the driver design, it > > > just makes it very hard to see what's going on with debug logs ^^; > > > > > > There are only two that are ever used for non-debugging purposes: the VM ID, > > > and the File ID. Both are per-device global IDs attached to the VMs (not the > > > UAPI VM objects, but rather the underlyng MMU address space managers they > > > represent, including the kernel-internal ones) and to Files themselves. They > > > are used for destroying GEM objects: since the objects are also > > > device-global across multiple clients, I need a way to do things like "clean > > > up all mappings for this File" or "clean up all mappings for this VM". > > > There's an annoying circular reference between GEM objects and their > > > mappings, which is why this is explicitly coded out in destroy paths instead > > > of naturally happening via Drop semantics (without that cleanup code, the > > > circular reference leaks it). > > > > > > So e.g. when a File does a GEM close or explicitly asks for all mappings of > > > an object to be removed, it goes out to the (possibly shared) GEM object and > > > tells it to drop all mappings marked as owned by that unique File ID. When > > > an explicit "unmap all in VM" op happens, it asks the GEM object to drop all > > > mappings for that underlying VM ID. Similarly, when a UAPI VM object is > > > dropped (in the Drop impl, so both explicitly and when the whole File/xarray > > > is dropped and such), that does an explicit unmap of a special dummy object > > > it owns which would otherwise leak since it is not tracked as a GEM object > > > owned by that File and therefore not handled by GEM closing. And again along > > > the same lines, the allocators in alloc.rs explicitly destroy the mappings > > > for their backing GEM objects on Drop. All this is due to that annoying > > > circular reference between VMs and GEM objects that I'm not sure how to fix. > > > > > > Note that if I *don't* do this (or forget to do it somewhere) the > > > consequence is just that we leak memory, and if you try to destroy the wrong > > > IDs somehow the worst that can happen is you unmap things you shouldn't and > > > fault the GPU (or, in the kernel or kernel-managed user VM cases, > > > potentially the firmware). Rust safety guarantees still keep things from > > > going entirely off the rails within the kernel, since everything that > > > matters is reference counted (which is why these reference cycles are > > > possible at all). > > > > > > This all started when I was looking at the panfrost driver for reference. It > > > does the same thing except it uses actual pointers to the owning entities > > > instead of IDs, and pointer comparison (see panfrost_gem_close). Of course > > > you could try do that in Rust too (literally storing and comparing raw > > > pointers that aren't owned references), but then you're introducing a Pin<> > > > requirement on those objects to make their addresses stable and it feels way > > > more icky and error-prone than unique IDs (since addresses can be reused). > > > panfrost only has a single mmu (what I call the raw VM) per File while I > > > have an arbitrary number, which is why I end up with the extra > > > distinction/complexity of both File and VM IDs, but the concept is the same. > > > > > > Some of this is going to be refactored when I implement arbitrary VM range > > > mapping/unmapping, which would be a good time to improve this... but is > > > there something particularly wrong/broken about the way I'm doing it now > > > that I missed? I figured unique u64 IDs would be a pretty safe way to > > > identify entities and cleanup the mappings when needed. > > > > Ok, some attempt at going through the vm_id/file_id stuff. Extremely > > high-level purely informed by having read too many drivers: > > > > First on the drm_file/struct file/file_id. This is the uapi interface > > object, and it's refcounted in the vfs, but that's entirely the vfs' > > business and none of the driver (or even subsystem). Once userspace has > > done the final close() the file is gone, there's no way to ever get > > anything meaningfully out of it because userspace dropped it. So if the > > driver has any kind of backpointer to that's a design bug, because in all > > the place you might want to care (ioctl, fdinfo for schedu stats, any > > other file_operations callback) the vfs ensures it stays alive during the > > callback and you essentially have a borrowed reference. > > Right, there's none of that for the File, and it is not refcounted itself. > Certainly there are no direct references, and as for the IDs: the IDs of > relevant Files live in GEM objects that hold mappings owned by that file. As > part of File close all the GEM objects get closed, which removes those > mappings. So by the time the File goes away there should be no references to > its ID anywhere (other than if I stashed some away for debugging, I forget > whether I did in some child object). > > If this process breaks for some reason (say, stray mappings remain indexed > to a File ID that is gone), that means we leak the mappings, which leaks the > GEM objects themselves and the VM they are mapped to. Not great but not > fireworks either. As long as the DRM core properly calls the GEM close > callback on everything before calling the File close callback though, that > shouldn't happen. > > > I've seen a lot of drivers try to make clever backpointings to stuff > > that's essentially tied to the drm_file, and I've not found a single case > > that made sense. iow, file_id as a lookup thingie needs to go. In > > principle it's the same argument I've made already for the syncobj rust > > wrappers. For specific uses I guess I need some rust reading help, but > > from your description it sounds like the vm_id is much more the core > > piece. > > The file ID is simply how GEM mappings are identified as belonging to an > active file within the mapping list of an object. GEM object close is > literally the only place this ID is ever used for anything other than > passing around: > > /// Callback to drop all mappings for a GEM object owned by a given `File` > fn close(obj: &Object, file: &DrmFile) { > mod_pr_debug!("DriverObject::close vm_id={:?} id={}\n", obj.vm_id, > obj.id); > obj.drop_file_mappings(file.inner().file_id()); > } > > I could also just iterate through the VM XArray for the File and drop > mappings one VM at a time instead of doing all of them in one go, it's just > slightly more cumbersome (though potentially less code because I could get > rid of all the forwarding the file_id I do now). > > On the other hand, once we implement arbitrary VM maps, I suspect this is > going to go away anyway with the new design, so I'm not really very inclined > to fix it until that happens... ^^ Yeah the driver-managed vm needs a bunch more reference loops and gets awkward fast. the gpuva library might need to keep support for that, but I really hope it's not needed. > > So for that we have the gpu ctx -> vm -> gem_bos chain of reference. Now > > on the C side if you have a modern driver that uses the > > vm_bind/unbind/gpuva manager approach, the reference counts go in that > > single direction only, anything else is essentially borrowed references > > under protection of a mutex/lock or similar thing (for e.g. going from the > > bo to the vm for eviction). > > Right, so that is what is going to change with the pending refactor. What I > have right now is a design that used to be the old driver-managed VM design > (and still retains part of that for kernel-managed objects) for the old > synchronous demo UAPI, that I then shoehorned into the redesigned vm_bind > UAPI by just not supporting the interesting cases (partial > maps/unmaps/remaps, etc.). This is all temporary, it's just to get us by for > now since OpenGL doesn't need it and there is no usable Vulkan driver that > cares yet... I wanted to focus on the explicit sync and general > sched/queuing part of the new UAPI before I got to the VM bind stuff, since > I figured that would be more interesting (and pulls in all the new > abstractions, plus major perf benefit). So the UAPI itself has vm_bind but > only the "easy" subset of cases are supported by the driver (whole object > maps/unmaps) and the refcounting is still backwards. > > As I said this originally came from the Panfrost design that doesn't have > vm_bind but instead keeps a list of mappings with pointer equality checks in > BOs... so that's why ^^ > > Thanks for explaining the design approach though, it's roughly what I had in > mind but it's good to hear I'm on the right track! I'd love to go into more > detail about how to implement vm_bind if you have time though (maybe a > meeting?). In particular things like using the mm allocator to keep track of > mapping ranges and supporting splitting and all that. Yeah vm_bind sounds like a good topic to discuss. I don't think we'll get all the pieces aligned to land that before asahi, but the driver internals should at least match wrt semantics with that so that the refactoring isn't total pain. > > In addition to the above chain the xarray in the drm_file also holds > > references to each of these. So far so good, in the drm_file ->postclose > > callback you just walk the xarrays and drop all the references, and > > everything gets cleaned up, at least in the C world. > > In the Rust world you just do nothing since the XArray abstraction knows how > to drop all of its contained objects! Yeah xarray should work with Drop, but I guess you need a special uapi/open-reference object that knows that it needs to perform additional cleanup (like quiescent the gpu ctx or unamp everything for the vm). > > But if either due to the uabi being a bit more legacy, or Rust requiring > > that the backpointers are reference-counted from the gem_bo->vma->vm and > > can't follow borrow semantics (afaiui the usual linux list_head pattern of > > walking the list under a lock giving you a borrowed reference for each > > element doesn't work too well in rust?) then that's not a problem, you can > > still all clean it out: > > > > - The key bit is that your vm struct needs both a refcount like kref and > > a separate open count. Each gpu ctx and the xarray for vm objects in > > drm_file hold _both_ the kref and the open refcount (in rust the open > > refcount implies the Arc or things go sideways). > > > > - the other key bit is that drm_file ->postclose does _not_ have simple > > Drop semantics, it's more explicit. > > > > - in the drm_file lastclose you first walk all the gpu ctx. The simplest > > semantics is that close() synchronously tears down all leftover gpu ctx, > > i.e. you unload them from the gpu. Details are under a lot of discussion > > in the various scheduler threads, but essentially this should ensure > > that the gpu ctx destruction completely removes all references to the > > ctx. If instead you have the legacy problem of apps expecting that > > rendering continues even if they called exit() before it finishes, then > > it gets more messy. I have no idea whether that's still a problem for > > new drivers or can be avoided. > > > > - Next up you do the same thing for the vm xarray (which drops both the > > kref an open refcounts). > > > > - At this point there might still be a ton of vm objects around with > > elevated kref. Except not, because at this point the open refcount of > > each vm should have dropped to zero. When that happens the vm object > > itself is still alive, plus even better for rust, you are in the > > vm_close(vm) function call so you have a full borrowed reference to > > that. Which means you can walk the entire address space and unmap > > everything explicit. Which should get rid of any gem_bo->vma->vm > > backpointers you have lying around. > > > > - At that point all your vm objects are gone too, because the kref managed > > backpointers are gone. > > > > - You walk the xarray of gem_bo (well the drm subsystem does that for > > you), which cleans out the reamining references to gem_bo. Only the > > gem_bo which are shared with other process or have a dma_buf will > > survive, like they should. > > > > No leak, no funky driver-internal vm_id based lookup, and with rust we > > should even be able to guarantee you never mix up Arc<Vm> with OpenRef<Vm> > > (or however that exactly works in rust types, I have not much real clue). > > That would totally work, and actually I already use somewhat analogous > mechanisms in other places like firmware queues! > > If this all weren't getting turned on its head for the new VM management I'd > implement it, but hopefully we can agree there's not much point right now... > I'd rather focus on the DRM abstraction design and work on improving the > driver in parallel right now, and then about one kernel cycle or so from now > it should definitely be in a better place for review. Honestly, there are > bigger design problems with the driver right now than these IDs (that I > already know about)... so I want to focus more on the abstractions and their > usage right now than the internal driver design which I *know* has problems > ^^ Yeah I think the only fundamental issue you have is that (if I get this all right) you're trying to clean up mappings from the gem_bo, not from the vm. The gem_bo (unlike the vm) is freely shareable (at least in general), so tying anything else to the lifetime of a gem_bo in any way is a design flaw. This is similar to dma_fence that can end up absolutely everywhere, and why drm/sched has this decoupling between hw_fence and drm_job fences with wider visibility. i915-gem/i915-scheduler and a lot of the really old drivers all get this wrong, and you end up with either terrible explicit cleanup code that tries to go around looking for all the references that it needs to drop. Or you just leak. All these things need to be sorted out at design time so that they're impossible. > Rust is really good at getting you to come up with a *safe* design as far as > memory and ownership, but that doesn't mean it's perfectly clean code and > more importantly it does nothing for deadlocks and allocating in the wrong > paths and getting resource allocation semantics right etc etc. The GPU FW > queue stuff is at the very least due for another major refactor/cleanup to > defer resource allocation and actual queuing to job prepare/run time (right > now there's some horrible hacks to do it upfront at submit because I don't > have a mechanism to back-patch job structures with those resource IDs later > at exec time, but I want to add that), and along the way I can also fix the > using job fences to block on pending job count thing that Christian really > wants me to do instead of the can_run_job thing, and then getting all this > resource stuff truly right is also going to mean eventually using fences to > handle blocking on resource exhaustion too (though maybe I can get away with > implementing that a bit later)... > > The driver works stupidly well for how quickly I wrote it, but it still has > all these rough edges that definitely need fixing before it's something I > could say I'm happy with... I'm sure if you start hammering it with evil > workloads you will hit some of its current problems (like I did yesterday > with the deadlocks on GpuContext inval). I also need to learn more about the > subtleties of fence signaling and all that, especially once a shrinker comes > into play... Yeah I think rust is impressive at creating working code. The real challenge, and really where I see all the short term value at least, is in clarifying the semantics. Because that'll help us to clarify the semantics on the C side too, which gives immediate benefits for everyone. Not just new drivers in rust. But it's also the part that's really, really hard work. -Daniel
On Thu, Apr 06, 2023 at 10:32:29PM +0900, Asahi Lina wrote: > On 06/04/2023 20.25, Daniel Vetter wrote: > > On Thu, Apr 06, 2023 at 02:02:55PM +0900, Asahi Lina wrote: > > > On 05/04/2023 23.44, Daniel Vetter wrote: > > > > On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: > > > > > +/// Look up a GEM object handle for a `File` and return an `ObjectRef` for it. > > > > > +pub(crate) fn lookup_handle(file: &DrmFile, handle: u32) -> Result<ObjectRef> { > > > > > + Ok(ObjectRef::new(shmem::Object::lookup_handle(file, handle)?)) > > > > > +} > > > > > > > > So maybe my expectations for rust typing is a bit too much, but I kinda > > > > expected this to be fully generic: > > > > > > > > - trait Driver (drm_driver) knows the driver's object type > > > > - a generic create_handle function could ensure that for drm_file (which > > > > is always for a specific drm_device and hence Driver) can ensure at the > > > > type level that you only put the right objects into the drm_file > > > > - a generic lookup_handle function on the drm_file knows the Driver trait > > > > and so can give you back the right type right away. > > > > > > > > Why the wrapping, what do I miss? > > > > > > Sigh, so this is one of the many ways I'm trying to work around the "Rust > > > doesn't do subclasses" problem (so we can figure out what the best one is > > > ^^). > > > > > > The generic shmem::Object::lookup_handle() call *is* fully generic and will > > > get you back a driver-specific object. But since Rust doesn't do > > > subclassing, what you get back isn't a driver-specific type T, but rather a > > > (reference to a) shmem::Object<T>. T represents the inner driver-specific > > > data/functionality (only), and the outer shmem::Object<T> includes the > > > actual drm_gem_shmem_object plus a T. This is backwards from C, where you > > > expect the opposite situation where T contains a shmem object, but that just > > > doesn't work with Rust because there's no way to build a safe API around > > > that model as far as I know. > > > > Ah I think I just got confused. I did untangle (I think at least) the > > Object<T> trick, I guess the only thing that confused me here is why this > > is in the shmem module? Or is that the rust problem again? > > > > I'd kinda have expected that we'd have a gem::Object<T> here that the > > lookup_handle function returns. So for the shmem case I guess that would > > then be gem::Object<shmem::Object<T>> for the driver type T with driver > > specific stuff? I guess not very pretty ... > > Ahh, uh... Yeah, so shmem objects are allocated their own way (the shmem > core expects to kfree them in drm_gem_shmem_free) and > bindings::drm_gem_shmem_object already contains a bindings::drm_gem_object. > Since the composition is already done in the C side, we can't just do it > again in Rust cleanly. That's why I have this weird setup with both a common > trait for common GEM functionality and separate actual types that both > implement it. Hm this is annoying. For a single driver it doesn't matter, but I do expect that once we have more, and especially once we have more libraries wrapped (ttm, gpuva, execbuf submit helpers, ...) then the common glue really becomes the gem_bo for many of these things. Could we have a GemObject trait which covers this? sole function is an unsafe one that gives you the raw C pointer :-) It still means that every gem memory manager library needs to impl that trait, but all the manager-agnostic bits in the wrappers would be generic? trait would then also have the right dependent type to ensure type safety in all this. Maybe something to discuss in the next meeting with the rust folks. > Honestly the whole GEM codepath is untested other than the bits inherited by > shmem. I'm not sure I'll be able to verify that this all makes sense until > another Rust driver comes along that needs something other than shmem. I > just felt I had to do *something* for GEM since the hierarchy is there and I > needed shmem... > > This whole gem stuff is IMO the messiest part of the abstractions though, so > I'm happy to turn it on its head if it makes it better and someone has an > idea of how to do that ^^ Yeah I still haven't worked up enough courage to type up my gem abstraction review :-/ > > > Now the problem is from the higher layers I want object operations that > > > interact with the shmem::Object<T> (that is, they call generic GEM functions > > > on the object). Options so far: > > > > > > 1. Add an outer wrapper and put that functionality there. > > > 2. Just have the functions on T as helpers, so you need to call T::foo(obj) > > > instead of obj.foo(). > > > 3. Use the undocumented method receiver trait thing to make shmem::Object<T> > > > a valid `self` type, plus add auto-Deref to shmem::Object. Then obj.foo() > > > works. > > > > > > #1 is what I use here. #2 is how the driver-specific File ioctl callbacks > > > are implemented, and also sched::Job<T>. #3 is used for fence callbacks > > > (FenceObject<T>). None of them are great, and I'd love to hear what people > > > think of the various options... > > > > > > There are other unexplored options, like in this GEM case it could be > > > covered with a driver-internal auxiliary trait impl'd on shmem::Object<T> > > > buuut that doesn't work when you actually need callbacks on T itself to > > > circle back to shmem::Object<T>, as is the case with File/Job/FenceObject. > > > > Ok I think I'm completely lost here. But I also havent' looked at how this > > is all really used in the driver, it's really just the shmem:: module in > > the lookup_handle function which looked strange to me. > > Ah, sorry, I misunderstood what you were talking about in my previous email > then. That's just a default trait function. It comes from common > functionality in the gem module, but shmem::Object implements the trait so > it ends up offering it too (lookup_handle() is not duplicated, it only lives > in gem, shmem only has to implement going to/from the drm_gem_object pointer > so the rest of the methods can use it). That's part of why the type/trait > hierarchy is kind of messy here, it's so I can share functionality between > both types even though they are pre-composed on the C side. Ok, so it's all already what I expect and I'm just confused with rust syntax. > In the end the object types are specialized for any given driver, so you're > always getting your own unique kind of object anyway. It's just that drivers > based on shmem will go through it to reach the common code and work with a > shmem::Object<T>, and drivers using raw gem will use gem::Object<T> instead. Ok, sounds all good. -Daniel
On 06/04/2023 22.48, Daniel Vetter wrote: > On Thu, Apr 06, 2023 at 10:15:56PM +0900, Asahi Lina wrote: >> On 06/04/2023 20.55, Daniel Vetter wrote: >>> On Thu, Apr 06, 2023 at 01:44:22PM +0900, Asahi Lina wrote: >>>> On 05/04/2023 23.37, Daniel Vetter wrote: >>>>> On Tue, Mar 07, 2023 at 11:25:43PM +0900, Asahi Lina wrote: >>>>>> +/// A generic monotonically incrementing ID used to uniquely identify object instances within the >>>>>> +/// driver. >>>>>> +pub(crate) struct ID(AtomicU64); >>>>>> + >>>>>> +impl ID { >>>>>> + /// Create a new ID counter with a given value. >>>>>> + fn new(val: u64) -> ID { >>>>>> + ID(AtomicU64::new(val)) >>>>>> + } >>>>>> + >>>>>> + /// Fetch the next unique ID. >>>>>> + pub(crate) fn next(&self) -> u64 { >>>>>> + self.0.fetch_add(1, Ordering::Relaxed) >>>>>> + } >>>>>> +} >>>>> >>>>> Continuing the theme of me commenting on individual things, I stumbled >>>>> over this because I noticed that there's a lot of id based lookups where I >>>>> don't expect them, and started chasing. >>>>> >>>>> - For ids use xarray, not atomic counters. Yes I know dma_fence timelines >>>>> gets this wrong, this goes back to an innocent time where we didn't >>>>> allocate more than one timeline per engine, and no one fixed it since >>>>> then. Yes u64 should be big enough for everyone :-/ >>>>> >>>>> - Attaching ID spaces to drm_device is also not great. drm is full of >>>>> these mistakes. Much better if their per drm_file and so private to each >>>>> client. >>>>> >>>>> - They shouldn't be used for anything else than uapi id -> kernel object >>>>> lookup at the beginning of ioctl code, and nowhere else. At least from >>>>> skimming it seems like these are used all over the driver codebase, >>>>> which does freak me out. At least on the C side that's a clear indicator >>>>> for a refcount/lockin/data structure model that's not thought out at >>>>> all. >>>>> >>>>> What's going on here, what do I miss? >>>> >>>> These aren't UAPI IDs, they are driver-internal IDs (the UAPI IDs do use >>>> xarray and are per-File). Most of them are just for debugging, so that when >>>> I enable full debug spam I have some way to correlate different things that >>>> are happening together (this subset of interleaved log lines relate to the >>>> same submission). Basically just object names that are easier to read (and >>>> less of a security leak) than pointers and guaranteed not to repeat. You >>>> could get rid of most of them and it wouldn't affect the driver design, it >>>> just makes it very hard to see what's going on with debug logs ^^; >>>> >>>> There are only two that are ever used for non-debugging purposes: the VM ID, >>>> and the File ID. Both are per-device global IDs attached to the VMs (not the >>>> UAPI VM objects, but rather the underlyng MMU address space managers they >>>> represent, including the kernel-internal ones) and to Files themselves. They >>>> are used for destroying GEM objects: since the objects are also >>>> device-global across multiple clients, I need a way to do things like "clean >>>> up all mappings for this File" or "clean up all mappings for this VM". >>>> There's an annoying circular reference between GEM objects and their >>>> mappings, which is why this is explicitly coded out in destroy paths instead >>>> of naturally happening via Drop semantics (without that cleanup code, the >>>> circular reference leaks it). >>>> >>>> So e.g. when a File does a GEM close or explicitly asks for all mappings of >>>> an object to be removed, it goes out to the (possibly shared) GEM object and >>>> tells it to drop all mappings marked as owned by that unique File ID. When >>>> an explicit "unmap all in VM" op happens, it asks the GEM object to drop all >>>> mappings for that underlying VM ID. Similarly, when a UAPI VM object is >>>> dropped (in the Drop impl, so both explicitly and when the whole File/xarray >>>> is dropped and such), that does an explicit unmap of a special dummy object >>>> it owns which would otherwise leak since it is not tracked as a GEM object >>>> owned by that File and therefore not handled by GEM closing. And again along >>>> the same lines, the allocators in alloc.rs explicitly destroy the mappings >>>> for their backing GEM objects on Drop. All this is due to that annoying >>>> circular reference between VMs and GEM objects that I'm not sure how to fix. >>>> >>>> Note that if I *don't* do this (or forget to do it somewhere) the >>>> consequence is just that we leak memory, and if you try to destroy the wrong >>>> IDs somehow the worst that can happen is you unmap things you shouldn't and >>>> fault the GPU (or, in the kernel or kernel-managed user VM cases, >>>> potentially the firmware). Rust safety guarantees still keep things from >>>> going entirely off the rails within the kernel, since everything that >>>> matters is reference counted (which is why these reference cycles are >>>> possible at all). >>>> >>>> This all started when I was looking at the panfrost driver for reference. It >>>> does the same thing except it uses actual pointers to the owning entities >>>> instead of IDs, and pointer comparison (see panfrost_gem_close). Of course >>>> you could try do that in Rust too (literally storing and comparing raw >>>> pointers that aren't owned references), but then you're introducing a Pin<> >>>> requirement on those objects to make their addresses stable and it feels way >>>> more icky and error-prone than unique IDs (since addresses can be reused). >>>> panfrost only has a single mmu (what I call the raw VM) per File while I >>>> have an arbitrary number, which is why I end up with the extra >>>> distinction/complexity of both File and VM IDs, but the concept is the same. >>>> >>>> Some of this is going to be refactored when I implement arbitrary VM range >>>> mapping/unmapping, which would be a good time to improve this... but is >>>> there something particularly wrong/broken about the way I'm doing it now >>>> that I missed? I figured unique u64 IDs would be a pretty safe way to >>>> identify entities and cleanup the mappings when needed. >>> >>> Ok, some attempt at going through the vm_id/file_id stuff. Extremely >>> high-level purely informed by having read too many drivers: >>> >>> First on the drm_file/struct file/file_id. This is the uapi interface >>> object, and it's refcounted in the vfs, but that's entirely the vfs' >>> business and none of the driver (or even subsystem). Once userspace has >>> done the final close() the file is gone, there's no way to ever get >>> anything meaningfully out of it because userspace dropped it. So if the >>> driver has any kind of backpointer to that's a design bug, because in all >>> the place you might want to care (ioctl, fdinfo for schedu stats, any >>> other file_operations callback) the vfs ensures it stays alive during the >>> callback and you essentially have a borrowed reference. >> >> Right, there's none of that for the File, and it is not refcounted itself. >> Certainly there are no direct references, and as for the IDs: the IDs of >> relevant Files live in GEM objects that hold mappings owned by that file. As >> part of File close all the GEM objects get closed, which removes those >> mappings. So by the time the File goes away there should be no references to >> its ID anywhere (other than if I stashed some away for debugging, I forget >> whether I did in some child object). >> >> If this process breaks for some reason (say, stray mappings remain indexed >> to a File ID that is gone), that means we leak the mappings, which leaks the >> GEM objects themselves and the VM they are mapped to. Not great but not >> fireworks either. As long as the DRM core properly calls the GEM close >> callback on everything before calling the File close callback though, that >> shouldn't happen. >> >>> I've seen a lot of drivers try to make clever backpointings to stuff >>> that's essentially tied to the drm_file, and I've not found a single case >>> that made sense. iow, file_id as a lookup thingie needs to go. In >>> principle it's the same argument I've made already for the syncobj rust >>> wrappers. For specific uses I guess I need some rust reading help, but >>> from your description it sounds like the vm_id is much more the core >>> piece. >> >> The file ID is simply how GEM mappings are identified as belonging to an >> active file within the mapping list of an object. GEM object close is >> literally the only place this ID is ever used for anything other than >> passing around: >> >> /// Callback to drop all mappings for a GEM object owned by a given `File` >> fn close(obj: &Object, file: &DrmFile) { >> mod_pr_debug!("DriverObject::close vm_id={:?} id={}\n", obj.vm_id, >> obj.id); >> obj.drop_file_mappings(file.inner().file_id()); >> } >> >> I could also just iterate through the VM XArray for the File and drop >> mappings one VM at a time instead of doing all of them in one go, it's just >> slightly more cumbersome (though potentially less code because I could get >> rid of all the forwarding the file_id I do now). >> >> On the other hand, once we implement arbitrary VM maps, I suspect this is >> going to go away anyway with the new design, so I'm not really very inclined >> to fix it until that happens... ^^ > > Yeah the driver-managed vm needs a bunch more reference loops and gets > awkward fast. the gpuva library might need to keep support for that, but I > really hope it's not needed. > >>> So for that we have the gpu ctx -> vm -> gem_bos chain of reference. Now >>> on the C side if you have a modern driver that uses the >>> vm_bind/unbind/gpuva manager approach, the reference counts go in that >>> single direction only, anything else is essentially borrowed references >>> under protection of a mutex/lock or similar thing (for e.g. going from the >>> bo to the vm for eviction). >> >> Right, so that is what is going to change with the pending refactor. What I >> have right now is a design that used to be the old driver-managed VM design >> (and still retains part of that for kernel-managed objects) for the old >> synchronous demo UAPI, that I then shoehorned into the redesigned vm_bind >> UAPI by just not supporting the interesting cases (partial >> maps/unmaps/remaps, etc.). This is all temporary, it's just to get us by for >> now since OpenGL doesn't need it and there is no usable Vulkan driver that >> cares yet... I wanted to focus on the explicit sync and general >> sched/queuing part of the new UAPI before I got to the VM bind stuff, since >> I figured that would be more interesting (and pulls in all the new >> abstractions, plus major perf benefit). So the UAPI itself has vm_bind but >> only the "easy" subset of cases are supported by the driver (whole object >> maps/unmaps) and the refcounting is still backwards. >> >> As I said this originally came from the Panfrost design that doesn't have >> vm_bind but instead keeps a list of mappings with pointer equality checks in >> BOs... so that's why ^^ >> >> Thanks for explaining the design approach though, it's roughly what I had in >> mind but it's good to hear I'm on the right track! I'd love to go into more >> detail about how to implement vm_bind if you have time though (maybe a >> meeting?). In particular things like using the mm allocator to keep track of >> mapping ranges and supporting splitting and all that. > > Yeah vm_bind sounds like a good topic to discuss. I don't think we'll get > all the pieces aligned to land that before asahi, but the driver internals > should at least match wrt semantics with that so that the refactoring > isn't total pain. > >>> In addition to the above chain the xarray in the drm_file also holds >>> references to each of these. So far so good, in the drm_file ->postclose >>> callback you just walk the xarrays and drop all the references, and >>> everything gets cleaned up, at least in the C world. >> >> In the Rust world you just do nothing since the XArray abstraction knows how >> to drop all of its contained objects! > > Yeah xarray should work with Drop, but I guess you need a special > uapi/open-reference object that knows that it needs to perform additional > cleanup (like quiescent the gpu ctx or unamp everything for the vm). Yeah, I already have that for VMs. Since I have a layer between UAPI VM objects and the underlying MMU VM objects, the UAPI VM object Drop impl can take care of explicitly unmapping whatever it needs to, or however that ends up working out with the new design. I prefer that to explicit cleanup code since it means you can't forget to do it. Rust is pretty nice for throwing around tiny objects, 1:1 wrappers, or even zero-sized types that just do one thing + Drop in order to make some semantic ergonomic to use. That's how the XArray reservation stuff works: you get back a trivial object that just references the queue (yay lifetimes, no refcounting here) and holds the reservation open, and then you either fill it (which consumes the reservation guard) or drop it (which cleans up the reservation). There's lots of that kind of pattern in kernel Rust and I think we should use it often, it just makes things a lot less error-prone (ScopeGuard is another nice one!) >>> But if either due to the uabi being a bit more legacy, or Rust requiring >>> that the backpointers are reference-counted from the gem_bo->vma->vm and >>> can't follow borrow semantics (afaiui the usual linux list_head pattern of >>> walking the list under a lock giving you a borrowed reference for each >>> element doesn't work too well in rust?) then that's not a problem, you can >>> still all clean it out: >>> >>> - The key bit is that your vm struct needs both a refcount like kref and >>> a separate open count. Each gpu ctx and the xarray for vm objects in >>> drm_file hold _both_ the kref and the open refcount (in rust the open >>> refcount implies the Arc or things go sideways). >>> >>> - the other key bit is that drm_file ->postclose does _not_ have simple >>> Drop semantics, it's more explicit. >>> >>> - in the drm_file lastclose you first walk all the gpu ctx. The simplest >>> semantics is that close() synchronously tears down all leftover gpu ctx, >>> i.e. you unload them from the gpu. Details are under a lot of discussion >>> in the various scheduler threads, but essentially this should ensure >>> that the gpu ctx destruction completely removes all references to the >>> ctx. If instead you have the legacy problem of apps expecting that >>> rendering continues even if they called exit() before it finishes, then >>> it gets more messy. I have no idea whether that's still a problem for >>> new drivers or can be avoided. >>> >>> - Next up you do the same thing for the vm xarray (which drops both the >>> kref an open refcounts). >>> >>> - At this point there might still be a ton of vm objects around with >>> elevated kref. Except not, because at this point the open refcount of >>> each vm should have dropped to zero. When that happens the vm object >>> itself is still alive, plus even better for rust, you are in the >>> vm_close(vm) function call so you have a full borrowed reference to >>> that. Which means you can walk the entire address space and unmap >>> everything explicit. Which should get rid of any gem_bo->vma->vm >>> backpointers you have lying around. >>> >>> - At that point all your vm objects are gone too, because the kref managed >>> backpointers are gone. >>> >>> - You walk the xarray of gem_bo (well the drm subsystem does that for >>> you), which cleans out the reamining references to gem_bo. Only the >>> gem_bo which are shared with other process or have a dma_buf will >>> survive, like they should. >>> >>> No leak, no funky driver-internal vm_id based lookup, and with rust we >>> should even be able to guarantee you never mix up Arc<Vm> with OpenRef<Vm> >>> (or however that exactly works in rust types, I have not much real clue). >> >> That would totally work, and actually I already use somewhat analogous >> mechanisms in other places like firmware queues! >> >> If this all weren't getting turned on its head for the new VM management I'd >> implement it, but hopefully we can agree there's not much point right now... >> I'd rather focus on the DRM abstraction design and work on improving the >> driver in parallel right now, and then about one kernel cycle or so from now >> it should definitely be in a better place for review. Honestly, there are >> bigger design problems with the driver right now than these IDs (that I >> already know about)... so I want to focus more on the abstractions and their >> usage right now than the internal driver design which I *know* has problems >> ^^ > > Yeah I think the only fundamental issue you have is that (if I get this > all right) you're trying to clean up mappings from the gem_bo, not from > the vm. The gem_bo (unlike the vm) is freely shareable (at least in > general), so tying anything else to the lifetime of a gem_bo in any way is > a design flaw. Yeah, it wasn't nice from the start. Actually the first bit of code I wrote is the MMU code, and originally it was even literally C code based on the panfrost MMU code as-is... I quickly realized that the C wasn't going to be that useful when I started diving into the GEM abstractions, so it got rewritten in Rust early on... So right now it works (and I have no reason to believe it has actual leak bugs lurking today) but it's not a nice design and it's going to get a major refactor/redesign once I switch to proper vm_bind tracking. > This is similar to dma_fence that can end up absolutely everywhere, and > why drm/sched has this decoupling between hw_fence and drm_job fences with > wider visibility. i915-gem/i915-scheduler and a lot of the really old > drivers all get this wrong, and you end up with either terrible explicit > cleanup code that tries to go around looking for all the references that > it needs to drop. Or you just leak. I think for fences my general approach is going to be to just try to keep to what I'm doing now and minimize the references fences hold, and treat them as a signaling mechanism that ideally doesn't have to hold a reference to anything other than the module. After all, the real king of what needs to be alive is the firmware, and its mechanisms don't map well to fences directly, so I need to do bespoke resource management there anyway (and then just plug it into fences so it can feed into drm_sched and the rest of the world). I don't know if that makes sense, but it feels like it does? I still need to spend a bunch of time thinking about this though... > All these things need to be sorted out at design time so that they're > impossible. That's the other nice thing about Rust, it makes refactoring a lot faster too! The compiler is really good at annoying you and refusing to compile things until you've fixed all the really dumb mistakes you introduced, and then there's a pretty good chance it'll run and the remaining bugs will be really obvious after that. As much as you learn to hate the compiler, it's so much better than trying to debug things at runtime... ^^ I'm not sure what your opinion is on this, but personally if you/others were okay with it I wouldn't be too worried about hypothetically merging the driver in the state it's in today, with the expectation to hack major parts of it to bits and pieces over the next few months. I've done it a few times already... it usually doesn't take more than a day or two to make some major refactor to a component and get it back up and running. (I do expect to do a bunch of that cleanup over the next few months before it's even possible to merge anyway, just a hypothetical). >> Rust is really good at getting you to come up with a *safe* design as far as >> memory and ownership, but that doesn't mean it's perfectly clean code and >> more importantly it does nothing for deadlocks and allocating in the wrong >> paths and getting resource allocation semantics right etc etc. The GPU FW >> queue stuff is at the very least due for another major refactor/cleanup to >> defer resource allocation and actual queuing to job prepare/run time (right >> now there's some horrible hacks to do it upfront at submit because I don't >> have a mechanism to back-patch job structures with those resource IDs later >> at exec time, but I want to add that), and along the way I can also fix the >> using job fences to block on pending job count thing that Christian really >> wants me to do instead of the can_run_job thing, and then getting all this >> resource stuff truly right is also going to mean eventually using fences to >> handle blocking on resource exhaustion too (though maybe I can get away with >> implementing that a bit later)... >> >> The driver works stupidly well for how quickly I wrote it, but it still has >> all these rough edges that definitely need fixing before it's something I >> could say I'm happy with... I'm sure if you start hammering it with evil >> workloads you will hit some of its current problems (like I did yesterday >> with the deadlocks on GpuContext inval). I also need to learn more about the >> subtleties of fence signaling and all that, especially once a shrinker comes >> into play... > > Yeah I think rust is impressive at creating working code. The real > challenge, and really where I see all the short term value at least, is in > clarifying the semantics. Because that'll help us to clarify the semantics > on the C side too, which gives immediate benefits for everyone. Not just > new drivers in rust. > > But it's also the part that's really, really hard work. Yup! ~~ Lina
Hi everyone! This is my first take on the Rust abstractions for the DRM subsystem. It includes the abstractions themselves, some minor prerequisite changes to the C side, as well as the drm-asahi GPU driver (for reference on how the abstractions are used, but not necessarily intended to land together). These patches apply on top of the tree at [1], which is based on 6.3-rc1 with a large number of Rust abstraction/support commits added on top. Most of these are not prerequisites for the DRM abstractions themselves, but rather only of the driver. * #1-12 introduce the abstractions, module by module, with minor C changes before the dependent abstraction. * Patch 10 is a little addition to drm_sched that I ended up needing, but I can pull it out of the abstraction into its own patch if needed. * #13-14 add a minor feature to drm/gem and its abstraction used by the driver. * #15-16 introduce the (unstable) asahi UAPI. This is obviously not ready for merge yet, but comments are welcome! * #17 adds a Rust helper macro to handle GPU core/firmware differences. This probably belongs in the driver at this point, but right now it has to live in rust/macros since there is no mechanism for per-driver proc macros. * #18 adds the driver proper, in one big commit, for reference purposes. I've been working since mid last year on an Apple AGX GPU driver for Linux, using the (at the time) out-of-tree Rust support. As part of this effort, I've been writing safe Rust abstractions for portions of the DRM subsystem. Now that Rust itself is upstream, I'd like to get all the abstractions upstreamed so we can eventually get the driver upstreamed! These abstractions have been used by the driver since our release in December [2], in a simpler synchronous-submission form: * drm::ioctl * drm::device * drm::drv * drm::file * drm::{gem, gem::shmem} * drm::mm This series adds these too, which are used by the explicit sync refactor of the driver (the version in this series): * drm::syncobj * drm::sched * dma_fence The major dependencies for the DRM abstractions themselves are: * [3] rust: error: Add missing wrappers to convert to/from kernel error codes * [4] rust: Miscellaneous macro improvements * [5] rust: Add a Sealed trait * [6] rust: device: Add a minimal RawDevice trait * [7] rust: Enable the new_uninit feature for kernel and driver crates * [8] rust: ioctl: Add ioctl number manipulation functions * [9] rust: sync: Arc: Any downcasting and assume_init() * rust: Add `container_of` and `offset_of` macros * kernel::sync::mutex and dependencies Most of these (the ones with links) have already been submitted, and I expect all of them to land for 6.4 (the mutex one will likely be last, since there is some refactoring that will happen over the current state to make it more ergonomic to use). The mutex dep is only necessary for drm::mm and dma_fence, and transitively drm::syncobj and drm::sched. Things work! We've had most of the abstractions in production edge kernels with the driver, and the new explicit sync stuff has passed quite a few torture tests (this is how we found the drm_sched issue, patch 11). The abstractions are intended to be safe (safety review very welcome!). While writing them, I tried to avoid making any changes to the C side unless absolutely necessary. I understand that it will probably make sense to adjust the C side to make some things easier, but I wanted to start from this as a baseline. Known issues: - The existing Rust integration does not currently allow building abstractions as modules, so the Rust abstractions are only available for DRM components that are built in. I added some extra Kconfig symbols to deal with this, so a driver built as a module can depende on having those built in. This should go away in the future (but may not be ready in time for submission... I understand this probably shouldn't be a blocker though?). - DRM relies heavily on the "subclassing" pattern for driver objects, and this doesn't map well to Rust. I tried several approaches for various bits, so we can see how they work out. In particular, whether wrapper types should pretend to be smart pointers and Deref to their inner driver-specific types, and whether they should be marked as method receivers (Yuck, internal rustc implementation hacks! But Arc<T> already does the same thing and it makes usage in driver-implemented callbacks as `self` possible) are things I'd love to discuss ^^. - Only what I need for my driver is implemented (plus a small amount of obvious extras where better API completeness makes sense). I think the general idea with Rust abstractions is that we add things as they become necessary. - The plain GEM vs. GEM-shmem duality ended up with quite a hairy type hierarchy. I'd love to figure out how to make this simpler... - drm::mm ends up requiring a built-in mutex in the abstraction, instead of delegating that to the user with the usual Rust mutability rules. This is because nodes can be dropped at any time, and those operations need to be synchronized. We could try to avoid forbidding those drops or mark the node type !Send, but that would make it a lot less ergonomic to use... I'm looking for feedback on the abstractions of all kinds, so we can move towards an upstreamable version. Optimistically, I'd love to get this upstream for 6.5, and the driver for 6.6. Please feel free to ask any questions about the Rust bits, since I know a lot of this is new to many of the C folks! This is a fairly complete driver for Apple AGX G13 and G14 series GPUs. The driver today supports the Apple M1, M1 Pro, M1 Max, M1 Ultra, and M2 SoCs, across two firmware revisions each. It has an explicit sync UAPI heavily inspired by the upcoming Intel Xe UAPI, designed with Vulkan support in mind. On the Mesa side we currently have a Gallium driver that is mostly already upstream (missing the UAPI bits mostly) and passes the dEQP GLES2/EGL tests, with most of GLES3.0 passing in downstream work-in-progress branches. This is a reverse engineered community driver (we have no hardware documentation of any kind, other than some hints from aspects shared with PowerVR). While developing the driver, I tried to make use of Rust's safety and lifetime features to provide not just CPU-side safety, but also partial firmware-ABI safety. Thanks to this, it has turned out to be a very stable driver even though GPU firmware crashes are fatal (no restart capability, need to reboot!) and the FW/driver interface is a huge mess of unsafe shared memory structures with complex pointer chains. There are over 70 ABI types and 3000+ lines of firmware ABI type definitions that vary between firmware builds and GPU cores... In a simpler blocking-submission form, it has been shipping in Asahi Linux edge kernels since December [2], with lots of users and zero (!) reported oopses (and only a couple reports of GPU firmware crashes, though that issue should now be fixed). It has survived OOM scenarios (Rust makes error cleanup easy!), UAPI-level fuzzing, countless broken Mesa builds, uptimes of 40+ days, and more. The explicit sync refactor significantly increases performance (and potential problems), but this version has survived a lot of torture with dEQP/piglit tests and some manual corner case testing. In other words, Rust works! ^^ There are some design notes on the driver and further links at [10]. [1] https://github.com/AsahiLinux/linux.git drm-rfc-base-20230307 [2] https://asahilinux.org/2022/12/gpu-drivers-now-in-asahi-linux/ [3] https://lore.kernel.org/rust-for-linux/20230224-rust-error-v1-0-f8f9a9a87303@asahilina.net/T/ [4] https://lore.kernel.org/rust-for-linux/20230224-rust-macros-v1-0-b39fae46e102@asahilina.net/T/ [5] https://lore.kernel.org/rust-for-linux/20230224-rust-iopt-rtkit-v1-0-49ced3391295@asahilina.net/T/#m515bad2cff7f5a46f55897e6b73c6c2f1fb2c638 [6] https://lore.kernel.org/rust-for-linux/20230224-rust-iopt-rtkit-v1-0-49ced3391295@asahilina.net/T/#m4c64e390c43b3ff1b8470fc8b37eaf87f6e12c94 [7] https://lore.kernel.org/rust-for-linux/CQV7ZNT6LMXI.1XG4YXSH8I7JK@vincent-arch/T/ [8] https://lore.kernel.org/rust-for-linux/61f734d6-1497-755f-3632-3f261b890846@asahilina.net/T/ [9] https://lore.kernel.org/rust-for-linux/20230224-rust-arc-v1-0-568eea613a41@asahilina.net/T/ [10] https://github.com/AsahiLinux/docs/wiki/SW:AGX-driver-notes Signed-off-by: Asahi Lina <lina@asahilina.net> --- Asahi Lina (18): rust: drm: ioctl: Add DRM ioctl abstraction rust: drm: Add Device and Driver abstractions rust: drm: file: Add File abstraction rust: drm: gem: Add GEM object abstraction drm/gem-shmem: Export VM ops functions rust: drm: gem: shmem: Add DRM shmem helper abstraction rust: drm: mm: Add DRM MM Range Allocator abstraction rust: dma_fence: Add DMA Fence abstraction rust: drm: syncobj: Add DRM Sync Object abstraction drm/scheduler: Add can_run_job callback drm/scheduler: Clean up jobs when the scheduler is torn down rust: drm: sched: Add GPU scheduler abstraction drm/gem: Add a flag to control whether objects can be exported rust: drm: gem: Add set_exportable() method drm/asahi: Add the Asahi driver UAPI [DO NOT MERGE] rust: bindings: Bind the Asahi DRM UAPI rust: macros: Add versions macro drm/asahi: Add the Asahi driver for Apple AGX GPUs drivers/gpu/drm/Kconfig | 19 + drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/asahi/Kconfig | 35 + drivers/gpu/drm/asahi/Makefile | 3 + drivers/gpu/drm/asahi/alloc.rs | 1046 ++++++++++++++++++++++++++ drivers/gpu/drm/asahi/asahi.rs | 53 ++ drivers/gpu/drm/asahi/buffer.rs | 694 ++++++++++++++++++ drivers/gpu/drm/asahi/channel.rs | 542 ++++++++++++++ drivers/gpu/drm/asahi/debug.rs | 129 ++++ drivers/gpu/drm/asahi/driver.rs | 166 +++++ drivers/gpu/drm/asahi/event.rs | 229 ++++++ drivers/gpu/drm/asahi/file.rs | 718 ++++++++++++++++++ drivers/gpu/drm/asahi/float.rs | 381 ++++++++++ drivers/gpu/drm/asahi/fw/buffer.rs | 170 +++++ drivers/gpu/drm/asahi/fw/channels.rs | 385 ++++++++++ drivers/gpu/drm/asahi/fw/compute.rs | 107 +++ drivers/gpu/drm/asahi/fw/event.rs | 100 +++ drivers/gpu/drm/asahi/fw/fragment.rs | 276 +++++++ drivers/gpu/drm/asahi/fw/initdata.rs | 1264 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/asahi/fw/job.rs | 56 ++ drivers/gpu/drm/asahi/fw/microseq.rs | 384 ++++++++++ drivers/gpu/drm/asahi/fw/mod.rs | 15 + drivers/gpu/drm/asahi/fw/types.rs | 233 ++++++ drivers/gpu/drm/asahi/fw/vertex.rs | 177 +++++ drivers/gpu/drm/asahi/fw/workqueue.rs | 168 +++++ drivers/gpu/drm/asahi/gem.rs | 301 ++++++++ drivers/gpu/drm/asahi/gpu.rs | 1088 +++++++++++++++++++++++++++ drivers/gpu/drm/asahi/hw/mod.rs | 522 +++++++++++++ drivers/gpu/drm/asahi/hw/t600x.rs | 140 ++++ drivers/gpu/drm/asahi/hw/t8103.rs | 80 ++ drivers/gpu/drm/asahi/hw/t8112.rs | 82 +++ drivers/gpu/drm/asahi/initdata.rs | 777 ++++++++++++++++++++ drivers/gpu/drm/asahi/mem.rs | 133 ++++ drivers/gpu/drm/asahi/microseq.rs | 61 ++ drivers/gpu/drm/asahi/mmu.rs | 1249 +++++++++++++++++++++++++++++++ drivers/gpu/drm/asahi/object.rs | 704 ++++++++++++++++++ drivers/gpu/drm/asahi/place.rs | 343 +++++++++ drivers/gpu/drm/asahi/queue/common.rs | 52 ++ drivers/gpu/drm/asahi/queue/compute.rs | 371 ++++++++++ drivers/gpu/drm/asahi/queue/mod.rs | 725 ++++++++++++++++++ drivers/gpu/drm/asahi/queue/render.rs | 1173 +++++++++++++++++++++++++++++ drivers/gpu/drm/asahi/regs.rs | 387 ++++++++++ drivers/gpu/drm/asahi/slotalloc.rs | 292 ++++++++ drivers/gpu/drm/asahi/util.rs | 44 ++ drivers/gpu/drm/asahi/workqueue.rs | 880 ++++++++++++++++++++++ drivers/gpu/drm/drm_gem.c | 1 + drivers/gpu/drm/drm_gem_shmem_helper.c | 9 +- drivers/gpu/drm/drm_prime.c | 5 + drivers/gpu/drm/scheduler/sched_main.c | 37 +- include/drm/drm_gem.h | 8 + include/drm/drm_gem_shmem_helper.h | 3 + include/drm/gpu_scheduler.h | 8 + include/uapi/drm/asahi_drm.h | 556 ++++++++++++++ rust/bindings/bindings_helper.h | 14 + rust/helpers.c | 168 +++++ rust/kernel/dma_fence.rs | 532 ++++++++++++++ rust/kernel/drm/device.rs | 76 ++ rust/kernel/drm/drv.rs | 342 +++++++++ rust/kernel/drm/file.rs | 113 +++ rust/kernel/drm/gem/mod.rs | 384 ++++++++++ rust/kernel/drm/gem/shmem.rs | 381 ++++++++++ rust/kernel/drm/ioctl.rs | 147 ++++ rust/kernel/drm/mm.rs | 309 ++++++++ rust/kernel/drm/mod.rs | 13 + rust/kernel/drm/sched.rs | 358 +++++++++ rust/kernel/drm/syncobj.rs | 77 ++ rust/kernel/lib.rs | 4 + rust/macros/lib.rs | 7 + rust/macros/versions.rs | 267 +++++++ 69 files changed, 20569 insertions(+), 5 deletions(-) --- base-commit: c9eb15274c9861026682a6b3e645891fccf88e07 change-id: 20230307-rust-drm-b5af3c2a9e55 Thank you, ~~ Lina