Message ID | 20241119112408.779243-1-abdiel.janulgue@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | rust: page: Add support for existing struct page mappings | expand |
On Tue, Nov 19, 2024 at 01:24:01PM +0200, Abdiel Janulgue wrote: > This series aims to add support for pages that are not constructed by an > instance of the rust Page abstraction, for example those returned by > vmalloc_to_page() or virt_to_page(). > > Changes sinve v3: > - Use the struct page's reference count to decide when to free the > allocation (Alice Ryhl, Boqun Feng). Bleh, this is going to be "exciting". We're in the middle of a multi-year project to remove refcounts from struct page. The lifetime of a page will be controlled by the memdesc that it belongs to. Some of those memdescs will have refcounts, but others will not. We don't have a fully formed destination yet, so I can't give you a definite answer to a lot of questions. Obviously I don't want to hold up the Rust project in any way, but I need to know that what we're trying to do will be expressible in Rust. Can we avoid referring to a page's refcount?
On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Nov 19, 2024 at 01:24:01PM +0200, Abdiel Janulgue wrote: > > This series aims to add support for pages that are not constructed by an > > instance of the rust Page abstraction, for example those returned by > > vmalloc_to_page() or virt_to_page(). > > > > Changes sinve v3: > > - Use the struct page's reference count to decide when to free the > > allocation (Alice Ryhl, Boqun Feng). > > Bleh, this is going to be "exciting". We're in the middle of a multi-year > project to remove refcounts from struct page. The lifetime of a page > will be controlled by the memdesc that it belongs to. Some of those > memdescs will have refcounts, but others will not. > > We don't have a fully formed destination yet, so I can't give you a > definite answer to a lot of questions. Obviously I don't want to hold > up the Rust project in any way, but I need to know that what we're trying > to do will be expressible in Rust. > > Can we avoid referring to a page's refcount? I don't think this patch needs the refcount at all, and the previous version did not expose it. This came out of the advice to use put_page over free_page. Does this mean that we should switch to put_page but not use get_page? Alice
On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: > On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Tue, Nov 19, 2024 at 01:24:01PM +0200, Abdiel Janulgue wrote: > > > This series aims to add support for pages that are not constructed by an > > > instance of the rust Page abstraction, for example those returned by > > > vmalloc_to_page() or virt_to_page(). > > > > > > Changes sinve v3: > > > - Use the struct page's reference count to decide when to free the > > > allocation (Alice Ryhl, Boqun Feng). > > > > Bleh, this is going to be "exciting". We're in the middle of a multi-year > > project to remove refcounts from struct page. The lifetime of a page > > will be controlled by the memdesc that it belongs to. Some of those > > memdescs will have refcounts, but others will not. > > One question: will the page that doesn't have refcounts has an exclusive owner? I.e. there is one owner that's responsible to free the page and make sure other references to the page get properly invalidated (maybe via RCU?) > > We don't have a fully formed destination yet, so I can't give you a > > definite answer to a lot of questions. Obviously I don't want to hold > > up the Rust project in any way, but I need to know that what we're trying > > to do will be expressible in Rust. > > > > Can we avoid referring to a page's refcount? > > I don't think this patch needs the refcount at all, and the previous > version did not expose it. This came out of the advice to use put_page > over free_page. Does this mean that we should switch to put_page but > not use get_page? > I think the point is finding the exact lifetime model for pages, if it's not a simple refcounting, then what it is? Besides, we can still represent refcounting pages with `struct Page` and other pages with a different type name. So as far as I can see, this patch is OK for now. Regards, Boqun > Alice
On Wed, Nov 20, 2024 at 08:20:16AM -0800, Boqun Feng wrote: > On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: > > On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > On Tue, Nov 19, 2024 at 01:24:01PM +0200, Abdiel Janulgue wrote: > > > > This series aims to add support for pages that are not constructed by an > > > > instance of the rust Page abstraction, for example those returned by > > > > vmalloc_to_page() or virt_to_page(). > > > > > > > > Changes sinve v3: > > > > - Use the struct page's reference count to decide when to free the > > > > allocation (Alice Ryhl, Boqun Feng). > > > > > > Bleh, this is going to be "exciting". We're in the middle of a multi-year > > > project to remove refcounts from struct page. The lifetime of a page > > > will be controlled by the memdesc that it belongs to. Some of those > > > memdescs will have refcounts, but others will not. > > > > > One question: will the page that doesn't have refcounts has an exclusive > owner? I.e. there is one owner that's responsible to free the page and > make sure other references to the page get properly invalidated (maybe > via RCU?) It's up to the owner of the page how they want to manage freeing it. They can use a refcount (folios will still have a refcount, for example), or they can know when there are no more users of the page (eg slab knows when all objects in a slab are freed). RCU is a possibility, but would be quite unusual I would think. The model I'm looking for here is that 'page' is too low-level an object to have its own lifecycle; it's always defined by a higher level object. > > > We don't have a fully formed destination yet, so I can't give you a > > > definite answer to a lot of questions. Obviously I don't want to hold > > > up the Rust project in any way, but I need to know that what we're trying > > > to do will be expressible in Rust. > > > > > > Can we avoid referring to a page's refcount? > > > > I don't think this patch needs the refcount at all, and the previous > > version did not expose it. This came out of the advice to use put_page > > over free_page. Does this mean that we should switch to put_page but > > not use get_page? Did I advise using put_page() over free_page()? I hope I didn't say that. I don't see a reason why binder needs to refcount its pages (nor use a mapcount on them), but I don't fully understand binder so maybe it does need a refcount. > I think the point is finding the exact lifetime model for pages, if it's > not a simple refcounting, then what it is? Besides, we can still > represent refcounting pages with `struct Page` and other pages with a > different type name. So as far as I can see, this patch is OK for now. I don't want Page to have a refcount. If you need something with a refcount, it needs to be called something else.
On Wed, Nov 20, 2024 at 05:02:14PM +0000, Matthew Wilcox wrote: > On Wed, Nov 20, 2024 at 08:20:16AM -0800, Boqun Feng wrote: > > On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: > > > On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > On Tue, Nov 19, 2024 at 01:24:01PM +0200, Abdiel Janulgue wrote: > > > > > This series aims to add support for pages that are not constructed by an > > > > > instance of the rust Page abstraction, for example those returned by > > > > > vmalloc_to_page() or virt_to_page(). > > > > > > > > > > Changes sinve v3: > > > > > - Use the struct page's reference count to decide when to free the > > > > > allocation (Alice Ryhl, Boqun Feng). > > > > > > > > Bleh, this is going to be "exciting". We're in the middle of a multi-year > > > > project to remove refcounts from struct page. The lifetime of a page > > > > will be controlled by the memdesc that it belongs to. Some of those > > > > memdescs will have refcounts, but others will not. > > > > > > > > One question: will the page that doesn't have refcounts has an exclusive > > owner? I.e. there is one owner that's responsible to free the page and > > make sure other references to the page get properly invalidated (maybe > > via RCU?) > > It's up to the owner of the page how they want to manage freeing it. > They can use a refcount (folios will still have a refcount, for example), > or they can know when there are no more users of the page (eg slab knows > when all objects in a slab are freed). RCU is a possibility, but would > be quite unusual I would think. The model I'm looking for here is that > 'page' is too low-level an object to have its own lifecycle; it's always > defined by a higher level object. > Ok, that makes sense. That's actually aligned with the direction we are heading in this patch: make `struct Page` itself independent on how the lifetime is maintained. Conceptually, say we can define folio in pure Rust, it could be: struct Folio { head: Page, /* or a union of page */ ... } and we can `impl AlwaysRefcounted for Folio`, which implies there is a refcount inside. And we can also have a `Foo` being: struct Foo { inner: Page, } which doesn't implement `AlwaysRefcounted`, and that suggests a different way the page lifetime will be maintained. > > > > We don't have a fully formed destination yet, so I can't give you a > > > > definite answer to a lot of questions. Obviously I don't want to hold > > > > up the Rust project in any way, but I need to know that what we're trying > > > > to do will be expressible in Rust. > > > > > > > > Can we avoid referring to a page's refcount? > > > > > > I don't think this patch needs the refcount at all, and the previous > > > version did not expose it. This came out of the advice to use put_page > > > over free_page. Does this mean that we should switch to put_page but > > > not use get_page? > > Did I advise using put_page() over free_page()? I hope I didn't say We have some off-list discussion about free_page() doesn't always free the page if you could remember. > that. I don't see a reason why binder needs to refcount its pages (nor > use a mapcount on them), but I don't fully understand binder so maybe > it does need a refcount. I don't think binder needs it either, but I think Abdiel here has a different usage than binder. > > > I think the point is finding the exact lifetime model for pages, if it's > > not a simple refcounting, then what it is? Besides, we can still > > represent refcounting pages with `struct Page` and other pages with a > > different type name. So as far as I can see, this patch is OK for now. > > I don't want Page to have a refcount. If you need something with a > refcount, it needs to be called something else. So if I understand correctly, what Abdiel needs here is a way to convert a virtual address to the corresponding page, would it make sense to just use folio in this case? Abdiel, what's the operation you are going to call on the page you get? Regards, Boqun
On 20/11/2024 19:25, Boqun Feng wrote: > On Wed, Nov 20, 2024 at 05:02:14PM +0000, Matthew Wilcox wrote: >> On Wed, Nov 20, 2024 at 08:20:16AM -0800, Boqun Feng wrote: >>> On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: >>>> On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@infradead.org> wrote: >>>>> >>>>> On Tue, Nov 19, 2024 at 01:24:01PM +0200, Abdiel Janulgue wrote: >>>>>> This series aims to add support for pages that are not constructed by an >>>>>> instance of the rust Page abstraction, for example those returned by >>>>>> vmalloc_to_page() or virt_to_page(). >>>>>> >>>>>> Changes sinve v3: >>>>>> - Use the struct page's reference count to decide when to free the >>>>>> allocation (Alice Ryhl, Boqun Feng). >>>>> >>>>> Bleh, this is going to be "exciting". We're in the middle of a multi-year >>>>> project to remove refcounts from struct page. The lifetime of a page >>>>> will be controlled by the memdesc that it belongs to. Some of those >>>>> memdescs will have refcounts, but others will not. >>>>> >>> >>> One question: will the page that doesn't have refcounts has an exclusive >>> owner? I.e. there is one owner that's responsible to free the page and >>> make sure other references to the page get properly invalidated (maybe >>> via RCU?) >> >> It's up to the owner of the page how they want to manage freeing it. >> They can use a refcount (folios will still have a refcount, for example), >> or they can know when there are no more users of the page (eg slab knows >> when all objects in a slab are freed). RCU is a possibility, but would >> be quite unusual I would think. The model I'm looking for here is that >> 'page' is too low-level an object to have its own lifecycle; it's always >> defined by a higher level object. >> > > Ok, that makes sense. That's actually aligned with the direction we are > heading in this patch: make `struct Page` itself independent on how the > lifetime is maintained. Conceptually, say we can define folio in pure > Rust, it could be: > > struct Folio { > head: Page, /* or a union of page */ > ... > } > > and we can `impl AlwaysRefcounted for Folio`, which implies there is a > refcount inside. And we can also have a `Foo` being: > > struct Foo { > inner: Page, > } > > which doesn't implement `AlwaysRefcounted`, and that suggests a > different way the page lifetime will be maintained. > >>>>> We don't have a fully formed destination yet, so I can't give you a >>>>> definite answer to a lot of questions. Obviously I don't want to hold >>>>> up the Rust project in any way, but I need to know that what we're trying >>>>> to do will be expressible in Rust. >>>>> >>>>> Can we avoid referring to a page's refcount? >>>> >>>> I don't think this patch needs the refcount at all, and the previous >>>> version did not expose it. This came out of the advice to use put_page >>>> over free_page. Does this mean that we should switch to put_page but >>>> not use get_page? >> >> Did I advise using put_page() over free_page()? I hope I didn't say > > We have some off-list discussion about free_page() doesn't always free > the page if you could remember. > >> that. I don't see a reason why binder needs to refcount its pages (nor >> use a mapcount on them), but I don't fully understand binder so maybe >> it does need a refcount. > > I don't think binder needs it either, but I think Abdiel here has a > different usage than binder. > >> >>> I think the point is finding the exact lifetime model for pages, if it's >>> not a simple refcounting, then what it is? Besides, we can still >>> represent refcounting pages with `struct Page` and other pages with a >>> different type name. So as far as I can see, this patch is OK for now. >> >> I don't want Page to have a refcount. If you need something with a >> refcount, it needs to be called something else. > > So if I understand correctly, what Abdiel needs here is a way to convert > a virtual address to the corresponding page, would it make sense to just > use folio in this case? Abdiel, what's the operation you are going to > call on the page you get? Yes that's basically it. The goal here is represent those existing struct page within this rust Page abstraction but at the same time to avoid taking over its ownership. Boqun, Alice, should we reconsider Ownable and Owned trait again? :) Regards, Abdiel
On Thu, Nov 21, 2024 at 12:56:38AM +0200, Abdiel Janulgue wrote: > On 20/11/2024 19:25, Boqun Feng wrote: > > On Wed, Nov 20, 2024 at 05:02:14PM +0000, Matthew Wilcox wrote: > > > On Wed, Nov 20, 2024 at 08:20:16AM -0800, Boqun Feng wrote: > > > > On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: > > > > > On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > > > On Tue, Nov 19, 2024 at 01:24:01PM +0200, Abdiel Janulgue wrote: > > > > > > > This series aims to add support for pages that are not constructed by an > > > > > > > instance of the rust Page abstraction, for example those returned by > > > > > > > vmalloc_to_page() or virt_to_page(). > > > > > > > > > > > > > > Changes sinve v3: > > > > > > > - Use the struct page's reference count to decide when to free the > > > > > > > allocation (Alice Ryhl, Boqun Feng). > > > > > > > > > > > > Bleh, this is going to be "exciting". We're in the middle of a multi-year > > > > > > project to remove refcounts from struct page. The lifetime of a page > > > > > > will be controlled by the memdesc that it belongs to. Some of those > > > > > > memdescs will have refcounts, but others will not. > > > > > > > > > > > > > > One question: will the page that doesn't have refcounts has an exclusive > > > > owner? I.e. there is one owner that's responsible to free the page and > > > > make sure other references to the page get properly invalidated (maybe > > > > via RCU?) > > > > > > It's up to the owner of the page how they want to manage freeing it. > > > They can use a refcount (folios will still have a refcount, for example), > > > or they can know when there are no more users of the page (eg slab knows > > > when all objects in a slab are freed). RCU is a possibility, but would > > > be quite unusual I would think. The model I'm looking for here is that > > > 'page' is too low-level an object to have its own lifecycle; it's always > > > defined by a higher level object. > > > > > > > Ok, that makes sense. That's actually aligned with the direction we are > > heading in this patch: make `struct Page` itself independent on how the > > lifetime is maintained. Conceptually, say we can define folio in pure > > Rust, it could be: > > > > struct Folio { > > head: Page, /* or a union of page */ > > ... > > } > > > > and we can `impl AlwaysRefcounted for Folio`, which implies there is a > > refcount inside. And we can also have a `Foo` being: > > > > struct Foo { > > inner: Page, > > } > > > > which doesn't implement `AlwaysRefcounted`, and that suggests a > > different way the page lifetime will be maintained. > > > > > > > > We don't have a fully formed destination yet, so I can't give you a > > > > > > definite answer to a lot of questions. Obviously I don't want to hold > > > > > > up the Rust project in any way, but I need to know that what we're trying > > > > > > to do will be expressible in Rust. > > > > > > > > > > > > Can we avoid referring to a page's refcount? > > > > > > > > > > I don't think this patch needs the refcount at all, and the previous > > > > > version did not expose it. This came out of the advice to use put_page > > > > > over free_page. Does this mean that we should switch to put_page but > > > > > not use get_page? > > > > > > Did I advise using put_page() over free_page()? I hope I didn't say > > > > We have some off-list discussion about free_page() doesn't always free > > the page if you could remember. > > > > > that. I don't see a reason why binder needs to refcount its pages (nor > > > use a mapcount on them), but I don't fully understand binder so maybe > > > it does need a refcount. > > > > I don't think binder needs it either, but I think Abdiel here has a > > different usage than binder. > > > > > > > > > I think the point is finding the exact lifetime model for pages, if it's > > > > not a simple refcounting, then what it is? Besides, we can still > > > > represent refcounting pages with `struct Page` and other pages with a > > > > different type name. So as far as I can see, this patch is OK for now. > > > > > > I don't want Page to have a refcount. If you need something with a > > > refcount, it needs to be called something else. > > > > So if I understand correctly, what Abdiel needs here is a way to convert > > a virtual address to the corresponding page, would it make sense to just > > use folio in this case? Abdiel, what's the operation you are going to > > call on the page you get? > > Yes that's basically it. The goal here is represent those existing struct > page within this rust Page abstraction but at the same time to avoid taking > over its ownership. > > Boqun, Alice, should we reconsider Ownable and Owned trait again? :) > Could you use folio in your case? If so, we can provide a simple binding for folio which should be `AlwaysRefcounted`, and re-investigate how page should be wrapped. Regards, Boqun > Regards, > Abdiel
On Thu, Nov 21, 2024 at 1:24 AM Boqun Feng <boqun.feng@gmail.com> wrote: > > On Thu, Nov 21, 2024 at 12:56:38AM +0200, Abdiel Janulgue wrote: > > On 20/11/2024 19:25, Boqun Feng wrote: > > > On Wed, Nov 20, 2024 at 05:02:14PM +0000, Matthew Wilcox wrote: > > > > On Wed, Nov 20, 2024 at 08:20:16AM -0800, Boqun Feng wrote: > > > > > On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: > > > > > > On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > > > > > > > > > > On Tue, Nov 19, 2024 at 01:24:01PM +0200, Abdiel Janulgue wrote: > > > > > > > > This series aims to add support for pages that are not constructed by an > > > > > > > > instance of the rust Page abstraction, for example those returned by > > > > > > > > vmalloc_to_page() or virt_to_page(). > > > > > > > > > > > > > > > > Changes sinve v3: > > > > > > > > - Use the struct page's reference count to decide when to free the > > > > > > > > allocation (Alice Ryhl, Boqun Feng). > > > > > > > > > > > > > > Bleh, this is going to be "exciting". We're in the middle of a multi-year > > > > > > > project to remove refcounts from struct page. The lifetime of a page > > > > > > > will be controlled by the memdesc that it belongs to. Some of those > > > > > > > memdescs will have refcounts, but others will not. > > > > > > > > > > > > > > > > > One question: will the page that doesn't have refcounts has an exclusive > > > > > owner? I.e. there is one owner that's responsible to free the page and > > > > > make sure other references to the page get properly invalidated (maybe > > > > > via RCU?) > > > > > > > > It's up to the owner of the page how they want to manage freeing it. > > > > They can use a refcount (folios will still have a refcount, for example), > > > > or they can know when there are no more users of the page (eg slab knows > > > > when all objects in a slab are freed). RCU is a possibility, but would > > > > be quite unusual I would think. The model I'm looking for here is that > > > > 'page' is too low-level an object to have its own lifecycle; it's always > > > > defined by a higher level object. > > > > > > > > > > Ok, that makes sense. That's actually aligned with the direction we are > > > heading in this patch: make `struct Page` itself independent on how the > > > lifetime is maintained. Conceptually, say we can define folio in pure > > > Rust, it could be: > > > > > > struct Folio { > > > head: Page, /* or a union of page */ > > > ... > > > } > > > > > > and we can `impl AlwaysRefcounted for Folio`, which implies there is a > > > refcount inside. And we can also have a `Foo` being: > > > > > > struct Foo { > > > inner: Page, > > > } > > > > > > which doesn't implement `AlwaysRefcounted`, and that suggests a > > > different way the page lifetime will be maintained. > > > > > > > > > > We don't have a fully formed destination yet, so I can't give you a > > > > > > > definite answer to a lot of questions. Obviously I don't want to hold > > > > > > > up the Rust project in any way, but I need to know that what we're trying > > > > > > > to do will be expressible in Rust. > > > > > > > > > > > > > > Can we avoid referring to a page's refcount? > > > > > > > > > > > > I don't think this patch needs the refcount at all, and the previous > > > > > > version did not expose it. This came out of the advice to use put_page > > > > > > over free_page. Does this mean that we should switch to put_page but > > > > > > not use get_page? > > > > > > > > Did I advise using put_page() over free_page()? I hope I didn't say > > > > > > We have some off-list discussion about free_page() doesn't always free > > > the page if you could remember. > > > > > > > that. I don't see a reason why binder needs to refcount its pages (nor > > > > use a mapcount on them), but I don't fully understand binder so maybe > > > > it does need a refcount. > > > > > > I don't think binder needs it either, but I think Abdiel here has a > > > different usage than binder. > > > > > > > > > > > > I think the point is finding the exact lifetime model for pages, if it's > > > > > not a simple refcounting, then what it is? Besides, we can still > > > > > represent refcounting pages with `struct Page` and other pages with a > > > > > different type name. So as far as I can see, this patch is OK for now. > > > > > > > > I don't want Page to have a refcount. If you need something with a > > > > refcount, it needs to be called something else. > > > > > > So if I understand correctly, what Abdiel needs here is a way to convert > > > a virtual address to the corresponding page, would it make sense to just > > > use folio in this case? Abdiel, what's the operation you are going to > > > call on the page you get? > > > > Yes that's basically it. The goal here is represent those existing struct > > page within this rust Page abstraction but at the same time to avoid taking > > over its ownership. > > > > Boqun, Alice, should we reconsider Ownable and Owned trait again? :) > > > > Could you use folio in your case? If so, we can provide a simple binding > for folio which should be `AlwaysRefcounted`, and re-investigate how > page should be wrapped. Well, regardless of that, I do think it sounds like Owned / Ownable is the right way forward for Page. Alice
Hi Boqun, Matthew: On 21/11/2024 02:24, Boqun Feng wrote: >>> So if I understand correctly, what Abdiel needs here is a way to convert >>> a virtual address to the corresponding page, would it make sense to just >>> use folio in this case? Abdiel, what's the operation you are going to >>> call on the page you get? >> >> Yes that's basically it. The goal here is represent those existing struct >> page within this rust Page abstraction but at the same time to avoid taking >> over its ownership. >> >> Boqun, Alice, should we reconsider Ownable and Owned trait again? :) >> > > Could you use folio in your case? If so, we can provide a simple binding > for folio which should be `AlwaysRefcounted`, and re-investigate how > page should be wrapped. > I'm not sure. Is there a way to get the struct folio from a vmalloc'd address, e.g vmalloc_to_folio()? Regards, Abdiel
[Cc Kairui in case he's interested] On Thu, Nov 21, 2024 at 11:30:13AM +0200, Abdiel Janulgue wrote: > Hi Boqun, Matthew: > > On 21/11/2024 02:24, Boqun Feng wrote: > > > > So if I understand correctly, what Abdiel needs here is a way to convert > > > > a virtual address to the corresponding page, would it make sense to just > > > > use folio in this case? Abdiel, what's the operation you are going to > > > > call on the page you get? > > > > > > Yes that's basically it. The goal here is represent those existing struct > > > page within this rust Page abstraction but at the same time to avoid taking > > > over its ownership. > > > > > > Boqun, Alice, should we reconsider Ownable and Owned trait again? :) > > > > > > > Could you use folio in your case? If so, we can provide a simple binding > > for folio which should be `AlwaysRefcounted`, and re-investigate how > > page should be wrapped. > > > > I'm not sure. Is there a way to get the struct folio from a vmalloc'd > address, e.g vmalloc_to_folio()? > I think you can use page_folio(vmalloc_to_page(..)) to get the folio, but one thing to notice is that folio is guaranteed to be a non-tail page, so if you want to do something later for the particular page (if it's a tail page), you will need to know the offset of the that page in folio. You can do something like below: pub fn page_slice_to_folio<'a>(page: &PageSlice) -> Result<(&'a Folio, usize)> { ... let page = vmalloc_to_page(ptr); let folio = page_folio(page); let offset = folio_page_idx(folio, page); Ok((folio, offset)) } And you have a folio -> page function like: pub struct Folio(Opaque<bindings::folio>); impl Folio { pub fn nth_page(&self, n: usize) -> &Page { &*(nth_page(self.0.get(), n)) } } Of course, this is me acting as I know MM ;-) but I feel this is the way to go. And if binder can use folio as well (I don't see a reason why not, but it's extra work, so defer to Alice), then we would only need the `pub struct Page { inner: Opaque<bindings::page> }` part in your patch #1, and can avoid doing `Ownable` or `AlwaysRefcounted` for `Page`. Thoughts? Regards, Boqun > Regards, > Abdiel
On Thu, Nov 21, 2024 at 11:10:45AM -0800, Boqun Feng wrote: > [Cc Kairui in case he's interested] > (forgot to cc...) > On Thu, Nov 21, 2024 at 11:30:13AM +0200, Abdiel Janulgue wrote: > > Hi Boqun, Matthew: > > > > On 21/11/2024 02:24, Boqun Feng wrote: > > > > > So if I understand correctly, what Abdiel needs here is a way to convert > > > > > a virtual address to the corresponding page, would it make sense to just > > > > > use folio in this case? Abdiel, what's the operation you are going to > > > > > call on the page you get? > > > > > > > > Yes that's basically it. The goal here is represent those existing struct > > > > page within this rust Page abstraction but at the same time to avoid taking > > > > over its ownership. > > > > > > > > Boqun, Alice, should we reconsider Ownable and Owned trait again? :) > > > > > > > > > > Could you use folio in your case? If so, we can provide a simple binding > > > for folio which should be `AlwaysRefcounted`, and re-investigate how > > > page should be wrapped. > > > > > > > I'm not sure. Is there a way to get the struct folio from a vmalloc'd > > address, e.g vmalloc_to_folio()? > > > > I think you can use page_folio(vmalloc_to_page(..)) to get the folio, > but one thing to notice is that folio is guaranteed to be a non-tail > page, so if you want to do something later for the particular page (if > it's a tail page), you will need to know the offset of the that page in > folio. You can do something like below: > > pub fn page_slice_to_folio<'a>(page: &PageSlice) -> Result<(&'a Folio, usize)> { > ... > let page = vmalloc_to_page(ptr); > > let folio = page_folio(page); > let offset = folio_page_idx(folio, page); > > Ok((folio, offset)) > } > > And you have a folio -> page function like: > > pub struct Folio(Opaque<bindings::folio>); > > impl Folio { > pub fn nth_page(&self, n: usize) -> &Page { > &*(nth_page(self.0.get(), n)) > } > } > > Of course, this is me acting as I know MM ;-) but I feel this is the way > to go. And if binder can use folio as well (I don't see a reason why > not, but it's extra work, so defer to Alice), then we would only need > the `pub struct Page { inner: Opaque<bindings::page> }` part in your > patch #1, and can avoid doing `Ownable` or `AlwaysRefcounted` for > `Page`. > > Thoughts? > > Regards, > Boqun > > > Regards, > > Abdiel
On Thu, Nov 21, 2024 at 11:12:30AM -0800, Boqun Feng wrote: > On Thu, Nov 21, 2024 at 11:30:13AM +0200, Abdiel Janulgue wrote: > > Hi Boqun, Matthew: > > > > On 21/11/2024 02:24, Boqun Feng wrote: > > > > > So if I understand correctly, what Abdiel needs here is a way to convert > > > > > a virtual address to the corresponding page, would it make sense to just > > > > > use folio in this case? Abdiel, what's the operation you are going to > > > > > call on the page you get? > > > > > > > > Yes that's basically it. The goal here is represent those existing struct > > > > page within this rust Page abstraction but at the same time to avoid taking > > > > over its ownership. > > > > > > > > Boqun, Alice, should we reconsider Ownable and Owned trait again? :) > > > > > > > > > > Could you use folio in your case? If so, we can provide a simple binding > > > for folio which should be `AlwaysRefcounted`, and re-investigate how > > > page should be wrapped. > > > > > > > I'm not sure. Is there a way to get the struct folio from a vmalloc'd > > address, e.g vmalloc_to_folio()? > > > > I think you can use page_folio(vmalloc_to_page(..)) to get the folio, > but one thing to notice is that folio is guaranteed to be a non-tail > page, so if you want to do something later for the particular page (if > it's a tail page), you will need to know the offset of the that page in > folio. You can do something like below: This is one of those things which will work today, but will stop working in the future, and anyway will only appear to work for some users. For example, both vmalloc and slab allocations do not use the refcount on the struct page for anything. eg this will be a UAF (please excuse me writing in C): char *a = kmalloc(256, GFP_KERNEL); struct page *page = get_page(virt_to_page(a)); char *b = page_address(page) + offset_in_page(a); // a and b will now have the same bit pattern kfree(a); *b = 1; Once you've called kfree(), slab is entitled to hand that memory out to any other user of kmalloc(). This might actually work to protect vmalloc() memory from going away under you, but I intend to change vmalloc so that it won't work (nothing to do with this patch series, rather an approach to make vmalloc more efficient). One reason you're confused today is that we have a temporary ambiguity around what "folio" actually means. The original definition (ie mine) was simply that it was a non-tail page. We're moving towards the definition Johannes wanted, which is that it's only the memdesc for anonymous & file-backed memory [1]. So while vmalloc_to_folio() makes sense under the original definition, it's an absurdity under the new definition. So, Abdiel, why are you trying to add this? What are you actually trying to accomplish in terms of "I am writing a device driver for XXX and I need to ..."? You've been very evasive up to now. [1] Actually Johannes wants to split them apart even further so that anon & file memory have different types, and we may yet get there. One step at a time.
On 22/11/2024 00:01, Matthew Wilcox wrote: > On Thu, Nov 21, 2024 at 11:12:30AM -0800, Boqun Feng wrote: >> On Thu, Nov 21, 2024 at 11:30:13AM +0200, Abdiel Janulgue wrote: >>> Hi Boqun, Matthew: >>> >>> On 21/11/2024 02:24, Boqun Feng wrote: >>>>>> So if I understand correctly, what Abdiel needs here is a way to convert >>>>>> a virtual address to the corresponding page, would it make sense to just >>>>>> use folio in this case? Abdiel, what's the operation you are going to >>>>>> call on the page you get? >>>>> >>>>> Yes that's basically it. The goal here is represent those existing struct >>>>> page within this rust Page abstraction but at the same time to avoid taking >>>>> over its ownership. >>>>> >>>>> Boqun, Alice, should we reconsider Ownable and Owned trait again? :) >>>>> >>>> >>>> Could you use folio in your case? If so, we can provide a simple binding >>>> for folio which should be `AlwaysRefcounted`, and re-investigate how >>>> page should be wrapped. >>>> >>> >>> I'm not sure. Is there a way to get the struct folio from a vmalloc'd >>> address, e.g vmalloc_to_folio()? >>> >> >> I think you can use page_folio(vmalloc_to_page(..)) to get the folio, >> but one thing to notice is that folio is guaranteed to be a non-tail >> page, so if you want to do something later for the particular page (if >> it's a tail page), you will need to know the offset of the that page in >> folio. You can do something like below: > > This is one of those things which will work today, but will stop > working in the future, and anyway will only appear to work for some > users. > > For example, both vmalloc and slab allocations do not use the refcount > on the struct page for anything. eg this will be a UAF (please excuse > me writing in C): > > char *a = kmalloc(256, GFP_KERNEL); > struct page *page = get_page(virt_to_page(a)); > char *b = page_address(page) + offset_in_page(a); > // a and b will now have the same bit pattern > kfree(a); > *b = 1; > > Once you've called kfree(), slab is entitled to hand that memory out > to any other user of kmalloc(). This might actually work to protect > vmalloc() memory from going away under you, but I intend to change > vmalloc so that it won't work (nothing to do with this patch series, > rather an approach to make vmalloc more efficient). > > One reason you're confused today is that we have a temporary ambiguity > around what "folio" actually means. The original definition (ie mine) was > simply that it was a non-tail page. We're moving towards the definition > Johannes wanted, which is that it's only the memdesc for anonymous & > file-backed memory [1]. So while vmalloc_to_folio() makes sense under > the original definition, it's an absurdity under the new definition. > > So, Abdiel, why are you trying to add this? What are you actually > trying to accomplish in terms of "I am writing a device driver for XXX > and I need to ..."? You've been very evasive up to now. Background behind this is that we need this for the nova rust driver [0]. We need an abstraction of struct page to construct a scatterlist which is needed for an internal firmware structure. Now most of pages needed there come from vmalloc_to_page() which, unlike the current rust Page abstraction, not allocated on demand but is an existing mapping. Hope that clears things up! Regards, Abdiel [0] https://rust-for-linux.com/nova-gpu-driver
On Fri, Nov 22, 2024 at 01:18:28AM +0200, Abdiel Janulgue wrote: > We need an abstraction of struct page to construct a scatterlist which is > needed for an internal firmware structure. Now most of pages needed there > come from vmalloc_to_page() which, unlike the current rust Page abstraction, > not allocated on demand but is an existing mapping. > > Hope that clears things up! That's very helpful! So the lifetime of the scatterllist must not outlive the lifetime of the vmalloc allocation. That means you can call kmap_local_page() on the page in the scatterlist without worrying about the refcount of the struct page. BTW, you can't call page_address() on vmalloc memory because vmalloc can allocate pages from HIGHMEM. Unless you're willing to disable support for 32-bit systems with highmem ...
On Fri, Nov 22, 2024 at 11:24 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Fri, Nov 22, 2024 at 01:18:28AM +0200, Abdiel Janulgue wrote: > > We need an abstraction of struct page to construct a scatterlist which is > > needed for an internal firmware structure. Now most of pages needed there > > come from vmalloc_to_page() which, unlike the current rust Page abstraction, > > not allocated on demand but is an existing mapping. > > > > Hope that clears things up! > > That's very helpful! So the lifetime of the scatterllist must not > outlive the lifetime of the vmalloc allocation. That means you can call > kmap_local_page() on the page in the scatterlist without worrying about > the refcount of the struct page. BTW, you can't call page_address() on > vmalloc memory because vmalloc can allocate pages from HIGHMEM. Unless > you're willing to disable support for 32-bit systems with highmem ... > https://elixir.bootlin.com/linux/v6.11.5/source/drivers/gpu/drm/nouveau/nvkm/core/firmware.c#L266 This is the C code we want to rustify. Dave.
On 11/22/24 07:58, David Airlie wrote: > On Fri, Nov 22, 2024 at 11:24 AM Matthew Wilcox <willy@infradead.org> wrote: >> >> On Fri, Nov 22, 2024 at 01:18:28AM +0200, Abdiel Janulgue wrote: >>> We need an abstraction of struct page to construct a scatterlist which is >>> needed for an internal firmware structure. Now most of pages needed there >>> come from vmalloc_to_page() which, unlike the current rust Page abstraction, >>> not allocated on demand but is an existing mapping. >>> >>> Hope that clears things up! >> >> That's very helpful! So the lifetime of the scatterllist must not >> outlive the lifetime of the vmalloc allocation. That means you can call >> kmap_local_page() on the page in the scatterlist without worrying about >> the refcount of the struct page. BTW, you can't call page_address() on >> vmalloc memory because vmalloc can allocate pages from HIGHMEM. Unless >> you're willing to disable support for 32-bit systems with highmem ... >> > > https://elixir.bootlin.com/linux/v6.11.5/source/drivers/gpu/drm/nouveau/nvkm/core/firmware.c#L266 > > This is the C code we want to rustify. I don't think you want to increase/decrease the refcount there. Instead you tie the lifetime of the returned page to the lifetime of the thing that provides the page, which would be some kind of NvkmFirmware struct. pub enum NvkmFirmwareData { Ram(KBox<[PageSlice]>, Dma(CoherentAllocation<PageSlice>, Sgt(VBox<[PageSlice]>, } pub struct NvkmFirmware { ..., img: NvkmFirmwareData, } pub struct NvkmFirmwarePages<'a> { fw: &'a NvkmFirmware, sgt: SgTable, } impl NvkmFirmware { fn get_sgl(&self) -> NvkmFirmwarePages { ... } } Perhaps a trait that is implemented by both {K,V,KV}Vec<PageSlice> and {K,V,KV}Box<[PageSlice]>, like trait ToComponentPage { fn to_component_page(&self, i: usize) -> &Page; } impl ToComponentPage for KVec<PageSlice> { // same for KBox<[PageSlice]> fn to_component_page(&self, i: usize) -> &Page { let base = &self[i << PAGE_SHIFT..]; bindings::virt_to_page(base.as_ptr()) } } impl ToComponentPage for VVec<PageSlice> { // same for VBox<[PageSlice]> fn to_component_page(&self, i: usize) -> &Page { let base = &self[i << PAGE_SHIFT..]; bindings::vmalloc_to_page(base.as_ptr()) } } ? And possibly also trait ToComponentPageMut { fn to_component_page_mut(&mut self, i: usize) -> &Page; } which would be implemented by the Box types, but not by the Vec types because their data is not pinned. Paolo
On Wed, Nov 20, 2024 at 6:02 PM Matthew Wilcox <willy@infradead.org> wrote: > On Wed, Nov 20, 2024 at 08:20:16AM -0800, Boqun Feng wrote: > > On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: > > > On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > We don't have a fully formed destination yet, so I can't give you a > > > > definite answer to a lot of questions. Obviously I don't want to hold > > > > up the Rust project in any way, but I need to know that what we're trying > > > > to do will be expressible in Rust. > > > > > > > > Can we avoid referring to a page's refcount? > > > > > > I don't think this patch needs the refcount at all, and the previous > > > version did not expose it. This came out of the advice to use put_page > > > over free_page. Does this mean that we should switch to put_page but > > > not use get_page? > > Did I advise using put_page() over free_page()? I hope I didn't say > that. I don't see a reason why binder needs to refcount its pages (nor > use a mapcount on them), but I don't fully understand binder so maybe > it does need a refcount. I think that was me, at <https://lore.kernel.org/all/CAG48ez32zWt4mcfA+y2FnzzNmFe-0ns9XQgp=QYeFpRsdiCAnw@mail.gmail.com/>. Looking at the C binder version, binder_install_single_page() installs pages into userspace page tables in a VM_MIXEDMAP mapping using vm_insert_page(), and when you do that with pages from the page allocator, userspace can grab references to them through GUP-fast (and I think also through GUP). (See how vm_insert_page() and vm_get_page_prot() don't use pte_mkspecial(), which is pretty much the only thing that can stop GUP-fast on most architectures.) My understanding is that the combination VM_IO|VM_MIXEDMAP would stop normal GUP, but currently the only way to block GUP-fast is to use VM_PFNMAP. (Which, as far as I understand, is also why GPU drivers use VM_PFNMAP so much.) Maybe we should change that, so that VM_IO and/or VM_MIXEDMAP blocks GUP in the region and causes installed PTEs to be marked with pte_mkspecial()? I am not entirely sure about this stuff, but I was recently looking at net/packet/af_packet.c, and I tested that vmsplice() can grab references to the high-order compound pages that alloc_one_pg_vec_page() allocates with __get_free_pages(GFP_KERNEL | __GFP_COMP | __GFP_ZERO | __GFP_NOWARN | __GFP_NORETRY, order), packet_mmap() inserts with vm_insert_page(), and free_pg_vec() drops with free_pages(). (But that all happens to actually work fine, free_pages() actually handles refcounted compound pages properly.)
On Tue, Nov 26, 2024 at 9:31 PM Jann Horn <jannh@google.com> wrote: > On Wed, Nov 20, 2024 at 6:02 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Wed, Nov 20, 2024 at 08:20:16AM -0800, Boqun Feng wrote: > > > On Wed, Nov 20, 2024 at 10:10:44AM +0100, Alice Ryhl wrote: > > > > On Wed, Nov 20, 2024 at 5:57 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > > We don't have a fully formed destination yet, so I can't give you a > > > > > definite answer to a lot of questions. Obviously I don't want to hold > > > > > up the Rust project in any way, but I need to know that what we're trying > > > > > to do will be expressible in Rust. > > > > > > > > > > Can we avoid referring to a page's refcount? > > > > > > > > I don't think this patch needs the refcount at all, and the previous > > > > version did not expose it. This came out of the advice to use put_page > > > > over free_page. Does this mean that we should switch to put_page but > > > > not use get_page? > > > > Did I advise using put_page() over free_page()? I hope I didn't say > > that. I don't see a reason why binder needs to refcount its pages (nor > > use a mapcount on them), but I don't fully understand binder so maybe > > it does need a refcount. > > I think that was me, at > <https://lore.kernel.org/all/CAG48ez32zWt4mcfA+y2FnzzNmFe-0ns9XQgp=QYeFpRsdiCAnw@mail.gmail.com/>. > Looking at the C binder version, binder_install_single_page() installs > pages into userspace page tables in a VM_MIXEDMAP mapping using > vm_insert_page(), and when you do that with pages from the page > allocator, userspace can grab references to them through GUP-fast (and > I think also through GUP). (See how vm_insert_page() and > vm_get_page_prot() don't use pte_mkspecial(), which is pretty much the > only thing that can stop GUP-fast on most architectures.) > > My understanding is that the combination VM_IO|VM_MIXEDMAP would stop > normal GUP, but currently the only way to block GUP-fast is to use > VM_PFNMAP. (Which, as far as I understand, is also why GPU drivers use > VM_PFNMAP so much.) Maybe we should change that, so that VM_IO and/or > VM_MIXEDMAP blocks GUP in the region and causes installed PTEs to be > marked with pte_mkspecial()? > > I am not entirely sure about this stuff, but I was recently looking at > net/packet/af_packet.c, and I tested that vmsplice() can grab > references to the high-order compound pages that > alloc_one_pg_vec_page() allocates with __get_free_pages(GFP_KERNEL | > __GFP_COMP | __GFP_ZERO | __GFP_NOWARN | __GFP_NORETRY, order), > packet_mmap() inserts with vm_insert_page(), and free_pg_vec() drops > with free_pages(). (But that all happens to actually work fine, > free_pages() actually handles refcounted compound pages properly.) And also, the C binder driver wants to free pages in its shrinker callback, but those pages might still be mapped into userspace. Binder tries to zap such userspace mappings, but it does that by absolute virtual address instead of going through the rmap (see binder_alloc_free_page()), so it will miss page mappings in VMAs that have been mremap()'d (though legitimate userspace never does that with binder VMAs) or are concurrently being torn down by munmap(); so currently the thing that keeps this from falling apart is that if page mappings are left over somewhere, the page refcount ensures that this userspace-mapped page doesn't get freed. (I think the C binder code does its job, but is not exactly a great model for how to write a clean driver that integrates nicely with the rest of the kernel.)