Message ID | 20210216102839.1801667-1-george.dunlap@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [DO,NOT,APPLY] docs: Document allocator properties and the rubric for using them | expand |
Hi George, On 16/02/2021 10:28, George Dunlap wrote: > Document the properties of the various allocators and lay out a clear > rubric for when to use each. > > Signed-off-by: George Dunlap <george.dunlap@citrix.com> > --- > > This doc is my understanding of the properties of the current > allocators (alloc_xenheap_pages, xmalloc, and vmalloc), and of Jan's > proposed new wrapper, xvmalloc. > > xmalloc, vmalloc, and xvmalloc were designed more or less to mirror > similar functions in Linux (kmalloc, vmalloc, and kvmalloc > respectively). > > CC: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Jan Beulich <jbeulich@suse.com> > CC: Roger Pau Monne <roger.pau@citrix.com> > CC: Stefano Stabellini <sstabellini@kernel.org> > CC: Julien Grall <julien@xen.org> > --- > .../memory-allocation-functions.rst | 118 ++++++++++++++++++ > 1 file changed, 118 insertions(+) > create mode 100644 docs/hypervisor-guide/memory-allocation-functions.rst > > diff --git a/docs/hypervisor-guide/memory-allocation-functions.rst b/docs/hypervisor-guide/memory-allocation-functions.rst > new file mode 100644 > index 0000000000..15aa2a1a65 > --- /dev/null > +++ b/docs/hypervisor-guide/memory-allocation-functions.rst > @@ -0,0 +1,118 @@ > +.. SPDX-License-Identifier: CC-BY-4.0 > + > +Xenheap memory allocation functions > +=================================== > + > +In general Xen contains two pools (or "heaps") of memory: the *xen > +heap* and the *dom heap*. Please see the comment at the top of > +``xen/common/page_alloc.c`` for the canonical explanation. > + > +This document describes the various functions available to allocate > +memory from the xen heap: their properties and rules for when they should be > +used. > + > + > +TLDR guidelines > +--------------- > + > +* By default, ``xvmalloc`` (or its helper cognates) should be used > + unless you know you have specific properties that need to be met. > + > +* If you need memory which needs to be physically contiguous, and may > + be larger than ``PAGE_SIZE``... > + > + - ...and is order 2, use ``alloc_xenheap_pages``. > + > + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. > + > +* If you don't need memory to be physically contiguous, and know the > + allocation will always be larger than ``PAGE_SIZE``, you may use > + ``vmalloc`` (or one of its helper cognates). > + > +* If you know that allocation will always be less than ``PAGE_SIZE``, > + you may use ``xmalloc``. AFAICT, the determining factor is PAGE_SIZE. This is a single is a single value on x86 (e.g. 4KB) but on other architecture this may be multiple values. For instance, on Arm, this could be 4KB, 16KB, 64KB (note that only the former is so far supported on Xen). For Arm and common code, it feels to me we can't make a clear decision based on PAGE_SIZE. Instead, I continue to think that the decision should only be based on physical vs virtually contiguous. We can then add further rules for x86 specific code if the maintainers want. Cheers,
> On Feb 16, 2021, at 10:28 AM, George Dunlap <george.dunlap@citrix.com> wrote: > > Document the properties of the various allocators and lay out a clear > rubric for when to use each. > > Signed-off-by: George Dunlap <george.dunlap@citrix.com> > --- > > This doc is my understanding of the properties of the current > allocators (alloc_xenheap_pages, xmalloc, and vmalloc), and of Jan's > proposed new wrapper, xvmalloc. > > xmalloc, vmalloc, and xvmalloc were designed more or less to mirror > similar functions in Linux (kmalloc, vmalloc, and kvmalloc > respectively). > > CC: Andrew Cooper <andrew.cooper3@citrix.com> > CC: Jan Beulich <jbeulich@suse.com> > CC: Roger Pau Monne <roger.pau@citrix.com> > CC: Stefano Stabellini <sstabellini@kernel.org> > CC: Julien Grall <julien@xen.org> > --- > .../memory-allocation-functions.rst | 118 ++++++++++++++++++ > 1 file changed, 118 insertions(+) > create mode 100644 docs/hypervisor-guide/memory-allocation-functions.rst > > diff --git a/docs/hypervisor-guide/memory-allocation-functions.rst b/docs/hypervisor-guide/memory-allocation-functions.rst > new file mode 100644 > index 0000000000..15aa2a1a65 > --- /dev/null > +++ b/docs/hypervisor-guide/memory-allocation-functions.rst > @@ -0,0 +1,118 @@ > +.. SPDX-License-Identifier: CC-BY-4.0 > + > +Xenheap memory allocation functions > +=================================== > + > +In general Xen contains two pools (or "heaps") of memory: the *xen > +heap* and the *dom heap*. Please see the comment at the top of > +``xen/common/page_alloc.c`` for the canonical explanation. > + > +This document describes the various functions available to allocate > +memory from the xen heap: their properties and rules for when they should be > +used. > + > + > +TLDR guidelines > +--------------- > + > +* By default, ``xvmalloc`` (or its helper cognates) should be used > + unless you know you have specific properties that need to be met. > + > +* If you need memory which needs to be physically contiguous, and may > + be larger than ``PAGE_SIZE``... > + > + - ...and is order 2, use ``alloc_xenheap_pages``. > + > + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. > + > +* If you don't need memory to be physically contiguous, and know the > + allocation will always be larger than ``PAGE_SIZE``, you may use > + ``vmalloc`` (or one of its helper cognates). > + > +* If you know that allocation will always be less than ``PAGE_SIZE``, > + you may use ``xmalloc``. > + > +Properties of various allocation functions > +------------------------------------------ > + > +Ultimately, the underlying allocator for all of these functions is > +``alloc_xenheap_pages``. They differ on several different properties: > + > +1. What underlying allocation sizes are. This in turn has an effect > + on: > + > + - How much memory is wasted when requested size doesn't match > + > + - How such allocations are affected by memory fragmentation > + > + - How such allocations affect memory fragmentation > + > +2. Whether the underlying pages are physically contiguous > + > +3. Whether allocation and deallocation require the cost of mapping and > + unmapping > + > +``alloc_xenheap_pages`` will allocate a physically contiguous set of > +pages on orders of 2. No mapping or unmapping is done. However, if > +this is used for sizes not close to ``PAGE_SIZE * (1 << n)``, a lot of > +space will be wasted. Such allocations may fail if the memory becomes > +very fragmented; but such allocations do not tend to contribute to > +that memory fragmentation much. > + > +As such, ``alloc_xenheap_pages`` should be used when you need a size > +of exactly ``PAGE_SIZE * (1 << n)`` physically contiguous pages. > + > +``xmalloc`` is actually two separate allocators. Allocations of < > +``PAGE_SIZE`` are handled using ``xmem_pool_alloc()``, and allocations >= > +``PAGE_SIZE`` are handled using ``xmalloc_whole_pages()``. > + > +``xmem_pool_alloc()`` is a pool allocator which allocates xenheap > +pages on demand as needed. This is ideal for small, quick > +allocations: no pages are mapped or unmapped; sub-page allocations are > +expected, and so a minimum of space is wasted; and because xenheap > +pages are allocated one-by-one, 1) they are unlikely to fail unless > +Xen is genuinely out of memory, and 2) it doesn't have a major effect > +on fragmentation of memory. > + > +Allocations of > ``PAGE_SIZE`` are not possible with the pool > +allocator, so for such sizes, ``xmalloc`` calls > +``xmalloc_whole_pages()``, which in turn calls ``alloc_xenheap_pages`` > +with an order large enough to satisfy the request, and then frees all > +the pages which aren't used. > + > +Like the other allocator, this incurs no mapping or unmapping > +overhead. Allocations will be physically contiguous (like > +``alloc_xenheap_pages``), but not as much is wasted as a plain > +``alloc_xenheap_pages`` allocation. However, such an allocation may > +fail if memory fragmented to the point that a contiguous allocation of > +the appropriate size cannot be found; such allocations also tend to > +fragment memory more. > + > +As such, ``xmalloc`` may be called in cases where you know the > +allocation will be less than ``PAGE_SIZE``; or when you need a > +physically contiguous allocation which may be more than > +``PAGE_SIZE``. > + > +``vmalloc`` will allocate pages one-by-one and map them into a virtual > +memory area designated for the purpose, separated by a guard page. > +Only full pages are allocated, so using it from less tham > +``PAGE_SIZE`` allocations is wasteful. The underlying memory will not > +be physically contiguous. As such, it is not adversely affected by > +excessive system fragmentation, nor does it contribute to it. > +However, allocating and freeing requires a map and unmap operation > +respectively, both of which adversely affect system performance. > + > +Therefore, ``vmalloc`` should be used for page allocations over a page > +size in length which don't need to be physically contiguous. > + > +``xvmalloc`` is like ``xmalloc``, except that for allocations > > +``PAGE_SIZE``, it calls ``vmalloc`` instead. Thus ``xvmalloc`` should > +always be preferred unless: > + > +1. You need physically contiguous memory, and your size may end up > + greater than ``PAGE_SIZE``; in which case you should use > + ``xmalloc`` or ``alloc_xenheap_pages`` as appropriate > + > +2. You are positive that ``xvmalloc`` will choose one specific > + underlying implementation; in which case you should simply call > + that implementation directly. Basically, the more I look at this whole thing — particularly the fact that xmalloc already has an `if ( size > PAGE_SIZE)` inside of it — the more I think this last point is just a waste of everyone’s time. I’m inclined to go with Julien’s suggestion, that we use xmalloc when we need physically contiguous memory (with a comment), and xvmalloc everywhere else. We can implement xvmalloc such that it’s no slower than xmalloc is currently (i.e., it directly calls `xmem_pool_alloc` when size < PAGE_SIZE, rather than calling xmalloc and having xmalloc do the comparison again). -George
> On Feb 16, 2021, at 10:55 AM, Julien Grall <julien@xen.org> wrote: > > Hi George, > > On 16/02/2021 10:28, George Dunlap wrote: >> Document the properties of the various allocators and lay out a clear >> rubric for when to use each. >> Signed-off-by: George Dunlap <george.dunlap@citrix.com> >> --- >> This doc is my understanding of the properties of the current >> allocators (alloc_xenheap_pages, xmalloc, and vmalloc), and of Jan's >> proposed new wrapper, xvmalloc. >> xmalloc, vmalloc, and xvmalloc were designed more or less to mirror >> similar functions in Linux (kmalloc, vmalloc, and kvmalloc >> respectively). >> CC: Andrew Cooper <andrew.cooper3@citrix.com> >> CC: Jan Beulich <jbeulich@suse.com> >> CC: Roger Pau Monne <roger.pau@citrix.com> >> CC: Stefano Stabellini <sstabellini@kernel.org> >> CC: Julien Grall <julien@xen.org> >> --- >> .../memory-allocation-functions.rst | 118 ++++++++++++++++++ >> 1 file changed, 118 insertions(+) >> create mode 100644 docs/hypervisor-guide/memory-allocation-functions.rst >> diff --git a/docs/hypervisor-guide/memory-allocation-functions.rst b/docs/hypervisor-guide/memory-allocation-functions.rst >> new file mode 100644 >> index 0000000000..15aa2a1a65 >> --- /dev/null >> +++ b/docs/hypervisor-guide/memory-allocation-functions.rst >> @@ -0,0 +1,118 @@ >> +.. SPDX-License-Identifier: CC-BY-4.0 >> + >> +Xenheap memory allocation functions >> +=================================== >> + >> +In general Xen contains two pools (or "heaps") of memory: the *xen >> +heap* and the *dom heap*. Please see the comment at the top of >> +``xen/common/page_alloc.c`` for the canonical explanation. >> + >> +This document describes the various functions available to allocate >> +memory from the xen heap: their properties and rules for when they should be >> +used. >> + >> + >> +TLDR guidelines >> +--------------- >> + >> +* By default, ``xvmalloc`` (or its helper cognates) should be used >> + unless you know you have specific properties that need to be met. >> + >> +* If you need memory which needs to be physically contiguous, and may >> + be larger than ``PAGE_SIZE``... >> + >> + - ...and is order 2, use ``alloc_xenheap_pages``. >> + >> + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. >> + >> +* If you don't need memory to be physically contiguous, and know the >> + allocation will always be larger than ``PAGE_SIZE``, you may use >> + ``vmalloc`` (or one of its helper cognates). >> + >> +* If you know that allocation will always be less than ``PAGE_SIZE``, >> + you may use ``xmalloc``. > > AFAICT, the determining factor is PAGE_SIZE. This is a single is a single value on x86 (e.g. 4KB) but on other architecture this may be multiple values. > > For instance, on Arm, this could be 4KB, 16KB, 64KB (note that only the former is so far supported on Xen). > > For Arm and common code, it feels to me we can't make a clear decision based on PAGE_SIZE. Instead, I continue to think that the decision should only be based on physical vs virtually contiguous. > > We can then add further rules for x86 specific code if the maintainers want. Sorry my second mail was somewhat delayed — my intent was: 1) post the document I’d agreed to write, 2) say why I think the proposal is a bad idea. :-) Re page size — the vast majority of time we’re talking “knowing” that the size is less than 4k. If we’re confident that no architecture will ever have a page size less than 4k, then we know that all allocations less than 4k will always be less than PAGE_SIZE. Obviously larger page sizes then becomes an issue. But in any case — unless we have BUG_ON(size > PAGE_SIZE), we’re going to have to have a fallback, which is going to cost one precious conditional, making the whole exercise pointless. -George
> On Feb 16, 2021, at 11:16 AM, George Dunlap <george.dunlap@citrix.com> wrote: > > > >> On Feb 16, 2021, at 10:55 AM, Julien Grall <julien@xen.org> wrote: >> >> Hi George, >> >> On 16/02/2021 10:28, George Dunlap wrote: >>> Document the properties of the various allocators and lay out a clear >>> rubric for when to use each. >>> Signed-off-by: George Dunlap <george.dunlap@citrix.com> >>> --- >>> This doc is my understanding of the properties of the current >>> allocators (alloc_xenheap_pages, xmalloc, and vmalloc), and of Jan's >>> proposed new wrapper, xvmalloc. >>> xmalloc, vmalloc, and xvmalloc were designed more or less to mirror >>> similar functions in Linux (kmalloc, vmalloc, and kvmalloc >>> respectively). >>> CC: Andrew Cooper <andrew.cooper3@citrix.com> >>> CC: Jan Beulich <jbeulich@suse.com> >>> CC: Roger Pau Monne <roger.pau@citrix.com> >>> CC: Stefano Stabellini <sstabellini@kernel.org> >>> CC: Julien Grall <julien@xen.org> >>> --- >>> .../memory-allocation-functions.rst | 118 ++++++++++++++++++ >>> 1 file changed, 118 insertions(+) >>> create mode 100644 docs/hypervisor-guide/memory-allocation-functions.rst >>> diff --git a/docs/hypervisor-guide/memory-allocation-functions.rst b/docs/hypervisor-guide/memory-allocation-functions.rst >>> new file mode 100644 >>> index 0000000000..15aa2a1a65 >>> --- /dev/null >>> +++ b/docs/hypervisor-guide/memory-allocation-functions.rst >>> @@ -0,0 +1,118 @@ >>> +.. SPDX-License-Identifier: CC-BY-4.0 >>> + >>> +Xenheap memory allocation functions >>> +=================================== >>> + >>> +In general Xen contains two pools (or "heaps") of memory: the *xen >>> +heap* and the *dom heap*. Please see the comment at the top of >>> +``xen/common/page_alloc.c`` for the canonical explanation. >>> + >>> +This document describes the various functions available to allocate >>> +memory from the xen heap: their properties and rules for when they should be >>> +used. >>> + >>> + >>> +TLDR guidelines >>> +--------------- >>> + >>> +* By default, ``xvmalloc`` (or its helper cognates) should be used >>> + unless you know you have specific properties that need to be met. >>> + >>> +* If you need memory which needs to be physically contiguous, and may >>> + be larger than ``PAGE_SIZE``... >>> + >>> + - ...and is order 2, use ``alloc_xenheap_pages``. >>> + >>> + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. >>> + >>> +* If you don't need memory to be physically contiguous, and know the >>> + allocation will always be larger than ``PAGE_SIZE``, you may use >>> + ``vmalloc`` (or one of its helper cognates). >>> + >>> +* If you know that allocation will always be less than ``PAGE_SIZE``, >>> + you may use ``xmalloc``. >> >> AFAICT, the determining factor is PAGE_SIZE. This is a single is a single value on x86 (e.g. 4KB) but on other architecture this may be multiple values. >> >> For instance, on Arm, this could be 4KB, 16KB, 64KB (note that only the former is so far supported on Xen). >> >> For Arm and common code, it feels to me we can't make a clear decision based on PAGE_SIZE. Instead, I continue to think that the decision should only be based on physical vs virtually contiguous. >> >> We can then add further rules for x86 specific code if the maintainers want. > > Sorry my second mail was somewhat delayed — my intent was: 1) post the document I’d agreed to write, 2) say why I think the proposal is a bad idea. :-) > > Re page size — the vast majority of time we’re talking “knowing” that the size is less than 4k. If we’re confident that no architecture will ever have a page size less than 4k, then we know that all allocations less than 4k will always be less than PAGE_SIZE. Obviously larger page sizes then becomes an issue. > > But in any case — unless we have BUG_ON(size > PAGE_SIZE), we’re going to have to have a fallback, which is going to cost one precious conditional, making the whole exercise pointless. Er, just in case it wasn’t clear — I agree with this: >> I continue to think that the decision should only be based on physical vs virtually contiguous. -George
On 16.02.2021 11:28, George Dunlap wrote: > --- /dev/null > +++ b/docs/hypervisor-guide/memory-allocation-functions.rst > @@ -0,0 +1,118 @@ > +.. SPDX-License-Identifier: CC-BY-4.0 > + > +Xenheap memory allocation functions > +=================================== > + > +In general Xen contains two pools (or "heaps") of memory: the *xen > +heap* and the *dom heap*. Please see the comment at the top of > +``xen/common/page_alloc.c`` for the canonical explanation. > + > +This document describes the various functions available to allocate > +memory from the xen heap: their properties and rules for when they should be > +used. Irrespective of your subsequent indication of you disliking the proposal (which I understand only affects the guidelines further down anyway) I'd like to point out that vmalloc() does not allocate from the Xen heap. Therefore a benefit of always recommending use of xvmalloc() would be that the function could fall back to vmalloc() (and hence the larger domain heap) when xmalloc() failed. > +TLDR guidelines > +--------------- > + > +* By default, ``xvmalloc`` (or its helper cognates) should be used > + unless you know you have specific properties that need to be met. > + > +* If you need memory which needs to be physically contiguous, and may > + be larger than ``PAGE_SIZE``... > + > + - ...and is order 2, use ``alloc_xenheap_pages``. > + > + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. ITYM "an exact power of 2 number of pages"? > +* If you don't need memory to be physically contiguous, and know the > + allocation will always be larger than ``PAGE_SIZE``, you may use > + ``vmalloc`` (or one of its helper cognates). > + > +* If you know that allocation will always be less than ``PAGE_SIZE``, > + you may use ``xmalloc``. As per Julien's and your own replies, this wants to be "minimum possible page size", which of course depends on where in the tree the piece of code is to live. (It would be "maximum possible page size" in the earlier paragraph.) > +Properties of various allocation functions > +------------------------------------------ > + > +Ultimately, the underlying allocator for all of these functions is > +``alloc_xenheap_pages``. They differ on several different properties: > + > +1. What underlying allocation sizes are. This in turn has an effect > + on: > + > + - How much memory is wasted when requested size doesn't match > + > + - How such allocations are affected by memory fragmentation > + > + - How such allocations affect memory fragmentation > + > +2. Whether the underlying pages are physically contiguous > + > +3. Whether allocation and deallocation require the cost of mapping and > + unmapping > + > +``alloc_xenheap_pages`` will allocate a physically contiguous set of > +pages on orders of 2. No mapping or unmapping is done. That's the case today, but meant to change rather sooner than later (when the 1:1 map disappears). Jan
Hi George, On 16/02/2021 11:17, George Dunlap wrote: > > >> On Feb 16, 2021, at 11:16 AM, George Dunlap <george.dunlap@citrix.com> wrote: >> >> >> >>> On Feb 16, 2021, at 10:55 AM, Julien Grall <julien@xen.org> wrote: >>> >>> Hi George, >>> >>> On 16/02/2021 10:28, George Dunlap wrote: >>>> Document the properties of the various allocators and lay out a clear >>>> rubric for when to use each. >>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com> >>>> --- >>>> This doc is my understanding of the properties of the current >>>> allocators (alloc_xenheap_pages, xmalloc, and vmalloc), and of Jan's >>>> proposed new wrapper, xvmalloc. >>>> xmalloc, vmalloc, and xvmalloc were designed more or less to mirror >>>> similar functions in Linux (kmalloc, vmalloc, and kvmalloc >>>> respectively). >>>> CC: Andrew Cooper <andrew.cooper3@citrix.com> >>>> CC: Jan Beulich <jbeulich@suse.com> >>>> CC: Roger Pau Monne <roger.pau@citrix.com> >>>> CC: Stefano Stabellini <sstabellini@kernel.org> >>>> CC: Julien Grall <julien@xen.org> >>>> --- >>>> .../memory-allocation-functions.rst | 118 ++++++++++++++++++ >>>> 1 file changed, 118 insertions(+) >>>> create mode 100644 docs/hypervisor-guide/memory-allocation-functions.rst >>>> diff --git a/docs/hypervisor-guide/memory-allocation-functions.rst b/docs/hypervisor-guide/memory-allocation-functions.rst >>>> new file mode 100644 >>>> index 0000000000..15aa2a1a65 >>>> --- /dev/null >>>> +++ b/docs/hypervisor-guide/memory-allocation-functions.rst >>>> @@ -0,0 +1,118 @@ >>>> +.. SPDX-License-Identifier: CC-BY-4.0 >>>> + >>>> +Xenheap memory allocation functions >>>> +=================================== >>>> + >>>> +In general Xen contains two pools (or "heaps") of memory: the *xen >>>> +heap* and the *dom heap*. Please see the comment at the top of >>>> +``xen/common/page_alloc.c`` for the canonical explanation. >>>> + >>>> +This document describes the various functions available to allocate >>>> +memory from the xen heap: their properties and rules for when they should be >>>> +used. >>>> + >>>> + >>>> +TLDR guidelines >>>> +--------------- >>>> + >>>> +* By default, ``xvmalloc`` (or its helper cognates) should be used >>>> + unless you know you have specific properties that need to be met. >>>> + >>>> +* If you need memory which needs to be physically contiguous, and may >>>> + be larger than ``PAGE_SIZE``... >>>> + >>>> + - ...and is order 2, use ``alloc_xenheap_pages``. >>>> + >>>> + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. >>>> + >>>> +* If you don't need memory to be physically contiguous, and know the >>>> + allocation will always be larger than ``PAGE_SIZE``, you may use >>>> + ``vmalloc`` (or one of its helper cognates). >>>> + >>>> +* If you know that allocation will always be less than ``PAGE_SIZE``, >>>> + you may use ``xmalloc``. >>> >>> AFAICT, the determining factor is PAGE_SIZE. This is a single is a single value on x86 (e.g. 4KB) but on other architecture this may be multiple values. >>> >>> For instance, on Arm, this could be 4KB, 16KB, 64KB (note that only the former is so far supported on Xen). >>> >>> For Arm and common code, it feels to me we can't make a clear decision based on PAGE_SIZE. Instead, I continue to think that the decision should only be based on physical vs virtually contiguous. >>> >>> We can then add further rules for x86 specific code if the maintainers want. >> >> Sorry my second mail was somewhat delayed — my intent was: 1) post the document I’d agreed to write, 2) say why I think the proposal is a bad idea. :-) No worry, I jumped too quickly in the discussion :). >> >> Re page size — the vast majority of time we’re talking “knowing” that the size is less than 4k. If we’re confident that no architecture will ever have a page size less than 4k, then we know that all allocations less than 4k will always be less than PAGE_SIZE. Obviously larger page sizes then becomes an issue. >> >> But in any case — unless we have BUG_ON(size > PAGE_SIZE), we’re going to have to have a fallback, which is going to cost one precious conditional, making the whole exercise pointless. > > Er, just in case it wasn’t clear — I agree with this: > >>> I continue to think that the decision should only be based on physical vs virtually contiguous. We have two opposite proposal with no clear way to reconciliate them. Should we request a vote on the two proposals? Cheers,
> On Mar 6, 2021, at 8:03 PM, Julien Grall <julien@xen.org> wrote: > > Hi George, > > On 16/02/2021 11:17, George Dunlap wrote: >>> On Feb 16, 2021, at 11:16 AM, George Dunlap <george.dunlap@citrix.com> wrote: >>> >>> >>> >>>> On Feb 16, 2021, at 10:55 AM, Julien Grall <julien@xen.org> wrote: >>>> >>>> Hi George, >>>> >>>> On 16/02/2021 10:28, George Dunlap wrote: >>>>> Document the properties of the various allocators and lay out a clear >>>>> rubric for when to use each. >>>>> Signed-off-by: George Dunlap <george.dunlap@citrix.com> >>>>> --- >>>>> This doc is my understanding of the properties of the current >>>>> allocators (alloc_xenheap_pages, xmalloc, and vmalloc), and of Jan's >>>>> proposed new wrapper, xvmalloc. >>>>> xmalloc, vmalloc, and xvmalloc were designed more or less to mirror >>>>> similar functions in Linux (kmalloc, vmalloc, and kvmalloc >>>>> respectively). >>>>> CC: Andrew Cooper <andrew.cooper3@citrix.com> >>>>> CC: Jan Beulich <jbeulich@suse.com> >>>>> CC: Roger Pau Monne <roger.pau@citrix.com> >>>>> CC: Stefano Stabellini <sstabellini@kernel.org> >>>>> CC: Julien Grall <julien@xen.org> >>>>> --- >>>>> .../memory-allocation-functions.rst | 118 ++++++++++++++++++ >>>>> 1 file changed, 118 insertions(+) >>>>> create mode 100644 docs/hypervisor-guide/memory-allocation-functions.rst >>>>> diff --git a/docs/hypervisor-guide/memory-allocation-functions.rst b/docs/hypervisor-guide/memory-allocation-functions.rst >>>>> new file mode 100644 >>>>> index 0000000000..15aa2a1a65 >>>>> --- /dev/null >>>>> +++ b/docs/hypervisor-guide/memory-allocation-functions.rst >>>>> @@ -0,0 +1,118 @@ >>>>> +.. SPDX-License-Identifier: CC-BY-4.0 >>>>> + >>>>> +Xenheap memory allocation functions >>>>> +=================================== >>>>> + >>>>> +In general Xen contains two pools (or "heaps") of memory: the *xen >>>>> +heap* and the *dom heap*. Please see the comment at the top of >>>>> +``xen/common/page_alloc.c`` for the canonical explanation. >>>>> + >>>>> +This document describes the various functions available to allocate >>>>> +memory from the xen heap: their properties and rules for when they should be >>>>> +used. >>>>> + >>>>> + >>>>> +TLDR guidelines >>>>> +--------------- >>>>> + >>>>> +* By default, ``xvmalloc`` (or its helper cognates) should be used >>>>> + unless you know you have specific properties that need to be met. >>>>> + >>>>> +* If you need memory which needs to be physically contiguous, and may >>>>> + be larger than ``PAGE_SIZE``... >>>>> + >>>>> + - ...and is order 2, use ``alloc_xenheap_pages``. >>>>> + >>>>> + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. >>>>> + >>>>> +* If you don't need memory to be physically contiguous, and know the >>>>> + allocation will always be larger than ``PAGE_SIZE``, you may use >>>>> + ``vmalloc`` (or one of its helper cognates). >>>>> + >>>>> +* If you know that allocation will always be less than ``PAGE_SIZE``, >>>>> + you may use ``xmalloc``. >>>> >>>> AFAICT, the determining factor is PAGE_SIZE. This is a single is a single value on x86 (e.g. 4KB) but on other architecture this may be multiple values. >>>> >>>> For instance, on Arm, this could be 4KB, 16KB, 64KB (note that only the former is so far supported on Xen). >>>> >>>> For Arm and common code, it feels to me we can't make a clear decision based on PAGE_SIZE. Instead, I continue to think that the decision should only be based on physical vs virtually contiguous. >>>> >>>> We can then add further rules for x86 specific code if the maintainers want. >>> >>> Sorry my second mail was somewhat delayed — my intent was: 1) post the document I’d agreed to write, 2) say why I think the proposal is a bad idea. :-) > > No worry, I jumped too quickly in the discussion :). > >>> >>> Re page size — the vast majority of time we’re talking “knowing” that the size is less than 4k. If we’re confident that no architecture will ever have a page size less than 4k, then we know that all allocations less than 4k will always be less than PAGE_SIZE. Obviously larger page sizes then becomes an issue. >>> >>> But in any case — unless we have BUG_ON(size > PAGE_SIZE), we’re going to have to have a fallback, which is going to cost one precious conditional, making the whole exercise pointless. >> Er, just in case it wasn’t clear — I agree with this: >>>> I continue to think that the decision should only be based on physical vs virtually contiguous. > > We have two opposite proposal with no clear way to reconciliate them. Should we request a vote on the two proposals? Let me write up an alternate proposal with Jan’s feedback; then if he still thinks his way is better we can vote. -George
> On Feb 16, 2021, at 3:29 PM, Jan Beulich <JBeulich@suse.com> wrote: > > On 16.02.2021 11:28, George Dunlap wrote: >> --- /dev/null >> +++ b/docs/hypervisor-guide/memory-allocation-functions.rst >> @@ -0,0 +1,118 @@ >> +.. SPDX-License-Identifier: CC-BY-4.0 >> + >> +Xenheap memory allocation functions >> +=================================== >> + >> +In general Xen contains two pools (or "heaps") of memory: the *xen >> +heap* and the *dom heap*. Please see the comment at the top of >> +``xen/common/page_alloc.c`` for the canonical explanation. >> + >> +This document describes the various functions available to allocate >> +memory from the xen heap: their properties and rules for when they should be >> +used. > > Irrespective of your subsequent indication of you disliking the > proposal (which I understand only affects the guidelines further > down anyway) I'd like to point out that vmalloc() does not > allocate from the Xen heap. Therefore a benefit of always > recommending use of xvmalloc() would be that the function could > fall back to vmalloc() (and hence the larger domain heap) when > xmalloc() failed. OK, that’s good to know. So just trying to think this through: address space is limiting factor for how big the xenheap can be, right? Presumably “vmap” space is also limited, and will be much smaller? So in a sense the “fallback” is less about getting more memory, but about using up that extra little bit of virtual address space? Another question that raises: Are there times when it’s advantageous to specify which heap to allocate from? If there are good reasons for allocations to be in the xenheap or in the domheap / vmap area, then the guidelines should probably say that as well. And, of course, will the whole concept of the xenheap / domheap split go away if we ever get rid of the 1:1 map? > >> +TLDR guidelines >> +--------------- >> + >> +* By default, ``xvmalloc`` (or its helper cognates) should be used >> + unless you know you have specific properties that need to be met. >> + >> +* If you need memory which needs to be physically contiguous, and may >> + be larger than ``PAGE_SIZE``... >> + >> + - ...and is order 2, use ``alloc_xenheap_pages``. >> + >> + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. > > ITYM "an exact power of 2 number of pages"? Yes, I’ll fix that. > >> +* If you don't need memory to be physically contiguous, and know the >> + allocation will always be larger than ``PAGE_SIZE``, you may use >> + ``vmalloc`` (or one of its helper cognates). >> + >> +* If you know that allocation will always be less than ``PAGE_SIZE``, >> + you may use ``xmalloc``. > > As per Julien's and your own replies, this wants to be "minimum > possible page size", which of course depends on where in the > tree the piece of code is to live. (It would be "maximum > possible page size" in the earlier paragraph.) I’ll see if I can clarify this. > >> +Properties of various allocation functions >> +------------------------------------------ >> + >> +Ultimately, the underlying allocator for all of these functions is >> +``alloc_xenheap_pages``. They differ on several different properties: >> + >> +1. What underlying allocation sizes are. This in turn has an effect >> + on: >> + >> + - How much memory is wasted when requested size doesn't match >> + >> + - How such allocations are affected by memory fragmentation >> + >> + - How such allocations affect memory fragmentation >> + >> +2. Whether the underlying pages are physically contiguous >> + >> +3. Whether allocation and deallocation require the cost of mapping and >> + unmapping >> + >> +``alloc_xenheap_pages`` will allocate a physically contiguous set of >> +pages on orders of 2. No mapping or unmapping is done. > > That's the case today, but meant to change rather sooner than later > (when the 1:1 map disappears). Is that the kind of thing we want to add into this document? I suppose it would be good to make the guidelines now such that they produce code which is as easy as possible to adapt to the new way of doing things. -George
> On Mar 12, 2021, at 2:32 PM, George Dunlap <george.dunlap@citrix.com> wrote: > > > >> On Feb 16, 2021, at 3:29 PM, Jan Beulich <JBeulich@suse.com> wrote: >> >> On 16.02.2021 11:28, George Dunlap wrote: >>> --- /dev/null >>> +++ b/docs/hypervisor-guide/memory-allocation-functions.rst >>> @@ -0,0 +1,118 @@ >>> +.. SPDX-License-Identifier: CC-BY-4.0 >>> + >>> +Xenheap memory allocation functions >>> +=================================== >>> + >>> +In general Xen contains two pools (or "heaps") of memory: the *xen >>> +heap* and the *dom heap*. Please see the comment at the top of >>> +``xen/common/page_alloc.c`` for the canonical explanation. >>> + >>> +This document describes the various functions available to allocate >>> +memory from the xen heap: their properties and rules for when they should be >>> +used. >> >> Irrespective of your subsequent indication of you disliking the >> proposal (which I understand only affects the guidelines further >> down anyway) I'd like to point out that vmalloc() does not >> allocate from the Xen heap. Therefore a benefit of always >> recommending use of xvmalloc() would be that the function could >> fall back to vmalloc() (and hence the larger domain heap) when >> xmalloc() failed. > > OK, that’s good to know. > > So just trying to think this through: address space is limiting factor for how big the xenheap can be, right? Presumably “vmap” space is also limited, and will be much smaller? So in a sense the “fallback” is less about getting more memory, but about using up that extra little bit of virtual address space? > > Another question that raises: Are there times when it’s advantageous to specify which heap to allocate from? If there are good reasons for allocations to be in the xenheap or in the domheap / vmap area, then the guidelines should probably say that as well. > > And, of course, will the whole concept of the xenheap / domheap split go away if we ever get rid of the 1:1 map? > >> >>> +TLDR guidelines >>> +--------------- >>> + >>> +* By default, ``xvmalloc`` (or its helper cognates) should be used >>> + unless you know you have specific properties that need to be met. >>> + >>> +* If you need memory which needs to be physically contiguous, and may >>> + be larger than ``PAGE_SIZE``... >>> + >>> + - ...and is order 2, use ``alloc_xenheap_pages``. >>> + >>> + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. >> >> ITYM "an exact power of 2 number of pages"? > > Yes, I’ll fix that. > >> >>> +* If you don't need memory to be physically contiguous, and know the >>> + allocation will always be larger than ``PAGE_SIZE``, you may use >>> + ``vmalloc`` (or one of its helper cognates). >>> + >>> +* If you know that allocation will always be less than ``PAGE_SIZE``, >>> + you may use ``xmalloc``. >> >> As per Julien's and your own replies, this wants to be "minimum >> possible page size", which of course depends on where in the >> tree the piece of code is to live. (It would be "maximum >> possible page size" in the earlier paragraph.) > > I’ll see if I can clarify this. I think the only way to actually make this clear would be to set specific values for “minimum possible PAGE_SIZE” and “maximum possible PAGE_SIZE” — values past which the maintainers of the architecture are happy to do some sort of audit if PAGE_SIZE ever exceeds them. -George
On 12.03.2021 15:32, George Dunlap wrote: >> On Feb 16, 2021, at 3:29 PM, Jan Beulich <JBeulich@suse.com> wrote: >> On 16.02.2021 11:28, George Dunlap wrote: >>> --- /dev/null >>> +++ b/docs/hypervisor-guide/memory-allocation-functions.rst >>> @@ -0,0 +1,118 @@ >>> +.. SPDX-License-Identifier: CC-BY-4.0 >>> + >>> +Xenheap memory allocation functions >>> +=================================== >>> + >>> +In general Xen contains two pools (or "heaps") of memory: the *xen >>> +heap* and the *dom heap*. Please see the comment at the top of >>> +``xen/common/page_alloc.c`` for the canonical explanation. >>> + >>> +This document describes the various functions available to allocate >>> +memory from the xen heap: their properties and rules for when they should be >>> +used. >> >> Irrespective of your subsequent indication of you disliking the >> proposal (which I understand only affects the guidelines further >> down anyway) I'd like to point out that vmalloc() does not >> allocate from the Xen heap. Therefore a benefit of always >> recommending use of xvmalloc() would be that the function could >> fall back to vmalloc() (and hence the larger domain heap) when >> xmalloc() failed. > > OK, that’s good to know. > > So just trying to think this through: address space is limiting factor for how big the xenheap can be, right? Yes, with the current direct-map model only memory which has a permanent mapping can be "Xen heap". Obviously, for the mapping to be permanent, its VA range needs to be set up front (at build time in reality). FAOD the distinction (at least on x86) matters only on systems with a lot of memory. > Presumably “vmap” space is also limited, and will be much smaller? Yes and yes, albeit for the 2nd one I'd like to add "currently", because once we do away with the direct map, I'd envision to use all the VA space for such on-demand mapping purposes. > So in a sense the “fallback” is less about getting more memory, > but about using up that extra little bit of virtual address space? Not really, no. If no memory is left on the Xen heap, there may still be some left on the domain heap. Falling back could also be the other way around, yes - if we've run out of vmalloc() address space, we may still have a chance find the requested space in the Xen heap. > Another question that raises: Are there times when it’s > advantageous to specify which heap to allocate from? If there > are good reasons for allocations to be in the xenheap or in the > domheap / vmap area, then the guidelines should probably say > that as well. I can't think of such reasons (beyond ones already named, like e.g. wanting to avoid mapping overhead), but I agree that if there are any, mentioning them would be desirable. > And, of course, will the whole concept of the xenheap / domheap > split go away if we ever get rid of the 1:1 map? I expect so, yes. >>> +Properties of various allocation functions >>> +------------------------------------------ >>> + >>> +Ultimately, the underlying allocator for all of these functions is >>> +``alloc_xenheap_pages``. They differ on several different properties: >>> + >>> +1. What underlying allocation sizes are. This in turn has an effect >>> + on: >>> + >>> + - How much memory is wasted when requested size doesn't match >>> + >>> + - How such allocations are affected by memory fragmentation >>> + >>> + - How such allocations affect memory fragmentation >>> + >>> +2. Whether the underlying pages are physically contiguous >>> + >>> +3. Whether allocation and deallocation require the cost of mapping and >>> + unmapping >>> + >>> +``alloc_xenheap_pages`` will allocate a physically contiguous set of >>> +pages on orders of 2. No mapping or unmapping is done. >> >> That's the case today, but meant to change rather sooner than later >> (when the 1:1 map disappears). > > Is that the kind of thing we want to add into this document? Not sure what to answer here - my intention with raising the point was ... > I suppose it would be good to make the guidelines now such > that they produce code which is as easy as possible to adapt > to the new way of doing things. ... precisely this. Jan
diff --git a/docs/hypervisor-guide/memory-allocation-functions.rst b/docs/hypervisor-guide/memory-allocation-functions.rst new file mode 100644 index 0000000000..15aa2a1a65 --- /dev/null +++ b/docs/hypervisor-guide/memory-allocation-functions.rst @@ -0,0 +1,118 @@ +.. SPDX-License-Identifier: CC-BY-4.0 + +Xenheap memory allocation functions +=================================== + +In general Xen contains two pools (or "heaps") of memory: the *xen +heap* and the *dom heap*. Please see the comment at the top of +``xen/common/page_alloc.c`` for the canonical explanation. + +This document describes the various functions available to allocate +memory from the xen heap: their properties and rules for when they should be +used. + + +TLDR guidelines +--------------- + +* By default, ``xvmalloc`` (or its helper cognates) should be used + unless you know you have specific properties that need to be met. + +* If you need memory which needs to be physically contiguous, and may + be larger than ``PAGE_SIZE``... + + - ...and is order 2, use ``alloc_xenheap_pages``. + + - ...and is not order 2, use ``xmalloc`` (or its helper cognates).. + +* If you don't need memory to be physically contiguous, and know the + allocation will always be larger than ``PAGE_SIZE``, you may use + ``vmalloc`` (or one of its helper cognates). + +* If you know that allocation will always be less than ``PAGE_SIZE``, + you may use ``xmalloc``. + +Properties of various allocation functions +------------------------------------------ + +Ultimately, the underlying allocator for all of these functions is +``alloc_xenheap_pages``. They differ on several different properties: + +1. What underlying allocation sizes are. This in turn has an effect + on: + + - How much memory is wasted when requested size doesn't match + + - How such allocations are affected by memory fragmentation + + - How such allocations affect memory fragmentation + +2. Whether the underlying pages are physically contiguous + +3. Whether allocation and deallocation require the cost of mapping and + unmapping + +``alloc_xenheap_pages`` will allocate a physically contiguous set of +pages on orders of 2. No mapping or unmapping is done. However, if +this is used for sizes not close to ``PAGE_SIZE * (1 << n)``, a lot of +space will be wasted. Such allocations may fail if the memory becomes +very fragmented; but such allocations do not tend to contribute to +that memory fragmentation much. + +As such, ``alloc_xenheap_pages`` should be used when you need a size +of exactly ``PAGE_SIZE * (1 << n)`` physically contiguous pages. + +``xmalloc`` is actually two separate allocators. Allocations of < +``PAGE_SIZE`` are handled using ``xmem_pool_alloc()``, and allocations >= +``PAGE_SIZE`` are handled using ``xmalloc_whole_pages()``. + +``xmem_pool_alloc()`` is a pool allocator which allocates xenheap +pages on demand as needed. This is ideal for small, quick +allocations: no pages are mapped or unmapped; sub-page allocations are +expected, and so a minimum of space is wasted; and because xenheap +pages are allocated one-by-one, 1) they are unlikely to fail unless +Xen is genuinely out of memory, and 2) it doesn't have a major effect +on fragmentation of memory. + +Allocations of > ``PAGE_SIZE`` are not possible with the pool +allocator, so for such sizes, ``xmalloc`` calls +``xmalloc_whole_pages()``, which in turn calls ``alloc_xenheap_pages`` +with an order large enough to satisfy the request, and then frees all +the pages which aren't used. + +Like the other allocator, this incurs no mapping or unmapping +overhead. Allocations will be physically contiguous (like +``alloc_xenheap_pages``), but not as much is wasted as a plain +``alloc_xenheap_pages`` allocation. However, such an allocation may +fail if memory fragmented to the point that a contiguous allocation of +the appropriate size cannot be found; such allocations also tend to +fragment memory more. + +As such, ``xmalloc`` may be called in cases where you know the +allocation will be less than ``PAGE_SIZE``; or when you need a +physically contiguous allocation which may be more than +``PAGE_SIZE``. + +``vmalloc`` will allocate pages one-by-one and map them into a virtual +memory area designated for the purpose, separated by a guard page. +Only full pages are allocated, so using it from less tham +``PAGE_SIZE`` allocations is wasteful. The underlying memory will not +be physically contiguous. As such, it is not adversely affected by +excessive system fragmentation, nor does it contribute to it. +However, allocating and freeing requires a map and unmap operation +respectively, both of which adversely affect system performance. + +Therefore, ``vmalloc`` should be used for page allocations over a page +size in length which don't need to be physically contiguous. + +``xvmalloc`` is like ``xmalloc``, except that for allocations > +``PAGE_SIZE``, it calls ``vmalloc`` instead. Thus ``xvmalloc`` should +always be preferred unless: + +1. You need physically contiguous memory, and your size may end up + greater than ``PAGE_SIZE``; in which case you should use + ``xmalloc`` or ``alloc_xenheap_pages`` as appropriate + +2. You are positive that ``xvmalloc`` will choose one specific + underlying implementation; in which case you should simply call + that implementation directly.
Document the properties of the various allocators and lay out a clear rubric for when to use each. Signed-off-by: George Dunlap <george.dunlap@citrix.com> --- This doc is my understanding of the properties of the current allocators (alloc_xenheap_pages, xmalloc, and vmalloc), and of Jan's proposed new wrapper, xvmalloc. xmalloc, vmalloc, and xvmalloc were designed more or less to mirror similar functions in Linux (kmalloc, vmalloc, and kvmalloc respectively). CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Jan Beulich <jbeulich@suse.com> CC: Roger Pau Monne <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Julien Grall <julien@xen.org> --- .../memory-allocation-functions.rst | 118 ++++++++++++++++++ 1 file changed, 118 insertions(+) create mode 100644 docs/hypervisor-guide/memory-allocation-functions.rst