Message ID | 20240726152206.28411-5-roger.pau@citrix.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | x86: adventures in Address Space Isolation | expand |
On 26.07.2024 17:21, Roger Pau Monne wrote: > The idle_pg_table L4 is cloned to create all the other L4 Xen uses, and hence > it shouldn't be modified once further L4 are created. Yes, but the window between moving into SYS_STATE_smp_boot and Dom0 having its initial page tables created is quite large. If the justification was relative to AP bringup, that may be all fine. But when related to cloning, I think that would then truly want keying to there being any non-system domain(s). > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -5023,6 +5023,12 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) > mfn_t l3mfn; > l3_pgentry_t *l3t = alloc_mapped_pagetable(&l3mfn); > > + /* > + * dom0 is build at smp_boot, at which point we already create new L4s > + * based on idle_pg_table. > + */ > + BUG_ON(system_state >= SYS_STATE_smp_boot); Which effectively means most of this function could become __init (e.g. by moving into a helper). We'd then hit the BUG_ON() prior to init_done() destroying the .init.* mappings, and we'd simply #PF afterwards. That's not so much for the space savings in .text, but to document the limited lifetime of the (helper) function directly in its function head. I further wonder whether in such a case the enclosing if() wouldn't want to gain unlikely() at the same time. Jan
On Tue, Aug 13, 2024 at 05:54:54PM +0200, Jan Beulich wrote: > On 26.07.2024 17:21, Roger Pau Monne wrote: > > The idle_pg_table L4 is cloned to create all the other L4 Xen uses, and hence > > it shouldn't be modified once further L4 are created. > > Yes, but the window between moving into SYS_STATE_smp_boot and Dom0 having > its initial page tables created is quite large. If the justification was > relative to AP bringup, that may be all fine. But when related to cloning, > I think that would then truly want keying to there being any non-system > domain(s). Further changes in this series will add a per-CPU idle page table, and hence we need to ensure that by the time APs are started the BSP L4 idle page directory is not changed, as otherwise the copies in the APs would get out of sync. The idle system domain is indeed tied to the idle page talbes, but the idle vCPU0 (the BSP) directly uses idle_pg_table (no copying), and hence it's fine to allow modifications of the L4 idle page table directory up to when APs are started (those will indeed make copies of the idle L4. > > --- a/xen/arch/x86/mm.c > > +++ b/xen/arch/x86/mm.c > > @@ -5023,6 +5023,12 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) > > mfn_t l3mfn; > > l3_pgentry_t *l3t = alloc_mapped_pagetable(&l3mfn); > > > > + /* > > + * dom0 is build at smp_boot, at which point we already create new L4s > > + * based on idle_pg_table. > > + */ > > + BUG_ON(system_state >= SYS_STATE_smp_boot); > > Which effectively means most of this function could become __init (e.g. by > moving into a helper). We'd then hit the BUG_ON() prior to init_done() > destroying the .init.* mappings, and we'd simply #PF afterwards. That's > not so much for the space savings in .text, but to document the limited > lifetime of the (helper) function directly in its function head. IMO the BUG_ON() is clearer to debug, but I won't mind splitting the logic inside the if body into a separate helper. > I further wonder whether in such a case the enclosing if() wouldn't want > to gain unlikely() at the same time. Yes, I can certainly add that. Thanks, Roger.
On 10.09.2024 10:54, Roger Pau Monné wrote: > On Tue, Aug 13, 2024 at 05:54:54PM +0200, Jan Beulich wrote: >> On 26.07.2024 17:21, Roger Pau Monne wrote: >>> The idle_pg_table L4 is cloned to create all the other L4 Xen uses, and hence >>> it shouldn't be modified once further L4 are created. >> >> Yes, but the window between moving into SYS_STATE_smp_boot and Dom0 having >> its initial page tables created is quite large. If the justification was >> relative to AP bringup, that may be all fine. But when related to cloning, >> I think that would then truly want keying to there being any non-system >> domain(s). > > Further changes in this series will add a per-CPU idle page table, and > hence we need to ensure that by the time APs are started the BSP L4 idle > page directory is not changed, as otherwise the copies in the APs > would get out of sync. > > The idle system domain is indeed tied to the idle page talbes, but the > idle vCPU0 (the BSP) directly uses idle_pg_table (no copying), and > hence it's fine to allow modifications of the L4 idle page table > directory up to when APs are started (those will indeed make copies of > the idle L4. Which may want at least mentioning in the description then. I take it that ... >>> --- a/xen/arch/x86/mm.c >>> +++ b/xen/arch/x86/mm.c >>> @@ -5023,6 +5023,12 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) >>> mfn_t l3mfn; >>> l3_pgentry_t *l3t = alloc_mapped_pagetable(&l3mfn); >>> >>> + /* >>> + * dom0 is build at smp_boot, at which point we already create new L4s >>> + * based on idle_pg_table. >>> + */ ... this comment is then refined by the later patches you refer to? >>> + BUG_ON(system_state >= SYS_STATE_smp_boot); >> >> Which effectively means most of this function could become __init (e.g. by >> moving into a helper). We'd then hit the BUG_ON() prior to init_done() >> destroying the .init.* mappings, and we'd simply #PF afterwards. That's >> not so much for the space savings in .text, but to document the limited >> lifetime of the (helper) function directly in its function head. > > IMO the BUG_ON() is clearer to debug, Fair point - it's indeed a balance between two possible goals. I guess ... > but I won't mind splitting the > logic inside the if body into a separate helper. ... simply keep it as you have it. Jan
On Tue, Sep 10, 2024 at 11:00:27AM +0200, Jan Beulich wrote: > On 10.09.2024 10:54, Roger Pau Monné wrote: > > On Tue, Aug 13, 2024 at 05:54:54PM +0200, Jan Beulich wrote: > >> On 26.07.2024 17:21, Roger Pau Monne wrote: > >>> The idle_pg_table L4 is cloned to create all the other L4 Xen uses, and hence > >>> it shouldn't be modified once further L4 are created. > >> > >> Yes, but the window between moving into SYS_STATE_smp_boot and Dom0 having > >> its initial page tables created is quite large. If the justification was > >> relative to AP bringup, that may be all fine. But when related to cloning, > >> I think that would then truly want keying to there being any non-system > >> domain(s). > > > > Further changes in this series will add a per-CPU idle page table, and > > hence we need to ensure that by the time APs are started the BSP L4 idle > > page directory is not changed, as otherwise the copies in the APs > > would get out of sync. > > > > The idle system domain is indeed tied to the idle page talbes, but the > > idle vCPU0 (the BSP) directly uses idle_pg_table (no copying), and > > hence it's fine to allow modifications of the L4 idle page table > > directory up to when APs are started (those will indeed make copies of > > the idle L4. > > Which may want at least mentioning in the description then. I take it > that ... > > >>> --- a/xen/arch/x86/mm.c > >>> +++ b/xen/arch/x86/mm.c > >>> @@ -5023,6 +5023,12 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) > >>> mfn_t l3mfn; > >>> l3_pgentry_t *l3t = alloc_mapped_pagetable(&l3mfn); > >>> > >>> + /* > >>> + * dom0 is build at smp_boot, at which point we already create new L4s > >>> + * based on idle_pg_table. > >>> + */ > > ... this comment is then refined by the later patches you refer to? Hm, I would have to double check, not sure I've updated it once the idle_pg_table is cloned for AP bringup. Will expand commit message and update the comment here if not done already by later patches. Thanks, Roger.
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 6ffacab341ad..01380fd82c9d 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -5023,6 +5023,12 @@ static l3_pgentry_t *virt_to_xen_l3e(unsigned long v) mfn_t l3mfn; l3_pgentry_t *l3t = alloc_mapped_pagetable(&l3mfn); + /* + * dom0 is build at smp_boot, at which point we already create new L4s + * based on idle_pg_table. + */ + BUG_ON(system_state >= SYS_STATE_smp_boot); + if ( !l3t ) return NULL; UNMAP_DOMAIN_PAGE(l3t);
The idle_pg_table L4 is cloned to create all the other L4 Xen uses, and hence it shouldn't be modified once further L4 are created. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/mm.c | 6 ++++++ 1 file changed, 6 insertions(+)