Message ID | d91b5315-a5bb-a6ee-c9bb-58974c733a4e@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86/shadow: misc tidying | expand |
On 11.01.2023 14:52, Jan Beulich wrote: > Rather than doing a separate hash walk (and then even using the vCPU > variant, which is to go away), do the up-pointer-clearing right in > sh_unpin(), as an alternative to the (now further limited) enlisting on > a "free floating" list fragment. This utilizes the fact that such list > fragments are traversed only for multi-page shadows (in shadow_free()). I've added mention of sh_next_page() here as well. Not sure how I managed to miss that, but this doesn't change the reasoning. Jan
On Wed, Jan 11, 2023 at 1:52 PM Jan Beulich <jbeulich@suse.com> wrote: > Rather than doing a separate hash walk (and then even using the vCPU > variant, which is to go away), do the up-pointer-clearing right in > sh_unpin(), as an alternative to the (now further limited) enlisting on > a "free floating" list fragment. This utilizes the fact that such list > fragments are traversed only for multi-page shadows (in shadow_free()). > Furthermore sh_terminate_list() is a safe guard only anyway, which isn't > in use in the common case (it actually does anything only for BIGMEM > configurations). > One thing that seems strange about this patch is that you're essentially adding a field to the domain shadow struct in lieu of adding another another argument to sh_unpin() (unless the bit is referenced elsewhere in subsequent patches, which I haven't reviewed, in part because about half of them don't apply cleanly to the current tree). -George
On 20.01.2023 18:02, George Dunlap wrote: > On Wed, Jan 11, 2023 at 1:52 PM Jan Beulich <jbeulich@suse.com> wrote: > >> Rather than doing a separate hash walk (and then even using the vCPU >> variant, which is to go away), do the up-pointer-clearing right in >> sh_unpin(), as an alternative to the (now further limited) enlisting on >> a "free floating" list fragment. This utilizes the fact that such list >> fragments are traversed only for multi-page shadows (in shadow_free()). >> Furthermore sh_terminate_list() is a safe guard only anyway, which isn't >> in use in the common case (it actually does anything only for BIGMEM >> configurations). > > One thing that seems strange about this patch is that you're essentially > adding a field to the domain shadow struct in lieu of adding another > another argument to sh_unpin() (unless the bit is referenced elsewhere in > subsequent patches, which I haven't reviewed, in part because about half of > them don't apply cleanly to the current tree). Well, to me adding another parameter to sh_unpin() would have looked odd; the new field looks slightly cleaner to me. But changing that is merely a matter of taste, so if you and e.g. Andrew think that approach was better, I could switch to that. And no, I don't foresee further uses of the field. As to half of the patches not applying: Some where already applied out of order, and others therefore need re-basing slightly. Till now I saw no reason to re-send the remaining patches just for that. Jan
On Mon, Jan 23, 2023 at 8:41 AM Jan Beulich <jbeulich@suse.com> wrote: > On 20.01.2023 18:02, George Dunlap wrote: > > On Wed, Jan 11, 2023 at 1:52 PM Jan Beulich <jbeulich@suse.com> wrote: > > > >> Rather than doing a separate hash walk (and then even using the vCPU > >> variant, which is to go away), do the up-pointer-clearing right in > >> sh_unpin(), as an alternative to the (now further limited) enlisting on > >> a "free floating" list fragment. This utilizes the fact that such list > >> fragments are traversed only for multi-page shadows (in shadow_free()). > >> Furthermore sh_terminate_list() is a safe guard only anyway, which isn't > >> in use in the common case (it actually does anything only for BIGMEM > >> configurations). > > > > One thing that seems strange about this patch is that you're essentially > > adding a field to the domain shadow struct in lieu of adding another > > another argument to sh_unpin() (unless the bit is referenced elsewhere in > > subsequent patches, which I haven't reviewed, in part because about half > of > > them don't apply cleanly to the current tree). > > Well, to me adding another parameter to sh_unpin() would have looked odd; > the new field looks slightly cleaner to me. But changing that is merely a > matter of taste, so if you and e.g. Andrew think that approach was better, > I could switch to that. And no, I don't foresee further uses of the field. > You're about to call sh_unpin(), and you want to tell that function to change its behavior. What's so odd about adding an argument to the function to indicate the behavior? Instead you're adding a bit of global state which is carried around 100% of the time, even when that function isn't being called. That's not what people normally expect; it makes the code harder to reason about. It would certainly be ugly to have to add "false" to every other instance of sh_unpin; but the normal way you get around that is to redefine sh_unpin() as a wrapper which calls the other function with the 'false' argument set. You asked me to review this for a second opinion on the safety of clearing the up-pointer this way, not because you need an ack; so I don't really want to block the patch for non-functional reasons. But I think this is one of the "death by a thousand cuts" that makes the shadow code more fragile and difficult for new people to approach and understand. Re the original question: I've stared at the code for a bit now, and I can't see anything obviously wrong or dangerous about it. But it does make me ask, why do we need the "unpinning_l3" pseudo-argument at all? Is there any reason not to unconditionally zero out sp->up when we find a head_type of SH_type_l3_64_shadow? As far as I can tell, sp->list doesn't require any special state. Why do we make the effort to leave it alone when we're not unpinning all l3s? In fact, is there a way to unpin an l3 shadow *other* than when we're unpinning all l3's? If so, then this patch, as written, is broken -- the original code clears the up-pointer for *all* L3_64 shadows, regardless of whether they're on the pinned list; the new patch will only clear the ones on the pinned list. But unconditionally clearing sp->up could actually fix that. Thoughts? As to half of the patches not applying: Some where already applied out of > order, and others therefore need re-basing slightly. Till now I saw no > reason to re-send the remaining patches just for that. > Sorry if that sounded like complaining; I was only being preemptively defensive against the potential accusation that the answer would have been obvious if I'd just continued reviewing the series. :-). (And indeed if the whole series had applied I would have checked that the final result didn't have any other references to it.) -George
On 23.01.2023 16:20, George Dunlap wrote: > Re the original question: I've stared at the code for a bit now, and I > can't see anything obviously wrong or dangerous about it. > > But it does make me ask, why do we need the "unpinning_l3" pseudo-argument > at all? Is there any reason not to unconditionally zero out sp->up when we > find a head_type of SH_type_l3_64_shadow? As far as I can tell, sp->list > doesn't require any special state. Why do we make the effort to leave it > alone when we're not unpinning all l3s? This was an attempt to retain original behavior as much as possible, but I'm afraid that, ... > In fact, is there a way to unpin an l3 shadow *other* than when we're > unpinning all l3's? ... since the answer here is of course "yes", ... > If so, then this patch, as written, is broken -- the > original code clears the up-pointer for *all* L3_64 shadows, regardless of > whether they're on the pinned list; the new patch will only clear the ones > on the pinned list. But unconditionally clearing sp->up could actually fix > that. ... you're right, and I failed (went too far) with that attempt. Plus it'll naturally resolve the parameter-vs-state aspect. Jan
--- a/xen/arch/x86/include/asm/domain.h +++ b/xen/arch/x86/include/asm/domain.h @@ -116,6 +116,9 @@ struct shadow_domain { /* OOS */ bool_t oos_active; + /* Domain is in the process of leaving SHOPT_LINUX_L3_TOPLEVEL mode. */ + bool unpinning_l3; + #ifdef CONFIG_HVM /* Has this domain ever used HVMOP_pagetable_dying? */ bool_t pagetable_dying_op; --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -2302,29 +2302,6 @@ void shadow_prepare_page_type_change(str /**************************************************************************/ -/* Reset the up-pointers of every L3 shadow to 0. - * This is called when l3 shadows stop being pinnable, to clear out all - * the list-head bits so the up-pointer field is properly inititalised. */ -static int cf_check sh_clear_up_pointer( - struct vcpu *v, mfn_t smfn, mfn_t unused) -{ - mfn_to_page(smfn)->up = 0; - return 0; -} - -void sh_reset_l3_up_pointers(struct vcpu *v) -{ - static const hash_vcpu_callback_t callbacks[SH_type_unused] = { - [SH_type_l3_64_shadow] = sh_clear_up_pointer, - }; - - HASH_CALLBACKS_CHECK(SHF_L3_64); - hash_vcpu_foreach(v, SHF_L3_64, callbacks, INVALID_MFN); -} - - -/**************************************************************************/ - static void sh_update_paging_modes(struct vcpu *v) { struct domain *d = v->domain; --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -960,6 +960,8 @@ sh_make_shadow(struct vcpu *v, mfn_t gmf } if ( l4count > 2 * d->max_vcpus ) { + d->arch.paging.shadow.unpinning_l3 = true; + /* Unpin all the pinned l3 tables, and don't pin any more. */ page_list_for_each_safe(sp, t, &d->arch.paging.shadow.pinned_shadows) { @@ -967,7 +969,8 @@ sh_make_shadow(struct vcpu *v, mfn_t gmf sh_unpin(d, page_to_mfn(sp)); } d->arch.paging.shadow.opt_flags &= ~SHOPT_LINUX_L3_TOPLEVEL; - sh_reset_l3_up_pointers(v); + + d->arch.paging.shadow.unpinning_l3 = false; } } #endif --- a/xen/arch/x86/mm/shadow/private.h +++ b/xen/arch/x86/mm/shadow/private.h @@ -497,11 +497,6 @@ void shadow_blow_tables(struct domain *d */ int sh_remove_all_mappings(struct domain *d, mfn_t gmfn, gfn_t gfn); -/* Reset the up-pointers of every L3 shadow to 0. - * This is called when l3 shadows stop being pinnable, to clear out all - * the list-head bits so the up-pointer field is properly inititalised. */ -void sh_reset_l3_up_pointers(struct vcpu *v); - /****************************************************************************** * Flags used in the return value of the shadow_set_lXe() functions... */ @@ -721,7 +716,7 @@ static inline void sh_unpin(struct domai { struct page_list_head tmp_list, *pin_list; struct page_info *sp, *next; - unsigned int i, head_type; + unsigned int i, head_type, sz; ASSERT(mfn_valid(smfn)); sp = mfn_to_page(smfn); @@ -733,20 +728,30 @@ static inline void sh_unpin(struct domai return; sp->u.sh.pinned = 0; - /* Cut the sub-list out of the list of pinned shadows, - * stitching it back into a list fragment of its own. */ + sz = shadow_size(head_type); + + /* + * Cut the sub-list out of the list of pinned shadows, stitching + * multi-page shadows back into a list fragment of their own. + */ pin_list = &d->arch.paging.shadow.pinned_shadows; INIT_PAGE_LIST_HEAD(&tmp_list); - for ( i = 0; i < shadow_size(head_type); i++ ) + for ( i = 0; i < sz; i++ ) { ASSERT(sp->u.sh.type == head_type); ASSERT(!i || !sp->u.sh.head); next = page_list_next(sp, pin_list); page_list_del(sp, pin_list); - page_list_add_tail(sp, &tmp_list); + if ( sz > 1 ) + page_list_add_tail(sp, &tmp_list); + else if ( head_type == SH_type_l3_64_shadow && + d->arch.paging.shadow.unpinning_l3 ) + sp->up = 0; sp = next; } - sh_terminate_list(&tmp_list); + + if ( sz > 1 ) + sh_terminate_list(&tmp_list); sh_put_ref(d, smfn, 0); }