Message ID | 20220819221731.480795-11-gpiccoli@igalia.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | The panic notifiers refactor - fixes/clean-ups (V3) | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch, async |
From: Guilherme G. Piccoli <gpiccoli@igalia.com> Sent: Friday, August 19, 2022 3:18 PM > > Currently Hyper-V guests are among the most relevant users of the panic > infrastructure, like panic notifiers, kmsg dumpers, etc. The reasons rely > both in cleaning-up procedures (closing hypervisor <-> guest connection, > disabling some paravirtualized timer) as well as to data collection > (sending panic information to the hypervisor) and framebuffer management. > > The thing is: some notifiers are related to others, ordering matters, some > functionalities are duplicated and there are lots of conditionals behind > sending panic information to the hypervisor. As part of an effort to > clean-up the panic notifiers mechanism and better document things, we > hereby address some of the issues/complexities of Hyper-V panic handling > through the following changes: > > (a) We have die and panic notifiers on vmbus_drv.c and both have goals of > sending panic information to the hypervisor, though the panic notifier is > also responsible for a cleaning-up procedure. > > This commit clears the code by splitting the panic notifier in two, one > for closing the vmbus connection whereas the other is only for sending > panic info to hypervisor. With that, it was possible to merge the die and > panic notifiers in a single/well-documented function, and clear some > conditional complexities on sending such information to the hypervisor. > > (b) There is a Hyper-V framebuffer panic notifier, which relies in doing > a vmbus operation that demands a valid connection. So, we must order this > notifier with the panic notifier from vmbus_drv.c, to guarantee that the > framebuffer code executes before the vmbus connection is unloaded. > > Also, this commit removes a useless header. > > Although there is code rework and re-ordering, we expect that this change > has no functional regressions but instead optimize the path and increase > panic reliability on Hyper-V. This was tested on Hyper-V with success. > > Cc: Andrea Parri (Microsoft) <parri.andrea@gmail.com> > Cc: Dexuan Cui <decui@microsoft.com> > Cc: Haiyang Zhang <haiyangz@microsoft.com> > Cc: "K. Y. Srinivasan" <kys@microsoft.com> > Cc: Petr Mladek <pmladek@suse.com> > Cc: Stephen Hemminger <sthemmin@microsoft.com> > Cc: Tianyu Lan <Tianyu.Lan@microsoft.com> > Cc: Wei Liu <wei.liu@kernel.org> > Reviewed-by: Michael Kelley <mikelley@microsoft.com> > Tested-by: Fabio A M Martins <fabiomirmar@gmail.com> > Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> > > --- > > V3: > - Added Michael's review tag - thanks! > > V2: > - Unfortunately we cannot rely in the crash shutdown (custom) handler > to perform the vmbus unload - arm64 architecture doesn't have this > "feature" [0]. So, in V2 we kept the notifier behavior and always > unload the vmbus connection, no matter what - thanks Michael for > pointing that; > > - Removed the Fixes tags as per Michael suggestion; > > - As per Petr suggestion, we abandoned the idea of distinguish among > notifiers using an id - so, in V2 we rely in the old and good address > comparison for that. Thanks Petr for the enriching discussion! > > [0] https://lore.kernel.org/lkml/427a8277-49f0-4317-d6c3-4a15d7070e55@igalia.com/ > > > drivers/hv/vmbus_drv.c | 109 +++++++++++++++++++------------- > drivers/video/fbdev/hyperv_fb.c | 8 +++ > 2 files changed, 74 insertions(+), 43 deletions(-) > Tested this patch in combination with Patch 9 in this series. Verified that both the panic and die paths work correctly with notification to Hyper-V via hyperv_report_panic() or via hv_kmsg_dump(). Hyper-V framebuffer is updated as expected, though I did not reproduce a case where the ring buffer lock is held. vmbus_initiate_unload() runs as expected. Tested-by: Michael Kelley <mikelley@microsoft.com> > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c > index 23c680d1a0f5..0a77e2bb0b70 100644 > --- a/drivers/hv/vmbus_drv.c > +++ b/drivers/hv/vmbus_drv.c > @@ -25,7 +25,6 @@ > #include <linux/sched/task_stack.h> > > #include <linux/delay.h> > -#include <linux/notifier.h> > #include <linux/panic_notifier.h> > #include <linux/ptrace.h> > #include <linux/screen_info.h> > @@ -69,53 +68,74 @@ static int hyperv_report_reg(void) > return !sysctl_record_panic_msg || !hv_panic_page; > } > > -static int hyperv_panic_event(struct notifier_block *nb, unsigned long val, > +/* > + * The panic notifier below is responsible solely for unloading the > + * vmbus connection, which is necessary in a panic event. > + * > + * Notice an intrincate relation of this notifier with Hyper-V > + * framebuffer panic notifier exists - we need vmbus connection alive > + * there in order to succeed, so we need to order both with each other > + * [see hvfb_on_panic()] - this is done using notifiers' priorities. > + */ > +static int hv_panic_vmbus_unload(struct notifier_block *nb, unsigned long val, > void *args) > +{ > + vmbus_initiate_unload(true); > + return NOTIFY_DONE; > +} > +static struct notifier_block hyperv_panic_vmbus_unload_block = { > + .notifier_call = hv_panic_vmbus_unload, > + .priority = INT_MIN + 1, /* almost the latest one to execute */ > +}; > + > +static int hv_die_panic_notify_crash(struct notifier_block *self, > + unsigned long val, void *args); > + > +static struct notifier_block hyperv_die_report_block = { > + .notifier_call = hv_die_panic_notify_crash, > +}; > +static struct notifier_block hyperv_panic_report_block = { > + .notifier_call = hv_die_panic_notify_crash, > +}; > + > +/* > + * The following callback works both as die and panic notifier; its > + * goal is to provide panic information to the hypervisor unless the > + * kmsg dumper is used [see hv_kmsg_dump()], which provides more > + * information but isn't always available. > + * > + * Notice that both the panic/die report notifiers are registered only > + * if we have the capability HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE set. > + */ > +static int hv_die_panic_notify_crash(struct notifier_block *self, > + unsigned long val, void *args) > { > struct pt_regs *regs; > + bool is_die; > > - vmbus_initiate_unload(true); > - > - /* > - * Hyper-V should be notified only once about a panic. If we will be > - * doing hv_kmsg_dump() with kmsg data later, don't do the notification > - * here. > - */ > - if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE > - && hyperv_report_reg()) { > + /* Don't notify Hyper-V unless we have a die oops event or panic. */ > + if (self == &hyperv_panic_report_block) { > + is_die = false; > regs = current_pt_regs(); > - hyperv_report_panic(regs, val, false); > + } else { /* die event */ > + if (val != DIE_OOPS) > + return NOTIFY_DONE; > + > + is_die = true; > + regs = ((struct die_args *)args)->regs; > } > - return NOTIFY_DONE; > -} > - > -static int hyperv_die_event(struct notifier_block *nb, unsigned long val, > - void *args) > -{ > - struct die_args *die = args; > - struct pt_regs *regs = die->regs; > - > - /* Don't notify Hyper-V if the die event is other than oops */ > - if (val != DIE_OOPS) > - return NOTIFY_DONE; > > /* > - * Hyper-V should be notified only once about a panic. If we will be > - * doing hv_kmsg_dump() with kmsg data later, don't do the notification > - * here. > + * Hyper-V should be notified only once about a panic/die. If we will > + * be calling hv_kmsg_dump() later with kmsg data, don't do the > + * notification here. > */ > if (hyperv_report_reg()) > - hyperv_report_panic(regs, val, true); > + hyperv_report_panic(regs, val, is_die); > + > return NOTIFY_DONE; > } > > -static struct notifier_block hyperv_die_block = { > - .notifier_call = hyperv_die_event, > -}; > -static struct notifier_block hyperv_panic_block = { > - .notifier_call = hyperv_panic_event, > -}; > - > static const char *fb_mmio_name = "fb_range"; > static struct resource *fb_mmio; > static struct resource *hyperv_mmio; > @@ -1538,16 +1558,17 @@ static int vmbus_bus_init(void) > if (hyperv_crash_ctl & HV_CRASH_CTL_CRASH_NOTIFY_MSG) > hv_kmsg_dump_register(); > > - register_die_notifier(&hyperv_die_block); > + register_die_notifier(&hyperv_die_report_block); > + atomic_notifier_chain_register(&panic_notifier_list, > + &hyperv_panic_report_block); > } > > /* > - * Always register the panic notifier because we need to unload > - * the VMbus channel connection to prevent any VMbus > - * activity after the VM panics. > + * Always register the vmbus unload panic notifier because we > + * need to shut the VMbus channel connection on panic. > */ > atomic_notifier_chain_register(&panic_notifier_list, > - &hyperv_panic_block); > + &hyperv_panic_vmbus_unload_block); > > vmbus_request_offers(); > > @@ -2776,15 +2797,17 @@ static void __exit vmbus_exit(void) > > if (ms_hyperv.misc_features & > HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { > kmsg_dump_unregister(&hv_kmsg_dumper); > - unregister_die_notifier(&hyperv_die_block); > + unregister_die_notifier(&hyperv_die_report_block); > + atomic_notifier_chain_unregister(&panic_notifier_list, > + &hyperv_panic_report_block); > } > > /* > - * The panic notifier is always registered, hence we should > + * The vmbus panic notifier is always registered, hence we should > * also unconditionally unregister it here as well. > */ > atomic_notifier_chain_unregister(&panic_notifier_list, > - &hyperv_panic_block); > + &hyperv_panic_vmbus_unload_block); > > free_page((unsigned long)hv_panic_page); > unregister_sysctl_table(hv_ctl_table_hdr); > diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c > index e1b65a01fb96..9234d31d4305 100644 > --- a/drivers/video/fbdev/hyperv_fb.c > +++ b/drivers/video/fbdev/hyperv_fb.c > @@ -1216,7 +1216,15 @@ static int hvfb_probe(struct hv_device *hdev, > par->fb_ready = true; > > par->synchronous_fb = false; > + > + /* > + * We need to be sure this panic notifier runs _before_ the > + * vmbus disconnect, so order it by priority. It must execute > + * before the function hv_panic_vmbus_unload() [drivers/hv/vmbus_drv.c], > + * which is almost at the end of list, with priority = INT_MIN + 1. > + */ > par->hvfb_panic_nb.notifier_call = hvfb_on_panic; > + par->hvfb_panic_nb.priority = INT_MIN + 10, > atomic_notifier_chain_register(&panic_notifier_list, > &par->hvfb_panic_nb); > > -- > 2.37.2
On 04/10/2022 13:24, Michael Kelley (LINUX) wrote: > [...] > > Tested this patch in combination with Patch 9 in this series. Verified > that both the panic and die paths work correctly with notification to > Hyper-V via hyperv_report_panic() or via hv_kmsg_dump(). Hyper-V > framebuffer is updated as expected, though I did not reproduce > a case where the ring buffer lock is held. vmbus_initiate_unload() runs > as expected. > > Tested-by: Michael Kelley <mikelley@microsoft.com> > Thanks a lot for the tests/review Michael! Do you think Hyper-V folks could add both patches in hv tree? If you prefer, I can re-send them individually. Cheers, Guilherme
From: Guilherme G. Piccoli <gpiccoli@igalia.com> Sent: Tuesday, October 4, 2022 10:20 AM > > On 04/10/2022 13:24, Michael Kelley (LINUX) wrote: > > [...] > > > > Tested this patch in combination with Patch 9 in this series. Verified > > that both the panic and die paths work correctly with notification to > > Hyper-V via hyperv_report_panic() or via hv_kmsg_dump(). Hyper-V > > framebuffer is updated as expected, though I did not reproduce > > a case where the ring buffer lock is held. vmbus_initiate_unload() runs > > as expected. > > > > Tested-by: Michael Kelley <mikelley@microsoft.com> > > > > Thanks a lot for the tests/review Michael! > > Do you think Hyper-V folks could add both patches in hv tree? If you > prefer, I can re-send them individually. > Wei Liu: Could you pick up Patch 9 and Patch 10 from this series in the hyperv-next tree? Michael
[Trimming long CC list] Hi Wei / Michael, just out of curiosity, did this (and patch 9) ended-up being queued in hyperv-next? Thanks in advance, Guilherme On 17/10/2022 12:26, Michael Kelley (LINUX) wrote: > From: Guilherme G. Piccoli <gpiccoli@igalia.com> Sent: Tuesday, October 4, 2022 10:20 AM >> >> On 04/10/2022 13:24, Michael Kelley (LINUX) wrote: >>> [...] >>> >>> Tested this patch in combination with Patch 9 in this series. Verified >>> that both the panic and die paths work correctly with notification to >>> Hyper-V via hyperv_report_panic() or via hv_kmsg_dump(). Hyper-V >>> framebuffer is updated as expected, though I did not reproduce >>> a case where the ring buffer lock is held. vmbus_initiate_unload() runs >>> as expected. >>> >>> Tested-by: Michael Kelley <mikelley@microsoft.com> >>> >> >> Thanks a lot for the tests/review Michael! >> >> Do you think Hyper-V folks could add both patches in hv tree? If you >> prefer, I can re-send them individually. >> > > Wei Liu: Could you pick up Patch 9 and Patch 10 from this series in the > hyperv-next tree? > > Michael >
On Thu, Nov 10, 2022 at 06:32:54PM -0300, Guilherme G. Piccoli wrote: > [Trimming long CC list] > > Hi Wei / Michael, just out of curiosity, did this (and patch 9) ended-up > being queued in hyperv-next? > No. They are not queued. I didn't notice Michael's email and thought they would be picked up by someone else while dealing with the whole series. I will pick them up later. Thanks, Wei. > Thanks in advance, > > > Guilherme > > On 17/10/2022 12:26, Michael Kelley (LINUX) wrote: > > From: Guilherme G. Piccoli <gpiccoli@igalia.com> Sent: Tuesday, October 4, 2022 10:20 AM > >> > >> On 04/10/2022 13:24, Michael Kelley (LINUX) wrote: > >>> [...] > >>> > >>> Tested this patch in combination with Patch 9 in this series. Verified > >>> that both the panic and die paths work correctly with notification to > >>> Hyper-V via hyperv_report_panic() or via hv_kmsg_dump(). Hyper-V > >>> framebuffer is updated as expected, though I did not reproduce > >>> a case where the ring buffer lock is held. vmbus_initiate_unload() runs > >>> as expected. > >>> > >>> Tested-by: Michael Kelley <mikelley@microsoft.com> > >>> > >> > >> Thanks a lot for the tests/review Michael! > >> > >> Do you think Hyper-V folks could add both patches in hv tree? If you > >> prefer, I can re-send them individually. > >> > > > > Wei Liu: Could you pick up Patch 9 and Patch 10 from this series in the > > hyperv-next tree? > > > > Michael > >
On Fri, Nov 11, 2022 at 10:47:17PM +0000, Wei Liu wrote: > On Thu, Nov 10, 2022 at 06:32:54PM -0300, Guilherme G. Piccoli wrote: > > [Trimming long CC list] > > > > Hi Wei / Michael, just out of curiosity, did this (and patch 9) ended-up > > being queued in hyperv-next? > > > > No. They are not queued. > > I didn't notice Michael's email and thought they would be picked up by > someone else while dealing with the whole series. > > I will pick them up later. They are now queued to hyperv-next. Thanks.
On 11/11/2022 20:16, Wei Liu wrote: > On Fri, Nov 11, 2022 at 10:47:17PM +0000, Wei Liu wrote: >> On Thu, Nov 10, 2022 at 06:32:54PM -0300, Guilherme G. Piccoli wrote: >>> [Trimming long CC list] >>> >>> Hi Wei / Michael, just out of curiosity, did this (and patch 9) ended-up >>> being queued in hyperv-next? >>> >> >> No. They are not queued. >> >> I didn't notice Michael's email and thought they would be picked up by >> someone else while dealing with the whole series. >> >> I will pick them up later. > > They are now queued to hyperv-next. Thanks. Thanks a lot Wei and Michael, much appreciated =)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 23c680d1a0f5..0a77e2bb0b70 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -25,7 +25,6 @@ #include <linux/sched/task_stack.h> #include <linux/delay.h> -#include <linux/notifier.h> #include <linux/panic_notifier.h> #include <linux/ptrace.h> #include <linux/screen_info.h> @@ -69,53 +68,74 @@ static int hyperv_report_reg(void) return !sysctl_record_panic_msg || !hv_panic_page; } -static int hyperv_panic_event(struct notifier_block *nb, unsigned long val, +/* + * The panic notifier below is responsible solely for unloading the + * vmbus connection, which is necessary in a panic event. + * + * Notice an intrincate relation of this notifier with Hyper-V + * framebuffer panic notifier exists - we need vmbus connection alive + * there in order to succeed, so we need to order both with each other + * [see hvfb_on_panic()] - this is done using notifiers' priorities. + */ +static int hv_panic_vmbus_unload(struct notifier_block *nb, unsigned long val, void *args) +{ + vmbus_initiate_unload(true); + return NOTIFY_DONE; +} +static struct notifier_block hyperv_panic_vmbus_unload_block = { + .notifier_call = hv_panic_vmbus_unload, + .priority = INT_MIN + 1, /* almost the latest one to execute */ +}; + +static int hv_die_panic_notify_crash(struct notifier_block *self, + unsigned long val, void *args); + +static struct notifier_block hyperv_die_report_block = { + .notifier_call = hv_die_panic_notify_crash, +}; +static struct notifier_block hyperv_panic_report_block = { + .notifier_call = hv_die_panic_notify_crash, +}; + +/* + * The following callback works both as die and panic notifier; its + * goal is to provide panic information to the hypervisor unless the + * kmsg dumper is used [see hv_kmsg_dump()], which provides more + * information but isn't always available. + * + * Notice that both the panic/die report notifiers are registered only + * if we have the capability HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE set. + */ +static int hv_die_panic_notify_crash(struct notifier_block *self, + unsigned long val, void *args) { struct pt_regs *regs; + bool is_die; - vmbus_initiate_unload(true); - - /* - * Hyper-V should be notified only once about a panic. If we will be - * doing hv_kmsg_dump() with kmsg data later, don't do the notification - * here. - */ - if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE - && hyperv_report_reg()) { + /* Don't notify Hyper-V unless we have a die oops event or panic. */ + if (self == &hyperv_panic_report_block) { + is_die = false; regs = current_pt_regs(); - hyperv_report_panic(regs, val, false); + } else { /* die event */ + if (val != DIE_OOPS) + return NOTIFY_DONE; + + is_die = true; + regs = ((struct die_args *)args)->regs; } - return NOTIFY_DONE; -} - -static int hyperv_die_event(struct notifier_block *nb, unsigned long val, - void *args) -{ - struct die_args *die = args; - struct pt_regs *regs = die->regs; - - /* Don't notify Hyper-V if the die event is other than oops */ - if (val != DIE_OOPS) - return NOTIFY_DONE; /* - * Hyper-V should be notified only once about a panic. If we will be - * doing hv_kmsg_dump() with kmsg data later, don't do the notification - * here. + * Hyper-V should be notified only once about a panic/die. If we will + * be calling hv_kmsg_dump() later with kmsg data, don't do the + * notification here. */ if (hyperv_report_reg()) - hyperv_report_panic(regs, val, true); + hyperv_report_panic(regs, val, is_die); + return NOTIFY_DONE; } -static struct notifier_block hyperv_die_block = { - .notifier_call = hyperv_die_event, -}; -static struct notifier_block hyperv_panic_block = { - .notifier_call = hyperv_panic_event, -}; - static const char *fb_mmio_name = "fb_range"; static struct resource *fb_mmio; static struct resource *hyperv_mmio; @@ -1538,16 +1558,17 @@ static int vmbus_bus_init(void) if (hyperv_crash_ctl & HV_CRASH_CTL_CRASH_NOTIFY_MSG) hv_kmsg_dump_register(); - register_die_notifier(&hyperv_die_block); + register_die_notifier(&hyperv_die_report_block); + atomic_notifier_chain_register(&panic_notifier_list, + &hyperv_panic_report_block); } /* - * Always register the panic notifier because we need to unload - * the VMbus channel connection to prevent any VMbus - * activity after the VM panics. + * Always register the vmbus unload panic notifier because we + * need to shut the VMbus channel connection on panic. */ atomic_notifier_chain_register(&panic_notifier_list, - &hyperv_panic_block); + &hyperv_panic_vmbus_unload_block); vmbus_request_offers(); @@ -2776,15 +2797,17 @@ static void __exit vmbus_exit(void) if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { kmsg_dump_unregister(&hv_kmsg_dumper); - unregister_die_notifier(&hyperv_die_block); + unregister_die_notifier(&hyperv_die_report_block); + atomic_notifier_chain_unregister(&panic_notifier_list, + &hyperv_panic_report_block); } /* - * The panic notifier is always registered, hence we should + * The vmbus panic notifier is always registered, hence we should * also unconditionally unregister it here as well. */ atomic_notifier_chain_unregister(&panic_notifier_list, - &hyperv_panic_block); + &hyperv_panic_vmbus_unload_block); free_page((unsigned long)hv_panic_page); unregister_sysctl_table(hv_ctl_table_hdr); diff --git a/drivers/video/fbdev/hyperv_fb.c b/drivers/video/fbdev/hyperv_fb.c index e1b65a01fb96..9234d31d4305 100644 --- a/drivers/video/fbdev/hyperv_fb.c +++ b/drivers/video/fbdev/hyperv_fb.c @@ -1216,7 +1216,15 @@ static int hvfb_probe(struct hv_device *hdev, par->fb_ready = true; par->synchronous_fb = false; + + /* + * We need to be sure this panic notifier runs _before_ the + * vmbus disconnect, so order it by priority. It must execute + * before the function hv_panic_vmbus_unload() [drivers/hv/vmbus_drv.c], + * which is almost at the end of list, with priority = INT_MIN + 1. + */ par->hvfb_panic_nb.notifier_call = hvfb_on_panic; + par->hvfb_panic_nb.priority = INT_MIN + 10, atomic_notifier_chain_register(&panic_notifier_list, &par->hvfb_panic_nb);