Message ID | s5hio705epm.wl-tiwai@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Sep 24, 2015 at 11:49:57AM +0200, Takashi Iwai wrote: > On Wed, 23 Sep 2015 11:03:44 +0200, > Takashi Iwai wrote: > > > > On Thu, 17 Sep 2015 12:00:03 +0200, > > Thierry Reding wrote: > > > > > > From: Thierry Reding <treding@nvidia.com> > > > > > > The Tegra HDA controller driver committed in v3.16 causes deadlocks when > > > loaded as a module. The reason is that the driver core will lock the HDA > > > controller device upon calling its probe callback and the probe callback > > > then goes on to create child devices for detected codecs and loads their > > > modules via a request_module() call. This is problematic because the new > > > driver will immediately be bound to the device, which will in turn cause > > > the parent of the codec device (the HDA controller device) to be locked > > > again, causing a deadlock. > > > > > > This problem seems to have been present since the modularization of the > > > HD-audio driver in commit 1289e9e8b42f ("ALSA: hda - Modularize HD-audio > > > driver"). On Intel platforms this has been worked around by splitting up > > > the probe sequence into a synchronous and an asynchronous part where the > > > request_module() calls are asynchronous and hence avoid the deadlock. > > > > > > An alternative proposal is provided in this series of patches. Rather > > > than relying on explicit request_module() calls to load kernel modules > > > for HDA codec drivers, this implements a uevent callback for the HDA bus > > > to advertises the MODALIAS information to the userspace helper. > > > > > > Effectively this results in the same modules being loaded, but it uses > > > the more canonical infrastructure to perform this. Deferring the module > > > loading to userspace removes the need for the explicit request_module() > > > calls and works around the recursive locking issue because both drivers > > > will be bound from separate contexts. > > > > While this looks definitely like the right direction to go, I'm afraid > > that this will give a few major regressions. First off, there is no > > way to bind with the generic codec driver. There are two generic > > drivers, one for HDMI/DP and one for normal audio. Binding to them is > > judged by parsing the codec widgets whether they are digital-only. > > So, either user-space or kernel needs to parse the codec widgets > > beforehand. If we rip off all binding magic as in your patch, this > > has to be done by udev. With the sysfs stuff, now it should be > > possible, but this would break the existing system. > > > > Another possible regression is the matching with the vendor-only > > alias. Maybe the current wildcard works, but we need to double > > check. > > > > So, unless these are addressed, I think we need another quick band-aid > > over snd-hda-tegra just doing the async probe like snd-hda-intel. > > Does the patch below work? I only did a quick compile test. > > > thanks, > > Takashi > > -- 8< -- > From: Takashi Iwai <tiwai@suse.de> > Subject: [PATCH] ALSA: hda/tegra - async probe for avoiding module loading > deadlock > > The Tegra HD-audio controller driver causes deadlocks when loaded as a > module since the driver invokes request_module() at binding with the > codec driver. This patch works around it by deferring the probe in a > work like Intel HD-audio controller driver does. Although hovering > the codec probe stuff into udev would be a better solution, it may > cause other regressions, so let's try this band-aid fix until the more > proper solution gets landed. > > Reported-by: Thierry Reding <treding@nvidia.com> > Cc: <stable@vger.kernel.org> > Signed-off-by: Takashi Iwai <tiwai@suse.de> > --- > sound/pci/hda/hda_tegra.c | 30 +++++++++++++++++++++++++----- > 1 file changed, 25 insertions(+), 5 deletions(-) Yes, that fixes the hang that I was seeing: Tested-by: Thierry Reding <treding@nvidia.com> As a matter of fact this resembles a patch that Jon had worked on to solve this. I'm slightly concerned that merging a band-aid like this is going to remove any incentive to fix this properly, though. Thierry > diff --git a/sound/pci/hda/hda_tegra.c b/sound/pci/hda/hda_tegra.c > index 477742cb70a2..58c0aad37284 100644 > --- a/sound/pci/hda/hda_tegra.c > +++ b/sound/pci/hda/hda_tegra.c > @@ -73,6 +73,7 @@ struct hda_tegra { > struct clk *hda2codec_2x_clk; > struct clk *hda2hdmi_clk; > void __iomem *regs; > + struct work_struct probe_work; > }; > > #ifdef CONFIG_PM > @@ -294,7 +295,9 @@ static int hda_tegra_dev_disconnect(struct snd_device *device) > static int hda_tegra_dev_free(struct snd_device *device) > { > struct azx *chip = device->device_data; > + struct hda_tegra *hda = container_of(chip, struct hda_tegra, chip); > > + cancel_work_sync(&hda->probe_work); > if (azx_bus(chip)->chip_init) { > azx_stop_all_streams(chip); > azx_stop_chip(chip); > @@ -426,6 +429,9 @@ static int hda_tegra_first_init(struct azx *chip, struct platform_device *pdev) > /* > * constructor > */ > + > +static void hda_tegra_probe_work(struct work_struct *work); > + > static int hda_tegra_create(struct snd_card *card, > unsigned int driver_caps, > struct hda_tegra *hda) > @@ -452,6 +458,8 @@ static int hda_tegra_create(struct snd_card *card, > chip->single_cmd = false; > chip->snoop = true; > > + INIT_WORK(&hda->probe_work, hda_tegra_probe_work); > + > err = azx_bus_init(chip, NULL, &hda_tegra_io_ops); > if (err < 0) > return err; > @@ -499,6 +507,21 @@ static int hda_tegra_probe(struct platform_device *pdev) > card->private_data = chip; > > dev_set_drvdata(&pdev->dev, card); > + schedule_work(&hda->probe_work); > + > + return 0; > + > +out_free: > + snd_card_free(card); > + return err; > +} > + > +static void hda_tegra_probe_work(struct work_struct *work) > +{ > + struct hda_tegra *hda = container_of(work, struct hda_tegra, probe_work); > + struct azx *chip = &hda->chip; > + struct platform_device *pdev = to_platform_device(hda->dev); > + int err; > > err = hda_tegra_first_init(chip, pdev); > if (err < 0) > @@ -520,11 +543,8 @@ static int hda_tegra_probe(struct platform_device *pdev) > chip->running = 1; > snd_hda_set_power_save(&chip->bus, power_save * 1000); > > - return 0; > - > -out_free: > - snd_card_free(card); > - return err; > + out_free: > + return; /* no error return from async probe */ > } > > static int hda_tegra_remove(struct platform_device *pdev) > -- > 2.5.1 >
On Thu, 24 Sep 2015 12:50:10 +0200, Thierry Reding wrote: > > On Thu, Sep 24, 2015 at 11:49:57AM +0200, Takashi Iwai wrote: > > On Wed, 23 Sep 2015 11:03:44 +0200, > > Takashi Iwai wrote: > > > > > > On Thu, 17 Sep 2015 12:00:03 +0200, > > > Thierry Reding wrote: > > > > > > > > From: Thierry Reding <treding@nvidia.com> > > > > > > > > The Tegra HDA controller driver committed in v3.16 causes deadlocks when > > > > loaded as a module. The reason is that the driver core will lock the HDA > > > > controller device upon calling its probe callback and the probe callback > > > > then goes on to create child devices for detected codecs and loads their > > > > modules via a request_module() call. This is problematic because the new > > > > driver will immediately be bound to the device, which will in turn cause > > > > the parent of the codec device (the HDA controller device) to be locked > > > > again, causing a deadlock. > > > > > > > > This problem seems to have been present since the modularization of the > > > > HD-audio driver in commit 1289e9e8b42f ("ALSA: hda - Modularize HD-audio > > > > driver"). On Intel platforms this has been worked around by splitting up > > > > the probe sequence into a synchronous and an asynchronous part where the > > > > request_module() calls are asynchronous and hence avoid the deadlock. > > > > > > > > An alternative proposal is provided in this series of patches. Rather > > > > than relying on explicit request_module() calls to load kernel modules > > > > for HDA codec drivers, this implements a uevent callback for the HDA bus > > > > to advertises the MODALIAS information to the userspace helper. > > > > > > > > Effectively this results in the same modules being loaded, but it uses > > > > the more canonical infrastructure to perform this. Deferring the module > > > > loading to userspace removes the need for the explicit request_module() > > > > calls and works around the recursive locking issue because both drivers > > > > will be bound from separate contexts. > > > > > > While this looks definitely like the right direction to go, I'm afraid > > > that this will give a few major regressions. First off, there is no > > > way to bind with the generic codec driver. There are two generic > > > drivers, one for HDMI/DP and one for normal audio. Binding to them is > > > judged by parsing the codec widgets whether they are digital-only. > > > So, either user-space or kernel needs to parse the codec widgets > > > beforehand. If we rip off all binding magic as in your patch, this > > > has to be done by udev. With the sysfs stuff, now it should be > > > possible, but this would break the existing system. > > > > > > Another possible regression is the matching with the vendor-only > > > alias. Maybe the current wildcard works, but we need to double > > > check. > > > > > > So, unless these are addressed, I think we need another quick band-aid > > > over snd-hda-tegra just doing the async probe like snd-hda-intel. > > > > Does the patch below work? I only did a quick compile test. > > > > > > thanks, > > > > Takashi > > > > -- 8< -- > > From: Takashi Iwai <tiwai@suse.de> > > Subject: [PATCH] ALSA: hda/tegra - async probe for avoiding module loading > > deadlock > > > > The Tegra HD-audio controller driver causes deadlocks when loaded as a > > module since the driver invokes request_module() at binding with the > > codec driver. This patch works around it by deferring the probe in a > > work like Intel HD-audio controller driver does. Although hovering > > the codec probe stuff into udev would be a better solution, it may > > cause other regressions, so let's try this band-aid fix until the more > > proper solution gets landed. > > > > Reported-by: Thierry Reding <treding@nvidia.com> > > Cc: <stable@vger.kernel.org> > > Signed-off-by: Takashi Iwai <tiwai@suse.de> > > --- > > sound/pci/hda/hda_tegra.c | 30 +++++++++++++++++++++++++----- > > 1 file changed, 25 insertions(+), 5 deletions(-) > > Yes, that fixes the hang that I was seeing: > > Tested-by: Thierry Reding <treding@nvidia.com> Thanks! I'll queue this for the next pull request. > As a matter of fact this resembles a patch that Jon had worked on to > solve this. I'm slightly concerned that merging a band-aid like this > is going to remove any incentive to fix this properly, though. Yeah, it's neither elegant nor cleaner solution but it's certainly safer. Takashi > Thierry > > > diff --git a/sound/pci/hda/hda_tegra.c b/sound/pci/hda/hda_tegra.c > > index 477742cb70a2..58c0aad37284 100644 > > --- a/sound/pci/hda/hda_tegra.c > > +++ b/sound/pci/hda/hda_tegra.c > > @@ -73,6 +73,7 @@ struct hda_tegra { > > struct clk *hda2codec_2x_clk; > > struct clk *hda2hdmi_clk; > > void __iomem *regs; > > + struct work_struct probe_work; > > }; > > > > #ifdef CONFIG_PM > > @@ -294,7 +295,9 @@ static int hda_tegra_dev_disconnect(struct snd_device *device) > > static int hda_tegra_dev_free(struct snd_device *device) > > { > > struct azx *chip = device->device_data; > > + struct hda_tegra *hda = container_of(chip, struct hda_tegra, chip); > > > > + cancel_work_sync(&hda->probe_work); > > if (azx_bus(chip)->chip_init) { > > azx_stop_all_streams(chip); > > azx_stop_chip(chip); > > @@ -426,6 +429,9 @@ static int hda_tegra_first_init(struct azx *chip, struct platform_device *pdev) > > /* > > * constructor > > */ > > + > > +static void hda_tegra_probe_work(struct work_struct *work); > > + > > static int hda_tegra_create(struct snd_card *card, > > unsigned int driver_caps, > > struct hda_tegra *hda) > > @@ -452,6 +458,8 @@ static int hda_tegra_create(struct snd_card *card, > > chip->single_cmd = false; > > chip->snoop = true; > > > > + INIT_WORK(&hda->probe_work, hda_tegra_probe_work); > > + > > err = azx_bus_init(chip, NULL, &hda_tegra_io_ops); > > if (err < 0) > > return err; > > @@ -499,6 +507,21 @@ static int hda_tegra_probe(struct platform_device *pdev) > > card->private_data = chip; > > > > dev_set_drvdata(&pdev->dev, card); > > + schedule_work(&hda->probe_work); > > + > > + return 0; > > + > > +out_free: > > + snd_card_free(card); > > + return err; > > +} > > + > > +static void hda_tegra_probe_work(struct work_struct *work) > > +{ > > + struct hda_tegra *hda = container_of(work, struct hda_tegra, probe_work); > > + struct azx *chip = &hda->chip; > > + struct platform_device *pdev = to_platform_device(hda->dev); > > + int err; > > > > err = hda_tegra_first_init(chip, pdev); > > if (err < 0) > > @@ -520,11 +543,8 @@ static int hda_tegra_probe(struct platform_device *pdev) > > chip->running = 1; > > snd_hda_set_power_save(&chip->bus, power_save * 1000); > > > > - return 0; > > - > > -out_free: > > - snd_card_free(card); > > - return err; > > + out_free: > > + return; /* no error return from async probe */ > > } > > > > static int hda_tegra_remove(struct platform_device *pdev) > > -- > > 2.5.1 > >
diff --git a/sound/pci/hda/hda_tegra.c b/sound/pci/hda/hda_tegra.c index 477742cb70a2..58c0aad37284 100644 --- a/sound/pci/hda/hda_tegra.c +++ b/sound/pci/hda/hda_tegra.c @@ -73,6 +73,7 @@ struct hda_tegra { struct clk *hda2codec_2x_clk; struct clk *hda2hdmi_clk; void __iomem *regs; + struct work_struct probe_work; }; #ifdef CONFIG_PM @@ -294,7 +295,9 @@ static int hda_tegra_dev_disconnect(struct snd_device *device) static int hda_tegra_dev_free(struct snd_device *device) { struct azx *chip = device->device_data; + struct hda_tegra *hda = container_of(chip, struct hda_tegra, chip); + cancel_work_sync(&hda->probe_work); if (azx_bus(chip)->chip_init) { azx_stop_all_streams(chip); azx_stop_chip(chip); @@ -426,6 +429,9 @@ static int hda_tegra_first_init(struct azx *chip, struct platform_device *pdev) /* * constructor */ + +static void hda_tegra_probe_work(struct work_struct *work); + static int hda_tegra_create(struct snd_card *card, unsigned int driver_caps, struct hda_tegra *hda) @@ -452,6 +458,8 @@ static int hda_tegra_create(struct snd_card *card, chip->single_cmd = false; chip->snoop = true; + INIT_WORK(&hda->probe_work, hda_tegra_probe_work); + err = azx_bus_init(chip, NULL, &hda_tegra_io_ops); if (err < 0) return err; @@ -499,6 +507,21 @@ static int hda_tegra_probe(struct platform_device *pdev) card->private_data = chip; dev_set_drvdata(&pdev->dev, card); + schedule_work(&hda->probe_work); + + return 0; + +out_free: + snd_card_free(card); + return err; +} + +static void hda_tegra_probe_work(struct work_struct *work) +{ + struct hda_tegra *hda = container_of(work, struct hda_tegra, probe_work); + struct azx *chip = &hda->chip; + struct platform_device *pdev = to_platform_device(hda->dev); + int err; err = hda_tegra_first_init(chip, pdev); if (err < 0) @@ -520,11 +543,8 @@ static int hda_tegra_probe(struct platform_device *pdev) chip->running = 1; snd_hda_set_power_save(&chip->bus, power_save * 1000); - return 0; - -out_free: - snd_card_free(card); - return err; + out_free: + return; /* no error return from async probe */ } static int hda_tegra_remove(struct platform_device *pdev)