diff mbox series

[1/9] ALSA: hda: Do not unset preset when cleaning up codec

Message ID 20220706120230.427296-2-cezary.rojewski@intel.com (mailing list archive)
State New, archived
Headers show
Series ALSA: hda: Codec-reload bug fixes and cleanups | expand

Commit Message

Cezary Rojewski July 6, 2022, 12:02 p.m. UTC
snd_hda_codec_cleanup_for_unbind() unsets preset what interferes with
module unloading and triggers null-ptr-deref. Preset is assigned only
once, during device/driver matching whereas module reload and unload
follow completely different path and may occur several times during
runtime.

Fixes: 9a6246ff78ac ("ALSA: hda - Implement unbind more safely")
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
---
 sound/pci/hda/hda_codec.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Takashi Iwai July 9, 2022, 4:34 p.m. UTC | #1
On Wed, 06 Jul 2022 14:02:22 +0200,
Cezary Rojewski wrote:
> 
> snd_hda_codec_cleanup_for_unbind() unsets preset what interferes with
> module unloading and triggers null-ptr-deref. Preset is assigned only
> once, during device/driver matching whereas module reload and unload
> follow completely different path and may occur several times during
> runtime.

Hm, the driver reload/unload does unbind.  Keeping this field mean to
leave the pointer to the possibly freed object, no?

And if it's not cleared, where is this field cleared instead?


thanks,

Takashi
Cezary Rojewski July 11, 2022, 8:25 a.m. UTC | #2
On 2022-07-09 6:34 PM, Takashi Iwai wrote:
> On Wed, 06 Jul 2022 14:02:22 +0200,
> Cezary Rojewski wrote:
>>
>> snd_hda_codec_cleanup_for_unbind() unsets preset what interferes with
>> module unloading and triggers null-ptr-deref. Preset is assigned only
>> once, during device/driver matching whereas module reload and unload
>> follow completely different path and may occur several times during
>> runtime.
> 
> Hm, the driver reload/unload does unbind.  Keeping this field mean to
> leave the pointer to the possibly freed object, no?
> 
> And if it's not cleared, where is this field cleared instead?


avs-driver i.e. the bus driver takes responsibility for the codec device 
only. There is no real probe(), just the device creation and 
initialization of its fields. The rest is handled by the component 
driver (sound/soc/hda.c). If this field is cleared and the test is 
limited to reloading HDAudio codec module alone, we get a panic. 
Something similar to the stack found below my message.

In regard to the other question - are presets freed at all? It seems all 
of them are part of the static device-driver matching list. If so, the 
pointer is always valid.


[  136.827856] RIP: 0010:hda_codec_probe+0x16c/0x560 [snd_soc_hda_codec]
[  136.827929] Code: ff 85 c0 0f 88 5b 0b 00 00 4d 8d bc 24 d0 03 00 00 
4c 89 ff e8 e5 a2 9e ca 49 8b 9c 24 d0 03 00 00 48 8d 7b 10 e8 d4 a2 9e 
ca <48> 8b 73 10 4c 89 e7 e8 e8 7d fb ff 85 c0 0f 88 43 0b 00 00 4c 89
[  136.828028] RSP: 0018:ffff888101af74d0 EFLAGS: 00010286
[  136.828079] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 
ffffffff8b4f1b1a
[  136.828128] RDX: 0000000000000001 RSI: 0000000000000008 RDI: 
ffffffff8e323d20
[  136.828175] RBP: ffff888101af7540 R08: 1ffffffff1c647a4 R09: 
fffffbfff1c647a5
[  136.828224] R10: ffffffff8e323d27 R11: fffffbfff1c647a4 R12: 
ffff888102920000
[  136.828272] R13: ffff88810812e428 R14: ffff888102925028 R15: 
ffff8881029203d0
[  136.828323] FS:  00007f9049dd8540(0000) GS:ffff888227100000(0000) 
knlGS:0000000000000000
[  136.828380] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  136.828425] CR2: 0000000000000010 CR3: 000000010f086001 CR4: 
00000000003706e0
[  136.828474] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  136.828520] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
[  136.828568] Call Trace:
[  136.828593]  <TASK>
[  136.828628]  snd_soc_component_probe+0x3a/0x60 [snd_soc_core]
[  136.828981]  soc_probe_component+0x276/0x4a0 [snd_soc_core]
[  136.829274]  snd_soc_bind_card+0x819/0x13d0 [snd_soc_core]
[  136.829560]  ? __kasan_slab_alloc+0x32/0x90
[  136.829614]  snd_soc_register_card+0x24e/0x260 [snd_soc_core]
[  136.829900]  devm_snd_soc_register_card+0x48/0x90 [snd_soc_core]
[  136.830204]  avs_hdaudio_probe+0x298/0x2c0 [snd_soc_avs_hdaudio]
[  136.830269]  platform_probe+0x67/0x100
[  136.830313]  really_probe+0x1ff/0x500
[  136.830354]  __driver_probe_device+0xeb/0x240
[  136.830397]  driver_probe_device+0x4e/0xf0
[  136.830438]  __driver_attach+0xfd/0x210
[  136.830478]  ? __device_attach_driver+0x170/0x170
[  136.830520]  bus_for_each_dev+0xf9/0x150
[  136.830557]  ? subsys_dev_iter_exit+0x10/0x10
[  136.830597]  ? preempt_count_sub+0x18/0xc0
[  136.830643]  driver_attach+0x2d/0x40
[  136.830679]  bus_add_driver+0x28e/0x320
[  136.830722]  driver_register+0xdc/0x170
[  136.830763]  ? 0xffffffffc0428000
[  136.830796]  __platform_driver_register+0x39/0x40
[  136.830842]  avs_hdaudio_driver_init+0x1c/0x1000 [snd_soc_avs_hdaudio]
[  136.830902]  do_one_initcall+0xa0/0x2e0
[  136.830939]  ? initcall_blacklisted+0x170/0x170
[  136.830981]  ? __kasan_kmalloc+0x88/0xa0
[  136.831020]  ? kasan_poison+0x3c/0x50
[  136.831059]  ? kasan_unpoison+0x28/0x50
[  136.831100]  ? kasan_poison+0x3c/0x50
[  136.831139]  ? __asan_register_globals+0x5e/0x70
[  136.831187]  do_init_module+0xf6/0x350
[  136.831228]  load_module+0x2bf5/0x2e30
(...)
Takashi Iwai July 11, 2022, 2:12 p.m. UTC | #3
On Mon, 11 Jul 2022 10:25:17 +0200,
Cezary Rojewski wrote:
> 
> On 2022-07-09 6:34 PM, Takashi Iwai wrote:
> > On Wed, 06 Jul 2022 14:02:22 +0200,
> > Cezary Rojewski wrote:
> >> 
> >> snd_hda_codec_cleanup_for_unbind() unsets preset what interferes with
> >> module unloading and triggers null-ptr-deref. Preset is assigned only
> >> once, during device/driver matching whereas module reload and unload
> >> follow completely different path and may occur several times during
> >> runtime.
> > 
> > Hm, the driver reload/unload does unbind.  Keeping this field mean to
> > leave the pointer to the possibly freed object, no?
> > 
> > And if it's not cleared, where is this field cleared instead?
> 
> 
> avs-driver i.e. the bus driver takes responsibility for the codec
> device only. There is no real probe(), just the device creation and
> initialization of its fields. The rest is handled by the component
> driver (sound/soc/hda.c). If this field is cleared and the test is
> limited to reloading HDAudio codec module alone, we get a
> panic. Something similar to the stack found below my message.
> 
> In regard to the other question - are presets freed at all? It seems
> all of them are part of the static device-driver matching list. If so,
> the pointer is always valid.

When the codec driver is unbound and the module is unloaded, the whole
objects and symbols are gone.


> [  136.827856] RIP: 0010:hda_codec_probe+0x16c/0x560 [snd_soc_hda_codec]
> [  136.827929] Code: ff 85 c0 0f 88 5b 0b 00 00 4d 8d bc 24 d0 03 00
> 00 4c 89 ff e8 e5 a2 9e ca 49 8b 9c 24 d0 03 00 00 48 8d 7b 10 e8 d4
> a2 9e ca <48> 8b 73 10 4c 89 e7 e8 e8 7d fb ff 85 c0 0f 88 43 0b 00 00
> 4c 89
> [  136.828028] RSP: 0018:ffff888101af74d0 EFLAGS: 00010286
> [  136.828079] RAX: 0000000000000001 RBX: 0000000000000000 RCX:
> ffffffff8b4f1b1a
> [  136.828128] RDX: 0000000000000001 RSI: 0000000000000008 RDI:
> ffffffff8e323d20
> [  136.828175] RBP: ffff888101af7540 R08: 1ffffffff1c647a4 R09:
> fffffbfff1c647a5
> [  136.828224] R10: ffffffff8e323d27 R11: fffffbfff1c647a4 R12:
> ffff888102920000
> [  136.828272] R13: ffff88810812e428 R14: ffff888102925028 R15:
> ffff8881029203d0
> [  136.828323] FS:  00007f9049dd8540(0000) GS:ffff888227100000(0000)
> knlGS:0000000000000000
> [  136.828380] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  136.828425] CR2: 0000000000000010 CR3: 000000010f086001 CR4:
> 00000000003706e0
> [  136.828474] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  136.828520] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [  136.828568] Call Trace:
> [  136.828593]  <TASK>
> [  136.828628]  snd_soc_component_probe+0x3a/0x60 [snd_soc_core]
> [  136.828981]  soc_probe_component+0x276/0x4a0 [snd_soc_core]
> [  136.829274]  snd_soc_bind_card+0x819/0x13d0 [snd_soc_core]
> [  136.829560]  ? __kasan_slab_alloc+0x32/0x90
> [  136.829614]  snd_soc_register_card+0x24e/0x260 [snd_soc_core]
> [  136.829900]  devm_snd_soc_register_card+0x48/0x90 [snd_soc_core]
> [  136.830204]  avs_hdaudio_probe+0x298/0x2c0 [snd_soc_avs_hdaudio]
> [  136.830269]  platform_probe+0x67/0x100
> [  136.830313]  really_probe+0x1ff/0x500
> [  136.830354]  __driver_probe_device+0xeb/0x240
> [  136.830397]  driver_probe_device+0x4e/0xf0
> [  136.830438]  __driver_attach+0xfd/0x210
> [  136.830478]  ? __device_attach_driver+0x170/0x170
> [  136.830520]  bus_for_each_dev+0xf9/0x150
> [  136.830557]  ? subsys_dev_iter_exit+0x10/0x10
> [  136.830597]  ? preempt_count_sub+0x18/0xc0
> [  136.830643]  driver_attach+0x2d/0x40
> [  136.830679]  bus_add_driver+0x28e/0x320
> [  136.830722]  driver_register+0xdc/0x170
> [  136.830763]  ? 0xffffffffc0428000
> [  136.830796]  __platform_driver_register+0x39/0x40
> [  136.830842]  avs_hdaudio_driver_init+0x1c/0x1000 [snd_soc_avs_hdaudio]
> [  136.830902]  do_one_initcall+0xa0/0x2e0
> [  136.830939]  ? initcall_blacklisted+0x170/0x170
> [  136.830981]  ? __kasan_kmalloc+0x88/0xa0
> [  136.831020]  ? kasan_poison+0x3c/0x50
> [  136.831059]  ? kasan_unpoison+0x28/0x50
> [  136.831100]  ? kasan_poison+0x3c/0x50
> [  136.831139]  ? __asan_register_globals+0x5e/0x70
> [  136.831187]  do_init_module+0xf6/0x350
> [  136.831228]  load_module+0x2bf5/0x2e30
> (...)

Hmm, in the Oops above, at which moment,
snd_hda_codec_cleanup_for_unbind() is called via which function?
Is it the unload of HD-audio codec driver during the probe of AVS
HD-audio?

The preset is assigned to the given HD-audio device object for the
attached codec driver.  Once after the codec driver gets unbound, you
must not access to this codec driver's methods any longer, hence we
clear the preset field.

So I wonder how the access to the codec->preset happens after the
codec unbind.


thanks,

Takashi
Cezary Rojewski July 12, 2022, 9:42 a.m. UTC | #4
On 2022-07-11 4:12 PM, Takashi Iwai wrote:
> On Mon, 11 Jul 2022 10:25:17 +0200,
> Cezary Rojewski wrote:

...

>> avs-driver i.e. the bus driver takes responsibility for the codec
>> device only. There is no real probe(), just the device creation and
>> initialization of its fields. The rest is handled by the component
>> driver (sound/soc/hda.c). If this field is cleared and the test is
>> limited to reloading HDAudio codec module alone, we get a
>> panic. Something similar to the stack found below my message.
>>
>> In regard to the other question - are presets freed at all? It seems
>> all of them are part of the static device-driver matching list. If so,
>> the pointer is always valid.
> 
> When the codec driver is unbound and the module is unloaded, the whole
> objects and symbols are gone.


hda_codec_driver_remove() won't get even called when soc-card is being 
unbound so everything is still here.

>> [  136.827856] RIP: 0010:hda_codec_probe+0x16c/0x560 [snd_soc_hda_codec]

>> [  136.828568] Call Trace:
>> [  136.828593]  <TASK>
>> [  136.828628]  snd_soc_component_probe+0x3a/0x60 [snd_soc_core]
>> [  136.828981]  soc_probe_component+0x276/0x4a0 [snd_soc_core]
>> [  136.829274]  snd_soc_bind_card+0x819/0x13d0 [snd_soc_core]
>> [  136.829560]  ? __kasan_slab_alloc+0x32/0x90
>> [  136.829614]  snd_soc_register_card+0x24e/0x260 [snd_soc_core]
>> [  136.829900]  devm_snd_soc_register_card+0x48/0x90 [snd_soc_core]
>> [  136.830204]  avs_hdaudio_probe+0x298/0x2c0 [snd_soc_avs_hdaudio]

>> (...)
> 
> Hmm, in the Oops above, at which moment,
> snd_hda_codec_cleanup_for_unbind() is called via which function?
> Is it the unload of HD-audio codec driver during the probe of AVS
> HD-audio?
> 
> The preset is assigned to the given HD-audio device object for the
> attached codec driver.  Once after the codec driver gets unbound, you
> must not access to this codec driver's methods any longer, hence we
> clear the preset field.
> 
> So I wonder how the access to the codec->preset happens after the
> codec unbind.


Test scenario:
- enumerate avs-driver stack on machine with HDAudio codec present
- rmmod snd_soc_avs_hdaudio // just the machine board driver i.e. 
soc-card driver
- modprobe snd_soc_avs_hdaudio
 >>> panic <<<

snd_hda_codec_cleanup_for_unbind() is called in more places than just 
HDAudio codec driver's probe() and remove(). It's also called whenever 
HDAudio codec soc-component is being removed. Relevant part of the stack 
showing when does the cleanup function get called during rmmod:

[  220.549349]  snd_hda_codec_cleanup_for_unbind+0x25/0x451 [snd_hda_codec]
[  220.549536]  ? dump_stack_lvl+0x45/0x49
[  220.549568]  hda_codec_remove.cold+0x14/0x138 [snd_soc_hda_codec]
[  220.549609]  snd_soc_component_remove+0x34/0x40 [snd_soc_core]
[  220.549942]  soc_remove_component+0x113/0x120 [snd_soc_core]
[  220.550249]  soc_cleanup_card_resources+0x1a7/0x4a0 [snd_soc_core]
[  220.550561]  snd_soc_unbind_card+0x9e/0x190 [snd_soc_core]
[  220.550885]  snd_soc_unregister_card+0x28/0x80 [snd_soc_core]
[  220.551193]  devm_card_release+0x1d/0x20 [snd_soc_core]
[  220.551527]  release_nodes+0x73/0x170
[  220.551549]  ? preempt_count_sub+0x18/0xc0
[  220.551576]  devres_release_all+0x10a/0x150
[  220.551600]  ? devres_remove_group+0x260/0x260
[  220.551630]  device_unbind_cleanup+0x14/0xd0
[  220.551656]  device_release_driver_internal+0x146/0x1d0
[  220.551688]  driver_detach+0x81/0xf0
[  220.551716]  bus_remove_driver+0xae/0x170
[  220.551743]  driver_unregister+0x4d/0x70
[  220.551770]  platform_driver_unregister+0x12/0x20
[  220.551799]  avs_hdaudio_driver_exit+0x10/0x12 [snd_soc_avs_hdaudio]


Regards,
Czarek
Takashi Iwai July 12, 2022, 10:46 a.m. UTC | #5
On Tue, 12 Jul 2022 11:42:56 +0200,
Cezary Rojewski wrote:
> 
> On 2022-07-11 4:12 PM, Takashi Iwai wrote:
> > On Mon, 11 Jul 2022 10:25:17 +0200,
> > Cezary Rojewski wrote:
> 
> ...
> 
> >> avs-driver i.e. the bus driver takes responsibility for the codec
> >> device only. There is no real probe(), just the device creation and
> >> initialization of its fields. The rest is handled by the component
> >> driver (sound/soc/hda.c). If this field is cleared and the test is
> >> limited to reloading HDAudio codec module alone, we get a
> >> panic. Something similar to the stack found below my message.
> >> 
> >> In regard to the other question - are presets freed at all? It seems
> >> all of them are part of the static device-driver matching list. If so,
> >> the pointer is always valid.
> > 
> > When the codec driver is unbound and the module is unloaded, the whole
> > objects and symbols are gone.
> 
> 
> hda_codec_driver_remove() won't get even called when soc-card is being
> unbound so everything is still here.
> 
> >> [  136.827856] RIP: 0010:hda_codec_probe+0x16c/0x560 [snd_soc_hda_codec]
> 
> >> [  136.828568] Call Trace:
> >> [  136.828593]  <TASK>
> >> [  136.828628]  snd_soc_component_probe+0x3a/0x60 [snd_soc_core]
> >> [  136.828981]  soc_probe_component+0x276/0x4a0 [snd_soc_core]
> >> [  136.829274]  snd_soc_bind_card+0x819/0x13d0 [snd_soc_core]
> >> [  136.829560]  ? __kasan_slab_alloc+0x32/0x90
> >> [  136.829614]  snd_soc_register_card+0x24e/0x260 [snd_soc_core]
> >> [  136.829900]  devm_snd_soc_register_card+0x48/0x90 [snd_soc_core]
> >> [  136.830204]  avs_hdaudio_probe+0x298/0x2c0 [snd_soc_avs_hdaudio]
> 
> >> (...)
> > 
> > Hmm, in the Oops above, at which moment,
> > snd_hda_codec_cleanup_for_unbind() is called via which function?
> > Is it the unload of HD-audio codec driver during the probe of AVS
> > HD-audio?
> > 
> > The preset is assigned to the given HD-audio device object for the
> > attached codec driver.  Once after the codec driver gets unbound, you
> > must not access to this codec driver's methods any longer, hence we
> > clear the preset field.
> > 
> > So I wonder how the access to the codec->preset happens after the
> > codec unbind.
> 
> 
> Test scenario:
> - enumerate avs-driver stack on machine with HDAudio codec present
> - rmmod snd_soc_avs_hdaudio // just the machine board driver
> i.e. soc-card driver
> - modprobe snd_soc_avs_hdaudio
> >>> panic <<<
> 
> snd_hda_codec_cleanup_for_unbind() is called in more places than just
> HDAudio codec driver's probe() and remove(). It's also called whenever
> HDAudio codec soc-component is being removed. Relevant part of the
> stack showing when does the cleanup function get called during rmmod:
> 
> [  220.549349]  snd_hda_codec_cleanup_for_unbind+0x25/0x451 [snd_hda_codec]
> [  220.549536]  ? dump_stack_lvl+0x45/0x49
> [  220.549568]  hda_codec_remove.cold+0x14/0x138 [snd_soc_hda_codec]
> [  220.549609]  snd_soc_component_remove+0x34/0x40 [snd_soc_core]
> [  220.549942]  soc_remove_component+0x113/0x120 [snd_soc_core]
> [  220.550249]  soc_cleanup_card_resources+0x1a7/0x4a0 [snd_soc_core]
> [  220.550561]  snd_soc_unbind_card+0x9e/0x190 [snd_soc_core]
> [  220.550885]  snd_soc_unregister_card+0x28/0x80 [snd_soc_core]
> [  220.551193]  devm_card_release+0x1d/0x20 [snd_soc_core]
> [  220.551527]  release_nodes+0x73/0x170
> [  220.551549]  ? preempt_count_sub+0x18/0xc0
> [  220.551576]  devres_release_all+0x10a/0x150
> [  220.551600]  ? devres_remove_group+0x260/0x260
> [  220.551630]  device_unbind_cleanup+0x14/0xd0
> [  220.551656]  device_release_driver_internal+0x146/0x1d0
> [  220.551688]  driver_detach+0x81/0xf0
> [  220.551716]  bus_remove_driver+0xae/0x170
> [  220.551743]  driver_unregister+0x4d/0x70
> [  220.551770]  platform_driver_unregister+0x12/0x20
> [  220.551799]  avs_hdaudio_driver_exit+0x10/0x12 [snd_soc_avs_hdaudio]

So, IMO,  you're scratching a wrong surface.  The problem is rather
that snd_hda_codec_cleanup_for_unbind() is called even if it's not for
unbinding the codec.


Takashi
Cezary Rojewski July 12, 2022, 10:58 a.m. UTC | #6
On 2022-07-12 12:46 PM, Takashi Iwai wrote:
> On Tue, 12 Jul 2022 11:42:56 +0200,
> Cezary Rojewski wrote:

...

>> snd_hda_codec_cleanup_for_unbind() is called in more places than just
>> HDAudio codec driver's probe() and remove(). It's also called whenever
>> HDAudio codec soc-component is being removed. Relevant part of the
>> stack showing when does the cleanup function get called during rmmod:
>>
>> [  220.549349]  snd_hda_codec_cleanup_for_unbind+0x25/0x451 [snd_hda_codec]
>> [  220.549536]  ? dump_stack_lvl+0x45/0x49
>> [  220.549568]  hda_codec_remove.cold+0x14/0x138 [snd_soc_hda_codec]
>> [  220.549609]  snd_soc_component_remove+0x34/0x40 [snd_soc_core]
>> [  220.549942]  soc_remove_component+0x113/0x120 [snd_soc_core]
>> [  220.550249]  soc_cleanup_card_resources+0x1a7/0x4a0 [snd_soc_core]
>> [  220.550561]  snd_soc_unbind_card+0x9e/0x190 [snd_soc_core]
>> [  220.550885]  snd_soc_unregister_card+0x28/0x80 [snd_soc_core]
>> [  220.551193]  devm_card_release+0x1d/0x20 [snd_soc_core]
>> [  220.551527]  release_nodes+0x73/0x170
>> [  220.551549]  ? preempt_count_sub+0x18/0xc0
>> [  220.551576]  devres_release_all+0x10a/0x150
>> [  220.551600]  ? devres_remove_group+0x260/0x260
>> [  220.551630]  device_unbind_cleanup+0x14/0xd0
>> [  220.551656]  device_release_driver_internal+0x146/0x1d0
>> [  220.551688]  driver_detach+0x81/0xf0
>> [  220.551716]  bus_remove_driver+0xae/0x170
>> [  220.551743]  driver_unregister+0x4d/0x70
>> [  220.551770]  platform_driver_unregister+0x12/0x20
>> [  220.551799]  avs_hdaudio_driver_exit+0x10/0x12 [snd_soc_avs_hdaudio]
> 
> So, IMO,  you're scratching a wrong surface.  The problem is rather
> that snd_hda_codec_cleanup_for_unbind() is called even if it's not for
> unbinding the codec.


This is how ASoC HDAudio codec component was behaving for years, see 
sound/soc/codecs/hdac_hda.c. If the intention is _not_ do call 
snd_hda_codec_cleanup_for_unbind() then the teardown procedure will 
probably need a little update. Actually.. I'm afraid the init one would 
need an update to. Given how the implementation of HDAudio codec 
component's probe() and remove() looks like, there is no dropping the 
cleanup function without changing the upward path accordingly.

Well, to be honest the init/free procedures of HDAudio codec are a 
little hairy, perhaps it's time to address this.


Regards,
Czarek
Takashi Iwai July 15, 2022, 2:55 p.m. UTC | #7
On Tue, 12 Jul 2022 12:58:09 +0200,
Cezary Rojewski wrote:
> 
> On 2022-07-12 12:46 PM, Takashi Iwai wrote:
> > On Tue, 12 Jul 2022 11:42:56 +0200,
> > Cezary Rojewski wrote:
> 
> ...
> 
> >> snd_hda_codec_cleanup_for_unbind() is called in more places than just
> >> HDAudio codec driver's probe() and remove(). It's also called whenever
> >> HDAudio codec soc-component is being removed. Relevant part of the
> >> stack showing when does the cleanup function get called during rmmod:
> >> 
> >> [  220.549349]  snd_hda_codec_cleanup_for_unbind+0x25/0x451 [snd_hda_codec]
> >> [  220.549536]  ? dump_stack_lvl+0x45/0x49
> >> [  220.549568]  hda_codec_remove.cold+0x14/0x138 [snd_soc_hda_codec]
> >> [  220.549609]  snd_soc_component_remove+0x34/0x40 [snd_soc_core]
> >> [  220.549942]  soc_remove_component+0x113/0x120 [snd_soc_core]
> >> [  220.550249]  soc_cleanup_card_resources+0x1a7/0x4a0 [snd_soc_core]
> >> [  220.550561]  snd_soc_unbind_card+0x9e/0x190 [snd_soc_core]
> >> [  220.550885]  snd_soc_unregister_card+0x28/0x80 [snd_soc_core]
> >> [  220.551193]  devm_card_release+0x1d/0x20 [snd_soc_core]
> >> [  220.551527]  release_nodes+0x73/0x170
> >> [  220.551549]  ? preempt_count_sub+0x18/0xc0
> >> [  220.551576]  devres_release_all+0x10a/0x150
> >> [  220.551600]  ? devres_remove_group+0x260/0x260
> >> [  220.551630]  device_unbind_cleanup+0x14/0xd0
> >> [  220.551656]  device_release_driver_internal+0x146/0x1d0
> >> [  220.551688]  driver_detach+0x81/0xf0
> >> [  220.551716]  bus_remove_driver+0xae/0x170
> >> [  220.551743]  driver_unregister+0x4d/0x70
> >> [  220.551770]  platform_driver_unregister+0x12/0x20
> >> [  220.551799]  avs_hdaudio_driver_exit+0x10/0x12 [snd_soc_avs_hdaudio]
> > 
> > So, IMO,  you're scratching a wrong surface.  The problem is rather
> > that snd_hda_codec_cleanup_for_unbind() is called even if it's not for
> > unbinding the codec.
> 
> 
> This is how ASoC HDAudio codec component was behaving for years, see
> sound/soc/codecs/hdac_hda.c. If the intention is _not_ do call
> snd_hda_codec_cleanup_for_unbind() then the teardown procedure will
> probably need a little update.

Do we see a similar crash with the hdac-hda stuff, too?

And, after avs_hdaudio_driver_exit() is called, why the codec object
still remains bound with the HD-audio (Realtek or whatever) codec
driver?

> Actually.. I'm afraid the init one
> would need an update to. Given how the implementation of HDAudio codec
> component's probe() and remove() looks like, there is no dropping the
> cleanup function without changing the upward path accordingly.
> 
> Well, to be honest the init/free procedures of HDAudio codec are a
> little hairy, perhaps it's time to address this.

Admittedly, the plumbing work for ASoC HD-audio was somewhat messy,
and it's fine if we can clean things up.

snd_hda_codec_cleanup_for_unbind() is certainly written for unbinding
the codec driver, and if a part of that function code is needed for
different purposes, it should be factored out properly, at least.


thanks,

Takashi
Cezary Rojewski Jan. 17, 2023, 2:45 p.m. UTC | #8
On 2022-07-15 4:55 PM, Takashi Iwai wrote:
> On Tue, 12 Jul 2022 12:58:09 +0200, Cezary Rojewski wrote:

>> This is how ASoC HDAudio codec component was behaving for years, see
>> sound/soc/codecs/hdac_hda.c. If the intention is _not_ do call
>> snd_hda_codec_cleanup_for_unbind() then the teardown procedure will
>> probably need a little update.
> 
> Do we see a similar crash with the hdac-hda stuff, too?
> 
> And, after avs_hdaudio_driver_exit() is called, why the codec object
> still remains bound with the HD-audio (Realtek or whatever) codec
> driver?


Hello Takashi,

Your reply was somehow missed by me and shows as a review for patch 5/9 
in my email-client. Sorry for the delay.

In regard to the hdac_hda.c question, we did test reloading for the 
skylake-driver and there are several places where the driver can cause 
panics, that is, it may not even get to hdac_hda failing - some other 
panic will pop up faster.

But yes, the exact same problem exists there as both implementations 
handle hdev_attach/detach() and component's probe/remove() is similar 
fashion.

>> Actually.. I'm afraid the init one
>> would need an update to. Given how the implementation of HDAudio codec
>> component's probe() and remove() looks like, there is no dropping the
>> cleanup function without changing the upward path accordingly.
>>
>> Well, to be honest the init/free procedures of HDAudio codec are a
>> little hairy, perhaps it's time to address this.
> 
> Admittedly, the plumbing work for ASoC HD-audio was somewhat messy,
> and it's fine if we can clean things up.
> 
> snd_hda_codec_cleanup_for_unbind() is certainly written for unbinding
> the codec driver, and if a part of that function code is needed for
> different purposes, it should be factored out properly, at least.

On ASoC side, component->probe() and component->remove() basically mimic 
the behavior of hda_codec_driver_probe/remove() found in 
sound/pci/hda/hda_bind.c. As ASoC sound card may be unbound without 
codec device being actually removed from the system, relying solely on 
driver's (not component's) probe/remove() may not be an option.

So, the discussion does not circle around just 
snd_hda_codec_cleanup_for_unbind() but basically any function that takes 
part in driver's probe() and remove() routines.

Right now, we are in a situation where user can generate a panic with a 
single rmmod. Also, our tests show no regression with modprobe/rmmod on 
snd_hda_intel side with this patch applied.


Regards,
Czarek
Takashi Iwai Jan. 17, 2023, 2:51 p.m. UTC | #9
On Tue, 17 Jan 2023 15:45:17 +0100,
Cezary Rojewski wrote:
> 
> On 2022-07-15 4:55 PM, Takashi Iwai wrote:
> > On Tue, 12 Jul 2022 12:58:09 +0200, Cezary Rojewski wrote:
> 
> >> This is how ASoC HDAudio codec component was behaving for years, see
> >> sound/soc/codecs/hdac_hda.c. If the intention is _not_ do call
> >> snd_hda_codec_cleanup_for_unbind() then the teardown procedure will
> >> probably need a little update.
> > 
> > Do we see a similar crash with the hdac-hda stuff, too?
> > 
> > And, after avs_hdaudio_driver_exit() is called, why the codec object
> > still remains bound with the HD-audio (Realtek or whatever) codec
> > driver?
> 
> 
> Hello Takashi,
> 
> Your reply was somehow missed by me and shows as a review for patch
> 5/9 in my email-client. Sorry for the delay.
> 
> In regard to the hdac_hda.c question, we did test reloading for the
> skylake-driver and there are several places where the driver can cause
> panics, that is, it may not even get to hdac_hda failing - some other
> panic will pop up faster.
> 
> But yes, the exact same problem exists there as both implementations
> handle hdev_attach/detach() and component's probe/remove() is similar
> fashion.
> 
> >> Actually.. I'm afraid the init one
> >> would need an update to. Given how the implementation of HDAudio codec
> >> component's probe() and remove() looks like, there is no dropping the
> >> cleanup function without changing the upward path accordingly.
> >> 
> >> Well, to be honest the init/free procedures of HDAudio codec are a
> >> little hairy, perhaps it's time to address this.
> > 
> > Admittedly, the plumbing work for ASoC HD-audio was somewhat messy,
> > and it's fine if we can clean things up.
> > 
> > snd_hda_codec_cleanup_for_unbind() is certainly written for unbinding
> > the codec driver, and if a part of that function code is needed for
> > different purposes, it should be factored out properly, at least.
> 
> On ASoC side, component->probe() and component->remove() basically
> mimic the behavior of hda_codec_driver_probe/remove() found in
> sound/pci/hda/hda_bind.c. As ASoC sound card may be unbound without
> codec device being actually removed from the system, relying solely on
> driver's (not component's) probe/remove() may not be an option.
> 
> So, the discussion does not circle around just
> snd_hda_codec_cleanup_for_unbind() but basically any function that
> takes part in driver's probe() and remove() routines.
> 
> Right now, we are in a situation where user can generate a panic with
> a single rmmod. Also, our tests show no regression with modprobe/rmmod
> on snd_hda_intel side with this patch applied.

Let's focus on that bug, then.  Can we restart the thread with the
minimal change?  Otherwise it's hard to review and discuss further.


thanks,

Takashi
diff mbox series

Patch

diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c
index 7579a6982f47..9ceb642ac819 100644
--- a/sound/pci/hda/hda_codec.c
+++ b/sound/pci/hda/hda_codec.c
@@ -795,7 +795,6 @@  void snd_hda_codec_cleanup_for_unbind(struct hda_codec *codec)
 	snd_array_free(&codec->cvt_setups);
 	snd_array_free(&codec->spdif_out);
 	snd_array_free(&codec->verbs);
-	codec->preset = NULL;
 	codec->follower_dig_outs = NULL;
 	codec->spdif_status_reset = 0;
 	snd_array_free(&codec->mixers);