Message ID | 20230731165018.8233-1-mirsad.todorovac@alu.unizg.hr (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1,1/1] test_firmware: prevent race conditions by a correct implementation of locking | expand |
On Mon, Jul 31, 2023 at 06:50:19PM +0200, Mirsad Todorovac wrote: > NOTE: This patch is tested against 5.4 stable > > NOTE: This is a patch for the 5.4 stable branch, not for the torvalds tree. > > The torvalds tree, and stable tree 5.10, 5.15, 6.1 and 6.4 branches > were fixed in the separate > commit ID 4acfe3dfde68 ("test_firmware: prevent race conditions by a correct implementation of locking") > which was incompatible with 5.4 > The above part is not part of the original commit, you also forgot to mention the upstream commit: [ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ] > Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests") > Cc: Luis R. Rodriguez <mcgrof@kernel.org> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: Russ Weight <russell.h.weight@intel.com> > Cc: Takashi Iwai <tiwai@suse.de> > Cc: Tianfei Zhang <tianfei.zhang@intel.com> > Cc: Shuah Khan <shuah@kernel.org> > Cc: Colin Ian King <colin.i.king@gmail.com> > Cc: Randy Dunlap <rdunlap@infradead.org> > Cc: linux-kselftest@vger.kernel.org > Cc: stable@vger.kernel.org # v5.4 > Suggested-by: Dan Carpenter <error27@gmail.com> Here you can add the above note in brackets: [ explain your changes here from the original commit ] Then, I see two commits upstream on Linus tree which are also fixes but not merged on v5.4, did you want those applied too? Luis
On 7/31/23 19:27, Luis Chamberlain wrote: > On Mon, Jul 31, 2023 at 06:50:19PM +0200, Mirsad Todorovac wrote: >> NOTE: This patch is tested against 5.4 stable >> >> NOTE: This is a patch for the 5.4 stable branch, not for the torvalds tree. >> >> The torvalds tree, and stable tree 5.10, 5.15, 6.1 and 6.4 branches >> were fixed in the separate >> commit ID 4acfe3dfde68 ("test_firmware: prevent race conditions by a correct implementation of locking") >> which was incompatible with 5.4 >> > > The above part is not part of the original commit, you also forgot to > mention the upstream commit: > > [ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ] Will fix. Actually, I wasn't sure if it was required, because this backported patch isn't verbatim equal to commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 . Though they are cousins, addressing the same issue. There is a race to be fixed, despite not all racy functions present in the original commit c92316bf8e948. >> Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests") >> Cc: Luis R. Rodriguez <mcgrof@kernel.org> >> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> >> Cc: Russ Weight <russell.h.weight@intel.com> >> Cc: Takashi Iwai <tiwai@suse.de> >> Cc: Tianfei Zhang <tianfei.zhang@intel.com> >> Cc: Shuah Khan <shuah@kernel.org> >> Cc: Colin Ian King <colin.i.king@gmail.com> >> Cc: Randy Dunlap <rdunlap@infradead.org> >> Cc: linux-kselftest@vger.kernel.org >> Cc: stable@vger.kernel.org # v5.4 >> Suggested-by: Dan Carpenter <error27@gmail.com> > > Here you can add the above note in brackets: > > [ explain your changes here from the original commit ] > > Then, I see two commits upstream on Linus tree which are also fixes > but not merged on v5.4, did you want those applied too? These seem merged in the stable 5.4? commit 75d9e00f65cd2e0f2ce9ceeb395f821976773489 test_firmware: fix a memory leak with reqs buffer commit 94f3bc7e84af2f17dbfbc7afe93991c2a6f2f25e test_firmware: fix the memory leak of the allocated firmware buffer Maybe this commit should be backported instead: test_firmware: return ENOMEM instead of ENOSPC on failed memory allocation [ Upstream commit 7dae593cd226a0bca61201cf85ceb9335cf63682 ] It was also merged into 6.4, 6.1, 5.15 and 5.10 stable, but not on 5.4 I might also check whether the 4.19 and 4.14 are vulnerable to these memory leaks and this race (Yes, they are, so it might be prudent that we backport this fix.) Mirsad > > Luis
On 8/1/23 10:24, Mirsad Todorovac wrote: > On 7/31/23 19:27, Luis Chamberlain wrote: >> On Mon, Jul 31, 2023 at 06:50:19PM +0200, Mirsad Todorovac wrote: >>> NOTE: This patch is tested against 5.4 stable >>> >>> NOTE: This is a patch for the 5.4 stable branch, not for the torvalds tree. >>> >>> The torvalds tree, and stable tree 5.10, 5.15, 6.1 and 6.4 branches >>> were fixed in the separate >>> commit ID 4acfe3dfde68 ("test_firmware: prevent race conditions by a correct implementation of locking") >>> which was incompatible with 5.4 >>> >> >> The above part is not part of the original commit, you also forgot to >> mention the upstream commit: >> >> [ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ] > > Will fix. Actually, I wasn't sure if it was required, because this backported patch > isn't verbatim equal to commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 . > > Though they are cousins, addressing the same issue. > > There is a race to be fixed, despite not all racy functions present in the original commit c92316bf8e948. > >>> Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests") >>> Cc: Luis R. Rodriguez <mcgrof@kernel.org> >>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> >>> Cc: Russ Weight <russell.h.weight@intel.com> >>> Cc: Takashi Iwai <tiwai@suse.de> >>> Cc: Tianfei Zhang <tianfei.zhang@intel.com> >>> Cc: Shuah Khan <shuah@kernel.org> >>> Cc: Colin Ian King <colin.i.king@gmail.com> >>> Cc: Randy Dunlap <rdunlap@infradead.org> >>> Cc: linux-kselftest@vger.kernel.org >>> Cc: stable@vger.kernel.org # v5.4 >>> Suggested-by: Dan Carpenter <error27@gmail.com> >> >> Here you can add the above note in brackets: >> >> [ explain your changes here from the original commit ] >> >> Then, I see two commits upstream on Linus tree which are also fixes >> but not merged on v5.4, did you want those applied too? > > These seem merged in the stable 5.4? > > commit 75d9e00f65cd2e0f2ce9ceeb395f821976773489 test_firmware: fix a memory leak with reqs buffer > commit 94f3bc7e84af2f17dbfbc7afe93991c2a6f2f25e test_firmware: fix the memory leak of the allocated firmware buffer > > Maybe this commit should be backported instead: > > test_firmware: return ENOMEM instead of ENOSPC on failed memory allocation > [ Upstream commit 7dae593cd226a0bca61201cf85ceb9335cf63682 ] > > It was also merged into 6.4, 6.1, 5.15 and 5.10 stable, but not on 5.4 > > I might also check whether the 4.19 and 4.14 are vulnerable to these memory leaks and this race > (Yes, they are, so it might be prudent that we backport this fix.) FYI, just checked, the patch applied w/o modifications to 4.19 and 4.14 LTS stable branches. Mirsad
On 01. 08. 2023. 11:57, Mirsad Todorovac wrote: > On 8/1/23 10:24, Mirsad Todorovac wrote: >> On 7/31/23 19:27, Luis Chamberlain wrote: >>> On Mon, Jul 31, 2023 at 06:50:19PM +0200, Mirsad Todorovac wrote: >>>> NOTE: This patch is tested against 5.4 stable >>>> >>>> NOTE: This is a patch for the 5.4 stable branch, not for the torvalds tree. >>>> >>>> The torvalds tree, and stable tree 5.10, 5.15, 6.1 and 6.4 branches >>>> were fixed in the separate >>>> commit ID 4acfe3dfde68 ("test_firmware: prevent race conditions by a correct implementation of locking") >>>> which was incompatible with 5.4 >>>> >>> >>> The above part is not part of the original commit, you also forgot to >>> mention the upstream commit: >>> >>> [ Upstream commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 ] >> >> Will fix. Actually, I wasn't sure if it was required, because this backported patch >> isn't verbatim equal to commit 4acfe3dfde685a5a9eaec5555351918e2d7266a1 . >> >> Though they are cousins, addressing the same issue. >> >> There is a race to be fixed, despite not all racy functions present in the original commit c92316bf8e948. >> >>>> Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests") >>>> Cc: Luis R. Rodriguez <mcgrof@kernel.org> >>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> >>>> Cc: Russ Weight <russell.h.weight@intel.com> >>>> Cc: Takashi Iwai <tiwai@suse.de> >>>> Cc: Tianfei Zhang <tianfei.zhang@intel.com> >>>> Cc: Shuah Khan <shuah@kernel.org> >>>> Cc: Colin Ian King <colin.i.king@gmail.com> >>>> Cc: Randy Dunlap <rdunlap@infradead.org> >>>> Cc: linux-kselftest@vger.kernel.org >>>> Cc: stable@vger.kernel.org # v5.4 >>>> Suggested-by: Dan Carpenter <error27@gmail.com> >>> >>> Here you can add the above note in brackets: >>> >>> [ explain your changes here from the original commit ] >>> >>> Then, I see two commits upstream on Linus tree which are also fixes >>> but not merged on v5.4, did you want those applied too? >> >> These seem merged in the stable 5.4? >> >> commit 75d9e00f65cd2e0f2ce9ceeb395f821976773489 test_firmware: fix a memory leak with reqs buffer >> commit 94f3bc7e84af2f17dbfbc7afe93991c2a6f2f25e test_firmware: fix the memory leak of the allocated firmware buffer >> >> Maybe this commit should be backported instead: >> >> test_firmware: return ENOMEM instead of ENOSPC on failed memory allocation >> [ Upstream commit 7dae593cd226a0bca61201cf85ceb9335cf63682 ] >> >> It was also merged into 6.4, 6.1, 5.15 and 5.10 stable, but not on 5.4 >> >> I might also check whether the 4.19 and 4.14 are vulnerable to these memory leaks and this race >> (Yes, they are, so it might be prudent that we backport this fix.) > > FYI, just checked, the patch applied w/o modifications to 4.19 and 4.14 LTS stable branches. Hi, Mr. Luis, I tried to guess the best way how to backport these four patches: 48e156023059 test_firmware: fix the memory leak of the allocated firmware buffer 5.4 [ALREADY IN THE TREE] 4.1[49] N/A be37bed754ed test_firmware: fix a memory leak with reqs buffer 5.4 [ALREADY IN THE TREE] 4.19 https://lore.kernel.org/lkml/20230801170746.191505-1-mirsad.todorovac@alu.unizg.hr/ 4.14 https://lore.kernel.org/lkml/20230802053253.667634-1-mirsad.todorovac@alu.unizg.hr/ 4acfe3dfde68 test_firmware: prevent race conditions by a correct implementation of locking 5.4,4.19,4.14 https://lore.kernel.org/lkml/20230803165304.9200-1-mirsad.todorovac@alu.unizg.hr/ 7dae593cd226 test_firmware: return ENOMEM instead of ENOSPC on failed memory allocation 5.4 https://lore.kernel.org/lkml/20230803165304.9200-2-mirsad.todorovac@alu.unizg.hr/ 4.1[49] https://lore.kernel.org/lkml/20230801185324.197544-1-mirsad.todorovac@alu.unizg.hr/ I have tested the 5.4 and 4.19 builds, but 4.14 still won't boot at my hw (black screen, no msgs at all to diagnose). I hope you will manage between the patches that have the same name and version, but address the backport to a different stable LTS branch. They differ by the patch proper, naturally, to state the obvious, or the upstream would apply of course. I don't know the exact formatting procedure for the backports, so I improvised, but I feel that backporting bug fixes is very important, even if they are not security fixes. I found no new weaknesses in the firmware driver after reviewing the code again. The buffer for name can be released twice, though, but kfree(NULL) is permissible: kfree_const(test_fw_config->name); test_fw_config->name = NULL; This about ends this chapter, and I am waiting for a review and an ACK. Kind regards, Mirsad Todorovac
diff --git a/lib/test_firmware.c b/lib/test_firmware.c index 38553944e967..92d7195d5b5b 100644 --- a/lib/test_firmware.c +++ b/lib/test_firmware.c @@ -301,16 +301,26 @@ static ssize_t config_test_show_str(char *dst, return len; } -static int test_dev_config_update_bool(const char *buf, size_t size, - bool *cfg) +static inline int __test_dev_config_update_bool(const char *buf, size_t size, + bool *cfg) { int ret; - mutex_lock(&test_fw_mutex); if (strtobool(buf, cfg) < 0) ret = -EINVAL; else ret = size; + + return ret; +} + +static int test_dev_config_update_bool(const char *buf, size_t size, + bool *cfg) +{ + int ret; + + mutex_lock(&test_fw_mutex); + ret = __test_dev_config_update_bool(buf, size, cfg); mutex_unlock(&test_fw_mutex); return ret; @@ -340,7 +350,7 @@ static ssize_t test_dev_config_show_int(char *buf, int cfg) return snprintf(buf, PAGE_SIZE, "%d\n", val); } -static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) +static inline int __test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) { int ret; long new; @@ -352,14 +362,23 @@ static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) if (new > U8_MAX) return -EINVAL; - mutex_lock(&test_fw_mutex); *(u8 *)cfg = new; - mutex_unlock(&test_fw_mutex); /* Always return full write size even if we didn't consume all */ return size; } +static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) +{ + int ret; + + mutex_lock(&test_fw_mutex); + ret = __test_dev_config_update_u8(buf, size, cfg); + mutex_unlock(&test_fw_mutex); + + return ret; +} + static ssize_t test_dev_config_show_u8(char *buf, u8 cfg) { u8 val; @@ -392,10 +411,10 @@ static ssize_t config_num_requests_store(struct device *dev, mutex_unlock(&test_fw_mutex); goto out; } - mutex_unlock(&test_fw_mutex); - rc = test_dev_config_update_u8(buf, count, - &test_fw_config->num_requests); + rc = __test_dev_config_update_u8(buf, count, + &test_fw_config->num_requests); + mutex_unlock(&test_fw_mutex); out: return rc;
NOTE: This patch is tested against 5.4 stable NOTE: This is a patch for the 5.4 stable branch, not for the torvalds tree. The torvalds tree, and stable tree 5.10, 5.15, 6.1 and 6.4 branches were fixed in the separate commit ID 4acfe3dfde68 ("test_firmware: prevent race conditions by a correct implementation of locking") which was incompatible with 5.4 Dan Carpenter spotted a race condition in a couple of situations like these in the test_firmware driver: static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) { u8 val; int ret; ret = kstrtou8(buf, 10, &val); if (ret) return ret; mutex_lock(&test_fw_mutex); *(u8 *)cfg = val; mutex_unlock(&test_fw_mutex); /* Always return full write size even if we didn't consume all */ return size; } static ssize_t config_num_requests_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { int rc; mutex_lock(&test_fw_mutex); if (test_fw_config->reqs) { pr_err("Must call release_all_firmware prior to changing config\n"); rc = -EINVAL; mutex_unlock(&test_fw_mutex); goto out; } mutex_unlock(&test_fw_mutex); // NOTE: HERE is the race!!! Function can be preempted! // test_fw_config->reqs can change between the release of // the lock about and acquire of the lock in the // test_dev_config_update_u8() rc = test_dev_config_update_u8(buf, count, &test_fw_config->num_requests); out: return rc; } static ssize_t config_read_fw_idx_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { return test_dev_config_update_u8(buf, count, &test_fw_config->read_fw_idx); } The function test_dev_config_update_u8() is called from both the locked and the unlocked context, function config_num_requests_store() and config_read_fw_idx_store() which can both be called asynchronously as they are driver's methods, while test_dev_config_update_u8() and siblings change their argument pointed to by u8 *cfg or similar pointer. To avoid deadlock on test_fw_mutex, the lock is dropped before calling test_dev_config_update_u8() and re-acquired within test_dev_config_update_u8() itself, but alas this creates a race condition. Having two locks wouldn't assure a race-proof mutual exclusion. This situation is best avoided by the introduction of a new, unlocked function __test_dev_config_update_u8() which can be called from the locked context and reducing test_dev_config_update_u8() to: static int test_dev_config_update_u8(const char *buf, size_t size, u8 *cfg) { int ret; mutex_lock(&test_fw_mutex); ret = __test_dev_config_update_u8(buf, size, cfg); mutex_unlock(&test_fw_mutex); return ret; } doing the locking and calling the unlocked primitive, which enables both locked and unlocked versions without duplication of code. Fixes: c92316bf8e948 ("test_firmware: add batched firmware tests") Cc: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Russ Weight <russell.h.weight@intel.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tianfei Zhang <tianfei.zhang@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: linux-kselftest@vger.kernel.org Cc: stable@vger.kernel.org # v5.4 Suggested-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/20230509084746.48259-1-mirsad.todorovac@alu.unizg.hr Signed-off-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> --- lib/test_firmware.c | 37 ++++++++++++++++++++++++++++--------- 1 file changed, 28 insertions(+), 9 deletions(-)