Message ID | 20200430151041.31053-1-bstroesser@ts.fujitsu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] tcmu: wrong input from userspace can confuse tcmu | expand |
On 4/30/20 10:10 AM, Bodo Stroesser wrote: > Hi, > > When tcmu queues a new command - no matter whether in command > ring or in qfull_queue - a cmd_id from IDR udev->commands is > assigned to the command. > > If userspaces sends a wrong command completion containing the > cmd_id of a command on the qfull_queue, tcmu_handle_completions() > finds the command in the IDR and calls tcmu_handle_completion() > for it. This might do some nasty things, because commands in > qfull_queue do not have a valid dbi list. > > Of course this is easy to fix. E.g.: > > > --- a/drivers/target/target_core_user.c 2020-04-30 14:15:36.177184668 +0200 > +++ b/drivers/target/target_core_user.c 2020-04-30 14:29:43.331882066 +0200 > @@ -1242,13 +1242,14 @@ static unsigned int tcmu_handle_completi > } > WARN_ON(tcmu_hdr_get_op(entry->hdr.len_op) != TCMU_OP_CMD); > > - cmd = idr_remove(&udev->commands, entry->hdr.cmd_id); > - if (!cmd) { > + cmd = idr_find(&udev->commands, entry->hdr.cmd_id); > + if (!cmd || !test_bit(TCMU_CMD_BIT_INFLIGHT, &cmd->flags)) { > pr_err("cmd_id %u not found, ring is broken\n", > entry->hdr.cmd_id); > set_bit(TCMU_DEV_BIT_BROKEN, &udev->flags); > break; > } > + idr_remove(&udev->commands, entry->hdr.cmd_id); > > tcmu_handle_completion(cmd, entry); > > > To avoid the second idr_*() call in main data path, it would also > be possible to not replace the idr_remove() by idr_find(), but > in case cmd was found without TCMU_CMD_BIT_INFLIGHT being set, > during error handling re-add the cmd to the idr with the same id > as before by calling idr_alloc(...,cmd_id, crmdd_id+1, ...). > > > OTOH, I'm wondering whether the better solution would be to > remove idr_alloc() from tcmu_setup_cmd_timer() and add it > to queue_cmd_ring() immediately before or after it calls > tcmu_setup_cmd_timer(). > Doing so would mean that commands in qfull_queue no longer have > a cmd_id (which is not necessary AFAICS) and therefore the problem It's done for logging, but its only debug logging so we could just print the pointer value. > cannot happen, because tcmu_handle_completions() cannot find them > in the IDR. > This is probably best if it's not a lot of trouble. If you go this route then in tcmu_reset_ring I think I made a mistake and we just continue the loop if TCMU_CMD_BIT_INFLIGHT is not set. I think we want to do: for cmd in qfull_list target_complete_cmd() for cmd in idr target_complete_cmd()
On 04/30/20 17:59, Mike Christie wrote: > On 4/30/20 10:10 AM, Bodo Stroesser wrote: >> Hi, >> >> When tcmu queues a new command - no matter whether in command >> ring or in qfull_queue - a cmd_id from IDR udev->commands is >> assigned to the command. >> >> If userspaces sends a wrong command completion containing the >> cmd_id of a command on the qfull_queue, tcmu_handle_completions() >> finds the command in the IDR and calls tcmu_handle_completion() >> for it. This might do some nasty things, because commands in >> qfull_queue do not have a valid dbi list. >> >> Of course this is easy to fix. E.g.: >> >> >> --- a/drivers/target/target_core_user.c 2020-04-30 14:15:36.177184668 +0200 >> +++ b/drivers/target/target_core_user.c 2020-04-30 14:29:43.331882066 +0200 >> @@ -1242,13 +1242,14 @@ static unsigned int tcmu_handle_completi >> } >> WARN_ON(tcmu_hdr_get_op(entry->hdr.len_op) != TCMU_OP_CMD); >> >> - cmd = idr_remove(&udev->commands, entry->hdr.cmd_id); >> - if (!cmd) { >> + cmd = idr_find(&udev->commands, entry->hdr.cmd_id); >> + if (!cmd || !test_bit(TCMU_CMD_BIT_INFLIGHT, &cmd->flags)) { >> pr_err("cmd_id %u not found, ring is broken\n", >> entry->hdr.cmd_id); >> set_bit(TCMU_DEV_BIT_BROKEN, &udev->flags); >> break; >> } >> + idr_remove(&udev->commands, entry->hdr.cmd_id); >> >> tcmu_handle_completion(cmd, entry); >> >> >> To avoid the second idr_*() call in main data path, it would also >> be possible to not replace the idr_remove() by idr_find(), but >> in case cmd was found without TCMU_CMD_BIT_INFLIGHT being set, >> during error handling re-add the cmd to the idr with the same id >> as before by calling idr_alloc(...,cmd_id, crmdd_id+1, ...). >> >> >> OTOH, I'm wondering whether the better solution would be to >> remove idr_alloc() from tcmu_setup_cmd_timer() and add it >> to queue_cmd_ring() immediately before or after it calls >> tcmu_setup_cmd_timer(). >> Doing so would mean that commands in qfull_queue no longer have >> a cmd_id (which is not necessary AFAICS) and therefore the problem > > It's done for logging, but its only debug logging so we could just print > the pointer value. > Yes, and then add the pointer value to the pr_debug that prints the assigned cmd_id after idr_alloc() >> cannot happen, because tcmu_handle_completions() cannot find them >> in the IDR. >> > > This is probably best if it's not a lot of trouble. I already spent some time to check the necessary changes. I think there is no trouble, just some small changes. So, yes, the bigger fix is probably better. I'll wait till next week for further comments and then send a patch. > > If you go this route then in tcmu_reset_ring I think I made a mistake I'm not sure, but I don't think there is a mistake. If you cleanup the ring, you don't need to cancel the commands in qfull_queue. > and we just continue the loop if TCMU_CMD_BIT_INFLIGHT is not set. I > think we want to do: > > for cmd in qfull_list > target_complete_cmd() So I think, this is not really necessary, but after clear_bit(TCMU_DEV_BIT_BROKEN, &udev->flags); we could do run_qfull_queue(udev, false); right? > > for cmd in idr > target_complete_cmd() >
On 4/30/20 11:40 AM, Bodo Stroesser wrote: > > > On 04/30/20 17:59, Mike Christie wrote: >> On 4/30/20 10:10 AM, Bodo Stroesser wrote: >>> Hi, >>> >>> When tcmu queues a new command - no matter whether in command >>> ring or in qfull_queue - a cmd_id from IDR udev->commands is >>> assigned to the command. >>> >>> If userspaces sends a wrong command completion containing the >>> cmd_id of a command on the qfull_queue, tcmu_handle_completions() >>> finds the command in the IDR and calls tcmu_handle_completion() >>> for it. This might do some nasty things, because commands in >>> qfull_queue do not have a valid dbi list. >>> >>> Of course this is easy to fix. E.g.: >>> >>> >>> --- a/drivers/target/target_core_user.c 2020-04-30 >>> 14:15:36.177184668 +0200 >>> +++ b/drivers/target/target_core_user.c 2020-04-30 >>> 14:29:43.331882066 +0200 >>> @@ -1242,13 +1242,14 @@ static unsigned int tcmu_handle_completi >>> } >>> WARN_ON(tcmu_hdr_get_op(entry->hdr.len_op) != TCMU_OP_CMD); >>> - cmd = idr_remove(&udev->commands, entry->hdr.cmd_id); >>> - if (!cmd) { >>> + cmd = idr_find(&udev->commands, entry->hdr.cmd_id); >>> + if (!cmd || !test_bit(TCMU_CMD_BIT_INFLIGHT, &cmd->flags)) { >>> pr_err("cmd_id %u not found, ring is broken\n", >>> entry->hdr.cmd_id); >>> set_bit(TCMU_DEV_BIT_BROKEN, &udev->flags); >>> break; >>> } >>> + idr_remove(&udev->commands, entry->hdr.cmd_id); >>> tcmu_handle_completion(cmd, entry); >>> >>> To avoid the second idr_*() call in main data path, it would also >>> be possible to not replace the idr_remove() by idr_find(), but >>> in case cmd was found without TCMU_CMD_BIT_INFLIGHT being set, >>> during error handling re-add the cmd to the idr with the same id >>> as before by calling idr_alloc(...,cmd_id, crmdd_id+1, ...). >>> >>> >>> OTOH, I'm wondering whether the better solution would be to >>> remove idr_alloc() from tcmu_setup_cmd_timer() and add it >>> to queue_cmd_ring() immediately before or after it calls >>> tcmu_setup_cmd_timer(). >>> Doing so would mean that commands in qfull_queue no longer have >>> a cmd_id (which is not necessary AFAICS) and therefore the problem >> >> It's done for logging, but its only debug logging so we could just print >> the pointer value. >> > > Yes, and then add the pointer value to the pr_debug that prints the > assigned cmd_id after idr_alloc() > >>> cannot happen, because tcmu_handle_completions() cannot find them >>> in the IDR. >>> >> >> This is probably best if it's not a lot of trouble. > > I already spent some time to check the necessary changes. I think > there is no trouble, just some small changes. > So, yes, the bigger fix is probably better. > > I'll wait till next week for further comments and then send a patch. > >> >> If you go this route then in tcmu_reset_ring I think I made a mistake > > I'm not sure, but I don't think there is a mistake. If you cleanup > the ring, you don't need to cancel the commands in qfull_queue. I was thinking about your userspace restart PGR question in another mail. Between the time userspace stopped and restarted the session->id to I_T nexus mapping could have changed, so id now refers to a different I_T nexus. Code path wise, it would be easier if we restarted the inflight and non-inflight commands the same way. > >> and we just continue the loop if TCMU_CMD_BIT_INFLIGHT is not set. I >> think we want to do: >> >> for cmd in qfull_list >> target_complete_cmd() > So I think, this is not really necessary, but after > clear_bit(TCMU_DEV_BIT_BROKEN, &udev->flags); > > we could do > run_qfull_queue(udev, false); > > right? > >> >> for cmd in idr >> target_complete_cmd() >> >
--- a/drivers/target/target_core_user.c 2020-04-30 14:15:36.177184668 +0200 +++ b/drivers/target/target_core_user.c 2020-04-30 14:29:43.331882066 +0200 @@ -1242,13 +1242,14 @@ static unsigned int tcmu_handle_completi } WARN_ON(tcmu_hdr_get_op(entry->hdr.len_op) != TCMU_OP_CMD); - cmd = idr_remove(&udev->commands, entry->hdr.cmd_id); - if (!cmd) { + cmd = idr_find(&udev->commands, entry->hdr.cmd_id); + if (!cmd || !test_bit(TCMU_CMD_BIT_INFLIGHT, &cmd->flags)) { pr_err("cmd_id %u not found, ring is broken\n", entry->hdr.cmd_id); set_bit(TCMU_DEV_BIT_BROKEN, &udev->flags); break; } + idr_remove(&udev->commands, entry->hdr.cmd_id); tcmu_handle_completion(cmd, entry);