Message ID | 20171205152110.2050975-1-arnd@arndb.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Dec 5, 2017 at 8:19 PM, Kees Cook <keescook@chromium.org> wrote: > On Tue, Dec 5, 2017 at 7:20 AM, Arnd Bergmann <arnd@arndb.de> wrote: >> While reviewing all callers of get_task_comm(), I stumbled >> over this one that claimed it was not exported, when in fact >> it is. Accessing task->comm directly is not safe, so better >> convert this one to using get_task_comm as well. > > Using get_task_comm() in cases like this is actually overkill (i.e. > using up stack space), since there's (currently) no benefit. Nothing > protects getting a "correct" view of task->comm (i.e. it could get > updated in the middle of a copy), but it _is_ always NULL terminated, > so it's safe to use with %s like this. While it does make me slightly > uncomfortable to _depend_ on this NULL termination, but there are lots > of open-coded %s users of task->comm. When we're trying to save a > _copy_ of task->comm, then we want get_task_comm(), just to make sure > we're doing it right. > > So, while I don't oppose this patch, it might be seen as a wasteful > use of stack space. It's only a few bytes of stack space in a leaf function, I'd not be worried about that. More generally speaking though, how exactly do we guarantee that there is NUL-termination on tsk->comm during a concurrent update? Could we ever get into a situation where overwrite the NUL byte while setting tsk->comm to a longer string, and read the new start of the string together with an unterminated end, or do we strictly guarantee that the last byte is still NUL? I assume the latter is true, just haven't found exactly where that guarantee is made. Arnd
On Tue, Dec 5, 2017 at 9:32 PM, Kees Cook <keescook@chromium.org> wrote: > On Tue, Dec 5, 2017 at 12:27 PM, Arnd Bergmann <arnd@arndb.de> wrote: >> >> More generally speaking though, how exactly do we guarantee that >> there is NUL-termination on tsk->comm during a concurrent update? >> Could we ever get into a situation where overwrite the NUL byte >> while setting tsk->comm to a longer string, and read the new start >> of the string together with an unterminated end, or do we strictly >> guarantee that the last byte is still NUL? I assume the latter is >> true, just haven't found exactly where that guarantee is made. > > strncpy will zero pad with the trailing NULL, so it's supposed to > always be safe... still gives me the creeps, though. But set_task_comm uses strlcpy(), not strncpy(), so you might get some of the old data back, the question is just whether it could leak uninitialized data or part of the task_struct up to the next NUL byte. I could not come up with any code path that would leave a non-NUL byte in at the end of task->comm though, so it's probably still safe. Arnd
On Tue, Dec 05, 2017 at 10:44:17PM +0100, Arnd Bergmann wrote: > On Tue, Dec 5, 2017 at 9:32 PM, Kees Cook <keescook@chromium.org> wrote: > > On Tue, Dec 5, 2017 at 12:27 PM, Arnd Bergmann <arnd@arndb.de> wrote: > >> > >> More generally speaking though, how exactly do we guarantee that > >> there is NUL-termination on tsk->comm during a concurrent update? > >> Could we ever get into a situation where overwrite the NUL byte > >> while setting tsk->comm to a longer string, and read the new start > >> of the string together with an unterminated end, or do we strictly > >> guarantee that the last byte is still NUL? I assume the latter is > >> true, just haven't found exactly where that guarantee is made. > > > > strncpy will zero pad with the trailing NULL, so it's supposed to > > always be safe... still gives me the creeps, though. > > But set_task_comm uses strlcpy(), not strncpy(), so you might > get some of the old data back, the question is just whether it could > leak uninitialized data or part of the task_struct up to the next > NUL byte. I could not come up with any code path that would leave > a non-NUL byte in at the end of task->comm though, so it's > probably still safe. So we used to have some magic code set_task_comm() which even included a memory barrier etc.. But since none of the reading sites include a memory barrier its all pointless. There is no guarantee that a tsk->comm user reads the bytes in string order. The only thing that ensures we never run over, is the hard guarantee that ->comm[TSK_COMM_LEN-1] == 0 at all times. If we don't trust str*cpy() to do the right thing here, we could simply open code the thing.
diff --git a/fs/ocfs2/cluster/netdebug.c b/fs/ocfs2/cluster/netdebug.c index 74a21f6695c8..a51d001b89b1 100644 --- a/fs/ocfs2/cluster/netdebug.c +++ b/fs/ocfs2/cluster/netdebug.c @@ -130,6 +130,7 @@ static void *nst_seq_next(struct seq_file *seq, void *v, loff_t *pos) static int nst_seq_show(struct seq_file *seq, void *v) { struct o2net_send_tracking *nst, *dummy_nst = seq->private; + char comm[TASK_COMM_LEN]; ktime_t now; s64 sock, send, status; @@ -142,8 +143,8 @@ static int nst_seq_show(struct seq_file *seq, void *v) sock = ktime_to_us(ktime_sub(now, nst->st_sock_time)); send = ktime_to_us(ktime_sub(now, nst->st_send_time)); status = ktime_to_us(ktime_sub(now, nst->st_status_time)); + get_task_comm(comm, nst->st_task); - /* get_task_comm isn't exported. oh well. */ seq_printf(seq, "%p:\n" " pid: %lu\n" " tgid: %lu\n" @@ -158,7 +159,7 @@ static int nst_seq_show(struct seq_file *seq, void *v) " wait start: %lld usecs ago\n", nst, (unsigned long)task_pid_nr(nst->st_task), (unsigned long)nst->st_task->tgid, - nst->st_task->comm, nst->st_node, + comm, nst->st_node, nst->st_sc, nst->st_id, nst->st_msg_type, nst->st_msg_key, (long long)sock,
While reviewing all callers of get_task_comm(), I stumbled over this one that claimed it was not exported, when in fact it is. Accessing task->comm directly is not safe, so better convert this one to using get_task_comm as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- fs/ocfs2/cluster/netdebug.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)