Message ID | CAH2r5msbRyGMY2XQifbxB0iU3a3EPp8UcemO8QE5bhq9HPMqBQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [SMB3,PATCHes] parallelizing decryption of large read responses | expand |
Had a minor typo in patch 2 - fixed in attached On Sun, Sep 8, 2019 at 11:31 PM Steve French <smfrench@gmail.com> wrote: > > I am seeing very good performance benefit from offload of decryption > of encrypted SMB3 read responses to a pool of worker threads > (optionally). See attached patches. > > I plan to add another patch to only offload when number of requests in > flight is > 1 (since there is no point offloading and doing a thread > switch if no other responses would overlap in the cifsd thread reading > from the socket). > > -- > Thanks, > > Steve
пн, 9 сент. 2019 г. в 07:21, Steve French <smfrench@gmail.com>: > > Had a minor typo in patch 2 - fixed in attached > > On Sun, Sep 8, 2019 at 11:31 PM Steve French <smfrench@gmail.com> wrote: > > > > I am seeing very good performance benefit from offload of decryption This is great news! > > of encrypted SMB3 read responses to a pool of worker threads > > (optionally). See attached patches. > > > > I plan to add another patch to only offload when number of requests in > > flight is > 1 (since there is no point offloading and doing a thread > > switch if no other responses would overlap in the cifsd thread reading > > from the socket). Good idea. The 2nd path looks good. See feedback from the 1st patch below: + mid = smb2_find_mid(dw->server, dw->buf); + if (mid == NULL) + cifs_dbg(VFS, "mid not found\n"); /* BB FIXME? change to FYI? */ Yes, let's keep it the same - FYI - as the non-offloaded scenario. --------------------- I think there is a race on mid structure between offloaded work items and the demultiplex thread. + mid = smb2_find_mid(dw->server, dw->buf); ^^^ Here we took a reference to the mid... + if (mid == NULL) + cifs_dbg(VFS, "mid not found\n"); /* BB FIXME? change to FYI? */ + else { + mid->decrypted = true; + rc = handle_read_data(dw->server, mid, dw->buf, ^^^ ...and the above call will dequeue the mid from the list. Between this two steps the demultiplex thread might hit reconnect (cifs_reconnect) and fire the mid callback. + dw->server->vals->read_rsp_size, + dw->ppages, dw->npages, dw->len); + } + + dw->server->lstrp = jiffies; The mid callback for async reads will fail the read request and then call DeleteMidQEntry which resets the mid state to MID_FREE and do other things we don't want to be done at this point. So, I think the following needs to be done to avoid it: 1) dequeue the mid inside analog of smb2_find_mid() - let's say smb2_find_mid_dequeue() and call it instead. 2) refactor handle_read_data() into two parts to separate mid handling (part 1) and dequeue'ing (part2) - note, that the latter is being called in non-negative return code cases. 3) call only the part 1 in the offloaded case because the mid has been already dequeue'ed at the step 1. + mid->callback(mid); + + cifs_mid_q_entry_release(mid); + +free_pages: + for (i = dw->npages-1; i >= 0; i--) + put_page(dw->ppages[i]); + + kfree(dw->ppages); + cifs_small_buf_release(dw->buf); + +/* FIXME Do we need the equivalent of this? */ No, we don't need this because the entire packet has been received by this point. +/* discard_data: + cifs_discard_remaining_data(server); */ +} -- Best regards, Pavel Shilovsky
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> We have now a decently large number of new mount options so we need a patch to the manpage. That said, we should also make sure that we try to set reasonable values by default, or even longer term, remove the options with heuristics. (Very very few people read the manpage or ever use any of these mount options so our default should be "as close to optimal as possible" and using a mount option should be the rare exception where our heuristics just went wrong.) On Tue, Sep 10, 2019 at 12:21 AM Steve French <smfrench@gmail.com> wrote: > > Had a minor typo in patch 2 - fixed in attached > > On Sun, Sep 8, 2019 at 11:31 PM Steve French <smfrench@gmail.com> wrote: > > > > I am seeing very good performance benefit from offload of decryption > > of encrypted SMB3 read responses to a pool of worker threads > > (optionally). See attached patches. > > > > I plan to add another patch to only offload when number of requests in > > flight is > 1 (since there is no point offloading and doing a thread > > switch if no other responses would overlap in the cifsd thread reading > > from the socket). > > > > -- > > Thanks, > > > > Steve > > > > -- > Thanks, > > Steve
By the way ... my thoughts on "esize" (as an example) is that for a release or two encryption offload is disabled (but mount option is mentioned) and then when we are very comfortable with its stability (and the heuristic we use to turn it on - currently "at least one other SMB3 requests in flight to the same server" ... we are comfortable with). So we have some other mount options like this that should be used rarely now, and can be noted as such. On Tue, Sep 10, 2019 at 4:19 AM ronnie sahlberg <ronniesahlberg@gmail.com> wrote: > > Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> > > > We have now a decently large number of new mount options so we need a > patch to the manpage. > That said, we should also make sure that we try to set reasonable > values by default, > or even longer term, remove the options with heuristics. > > (Very very few people read the manpage or ever use any of these mount options > so our default should be "as close to optimal as possible" and using a > mount option > should be the rare exception where our heuristics just went wrong.) > > On Tue, Sep 10, 2019 at 12:21 AM Steve French <smfrench@gmail.com> wrote: > > > > Had a minor typo in patch 2 - fixed in attached > > > > On Sun, Sep 8, 2019 at 11:31 PM Steve French <smfrench@gmail.com> wrote: > > > > > > I am seeing very good performance benefit from offload of decryption > > > of encrypted SMB3 read responses to a pool of worker threads > > > (optionally). See attached patches. > > > > > > I plan to add another patch to only offload when number of requests in > > > flight is > 1 (since there is no point offloading and doing a thread > > > switch if no other responses would overlap in the cifsd thread reading > > > from the socket). > > > > > > -- > > > Thanks, > > > > > > Steve > > > > > > > > -- > > Thanks, > > > > Steve
From 03b90249fa35fae25401f5f06f33ce9f36834c86 Mon Sep 17 00:00:00 2001 From: Steve French <stfrench@microsoft.com> Date: Sat, 7 Sep 2019 01:09:49 -0500 Subject: [PATCH 1/2] smb3: allow parallelizing decryption of reads decrypting large reads on encrypted shares can be slow (e.g. adding multiple milliseconds per-read on non-GCM capable servers or when mounting with dialects prior to SMB3.1.1) - allow parallelizing of read decryption by launching worker threads. Testing to Samba on localhost showed 25% improvement. Testing to remote server showed very large improvement when doing more than one 'cp' command was called at one time. Signed-off-by: Steve French <stfrench@microsoft.com> --- fs/cifs/cifsfs.c | 17 ++++++++- fs/cifs/cifsglob.h | 1 + fs/cifs/smb2ops.c | 87 ++++++++++++++++++++++++++++++++++++++++++++-- 3 files changed, 101 insertions(+), 4 deletions(-) diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index de90e665ef11..b0ea332af35c 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -109,6 +109,7 @@ extern mempool_t *cifs_req_poolp; extern mempool_t *cifs_mid_poolp; struct workqueue_struct *cifsiod_wq; +struct workqueue_struct *decrypt_wq; struct workqueue_struct *cifsoplockd_wq; __u32 cifs_lock_secret; @@ -1499,11 +1500,22 @@ init_cifs(void) goto out_clean_proc; } + /* + * BB Consider setting limit!=0 maybe to min(num_of_cores - 1, 3) so we + * don't launch too many worker threads + */ + decrypt_wq = alloc_workqueue("smb3decryptd", + WQ_FREEZABLE|WQ_MEM_RECLAIM, 0); + if (!decrypt_wq) { + rc = -ENOMEM; + goto out_destroy_cifsiod_wq; + } + cifsoplockd_wq = alloc_workqueue("cifsoplockd", WQ_FREEZABLE|WQ_MEM_RECLAIM, 0); if (!cifsoplockd_wq) { rc = -ENOMEM; - goto out_destroy_cifsiod_wq; + goto out_destroy_decrypt_wq; } rc = cifs_fscache_register(); @@ -1569,6 +1581,8 @@ init_cifs(void) cifs_fscache_unregister(); out_destroy_cifsoplockd_wq: destroy_workqueue(cifsoplockd_wq); +out_destroy_decrypt_wq: + destroy_workqueue(decrypt_wq); out_destroy_cifsiod_wq: destroy_workqueue(cifsiod_wq); out_clean_proc: @@ -1595,6 +1609,7 @@ exit_cifs(void) cifs_destroy_inodecache(); cifs_fscache_unregister(); destroy_workqueue(cifsoplockd_wq); + destroy_workqueue(decrypt_wq); destroy_workqueue(cifsiod_wq); cifs_proc_clean(); } diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index 1f53dee211d8..d66106ac031a 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1892,6 +1892,7 @@ void cifs_queue_oplock_break(struct cifsFileInfo *cfile); extern const struct slow_work_ops cifs_oplock_break_ops; extern struct workqueue_struct *cifsiod_wq; +extern struct workqueue_struct *decrypt_wq; extern struct workqueue_struct *cifsoplockd_wq; extern __u32 cifs_lock_secret; diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index 83b02d74d48e..64fe0d93c442 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -4017,8 +4017,62 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid, return length; } +struct smb2_decrypt_work { + struct work_struct decrypt; + struct TCP_Server_Info *server; + struct page **ppages; + char *buf; + unsigned int npages; + unsigned int len; +}; + + +static void smb2_decrypt_offload(struct work_struct *work) +{ + struct smb2_decrypt_work *dw = container_of(work, + struct smb2_decrypt_work, decrypt); + int i, rc; + struct mid_q_entry *mid; + + rc = decrypt_raw_data(dw->server, dw->buf, dw->server->vals->read_rsp_size, + dw->ppages, dw->npages, dw->len); + if (rc) { + cifs_dbg(VFS, "error decrypting rc=%d\n", rc); + goto free_pages; + } + + mid = smb2_find_mid(dw->server, dw->buf); + if (mid == NULL) + cifs_dbg(VFS, "mid not found\n"); /* BB FIXME? change to FYI? */ + else { + mid->decrypted = true; + rc = handle_read_data(dw->server, mid, dw->buf, + dw->server->vals->read_rsp_size, + dw->ppages, dw->npages, dw->len); + } + + dw->server->lstrp = jiffies; + + mid->callback(mid); + + cifs_mid_q_entry_release(mid); + +free_pages: + for (i = dw->npages-1; i >= 0; i--) + put_page(dw->ppages[i]); + + kfree(dw->ppages); + cifs_small_buf_release(dw->buf); + +/* FIXME Do we need the equivalent of this? */ +/* discard_data: + cifs_discard_remaining_data(server); */ +} + + static int -receive_encrypted_read(struct TCP_Server_Info *server, struct mid_q_entry **mid) +receive_encrypted_read(struct TCP_Server_Info *server, struct mid_q_entry **mid, + int *num_mids) { char *buf = server->smallbuf; struct smb2_transform_hdr *tr_hdr = (struct smb2_transform_hdr *)buf; @@ -4028,7 +4082,9 @@ receive_encrypted_read(struct TCP_Server_Info *server, struct mid_q_entry **mid) unsigned int buflen = server->pdu_size; int rc; int i = 0; + struct smb2_decrypt_work *dw; + *num_mids = 1; len = min_t(unsigned int, buflen, server->vals->read_rsp_size + sizeof(struct smb2_transform_hdr)) - HEADER_SIZE(server) + 1; @@ -4064,6 +4120,32 @@ receive_encrypted_read(struct TCP_Server_Info *server, struct mid_q_entry **mid) if (rc) goto free_pages; + /* + * For large reads, offload to different thread for better performance, + * use more cores decrypting which can be expensive + */ + + /* TODO: make the size limit to enable decrypt offload configurable */ + if (server->pdu_size > (512 * 1024)) { + dw = kmalloc(sizeof(struct smb2_decrypt_work), GFP_KERNEL); + if (dw == NULL) + goto non_offloaded_decrypt; + + dw->buf = server->smallbuf; + server->smallbuf = (char *)cifs_small_buf_get(); + + INIT_WORK(&dw->decrypt, smb2_decrypt_offload); + + dw->npages = npages; + dw->server = server; + dw->ppages = pages; + dw->len = len; + queue_work(cifsiod_wq, &dw->decrypt); + *num_mids = 0; /* worker thread takes care of finding mid */ + return -1; + } + +non_offloaded_decrypt: rc = decrypt_raw_data(server, buf, server->vals->read_rsp_size, pages, npages, len); if (rc) @@ -4210,8 +4292,7 @@ smb3_receive_transform(struct TCP_Server_Info *server, /* TODO: add support for compounds containing READ. */ if (pdu_length > CIFSMaxBufSize + MAX_HEADER_SIZE(server)) { - *num_mids = 1; - return receive_encrypted_read(server, &mids[0]); + return receive_encrypted_read(server, &mids[0], num_mids); } return receive_encrypted_standard(server, mids, bufs, num_mids); -- 2.20.1