From patchwork Mon May 8 15:08:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 9716379 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0C4146035D for ; Mon, 8 May 2017 15:09:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F36EC205F6 for ; Mon, 8 May 2017 15:09:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E81A322638; Mon, 8 May 2017 15:09:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EFAD82094F for ; Mon, 8 May 2017 15:09:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754948AbdEHPJY (ORCPT ); Mon, 8 May 2017 11:09:24 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:50350 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754868AbdEHPJV (ORCPT ); Mon, 8 May 2017 11:09:21 -0400 Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v48F482n029436; Mon, 8 May 2017 08:09:05 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=subject : to : references : cc : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=facebook; bh=v889j9wQL8rgpYnHMw8Ml7OaNMTl3VqpFVQPPCkX4VM=; b=Rn33VDZFxKf2uzN+A5ELtNch9IIysGqik0eLUJ0XtjopWmpvGqgZhWUO7xWibN+Vj1Xu C7WwQe+8BW1BSamzlHOWCBr1v0frTDziMxq0Z7goO1yuegc9QXIEoFSoyYoWTFev3d0r rLPt67X2CxhBRUNMyg6nYtw2gFcsC8tHmZo= Received: from maileast.thefacebook.com ([199.201.65.23]) by mx0a-00082601.pphosted.com with ESMTP id 2aat8pg5dt-1 (version=TLSv1 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 08 May 2017 08:09:05 -0700 Received: from NAM01-BN3-obe.outbound.protection.outlook.com (192.168.183.28) by o365-in.thefacebook.com (192.168.177.29) with Microsoft SMTP Server (TLS) id 14.3.319.2; Mon, 8 May 2017 11:09:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.onmicrosoft.com; s=selector1-fb-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=v889j9wQL8rgpYnHMw8Ml7OaNMTl3VqpFVQPPCkX4VM=; b=EdiPr2Jche3eemC2wZwsr0Ot42Y79MIS+hGVzVopmfnTOooY4fpXTK35kpRTVDfV0sUPUAXcFFX4oDRwO0yYyl0KlLhitXX8cX+LMm4Ip9ATGGZX3ncXU+5m+OMT+m/hl+alkhc22SaIv4AFSEuMcaUXAD+xEXgvSMzThjKbBmM= Authentication-Results: lightnvm.io; dkim=none (message not signed) header.d=none; lightnvm.io; dmarc=none action=none header.from=fb.com; Received: from [192.168.1.154] (216.160.245.98) by CY4PR15MB1190.namprd15.prod.outlook.com (10.172.177.12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1075.11; Mon, 8 May 2017 15:09:01 +0000 Subject: Re: Large latency on blk_queue_enter To: =?UTF-8?Q?Javier_Gonz=c3=a1lez?= References: <1656B440-3ECA-4F2B-B95C-418CF0F347E9@lightnvm.io> <20170508122738.GC5696@ming.t460p> <76E35BA3-FEC9-46D6-B36F-554F464FA9ED@lightnvm.io> <661d4b67-cf0c-a703-331b-ce24d75e782d@fb.com> <375D00C3-8B76-40FA-BB81-69829270BF5A@lightnvm.io> CC: Ming Lei , Christoph Hellwig , "Dan Williams" , , "Linux Kernel Mailing List" , =?UTF-8?Q?Matias_Bj=c3=b8rling?= From: Jens Axboe Message-ID: <576f9601-b0de-c636-8195-07e12fe99734@fb.com> Date: Mon, 8 May 2017 09:08:55 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <375D00C3-8B76-40FA-BB81-69829270BF5A@lightnvm.io> X-Originating-IP: [216.160.245.98] X-ClientProxiedBy: CY4PR0101CA0031.prod.exchangelabs.com (10.171.218.44) To CY4PR15MB1190.namprd15.prod.outlook.com (10.172.177.12) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0dab1ae9-47c2-4548-2b86-08d496242919 X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(201703131423075)(201703031133081); SRVR:CY4PR15MB1190; X-Microsoft-Exchange-Diagnostics: 1; CY4PR15MB1190; 3:KLbCEdwIVHu9aGQHgMsZeYCyKfHYgdZIV/3ti+/kePbVzY3COiMKyhGmFvrWHlQMLlJh2aixoBmi4IqwTOtDk/YX6EqcB/Y6aTKpcWKLtl3zt1hkhARS2Wuvc850nKfYSqinVZuj8rK1Ebm/QbECINB/af7WMIUOP5GpiOwXmBxQXpaJBvWHXDlga79mDSX1naWVLrCjRUU4yZK+Qhe+HqvZ8dS+KARyoC49dthHuVik9ITde+6wIux8Sj6H6uoCGWj8ODlqqXC7pT2uczjILos1xr5JGFnmd/OS71vkzY++ybTEliSXKXwSMxRHrMYZFKQ/uZeFpu2RcS0p6qfP0g==; 25:BVyYrFeBhJ7jN9iI8AzI4eO8CENX3C+InCZgCrSyPLdQ6vw/FKDveOUpKGoEQfMlYpGSaBgu25PUbcxoF+LG9Pb7RBPOobGr6frB2FB2ODgQpaojaRpp2HM/1LJKM+tweUGOrGye4oOLG6vx/wXqK8XvW2624zLaNcITPWg07KP8UfAOeyq42M1+cnHsRJjw3hT3BiYNChxZFisdbBUM86cemYmwd7QHEakQ/PaIl0kqNcPfaiB4dgLT/7slM6KetV+ZRaR/Jha+N0gUN4xYX55hv+gcmWJYrSWaBZa0B5gPBmP9n9Ef2M9PZyWA4s/eTxaEcXtk1BZYVVxOyf49zda9fjNi28zwQCNogMMDogWobVdzySQOYq81m7XePtmHTktGB2OI5H3ToNs9fZG3tsjs+bdRAMarcaA4PBtl5sUz3Eb7JY3RTDCQN1TDdUXDtTsArTgd6a39QPjKejBUtPYdswLxYwj2Bwx4/ikJDNU= X-Microsoft-Exchange-Diagnostics: 1; CY4PR15MB1190; 31:lM8dJdo63xiA3wXWI1+FI4ITeM4iyZxtwswyif717CxvxhTxVkdLa/bBGVmbVlk1xwj3OEGR82EXRyzsmlAMMrE40rouJ1+Jq6jVZ1Rw7rP9NlMot+5s6xt0hV6ne0yYI+kX4VkcdzLE9cq5Ig8N+WnlyPKVmDWqq1zlPQRT82diKY2m8Ol1M5FdHzpVLhmWNURUOR2f87l2kIZZrI1yBNO2f8d0gQVR0APSKJbyKy6vAkplzSNKeQ/fMXHrVAuiq9+MYDbTDpxzgXAFpGbPbw==; 20:pstl0q3fyonQ+opH2l+Npu5slKrVW6r95kWqMzLdL8kxR9mw/IG99pGFfvkglBKL2Nr0dk3KtzCFpJkTAkIN0rfFr5eRU2BSNxmxZoVpDvwgNXt2FZyu9hlCAVNAImEq8oquHy4LhvlvHQkSOU9YND6sMzamJPhrQK37m+WTD0POSiwL3zAOk5PJuM7UnDhvxvmkl2go7/wj1NHg8+gKe5ohnVSVibNQCT0J8NHpWeRH14+7gHMlw6V0e0z9NxDGPJz4gtT36IfrOF6FvycJuOOW0wAHT0LEb0G13TBpKl+8dzP/E80DI42j7GgaHlVWrLBxPAFTIFkuyc/UD20tcTunlaX1gAq/nrH8xkEumDIlYUFubm3gkbCjH5iEyr5RXU85X2GeBfi8pCcFfPBymT7zO+6TZ+IbUpjy491vf+dKNI5Pas0jT5ZDkag/bKH0AeRrsp43QFP6dRWYl8ZSh150gDLXmTTzrFyEv2Hdc4dfC7RRayltCP+wfjHPLSqf X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(67672495146484); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040450)(601004)(2401047)(5005006)(8121501046)(93006095)(93001095)(10201501046)(3002001)(6041248)(20161123555025)(20161123564025)(20161123562025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(6072148); SRVR:CY4PR15MB1190; BCL:0; PCL:0; RULEID:; SRVR:CY4PR15MB1190; X-Microsoft-Exchange-Diagnostics: 1; CY4PR15MB1190; 4:Hrf48CJI5XGCKpc8+DKy1pZquC5dSl3ue/hURN0DYnRkLU4sUNToFUYYcYMkaFfY8TOYlCHbzwkXavFyZDoWQbfzPjDkqY/B/fRDYQS2a8+HJz3T1iT4MtK0zJ+j+SDvrzoJAOR03dmfeUZafKWFiEWDG/09EUQiIwmKXpfBSSAHOzmG+6Euf1SupfmxU7EC2rUrsx2q+cBh+07RLIiZ0wwGk530J7fYyljAyt7Ffv8v0VSzhHOpmnFmlR3DqHYq5TjV2CUSTQPdL/9PzUNn/FqzrK6KYluZJcpmY7xhFXhkL6Oyl7VbtQP2x6E9jp9i1uObpB7ACe/QYNY0EZ2tOgQxs3EgSGC+tl/IkrHjblwau3Poh+h29QXAEtTzj+sbtF2ZE/kzClwFXmaX2GtpWCNCH3nsTxduwjl5Fb2aQ31wW+5bNCA9H3LSrF4STV6iFK2Z+q3iZC3RiPFE0arFj4xqfUS3DXENmVuxh5IDcxK49mK0UjR+riOHewglzA4ihROvcuH6gsk66pcGUuDb/WOQedB14s3kVqcZedWzBpKQmgGsZxnev2Z/2QnOVnv7mz2YOQEVJPBxRCHPyK5fA7oWj/GpgRsTLVC/8JrBV6tIxK9SeTDbtJwFGslOCzNkCjA9Z/Z03H9VYEcgJZTAOciI/pruetDxKDOPJ46ucq2uKg4fKp+SQm8ghHJkABQSSbbBjNXNRTuG1bjlr71oOICujOHm6Uqvdkj7wMO1ibYdwV262XlVqWy8FX0MREXhY4OdQ+T80t2s6H4mED6TYR99aQzpP6VYr2IkUpjWVIn8gdF7v2QaUMoB2g9iretsUM02NnS2IZ9GBAAYcAH9Yw== X-Forefront-PRVS: 0301360BF5 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(4630300001)(6009001)(6049001)(39410400002)(39400400002)(39850400002)(39840400002)(39450400003)(377454003)(24454002)(6116002)(3846002)(66066001)(47776003)(305945005)(5660300001)(4326008)(36756003)(83506001)(54356999)(117156002)(53546009)(25786009)(76176999)(478600001)(50466002)(50986999)(64126003)(81166006)(8676002)(4001350100001)(189998001)(93886004)(65826007)(575784001)(31696002)(33646002)(86362001)(6246003)(110136004)(77096006)(90366009)(6486002)(6666003)(2950100002)(6916009)(23676002)(53936002)(229853002)(2870700001)(65806001)(42186005)(65956001)(38730400002)(7736002)(2906002); DIR:OUT; SFP:1102; SCL:1; SRVR:CY4PR15MB1190; H:[192.168.1.154]; FPR:; SPF:None; MLV:sfv; LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtDWTRQUjE1TUIxMTkwOzIzOk8vN2RWd25DbnVpYTBndUVGMEpZbEFFMlND?= =?utf-8?B?ZFkwUUdKTTNsdGhjejhEUHpySmpOVXEwOUFjS3BPOHdUNU52UVBVdVVDSGdZ?= =?utf-8?B?S2hVYzd5WGpHVVNyb0F2Mk9PVk0rWXBlNzhJSUlXQTcwZm5tUUpJdVZYTnUv?= =?utf-8?B?aVJIZktHQ2xzLzhMTmVRWk02NVdaQzNLT2hXUExvR2loMXhXbmxCTy9KM2Yv?= =?utf-8?B?K2kzTktsUE9RYVpFZkFNOE1LRFNDazlPa202clhXK1ZncFlxVTVRRkdJR1p2?= =?utf-8?B?QktlZHVydVVXN2FOeVM2YW90KzZBY1VCUkl5WG9ISGRJN1JERzVZV0xSN2xp?= =?utf-8?B?eExxQWcxZnh0b2o3N1NPN0tFd1VoWUM3NlJwQno4YmlOQytFeUgrYVJJa0JY?= =?utf-8?B?L3NqV0FSYm0reWoyYmhnMXc5QkxGQ0tFc2ZSbWdvRDRnSUNpb1pDWlJvQXpH?= =?utf-8?B?NzNmdGFLUUpGOTB3dkYvZ3IvWW5ZckdHbVZzYm5tMXhQWEpqaFNGMHZtdEF5?= =?utf-8?B?V1BkS3RCWjRDU0Mwc0lLczJTOE9ZcGxOMGpHVmJLWHo2MHBKY1V2WElSMDE1?= =?utf-8?B?bTVoNmlYRFdZUHI1UWtpaTIzdEtPSlFSaXVRa1J2bEltNUFPcWVEYUNVSVNy?= =?utf-8?B?UGpFTW44enMwYXdheEhudDhhMEp1d3dNa1g2UW5RNEhjUkdOZDQ0b3JHTHNU?= =?utf-8?B?MC9POXBhL3p2czhXdGpQK1dRWHl0dExpd2lRc2thZW0vc0ZMQlVHRER3a3FT?= =?utf-8?B?MUwxdnN2U0Z1UEpqVHhRc0ZJbUZCZE5STFJ0Ym5vRFE0Sm01QUJ3S2YwdmdW?= =?utf-8?B?UFZURW54bUhtUDM2MmM3WWwyeVpjTDJQQm1sM2VwTCtnVlZNWDRQaUNsUGtN?= =?utf-8?B?dk14NUZaVjJuU3hoU3B0MWw2b000SVlSMHpveWRvZ2FGampEZXBlY2c1OFNS?= =?utf-8?B?VEJQaEVLbjdSMEFUc05lQzN0aitQRHVZcWVHM2dMWTNwcUtXNWZxdGJKWGZo?= =?utf-8?B?MFdicGRKcHUyNmMyK3lxWGZoa0hhSzN1Y3oxUE1pem5IWUZ0RXZpR0plM1Nr?= =?utf-8?B?ZUphbFAvMXREZFJ2cHU3ZHNvWVJ1WHhSUWFGSDhKM2I3Ynk5MFpNN2drczQ4?= =?utf-8?B?MWJYdWQ1YU5UYXhwN0dXQVk0NWJHSy9lUnRjQmFYUUlIZUpYK2UzLzBrNVVr?= =?utf-8?B?YzZQb1d4eUdSMytGZkJtVTVucGl5SHhZY3BxYWhFRWhrTWp5WkpTSU5XU2pC?= =?utf-8?B?Rlg4SUdsMnAzbTdraWVtQVlDWHB3SmY0Rmo4cGRHLzZpcTBpbnRMM3hmVWtF?= =?utf-8?B?UHNQN1hRSElSdnJEN0JWNW0vTzhqNFBaN2JDZ3VOU2p6dW9IOHgyN0ZybUYx?= =?utf-8?B?WkZCMVdaTkxvblptVkFOdFlqUGR6d0RDeEdZQTFjK0h6MVJWaEtnSzdlUDJw?= =?utf-8?B?K1p4REFoWFQ3a2JRMEtYdWZucmhwREZDL3YwdVBIdGdGaEdBS1RuQ1pTY2pm?= =?utf-8?B?ZW9TOG8wZEFvNTdCNWZpc09DZ2pxOTRLTnFlOWJjVzBZZFVRWHpYczNNeWRK?= =?utf-8?B?S0hNLzNkT1g1RFVnNk9hTXN3dnFkMGdhakk0S284Vk9uUU5WNndUczlWTGx6?= =?utf-8?B?bE1PK1RnNVNwZFZKRXNCYjFTK3BFOS9hdENCcDg1RHhGTGVWbThENERpRHV1?= =?utf-8?B?QWxVUlYyMktYTzJTY0djVFB0djk0RmlRVGdLcndWM3R4eWFuQlRkQzNQOENz?= =?utf-8?B?U3BEd2VxZ3FXS0MwM1dPWGRTUGNpU1NsRmxaY0tUeU9ISGo2czJUUyswSzRx?= =?utf-8?B?R3Z5VUsyQVNvQkxWRlZ0RUhzczFpOHd5SjhpeFJiU1VVOXdRMSsreXpiN1VS?= =?utf-8?Q?B4IbUe1txHg=3D?= X-Microsoft-Exchange-Diagnostics: 1; CY4PR15MB1190; 6:3VRukT2FCLXtqrP6+ES/tedYm6xHctcSl6XzRn+s73g5qhESl5URd8H/tnZ9cAkYEBP3yD/6kPwzPAJ0BOyx5NZeXcY+3RZ5QisLvd7IMAbA86uw5Q7Bdv+2cbZJVnGOxth9bpYFKBavBeKZmvtxGVlWuneKP8nCbZ5KglicMH5HNeusq6d+0i05Tg4zhVRrP3NOCWypvJA2EEmq58NctifOpyZFH5rM0xSfdwOEBuDlAKtg8OA/mYf+EJJ72yIzbm+B8P4sgQm8RX66BIRk7eMl/Bguf/A0C7yA66M3QjRUOK3sOGXYS7dvaicxIp3DtUGcgmu0woy3cy0vj/QIdIXU06JHa0N/v6MHXFm4toXltHVk7Ssem1ioYiwJPP83E8dw3wzPOC4CR64c1yaJ/9YsSo/RO9h4xMDDnpbQ/1GAsqgYh1IhV9hIZtUclX4UJlqtc5Uxu2wbac6PeoxXP/oIpQ7iQuXPfwII3VbfyXEewoLDICJQ9tne8bibqjx9hMMp30cdwhSt8rY74a2RNw==; 5:M0syh1x9pn0uwZKQBPRW7ZpxoZaktuJhXg6KEDB4RFbOc5gUmDzTOmdwzw+z7we1gAJUG8vRoUyRa7jSF24r62i/D0U4lgeVjCOzAnd3Al5Figtd0d227b72shR1GezSUPB6X6zGEcGzigMaNSFQew==; 24:cVllx+ZF/n0ka3Ktu+YcyJU2u4Cdt54tGH6Vjb+jzUE5sMA4p7DA8/9glCzitkLDXZFvfuHTYDIWUwkqjMH3DNkm4eYU2/wQnb4ZCU1vGOE= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; CY4PR15MB1190; 7:CDGIsL5//ykW1bcu3eWlcNlEouiJ6p9KajIACuM9hdHiaLcDfoswNVZarJDi9UXlpR5aU18xGfjKOy4u1gG17gM/342E3FAI9tees+rIu3T6IagBx1lBIstBaFyanD/AT/DndtwKb2jEESCg+/wpAGaXmooYY/6hyt1vLlE/6lUlQ6OKnYxOExFKaiyJHuA1m/3ZGSVmQUxcw/dCxeV3eEkI+xISYknzK+ctuGatpW9VSGVQm70owp7NaiwztEwnf0Rlf/FB2vP2cbHC2bp/jquceKDdtrabWcz6ukfJxoNJBCaJdJIR7gudAmqUSzvwwatqVTEWdzuh0uN/3hIzcw==; 20:3R+mVH2WcwB7LLzR3dt/fn/LGIfvpERVku5VXUvtv4/Es6itQ8KPfsov9XLOh/pRbFA6hN1SislqU8xKfD51R/b5yCjdj+9QMsUqXxk0OeXlaoOy505+E6/tlaO5wFox9GmWU84120RpDt1MvVf2ZBTxQcC54XR4S5vCZcvp9Es= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 May 2017 15:09:01.6575 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR15MB1190 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-05-08_10:, , signatures=0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 05/08/2017 09:02 AM, Javier González wrote: >> On 8 May 2017, at 16.52, Jens Axboe wrote: >> >> On 05/08/2017 08:46 AM, Javier González wrote: >>>> On 8 May 2017, at 16.23, Jens Axboe wrote: >>>> >>>> On 05/08/2017 08:20 AM, Javier González wrote: >>>>>> On 8 May 2017, at 16.13, Jens Axboe wrote: >>>>>> >>>>>> On 05/08/2017 07:44 AM, Javier González wrote: >>>>>>>> On 8 May 2017, at 14.27, Ming Lei wrote: >>>>>>>> >>>>>>>> On Mon, May 08, 2017 at 01:54:58PM +0200, Javier González wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I find an unusual added latency(~20-30ms) on blk_queue_enter when >>>>>>>>> allocating a request directly from the NVMe driver through >>>>>>>>> nvme_alloc_request. I could use some help confirming that this is a bug >>>>>>>>> and not an expected side effect due to something else. >>>>>>>>> >>>>>>>>> I can reproduce this latency consistently on LightNVM when mixing I/O >>>>>>>>> from pblk and I/O sent through an ioctl using liblightnvm, but I don't >>>>>>>>> see anything on the LightNVM side that could impact the request >>>>>>>>> allocation. >>>>>>>>> >>>>>>>>> When I have a 100% read workload sent from pblk, the max. latency is >>>>>>>>> constant throughout several runs at ~80us (which is normal for the media >>>>>>>>> we are using at bs=4k, qd=1). All pblk I/Os reach the nvme_nvm_submit_io >>>>>>>>> function on lightnvm.c., which uses nvme_alloc_request. When we send a >>>>>>>>> command from user space through an ioctl, then the max latency goes up >>>>>>>>> to ~20-30ms. This happens independently from the actual command >>>>>>>>> (IN/OUT). I tracked down the added latency down to the call >>>>>>>>> percpu_ref_tryget_live in blk_queue_enter. Seems that the queue >>>>>>>>> reference counter is not released as it should through blk_queue_exit in >>>>>>>>> blk_mq_alloc_request. For reference, all ioctl I/Os reach the >>>>>>>>> nvme_nvm_submit_user_cmd on lightnvm.c >>>>>>>>> >>>>>>>>> Do you have any idea about why this might happen? I can dig more into >>>>>>>>> it, but first I wanted to make sure that I am not missing any obvious >>>>>>>>> assumption, which would explain the reference counter to be held for a >>>>>>>>> longer time. >>>>>>>> >>>>>>>> You need to check if the .q_usage_counter is working at atomic mode. >>>>>>>> This counter is initialized as atomic mode, and finally switchs to >>>>>>>> percpu mode via percpu_ref_switch_to_percpu() in blk_register_queue(). >>>>>>> >>>>>>> Thanks for commenting Ming. >>>>>>> >>>>>>> The .q_usage_counter is not working on atomic mode. The queue is >>>>>>> initialized normally through blk_register_queue() and the counter is >>>>>>> switched to percpu mode, as you mentioned. As I understand it, this is >>>>>>> how it should be, right? >>>>>> >>>>>> That is how it should be, yes. You're not running with any heavy >>>>>> debugging options, like lockdep or anything like that? >>>>> >>>>> No lockdep, KASAN, kmemleak or any of the other usual suspects. >>>>> >>>>> What's interesting is that it only happens when one of the I/Os comes >>>>> from user space through the ioctl. If I have several pblk instances on >>>>> the same device (which would end up allocating a new request in >>>>> parallel, potentially on the same core), the latency spike does not >>>>> trigger. >>>>> >>>>> I also tried to bind the read thread and the liblightnvm thread issuing >>>>> the ioctl to different cores, but it does not help... >>>> >>>> How do I reproduce this? Off the top of my head, and looking at the code, >>>> I have no idea what is going on here. >>> >>> Using LightNVM and liblightnvm [1] you can reproduce it by: >>> >>> 1. Instantiate a pblk instance on the first channel (luns 0 - 7): >>> sudo nvme lnvm create -d nvme0n1 -n test0 -t pblk -b 0 -e 7 -f >>> 2. Write 5GB to the test0 block device with a normal fio script >>> 3. Read 5GB to verify that latencies are good (max. ~80-90us at bs=4k, qd=1) >>> 4. Re-run 3. and in parallel send a command through liblightnvm to a >>> different channel. A simple command is an erase (erase block 900 on >>> channel 2, lun 0): >>> sudo nvm_vblk line_erase /dev/nvme0n1 2 2 0 0 900 >>> >>> After 4. you should see a ~25-30ms latency on the read workload. >>> >>> I tried to reproduce the ioctl in a more generic way to reach >>> __nvme_submit_user_cmd(), but SPDK steals the whole device. Also, qemu >>> is not reliable for this kind of performance testing. >>> >>> If you have a suggestion on how I can mix an ioctl with normal block I/O >>> read on a standard NVMe device, I'm happy to try it and see if I can >>> reproduce the issue. >> >> Just to rule out this being any hardware related delays in processing >> IO: >> >> 1) Does it reproduce with a simpler command, anything close to a no-op >> that you can test? > > Yes. I tried with a 4KB read and with a fake command I drop right after > allocation. > >> 2) What did you use to time the stall being blk_queue_enter()? >> > > I have some debug code measuring time with ktime_get() in different > places in the stack, and among other places, around blk_queue_enter(). I > use them then to measure max latency and expose it through sysfs. I can > see that the latency peak is recorded in the probe before > blk_queue_enter() and not in the one after. > > I also did an experiment, where the normal I/O path allocates the > request with BLK_MQ_REQ_NOWAIT. When running the experiment above, the > read test fails since we reach: > if (nowait) > return -EBUSY; > > in blk_queue_enter. OK, that's starting to make more sense, that indicates that there is indeed something wrong with the refs. Does the below help? diff --git a/block/blk-mq.c b/block/blk-mq.c index 5d4ce7eb8dbf..df5ee82d28f8 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -292,10 +292,11 @@ struct request *blk_mq_alloc_request(struct request_queue *q, int rw, rq = blk_mq_sched_get_request(q, NULL, rw, &alloc_data); blk_mq_put_ctx(alloc_data.ctx); - blk_queue_exit(q); - if (!rq) + if (!rq) { + blk_queue_exit(q); return ERR_PTR(-EWOULDBLOCK); + } rq->__data_len = 0; rq->__sector = (sector_t) -1;