From patchwork Fri Mar 22 09:35:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liu Shixin X-Patchwork-Id: 13599846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2F7F4C54E71 for ; Fri, 22 Mar 2024 09:36:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B28646B0089; Fri, 22 Mar 2024 05:36:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AD7B66B008A; Fri, 22 Mar 2024 05:36:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C7826B008C; Fri, 22 Mar 2024 05:36:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8E7FE6B0089 for ; Fri, 22 Mar 2024 05:36:23 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5D62AA0F84 for ; Fri, 22 Mar 2024 09:36:23 +0000 (UTC) X-FDA: 81924169446.19.CE6A8A2 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf30.hostedemail.com (Postfix) with ESMTP id 573738000C for ; Fri, 22 Mar 2024 09:36:19 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of liushixin2@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=liushixin2@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711100181; a=rsa-sha256; cv=none; b=rv1U58Dj35FqIrgXyEOAVv6MyoikAluPL0gi+K0blu0mclNUfcKWhFKSljkXfPv0Rfhyke gXvV0r2sQfMVDVqQT2TGhQCjAoxssulYcO/ojZy5kDdXWWwjak3n7915BG5FxoSkEQOD5v z7ZbhUAGmIeEBeWohflF3zyCs3HboXs= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; spf=pass (imf30.hostedemail.com: domain of liushixin2@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=liushixin2@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711100181; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=CKBolZD9bVwnpERbwm6zI5cps1p+nM4o6At2VHx1kO4=; b=Uxkba96PCX4pUQyUsDI9LiQk60fnJQ9W7uD5q41ukAPtgzP5ctgO7sGfdn+N3zE7vU7zlo vWtIxefyCvQGBHkXv0M83owS1foxX9wluWQ2fq0Fy73M3lxWRIfz48Ul1yFkfGeHBtLUyc Xx8TXbYiaa54EPXZlSZXJfRLgArQIlk= Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4V1HGP0xK6z1xsQm; Fri, 22 Mar 2024 17:34:21 +0800 (CST) Received: from dggpemd200004.china.huawei.com (unknown [7.185.36.141]) by mail.maildlp.com (Postfix) with ESMTPS id 1F12C1A016C; Fri, 22 Mar 2024 17:36:16 +0800 (CST) Received: from huawei.com (10.175.113.32) by dggpemd200004.china.huawei.com (7.185.36.141) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Fri, 22 Mar 2024 17:36:15 +0800 From: Liu Shixin To: Jan Kara , Matthew Wilcox , Andrew Morton , Alexander Viro , Christian Brauner CC: , , , Liu Shixin Subject: [PATCH v2 0/2] Fix I/O high when memory almost met memcg limit Date: Fri, 22 Mar 2024 17:35:53 +0800 Message-ID: <20240322093555.226789-1-liushixin2@huawei.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Originating-IP: [10.175.113.32] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemd200004.china.huawei.com (7.185.36.141) X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 573738000C X-Stat-Signature: sz5sduaq4cjm5b45otwgbuqrbczi7se7 X-HE-Tag: 1711100179-951257 X-HE-Meta: U2FsdGVkX1/rtDc5ifx40vyOXmdvuQQe21/guzkT21rCF5DiQQH//61ybVWc+RIdPmj2PsgHtYmINODXfjwoH6GNxKLduZAZoMS7sobtFoyu/lEAlb6aL6liMeXbMu6HPV+KD1nZe8yvSwEK/nY1qiN/fcaFL4hibbIlY81UoErhuOjXr9aAbilN1cyUnLTbcjU440JgsQtMRBV2xSho1DEGyavbRvpsPlumR8AdiFw9Mo3EaLGMiqbH2N90Gk+QJ2spvbK1AGJXZX74hZh41/BUjPpaN/ydbotA+pu5ZmIkg8HGRQzYLFktu/OsOzrI+4VUSdpdNMMY3HnKtbaPanW6YEqxryxBCm++OuTLnatfNb3pAXXkRam2/ACuoIdzvfh6UPgp1lqgZJGjWwPtKQCIRASRBl2H+MsxRCh54gu85H9pLGym7GO2hXQ/bIQI6nHw6DdW6biYIP7+tl98O91aa55Q22XcUV6a2AOt1n4KThRj0uynSTykqWpt2Gq5wCSMNMPZhiBOSXfQOxs09+yPFXGEiiFQyAINmwi/GOKZxGL98gm5WxnkmTL9PZmlau0Y75XyxV9kRvrCut1ccK9TacblocupdyF02wD2cPFcob+O7HQbWmCSaNxB/ZP6/lpiifd+IThJXfxrldcrX2Mfxam+mpDM0dzhX8MNSJVxtj3gMxU0qv4tAdk8baVrZoZCr5JyQZSBlHfICqGaeRIAYS7K7Bp8PDyhqtkfP88nTrazhJLiWNycy/VyrO1l4x13m2Zg+x8lHKEPeiRPaEH3WB5IGCzzJj389bS80972a9ruV5xMC/8wCicqSj08M2pKHBY3V5tiUZy3WNPexbJniFRT8lZAhw6AvcgXl/h8U1pV4WSsVQ6YfiFQBnETZVn/1Q6z8IKpWb6KcvceKiu665q8y1th8ye3L0pHK7SkWQX47gyQmUa3nOr503/dVTOdYAZSQvWFjbel4Fq 19oj15Uz W9RTIPk/KOPUitccfptw1rtIOjjr3ovyGLq+/1gQzJjvM9Kpx1OVLS7WoSEeMHsWqNwjJdfZtg74YEriGSLBACl0jHUAHvFEXLVfkyGfd7xQJnUlvaCRkcB58Hv34cUg1ISfD+tr10WSRqvBKWL6y6MUd9h6LEpoF4WWL2QCweJqVZmaEYr/mXZ5cX38WiFLZL8uBkCGXj0roF4M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: v1->v2: 1. Replace the variable active_refault with mmap_miss. Now mmap_miss will not decreased if folio is active prior to eviction. 2. Jan has given me other two patches which aims to let mmap_miss properly increased when the page is not ready. But in my scenario, the problem is that the page will be reclaimed immediately. These two patches have no logic conflict with Jan's patches[3]. Recently, when install package in a docker which almost reached its memory limit, the installer has no respond severely for more than 15 minutes. During this period, I/O stays high(~1G/s) and influence the whole machine. I've constructed a use case as follows: 1. create a docker: $ cat test.sh #!/bin/bash docker rm centos7 --force docker create --name centos7 --memory 4G --memory-swap 6G centos:7 /usr/sbin/init docker start centos7 sleep 1 docker cp ./alloc_page centos7:/ docker cp ./reproduce.sh centos7:/ docker exec -it centos7 /bin/bash 2. try reproduce the problem in docker: $ cat reproduce.sh #!/bin/bash while true; do flag=$(ps -ef | grep -v grep | grep alloc_page| wc -l) if [ "$flag" -eq 0 ]; then /alloc_page & fi sleep 30 start_time=$(date +%s) yum install -y expect > /dev/null 2>&1 end_time=$(date +%s) elapsed_time=$((end_time - start_time)) echo "$elapsed_time seconds" yum remove -y expect > /dev/null 2>&1 done $ cat alloc_page.c: #include #include #include #include #define SIZE 1*1024*1024 //1M int main() { void *addr = NULL; int i; for (i = 0; i < 1024 * 6 - 50;i++) { addr = (void *)malloc(SIZE); if (!addr) return -1; memset(addr, 0, SIZE); } sleep(99999); return 0; } We found that this problem is caused by a lot ot meaningless read-ahead. Since the docker is almost met memory limit, the page will be reclaimed immediately after read-ahead and will read-ahead again immediately. The program is executed slowly and waste a lot of I/O resource. These two patch aim to break the read-ahead in above scenario. [1] https://lore.kernel.org/linux-mm/c2f4a2fa-3bde-72ce-66f5-db81a373fdbc@huawei.com/T/ [2] https://lore.kernel.org/all/20240201100835.1626685-1-liushixin2@huawei.com/ [3] https://lore.kernel.org/all/20240201173130.frpaqpy7iyzias5j@quack3/ Liu Shixin (2): mm/readahead: break read-ahead loop if filemap_add_folio return -ENOMEM mm/readahead: increase mmap_miss when folio in workingset include/linux/pagemap.h | 2 ++ mm/filemap.c | 7 ++++--- mm/readahead.c | 15 +++++++++++++-- 3 files changed, 19 insertions(+), 5 deletions(-)