Message ID | 20240628095955.34096-1-christian.loehle@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | cpuidle: teo: Fixing utilization and intercept logic | expand |
On Fri, Jun 28, 2024 at 12:02 PM Christian Loehle <christian.loehle@arm.com> wrote: > > Hi all, > so my investigation into teo lead to the following fixes. > > 1/3: > As discussed the utilization threshold is too high while > there are benefits in certain workloads, there are quite a few > regressions, too. Revert the Util-awareness patch. > This in itself leads to regressions, but part of it can be offset > by the later patches. > See > https://lore.kernel.org/lkml/CAKfTPtA6ZzRR-zMN7sodOW+N_P+GqwNv4tGR+aMB5VXRT2b5bg@mail.gmail.com/ > 2/3: > Remove the 'recent' intercept logic, see my findings in: > https://lore.kernel.org/lkml/0ce2d536-1125-4df8-9a5b-0d5e389cd8af@arm.com/ > I haven't found a way to salvage this properly, so I removed it. > The regular intercept seems to decay fast enough to not need this, but > we could change it if that turns out that we need this to be faster in > ramp-up and decaying. > 3/3: > The rest of the intercept logic had issues, too. > See the commit. > > Happy for anyone to take a look and test as well. > > Some numbers for context, comparing: > - IO workload (intercept heavy). > - Timer workload very low utilization (check for deepest state) > - hackbench (high utilization) > - Geekbench 5 on Pixel6 (high utilization) > Tests 1 to 3 are on RK3399 with CONFIG_HZ=100. > target_residencies: 1, 900, 2000 > > 1. IO workload, 5 runs, results sorted, in read IOPS. > fio --minimal --time_based --name=fiotest --filename=/dev/nvme0n1 --runtime=30 --rw=randread --bs=4k --ioengine=psync --iodepth=1 --direct=1 | cut -d \; -f 8; > > teo fixed v2: > /dev/nvme0n1 > [4599, 4658, 4692, 4694, 4720] > /dev/mmcblk2 > [5700, 5730, 5735, 5747, 5977] > /dev/mmcblk1 > [2052, 2054, 2066, 2067, 2073] > > teo mainline: > /dev/nvme0n1 > [3793, 3825, 3846, 3865, 3964] > /dev/mmcblk2 > [3831, 4110, 4154, 4203, 4228] > /dev/mmcblk1 > [1559, 1564, 1596, 1611, 1618] > > menu: > /dev/nvme0n1 > [2571, 2630, 2804, 2813, 2917] > /dev/mmcblk2 > [4181, 4260, 5062, 5260, 5329] > /dev/mmcblk1 > [1567, 1581, 1585, 1603, 1769] > > > 2. Timer workload (through IO for my convenience
On 6/28/24 20:06, Rafael J. Wysocki wrote: > On Fri, Jun 28, 2024 at 12:02 PM Christian Loehle > <christian.loehle@arm.com> wrote: >> >> Hi all, >> so my investigation into teo lead to the following fixes. >> >> 1/3: >> As discussed the utilization threshold is too high while >> there are benefits in certain workloads, there are quite a few >> regressions, too. Revert the Util-awareness patch. >> This in itself leads to regressions, but part of it can be offset >> by the later patches. >> See >> https://lore.kernel.org/lkml/CAKfTPtA6ZzRR-zMN7sodOW+N_P+GqwNv4tGR+aMB5VXRT2b5bg@mail.gmail.com/ >> 2/3: >> Remove the 'recent' intercept logic, see my findings in: >> https://lore.kernel.org/lkml/0ce2d536-1125-4df8-9a5b-0d5e389cd8af@arm.com/ >> I haven't found a way to salvage this properly, so I removed it. >> The regular intercept seems to decay fast enough to not need this, but >> we could change it if that turns out that we need this to be faster in >> ramp-up and decaying. >> 3/3: >> The rest of the intercept logic had issues, too. >> See the commit. >> >> Happy for anyone to take a look and test as well. >> >> Some numbers for context, comparing: >> - IO workload (intercept heavy). >> - Timer workload very low utilization (check for deepest state) >> - hackbench (high utilization) >> - Geekbench 5 on Pixel6 (high utilization) >> Tests 1 to 3 are on RK3399 with CONFIG_HZ=100. >> target_residencies: 1, 900, 2000 >> >> 1. IO workload, 5 runs, results sorted, in read IOPS. >> fio --minimal --time_based --name=fiotest --filename=/dev/nvme0n1 --runtime=30 --rw=randread --bs=4k --ioengine=psync --iodepth=1 --direct=1 | cut -d \; -f 8; >> >> teo fixed v2: >> /dev/nvme0n1 >> [4599, 4658, 4692, 4694, 4720] >> /dev/mmcblk2 >> [5700, 5730, 5735, 5747, 5977] >> /dev/mmcblk1 >> [2052, 2054, 2066, 2067, 2073] >> >> teo mainline: >> /dev/nvme0n1 >> [3793, 3825, 3846, 3865, 3964] >> /dev/mmcblk2 >> [3831, 4110, 4154, 4203, 4228] >> /dev/mmcblk1 >> [1559, 1564, 1596, 1611, 1618] >> >> menu: >> /dev/nvme0n1 >> [2571, 2630, 2804, 2813, 2917] >> /dev/mmcblk2 >> [4181, 4260, 5062, 5260, 5329] >> /dev/mmcblk1 >> [1567, 1581, 1585, 1603, 1769] >> >> >> 2. Timer workload (through IO for my convenience
On 6/28/24 20:06, Rafael J. Wysocki wrote: > On Fri, Jun 28, 2024 at 12:02 PM Christian Loehle > <christian.loehle@arm.com> wrote: >> >> Hi all, >> so my investigation into teo lead to the following fixes. >> >> 1/3: >> As discussed the utilization threshold is too high while >> there are benefits in certain workloads, there are quite a few >> regressions, too. Revert the Util-awareness patch. >> This in itself leads to regressions, but part of it can be offset >> by the later patches. >> See >> https://lore.kernel.org/lkml/CAKfTPtA6ZzRR-zMN7sodOW+N_P+GqwNv4tGR+aMB5VXRT2b5bg@mail.gmail.com/ >> 2/3: >> Remove the 'recent' intercept logic, see my findings in: >> https://lore.kernel.org/lkml/0ce2d536-1125-4df8-9a5b-0d5e389cd8af@arm.com/ >> I haven't found a way to salvage this properly, so I removed it. >> The regular intercept seems to decay fast enough to not need this, but >> we could change it if that turns out that we need this to be faster in >> ramp-up and decaying. >> 3/3: >> The rest of the intercept logic had issues, too. >> See the commit. >> >> Happy for anyone to take a look and test as well. >> >> Some numbers for context, comparing: >> - IO workload (intercept heavy). >> - Timer workload very low utilization (check for deepest state) >> - hackbench (high utilization) >> - Geekbench 5 on Pixel6 (high utilization) >> Tests 1 to 3 are on RK3399 with CONFIG_HZ=100. >> target_residencies: 1, 900, 2000 >> >> 1. IO workload, 5 runs, results sorted, in read IOPS. >> fio --minimal --time_based --name=fiotest --filename=/dev/nvme0n1 --runtime=30 --rw=randread --bs=4k --ioengine=psync --iodepth=1 --direct=1 | cut -d \; -f 8; >> >> teo fixed v2: >> /dev/nvme0n1 >> [4599, 4658, 4692, 4694, 4720] >> /dev/mmcblk2 >> [5700, 5730, 5735, 5747, 5977] >> /dev/mmcblk1 >> [2052, 2054, 2066, 2067, 2073] >> >> teo mainline: >> /dev/nvme0n1 >> [3793, 3825, 3846, 3865, 3964] >> /dev/mmcblk2 >> [3831, 4110, 4154, 4203, 4228] >> /dev/mmcblk1 >> [1559, 1564, 1596, 1611, 1618] >> >> menu: >> /dev/nvme0n1 >> [2571, 2630, 2804, 2813, 2917] >> /dev/mmcblk2 >> [4181, 4260, 5062, 5260, 5329] >> /dev/mmcblk1 >> [1567, 1581, 1585, 1603, 1769] >> >> >> 2. Timer workload (through IO for my convenience