Message ID | 20201201201034.116760-1-wangyanan55@huawei.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix several bugs in KVM stage 2 translation | expand |
On Wed, Dec 02, 2020 at 04:10:31AM +0800, Yanan Wang wrote: > When installing a new pte entry or updating an old valid entry in stage 2 > translation, we use get_page()/put_page() to record page_count of the page-table > pages. PATCH 1/3 aims to fix incorrect use of get_page()/put_page() in stage 2, > which might make page-table pages unable to be freed when unmapping a range. > > When dirty logging of a guest with hugepages is finished, we should merge tables > back into a block entry if adjustment of huge mapping is found necessary. > In addition to installing the block entry, we should not only free the non-huge > page-table pages but also invalidate all the TLB entries of non-huge mappings for > the block. PATCH 2/3 adds enough TLBI when merging tables into a block entry. > > The rewrite of page-table code and fault handling add two different handlers > for "just relaxing permissions" and "map by stage2 page-table walk", that's > good improvement. Yet, in function user_mem_abort(), conditions where we choose > the above two fault handlers are not strictly distinguished. This will causes > guest errors such as infinite-loop (soft lockup will occur in result), because of > calling the inappropriate fault handler. So, a solution that can strictly > distinguish conditions is introduced in PATCH 3/3. For the series: Acked-by: Will Deacon <will@kernel.org> Thanks for reporting these, helping me to understand the issues and then spinning a v2 so promptly. Will
Hi Will, Marc, On 2020/12/2 4:59, Will Deacon wrote: > On Wed, Dec 02, 2020 at 04:10:31AM +0800, Yanan Wang wrote: >> When installing a new pte entry or updating an old valid entry in stage 2 >> translation, we use get_page()/put_page() to record page_count of the page-table >> pages. PATCH 1/3 aims to fix incorrect use of get_page()/put_page() in stage 2, >> which might make page-table pages unable to be freed when unmapping a range. >> >> When dirty logging of a guest with hugepages is finished, we should merge tables >> back into a block entry if adjustment of huge mapping is found necessary. >> In addition to installing the block entry, we should not only free the non-huge >> page-table pages but also invalidate all the TLB entries of non-huge mappings for >> the block. PATCH 2/3 adds enough TLBI when merging tables into a block entry. >> >> The rewrite of page-table code and fault handling add two different handlers >> for "just relaxing permissions" and "map by stage2 page-table walk", that's >> good improvement. Yet, in function user_mem_abort(), conditions where we choose >> the above two fault handlers are not strictly distinguished. This will causes >> guest errors such as infinite-loop (soft lockup will occur in result), because of >> calling the inappropriate fault handler. So, a solution that can strictly >> distinguish conditions is introduced in PATCH 3/3. > For the series: > > Acked-by: Will Deacon <will@kernel.org> > > Thanks for reporting these, helping me to understand the issues and then > spinning a v2 so promptly. > > Will > . Thanks for the help and suggestions. BTW: there are two more things below that I want to talk about. 1. Recently, I have been focusing on the ARMv8.4-TTRem feature which is aimed at changing block size in stage 2 mapping. I have a plan to implement this feature for stage 2 translation when splitting a block into tables or merging tables into a block. This feature supports changing block size without performing *break-before-make*, which might have some improvement on performance. What do you think about this? 2. Given that the issues we discussed before were found in practice when guest state changes from dirty logging to dirty logging canceled. I could add a test file testing on this case to selftests/ or kvm unit tests/, if it's necessary. Yanan
Hi Yanan, [...] > BTW: there are two more things below that I want to talk about. > > 1. Recently, I have been focusing on the ARMv8.4-TTRem feature which > is aimed at changing block size in stage 2 mapping. > > I have a plan to implement this feature for stage 2 translation when > splitting a block into tables or merging tables into a block. > > This feature supports changing block size without performing > *break-before-make*, which might have some improvement on performance. > > What do you think about this? It would be interesting if you can demonstrate some significant performance improvements compared to the same workload with BBM. I'm not completely convinced this would change much, given that it is only when moving from a table to a block mapping that you can elide BBM when the support level is 1 or 2. As far as I can tell, this only happens in the "stop logging" case. Is that something that happens often enough to justify the added complexity? Having to handle TLB Conflict Abort is annoying, for example. > 2. Given that the issues we discussed before were found in practice > when guest state changes from dirty logging to dirty logging canceled. > > I could add a test file testing on this case to selftests/ or kvm unit > tests/, if it's necessary. That would be awesome, and I'd be very grateful if you did. It is the second time we break this exact case, and having a reliable way to verify it would definitely help. Thanks, M.
On Wed, 2 Dec 2020 04:10:31 +0800, Yanan Wang wrote: > When installing a new pte entry or updating an old valid entry in stage 2 > translation, we use get_page()/put_page() to record page_count of the page-table > pages. PATCH 1/3 aims to fix incorrect use of get_page()/put_page() in stage 2, > which might make page-table pages unable to be freed when unmapping a range. > > When dirty logging of a guest with hugepages is finished, we should merge tables > back into a block entry if adjustment of huge mapping is found necessary. > In addition to installing the block entry, we should not only free the non-huge > page-table pages but also invalidate all the TLB entries of non-huge mappings for > the block. PATCH 2/3 adds enough TLBI when merging tables into a block entry. > > [...] Applied to kvm-arm64/fixes-5.10, thanks! [1/3] KVM: arm64: Fix memory leak on stage2 update of a valid PTE commit: 5c646b7e1d8bcb12317426287c516dfa4c5171c2 [2/3] KVM: arm64: Fix handling of merging tables into a block entry commit: 3a0b870e3448302ca2ba703bea1b79b61c3f33c6 [3/3] KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort() commit: 7d894834a305568a0168c55d4729216f5f8cb4e6 Cheers, M.
On 2020/12/2 20:23, Marc Zyngier wrote: > Hi Yanan, > > [...] > >> BTW: there are two more things below that I want to talk about. >> >> 1. Recently, I have been focusing on the ARMv8.4-TTRem feature which >> is aimed at changing block size in stage 2 mapping. >> >> I have a plan to implement this feature for stage 2 translation when >> splitting a block into tables or merging tables into a block. >> >> This feature supports changing block size without performing >> *break-before-make*, which might have some improvement on performance. >> >> What do you think about this? > > It would be interesting if you can demonstrate some significant > performance improvements compared to the same workload with BBM. > > I'm not completely convinced this would change much, given that > it is only when moving from a table to a block mapping that you > can elide BBM when the support level is 1 or 2. As far as I can > tell, this only happens in the "stop logging" case. > > Is that something that happens often enough to justify the added > complexity? Having to handle TLB Conflict Abort is annoying, for > example. I will take more consideration about the necessity and maybe some tests on the performance will be made later. Thanks, Yanan > >> 2. Given that the issues we discussed before were found in practice >> when guest state changes from dirty logging to dirty logging canceled. >> >> I could add a test file testing on this case to selftests/ or kvm unit >> tests/, if it's necessary. > > That would be awesome, and I'd be very grateful if you did. It is the > second time we break this exact case, and having a reliable way to > verify it would definitely help. > > Thanks, > > M.