Message ID | 20210402093249.25137-3-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Cleanup and fixup for hugetlb | expand |
On 4/2/21 2:32 AM, Miaohe Lin wrote: > It's guaranteed that the vma is associated with a resv_map, i.e. either > VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would > have returned via !resv check above. So ret must be less than 0 in the > 'else' case. Simplify the return code to make this clear. I believe we still neeed that ternary operator in the return statement. Why? There are two basic types of mappings to be concerned with: shared and private. For private mappings, a task can 'own' the mapping as indicated by HPAGE_RESV_OWNER. Or, it may not own the mapping. The most common way to create a non-owner private mapping is to have a task with a private mapping fork. The parent process will have HPAGE_RESV_OWNER set, the child process will not. The idea is that since the child has a COW copy of the mapping it should not consume reservations made by the parent. Only the parent (HPAGE_RESV_OWNER) is allowed to consume the reservations. Hope that makens sense? > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/hugetlb.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index a03a50b7c410..b7864abded3d 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate *h, > return 1; > } > else This else also handles the case !HPAGE_RESV_OWNER. In this case, we never want to indicate reservations are available. The ternary makes sure a positive value is never returned.
Hi: On 2021/4/7 8:53, Mike Kravetz wrote: > On 4/2/21 2:32 AM, Miaohe Lin wrote: >> It's guaranteed that the vma is associated with a resv_map, i.e. either >> VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would >> have returned via !resv check above. So ret must be less than 0 in the >> 'else' case. Simplify the return code to make this clear. > > I believe we still neeed that ternary operator in the return statement. > Why? > > There are two basic types of mappings to be concerned with: > shared and private. > For private mappings, a task can 'own' the mapping as indicated by > HPAGE_RESV_OWNER. Or, it may not own the mapping. The most common way > to create a non-owner private mapping is to have a task with a private > mapping fork. The parent process will have HPAGE_RESV_OWNER set, the > child process will not. The idea is that since the child has a COW copy > of the mapping it should not consume reservations made by the parent. The child process will not have HPAGE_RESV_OWNER set because at fork time, we do: /* * Clear hugetlb-related page reserves for children. This only * affects MAP_PRIVATE mappings. Faults generated by the child * are not guaranteed to succeed, even if read-only */ if (is_vm_hugetlb_page(tmp)) reset_vma_resv_huge_pages(tmp); i.e. we have vma->vm_private_data = (void *)0; for child process and vma_resv_map() will return NULL in this case. Or am I missed something? > Only the parent (HPAGE_RESV_OWNER) is allowed to consume the > reservations. > Hope that makens sense? > >> >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/hugetlb.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index a03a50b7c410..b7864abded3d 100644 >> --- a/mm/hugetlb.c >> +++ b/mm/hugetlb.c >> @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate *h, >> return 1; >> } >> else > > This else also handles the case !HPAGE_RESV_OWNER. In this case, we IMO, for the case !HPAGE_RESV_OWNER, we won't reach here. What do you think? > never want to indicate reservations are available. The ternary makes > sure a positive value is never returned. > Many thanks for review and reply! :)
On 4/6/21 7:05 PM, Miaohe Lin wrote: > Hi: > On 2021/4/7 8:53, Mike Kravetz wrote: >> On 4/2/21 2:32 AM, Miaohe Lin wrote: >>> It's guaranteed that the vma is associated with a resv_map, i.e. either >>> VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would >>> have returned via !resv check above. So ret must be less than 0 in the >>> 'else' case. Simplify the return code to make this clear. >> >> I believe we still neeed that ternary operator in the return statement. >> Why? >> >> There are two basic types of mappings to be concerned with: >> shared and private. >> For private mappings, a task can 'own' the mapping as indicated by >> HPAGE_RESV_OWNER. Or, it may not own the mapping. The most common way >> to create a non-owner private mapping is to have a task with a private >> mapping fork. The parent process will have HPAGE_RESV_OWNER set, the >> child process will not. The idea is that since the child has a COW copy >> of the mapping it should not consume reservations made by the parent. > > The child process will not have HPAGE_RESV_OWNER set because at fork time, we do: > /* > * Clear hugetlb-related page reserves for children. This only > * affects MAP_PRIVATE mappings. Faults generated by the child > * are not guaranteed to succeed, even if read-only > */ > if (is_vm_hugetlb_page(tmp)) > reset_vma_resv_huge_pages(tmp); > i.e. we have vma->vm_private_data = (void *)0; for child process and vma_resv_map() will > return NULL in this case. > Or am I missed something? > >> Only the parent (HPAGE_RESV_OWNER) is allowed to consume the >> reservations. >> Hope that makens sense? >> >>> >>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>> --- >>> mm/hugetlb.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>> index a03a50b7c410..b7864abded3d 100644 >>> --- a/mm/hugetlb.c >>> +++ b/mm/hugetlb.c >>> @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate *h, >>> return 1; >>> } >>> else >> >> This else also handles the case !HPAGE_RESV_OWNER. In this case, we > > IMO, for the case !HPAGE_RESV_OWNER, we won't reach here. What do you think? > I think you are correct. However, if this is true we should be able to simply the code even further. There is no need to check for HPAGE_RESV_OWNER because we know it must be set. Correct? If so, the code could look something like: if (vma->vm_flags & VM_MAYSHARE) return ret; /* We know private mapping with HPAGE_RESV_OWNER */ * ... * * Add that existing comment */ if (ret > 0) return 0; if (ret == 0) return 1; return ret;
On 2021/4/7 10:37, Mike Kravetz wrote: > On 4/6/21 7:05 PM, Miaohe Lin wrote: >> Hi: >> On 2021/4/7 8:53, Mike Kravetz wrote: >>> On 4/2/21 2:32 AM, Miaohe Lin wrote: >>>> It's guaranteed that the vma is associated with a resv_map, i.e. either >>>> VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would >>>> have returned via !resv check above. So ret must be less than 0 in the >>>> 'else' case. Simplify the return code to make this clear. >>> >>> I believe we still neeed that ternary operator in the return statement. >>> Why? >>> >>> There are two basic types of mappings to be concerned with: >>> shared and private. >>> For private mappings, a task can 'own' the mapping as indicated by >>> HPAGE_RESV_OWNER. Or, it may not own the mapping. The most common way >>> to create a non-owner private mapping is to have a task with a private >>> mapping fork. The parent process will have HPAGE_RESV_OWNER set, the >>> child process will not. The idea is that since the child has a COW copy >>> of the mapping it should not consume reservations made by the parent. >> >> The child process will not have HPAGE_RESV_OWNER set because at fork time, we do: >> /* >> * Clear hugetlb-related page reserves for children. This only >> * affects MAP_PRIVATE mappings. Faults generated by the child >> * are not guaranteed to succeed, even if read-only >> */ >> if (is_vm_hugetlb_page(tmp)) >> reset_vma_resv_huge_pages(tmp); >> i.e. we have vma->vm_private_data = (void *)0; for child process and vma_resv_map() will >> return NULL in this case. >> Or am I missed something? >> >>> Only the parent (HPAGE_RESV_OWNER) is allowed to consume the >>> reservations. >>> Hope that makens sense? >>> >>>> >>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>> --- >>>> mm/hugetlb.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>>> index a03a50b7c410..b7864abded3d 100644 >>>> --- a/mm/hugetlb.c >>>> +++ b/mm/hugetlb.c >>>> @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate *h, >>>> return 1; >>>> } >>>> else >>> >>> This else also handles the case !HPAGE_RESV_OWNER. In this case, we >> >> IMO, for the case !HPAGE_RESV_OWNER, we won't reach here. What do you think? >> > > I think you are correct. > > However, if this is true we should be able to simply the code even > further. There is no need to check for HPAGE_RESV_OWNER because we know > it must be set. Correct? If so, the code could look something like: > > if (vma->vm_flags & VM_MAYSHARE) > return ret; > > /* We know private mapping with HPAGE_RESV_OWNER */ > * ... * > * Add that existing comment */ > > if (ret > 0) > return 0; > if (ret == 0) > return 1; > return ret; > Many thanks for good suggestion! What do you mean is this ? diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a03a50b7c410..9b4c05699a90 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2163,27 +2163,26 @@ static long __vma_reservation_common(struct hstate *h, if (vma->vm_flags & VM_MAYSHARE) return ret; - else if (is_vma_resv_set(vma, HPAGE_RESV_OWNER) && ret >= 0) { - /* - * In most cases, reserves always exist for private mappings. - * However, a file associated with mapping could have been - * hole punched or truncated after reserves were consumed. - * As subsequent fault on such a range will not use reserves. - * Subtle - The reserve map for private mappings has the - * opposite meaning than that of shared mappings. If NO - * entry is in the reserve map, it means a reservation exists. - * If an entry exists in the reserve map, it means the - * reservation has already been consumed. As a result, the - * return value of this routine is the opposite of the - * value returned from reserve map manipulation routines above. - */ - if (ret) - return 0; - else - return 1; - } - else - return ret < 0 ? ret : 0; + /* + * We know private mapping must have HPAGE_RESV_OWNER set. + * + * In most cases, reserves always exist for private mappings. + * However, a file associated with mapping could have been + * hole punched or truncated after reserves were consumed. + * As subsequent fault on such a range will not use reserves. + * Subtle - The reserve map for private mappings has the + * opposite meaning than that of shared mappings. If NO + * entry is in the reserve map, it means a reservation exists. + * If an entry exists in the reserve map, it means the + * reservation has already been consumed. As a result, the + * return value of this routine is the opposite of the + * value returned from reserve map manipulation routines above. + */ + if (ret > 0) + return 0; + if (ret == 0) + return 1; + return ret; }
On 4/6/21 8:09 PM, Miaohe Lin wrote: > On 2021/4/7 10:37, Mike Kravetz wrote: >> On 4/6/21 7:05 PM, Miaohe Lin wrote: >>> Hi: >>> On 2021/4/7 8:53, Mike Kravetz wrote: >>>> On 4/2/21 2:32 AM, Miaohe Lin wrote: >>>>> It's guaranteed that the vma is associated with a resv_map, i.e. either >>>>> VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would >>>>> have returned via !resv check above. So ret must be less than 0 in the >>>>> 'else' case. Simplify the return code to make this clear. >>>> >>>> I believe we still neeed that ternary operator in the return statement. >>>> Why? >>>> >>>> There are two basic types of mappings to be concerned with: >>>> shared and private. >>>> For private mappings, a task can 'own' the mapping as indicated by >>>> HPAGE_RESV_OWNER. Or, it may not own the mapping. The most common way >>>> to create a non-owner private mapping is to have a task with a private >>>> mapping fork. The parent process will have HPAGE_RESV_OWNER set, the >>>> child process will not. The idea is that since the child has a COW copy >>>> of the mapping it should not consume reservations made by the parent. >>> >>> The child process will not have HPAGE_RESV_OWNER set because at fork time, we do: >>> /* >>> * Clear hugetlb-related page reserves for children. This only >>> * affects MAP_PRIVATE mappings. Faults generated by the child >>> * are not guaranteed to succeed, even if read-only >>> */ >>> if (is_vm_hugetlb_page(tmp)) >>> reset_vma_resv_huge_pages(tmp); >>> i.e. we have vma->vm_private_data = (void *)0; for child process and vma_resv_map() will >>> return NULL in this case. >>> Or am I missed something? >>> >>>> Only the parent (HPAGE_RESV_OWNER) is allowed to consume the >>>> reservations. >>>> Hope that makens sense? >>>> >>>>> >>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>> --- >>>>> mm/hugetlb.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>>>> index a03a50b7c410..b7864abded3d 100644 >>>>> --- a/mm/hugetlb.c >>>>> +++ b/mm/hugetlb.c >>>>> @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate *h, >>>>> return 1; >>>>> } >>>>> else >>>> >>>> This else also handles the case !HPAGE_RESV_OWNER. In this case, we >>> >>> IMO, for the case !HPAGE_RESV_OWNER, we won't reach here. What do you think? >>> >> >> I think you are correct. >> >> However, if this is true we should be able to simply the code even >> further. There is no need to check for HPAGE_RESV_OWNER because we know >> it must be set. Correct? If so, the code could look something like: >> >> if (vma->vm_flags & VM_MAYSHARE) >> return ret; >> >> /* We know private mapping with HPAGE_RESV_OWNER */ >> * ... * >> * Add that existing comment */ >> >> if (ret > 0) >> return 0; >> if (ret == 0) >> return 1; >> return ret; >> > > Many thanks for good suggestion! What do you mean is this ? I think the below changes would work fine. However, this patch/discussion has made me ask the question. Do we need the HPAGE_RESV_OWNER flag? Is the followng true? !(vm_flags & VM_MAYSHARE) && vma_resv_map() ===> HPAGE_RESV_OWNER !(vm_flags & VM_MAYSHARE) && !vma_resv_map() ===> !HPAGE_RESV_OWNER I am not suggesting we eliminate the flag and make corresponding changes. Just curious if you believe we 'could' remove the flag and depend on the above conditions. One reason for NOT removing the flag is that that flag itself and supporting code and commnets help explain what happens with hugetlb reserves for COW mappings. That code is hard to understand and the existing code and coments around HPAGE_RESV_OWNER help with understanding.
On 2021/4/8 5:23, Mike Kravetz wrote: > On 4/6/21 8:09 PM, Miaohe Lin wrote: >> On 2021/4/7 10:37, Mike Kravetz wrote: >>> On 4/6/21 7:05 PM, Miaohe Lin wrote: >>>> Hi: >>>> On 2021/4/7 8:53, Mike Kravetz wrote: >>>>> On 4/2/21 2:32 AM, Miaohe Lin wrote: >>>>>> It's guaranteed that the vma is associated with a resv_map, i.e. either >>>>>> VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would >>>>>> have returned via !resv check above. So ret must be less than 0 in the >>>>>> 'else' case. Simplify the return code to make this clear. >>>>> >>>>> I believe we still neeed that ternary operator in the return statement. >>>>> Why? >>>>> >>>>> There are two basic types of mappings to be concerned with: >>>>> shared and private. >>>>> For private mappings, a task can 'own' the mapping as indicated by >>>>> HPAGE_RESV_OWNER. Or, it may not own the mapping. The most common way >>>>> to create a non-owner private mapping is to have a task with a private >>>>> mapping fork. The parent process will have HPAGE_RESV_OWNER set, the >>>>> child process will not. The idea is that since the child has a COW copy >>>>> of the mapping it should not consume reservations made by the parent. >>>> >>>> The child process will not have HPAGE_RESV_OWNER set because at fork time, we do: >>>> /* >>>> * Clear hugetlb-related page reserves for children. This only >>>> * affects MAP_PRIVATE mappings. Faults generated by the child >>>> * are not guaranteed to succeed, even if read-only >>>> */ >>>> if (is_vm_hugetlb_page(tmp)) >>>> reset_vma_resv_huge_pages(tmp); >>>> i.e. we have vma->vm_private_data = (void *)0; for child process and vma_resv_map() will >>>> return NULL in this case. >>>> Or am I missed something? >>>> >>>>> Only the parent (HPAGE_RESV_OWNER) is allowed to consume the >>>>> reservations. >>>>> Hope that makens sense? >>>>> >>>>>> >>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>>> --- >>>>>> mm/hugetlb.c | 2 +- >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>>>>> index a03a50b7c410..b7864abded3d 100644 >>>>>> --- a/mm/hugetlb.c >>>>>> +++ b/mm/hugetlb.c >>>>>> @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate *h, >>>>>> return 1; >>>>>> } >>>>>> else >>>>> >>>>> This else also handles the case !HPAGE_RESV_OWNER. In this case, we >>>> >>>> IMO, for the case !HPAGE_RESV_OWNER, we won't reach here. What do you think? >>>> >>> >>> I think you are correct. >>> >>> However, if this is true we should be able to simply the code even >>> further. There is no need to check for HPAGE_RESV_OWNER because we know >>> it must be set. Correct? If so, the code could look something like: >>> >>> if (vma->vm_flags & VM_MAYSHARE) >>> return ret; >>> >>> /* We know private mapping with HPAGE_RESV_OWNER */ >>> * ... * >>> * Add that existing comment */ >>> >>> if (ret > 0) >>> return 0; >>> if (ret == 0) >>> return 1; >>> return ret; >>> >> >> Many thanks for good suggestion! What do you mean is this ? > > I think the below changes would work fine. > > However, this patch/discussion has made me ask the question. Do we need > the HPAGE_RESV_OWNER flag? Is the followng true? > !(vm_flags & VM_MAYSHARE) && vma_resv_map() ===> HPAGE_RESV_OWNER > !(vm_flags & VM_MAYSHARE) && !vma_resv_map() ===> !HPAGE_RESV_OWNER > I agree with you. HPAGE_RESV_OWNER is set in hugetlb_reserve_pages() and there's no way to clear it in the owner process. The child process can not inherit both HPAGE_RESV_OWNER and resv_map. So for !HPAGE_RESV_OWNER vma, it knows nothing about resv_map. IMO, in !(vm_flags & VM_MAYSHARE) case, we must have: !!vma_resv_map() == !!HPAGE_RESV_OWNER > I am not suggesting we eliminate the flag and make corresponding > changes. Just curious if you believe we 'could' remove the flag and > depend on the above conditions. > > One reason for NOT removing the flag is that that flag itself and > supporting code and commnets help explain what happens with hugetlb > reserves for COW mappings. That code is hard to understand and the > existing code and coments around HPAGE_RESV_OWNER help with > understanding. Agree. These codes took me several days to understand... > Thanks.
On 4/7/21 7:44 PM, Miaohe Lin wrote: > On 2021/4/8 5:23, Mike Kravetz wrote: >> On 4/6/21 8:09 PM, Miaohe Lin wrote: >>> On 2021/4/7 10:37, Mike Kravetz wrote: >>>> On 4/6/21 7:05 PM, Miaohe Lin wrote: >>>>> Hi: >>>>> On 2021/4/7 8:53, Mike Kravetz wrote: >>>>>> On 4/2/21 2:32 AM, Miaohe Lin wrote: >>>>>>> It's guaranteed that the vma is associated with a resv_map, i.e. either >>>>>>> VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would >>>>>>> have returned via !resv check above. So ret must be less than 0 in the >>>>>>> 'else' case. Simplify the return code to make this clear. >>>>>> >>>>>> I believe we still neeed that ternary operator in the return statement. >>>>>> Why? >>>>>> >>>>>> There are two basic types of mappings to be concerned with: >>>>>> shared and private. >>>>>> For private mappings, a task can 'own' the mapping as indicated by >>>>>> HPAGE_RESV_OWNER. Or, it may not own the mapping. The most common way >>>>>> to create a non-owner private mapping is to have a task with a private >>>>>> mapping fork. The parent process will have HPAGE_RESV_OWNER set, the >>>>>> child process will not. The idea is that since the child has a COW copy >>>>>> of the mapping it should not consume reservations made by the parent. >>>>> >>>>> The child process will not have HPAGE_RESV_OWNER set because at fork time, we do: >>>>> /* >>>>> * Clear hugetlb-related page reserves for children. This only >>>>> * affects MAP_PRIVATE mappings. Faults generated by the child >>>>> * are not guaranteed to succeed, even if read-only >>>>> */ >>>>> if (is_vm_hugetlb_page(tmp)) >>>>> reset_vma_resv_huge_pages(tmp); >>>>> i.e. we have vma->vm_private_data = (void *)0; for child process and vma_resv_map() will >>>>> return NULL in this case. >>>>> Or am I missed something? >>>>> >>>>>> Only the parent (HPAGE_RESV_OWNER) is allowed to consume the >>>>>> reservations. >>>>>> Hope that makens sense? >>>>>> >>>>>>> >>>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>>>> --- >>>>>>> mm/hugetlb.c | 2 +- >>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>>>>>> index a03a50b7c410..b7864abded3d 100644 >>>>>>> --- a/mm/hugetlb.c >>>>>>> +++ b/mm/hugetlb.c >>>>>>> @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate *h, >>>>>>> return 1; >>>>>>> } >>>>>>> else >>>>>> >>>>>> This else also handles the case !HPAGE_RESV_OWNER. In this case, we >>>>> >>>>> IMO, for the case !HPAGE_RESV_OWNER, we won't reach here. What do you think? >>>>> >>>> >>>> I think you are correct. >>>> >>>> However, if this is true we should be able to simply the code even >>>> further. There is no need to check for HPAGE_RESV_OWNER because we know >>>> it must be set. Correct? If so, the code could look something like: >>>> >>>> if (vma->vm_flags & VM_MAYSHARE) >>>> return ret; >>>> >>>> /* We know private mapping with HPAGE_RESV_OWNER */ >>>> * ... * >>>> * Add that existing comment */ >>>> >>>> if (ret > 0) >>>> return 0; >>>> if (ret == 0) >>>> return 1; >>>> return ret; >>>> >>> >>> Many thanks for good suggestion! What do you mean is this ? >> >> I think the below changes would work fine. >> >> However, this patch/discussion has made me ask the question. Do we need >> the HPAGE_RESV_OWNER flag? Is the followng true? >> !(vm_flags & VM_MAYSHARE) && vma_resv_map() ===> HPAGE_RESV_OWNER >> !(vm_flags & VM_MAYSHARE) && !vma_resv_map() ===> !HPAGE_RESV_OWNER >> > > I agree with you. > > HPAGE_RESV_OWNER is set in hugetlb_reserve_pages() and there's no way to clear it > in the owner process. The child process can not inherit both HPAGE_RESV_OWNER and > resv_map. So for !HPAGE_RESV_OWNER vma, it knows nothing about resv_map. > > IMO, in !(vm_flags & VM_MAYSHARE) case, we must have: > !!vma_resv_map() == !!HPAGE_RESV_OWNER > >> I am not suggesting we eliminate the flag and make corresponding >> changes. Just curious if you believe we 'could' remove the flag and >> depend on the above conditions. >> >> One reason for NOT removing the flag is that that flag itself and >> supporting code and commnets help explain what happens with hugetlb >> reserves for COW mappings. That code is hard to understand and the >> existing code and coments around HPAGE_RESV_OWNER help with >> understanding. > > Agree. These codes took me several days to understand... > Please prepare v2 with the changes to remove the HPAGE_RESV_OWNER check and move the large comment. I would prefer to leave other places that mention HPAGE_RESV_OWNER unchanged. Thanks,
On 2021/4/9 6:40, Mike Kravetz wrote: > On 4/7/21 7:44 PM, Miaohe Lin wrote: >> On 2021/4/8 5:23, Mike Kravetz wrote: >>> On 4/6/21 8:09 PM, Miaohe Lin wrote: >>>> On 2021/4/7 10:37, Mike Kravetz wrote: >>>>> On 4/6/21 7:05 PM, Miaohe Lin wrote: >>>>>> Hi: >>>>>> On 2021/4/7 8:53, Mike Kravetz wrote: >>>>>>> On 4/2/21 2:32 AM, Miaohe Lin wrote: >>>>>>>> It's guaranteed that the vma is associated with a resv_map, i.e. either >>>>>>>> VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would >>>>>>>> have returned via !resv check above. So ret must be less than 0 in the >>>>>>>> 'else' case. Simplify the return code to make this clear. >>>>>>> >>>>>>> I believe we still neeed that ternary operator in the return statement. >>>>>>> Why? >>>>>>> >>>>>>> There are two basic types of mappings to be concerned with: >>>>>>> shared and private. >>>>>>> For private mappings, a task can 'own' the mapping as indicated by >>>>>>> HPAGE_RESV_OWNER. Or, it may not own the mapping. The most common way >>>>>>> to create a non-owner private mapping is to have a task with a private >>>>>>> mapping fork. The parent process will have HPAGE_RESV_OWNER set, the >>>>>>> child process will not. The idea is that since the child has a COW copy >>>>>>> of the mapping it should not consume reservations made by the parent. >>>>>> >>>>>> The child process will not have HPAGE_RESV_OWNER set because at fork time, we do: >>>>>> /* >>>>>> * Clear hugetlb-related page reserves for children. This only >>>>>> * affects MAP_PRIVATE mappings. Faults generated by the child >>>>>> * are not guaranteed to succeed, even if read-only >>>>>> */ >>>>>> if (is_vm_hugetlb_page(tmp)) >>>>>> reset_vma_resv_huge_pages(tmp); >>>>>> i.e. we have vma->vm_private_data = (void *)0; for child process and vma_resv_map() will >>>>>> return NULL in this case. >>>>>> Or am I missed something? >>>>>> >>>>>>> Only the parent (HPAGE_RESV_OWNER) is allowed to consume the >>>>>>> reservations. >>>>>>> Hope that makens sense? >>>>>>> >>>>>>>> >>>>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>>>>> --- >>>>>>>> mm/hugetlb.c | 2 +- >>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>>>>>>> index a03a50b7c410..b7864abded3d 100644 >>>>>>>> --- a/mm/hugetlb.c >>>>>>>> +++ b/mm/hugetlb.c >>>>>>>> @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate *h, >>>>>>>> return 1; >>>>>>>> } >>>>>>>> else >>>>>>> >>>>>>> This else also handles the case !HPAGE_RESV_OWNER. In this case, we >>>>>> >>>>>> IMO, for the case !HPAGE_RESV_OWNER, we won't reach here. What do you think? >>>>>> >>>>> >>>>> I think you are correct. >>>>> >>>>> However, if this is true we should be able to simply the code even >>>>> further. There is no need to check for HPAGE_RESV_OWNER because we know >>>>> it must be set. Correct? If so, the code could look something like: >>>>> >>>>> if (vma->vm_flags & VM_MAYSHARE) >>>>> return ret; >>>>> >>>>> /* We know private mapping with HPAGE_RESV_OWNER */ >>>>> * ... * >>>>> * Add that existing comment */ >>>>> >>>>> if (ret > 0) >>>>> return 0; >>>>> if (ret == 0) >>>>> return 1; >>>>> return ret; >>>>> >>>> >>>> Many thanks for good suggestion! What do you mean is this ? >>> >>> I think the below changes would work fine. >>> >>> However, this patch/discussion has made me ask the question. Do we need >>> the HPAGE_RESV_OWNER flag? Is the followng true? >>> !(vm_flags & VM_MAYSHARE) && vma_resv_map() ===> HPAGE_RESV_OWNER >>> !(vm_flags & VM_MAYSHARE) && !vma_resv_map() ===> !HPAGE_RESV_OWNER >>> >> >> I agree with you. >> >> HPAGE_RESV_OWNER is set in hugetlb_reserve_pages() and there's no way to clear it >> in the owner process. The child process can not inherit both HPAGE_RESV_OWNER and >> resv_map. So for !HPAGE_RESV_OWNER vma, it knows nothing about resv_map. >> >> IMO, in !(vm_flags & VM_MAYSHARE) case, we must have: >> !!vma_resv_map() == !!HPAGE_RESV_OWNER >> >>> I am not suggesting we eliminate the flag and make corresponding >>> changes. Just curious if you believe we 'could' remove the flag and >>> depend on the above conditions. >>> >>> One reason for NOT removing the flag is that that flag itself and >>> supporting code and commnets help explain what happens with hugetlb >>> reserves for COW mappings. That code is hard to understand and the >>> existing code and coments around HPAGE_RESV_OWNER help with >>> understanding. >> >> Agree. These codes took me several days to understand... >> > > Please prepare v2 with the changes to remove the HPAGE_RESV_OWNER check > and move the large comment. > Sure. Will do. Thanks. > > I would prefer to leave other places that mention HPAGE_RESV_OWNER > unchanged. > > Thanks, >
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a03a50b7c410..b7864abded3d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2183,7 +2183,7 @@ static long __vma_reservation_common(struct hstate *h, return 1; } else - return ret < 0 ? ret : 0; + return ret; } static long vma_needs_reservation(struct hstate *h,
It's guaranteed that the vma is associated with a resv_map, i.e. either VM_MAYSHARE or HPAGE_RESV_OWNER, when the code reaches here or we would have returned via !resv check above. So ret must be less than 0 in the 'else' case. Simplify the return code to make this clear. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/hugetlb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)