diff mbox series

mm/memory.c: avoid repeated set_page_dirty in fault_dirty_shared_page

Message ID 1576164078-28402-1-git-send-email-lixinhai.lxh@gmail.com (mailing list archive)
State New, archived
Headers show
Series mm/memory.c: avoid repeated set_page_dirty in fault_dirty_shared_page | expand

Commit Message

Li Xinhai Dec. 12, 2019, 3:21 p.m. UTC
When vm_ops->page_mkwrite is defined, and called from wp_page_shared and
do_shared_fault, the set_page_dirty must already called by page_mkwrite.
Then in fault_dirty_shared_page, avoid this repeated call.

Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
Cc: Jan Kara <jack@suse.cz>
---
 mm/memory.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

David Hildenbrand Dec. 12, 2019, 3:35 p.m. UTC | #1
On 12.12.19 16:21, Li Xinhai wrote:
> When vm_ops->page_mkwrite is defined, and called from wp_page_shared and
> do_shared_fault, the set_page_dirty must already called by page_mkwrite.
> Then in fault_dirty_shared_page, avoid this repeated call.
> 
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> Cc: Jan Kara <jack@suse.cz>
> ---
>  mm/memory.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 606da18..34a83d7 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2300,10 +2300,11 @@ static vm_fault_t fault_dirty_shared_page(struct vm_fault *vmf)
>  	struct vm_area_struct *vma = vmf->vma;
>  	struct address_space *mapping;
>  	struct page *page = vmf->page;
> -	bool dirtied;
> +	bool dirtied = false;
>  	bool page_mkwrite = vma->vm_ops && vma->vm_ops->page_mkwrite;
>  
> -	dirtied = set_page_dirty(page);
> +	if(!page_mkwrite)
> +		dirtied = set_page_dirty(page);
>  	VM_BUG_ON_PAGE(PageAnon(page), page);
>  	/*
>  	 * Take a local copy of the address_space - page.mapping may be zeroed
> @@ -3645,7 +3646,7 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
>  	 * Check if the backing address space wants to know that the page is
>  	 * about to become writable
>  	 */
> -	if (vma->vm_ops->page_mkwrite) {
> +	if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
>  		unlock_page(vmf->page);
>  		tmp = do_page_mkwrite(vmf);
>  		if (unlikely(!tmp ||
> 

This hunk looks like an unrelated change to me.
Kirill A. Shutemov Dec. 12, 2019, 4:55 p.m. UTC | #2
On Thu, Dec 12, 2019 at 11:21:18PM +0800, Li Xinhai wrote:
> When vm_ops->page_mkwrite is defined, and called from wp_page_shared and
> do_shared_fault, the set_page_dirty must already called by page_mkwrite.

Must? Do all ->page_mkwrite implementation do this?

> @@ -3645,7 +3646,7 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
>  	 * Check if the backing address space wants to know that the page is
>  	 * about to become writable
>  	 */
> -	if (vma->vm_ops->page_mkwrite) {
> +	if (vma->vm_ops && vma->vm_ops->page_mkwrite) {

vma->vm_ops is always non-NULL here.
Li Xinhai Dec. 13, 2019, 7:28 a.m. UTC | #3
On 2019-12-13 at 00:55 Kirill A. Shutemov wrote:
>On Thu, Dec 12, 2019 at 11:21:18PM +0800, Li Xinhai wrote:
>> When vm_ops->page_mkwrite is defined, and called from wp_page_shared and
>> do_shared_fault, the set_page_dirty must already called by page_mkwrite.
>
>Must? Do all ->page_mkwrite implementation do this?

My understanding is that set_page_dirty need be called before PTE is set to allow
writing. If not in this sequence, other thread will see a writable PTE and dirty the page
before current thread set_page_dirty. 
In ->page_mkwrite, FS can decide if set_page_dirty should be called or not. I checked
a few FS, ext4/xfs/btrsfs/ceph and generic filemap_page_mkwrite, they called it. 
If FS provide ->page_mkwrite and decide don't call set_page_dirty, why
fault_dirty_shared_page call this function unconditionally? or, I missed something?

In case no ->page_mkwrite provided, call set_page_dirty looks reasonable for default
action.

>> @@ -3645,7 +3646,7 @@ static vm_fault_t do_shared_fault(struct vm_fault *vmf)
>>  * Check if the backing address space wants to know that the page is
>>  * about to become writable
>>  */
>> -	if (vma->vm_ops->page_mkwrite) {
>> +	if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
>
>vma->vm_ops is always non-NULL here.
yes, thanks point out.

>
>--
> Kirill A. Shutemov
Jan Kara Dec. 13, 2019, 5:50 p.m. UTC | #4
On Fri 13-12-19 15:28:32, lixinhai.lxh@gmail.com wrote:
> On 2019-12-13 at 00:55 Kirill A. Shutemov wrote:
> >On Thu, Dec 12, 2019 at 11:21:18PM +0800, Li Xinhai wrote:
> >> When vm_ops->page_mkwrite is defined, and called from wp_page_shared and
> >> do_shared_fault, the set_page_dirty must already called by page_mkwrite.
> >
> >Must? Do all ->page_mkwrite implementation do this?
> 
> My understanding is that set_page_dirty need be called before PTE is set
> to allow writing. If not in this sequence, other thread will see a
> writable PTE and dirty the page before current thread set_page_dirty. 

Yes, filesystems effectively do rely on this.

> In ->page_mkwrite, FS can decide if set_page_dirty should be called or
> not. I checked a few FS, ext4/xfs/btrsfs/ceph and generic
> filemap_page_mkwrite, they called it.  If FS provide ->page_mkwrite and
> decide don't call set_page_dirty, why fault_dirty_shared_page call this
> function unconditionally? or, I missed something?

Well, generally the responsibility for dirtying the page has been on the
generic MM code (i.e., fault_dirty_shared_page()). Now you're right that
lots of filesystems will end up dirtying the page because they are reusing
generic helpers for handling ->page_mkwrite() and that happens to dirty the
page. But that is mostly a coincidence and not guarantee. So to safely
remove page dirtying from fault_dirty_shared_page(), you'd need to audit
*all* ->page_mkwrite() implementations, make sure they all dirty the page,
and then document this requirement somewhere. Overall I don't think the
effort is really worth it since redirtying already dirty page is very
cheap.

								Honza
diff mbox series

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 606da18..34a83d7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2300,10 +2300,11 @@  static vm_fault_t fault_dirty_shared_page(struct vm_fault *vmf)
 	struct vm_area_struct *vma = vmf->vma;
 	struct address_space *mapping;
 	struct page *page = vmf->page;
-	bool dirtied;
+	bool dirtied = false;
 	bool page_mkwrite = vma->vm_ops && vma->vm_ops->page_mkwrite;
 
-	dirtied = set_page_dirty(page);
+	if(!page_mkwrite)
+		dirtied = set_page_dirty(page);
 	VM_BUG_ON_PAGE(PageAnon(page), page);
 	/*
 	 * Take a local copy of the address_space - page.mapping may be zeroed
@@ -3645,7 +3646,7 @@  static vm_fault_t do_shared_fault(struct vm_fault *vmf)
 	 * Check if the backing address space wants to know that the page is
 	 * about to become writable
 	 */
-	if (vma->vm_ops->page_mkwrite) {
+	if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
 		unlock_page(vmf->page);
 		tmp = do_page_mkwrite(vmf);
 		if (unlikely(!tmp ||