diff mbox series

[RFC,1/6] vdso/extable: fix calculation of base

Message ID 20210225072910.2811795-2-namit@vmware.com (mailing list archive)
State New, archived
Headers show
Series x86: prefetch_page() vDSO call | expand

Commit Message

Nadav Amit Feb. 25, 2021, 7:29 a.m. UTC
From: Nadav Amit <namit@vmware.com>

Apparently, the assembly considers __ex_table as the location when the
pushsection directive was issued. Therefore when there is more than a
single entry in the vDSO exception table, the calculations of the base
and fixup are wrong.

Fix the calculations of the expected fault IP and new IP by adjusting
the base after each entry.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: x86@kernel.org
Signed-off-by: Nadav Amit <namit@vmware.com>
---
 arch/x86/entry/vdso/extable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Sean Christopherson Feb. 25, 2021, 9:16 p.m. UTC | #1
On Wed, Feb 24, 2021, Nadav Amit wrote:
> From: Nadav Amit <namit@vmware.com>
> 
> Apparently, the assembly considers __ex_table as the location when the
> pushsection directive was issued. Therefore when there is more than a
> single entry in the vDSO exception table, the calculations of the base
> and fixup are wrong.
> 
> Fix the calculations of the expected fault IP and new IP by adjusting
> the base after each entry.
> 
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: x86@kernel.org
> Signed-off-by: Nadav Amit <namit@vmware.com>
> ---
>  arch/x86/entry/vdso/extable.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/entry/vdso/extable.c b/arch/x86/entry/vdso/extable.c
> index afcf5b65beef..c81e78636220 100644
> --- a/arch/x86/entry/vdso/extable.c
> +++ b/arch/x86/entry/vdso/extable.c
> @@ -32,7 +32,7 @@ bool fixup_vdso_exception(struct pt_regs *regs, int trapnr,
>  	nr_entries = image->extable_len / (sizeof(*extable));
>  	extable = image->extable;
>  
> -	for (i = 0; i < nr_entries; i++) {
> +	for (i = 0; i < nr_entries; i++, base += sizeof(*extable)) {

It's been literally years since I wrote this code, but I distinctly remember the
addresses being relative to the base.  I also remember testing multiple entries,
but again, that was a long time ago.

Assuming things have changed, or I was flat out wrong, the comment above the
macro magic should also be updated.

/*
 * Inject exception fixup for vDSO code.  Unlike normal exception fixup,
 * vDSO uses a dedicated handler the addresses are relative to the overall
 * exception table, not each individual entry.
 */

>  		if (regs->ip == base + extable[i].insn) {
>  			regs->ip = base + extable[i].fixup;
>  			regs->di = trapnr;
> -- 
> 2.25.1
>
Nadav Amit Feb. 26, 2021, 5:24 p.m. UTC | #2
> On Feb 25, 2021, at 1:16 PM, Sean Christopherson <seanjc@google.com> wrote:
> 
> On Wed, Feb 24, 2021, Nadav Amit wrote:
>> From: Nadav Amit <namit@vmware.com>
>> 
>> Apparently, the assembly considers __ex_table as the location when the
>> pushsection directive was issued. Therefore when there is more than a
>> single entry in the vDSO exception table, the calculations of the base
>> and fixup are wrong.
>> 
>> Fix the calculations of the expected fault IP and new IP by adjusting
>> the base after each entry.
>> 
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Sean Christopherson <seanjc@google.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: x86@kernel.org
>> Signed-off-by: Nadav Amit <namit@vmware.com>
>> ---
>> arch/x86/entry/vdso/extable.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/arch/x86/entry/vdso/extable.c b/arch/x86/entry/vdso/extable.c
>> index afcf5b65beef..c81e78636220 100644
>> --- a/arch/x86/entry/vdso/extable.c
>> +++ b/arch/x86/entry/vdso/extable.c
>> @@ -32,7 +32,7 @@ bool fixup_vdso_exception(struct pt_regs *regs, int trapnr,
>> 	nr_entries = image->extable_len / (sizeof(*extable));
>> 	extable = image->extable;
>> 
>> -	for (i = 0; i < nr_entries; i++) {
>> +	for (i = 0; i < nr_entries; i++, base += sizeof(*extable)) {
> 
> It's been literally years since I wrote this code, but I distinctly remember the
> addresses being relative to the base.  I also remember testing multiple entries,
> but again, that was a long time ago.
> 
> Assuming things have changed, or I was flat out wrong, the comment above the
> macro magic should also be updated.
> 
> /*
> * Inject exception fixup for vDSO code.  Unlike normal exception fixup,
> * vDSO uses a dedicated handler the addresses are relative to the overall
> * exception table, not each individual entry.
> */

I will update the comment. I am not very familiar with pushsection stuff,
but the offsets were wrong.

Since you say you checked it, I wonder whether it can somehow be caused
by having exception table entries defined from multiple object files.

Anyhow, this change follows the kernel’s (not vDSO) exception table
scheme.
Sean Christopherson Feb. 26, 2021, 5:47 p.m. UTC | #3
On Fri, Feb 26, 2021, Nadav Amit wrote:
> 
> > On Feb 25, 2021, at 1:16 PM, Sean Christopherson <seanjc@google.com> wrote:
> > It's been literally years since I wrote this code, but I distinctly remember the
> > addresses being relative to the base.  I also remember testing multiple entries,
> > but again, that was a long time ago.
> > 
> > Assuming things have changed, or I was flat out wrong, the comment above the
> > macro magic should also be updated.
> > 
> > /*
> > * Inject exception fixup for vDSO code.  Unlike normal exception fixup,
> > * vDSO uses a dedicated handler the addresses are relative to the overall
> > * exception table, not each individual entry.
> > */
> 
> I will update the comment. I am not very familiar with pushsection stuff,
> but the offsets were wrong.
> 
> Since you say you checked it, I wonder whether it can somehow be caused
> by having exception table entries defined from multiple object files.

Oooh, I think that would do it.  Have you checked what happens if there are
multiple object files and multiple fixups within an object file?

> Anyhow, this change follows the kernel’s (not vDSO) exception table
> scheme.
Nadav Amit Feb. 28, 2021, 9:20 a.m. UTC | #4
> On Feb 26, 2021, at 9:47 AM, Sean Christopherson <seanjc@google.com> wrote:
> 
> On Fri, Feb 26, 2021, Nadav Amit wrote:
>> 
>>> On Feb 25, 2021, at 1:16 PM, Sean Christopherson <seanjc@google.com> wrote:
>>> It's been literally years since I wrote this code, but I distinctly remember the
>>> addresses being relative to the base.  I also remember testing multiple entries,
>>> but again, that was a long time ago.
>>> 
>>> Assuming things have changed, or I was flat out wrong, the comment above the
>>> macro magic should also be updated.
>>> 
>>> /*
>>> * Inject exception fixup for vDSO code.  Unlike normal exception fixup,
>>> * vDSO uses a dedicated handler the addresses are relative to the overall
>>> * exception table, not each individual entry.
>>> */
>> 
>> I will update the comment. I am not very familiar with pushsection stuff,
>> but the offsets were wrong.
>> 
>> Since you say you checked it, I wonder whether it can somehow be caused
>> by having exception table entries defined from multiple object files.
> 
> Oooh, I think that would do it.  Have you checked what happens if there are
> multiple object files and multiple fixups within an object file?

Good thing that you insisted...

I certainly do not know well enough the assembly section directives,
but indeed it seems (after some experiments) that referring to the
section provides different values from different objects.

So both the current (yours) and this patch (mine) are broken. I think
the easiest thing is to fall back to the kernel exception table scheme.
I checked the following with both entries in the same and different
objects and it seems to work correctly:

-- >8 --

diff --git a/arch/x86/entry/vdso/extable.c b/arch/x86/entry/vdso/extable.c
index afcf5b65beef..3f395b782553 100644
--- a/arch/x86/entry/vdso/extable.c
+++ b/arch/x86/entry/vdso/extable.c
@@ -32,9 +32,11 @@ bool fixup_vdso_exception(struct pt_regs *regs, int trapnr,
 	nr_entries = image->extable_len / (sizeof(*extable));
 	extable = image->extable;

-	for (i = 0; i < nr_entries; i++) {
-		if (regs->ip == base + extable[i].insn) {
-			regs->ip = base + extable[i].fixup;
+	for (i = 0; i < nr_entries; i++, base += sizeof(*extable)) {
+		if (regs->ip == base + extable[i].insn +
+			offsetof(struct vdso_exception_table_entry, insn)) {
+			regs->ip = base + extable[i].fixup +
+				offsetof(struct vdso_exception_table_entry, fixup);
 			regs->di = trapnr;
 			regs->si = error_code;
 			regs->dx = fault_addr;
diff --git a/arch/x86/entry/vdso/extable.h b/arch/x86/entry/vdso/extable.h
index b56f6b012941..4ffe3d533148 100644
--- a/arch/x86/entry/vdso/extable.h
+++ b/arch/x86/entry/vdso/extable.h
@@ -13,8 +13,8 @@

 .macro ASM_VDSO_EXTABLE_HANDLE from:req to:req
 	.pushsection __ex_table, "a"
-	.long (\from) - __ex_table
-	.long (\to) - __ex_table
+	.long (\from) - .
+	.long (\to) - .
 	.popsection
 .endm
 #else
--
2.25.1
diff mbox series

Patch

diff --git a/arch/x86/entry/vdso/extable.c b/arch/x86/entry/vdso/extable.c
index afcf5b65beef..c81e78636220 100644
--- a/arch/x86/entry/vdso/extable.c
+++ b/arch/x86/entry/vdso/extable.c
@@ -32,7 +32,7 @@  bool fixup_vdso_exception(struct pt_regs *regs, int trapnr,
 	nr_entries = image->extable_len / (sizeof(*extable));
 	extable = image->extable;
 
-	for (i = 0; i < nr_entries; i++) {
+	for (i = 0; i < nr_entries; i++, base += sizeof(*extable)) {
 		if (regs->ip == base + extable[i].insn) {
 			regs->ip = base + extable[i].fixup;
 			regs->di = trapnr;