diff mbox series

[8/8] iommu: Improve map/unmap sanity checks

Message ID 524db1ec0139c964d26928a6a264945aa66d010c.1694525662.git.robin.murphy@arm.com (mailing list archive)
State New, archived
Headers show
Series iommu: Clean up map/unmap ops | expand

Commit Message

Robin Murphy Sept. 12, 2023, 4:18 p.m. UTC
The current checks for the __IOMMU_DOMAIN_PAGING capability seem a
bit stifled, since it is quite likely now that a non-paging domain
won't have a pgsize_bitmap and/or mapping ops, and thus get caught
by the earlier condition anyway. Swap them around to test the more
fundamental condition first, then we can reasonably also upgrade
the other to a WARN_ON, since if a driver does ever expose a paging
domain without the means to actually page, it's clearly very broken.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/iommu.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Jason Gunthorpe Sept. 13, 2023, 2:39 p.m. UTC | #1
On Tue, Sep 12, 2023 at 05:18:44PM +0100, Robin Murphy wrote:
> The current checks for the __IOMMU_DOMAIN_PAGING capability seem a
> bit stifled, since it is quite likely now that a non-paging domain
> won't have a pgsize_bitmap and/or mapping ops, and thus get caught
> by the earlier condition anyway. Swap them around to test the more
> fundamental condition first, then we can reasonably also upgrade
> the other to a WARN_ON, since if a driver does ever expose a paging
> domain without the means to actually page, it's clearly very broken.

> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/iommu.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index c83f2e4c56f5..391bcae4d02d 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -2427,12 +2427,12 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
>  	phys_addr_t orig_paddr = paddr;
>  	int ret = 0;
>  
> -	if (unlikely(!ops->map_pages || domain->pgsize_bitmap == 0UL))
> -		return -ENODEV;
> -
>  	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
>  		return -EINVAL;

Why isn't this a WARN_ON? The caller is clearly lost its mind if it
is calling map on a non paging domain..

> +	if (WARN_ON(!ops->map_pages || domain->pgsize_bitmap == 0UL))
> +		return -ENODEV;
> +

And these could be moved to after attach so they are not in any fast
path, and eventually to after alloc..

Jason
Robin Murphy Sept. 13, 2023, 6:46 p.m. UTC | #2
On 2023-09-13 15:39, Jason Gunthorpe wrote:
> On Tue, Sep 12, 2023 at 05:18:44PM +0100, Robin Murphy wrote:
>> The current checks for the __IOMMU_DOMAIN_PAGING capability seem a
>> bit stifled, since it is quite likely now that a non-paging domain
>> won't have a pgsize_bitmap and/or mapping ops, and thus get caught
>> by the earlier condition anyway. Swap them around to test the more
>> fundamental condition first, then we can reasonably also upgrade
>> the other to a WARN_ON, since if a driver does ever expose a paging
>> domain without the means to actually page, it's clearly very broken.
> 
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   drivers/iommu/iommu.c | 10 +++++-----
>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index c83f2e4c56f5..391bcae4d02d 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -2427,12 +2427,12 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
>>   	phys_addr_t orig_paddr = paddr;
>>   	int ret = 0;
>>   
>> -	if (unlikely(!ops->map_pages || domain->pgsize_bitmap == 0UL))
>> -		return -ENODEV;
>> -
>>   	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
>>   		return -EINVAL;
> 
> Why isn't this a WARN_ON? The caller is clearly lost its mind if it
> is calling map on a non paging domain..

Sure it's a dumb thing to do, however I don't think we can reasonably 
say without question that an external caller being dumb represents an 
unexpected and serious loss of internal consistency.

>> +	if (WARN_ON(!ops->map_pages || domain->pgsize_bitmap == 0UL))
>> +		return -ENODEV;
>> +
> 
> And these could be moved to after attach so they are not in any fast
> path, and eventually to after alloc..

Perhaps, although TBH I can't imagine it making any appreciable 
difference - they're both values we need to load and use soon anyway, so 
on a sensible CPU, the additional overhead should only really be a 
couple of not-taken branch-if-zero instructions, which is nothing at all 
compared to how much goes on in the main loop and the driver op call itself.

Thanks,
Robin.
Jason Gunthorpe Sept. 14, 2023, 12:48 p.m. UTC | #3
On Wed, Sep 13, 2023 at 07:46:45PM +0100, Robin Murphy wrote:
> On 2023-09-13 15:39, Jason Gunthorpe wrote:
> > On Tue, Sep 12, 2023 at 05:18:44PM +0100, Robin Murphy wrote:
> > > The current checks for the __IOMMU_DOMAIN_PAGING capability seem a
> > > bit stifled, since it is quite likely now that a non-paging domain
> > > won't have a pgsize_bitmap and/or mapping ops, and thus get caught
> > > by the earlier condition anyway. Swap them around to test the more
> > > fundamental condition first, then we can reasonably also upgrade
> > > the other to a WARN_ON, since if a driver does ever expose a paging
> > > domain without the means to actually page, it's clearly very broken.
> > 
> > > Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> > > ---
> > >   drivers/iommu/iommu.c | 10 +++++-----
> > >   1 file changed, 5 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > > index c83f2e4c56f5..391bcae4d02d 100644
> > > --- a/drivers/iommu/iommu.c
> > > +++ b/drivers/iommu/iommu.c
> > > @@ -2427,12 +2427,12 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
> > >   	phys_addr_t orig_paddr = paddr;
> > >   	int ret = 0;
> > > -	if (unlikely(!ops->map_pages || domain->pgsize_bitmap == 0UL))
> > > -		return -ENODEV;
> > > -
> > >   	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
> > >   		return -EINVAL;
> > 
> > Why isn't this a WARN_ON? The caller is clearly lost its mind if it
> > is calling map on a non paging domain..
> 
> Sure it's a dumb thing to do, however I don't think we can reasonably say
> without question that an external caller being dumb represents an unexpected
> and serious loss of internal consistency.

WARN_ON is not just for "serious loss of internal consistency" it
should be use in all places where invariant are violated. We can't
guess why the caller is using this wrong (most likely it is UAF or
memory corruption), but if this fires something definately has gone
wrong with the kernel.

eg if syzkaller somehow hits this we want the WARN_ON so it reports
it.

> > > +	if (WARN_ON(!ops->map_pages || domain->pgsize_bitmap == 0UL))
> > > +		return -ENODEV;
> > > +
> > 
> > And these could be moved to after attach so they are not in any fast
> > path, and eventually to after alloc..
> 
> Perhaps, although TBH I can't imagine it making any appreciable difference -
> they're both values we need to load and use soon anyway, so on a sensible
> CPU, the additional overhead should only really be a couple of not-taken
> branch-if-zero instructions, which is nothing at all compared to how much
> goes on in the main loop and the driver op call itself.

IMHO it is a cleaner pattern to check the objects the driver creates
when it creates them, not when we go to try to use them..

Jason
Robin Murphy Sept. 14, 2023, 2:23 p.m. UTC | #4
On 2023-09-14 13:48, Jason Gunthorpe wrote:
> On Wed, Sep 13, 2023 at 07:46:45PM +0100, Robin Murphy wrote:
>> On 2023-09-13 15:39, Jason Gunthorpe wrote:
>>> On Tue, Sep 12, 2023 at 05:18:44PM +0100, Robin Murphy wrote:
>>>> The current checks for the __IOMMU_DOMAIN_PAGING capability seem a
>>>> bit stifled, since it is quite likely now that a non-paging domain
>>>> won't have a pgsize_bitmap and/or mapping ops, and thus get caught
>>>> by the earlier condition anyway. Swap them around to test the more
>>>> fundamental condition first, then we can reasonably also upgrade
>>>> the other to a WARN_ON, since if a driver does ever expose a paging
>>>> domain without the means to actually page, it's clearly very broken.
>>>
>>>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>>>> ---
>>>>    drivers/iommu/iommu.c | 10 +++++-----
>>>>    1 file changed, 5 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>> index c83f2e4c56f5..391bcae4d02d 100644
>>>> --- a/drivers/iommu/iommu.c
>>>> +++ b/drivers/iommu/iommu.c
>>>> @@ -2427,12 +2427,12 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
>>>>    	phys_addr_t orig_paddr = paddr;
>>>>    	int ret = 0;
>>>> -	if (unlikely(!ops->map_pages || domain->pgsize_bitmap == 0UL))
>>>> -		return -ENODEV;
>>>> -
>>>>    	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
>>>>    		return -EINVAL;
>>>
>>> Why isn't this a WARN_ON? The caller is clearly lost its mind if it
>>> is calling map on a non paging domain..
>>
>> Sure it's a dumb thing to do, however I don't think we can reasonably say
>> without question that an external caller being dumb represents an unexpected
>> and serious loss of internal consistency.
> 
> WARN_ON is not just for "serious loss of internal consistency" it
> should be use in all places where invariant are violated. We can't
> guess why the caller is using this wrong (most likely it is UAF or
> memory corruption), but if this fires something definately has gone
> wrong with the kernel.

Has it? It's not *functionally* incorrect to obtain a valid domain by 
calling iommu_get_domain_for_dev(), pass that domain to iommu_map(), and 
handle any failure returned. Sure, it's *semantically* questionable, but 
so is calling iommu_iova_to_phys() on non-paging domains, and we have to 
support that, because callers are dumb, and "callers aren't dumb" is not 
a realistic invariant which we can uphold.

> eg if syzkaller somehow hits this we want the WARN_ON so it reports
> it.

But then an "IOMMU bug" is reported to us, and we say "yeah, that's not 
ours, that's some code somewhere down in the middle of that callstack 
being dumb", and then what? I know I don't have the time or inclination 
to go off debugging and redesigning random other bits of the kernel for 
calling our API in ways that look sketchy but aren't technically 
invalid, do you?

iommu_map() can already fail for numerous reasons that may or may not 
represent a bug in the caller, like the IOVA or PA being out-of-range, 
or the IOVA already being mapped, or whatever. The wrong type of domain 
is just another reason to fail (in, as far as we're concerned, an 
entirely safe and manageable way). Callers have to be prepared to handle 
failure, but it's up to them how unexpected or serious it is; we can't 
second-guess them.

Thanks,
Robin.
Jason Gunthorpe Sept. 14, 2023, 4:48 p.m. UTC | #5
On Thu, Sep 14, 2023 at 03:23:46PM +0100, Robin Murphy wrote:
> > > > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> > > > > index c83f2e4c56f5..391bcae4d02d 100644
> > > > > --- a/drivers/iommu/iommu.c
> > > > > +++ b/drivers/iommu/iommu.c
> > > > > @@ -2427,12 +2427,12 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
> > > > >    	phys_addr_t orig_paddr = paddr;
> > > > >    	int ret = 0;
> > > > > -	if (unlikely(!ops->map_pages || domain->pgsize_bitmap == 0UL))
> > > > > -		return -ENODEV;
> > > > > -
> > > > >    	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
> > > > >    		return -EINVAL;
> > > > 
> > > > Why isn't this a WARN_ON? The caller is clearly lost its mind if it
> > > > is calling map on a non paging domain..
> > > 
> > > Sure it's a dumb thing to do, however I don't think we can reasonably say
> > > without question that an external caller being dumb represents an unexpected
> > > and serious loss of internal consistency.
> > 
> > WARN_ON is not just for "serious loss of internal consistency" it
> > should be use in all places where invariant are violated. We can't
> > guess why the caller is using this wrong (most likely it is UAF or
> > memory corruption), but if this fires something definately has gone
> > wrong with the kernel.
> 
> Has it? It's not *functionally* incorrect to obtain a valid domain by
> calling iommu_get_domain_for_dev(), pass that domain to iommu_map(), and
> handle any failure returned. 

This is not a theoretical question - does any in-kernel code actually
do that and expect it to work? I didn't notice any..

It doesn't matter that someone *could*, our task is not to make an
overly general API. We can, and should, have tight invarients because
it allows discovery of more bug classes, and prevents "creative" use
of APIs.

> is calling iommu_iova_to_phys() on non-paging domains, and we have to
> support that, because callers are dumb, and "callers aren't dumb" is not a
> realistic invariant which we can uphold.

Someone does that? :(

> > eg if syzkaller somehow hits this we want the WARN_ON so it reports
> > it.
> 
> But then an "IOMMU bug" is reported to us, and we say "yeah, that's not
> ours, that's some code somewhere down in the middle of that callstack being
> dumb", and then what?

The report gets handed off to the part of the callstack that is making
the problem.

> iommu_map() can already fail for numerous reasons that may or may not
> represent a bug in the caller, like the IOVA or PA being out-of-range, or
> the IOVA already being mapped, or whatever. 

IMHO some of those certainly could be WARN too, depends what you think
the API should be.

eg iommufd and vfio both are designed to never double map/use out of
aperture IOVA/etc, if they do it is a significant bug and a WARN_ON
would be welcomed.

> another reason to fail (in, as far as we're concerned, an entirely safe and
> manageable way). Callers have to be prepared to handle failure, but it's up
> to them how unexpected or serious it is; we can't second-guess them.

We can always convert things into error codes, but that can also hide
bugs if no existing caller needs that functionality. IMHO it is always
better to have tighter invarients, and be noisy when they fail. This
gives us better understandablity, bug discoverability and discourages
people from introducing new code doing crazy things.

Jason
Robin Murphy Sept. 19, 2023, 12:18 p.m. UTC | #6
On 2023-09-14 17:48, Jason Gunthorpe wrote:
> On Thu, Sep 14, 2023 at 03:23:46PM +0100, Robin Murphy wrote:
>>>>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>>>>> index c83f2e4c56f5..391bcae4d02d 100644
>>>>>> --- a/drivers/iommu/iommu.c
>>>>>> +++ b/drivers/iommu/iommu.c
>>>>>> @@ -2427,12 +2427,12 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
>>>>>>     	phys_addr_t orig_paddr = paddr;
>>>>>>     	int ret = 0;
>>>>>> -	if (unlikely(!ops->map_pages || domain->pgsize_bitmap == 0UL))
>>>>>> -		return -ENODEV;
>>>>>> -
>>>>>>     	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
>>>>>>     		return -EINVAL;
>>>>>
>>>>> Why isn't this a WARN_ON? The caller is clearly lost its mind if it
>>>>> is calling map on a non paging domain..
>>>>
>>>> Sure it's a dumb thing to do, however I don't think we can reasonably say
>>>> without question that an external caller being dumb represents an unexpected
>>>> and serious loss of internal consistency.
>>>
>>> WARN_ON is not just for "serious loss of internal consistency" it
>>> should be use in all places where invariant are violated. We can't
>>> guess why the caller is using this wrong (most likely it is UAF or
>>> memory corruption), but if this fires something definately has gone
>>> wrong with the kernel.
>>
>> Has it? It's not *functionally* incorrect to obtain a valid domain by
>> calling iommu_get_domain_for_dev(), pass that domain to iommu_map(), and
>> handle any failure returned.
> 
> This is not a theoretical question - does any in-kernel code actually
> do that and expect it to work? I didn't notice any..

I was hoping to have to say I'd only seen it in unmerged (e.g. [1]) and 
downstream code, but then I found drivers/net/ipa/ :(

(and amusingly, the line I happened to look at first blamed you, but 
that was only for that mechanical change of the iommu_map() prototype)

> It doesn't matter that someone *could*, our task is not to make an
> overly general API. We can, and should, have tight invarients because
> it allows discovery of more bug classes, and prevents "creative" use
> of APIs.
> 
>> is calling iommu_iova_to_phys() on non-paging domains, and we have to
>> support that, because callers are dumb, and "callers aren't dumb" is not a
>> realistic invariant which we can uphold.
> 
> Someone does that? :(

Only every user of iommu_iova_to_phys() outside of drivers/iommu/, VFIO, 
and the arch/arm DMA ops...

>>> eg if syzkaller somehow hits this we want the WARN_ON so it reports
>>> it.
>>
>> But then an "IOMMU bug" is reported to us, and we say "yeah, that's not
>> ours, that's some code somewhere down in the middle of that callstack being
>> dumb", and then what?
> 
> The report gets handed off to the part of the callstack that is making
> the problem.

And if it's critical to them that a mapping succeeds, they should 
probably have their own warning already. An inappropriate domain type is 
far from critical to us, and is something we've already allowed to fail 
cleanly and gracefully for the best part of a decade.

>> iommu_map() can already fail for numerous reasons that may or may not
>> represent a bug in the caller, like the IOVA or PA being out-of-range, or
>> the IOVA already being mapped, or whatever.
> 
> IMHO some of those certainly could be WARN too, depends what you think
> the API should be.

We don't even have consistent behaviour across IOMMU drivers for most of 
them, and some callers are written around particular driver-specific 
behaviours (yes, even VFIO), so IMO there's no way we're in a position 
to suddenly decree that anyone's fundamentally wrong for doing something 
that we've long considered a non-critical failure, but have now decided 
looks sketchy.

> eg iommufd and vfio both are designed to never double map/use out of
> aperture IOVA/etc, if they do it is a significant bug and a WARN_ON
> would be welcomed.
> 
>> another reason to fail (in, as far as we're concerned, an entirely safe and
>> manageable way). Callers have to be prepared to handle failure, but it's up
>> to them how unexpected or serious it is; we can't second-guess them.
> 
> We can always convert things into error codes, but that can also hide
> bugs if no existing caller needs that functionality. IMHO it is always
> better to have tighter invarients, and be noisy when they fail. This
> gives us better understandablity, bug discoverability and discourages
> people from introducing new code doing crazy things.
All that said, I don't mind if you want to propose a separate patch to 
turn certain external caller conditions into noisy warnings, but I 
wouldn't want to tie it up with any other changes which would risk 
getting caught in the crossfire if someone does then hit a 
newly-introduced warning and start arguing to revert it. And I'm not 
going to do so in any of my patches since as above I don't personally 
believe it's a valuable thing to do.

Thanks,
Robin.
Jason Gunthorpe Sept. 22, 2023, 5:28 p.m. UTC | #7
On Tue, Sep 19, 2023 at 01:18:31PM +0100, Robin Murphy wrote:

> > This is not a theoretical question - does any in-kernel code actually
> > do that and expect it to work? I didn't notice any..
> 
> I was hoping to have to say I'd only seen it in unmerged (e.g. [1]) and
> downstream code, but then I found drivers/net/ipa/ :(

Aiiie, what is that doing? Mapping something into the dma-api owned
domain!?!

See this is why I think stronger invarients are a good idea, we can't
police the entire kernel but we can make the entry points explode if
someone is doing something WRONG.

> All that said, I don't mind if you want to propose a separate patch to turn
> certain external caller conditions into noisy warnings, but I wouldn't want
> to tie it up with any other changes which would risk getting caught in the
> crossfire if someone does then hit a newly-introduced warning and start
> arguing to revert it. And I'm not going to do so in any of my patches since
> as above I don't personally believe it's a valuable thing to do.

Ok, that make sense

Jason
diff mbox series

Patch

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index c83f2e4c56f5..391bcae4d02d 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2427,12 +2427,12 @@  static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
 	phys_addr_t orig_paddr = paddr;
 	int ret = 0;
 
-	if (unlikely(!ops->map_pages || domain->pgsize_bitmap == 0UL))
-		return -ENODEV;
-
 	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return -EINVAL;
 
+	if (WARN_ON(!ops->map_pages || domain->pgsize_bitmap == 0UL))
+		return -ENODEV;
+
 	/* find out the minimum page size supported */
 	min_pagesz = 1 << __ffs(domain->pgsize_bitmap);
 
@@ -2510,10 +2510,10 @@  static size_t __iommu_unmap(struct iommu_domain *domain,
 	unsigned long orig_iova = iova;
 	unsigned int min_pagesz;
 
-	if (unlikely(!ops->unmap_pages || domain->pgsize_bitmap == 0UL))
+	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
 		return 0;
 
-	if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING)))
+	if (WARN_ON(!ops->unmap_pages || domain->pgsize_bitmap == 0UL))
 		return 0;
 
 	/* find out the minimum page size supported */