diff mbox series

soc: qcom: mdt_loader: Drop PT_LOAD check on hash segment

Message ID 20210824094109.7272-1-shawnguo@kernel.org (mailing list archive)
State Changes Requested
Headers show
Series soc: qcom: mdt_loader: Drop PT_LOAD check on hash segment | expand

Commit Message

Shawn Guo Aug. 24, 2021, 9:41 a.m. UTC
From: Shawn Guo <shawn.guo@linaro.org>

It's been observed on Sony Xperia M4 Aqua phone, that wcnss firmware has
the type of the second segment holding hashes just be PT_LOAD.  So drop
the check on phdrs[1].p_type to get it go on that phone.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/soc/qcom/mdt_loader.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Marijn Suijten Aug. 24, 2021, 3:18 p.m. UTC | #1
Hi Shawn,

On 8/24/21 11:41 AM, Shawn Guo wrote:
> From: Shawn Guo <shawn.guo@linaro.org>
> 
> It's been observed on Sony Xperia M4 Aqua phone, that wcnss firmware has
> the type of the second segment holding hashes just be PT_LOAD.  So drop
> the check on phdrs[1].p_type to get it go on that phone.
> 
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---
>   drivers/soc/qcom/mdt_loader.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c
> index eba7f76f9d61..6034cd8992b0 100644
> --- a/drivers/soc/qcom/mdt_loader.c
> +++ b/drivers/soc/qcom/mdt_loader.c
> @@ -98,7 +98,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len)
>   	if (ehdr->e_phnum < 2)
>   		return ERR_PTR(-EINVAL);
>   
> -	if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD)
> +	if (phdrs[0].p_type == PT_LOAD)
>   		return ERR_PTR(-EINVAL);
>   
>   	if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH)
> 


Konrad (on the CC-list) originally came up with a similar patch to make 
his Sony phone boot (Xperia XZ, MSM8996).  We however concluded that 
this solution is wrong, for the simple fact that the code below expects 
a PT_NULL header with data in the right place.  Skipping this check most 
likely means that the code below will read out of bounds since the mdt 
file isn't large enough; this data is supposed to be read from an 
external file as indicated by the meaning of PT_LOAD.  We built a 
solution for this, and fortunately CAF independently submitted a similar 
solution to the lists a while ago which is more thorough:

https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/

It's been a while since I last compared mdt files with and without 
PT_LOAD, but this is the conclusion I remember coming to:

The code that packs the hash from program header 1 tightly after the ELF 
header in program header 0 doesn't actually need to run, since our mdt 
binaries already contain the signature in that place; the bytes aren't 
used for anything else.  Simply sending the contents of the entire file 
as-is (similar to our downstream kernels) worked just fine (of course 
files beyond the size of the mdt file still have to be appended from 
.bXX files, and I'm not sure why this isn't checking for PT_LOAD 
program-header type there).

- Marijn
Marijn Suijten Aug. 24, 2021, 3:34 p.m. UTC | #2
Looping Siddharth Gupta in this thread who sent the mdt_loader 
improvements referenced in my reply.  Siddharth, it appears review 
comments have been handled but no v2 made it to the lists yet; are you 
still able to send this?

On 8/24/21 5:18 PM, Marijn Suijten wrote:
> [..]
> 
> https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/
> 
> [..]

- Marijn
Shawn Guo Aug. 26, 2021, 2:18 p.m. UTC | #3
Hi Marijn,

Thanks much for the information!

On Tue, Aug 24, 2021 at 05:18:05PM +0200, Marijn Suijten wrote:
> Hi Shawn,
> 
> On 8/24/21 11:41 AM, Shawn Guo wrote:
> > From: Shawn Guo <shawn.guo@linaro.org>
> > 
> > It's been observed on Sony Xperia M4 Aqua phone, that wcnss firmware has
> > the type of the second segment holding hashes just be PT_LOAD.  So drop
> > the check on phdrs[1].p_type to get it go on that phone.
> > 
> > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > ---
> >   drivers/soc/qcom/mdt_loader.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c
> > index eba7f76f9d61..6034cd8992b0 100644
> > --- a/drivers/soc/qcom/mdt_loader.c
> > +++ b/drivers/soc/qcom/mdt_loader.c
> > @@ -98,7 +98,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len)
> >   	if (ehdr->e_phnum < 2)
> >   		return ERR_PTR(-EINVAL);
> > -	if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD)
> > +	if (phdrs[0].p_type == PT_LOAD)
> >   		return ERR_PTR(-EINVAL);
> >   	if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH)
> > 
> 
> 
> Konrad (on the CC-list) originally came up with a similar patch to make his
> Sony phone boot (Xperia XZ, MSM8996).  We however concluded that this
> solution is wrong, for the simple fact that the code below expects a PT_NULL
> header with data in the right place.  Skipping this check most likely means
> that the code below will read out of bounds since the mdt file isn't large
> enough; this data is supposed to be read from an external file as indicated
> by the meaning of PT_LOAD.  We built a solution for this, and fortunately
> CAF independently submitted a similar solution to the lists a while ago
> which is more thorough:
> 
> https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/

While a thorough solution is good, I do not think my patch makes
anything worse or breaks anything, while fixing my issue on Sony Xperia
M4 Aqua.  All the current assumptions hold true for my WCNSS firmware,
only except that phdrs[1] gets a PT_LOAD type.

There is a blog[1] illustrating the situation quite nicely.  Again, the
only thing my WCNSS firmware differs from the illustration is that
hash segment gets a PT_LOAD type.

Skipping the check will not cause problem for firmwares we have seen,
because hash segment is duplicated in .mdt file, and we are using that
duplication instead of reading from .b01 file.

Shawn

[1] http://bits-please.blogspot.com/2016/04/exploring-qualcomms-secure-execution.html
Marijn Suijten Aug. 26, 2021, 8:52 p.m. UTC | #4
Hi Shawn,

On 8/26/21 4:18 PM, Shawn Guo wrote:
> Hi Marijn,
> 
> Thanks much for the information!
> 
> On Tue, Aug 24, 2021 at 05:18:05PM +0200, Marijn Suijten wrote:
>> Hi Shawn,
>>
>> On 8/24/21 11:41 AM, Shawn Guo wrote:
>>> From: Shawn Guo <shawn.guo@linaro.org>
>>>
>>> It's been observed on Sony Xperia M4 Aqua phone, that wcnss firmware has
>>> the type of the second segment holding hashes just be PT_LOAD.  So drop
>>> the check on phdrs[1].p_type to get it go on that phone.
>>>
>>> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
>>> ---
>>>    drivers/soc/qcom/mdt_loader.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c
>>> index eba7f76f9d61..6034cd8992b0 100644
>>> --- a/drivers/soc/qcom/mdt_loader.c
>>> +++ b/drivers/soc/qcom/mdt_loader.c
>>> @@ -98,7 +98,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len)
>>>    	if (ehdr->e_phnum < 2)
>>>    		return ERR_PTR(-EINVAL);
>>> -	if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD)
>>> +	if (phdrs[0].p_type == PT_LOAD)
>>>    		return ERR_PTR(-EINVAL);
>>>    	if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH)
>>>
>>
>>
>> Konrad (on the CC-list) originally came up with a similar patch to make his
>> Sony phone boot (Xperia XZ, MSM8996).  We however concluded that this
>> solution is wrong, for the simple fact that the code below expects a PT_NULL
>> header with data in the right place.  Skipping this check most likely means
>> that the code below will read out of bounds since the mdt file isn't large
>> enough; this data is supposed to be read from an external file as indicated
>> by the meaning of PT_LOAD.  We built a solution for this, and fortunately
>> CAF independently submitted a similar solution to the lists a while ago
>> which is more thorough:
>>
>> https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/
> 
> While a thorough solution is good, I do not think my patch makes
> anything worse or breaks anything, while fixing my issue on Sony Xperia
> M4 Aqua.  All the current assumptions hold true for my WCNSS firmware,
> only except that phdrs[1] gets a PT_LOAD type.


It's not up to me to choose between a thorough or quick solution, but 
this patch seems to open up the possibility for an out-of-bounds read. 
The assertion was placed in this function for a reason after all.

> There is a blog[1] illustrating the situation quite nicely.  Again, the
> only thing my WCNSS firmware differs from the illustration is that
> hash segment gets a PT_LOAD type.


Yes, that blog post nicely explains the format but it doesn't cover that 
the hash is encoded in the second program header nor that it can be 
copied to be packed tightly against the ELF header?  Maybe that only 
applies to newer formats?

> Skipping the check will not cause problem for firmwares we have seen,
> because hash segment is duplicated in .mdt file, and we are using that
> duplication instead of reading from .b01 file.


Can you share some more details about the firmware file you're using: 
size and the program headers as shown by `readelf -l`?  If possible, can 
you also validate whether QCOM_MDT_TYPE_HASH is set in phdrs[1].p_flags 
& QCOM_MDT_TYPE_MASK (using something like GNU poke)?

All the files I'm able to test adhere to `/* Is the header and hash 
already packed */`: `ehdr_size + hash_size == fw->size` (meaning the 
file solely consists of a tightly packed ELF header and hash signature) 
but I won't be surprised if there are firmware files out in the wild 
with different headers or the hash missing, otherwise the else clause 
wouldn't exist.
This else clause causes a read starting at fw->data + phdrs[1].p_offset 
of phdrs[1].p_filesz bytes which is almost certainly out of bounds if 
the program header type is PT_LOAD instead of PT_NONE, otherwise it 
wouldn't need to be loaded from a .bXX file in the first place.

For this quick solution to be valid I propose to reject PT_LOAD in the 
else clause, and otherwise validate that phdrs[1].p_offset + hash_size 
<= fw->size.
The aforementioned patch series in qcom_mdt_bins_are_split (and certain 
firmware files from Sony phones) show that it is also possible to read 
PT_NULL headers from external files, as long as the header references a 
range that is out of bounds of the mdt file.

> Shawn
> 
> [1] http://bits-please.blogspot.com/2016/04/exploring-qualcomms-secure-execution.html
> 

- Marijn
Shawn Guo Aug. 27, 2021, 6:24 a.m. UTC | #5
On Thu, Aug 26, 2021 at 10:52:21PM +0200, Marijn Suijten wrote:
> Hi Shawn,
> 
> On 8/26/21 4:18 PM, Shawn Guo wrote:
> > Hi Marijn,
> > 
> > Thanks much for the information!
> > 
> > On Tue, Aug 24, 2021 at 05:18:05PM +0200, Marijn Suijten wrote:
> > > Hi Shawn,
> > > 
> > > On 8/24/21 11:41 AM, Shawn Guo wrote:
> > > > From: Shawn Guo <shawn.guo@linaro.org>
> > > > 
> > > > It's been observed on Sony Xperia M4 Aqua phone, that wcnss firmware has
> > > > the type of the second segment holding hashes just be PT_LOAD.  So drop
> > > > the check on phdrs[1].p_type to get it go on that phone.
> > > > 
> > > > Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> > > > ---
> > > >    drivers/soc/qcom/mdt_loader.c | 2 +-
> > > >    1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c
> > > > index eba7f76f9d61..6034cd8992b0 100644
> > > > --- a/drivers/soc/qcom/mdt_loader.c
> > > > +++ b/drivers/soc/qcom/mdt_loader.c
> > > > @@ -98,7 +98,7 @@ void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len)
> > > >    	if (ehdr->e_phnum < 2)
> > > >    		return ERR_PTR(-EINVAL);
> > > > -	if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD)
> > > > +	if (phdrs[0].p_type == PT_LOAD)
> > > >    		return ERR_PTR(-EINVAL);
> > > >    	if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH)
> > > > 
> > > 
> > > 
> > > Konrad (on the CC-list) originally came up with a similar patch to make his
> > > Sony phone boot (Xperia XZ, MSM8996).  We however concluded that this
> > > solution is wrong, for the simple fact that the code below expects a PT_NULL
> > > header with data in the right place.  Skipping this check most likely means
> > > that the code below will read out of bounds since the mdt file isn't large
> > > enough; this data is supposed to be read from an external file as indicated
> > > by the meaning of PT_LOAD.  We built a solution for this, and fortunately
> > > CAF independently submitted a similar solution to the lists a while ago
> > > which is more thorough:
> > > 
> > > https://lore.kernel.org/linux-arm-msm/1609968211-7579-1-git-send-email-sidgup@codeaurora.org/
> > 
> > While a thorough solution is good, I do not think my patch makes
> > anything worse or breaks anything, while fixing my issue on Sony Xperia
> > M4 Aqua.  All the current assumptions hold true for my WCNSS firmware,
> > only except that phdrs[1] gets a PT_LOAD type.
> 
> 
> It's not up to me to choose between a thorough or quick solution,

To be clear, Siddharth's series doesn't resolve my problem, as the
assumption that hash segment type cannot be PT_LOAD is still there.
I have to add the following change on top to fix my problem.

@@ -126,8 +126,7 @@ void *qcom_mdt_read_metadata(struct device *dev, const struct firmware *fw, cons
                return ERR_PTR(-EINVAL);
 
        for (hash_index = 1; hash_index < ehdr->e_phnum; hash_index++) {
-               if (phdrs[hash_index].p_type != PT_LOAD &&
-                  (phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH)
+               if ((phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH)
                        break;
        }
        if (hash_index >= ehdr->e_phnum)

That said, my patch is merely to break the assumption that hash segment
type cannot be PT_LOAD, and it's really orthogonal to Siddharth's
series.

> but this
> patch seems to open up the possibility for an out-of-bounds read. The
> assertion was placed in this function for a reason after all.

I would much appreciate it if someone helps to elaborate the reason.

> 
> > There is a blog[1] illustrating the situation quite nicely.  Again, the
> > only thing my WCNSS firmware differs from the illustration is that
> > hash segment gets a PT_LOAD type.
> 
> 
> Yes, that blog post nicely explains the format but it doesn't cover that the
> hash is encoded in the second program header nor that it can be copied to be
> packed tightly against the ELF header?  Maybe that only applies to newer
> formats?

Hmm, it does cover the things.  It's been illustrated that .mdt is
simply a concatenating of .b00 and .b01.  The .b00 includes ELF header
and program headers, while .b01 is just hash segment.

$ cat wcnss.b00 wcnss.b01 > wcnss.b00_b01
$ cmp wcnss.b00_b01 wcnss.mdt
$

That said, .mdt = ELF header + program headers + hash

> 
> > Skipping the check will not cause problem for firmwares we have seen,
> > because hash segment is duplicated in .mdt file, and we are using that
> > duplication instead of reading from .b01 file.
> 
> 
> Can you share some more details about the firmware file you're using: size
> and the program headers as shown by `readelf -l`?

$ readelf -l wcnss.mdt

Elf file type is EXEC (Executable file)
Entry point 0x8b6018d4
There are 12 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  NULL           0x000000 0x00000000 0x00000000 0x001b4 0x00000     0
  LOAD           0x001000 0x8bbe0000 0x8bbe0000 0x00c58 0x02000     0x1000
  LOAD           0x003000 0x8b600000 0x8b600000 0x03308 0x03438 RWE 0x100000
  LOAD           0x00afcc 0x8b604000 0x8b604000 0x00000 0x08000 RW  0x4000
  LOAD           0x00afcc 0x8b60c000 0x8b60c000 0x0f000 0x0f000 RW  0x4
  LOAD           0x019fcc 0x8b61b000 0x8b61b000 0x00000 0x0e000 RW  0x4
  LOAD           0x019fcc 0x8b629000 0x8b629000 0x32eb04 0x4c2b10 RWE 0x80
  LOAD           0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x37cf8 RW  0x8
  LOAD           0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x21c44 RW  0x4
  LOAD           0x348ad0 0x8bb30000 0x8bb30000 0x00034 0x001a8 RW  0x8
  LOAD           0x349fcc 0x8bb31000 0x8bb31000 0xa0000 0xa0000 RW  0x1000
  LOAD           0x3e9fcc 0x8bbd1000 0x8bbd1000 0x0ac3c 0x0ee00 RWE 0x1000

> If possible, can you also
> validate whether QCOM_MDT_TYPE_HASH is set in phdrs[1].p_flags &
> QCOM_MDT_TYPE_MASK (using something like GNU poke)?

It is set, otherwise the QCOM_MDT_TYPE_HASH check in
qcom_mdt_read_metadata() will just fail.  To be clear, everything works
fine for me, as long as I drop the check of (phdrs[1].p_type == PT_LOAD).

> 
> All the files I'm able to test adhere to `/* Is the header and hash already
> packed */`: `ehdr_size + hash_size == fw->size` (meaning the file solely
> consists of a tightly packed ELF header and hash signature)

Yeah, my wcnss firmware is same here.  Actually, all split firmwares I
have seen are this case.  But bear it in mind there are non-split
firmware like a single .mbn file.  My understanding is that condition
(ehdr_size + hash_size == fw->size) is being used to check whether it's
a split firmware or non-split one.

> but I won't be
> surprised if there are firmware files out in the wild with different headers
> or the hash missing, otherwise the else clause wouldn't exist.
> This else clause causes a read starting at fw->data + phdrs[1].p_offset of
> phdrs[1].p_filesz bytes which is almost certainly out of bounds if the
> program header type is PT_LOAD instead of PT_NONE, otherwise it wouldn't
> need to be loaded from a .bXX file in the first place.

So the else clause is there to handle non-split firmware, which has
everything within fw->size.

> 
> For this quick solution to be valid I propose to reject PT_LOAD in the else
> clause, and otherwise validate that phdrs[1].p_offset + hash_size <=
> fw->size.

I'm not sure what you propose here are required for non-split firmware.

> The aforementioned patch series in qcom_mdt_bins_are_split (and certain
> firmware files from Sony phones) show that it is also possible to read
> PT_NULL headers from external files, as long as the header references a
> range that is out of bounds of the mdt file.

It's been shown that hashes can be read from .mdt or hash segment, and
independently the hash segment type could be PT_NULL or PT_LOAD.  With
Siddharth's patch, instead of using the hash duplication in .mdt, hash
will be read from hash segment.  But again, to resolve my problem, the
assertion that hash segment type cannot be PT_LOAD has to be dropped.
And that's the only thing my patch is doing.

Shawn
Marijn Suijten Aug. 27, 2021, 8:29 a.m. UTC | #6
Hi Shawn,

On 8/27/21 8:24 AM, Shawn Guo wrote:
>> [..]
>> It's not up to me to choose between a thorough or quick solution,
> 
> To be clear, Siddharth's series doesn't resolve my problem, as the
> assumption that hash segment type cannot be PT_LOAD is still there.
> I have to add the following change on top to fix my problem.
> 
> @@ -126,8 +126,7 @@ void *qcom_mdt_read_metadata(struct device *dev, const struct firmware *fw, cons
>                  return ERR_PTR(-EINVAL);
>   
>          for (hash_index = 1; hash_index < ehdr->e_phnum; hash_index++) {
> -               if (phdrs[hash_index].p_type != PT_LOAD &&
> -                  (phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH)
> +               if ((phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH)
>                          break;
>          }
>          if (hash_index >= ehdr->e_phnum)


Patch 3/3 allows the hash segment to be read from an external file and 
should indeed get rid of this comparison, as external file loading is 
possible with PT_NULL and required with PT_LOAD.  I'd go as far as 
moving the check into the if, next to qcom_mdt_bins_are_split:

  if (phdrs[hash_index].p_type == PT_LOAD || qcom_mdt_bins_are_split(fw))

Unfortunately it seems this patchset lost optimization for the packed 
`ehdr_size + hash_size == fw->size` case, where the hash segment is 
already available in the loaded mbt.

> That said, my patch is merely to break the assumption that hash segment
> type cannot be PT_LOAD, and it's really orthogonal to Siddharth's
> series.


It is fine - correct even - to break that assumption, but it should go 
with extra validation that we are dealing with packed mdt instead.

>> but this
>> patch seems to open up the possibility for an out-of-bounds read. The
>> assertion was placed in this function for a reason after all.
> 
> I would much appreciate it if someone helps to elaborate the reason.


PT_LOAD specifies that the segment is to be loaded externally.  The fact 
that our .mdt file is a tight pack of b00 + b01 is mere convenience, but 
is it also a given for the future?  Can we rely on this assumption to 
never change?

>>> There is a blog[1] illustrating the situation quite nicely.  Again, the
>>> only thing my WCNSS firmware differs from the illustration is that
>>> hash segment gets a PT_LOAD type.
>>
>>
>> Yes, that blog post nicely explains the format but it doesn't cover that the
>> hash is encoded in the second program header nor that it can be copied to be
>> packed tightly against the ELF header?  Maybe that only applies to newer
>> formats?
> 
> Hmm, it does cover the things.  It's been illustrated that .mdt is
> simply a concatenating of .b00 and .b01.  The .b00 includes ELF header
> and program headers, while .b01 is just hash segment.
> 
> $ cat wcnss.b00 wcnss.b01 > wcnss.b00_b01
> $ cmp wcnss.b00_b01 wcnss.mdt
> $
> 
> That said, .mdt = ELF header + program headers + hash


Ack, there's one picture halfway demonstrating the `ehdr_size + 
hash_size == fw->size` case.

>>> Skipping the check will not cause problem for firmwares we have seen,
>>> because hash segment is duplicated in .mdt file, and we are using that
>>> duplication instead of reading from .b01 file.
>>
>>
>> Can you share some more details about the firmware file you're using: size
>> and the program headers as shown by `readelf -l`?
> 
> $ readelf -l wcnss.mdt
> 
> Elf file type is EXEC (Executable file)
> Entry point 0x8b6018d4
> There are 12 program headers, starting at offset 52
> 
> Program Headers:
>    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
>    NULL           0x000000 0x00000000 0x00000000 0x001b4 0x00000     0
>    LOAD           0x001000 0x8bbe0000 0x8bbe0000 0x00c58 0x02000     0x1000
>    LOAD           0x003000 0x8b600000 0x8b600000 0x03308 0x03438 RWE 0x100000
>    LOAD           0x00afcc 0x8b604000 0x8b604000 0x00000 0x08000 RW  0x4000
>    LOAD           0x00afcc 0x8b60c000 0x8b60c000 0x0f000 0x0f000 RW  0x4
>    LOAD           0x019fcc 0x8b61b000 0x8b61b000 0x00000 0x0e000 RW  0x4
>    LOAD           0x019fcc 0x8b629000 0x8b629000 0x32eb04 0x4c2b10 RWE 0x80
>    LOAD           0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x37cf8 RW  0x8
>    LOAD           0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x21c44 RW  0x4
>    LOAD           0x348ad0 0x8bb30000 0x8bb30000 0x00034 0x001a8 RW  0x8
>    LOAD           0x349fcc 0x8bb31000 0x8bb31000 0xa0000 0xa0000 RW  0x1000
>    LOAD           0x3e9fcc 0x8bbd1000 0x8bbd1000 0x0ac3c 0x0ee00 RWE 0x1000
> 
>> If possible, can you also
>> validate whether QCOM_MDT_TYPE_HASH is set in phdrs[1].p_flags &
>> QCOM_MDT_TYPE_MASK (using something like GNU poke)?
> 
> It is set, otherwise the QCOM_MDT_TYPE_HASH check in
> qcom_mdt_read_metadata() will just fail.  To be clear, everything works
> fine for me, as long as I drop the check of (phdrs[1].p_type == PT_LOAD).


Thanks, this all checks out and is similar to what I'm seeing here.  It 
all comes down to the mdt packing b00 and b01 tightly together already.

>> All the files I'm able to test adhere to `/* Is the header and hash already
>> packed */`: `ehdr_size + hash_size == fw->size` (meaning the file solely
>> consists of a tightly packed ELF header and hash signature)
> 
> Yeah, my wcnss firmware is same here.  Actually, all split firmwares I
> have seen are this case.  But bear it in mind there are non-split
> firmware like a single .mbn file.  My understanding is that condition
> (ehdr_size + hash_size == fw->size) is being used to check whether it's
> a split firmware or non-split one.


Is it unreasonable to assume that more configurations besides split and 
non-split firmware might occur in the future (or are already out in the 
wild)?  These program headers allow for a variety of configurations 
which we should - in my opinion - parse and handle in a generic manner. 
  `ehdr_size + hash_size == fw->size` is convenient to have, but not 
something we should rely on.

>> but I won't be
>> surprised if there are firmware files out in the wild with different headers
>> or the hash missing, otherwise the else clause wouldn't exist.
>> This else clause causes a read starting at fw->data + phdrs[1].p_offset of
>> phdrs[1].p_filesz bytes which is almost certainly out of bounds if the
>> program header type is PT_LOAD instead of PT_NONE, otherwise it wouldn't
>> need to be loaded from a .bXX file in the first place.
> 
> So the else clause is there to handle non-split firmware, which has
> everything within fw->size.


Is it too much to ask for extra validation here, instead of always 
assuming either "split" or "non-split" firmware?  As mentioned before 
these program headers allow for a variety of configurations, which is 
confirmed by Siddharth's first patch looking for QCOM_MDT_TYPE_HASH in 
all headers instead of assuming it to reside in the second.

>>
>> For this quick solution to be valid I propose to reject PT_LOAD in the else
>> clause, and otherwise validate that phdrs[1].p_offset + hash_size <=
>> fw->size.
> 
> I'm not sure what you propose here are required for non-split firmware.


Would it help if I sent a patch based on yours, with this extra 
validation and comments + commit message explaining the cases?

Alternatively we can try re-spinning the patches from Siddharth while 
taking review comments, the possible .mdt == .b00 + . b01 packing, and 
my suggestion to handle the headers generically into account.

>> The aforementioned patch series in qcom_mdt_bins_are_split (and certain
>> firmware files from Sony phones) show that it is also possible to read
>> PT_NULL headers from external files, as long as the header references a
>> range that is out of bounds of the mdt file.
> 
> It's been shown that hashes can be read from .mdt or hash segment, and
> independently the hash segment type could be PT_NULL or PT_LOAD.  With
> Siddharth's patch, instead of using the hash duplication in .mdt, hash
> will be read from hash segment.  But again, to resolve my problem, the
> assertion that hash segment type cannot be PT_LOAD has to be dropped.
> And that's the only thing my patch is doing.

Correct.  If I were to respin Siddharth's patchset, I'd write 
qcom_mdt_read_metadata as follows:

1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE
    and PT_LOAD);
2. If `ehdr_size + hash_size == fw->size`, this is split firmware and
    the mdt file can be used as-is for metadata;
3. If the type is PT_LOAD, _or_ if the type is PT_NULL but
    `p_offset + p_filesz` does not fit inside the .mdt, this is (a
    variant of) split-firmware, and the hash needs to be loaded from an
    external file and appended to the ELF header.
4. If none of the above, this is a non-split firmware file.  Concatenate
    the ELF and hash headers by reading them directly from the firmware.

- Marijn
Shawn Guo Aug. 27, 2021, 9:57 a.m. UTC | #7
On Fri, Aug 27, 2021 at 10:29:59AM +0200, Marijn Suijten wrote:
> Hi Shawn,
> 
> On 8/27/21 8:24 AM, Shawn Guo wrote:
> > > [..]
> > > It's not up to me to choose between a thorough or quick solution,
> > 
> > To be clear, Siddharth's series doesn't resolve my problem, as the
> > assumption that hash segment type cannot be PT_LOAD is still there.
> > I have to add the following change on top to fix my problem.
> > 
> > @@ -126,8 +126,7 @@ void *qcom_mdt_read_metadata(struct device *dev, const struct firmware *fw, cons
> >                  return ERR_PTR(-EINVAL);
> >          for (hash_index = 1; hash_index < ehdr->e_phnum; hash_index++) {
> > -               if (phdrs[hash_index].p_type != PT_LOAD &&
> > -                  (phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH)
> > +               if ((phdrs[hash_index].p_flags & QCOM_MDT_TYPE_MASK) == QCOM_MDT_TYPE_HASH)
> >                          break;
> >          }
> >          if (hash_index >= ehdr->e_phnum)
> 
> 
> Patch 3/3 allows the hash segment to be read from an external file and
> should indeed get rid of this comparison, as external file loading is
> possible with PT_NULL and required with PT_LOAD.  I'd go as far as moving
> the check into the if, next to qcom_mdt_bins_are_split:
> 
>  if (phdrs[hash_index].p_type == PT_LOAD || qcom_mdt_bins_are_split(fw))
> 
> Unfortunately it seems this patchset lost optimization for the packed
> `ehdr_size + hash_size == fw->size` case, where the hash segment is already
> available in the loaded mbt.
> 
> > That said, my patch is merely to break the assumption that hash segment
> > type cannot be PT_LOAD, and it's really orthogonal to Siddharth's
> > series.
> 
> 
> It is fine - correct even - to break that assumption, but it should go with
> extra validation that we are dealing with packed mdt instead.
> 
> > > but this
> > > patch seems to open up the possibility for an out-of-bounds read. The
> > > assertion was placed in this function for a reason after all.
> > 
> > I would much appreciate it if someone helps to elaborate the reason.
> 
> 
> PT_LOAD specifies that the segment is to be loaded externally.  The fact
> that our .mdt file is a tight pack of b00 + b01 is mere convenience, but is
> it also a given for the future?  Can we rely on this assumption to never
> change?

My patch is trying to fix an existing issue, not anything for the
future.

> 
> > > > There is a blog[1] illustrating the situation quite nicely.  Again, the
> > > > only thing my WCNSS firmware differs from the illustration is that
> > > > hash segment gets a PT_LOAD type.
> > > 
> > > 
> > > Yes, that blog post nicely explains the format but it doesn't cover that the
> > > hash is encoded in the second program header nor that it can be copied to be
> > > packed tightly against the ELF header?  Maybe that only applies to newer
> > > formats?
> > 
> > Hmm, it does cover the things.  It's been illustrated that .mdt is
> > simply a concatenating of .b00 and .b01.  The .b00 includes ELF header
> > and program headers, while .b01 is just hash segment.
> > 
> > $ cat wcnss.b00 wcnss.b01 > wcnss.b00_b01
> > $ cmp wcnss.b00_b01 wcnss.mdt
> > $
> > 
> > That said, .mdt = ELF header + program headers + hash
> 
> 
> Ack, there's one picture halfway demonstrating the `ehdr_size + hash_size ==
> fw->size` case.
> 
> > > > Skipping the check will not cause problem for firmwares we have seen,
> > > > because hash segment is duplicated in .mdt file, and we are using that
> > > > duplication instead of reading from .b01 file.
> > > 
> > > 
> > > Can you share some more details about the firmware file you're using: size
> > > and the program headers as shown by `readelf -l`?
> > 
> > $ readelf -l wcnss.mdt
> > 
> > Elf file type is EXEC (Executable file)
> > Entry point 0x8b6018d4
> > There are 12 program headers, starting at offset 52
> > 
> > Program Headers:
> >    Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
> >    NULL           0x000000 0x00000000 0x00000000 0x001b4 0x00000     0
> >    LOAD           0x001000 0x8bbe0000 0x8bbe0000 0x00c58 0x02000     0x1000
> >    LOAD           0x003000 0x8b600000 0x8b600000 0x03308 0x03438 RWE 0x100000
> >    LOAD           0x00afcc 0x8b604000 0x8b604000 0x00000 0x08000 RW  0x4000
> >    LOAD           0x00afcc 0x8b60c000 0x8b60c000 0x0f000 0x0f000 RW  0x4
> >    LOAD           0x019fcc 0x8b61b000 0x8b61b000 0x00000 0x0e000 RW  0x4
> >    LOAD           0x019fcc 0x8b629000 0x8b629000 0x32eb04 0x4c2b10 RWE 0x80
> >    LOAD           0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x37cf8 RW  0x8
> >    LOAD           0x348ad0 0x8baebb80 0x8baebb80 0x00000 0x21c44 RW  0x4
> >    LOAD           0x348ad0 0x8bb30000 0x8bb30000 0x00034 0x001a8 RW  0x8
> >    LOAD           0x349fcc 0x8bb31000 0x8bb31000 0xa0000 0xa0000 RW  0x1000
> >    LOAD           0x3e9fcc 0x8bbd1000 0x8bbd1000 0x0ac3c 0x0ee00 RWE 0x1000
> > 
> > > If possible, can you also
> > > validate whether QCOM_MDT_TYPE_HASH is set in phdrs[1].p_flags &
> > > QCOM_MDT_TYPE_MASK (using something like GNU poke)?
> > 
> > It is set, otherwise the QCOM_MDT_TYPE_HASH check in
> > qcom_mdt_read_metadata() will just fail.  To be clear, everything works
> > fine for me, as long as I drop the check of (phdrs[1].p_type == PT_LOAD).
> 
> 
> Thanks, this all checks out and is similar to what I'm seeing here.  It all
> comes down to the mdt packing b00 and b01 tightly together already.
> 
> > > All the files I'm able to test adhere to `/* Is the header and hash already
> > > packed */`: `ehdr_size + hash_size == fw->size` (meaning the file solely
> > > consists of a tightly packed ELF header and hash signature)
> > 
> > Yeah, my wcnss firmware is same here.  Actually, all split firmwares I
> > have seen are this case.  But bear it in mind there are non-split
> > firmware like a single .mbn file.  My understanding is that condition
> > (ehdr_size + hash_size == fw->size) is being used to check whether it's
> > a split firmware or non-split one.
> 
> 
> Is it unreasonable to assume that more configurations besides split and
> non-split firmware might occur in the future (or are already out in the
> wild)?  These program headers allow for a variety of configurations which we
> should - in my opinion - parse and handle in a generic manner.  `ehdr_size +
> hash_size == fw->size` is convenient to have, but not something we should
> rely on.
> 
> > > but I won't be
> > > surprised if there are firmware files out in the wild with different headers
> > > or the hash missing, otherwise the else clause wouldn't exist.
> > > This else clause causes a read starting at fw->data + phdrs[1].p_offset of
> > > phdrs[1].p_filesz bytes which is almost certainly out of bounds if the
> > > program header type is PT_LOAD instead of PT_NONE, otherwise it wouldn't
> > > need to be loaded from a .bXX file in the first place.
> > 
> > So the else clause is there to handle non-split firmware, which has
> > everything within fw->size.
> 
> 
> Is it too much to ask for extra validation here, instead of always assuming
> either "split" or "non-split" firmware?  As mentioned before these program
> headers allow for a variety of configurations, which is confirmed by
> Siddharth's first patch looking for QCOM_MDT_TYPE_HASH in all headers
> instead of assuming it to reside in the second.
> 
> > > 
> > > For this quick solution to be valid I propose to reject PT_LOAD in the else
> > > clause, and otherwise validate that phdrs[1].p_offset + hash_size <=
> > > fw->size.
> > 
> > I'm not sure what you propose here are required for non-split firmware.
> 
> 
> Would it help if I sent a patch based on yours, with this extra validation
> and comments + commit message explaining the cases?
> 
> Alternatively we can try re-spinning the patches from Siddharth while taking
> review comments, the possible .mdt == .b00 + . b01 packing, and my
> suggestion to handle the headers generically into account.

I would suggest this path.  It's been 8 months, so Siddharth probably lost
the interest here.  It's reasonable that someone picks up the work.

> 
> > > The aforementioned patch series in qcom_mdt_bins_are_split (and certain
> > > firmware files from Sony phones) show that it is also possible to read
> > > PT_NULL headers from external files, as long as the header references a
> > > range that is out of bounds of the mdt file.
> > 
> > It's been shown that hashes can be read from .mdt or hash segment, and
> > independently the hash segment type could be PT_NULL or PT_LOAD.  With
> > Siddharth's patch, instead of using the hash duplication in .mdt, hash
> > will be read from hash segment.  But again, to resolve my problem, the
> > assertion that hash segment type cannot be PT_LOAD has to be dropped.
> > And that's the only thing my patch is doing.
> 
> Correct.  If I were to respin Siddharth's patchset, I'd write
> qcom_mdt_read_metadata as follows:
> 
> 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE
>    and PT_LOAD);
> 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and
>    the mdt file can be used as-is for metadata;
> 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but
>    `p_offset + p_filesz` does not fit inside the .mdt, this is (a
>    variant of) split-firmware, and the hash needs to be loaded from an
>    external file and appended to the ELF header.
> 4. If none of the above, this is a non-split firmware file.  Concatenate
>    the ELF and hash headers by reading them directly from the firmware.

Looks good to me.  I will be happy to review patches if you pick up the
work.

Shawn
Marijn Suijten Aug. 27, 2021, 10:46 a.m. UTC | #8
Hi Shawn,

On 8/27/21 11:57 AM, Shawn Guo wrote:
>> [..]
>> PT_LOAD specifies that the segment is to be loaded externally.  The fact
>> that our .mdt file is a tight pack of b00 + b01 is mere convenience, but is
>> it also a given for the future?  Can we rely on this assumption to never
>> change?
> 
> My patch is trying to fix an existing issue, not anything for the
> future.


We both agree that the PT_LOAD assertion here is too strict, but 
removing it altogether makes the function too lenient and allows for 
possible bugs.  To solve your issue in the simple case I have already 
suggested to add an extra bounds check.

>> [..]
>> Alternatively we can try re-spinning the patches from Siddharth while taking
>> review comments, the possible .mdt == .b00 + . b01 packing, and my
>> suggestion to handle the headers generically into account.
> 
> I would suggest this path.  It's been 8 months, so Siddharth probably lost
> the interest here.  It's reasonable that someone picks up the work.
> 
>>
>>>> The aforementioned patch series in qcom_mdt_bins_are_split (and certain
>>>> firmware files from Sony phones) show that it is also possible to read
>>>> PT_NULL headers from external files, as long as the header references a
>>>> range that is out of bounds of the mdt file.
>>>
>>> It's been shown that hashes can be read from .mdt or hash segment, and
>>> independently the hash segment type could be PT_NULL or PT_LOAD.  With
>>> Siddharth's patch, instead of using the hash duplication in .mdt, hash
>>> will be read from hash segment.  But again, to resolve my problem, the
>>> assertion that hash segment type cannot be PT_LOAD has to be dropped.
>>> And that's the only thing my patch is doing.
>>
>> Correct.  If I were to respin Siddharth's patchset, I'd write
>> qcom_mdt_read_metadata as follows:
>>
>> 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE
>>     and PT_LOAD);
>> 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and
>>     the mdt file can be used as-is for metadata;
>> 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but
>>     `p_offset + p_filesz` does not fit inside the .mdt, this is (a
>>     variant of) split-firmware, and the hash needs to be loaded from an
>>     external file and appended to the ELF header.
>> 4. If none of the above, this is a non-split firmware file.  Concatenate
>>     the ELF and hash headers by reading them directly from the firmware.
> 
> Looks good to me.  I will be happy to review patches if you pick up the
> work.


I'll try to get this going over the weekend or next week.  I don't think 
I can salvage anything from Siddharth's patchset and will likely start 
from scratch as I'm not confident "just detecting split firmware" is 
enough, but favour generic handling of the program headers as described 
above instead (for both the hash and firmware loading itself).

Will keep you posted!

- Marijn
Shawn Guo Aug. 27, 2021, 2:12 p.m. UTC | #9
On Fri, Aug 27, 2021 at 12:46:47PM +0200, Marijn Suijten wrote:
> Hi Shawn,
> 
> On 8/27/21 11:57 AM, Shawn Guo wrote:
> > > [..]
> > > PT_LOAD specifies that the segment is to be loaded externally.  The fact
> > > that our .mdt file is a tight pack of b00 + b01 is mere convenience, but is
> > > it also a given for the future?  Can we rely on this assumption to never
> > > change?
> > 
> > My patch is trying to fix an existing issue, not anything for the
> > future.
> 
> 
> We both agree that the PT_LOAD assertion here is too strict, but removing it
> altogether makes the function too lenient and allows for possible bugs.  To
> solve your issue in the simple case I have already suggested to add an extra
> bounds check.


So you proposed to reject PT_LOAD in the else clause, which right now
handles .mbn case, are you sure hash segment in .mbn is not going to be
PT_LOAD?

Shawn
Marijn Suijten Aug. 27, 2021, 3:13 p.m. UTC | #10
Hi Shawn,

On 8/27/21 4:12 PM, Shawn Guo wrote:
> [..]
> 
> So you proposed to reject PT_LOAD in the else clause, which right now
> handles .mbn case


Yes, I propose to reject PT_LOAD in the else-case, and additionally 
reject cases where p_offset+p_filesz > sw->size since PT_NULL can also 
cause external file loads (meaning split-firmware).  This is what 
Siddharth's patchset - or my respin of it - is going to implement.

> are you sure hash segment in .mbn is not going to be
> PT_LOAD?


PT_LOAD unambiguously indicates a program header that ought to be loaded 
from an external file.  Any mbn file (non-split firmware) without split 
files that set PT_LOAD are misusing this program header type field.  I 
have no way to validate whether such mbns are in circulation.

Of note, I have never referenced the definition of the program header 
types yet.  Please see [1]:

     PT_LOAD (1)
         Indicates that this program header describes a segment to be
         loaded from the file.

Let me know if you're planning to send a v2 of this patch with 
aforementioned improvements, then I'll adjust my plans to respin 
Siddharth's patchset accordingly.

- Marijn

[1]: https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_23.html
Bjorn Andersson Aug. 27, 2021, 4:07 p.m. UTC | #11
On Fri 27 Aug 04:57 CDT 2021, Shawn Guo wrote:

> On Fri, Aug 27, 2021 at 10:29:59AM +0200, Marijn Suijten wrote:
> > On 8/27/21 8:24 AM, Shawn Guo wrote:
[..]
> > > > 
> > > > For this quick solution to be valid I propose to reject PT_LOAD in the else
> > > > clause, and otherwise validate that phdrs[1].p_offset + hash_size <=
> > > > fw->size.
> > > 
> > > I'm not sure what you propose here are required for non-split firmware.
> > 
> > 
> > Would it help if I sent a patch based on yours, with this extra validation
> > and comments + commit message explaining the cases?
> > 
> > Alternatively we can try re-spinning the patches from Siddharth while taking
> > review comments, the possible .mdt == .b00 + . b01 packing, and my
> > suggestion to handle the headers generically into account.
> 
> I would suggest this path.  It's been 8 months, so Siddharth probably lost
> the interest here.  It's reasonable that someone picks up the work.
> 

Prior to 498b98e93900 ("soc: qcom: mdt_loader: Support loading non-split
images") we'd just blindly pass the entire .mdt into the SCM call.

So reading through your discussion and looking at the problem is that
I assumed (based on the firmware files I looked at) that the hash
segment is explicitly mentioned in the ELF header - i.e. segment 0 and 1
are not PT_LOAD and define the part that should be passed to init_image,
and that this may or may not be packed.

And the problem we have here is that we have the packed case, but the
hash segment isn't described explicitly in the ELF header.

> > 
> > > > The aforementioned patch series in qcom_mdt_bins_are_split (and certain
> > > > firmware files from Sony phones) show that it is also possible to read
> > > > PT_NULL headers from external files, as long as the header references a
> > > > range that is out of bounds of the mdt file.
> > > 
> > > It's been shown that hashes can be read from .mdt or hash segment, and
> > > independently the hash segment type could be PT_NULL or PT_LOAD.  With
> > > Siddharth's patch, instead of using the hash duplication in .mdt, hash
> > > will be read from hash segment.  But again, to resolve my problem, the
> > > assertion that hash segment type cannot be PT_LOAD has to be dropped.
> > > And that's the only thing my patch is doing.
> > 
> > Correct.  If I were to respin Siddharth's patchset, I'd write
> > qcom_mdt_read_metadata as follows:
> > 
> > 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE
> >    and PT_LOAD);
> > 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and
> >    the mdt file can be used as-is for metadata;
> > 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but
> >    `p_offset + p_filesz` does not fit inside the .mdt, this is (a
> >    variant of) split-firmware, and the hash needs to be loaded from an
> >    external file and appended to the ELF header.

I would expect that PT_LOAD denotes that the segment should be loaded
into the final firmware region and that the hash segment would be
PT_NULL regardless of being part of the .mdt, single .mbn or a separate
.bNN segment.

> > 4. If none of the above, this is a non-split firmware file.  Concatenate
> >    the ELF and hash headers by reading them directly from the firmware.
> 

I'm happy with this plan and I think Sid will be as well.

I hope that we can introduce a change that fixes Shawn's reported
problem first, so that can be backported to stable, and then add Sid's
need for loading the additional .bNN after that (same series is fine).

> Looks good to me.  I will be happy to review patches if you pick up the
> work.
> 

Thanks to the both of you for looking into this!

Regards,
Bjorn
Marijn Suijten Aug. 27, 2021, 5:40 p.m. UTC | #12
Hi Bjorn,

On 8/27/21 6:07 PM, Bjorn Andersson wrote:
> On Fri 27 Aug 04:57 CDT 2021, Shawn Guo wrote:
> 
>> On Fri, Aug 27, 2021 at 10:29:59AM +0200, Marijn Suijten wrote:
>>> On 8/27/21 8:24 AM, Shawn Guo wrote:
> [..]
>>>>>
>>>>> For this quick solution to be valid I propose to reject PT_LOAD in the else
>>>>> clause, and otherwise validate that phdrs[1].p_offset + hash_size <=
>>>>> fw->size.
>>>>
>>>> I'm not sure what you propose here are required for non-split firmware.
>>>
>>>
>>> Would it help if I sent a patch based on yours, with this extra validation
>>> and comments + commit message explaining the cases?
>>>
>>> Alternatively we can try re-spinning the patches from Siddharth while taking
>>> review comments, the possible .mdt == .b00 + . b01 packing, and my
>>> suggestion to handle the headers generically into account.
>>
>> I would suggest this path.  It's been 8 months, so Siddharth probably lost
>> the interest here.  It's reasonable that someone picks up the work.
>>
> 
> Prior to 498b98e93900 ("soc: qcom: mdt_loader: Support loading non-split
> images") we'd just blindly pass the entire .mdt into the SCM call.
> 
> So reading through your discussion and looking at the problem is that
> I assumed (based on the firmware files I looked at) that the hash
> segment is explicitly mentioned in the ELF header - i.e. segment 0 and 1
> are not PT_LOAD and define the part that should be passed to init_image,
> and that this may or may not be packed.
> 
> And the problem we have here is that we have the packed case, but the
> hash segment isn't described explicitly in the ELF header.


The hash segment is still explicitly described by the presence of 
QCOM_MDT_TYPE_HASH, and the external file (.b01 when the program header 
is in slot 1) contains the hash signature.  And as Shawn demonstrated, 
concatenating the ELF header in .b00 and this hash in .b01 results in 
the .mdt file.  This is the case we can (and already) optimize for, by 
passing the entire .mdt as-is into SCM if such packing is detected.

>>>>> The aforementioned patch series in qcom_mdt_bins_are_split (and certain
>>>>> firmware files from Sony phones) show that it is also possible to read
>>>>> PT_NULL headers from external files, as long as the header references a
>>>>> range that is out of bounds of the mdt file.
>>>>
>>>> It's been shown that hashes can be read from .mdt or hash segment, and
>>>> independently the hash segment type could be PT_NULL or PT_LOAD.  With
>>>> Siddharth's patch, instead of using the hash duplication in .mdt, hash
>>>> will be read from hash segment.  But again, to resolve my problem, the
>>>> assertion that hash segment type cannot be PT_LOAD has to be dropped.
>>>> And that's the only thing my patch is doing.
>>>
>>> Correct.  If I were to respin Siddharth's patchset, I'd write
>>> qcom_mdt_read_metadata as follows:
>>>
>>> 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE
>>>     and PT_LOAD);
>>> 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and
>>>     the mdt file can be used as-is for metadata;
>>> 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but
>>>     `p_offset + p_filesz` does not fit inside the .mdt, this is (a
>>>     variant of) split-firmware, and the hash needs to be loaded from an
>>>     external file and appended to the ELF header.
> 
> I would expect that PT_LOAD denotes that the segment should be loaded
> into the final firmware region and that the hash segment would be
> PT_NULL regardless of being part of the .mdt, single .mbn or a separate
> .bNN segment.


I have two mdt files here that load the hash externally, one with 
PT_LOAD and one with PT_NULL annotation (QCOM_MDT_TYPE_HASH is set). 
Both have their p_offset and/or p_offset+p_filesz well out of bounds of 
the .mdt file, but the .mdt file is of the packed variant (.b00 + .b01 
== .mdt).

This packing seems to be an optimization, making it possible to send the 
.mdt contents as-is over SCM without having to load and concatenate the 
header from a separate file.

The image produced by __qcom_mdt_load should still be reconstructed 
based on the layout specified in the program headers, of course.

>>> 4. If none of the above, this is a non-split firmware file.  Concatenate
>>>     the ELF and hash headers by reading them directly from the firmware.
>>
> 
> I'm happy with this plan and I think Sid will be as well.
> 
> I hope that we can introduce a change that fixes Shawn's reported
> problem first, so that can be backported to stable, and then add Sid's
> need for loading the additional .bNN after that (same series is fine).


I'm totally for fixing Shawn's reported issue first, before diving into 
extending mdt_loader as per Sid's patches.  In hindsight I'm not sure if 
I should be the person writing that until coming across a 
platform/firmware that needs this setup (hash in a different header, 
hash not included in the packed .mdt file).

Note that backporting needs a Fixes: tag, probably on 498b98e93900.

Shawn, do you want to send your reworded patch with the necessary 
safeguards as v2, or should I send it?

- Marijn
Bjorn Andersson Aug. 27, 2021, 9:25 p.m. UTC | #13
On Fri 27 Aug 10:40 PDT 2021, Marijn Suijten wrote:

> Hi Bjorn,
> 
> On 8/27/21 6:07 PM, Bjorn Andersson wrote:
> > On Fri 27 Aug 04:57 CDT 2021, Shawn Guo wrote:
> > 
> > > On Fri, Aug 27, 2021 at 10:29:59AM +0200, Marijn Suijten wrote:
> > > > On 8/27/21 8:24 AM, Shawn Guo wrote:
> > [..]
> > > > > > 
> > > > > > For this quick solution to be valid I propose to reject PT_LOAD in the else
> > > > > > clause, and otherwise validate that phdrs[1].p_offset + hash_size <=
> > > > > > fw->size.
> > > > > 
> > > > > I'm not sure what you propose here are required for non-split firmware.
> > > > 
> > > > 
> > > > Would it help if I sent a patch based on yours, with this extra validation
> > > > and comments + commit message explaining the cases?
> > > > 
> > > > Alternatively we can try re-spinning the patches from Siddharth while taking
> > > > review comments, the possible .mdt == .b00 + . b01 packing, and my
> > > > suggestion to handle the headers generically into account.
> > > 
> > > I would suggest this path.  It's been 8 months, so Siddharth probably lost
> > > the interest here.  It's reasonable that someone picks up the work.
> > > 
> > 
> > Prior to 498b98e93900 ("soc: qcom: mdt_loader: Support loading non-split
> > images") we'd just blindly pass the entire .mdt into the SCM call.
> > 
> > So reading through your discussion and looking at the problem is that
> > I assumed (based on the firmware files I looked at) that the hash
> > segment is explicitly mentioned in the ELF header - i.e. segment 0 and 1
> > are not PT_LOAD and define the part that should be passed to init_image,
> > and that this may or may not be packed.
> > 
> > And the problem we have here is that we have the packed case, but the
> > hash segment isn't described explicitly in the ELF header.
> 
> 
> The hash segment is still explicitly described by the presence of
> QCOM_MDT_TYPE_HASH, and the external file (.b01 when the program header is
> in slot 1) contains the hash signature.

But in Shawn's firmware the hashes follows the elf header in the .mdt
and both are described in a single program header. Are you saying that
this entry is of type QCOM_MDT_TYPE_HASH?

> And as Shawn demonstrated,
> concatenating the ELF header in .b00 and this hash in .b01 results in the
> .mdt file.  This is the case we can (and already) optimize for, by passing
> the entire .mdt as-is into SCM if such packing is detected.
> 

In the other case, where the hash is described by a separate program
header, we need to ensure that it's concatenated directly following the
ELF header.

There are three cases here:
1) It's already directly following the ELF header and we should send
ehdr_size + hash_size bytes to PAS.
2) It's in the loaded file, but it's not tightly packed (this is what we
see in the .mbn). In this case we need to pack up the two pieces before
we send them to PAS.
3) It's not part of the loaded .mdt, in which case we need to read it
from whichever program header that happens to contain it (might not be
.b01).


In the case of #1 we should not do #3, because I've seen several cases
where the signature in .b01 does not match the one present in the .mdt.

I do expect that passing the metadata of ELF+hash+b01 will still work,
as it would ignore the tailing garbage. Replacing ELF+hash with ELF+b01
is different from what we've been doing all these years, so that will
require retesting on all possible platforms...


So I think what we're looking at it the three possible operations:
1) init_image(ELF+hash)
2) init_image(pack(ELF+hash))
3) init_image(pack(ELF+load(b01)))

Let's see if Sid agrees.

> > > > > > The aforementioned patch series in qcom_mdt_bins_are_split (and certain
> > > > > > firmware files from Sony phones) show that it is also possible to read
> > > > > > PT_NULL headers from external files, as long as the header references a
> > > > > > range that is out of bounds of the mdt file.
> > > > > 
> > > > > It's been shown that hashes can be read from .mdt or hash segment, and
> > > > > independently the hash segment type could be PT_NULL or PT_LOAD.  With
> > > > > Siddharth's patch, instead of using the hash duplication in .mdt, hash
> > > > > will be read from hash segment.  But again, to resolve my problem, the
> > > > > assertion that hash segment type cannot be PT_LOAD has to be dropped.
> > > > > And that's the only thing my patch is doing.
> > > > 
> > > > Correct.  If I were to respin Siddharth's patchset, I'd write
> > > > qcom_mdt_read_metadata as follows:
> > > > 
> > > > 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE
> > > >     and PT_LOAD);
> > > > 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and
> > > >     the mdt file can be used as-is for metadata;
> > > > 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but
> > > >     `p_offset + p_filesz` does not fit inside the .mdt, this is (a
> > > >     variant of) split-firmware, and the hash needs to be loaded from an
> > > >     external file and appended to the ELF header.
> > 
> > I would expect that PT_LOAD denotes that the segment should be loaded
> > into the final firmware region and that the hash segment would be
> > PT_NULL regardless of being part of the .mdt, single .mbn or a separate
> > .bNN segment.
> 
> 
> I have two mdt files here that load the hash externally, one with PT_LOAD
> and one with PT_NULL annotation (QCOM_MDT_TYPE_HASH is set). Both have their
> p_offset and/or p_offset+p_filesz well out of bounds of the .mdt file, but
> the .mdt file is of the packed variant (.b00 + .b01 == .mdt).
> 

Ahh, because mdt_phdr_valid() will discard them anyways as we look for
PT_LOAD segments...

> This packing seems to be an optimization, making it possible to send the
> .mdt contents as-is over SCM without having to load and concatenate the
> header from a separate file.
> 

Yes, that's probably the case. I was surprised to see the need for
repacking the two segments in the .mbn case. I can only assume that
Windows does this...

Regards,
Bjorn

> The image produced by __qcom_mdt_load should still be reconstructed based on
> the layout specified in the program headers, of course.
> 
> > > > 4. If none of the above, this is a non-split firmware file.  Concatenate
> > > >     the ELF and hash headers by reading them directly from the firmware.
> > > 
> > 
> > I'm happy with this plan and I think Sid will be as well.
> > 
> > I hope that we can introduce a change that fixes Shawn's reported
> > problem first, so that can be backported to stable, and then add Sid's
> > need for loading the additional .bNN after that (same series is fine).
> 
> 
> I'm totally for fixing Shawn's reported issue first, before diving into
> extending mdt_loader as per Sid's patches.  In hindsight I'm not sure if I
> should be the person writing that until coming across a platform/firmware
> that needs this setup (hash in a different header, hash not included in the
> packed .mdt file).
> 
> Note that backporting needs a Fixes: tag, probably on 498b98e93900.
> 
> Shawn, do you want to send your reworded patch with the necessary safeguards
> as v2, or should I send it?
> 
> - Marijn
Marijn Suijten Aug. 27, 2021, 11:42 p.m. UTC | #14
Hi Bjorn,

On 8/27/21 11:25 PM, Bjorn Andersson wrote:
> On Fri 27 Aug 10:40 PDT 2021, Marijn Suijten wrote:
 > [..]
>> On 8/27/21 6:07 PM, Bjorn Andersson wrote:
>>> [..]
>>> Prior to 498b98e93900 ("soc: qcom: mdt_loader: Support loading non-split
>>> images") we'd just blindly pass the entire .mdt into the SCM call.
>>>
>>> So reading through your discussion and looking at the problem is that
>>> I assumed (based on the firmware files I looked at) that the hash
>>> segment is explicitly mentioned in the ELF header - i.e. segment 0 and 1
>>> are not PT_LOAD and define the part that should be passed to init_image,
>>> and that this may or may not be packed.
>>>
>>> And the problem we have here is that we have the packed case, but the
>>> hash segment isn't described explicitly in the ELF header.
>>
>>
>> The hash segment is still explicitly described by the presence of
>> QCOM_MDT_TYPE_HASH, and the external file (.b01 when the program header is
>> in slot 1) contains the hash signature.
> 
> But in Shawn's firmware the hashes follows the elf header in the .mdt


Apologies, I wasn't denying that the hash doesn't follow the ELF header 
in the .mdt file (in his example, ELF (.b00) + hash (.b01) == .mdt), 
simply forgot to mention it.

> and both are described in a single program header.


Only the ELF header is described in the first program header, which is 
merely 0x001b4 bytes in size.  The hash segment is described by the 
second header, which loads it from an external file.

The trailing data in the .mdt file directly after the ELF header is 
unspecified by the program headers (all other tables are PT_LOAD after 
all).  As discussed below this seems to be an "ease-of-use" feature of 
the .mdt, doubling as metadata sent directly to SCM.  Its "presence" is 
(seemingly) detected by comparing the size of the mdt file against the 
summed size of the ELF header and the segment containing QCOM_MDT_TYPE_HASH.

> Are you saying that this entry is of type QCOM_MDT_TYPE_HASH?


According to Shawn's validation (matching what I see locally), the 
second program header starting at 0x1000 has the QCOM_MDT_TYPE_HASH bit set.
>> And as Shawn demonstrated,
>> concatenating the ELF header in .b00 and this hash in .b01 results in the
>> .mdt file.  This is the case we can (and already) optimize for, by passing
>> the entire .mdt as-is into SCM if such packing is detected.
>>
> 
> In the other case, where the hash is described by a separate program
> header, we need to ensure that it's concatenated directly following the
> ELF header.


Shawn's case has the hash described in a separate program header as 
well, but it is also appended directly after the ELF header.  The 
program headers do not specify anything about this region in the .mdt 
file though.

> There are three cases here:
> 1) It's already directly following the ELF header and we should send
> ehdr_size + hash_size bytes to PAS.


Ack.

> 2) It's in the loaded file, but it's not tightly packed (this is what we
> see in the .mbn). In this case we need to pack up the two pieces before
> we send them to PAS.


Ack.

> 3) It's not part of the loaded .mdt, in which case we need to read it
> from whichever program header that happens to contain it (might not be
> .b01).


How should we detect that it is not part of the .mdt?  Will the size of 
ELF header + hash segment simply not equal the size of the .mdt file?

> In the case of #1 we should not do #3, because I've seen several cases
> where the signature in .b01 does not match the one present in the .mdt.


Oh, that really complicates matters :(.  It means that SCM will see a 
different signature than what we eventually load into memory (recall 
that the hash bit in a .mdt is not specified in any program header, and 
will hence never be copied to memory.  The "memory size" [p_memsz] of 
the ELF header is zero, and will never be copied either).  Maybe that's 
never read though.

> I do expect that passing the metadata of ELF+hash+b01 will still work,


I don't think we have an ELF+hash+b01 case.  It's always ELF+hash, but 
that hash can come from one of three locations as described in your 
summation below.

> as it would ignore the tailing garbage. Replacing ELF+hash with ELF+b01
> is different from what we've been doing all these years, so that will
> require retesting on all possible platforms...


I don't think we should be doing this replacement (Sid's patch does, 
which seems to have lost this check) and simply keep the ehdr_size + 
hash_size == fw->size predicate for direct ELF+hash reading from a .mdt.

> So I think what we're looking at it the three possible operations:
> 1) init_image(ELF+hash)
> 2) init_image(pack(ELF+hash))
> 3) init_image(pack(ELF+load(b01)))


Correct, and all three cases should be handled by the four steps I 
mentioned to implement qcom_mdt_read_metadata.

> Let's see if Sid agrees.
> 
>>>>> [..]
>>>>> Correct.  If I were to respin Siddharth's patchset, I'd write
>>>>> qcom_mdt_read_metadata as follows:
>>>>>
>>>>> 1. Find the header that contains QCOM_MDT_TYPE_HASH (allowing PT_NONE
>>>>>      and PT_LOAD);
>>>>> 2. If `ehdr_size + hash_size == fw->size`, this is split firmware and
>>>>>      the mdt file can be used as-is for metadata;
>>>>> 3. If the type is PT_LOAD, _or_ if the type is PT_NULL but
>>>>>      `p_offset + p_filesz` does not fit inside the .mdt, this is (a
>>>>>      variant of) split-firmware, and the hash needs to be loaded from an
>>>>>      external file and appended to the ELF header.
>>>
>>> I would expect that PT_LOAD denotes that the segment should be loaded
>>> into the final firmware region and that the hash segment would be
>>> PT_NULL regardless of being part of the .mdt, single .mbn or a separate
>>> .bNN segment.
>>
>>
>> I have two mdt files here that load the hash externally, one with PT_LOAD
>> and one with PT_NULL annotation (QCOM_MDT_TYPE_HASH is set). Both have their
>> p_offset and/or p_offset+p_filesz well out of bounds of the .mdt file, but
>> the .mdt file is of the packed variant (.b00 + .b01 == .mdt).
>>
> 
> Ahh, because mdt_phdr_valid() will discard them anyways as we look for
> PT_LOAD segments...


Yes.  Curious how that works for mbn files then, as those should not 
have externally loadable files and hence no PT_LOAD segments?

>> This packing seems to be an optimization, making it possible to send the
>> .mdt contents as-is over SCM without having to load and concatenate the
>> header from a separate file.
>>
> 
> Yes, that's probably the case. I was surprised to see the need for
> repacking the two segments in the .mbn case. I can only assume that
> Windows does this...


I take it the .mbns also do not have a program header defining the 
memory contents directly after the ELF header?  Is this part 
zero-initialized in the file, or is it pre-initialized with the hash 
just like .mdt?

- Marijn
Shawn Guo Aug. 28, 2021, 6:03 a.m. UTC | #15
On Fri, Aug 27, 2021 at 05:13:34PM +0200, Marijn Suijten wrote:
> Hi Shawn,
> 
> On 8/27/21 4:12 PM, Shawn Guo wrote:
> > [..]
> > 
> > So you proposed to reject PT_LOAD in the else clause, which right now
> > handles .mbn case
> 
> 
> Yes, I propose to reject PT_LOAD in the else-case, and additionally reject
> cases where p_offset+p_filesz > sw->size since PT_NULL can also cause
> external file loads (meaning split-firmware).  This is what Siddharth's
> patchset - or my respin of it - is going to implement.
> 
> > are you sure hash segment in .mbn is not going to be
> > PT_LOAD?
> 
> 
> PT_LOAD unambiguously indicates a program header that ought to be loaded
> from an external file.  Any mbn file (non-split firmware) without split
> files that set PT_LOAD are misusing this program header type field.  I have
> no way to validate whether such mbns are in circulation.

Following your take on PT_LOAD, I assume that no PT_LOAD segment should
be found in .mbn file, correct?  Here are two .mbn I found in
circulation.  Both have PT_LOAD type for a few segments.

$ readelf -l venus.mbn 

Elf file type is EXEC (Executable file)
Entry point 0xf500000
There are 5 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  NULL           0x000000 0x00000000 0x00000000 0x000d4 0x00000     0
  LOAD           0x001000 0x0fa00000 0x0fa00000 0x00b88 0x02000     0x1000
  LOAD           0x003000 0x00000000 0x0f500000 0xeecd0 0xeecd0 R E 0x100000
  LOAD           0x0f1cd0 0x000ef000 0x0f5ef000 0x015c0 0x405000 RW  0x4000
  LOAD           0x0f3290 0x004ff000 0x0f9ff000 0x00020 0x00020 RW  0x4

$ readelf -l mba.mbn 

Elf file type is EXEC (Executable file)
Entry point 0x4417000
There are 5 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  NULL           0x000000 0x00000000 0x00000000 0x000d4 0x00000     0
  NULL           0x001000 0x8ea4a000 0x8ea4a000 0x019c8 0x02000     0x1000
  LOAD           0x003000 0x04417000 0x8ea00000 0x350d0 0x3bbc8 RWE 0x1000
  LOAD           0x039000 0x04460000 0x8ea49000 0x00380 0x00380 RW  0x1000
  GNU_STACK      0x002000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x4

Or you think these are all misusing of PT_LOAD?  Sorry, I hardly believe
your understanding on PT_LOAD matches the reality.  Instead, I'm inclined
to agree with Bjorn's comment.

Quote:

"I would expect that PT_LOAD denotes that the segment should be loaded
into the final firmware region and that the hash segment would be
PT_NULL regardless of being part of the .mdt, single .mbn or a separate
.bNN segment."

The only part that doesn't hold is "the hash segment would be PT_NULL".
But the point is that PT_LOAD doesn't mean the segment should be loaded
externally (from .bNN file). 

> 
> Of note, I have never referenced the definition of the program header types
> yet.  Please see [1]:
> 
>     PT_LOAD (1)
>         Indicates that this program header describes a segment to be
>         loaded from the file.
> 
> Let me know if you're planning to send a v2 of this patch with
> aforementioned improvements, then I'll adjust my plans to respin Siddharth's
> patchset accordingly.

I will send v2. However there will be no code change but just commit log
update based on all these discussion.  Thanks!

Shawn

> [1]: https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_23.html
Marijn Suijten Aug. 28, 2021, 8:58 a.m. UTC | #16
Hi Shawn,

On 8/28/21 8:03 AM, Shawn Guo wrote:
> [..]
> 
> Or you think these are all misusing of PT_LOAD?  Sorry, I hardly believe
> your understanding on PT_LOAD matches the reality.  Instead, I'm inclined
> to agree with Bjorn's comment.


Agreed, PT_LOAD does not imply _external_ loading at all, merely whether 
it has to be copied to the final memory region.  That's my 
misunderstanding, even after quoting the documentation.

This external loading is merely a qcom implementation detail AFAIK, and 
solely detected by the source boundaries (p_offset - p_offset+p_filesz) 
residing outside the .mdt file.

> Quote:
> 
> "I would expect that PT_LOAD denotes that the segment should be loaded
> into the final firmware region and that the hash segment would be
> PT_NULL regardless of being part of the .mdt, single .mbn or a separate
> .bNN segment."
> 
> The only part that doesn't hold is "the hash segment would be PT_NULL".
> But the point is that PT_LOAD doesn't mean the segment should be loaded
> externally (from .bNN file).


Yes, this is strange.  Why would the .mdt request the hash segment to be 
loaded to firmware memory when it's filtered out anyway?  Are we 
supposed to filter it out?

> [..]
> 
> I will send v2. However there will be no code change but just commit log
> update based on all these discussion.  Thanks!


Sounds good - glad we could have this discussion to get to the bottom of 
this and conclude that removing this check is the right approach without 
any side-effects, including a detailed justification in the commit-message.

- Marijn
diff mbox series

Patch

diff --git a/drivers/soc/qcom/mdt_loader.c b/drivers/soc/qcom/mdt_loader.c
index eba7f76f9d61..6034cd8992b0 100644
--- a/drivers/soc/qcom/mdt_loader.c
+++ b/drivers/soc/qcom/mdt_loader.c
@@ -98,7 +98,7 @@  void *qcom_mdt_read_metadata(const struct firmware *fw, size_t *data_len)
 	if (ehdr->e_phnum < 2)
 		return ERR_PTR(-EINVAL);
 
-	if (phdrs[0].p_type == PT_LOAD || phdrs[1].p_type == PT_LOAD)
+	if (phdrs[0].p_type == PT_LOAD)
 		return ERR_PTR(-EINVAL);
 
 	if ((phdrs[1].p_flags & QCOM_MDT_TYPE_MASK) != QCOM_MDT_TYPE_HASH)