diff mbox series

[RFC] remoteproc: k3-r5: Fix check performed in k3_r5_rproc_{mbox_callback/kick}

Message ID 20240916083131.2801755-1-s-vadapalli@ti.com (mailing list archive)
State New, archived
Headers show
Series [RFC] remoteproc: k3-r5: Fix check performed in k3_r5_rproc_{mbox_callback/kick} | expand

Commit Message

s-vadapalli Sept. 16, 2024, 8:31 a.m. UTC
Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during
probe routine") introduced a check in the "k3_r5_rproc_mbox_callback()" and
"k3_r5_rproc_kick()" callbacks, causing them to exit if the remote core's
state is "RPROC_DETACHED". However, the "__rproc_attach()" function that is
responsible for attaching to a remote core, updates the state of the remote
core to "RPROC_ATTACHED" only after invoking "rproc_start_subdevices()".

The "rproc_start_subdevices()" function triggers the probe of the Virtio
RPMsg devices associated with the remote core, which require that the
"k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
functional. Hence, drop the check in the callbacks.

Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during probe routine")
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
---

Hello,

Since the commit being fixed is not yet a part of Mainline Linux, this
patch is based on linux-next tagged next-20240913.

An alternative to this patch will be a change to the "__rproc_attach()"
function in the "remoteproc_core.c" driver with
rproc->state = RPROC_ATTACHED;
being set after "rproc_attach_device()" is invoked, but __before__
invoking "rproc_start_subdevices()". Since this change will be performed
in the common Remoteproc Core, it appeared to me that fixing it in the
TI remoteproc driver is the correct approach.

The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
required, which I shall post if the current patch is acceptable.

Kindly review and share your feedback on this patch.

Regards,
Siddharth.

 drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
 1 file changed, 8 deletions(-)

Comments

Mathieu Poirier Sept. 16, 2024, 3:20 p.m. UTC | #1
On Mon, 16 Sept 2024 at 02:31, Siddharth Vadapalli <s-vadapalli@ti.com> wrote:
>
> Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during
> probe routine") introduced a check in the "k3_r5_rproc_mbox_callback()" and
> "k3_r5_rproc_kick()" callbacks, causing them to exit if the remote core's
> state is "RPROC_DETACHED". However, the "__rproc_attach()" function that is
> responsible for attaching to a remote core, updates the state of the remote
> core to "RPROC_ATTACHED" only after invoking "rproc_start_subdevices()".
>
> The "rproc_start_subdevices()" function triggers the probe of the Virtio
> RPMsg devices associated with the remote core, which require that the
> "k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
> functional. Hence, drop the check in the callbacks.

Honestly, I am very tempted to just revert f3f11cfe8907 and ea1d6fb5b571.

>
> Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during probe routine")
> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
> ---
>
> Hello,
>
> Since the commit being fixed is not yet a part of Mainline Linux, this
> patch is based on linux-next tagged next-20240913.
>
> An alternative to this patch will be a change to the "__rproc_attach()"
> function in the "remoteproc_core.c" driver with
> rproc->state = RPROC_ATTACHED;
> being set after "rproc_attach_device()" is invoked, but __before__
> invoking "rproc_start_subdevices()". Since this change will be performed
> in the common Remoteproc Core, it appeared to me that fixing it in the
> TI remoteproc driver is the correct approach.
>
> The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
> required, which I shall post if the current patch is acceptable.
>
> Kindly review and share your feedback on this patch.
>
> Regards,
> Siddharth.
>
>  drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
>  1 file changed, 8 deletions(-)
>
> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> index 747ee467da88..4894461aa65f 100644
> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> @@ -194,10 +194,6 @@ static void k3_r5_rproc_mbox_callback(struct mbox_client *client, void *data)
>         const char *name = kproc->rproc->name;
>         u32 msg = omap_mbox_message(data);
>
> -       /* Do not forward message from a detached core */
> -       if (kproc->rproc->state == RPROC_DETACHED)
> -               return;
> -
>         dev_dbg(dev, "mbox msg: 0x%x\n", msg);
>
>         switch (msg) {
> @@ -233,10 +229,6 @@ static void k3_r5_rproc_kick(struct rproc *rproc, int vqid)
>         mbox_msg_t msg = (mbox_msg_t)vqid;
>         int ret;
>
> -       /* Do not forward message to a detached core */
> -       if (kproc->rproc->state == RPROC_DETACHED)
> -               return;
> -
>         /* send the index of the triggered virtqueue in the mailbox payload */
>         ret = mbox_send_message(kproc->mbox, (void *)msg);
>         if (ret < 0)
> --
> 2.40.1
>
Kumar, Udit Sept. 17, 2024, 5:20 a.m. UTC | #2
On 9/16/2024 8:50 PM, Mathieu Poirier wrote:
> On Mon, 16 Sept 2024 at 02:31, Siddharth Vadapalli <s-vadapalli@ti.com> wrote:
>> Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during
>> probe routine") introduced a check in the "k3_r5_rproc_mbox_callback()" and
>> "k3_r5_rproc_kick()" callbacks, causing them to exit if the remote core's
>> state is "RPROC_DETACHED". However, the "__rproc_attach()" function that is
>> responsible for attaching to a remote core, updates the state of the remote
>> core to "RPROC_ATTACHED" only after invoking "rproc_start_subdevices()".
>>
>> The "rproc_start_subdevices()" function triggers the probe of the Virtio
>> RPMsg devices associated with the remote core, which require that the
>> "k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
>> functional. Hence, drop the check in the callbacks.
> Honestly, I am very tempted to just revert f3f11cfe8907 and ea1d6fb5b571.


Please don't :) , it will break rproc in general for k3 devices.

Couple of solutions for this race around condition (in mine preference 
order)

1) In 
https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L190 
have a check , if probe in is progress or not

2) 
https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L1205 
-- correct the state to ON or something else

3) Move condition 
https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/remoteproc_core.c#L1360 
before rproc_start_subdevices 
<https://elixir.bootlin.com/linux/v6.11/C/ident/rproc_start_subdevices> 
calling



>
>> Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during probe routine")
>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
>> ---
>>
>> Hello,
>>
>> Since the commit being fixed is not yet a part of Mainline Linux, this
>> patch is based on linux-next tagged next-20240913.
>>
>> An alternative to this patch will be a change to the "__rproc_attach()"
>> function in the "remoteproc_core.c" driver with
>> rproc->state = RPROC_ATTACHED;
>> being set after "rproc_attach_device()" is invoked, but __before__
>> invoking "rproc_start_subdevices()". Since this change will be performed
>> in the common Remoteproc Core, it appeared to me that fixing it in the
>> TI remoteproc driver is the correct approach.
>>
>> The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
>> required, which I shall post if the current patch is acceptable.
>>
>> Kindly review and share your feedback on this patch.
>>
>> Regards,
>> Siddharth.
>>
>>   drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
>>   1 file changed, 8 deletions(-)
>>
>> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>> index 747ee467da88..4894461aa65f 100644
>> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
>> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>> @@ -194,10 +194,6 @@ static void k3_r5_rproc_mbox_callback(struct mbox_client *client, void *data)
>>          const char *name = kproc->rproc->name;
>>          u32 msg = omap_mbox_message(data);
>>
>> -       /* Do not forward message from a detached core */
>> -       if (kproc->rproc->state == RPROC_DETACHED)
>> -               return;
>> -
>>          dev_dbg(dev, "mbox msg: 0x%x\n", msg);
>>
>>          switch (msg) {
>> @@ -233,10 +229,6 @@ static void k3_r5_rproc_kick(struct rproc *rproc, int vqid)
>>          mbox_msg_t msg = (mbox_msg_t)vqid;
>>          int ret;
>>
>> -       /* Do not forward message to a detached core */
>> -       if (kproc->rproc->state == RPROC_DETACHED)
>> -               return;
>> -
>>          /* send the index of the triggered virtqueue in the mailbox payload */
>>          ret = mbox_send_message(kproc->mbox, (void *)msg);
>>          if (ret < 0)
>> --
>> 2.40.1
>>
Mathieu Poirier Sept. 17, 2024, 8:37 a.m. UTC | #3
On Mon, 16 Sept 2024 at 23:20, Kumar, Udit <u-kumar1@ti.com> wrote:
>
> On 9/16/2024 8:50 PM, Mathieu Poirier wrote:
> > On Mon, 16 Sept 2024 at 02:31, Siddharth Vadapalli <s-vadapalli@ti.com> wrote:
> >> Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during
> >> probe routine") introduced a check in the "k3_r5_rproc_mbox_callback()" and
> >> "k3_r5_rproc_kick()" callbacks, causing them to exit if the remote core's
> >> state is "RPROC_DETACHED". However, the "__rproc_attach()" function that is
> >> responsible for attaching to a remote core, updates the state of the remote
> >> core to "RPROC_ATTACHED" only after invoking "rproc_start_subdevices()".
> >>
> >> The "rproc_start_subdevices()" function triggers the probe of the Virtio
> >> RPMsg devices associated with the remote core, which require that the
> >> "k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
> >> functional. Hence, drop the check in the callbacks.
> > Honestly, I am very tempted to just revert f3f11cfe8907 and ea1d6fb5b571.
>
>
> Please don't :) , it will break rproc in general for k3 devices.
>

Why not - it is already broken anyway.  Reverting the patches will
force TI to actually think about the feature in terms of design,
completeness and testing.  The merge window opened on Sunday - I'm not
going to merge whack-a-mole patches and hope the right fix comes
along.

> Couple of solutions for this race around condition (in mine preference
> order)
>

This is for the TI team to discuss _and_ test thoroughly.  From hereon
and until I see things improve, all patches from TI will need to be
tagged with R-B and T-B tags (collected on the mailing lists) from two
different individuals before I look at them.

> 1) In
> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L190
> have a check , if probe in is progress or not
>
> 2)
> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L1205
> -- correct the state to ON or something else
>
> 3) Move condition
> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/remoteproc_core.c#L1360
> before rproc_start_subdevices
> <https://elixir.bootlin.com/linux/v6.11/C/ident/rproc_start_subdevices>
> calling
>
>
>
> >
> >> Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during probe routine")
> >> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
> >> ---
> >>
> >> Hello,
> >>
> >> Since the commit being fixed is not yet a part of Mainline Linux, this
> >> patch is based on linux-next tagged next-20240913.
> >>
> >> An alternative to this patch will be a change to the "__rproc_attach()"
> >> function in the "remoteproc_core.c" driver with
> >> rproc->state = RPROC_ATTACHED;
> >> being set after "rproc_attach_device()" is invoked, but __before__
> >> invoking "rproc_start_subdevices()". Since this change will be performed
> >> in the common Remoteproc Core, it appeared to me that fixing it in the
> >> TI remoteproc driver is the correct approach.
> >>
> >> The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
> >> required, which I shall post if the current patch is acceptable.
> >>
> >> Kindly review and share your feedback on this patch.
> >>
> >> Regards,
> >> Siddharth.
> >>
> >>   drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
> >>   1 file changed, 8 deletions(-)
> >>
> >> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >> index 747ee467da88..4894461aa65f 100644
> >> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >> @@ -194,10 +194,6 @@ static void k3_r5_rproc_mbox_callback(struct mbox_client *client, void *data)
> >>          const char *name = kproc->rproc->name;
> >>          u32 msg = omap_mbox_message(data);
> >>
> >> -       /* Do not forward message from a detached core */
> >> -       if (kproc->rproc->state == RPROC_DETACHED)
> >> -               return;
> >> -
> >>          dev_dbg(dev, "mbox msg: 0x%x\n", msg);
> >>
> >>          switch (msg) {
> >> @@ -233,10 +229,6 @@ static void k3_r5_rproc_kick(struct rproc *rproc, int vqid)
> >>          mbox_msg_t msg = (mbox_msg_t)vqid;
> >>          int ret;
> >>
> >> -       /* Do not forward message to a detached core */
> >> -       if (kproc->rproc->state == RPROC_DETACHED)
> >> -               return;
> >> -
> >>          /* send the index of the triggered virtqueue in the mailbox payload */
> >>          ret = mbox_send_message(kproc->mbox, (void *)msg);
> >>          if (ret < 0)
> >> --
> >> 2.40.1
> >>
Kumar, Udit Sept. 17, 2024, 9:13 a.m. UTC | #4
Hi Mathieu,

On 9/17/2024 2:07 PM, Mathieu Poirier wrote:
> On Mon, 16 Sept 2024 at 23:20, Kumar, Udit <u-kumar1@ti.com> wrote:
>> On 9/16/2024 8:50 PM, Mathieu Poirier wrote:
>>> On Mon, 16 Sept 2024 at 02:31, Siddharth Vadapalli <s-vadapalli@ti.com> wrote:
>>>> Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during
>>>> probe routine") introduced a check in the "k3_r5_rproc_mbox_callback()" and
>>>> "k3_r5_rproc_kick()" callbacks, causing them to exit if the remote core's
>>>> state is "RPROC_DETACHED". However, the "__rproc_attach()" function that is
>>>> responsible for attaching to a remote core, updates the state of the remote
>>>> core to "RPROC_ATTACHED" only after invoking "rproc_start_subdevices()".
>>>>
>>>> The "rproc_start_subdevices()" function triggers the probe of the Virtio
>>>> RPMsg devices associated with the remote core, which require that the
>>>> "k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
>>>> functional. Hence, drop the check in the callbacks.
>>> Honestly, I am very tempted to just revert f3f11cfe8907 and ea1d6fb5b571.
>>
>> Please don't :) , it will break rproc in general for k3 devices.
>>
> Why not - it is already broken anyway.  Reverting the patches will
> force TI to actually think about the feature in terms of design,
> completeness and testing.  The merge window opened on Sunday - I'm not
> going to merge whack-a-mole patches and hope the right fix comes
> along.

Now, I am not advocating here to revert or not.

But where we stand currently

1-  Without this patch, IPC is broken in general.

2-  With this patch, IPC is conditionally broken.

In either case, we need to fix it.

your call to revert or keep it.


>
>> Couple of solutions for this race around condition (in mine preference
>> order)
>>
> This is for the TI team to discuss _and_ test thoroughly.  From hereon
> and until I see things improve, all patches from TI will need to be
> tagged with R-B and T-B tags (collected on the mailing lists) from two
> different individuals before I look at them.

Sure we will take care of above

and fair ask on R-B and T-B tags

>
>> 1) In
>> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L190
>> have a check , if probe in is progress or not
>>
>> 2)
>> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L1205
>> -- correct the state to ON or something else
>>
>> 3) Move condition
>> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/remoteproc_core.c#L1360
>> before rproc_start_subdevices
>> <https://elixir.bootlin.com/linux/v6.11/C/ident/rproc_start_subdevices>
>> calling
>>
>>
>>
>>>> Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during probe routine")
>>>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
>>>> ---
>>>>
>>>> Hello,
>>>>
>>>> Since the commit being fixed is not yet a part of Mainline Linux, this
>>>> patch is based on linux-next tagged next-20240913.
>>>>
>>>> An alternative to this patch will be a change to the "__rproc_attach()"
>>>> function in the "remoteproc_core.c" driver with
>>>> rproc->state = RPROC_ATTACHED;
>>>> being set after "rproc_attach_device()" is invoked, but __before__
>>>> invoking "rproc_start_subdevices()". Since this change will be performed
>>>> in the common Remoteproc Core, it appeared to me that fixing it in the
>>>> TI remoteproc driver is the correct approach.
>>>>
>>>> The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
>>>> required, which I shall post if the current patch is acceptable.
>>>>
>>>> Kindly review and share your feedback on this patch.
>>>>
>>>> Regards,
>>>> Siddharth.
>>>>
>>>>    drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
>>>>    1 file changed, 8 deletions(-)
>>>>
>>>> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>>>> index 747ee467da88..4894461aa65f 100644
>>>> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
>>>> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>>>> @@ -194,10 +194,6 @@ static void k3_r5_rproc_mbox_callback(struct mbox_client *client, void *data)
>>>>           const char *name = kproc->rproc->name;
>>>>           u32 msg = omap_mbox_message(data);
>>>>
>>>> -       /* Do not forward message from a detached core */
>>>> -       if (kproc->rproc->state == RPROC_DETACHED)
>>>> -               return;
>>>> -
>>>>           dev_dbg(dev, "mbox msg: 0x%x\n", msg);
>>>>
>>>>           switch (msg) {
>>>> @@ -233,10 +229,6 @@ static void k3_r5_rproc_kick(struct rproc *rproc, int vqid)
>>>>           mbox_msg_t msg = (mbox_msg_t)vqid;
>>>>           int ret;
>>>>
>>>> -       /* Do not forward message to a detached core */
>>>> -       if (kproc->rproc->state == RPROC_DETACHED)
>>>> -               return;
>>>> -
>>>>           /* send the index of the triggered virtqueue in the mailbox payload */
>>>>           ret = mbox_send_message(kproc->mbox, (void *)msg);
>>>>           if (ret < 0)
>>>> --
>>>> 2.40.1
>>>>
Kumar, Udit Sept. 17, 2024, 9:19 a.m. UTC | #5
On 9/17/2024 2:43 PM, Kumar, Udit wrote:
> Hi Mathieu,
>
> On 9/17/2024 2:07 PM, Mathieu Poirier wrote:
>> On Mon, 16 Sept 2024 at 23:20, Kumar, Udit <u-kumar1@ti.com> wrote:
>>> On 9/16/2024 8:50 PM, Mathieu Poirier wrote:
>>>> On Mon, 16 Sept 2024 at 02:31, Siddharth Vadapalli 
>>>> <s-vadapalli@ti.com> wrote:
>>>>> Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle 
>>>>> during
>>>>> probe routine") introduced a check in the 
>>>>> "k3_r5_rproc_mbox_callback()" and
>>>>> "k3_r5_rproc_kick()" callbacks, causing them to exit if the remote 
>>>>> core's
>>>>> state is "RPROC_DETACHED". However, the "__rproc_attach()" 
>>>>> function that is
>>>>> responsible for attaching to a remote core, updates the state of 
>>>>> the remote
>>>>> core to "RPROC_ATTACHED" only after invoking 
>>>>> "rproc_start_subdevices()".
>>>>>
>>>>> The "rproc_start_subdevices()" function triggers the probe of the 
>>>>> Virtio
>>>>> RPMsg devices associated with the remote core, which require that the
>>>>> "k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
>>>>> functional. Hence, drop the check in the callbacks.
>>>> Honestly, I am very tempted to just revert f3f11cfe8907 and 
>>>> ea1d6fb5b571.
>>>
>>> Please don't :) , it will break rproc in general for k3 devices.
>>>
>> Why not - it is already broken anyway.  Reverting the patches will
>> force TI to actually think about the feature in terms of design,
>> completeness and testing.  The merge window opened on Sunday - I'm not
>> going to merge whack-a-mole patches and hope the right fix comes
>> along.
>
> Now, I am not advocating here to revert or not.
>
> But where we stand currently
>
> 1-  Without this patch, IPC is broken in general.
>
> 2-  With this patch, IPC is conditionally broken.


Sorry for confusion,

here _this_ patch I meant below commit ids

f3f11cfe8907 and ea1d6fb5b571.

>
> In either case, we need to fix it.
>
> your call to revert or keep it.
>
>
>>
>>> Couple of solutions for this race around condition (in mine preference
>>> order)
>>>
>> This is for the TI team to discuss _and_ test thoroughly.  From hereon
>> and until I see things improve, all patches from TI will need to be
>> tagged with R-B and T-B tags (collected on the mailing lists) from two
>> different individuals before I look at them.
>
> Sure we will take care of above
>
> and fair ask on R-B and T-B tags
>
>>
>>> 1) In
>>> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L190 
>>>
>>> have a check , if probe in is progress or not
>>>
>>> 2)
>>> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L1205 
>>>
>>> -- correct the state to ON or something else
>>>
>>> 3) Move condition
>>> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/remoteproc_core.c#L1360 
>>>
>>> before rproc_start_subdevices
>>> <https://elixir.bootlin.com/linux/v6.11/C/ident/rproc_start_subdevices>
>>> calling
>>>
>>>
>>>
>>>>> Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle 
>>>>> during probe routine")
>>>>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
>>>>> ---
>>>>>
>>>>> Hello,
>>>>>
>>>>> Since the commit being fixed is not yet a part of Mainline Linux, 
>>>>> this
>>>>> patch is based on linux-next tagged next-20240913.
>>>>>
>>>>> An alternative to this patch will be a change to the 
>>>>> "__rproc_attach()"
>>>>> function in the "remoteproc_core.c" driver with
>>>>> rproc->state = RPROC_ATTACHED;
>>>>> being set after "rproc_attach_device()" is invoked, but __before__
>>>>> invoking "rproc_start_subdevices()". Since this change will be 
>>>>> performed
>>>>> in the common Remoteproc Core, it appeared to me that fixing it in 
>>>>> the
>>>>> TI remoteproc driver is the correct approach.
>>>>>
>>>>> The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
>>>>> required, which I shall post if the current patch is acceptable.
>>>>>
>>>>> Kindly review and share your feedback on this patch.
>>>>>
>>>>> Regards,
>>>>> Siddharth.
>>>>>
>>>>>    drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
>>>>>    1 file changed, 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c 
>>>>> b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>>>>> index 747ee467da88..4894461aa65f 100644
>>>>> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
>>>>> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>>>>> @@ -194,10 +194,6 @@ static void k3_r5_rproc_mbox_callback(struct 
>>>>> mbox_client *client, void *data)
>>>>>           const char *name = kproc->rproc->name;
>>>>>           u32 msg = omap_mbox_message(data);
>>>>>
>>>>> -       /* Do not forward message from a detached core */
>>>>> -       if (kproc->rproc->state == RPROC_DETACHED)
>>>>> -               return;
>>>>> -
>>>>>           dev_dbg(dev, "mbox msg: 0x%x\n", msg);
>>>>>
>>>>>           switch (msg) {
>>>>> @@ -233,10 +229,6 @@ static void k3_r5_rproc_kick(struct rproc 
>>>>> *rproc, int vqid)
>>>>>           mbox_msg_t msg = (mbox_msg_t)vqid;
>>>>>           int ret;
>>>>>
>>>>> -       /* Do not forward message to a detached core */
>>>>> -       if (kproc->rproc->state == RPROC_DETACHED)
>>>>> -               return;
>>>>> -
>>>>>           /* send the index of the triggered virtqueue in the 
>>>>> mailbox payload */
>>>>>           ret = mbox_send_message(kproc->mbox, (void *)msg);
>>>>>           if (ret < 0)
>>>>> -- 
>>>>> 2.40.1
>>>>>
Beleswar Prasad Padhi Sept. 17, 2024, 9:40 a.m. UTC | #6
Hi Mathieu,

On 17/09/24 14:07, Mathieu Poirier wrote:
> On Mon, 16 Sept 2024 at 23:20, Kumar, Udit <u-kumar1@ti.com> wrote:
>> On 9/16/2024 8:50 PM, Mathieu Poirier wrote:
>>> On Mon, 16 Sept 2024 at 02:31, Siddharth Vadapalli <s-vadapalli@ti.com> wrote:
>>>> Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during
>>>> probe routine") introduced a check in the "k3_r5_rproc_mbox_callback()" and
>>>> "k3_r5_rproc_kick()" callbacks, causing them to exit if the remote core's
>>>> state is "RPROC_DETACHED". However, the "__rproc_attach()" function that is
>>>> responsible for attaching to a remote core, updates the state of the remote
>>>> core to "RPROC_ATTACHED" only after invoking "rproc_start_subdevices()".
>>>>
>>>> The "rproc_start_subdevices()" function triggers the probe of the Virtio
>>>> RPMsg devices associated with the remote core, which require that the
>>>> "k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
>>>> functional. Hence, drop the check in the callbacks.
>>> Honestly, I am very tempted to just revert f3f11cfe8907 and ea1d6fb5b571.
>>
>> Please don't :) , it will break rproc in general for k3 devices.
>>
> Why not - it is already broken anyway.  Reverting the patches will
> force TI to actually think about the feature in terms of design,
> completeness and testing.  The merge window opened on Sunday - I'm not
> going to merge whack-a-mole patches and hope the right fix comes
> along.


Apologies for causing this trouble, Mathieu. I have accumulated various 
use-cases of the driver, including this, and hereon will keep in mind 
while posting further patches.

>
>> Couple of solutions for this race around condition (in mine preference
>> order)
>>
> This is for the TI team to discuss _and_ test thoroughly.  From hereon
> and until I see things improve, all patches from TI will need to be
> tagged with R-B and T-B tags (collected on the mailing lists) from two
> different individuals before I look at them.


Understood, that is a fair ask. Hereon, I will also attach my test logs 
for all the usecases I've tested a patch with, to give more visibility 
on the testing done.

>
>> 1) In
>> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L190
>> have a check , if probe in is progress or not
>>
>> 2)
>> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L1205
>> -- correct the state to ON or something else
>>
>> 3) Move condition
>> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/remoteproc_core.c#L1360
>> before rproc_start_subdevices
>> <https://elixir.bootlin.com/linux/v6.11/C/ident/rproc_start_subdevices>
>> calling
>>
>>
>>
>>>> Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during probe routine")
>>>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
>>>> ---
>>>>
>>>> Hello,
>>>>
>>>> Since the commit being fixed is not yet a part of Mainline Linux, this
>>>> patch is based on linux-next tagged next-20240913.
>>>>
>>>> An alternative to this patch will be a change to the "__rproc_attach()"
>>>> function in the "remoteproc_core.c" driver with
>>>> rproc->state = RPROC_ATTACHED;
>>>> being set after "rproc_attach_device()" is invoked, but __before__
>>>> invoking "rproc_start_subdevices()". Since this change will be performed
>>>> in the common Remoteproc Core, it appeared to me that fixing it in the
>>>> TI remoteproc driver is the correct approach.
>>>>
>>>> The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
>>>> required, which I shall post if the current patch is acceptable.
>>>>
>>>> Kindly review and share your feedback on this patch.
>>>>
>>>> Regards,
>>>> Siddharth.
>>>>
>>>>    drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
>>>>    1 file changed, 8 deletions(-)
>>>>
>>>> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>>>> index 747ee467da88..4894461aa65f 100644
>>>> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
>>>> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>>>> @@ -194,10 +194,6 @@ static void k3_r5_rproc_mbox_callback(struct mbox_client *client, void *data)
>>>>           const char *name = kproc->rproc->name;
>>>>           u32 msg = omap_mbox_message(data);
>>>>
>>>> -       /* Do not forward message from a detached core */
>>>> -       if (kproc->rproc->state == RPROC_DETACHED)
>>>> -               return;
>>>> -
>>>>           dev_dbg(dev, "mbox msg: 0x%x\n", msg);
>>>>
>>>>           switch (msg) {
>>>> @@ -233,10 +229,6 @@ static void k3_r5_rproc_kick(struct rproc *rproc, int vqid)
>>>>           mbox_msg_t msg = (mbox_msg_t)vqid;
>>>>           int ret;
>>>>
>>>> -       /* Do not forward message to a detached core */
>>>> -       if (kproc->rproc->state == RPROC_DETACHED)
>>>> -               return;
>>>> -
>>>>           /* send the index of the triggered virtqueue in the mailbox payload */
>>>>           ret = mbox_send_message(kproc->mbox, (void *)msg);
>>>>           if (ret < 0)
>>>> --
>>>> 2.40.1
>>>>
Mathieu Poirier Sept. 19, 2024, 8:26 a.m. UTC | #7
On Tue, 17 Sept 2024 at 03:13, Kumar, Udit <u-kumar1@ti.com> wrote:
>
> Hi Mathieu,
>
> On 9/17/2024 2:07 PM, Mathieu Poirier wrote:
> > On Mon, 16 Sept 2024 at 23:20, Kumar, Udit <u-kumar1@ti.com> wrote:
> >> On 9/16/2024 8:50 PM, Mathieu Poirier wrote:
> >>> On Mon, 16 Sept 2024 at 02:31, Siddharth Vadapalli <s-vadapalli@ti.com> wrote:
> >>>> Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during
> >>>> probe routine") introduced a check in the "k3_r5_rproc_mbox_callback()" and
> >>>> "k3_r5_rproc_kick()" callbacks, causing them to exit if the remote core's
> >>>> state is "RPROC_DETACHED". However, the "__rproc_attach()" function that is
> >>>> responsible for attaching to a remote core, updates the state of the remote
> >>>> core to "RPROC_ATTACHED" only after invoking "rproc_start_subdevices()".
> >>>>
> >>>> The "rproc_start_subdevices()" function triggers the probe of the Virtio
> >>>> RPMsg devices associated with the remote core, which require that the
> >>>> "k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
> >>>> functional. Hence, drop the check in the callbacks.
> >>> Honestly, I am very tempted to just revert f3f11cfe8907 and ea1d6fb5b571.
> >>
> >> Please don't :) , it will break rproc in general for k3 devices.
> >>
> > Why not - it is already broken anyway.  Reverting the patches will
> > force TI to actually think about the feature in terms of design,
> > completeness and testing.  The merge window opened on Sunday - I'm not
> > going to merge whack-a-mole patches and hope the right fix comes
> > along.
>
> Now, I am not advocating here to revert or not.
>
> But where we stand currently
>
> 1-  Without this patch, IPC is broken in general.
>
> 2-  With this patch, IPC is conditionally broken.
>
> In either case, we need to fix it.
>
> your call to revert or keep it.
>

I will keep f3f11cfe8907 and ea1d6fb5b571 but will not take this one
because 1) we are in the merge window and 2) I have no assurance that
this is the right (and complete) fix.

>
> >
> >> Couple of solutions for this race around condition (in mine preference
> >> order)
> >>
> > This is for the TI team to discuss _and_ test thoroughly.  From hereon
> > and until I see things improve, all patches from TI will need to be
> > tagged with R-B and T-B tags (collected on the mailing lists) from two
> > different individuals before I look at them.
>
> Sure we will take care of above
>
> and fair ask on R-B and T-B tags
>
> >
> >> 1) In
> >> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L190
> >> have a check , if probe in is progress or not
> >>
> >> 2)
> >> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L1205
> >> -- correct the state to ON or something else
> >>
> >> 3) Move condition
> >> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/remoteproc_core.c#L1360
> >> before rproc_start_subdevices
> >> <https://elixir.bootlin.com/linux/v6.11/C/ident/rproc_start_subdevices>
> >> calling
> >>
> >>
> >>
> >>>> Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during probe routine")
> >>>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
> >>>> ---
> >>>>
> >>>> Hello,
> >>>>
> >>>> Since the commit being fixed is not yet a part of Mainline Linux, this
> >>>> patch is based on linux-next tagged next-20240913.
> >>>>
> >>>> An alternative to this patch will be a change to the "__rproc_attach()"
> >>>> function in the "remoteproc_core.c" driver with
> >>>> rproc->state = RPROC_ATTACHED;
> >>>> being set after "rproc_attach_device()" is invoked, but __before__
> >>>> invoking "rproc_start_subdevices()". Since this change will be performed
> >>>> in the common Remoteproc Core, it appeared to me that fixing it in the
> >>>> TI remoteproc driver is the correct approach.
> >>>>
> >>>> The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
> >>>> required, which I shall post if the current patch is acceptable.
> >>>>
> >>>> Kindly review and share your feedback on this patch.
> >>>>
> >>>> Regards,
> >>>> Siddharth.
> >>>>
> >>>>    drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
> >>>>    1 file changed, 8 deletions(-)
> >>>>
> >>>> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >>>> index 747ee467da88..4894461aa65f 100644
> >>>> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >>>> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >>>> @@ -194,10 +194,6 @@ static void k3_r5_rproc_mbox_callback(struct mbox_client *client, void *data)
> >>>>           const char *name = kproc->rproc->name;
> >>>>           u32 msg = omap_mbox_message(data);
> >>>>
> >>>> -       /* Do not forward message from a detached core */
> >>>> -       if (kproc->rproc->state == RPROC_DETACHED)
> >>>> -               return;
> >>>> -
> >>>>           dev_dbg(dev, "mbox msg: 0x%x\n", msg);
> >>>>
> >>>>           switch (msg) {
> >>>> @@ -233,10 +229,6 @@ static void k3_r5_rproc_kick(struct rproc *rproc, int vqid)
> >>>>           mbox_msg_t msg = (mbox_msg_t)vqid;
> >>>>           int ret;
> >>>>
> >>>> -       /* Do not forward message to a detached core */
> >>>> -       if (kproc->rproc->state == RPROC_DETACHED)
> >>>> -               return;
> >>>> -
> >>>>           /* send the index of the triggered virtqueue in the mailbox payload */
> >>>>           ret = mbox_send_message(kproc->mbox, (void *)msg);
> >>>>           if (ret < 0)
> >>>> --
> >>>> 2.40.1
> >>>>
Mathieu Poirier Sept. 19, 2024, 8:33 a.m. UTC | #8
s

On Tue, 17 Sept 2024 at 03:40, Beleswar Prasad Padhi <b-padhi@ti.com> wrote:
>
> Hi Mathieu,
>
> On 17/09/24 14:07, Mathieu Poirier wrote:
> > On Mon, 16 Sept 2024 at 23:20, Kumar, Udit <u-kumar1@ti.com> wrote:
> >> On 9/16/2024 8:50 PM, Mathieu Poirier wrote:
> >>> On Mon, 16 Sept 2024 at 02:31, Siddharth Vadapalli <s-vadapalli@ti.com> wrote:
> >>>> Commit f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during
> >>>> probe routine") introduced a check in the "k3_r5_rproc_mbox_callback()" and
> >>>> "k3_r5_rproc_kick()" callbacks, causing them to exit if the remote core's
> >>>> state is "RPROC_DETACHED". However, the "__rproc_attach()" function that is
> >>>> responsible for attaching to a remote core, updates the state of the remote
> >>>> core to "RPROC_ATTACHED" only after invoking "rproc_start_subdevices()".
> >>>>
> >>>> The "rproc_start_subdevices()" function triggers the probe of the Virtio
> >>>> RPMsg devices associated with the remote core, which require that the
> >>>> "k3_r5_rproc_kick()" and "k3_r5_rproc_mbox_callback()" callbacks are
> >>>> functional. Hence, drop the check in the callbacks.
> >>> Honestly, I am very tempted to just revert f3f11cfe8907 and ea1d6fb5b571.
> >>
> >> Please don't :) , it will break rproc in general for k3 devices.
> >>
> > Why not - it is already broken anyway.  Reverting the patches will
> > force TI to actually think about the feature in terms of design,
> > completeness and testing.  The merge window opened on Sunday - I'm not
> > going to merge whack-a-mole patches and hope the right fix comes
> > along.
>
>
> Apologies for causing this trouble, Mathieu. I have accumulated various
> use-cases of the driver, including this, and hereon will keep in mind
> while posting further patches.
>
> >
> >> Couple of solutions for this race around condition (in mine preference
> >> order)
> >>
> > This is for the TI team to discuss _and_ test thoroughly.  From hereon
> > and until I see things improve, all patches from TI will need to be
> > tagged with R-B and T-B tags (collected on the mailing lists) from two
> > different individuals before I look at them.
>
>
> Understood, that is a fair ask. Hereon, I will also attach my test logs
> for all the usecases I've tested a patch with, to give more visibility
> on the testing done.
>

I am not in a position (nor have the time) to look at test logs and as
such, there is no point in doing that.  That time should be better
spent building a CI that goes through all the scenarios and making
sure that all patches pass the tests before being sent to the mailing
list.

> >
> >> 1) In
> >> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L190
> >> have a check , if probe in is progress or not
> >>
> >> 2)
> >> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/ti_k3_r5_remoteproc.c#L1205
> >> -- correct the state to ON or something else
> >>
> >> 3) Move condition
> >> https://elixir.bootlin.com/linux/v6.11/source/drivers/remoteproc/remoteproc_core.c#L1360
> >> before rproc_start_subdevices
> >> <https://elixir.bootlin.com/linux/v6.11/C/ident/rproc_start_subdevices>
> >> calling
> >>
> >>
> >>
> >>>> Fixes: f3f11cfe8907 ("remoteproc: k3-r5: Acquire mailbox handle during probe routine")
> >>>> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
> >>>> ---
> >>>>
> >>>> Hello,
> >>>>
> >>>> Since the commit being fixed is not yet a part of Mainline Linux, this
> >>>> patch is based on linux-next tagged next-20240913.
> >>>>
> >>>> An alternative to this patch will be a change to the "__rproc_attach()"
> >>>> function in the "remoteproc_core.c" driver with
> >>>> rproc->state = RPROC_ATTACHED;
> >>>> being set after "rproc_attach_device()" is invoked, but __before__
> >>>> invoking "rproc_start_subdevices()". Since this change will be performed
> >>>> in the common Remoteproc Core, it appeared to me that fixing it in the
> >>>> TI remoteproc driver is the correct approach.
> >>>>
> >>>> The equivalent of this patch for ti_k3_dsp_remoteproc.c might also be
> >>>> required, which I shall post if the current patch is acceptable.
> >>>>
> >>>> Kindly review and share your feedback on this patch.
> >>>>
> >>>> Regards,
> >>>> Siddharth.
> >>>>
> >>>>    drivers/remoteproc/ti_k3_r5_remoteproc.c | 8 --------
> >>>>    1 file changed, 8 deletions(-)
> >>>>
> >>>> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >>>> index 747ee467da88..4894461aa65f 100644
> >>>> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >>>> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> >>>> @@ -194,10 +194,6 @@ static void k3_r5_rproc_mbox_callback(struct mbox_client *client, void *data)
> >>>>           const char *name = kproc->rproc->name;
> >>>>           u32 msg = omap_mbox_message(data);
> >>>>
> >>>> -       /* Do not forward message from a detached core */
> >>>> -       if (kproc->rproc->state == RPROC_DETACHED)
> >>>> -               return;
> >>>> -
> >>>>           dev_dbg(dev, "mbox msg: 0x%x\n", msg);
> >>>>
> >>>>           switch (msg) {
> >>>> @@ -233,10 +229,6 @@ static void k3_r5_rproc_kick(struct rproc *rproc, int vqid)
> >>>>           mbox_msg_t msg = (mbox_msg_t)vqid;
> >>>>           int ret;
> >>>>
> >>>> -       /* Do not forward message to a detached core */
> >>>> -       if (kproc->rproc->state == RPROC_DETACHED)
> >>>> -               return;
> >>>> -
> >>>>           /* send the index of the triggered virtqueue in the mailbox payload */
> >>>>           ret = mbox_send_message(kproc->mbox, (void *)msg);
> >>>>           if (ret < 0)
> >>>> --
> >>>> 2.40.1
> >>>>
diff mbox series

Patch

diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
index 747ee467da88..4894461aa65f 100644
--- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
+++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
@@ -194,10 +194,6 @@  static void k3_r5_rproc_mbox_callback(struct mbox_client *client, void *data)
 	const char *name = kproc->rproc->name;
 	u32 msg = omap_mbox_message(data);
 
-	/* Do not forward message from a detached core */
-	if (kproc->rproc->state == RPROC_DETACHED)
-		return;
-
 	dev_dbg(dev, "mbox msg: 0x%x\n", msg);
 
 	switch (msg) {
@@ -233,10 +229,6 @@  static void k3_r5_rproc_kick(struct rproc *rproc, int vqid)
 	mbox_msg_t msg = (mbox_msg_t)vqid;
 	int ret;
 
-	/* Do not forward message to a detached core */
-	if (kproc->rproc->state == RPROC_DETACHED)
-		return;
-
 	/* send the index of the triggered virtqueue in the mailbox payload */
 	ret = mbox_send_message(kproc->mbox, (void *)msg);
 	if (ret < 0)