Message ID | 20241023-pmic-glink-ecancelled-v2-1-ebc268129407@oss.qualcomm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | soc: qcom: pmic_glink: Resolve failures to bring up pmic_glink | expand |
On 10/23/2024 10:24 AM, Bjorn Andersson wrote: > GLINK operates using pre-allocated buffers, aka intents, where incoming > messages are aggregated before being passed up the stack. In the case > that no suitable intents have been announced by the receiver, the sender > can request an intent to be allocated. > > The initial implementation of the response to such request dealt > with two outcomes; granted allocations, and all other cases being > considered -ECANCELLED (likely from "cancelling the operation as the > remote is going down"). > > But on some channels intent allocation is not supported, instead the > remote will pre-allocate and announce a fixed number of intents for the > sender to use. If for such channels an rpmsg_send() is being invoked > before any channels have been announced, an intent request will be > issued and as this comes back rejected the call fails with -ECANCELED. > > Given that this is reported in the same way as the remote being shut > down, there's no way for the client to differentiate the two cases. > > In line with the original GLINK design, change the return value to > -EAGAIN for the case where the remote rejects an intent allocation > request. > > It's tempting to handle this case in the GLINK core, as we expect > intents to show up in this case. But there's no way to distinguish > between this case and a rejection for a too big allocation, nor is it > possible to predict if a currently used (and seemingly suitable) intent > will be returned for reuse or not. As such, returning the error to the > client and allow it to react seems to be the only sensible solution. > > In addition to this, commit 'c05dfce0b89e ("rpmsg: glink: Wait for > intent, not just request ack")' changed the logic such that the code > always wait for an intent request response and an intent. This works out > in most cases, but in the event that an intent request is rejected and no > further intent arrives (e.g. client asks for a too big intent), the code > will stall for 10 seconds and then return -ETIMEDOUT; instead of a more > suitable error. > > This change also resulted in intent requests racing with the shutdown of > the remote would be exposed to this same problem, unless some intent > happens to arrive. A patch for this was developed and posted by Sarannya > S [1], and has been incorporated here. > > To summarize, the intent request can end in 4 ways: > - Timeout, no response arrived => return -ETIMEDOUT > - Abort TX, the edge is going away => return -ECANCELLED > - Intent request was rejected => return -EAGAIN > - Intent request was accepted, and an intent arrived => return 0 > > This patch was developed with input from Sarannya S, Deepak Kumar Singh, > and Chris Lew. > > [1] https://lore.kernel.org/all/20240925072328.1163183-1-quic_deesin@quicinc.com/ > > Fixes: c05dfce0b89e ("rpmsg: glink: Wait for intent, not just request ack") > Cc: stable@vger.kernel.org > Tested-by: Johan Hovold <johan+linaro@kernel.org> > Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com> > --- Reviewed-by: Chris Lew <quic_clew@quicinc.com>
diff --git a/drivers/rpmsg/qcom_glink_native.c b/drivers/rpmsg/qcom_glink_native.c index 0b2f290069080638581a13b3a580054d31e176c2..d3af1dfa3c7d71b95dda911dfc7ad844679359d6 100644 --- a/drivers/rpmsg/qcom_glink_native.c +++ b/drivers/rpmsg/qcom_glink_native.c @@ -1440,14 +1440,18 @@ static int qcom_glink_request_intent(struct qcom_glink *glink, goto unlock; ret = wait_event_timeout(channel->intent_req_wq, - READ_ONCE(channel->intent_req_result) >= 0 && - READ_ONCE(channel->intent_received), + READ_ONCE(channel->intent_req_result) == 0 || + (READ_ONCE(channel->intent_req_result) > 0 && + READ_ONCE(channel->intent_received)) || + glink->abort_tx, 10 * HZ); if (!ret) { dev_err(glink->dev, "intent request timed out\n"); ret = -ETIMEDOUT; + } else if (glink->abort_tx) { + ret = -ECANCELED; } else { - ret = READ_ONCE(channel->intent_req_result) ? 0 : -ECANCELED; + ret = READ_ONCE(channel->intent_req_result) ? 0 : -EAGAIN; } unlock: