Message ID | 20170722095323.9964-3-christian.gmeiner@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am Samstag, den 22.07.2017, 11:53 +0200 schrieb Christian Gmeiner: > This makes it possible to allocate multiple events under the event > spinlock. This change is needed to support 'sync'-points. > > Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> > --- > drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 31 ++++++++++++++++++++----------- > 1 file changed, 20 insertions(+), 11 deletions(-) > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c > index fa9c7bd98e9c..ab108b0ed573 100644 > --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c > @@ -1137,10 +1137,12 @@ int etnaviv_gpu_fence_sync_obj(struct etnaviv_gem_object *etnaviv_obj, > * event management: > */ > > -static unsigned int event_alloc(struct etnaviv_gpu *gpu) > +static int event_alloc(struct etnaviv_gpu *gpu, unsigned nr_events, > + unsigned int *events) > { > unsigned long ret, flags; > - unsigned int event; > + unsigned used, i; > + int err = 0; > > ret = wait_for_completion_timeout(&gpu->event_free, > msecs_to_jiffies(10 * 10000)); This isn't obvious from the current code, but there are exactly as much completions in the queue, as there are events. See initialization of the completions in etnaviv_gpu_init(). This means the event allocation under spinlock always succeeds if the wait hasn't timed out. To keep this working you need to wait for the completion nr_event times, probably changing this to an absolute timeout, so we don't lengthen the timeout with the number of events. > @@ -1149,16 +1151,24 @@ static unsigned int event_alloc(struct etnaviv_gpu *gpu) > > spin_lock_irqsave(&gpu->event_spinlock, flags); > > - /* find first free event */ > - event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); > - if (event < ETNA_NR_EVENTS) > + /* are there enough free events? */ > + used = bitmap_weight(gpu->event_bitmap, ETNA_NR_EVENTS); > + if (used + nr_events > ETNA_NR_EVENTS) { > + err = -EBUSY; > + goto out; > + } This isn't necessary if you waited successfully for the completion to signal nr_event times. The allocation is guaranteed to succeed in that case. We just want an early return before even locking the spinlock, giving back the already collected completions if one of the waits timed out. > + > + for (i = 0; i < nr_events; i++) { > + int event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); > + > + events[i] = event; > set_bit(event, gpu->event_bitmap); > - else > - event = ~0U; > + } > > +out: > spin_unlock_irqrestore(&gpu->event_spinlock, flags); > > - return event; > + return err; > } > > static void event_free(struct etnaviv_gpu *gpu, unsigned int event) > @@ -1327,10 +1337,9 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, > * > */ > > - event = event_alloc(gpu); > - if (unlikely(event == ~0U)) { > + ret = event_alloc(gpu, 1, &event); > + if (!ret) { > DRM_ERROR("no free event\n"); > - ret = -EBUSY; > goto out_pm_put; > } >
Hi Lucas. Thanks for your review - hopefully there will be only one last v3 series of that patches. 2017-08-08 12:00 GMT+02:00 Lucas Stach <l.stach@pengutronix.de>: > Am Samstag, den 22.07.2017, 11:53 +0200 schrieb Christian Gmeiner: >> This makes it possible to allocate multiple events under the event >> spinlock. This change is needed to support 'sync'-points. >> >> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> >> --- >> drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 31 ++++++++++++++++++++----------- >> 1 file changed, 20 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c >> index fa9c7bd98e9c..ab108b0ed573 100644 >> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c >> @@ -1137,10 +1137,12 @@ int etnaviv_gpu_fence_sync_obj(struct etnaviv_gem_object *etnaviv_obj, >> * event management: >> */ >> >> -static unsigned int event_alloc(struct etnaviv_gpu *gpu) >> +static int event_alloc(struct etnaviv_gpu *gpu, unsigned nr_events, >> + unsigned int *events) >> { >> unsigned long ret, flags; >> - unsigned int event; >> + unsigned used, i; >> + int err = 0; >> >> ret = wait_for_completion_timeout(&gpu->event_free, >> msecs_to_jiffies(10 * 10000)); > > This isn't obvious from the current code, but there are exactly as much > completions in the queue, as there are events. See initialization of the > completions in etnaviv_gpu_init(). This means the event allocation under > spinlock always succeeds if the wait hasn't timed out. To keep this > working you need to wait for the completion nr_event times, probably > changing this to an absolute timeout, so we don't lengthen the timeout > with the number of events. > Makes sense but I not really sure how to best implement an absolute timeout. We could wait absolute timeout / nr_events in each wait_for_completion_timeout(..) call. >> @@ -1149,16 +1151,24 @@ static unsigned int event_alloc(struct etnaviv_gpu *gpu) >> >> spin_lock_irqsave(&gpu->event_spinlock, flags); >> >> - /* find first free event */ >> - event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); >> - if (event < ETNA_NR_EVENTS) >> + /* are there enough free events? */ >> + used = bitmap_weight(gpu->event_bitmap, ETNA_NR_EVENTS); >> + if (used + nr_events > ETNA_NR_EVENTS) { >> + err = -EBUSY; >> + goto out; >> + } > > This isn't necessary if you waited successfully for the completion to > signal nr_event times. The allocation is guaranteed to succeed in that > case. We just want an early return before even locking the spinlock, > giving back the already collected completions if one of the waits timed > out. > Yes - feels more elegant that way. >> + >> + for (i = 0; i < nr_events; i++) { >> + int event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); >> + >> + events[i] = event; >> set_bit(event, gpu->event_bitmap); >> - else >> - event = ~0U; >> + } >> >> +out: >> spin_unlock_irqrestore(&gpu->event_spinlock, flags); >> >> - return event; >> + return err; >> } >> >> static void event_free(struct etnaviv_gpu *gpu, unsigned int event) >> @@ -1327,10 +1337,9 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, >> * >> */ >> >> - event = event_alloc(gpu); >> - if (unlikely(event == ~0U)) { >> + ret = event_alloc(gpu, 1, &event); >> + if (!ret) { >> DRM_ERROR("no free event\n"); >> - ret = -EBUSY; >> goto out_pm_put; >> } >> > > What about something like this: static int event_alloc(struct etnaviv_gpu *gpu, unsigned nr_events, unsigned int *events) { unsigned long flags; unsigned i; int err = 0; for (i = 0; i < nr_events; i++) { unsigned long ret; ret = wait_for_completion_timeout(&gpu->event_free, msecs_to_jiffies(10 * 10000)); if (!ret) { dev_err(gpu->dev, "wait_for_completion_timeout failed"); err = -EBUSY; goto out; } } spin_lock_irqsave(&gpu->event_spinlock, flags); for (i = 0; i < nr_events; i++) { int event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); events[i] = event; set_bit(event, gpu->event_bitmap); } spin_unlock_irqrestore(&gpu->event_spinlock, flags); out: return err; } greets -- Christian Gmeiner, MSc https://christian-gmeiner.info
Hi Christian, Am Dienstag, den 22.08.2017, 10:27 +0200 schrieb Christian Gmeiner: > Hi Lucas. > > Thanks for your review - hopefully there will be only one last v3 > series of that patches. Yep, I would really like to merge this series. It's been in the making for long enough. :) > 2017-08-08 12:00 GMT+02:00 Lucas Stach <l.stach@pengutronix.de>: > > Am Samstag, den 22.07.2017, 11:53 +0200 schrieb Christian Gmeiner: > >> This makes it possible to allocate multiple events under the event > >> spinlock. This change is needed to support 'sync'-points. > >> > >> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> > >> --- > >> drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 31 ++++++++++++++++++++----------- > >> 1 file changed, 20 insertions(+), 11 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c > >> index fa9c7bd98e9c..ab108b0ed573 100644 > >> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c > >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c > >> @@ -1137,10 +1137,12 @@ int etnaviv_gpu_fence_sync_obj(struct etnaviv_gem_object *etnaviv_obj, > >> * event management: > >> */ > >> > >> -static unsigned int event_alloc(struct etnaviv_gpu *gpu) > >> +static int event_alloc(struct etnaviv_gpu *gpu, unsigned nr_events, > >> + unsigned int *events) > >> { > >> unsigned long ret, flags; > >> - unsigned int event; > >> + unsigned used, i; > >> + int err = 0; > >> > >> ret = wait_for_completion_timeout(&gpu->event_free, > >> msecs_to_jiffies(10 * 10000)); > > > > This isn't obvious from the current code, but there are exactly as much > > completions in the queue, as there are events. See initialization of the > > completions in etnaviv_gpu_init(). This means the event allocation under > > spinlock always succeeds if the wait hasn't timed out. To keep this > > working you need to wait for the completion nr_event times, probably > > changing this to an absolute timeout, so we don't lengthen the timeout > > with the number of events. > > > > Makes sense but I not really sure how to best implement an absolute > timeout. We could wait > absolute timeout / nr_events in each wait_for_completion_timeout(..) call. I think we should just keep the 10sec timeout regardless of the number of events. If we don't acquire the needed events after 10secs, it's probably fine to give up. > >> @@ -1149,16 +1151,24 @@ static unsigned int event_alloc(struct etnaviv_gpu *gpu) > >> > >> spin_lock_irqsave(&gpu->event_spinlock, flags); > >> > >> - /* find first free event */ > >> - event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); > >> - if (event < ETNA_NR_EVENTS) > >> + /* are there enough free events? */ > >> + used = bitmap_weight(gpu->event_bitmap, ETNA_NR_EVENTS); > >> + if (used + nr_events > ETNA_NR_EVENTS) { > >> + err = -EBUSY; > >> + goto out; > >> + } > > > > This isn't necessary if you waited successfully for the completion to > > signal nr_event times. The allocation is guaranteed to succeed in that > > case. We just want an early return before even locking the spinlock, > > giving back the already collected completions if one of the waits timed > > out. > > > > Yes - feels more elegant that way. > > >> + > >> + for (i = 0; i < nr_events; i++) { > >> + int event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); > >> + > >> + events[i] = event; > >> set_bit(event, gpu->event_bitmap); > >> - else > >> - event = ~0U; > >> + } > >> > >> +out: > >> spin_unlock_irqrestore(&gpu->event_spinlock, flags); > >> > >> - return event; > >> + return err; > >> } > >> > >> static void event_free(struct etnaviv_gpu *gpu, unsigned int event) > >> @@ -1327,10 +1337,9 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, > >> * > >> */ > >> > >> - event = event_alloc(gpu); > >> - if (unlikely(event == ~0U)) { > >> + ret = event_alloc(gpu, 1, &event); > >> + if (!ret) { > >> DRM_ERROR("no free event\n"); > >> - ret = -EBUSY; > >> goto out_pm_put; > >> } > >> > > > > > > What about something like this: A few notes inline. > > static int event_alloc(struct etnaviv_gpu *gpu, unsigned nr_events, > unsigned int *events) > { > unsigned long flags; > unsigned i; > int err = 0; > > for (i = 0; i < nr_events; i++) { > unsigned long ret; timeout = msecs_to_jiffies(10 * 10000); > > ret = wait_for_completion_timeout(&gpu->event_free, > msecs_to_jiffies(10 * 10000)); ^^ use timeout here > if (!ret) { > dev_err(gpu->dev, "wait_for_completion_timeout failed"); > err = -EBUSY; > goto out; > } If we waited successfully, wait_for_completion_timeout will return the remaining jiffies, so we can re-initialize the timeout with the return value to shorten the next wait. timeout = ret; > } > > spin_lock_irqsave(&gpu->event_spinlock, flags); > > for (i = 0; i < nr_events; i++) { > int event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); > > events[i] = event; > set_bit(event, gpu->event_bitmap); > } > > spin_unlock_irqrestore(&gpu->event_spinlock, flags); > > out: If you end up here you need to return the already acquired completions, by rolling back your for loop above and do a complete(&gpu->event_free) for each acquired completion to avoid depleting the completion queue. > return err; > } Regards, Lucas
Hi Lucas, 2017-08-22 10:39 GMT+02:00 Lucas Stach <l.stach@pengutronix.de>: > Hi Christian, > > Am Dienstag, den 22.08.2017, 10:27 +0200 schrieb Christian Gmeiner: >> Hi Lucas. >> >> Thanks for your review - hopefully there will be only one last v3 >> series of that patches. > > Yep, I would really like to merge this series. It's been in the making > for long enough. :) > >> 2017-08-08 12:00 GMT+02:00 Lucas Stach <l.stach@pengutronix.de>: >> > Am Samstag, den 22.07.2017, 11:53 +0200 schrieb Christian Gmeiner: >> >> This makes it possible to allocate multiple events under the event >> >> spinlock. This change is needed to support 'sync'-points. >> >> >> >> Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> >> >> --- >> >> drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 31 ++++++++++++++++++++----------- >> >> 1 file changed, 20 insertions(+), 11 deletions(-) >> >> >> >> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c >> >> index fa9c7bd98e9c..ab108b0ed573 100644 >> >> --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c >> >> +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c >> >> @@ -1137,10 +1137,12 @@ int etnaviv_gpu_fence_sync_obj(struct etnaviv_gem_object *etnaviv_obj, >> >> * event management: >> >> */ >> >> >> >> -static unsigned int event_alloc(struct etnaviv_gpu *gpu) >> >> +static int event_alloc(struct etnaviv_gpu *gpu, unsigned nr_events, >> >> + unsigned int *events) >> >> { >> >> unsigned long ret, flags; >> >> - unsigned int event; >> >> + unsigned used, i; >> >> + int err = 0; >> >> >> >> ret = wait_for_completion_timeout(&gpu->event_free, >> >> msecs_to_jiffies(10 * 10000)); >> > >> > This isn't obvious from the current code, but there are exactly as much >> > completions in the queue, as there are events. See initialization of the >> > completions in etnaviv_gpu_init(). This means the event allocation under >> > spinlock always succeeds if the wait hasn't timed out. To keep this >> > working you need to wait for the completion nr_event times, probably >> > changing this to an absolute timeout, so we don't lengthen the timeout >> > with the number of events. >> > >> >> Makes sense but I not really sure how to best implement an absolute >> timeout. We could wait >> absolute timeout / nr_events in each wait_for_completion_timeout(..) call. > > I think we should just keep the 10sec timeout regardless of the number > of events. If we don't acquire the needed events after 10secs, it's > probably fine to give up. > Yeah :) >> >> @@ -1149,16 +1151,24 @@ static unsigned int event_alloc(struct etnaviv_gpu *gpu) >> >> >> >> spin_lock_irqsave(&gpu->event_spinlock, flags); >> >> >> >> - /* find first free event */ >> >> - event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); >> >> - if (event < ETNA_NR_EVENTS) >> >> + /* are there enough free events? */ >> >> + used = bitmap_weight(gpu->event_bitmap, ETNA_NR_EVENTS); >> >> + if (used + nr_events > ETNA_NR_EVENTS) { >> >> + err = -EBUSY; >> >> + goto out; >> >> + } >> > >> > This isn't necessary if you waited successfully for the completion to >> > signal nr_event times. The allocation is guaranteed to succeed in that >> > case. We just want an early return before even locking the spinlock, >> > giving back the already collected completions if one of the waits timed >> > out. >> > >> >> Yes - feels more elegant that way. >> >> >> + >> >> + for (i = 0; i < nr_events; i++) { >> >> + int event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); >> >> + >> >> + events[i] = event; >> >> set_bit(event, gpu->event_bitmap); >> >> - else >> >> - event = ~0U; >> >> + } >> >> >> >> +out: >> >> spin_unlock_irqrestore(&gpu->event_spinlock, flags); >> >> >> >> - return event; >> >> + return err; >> >> } >> >> >> >> static void event_free(struct etnaviv_gpu *gpu, unsigned int event) >> >> @@ -1327,10 +1337,9 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, >> >> * >> >> */ >> >> >> >> - event = event_alloc(gpu); >> >> - if (unlikely(event == ~0U)) { >> >> + ret = event_alloc(gpu, 1, &event); >> >> + if (!ret) { >> >> DRM_ERROR("no free event\n"); >> >> - ret = -EBUSY; >> >> goto out_pm_put; >> >> } >> >> >> > >> > >> >> What about something like this: > > A few notes inline. > >> >> static int event_alloc(struct etnaviv_gpu *gpu, unsigned nr_events, >> unsigned int *events) >> { >> unsigned long flags; >> unsigned i; >> int err = 0; >> >> for (i = 0; i < nr_events; i++) { >> unsigned long ret; > timeout = msecs_to_jiffies(10 * 10000); >> >> ret = wait_for_completion_timeout(&gpu->event_free, >> msecs_to_jiffies(10 * 10000)); > ^^ use timeout here >> if (!ret) { >> dev_err(gpu->dev, "wait_for_completion_timeout failed"); >> err = -EBUSY; >> goto out; >> } > > If we waited successfully, wait_for_completion_timeout will return the > remaining jiffies, so we can re-initialize the timeout with the return > value to shorten the next wait. > > timeout = ret; ok >> } >> >> spin_lock_irqsave(&gpu->event_spinlock, flags); >> >> for (i = 0; i < nr_events; i++) { >> int event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); >> >> events[i] = event; >> set_bit(event, gpu->event_bitmap); >> } >> >> spin_unlock_irqrestore(&gpu->event_spinlock, flags); >> >> out: > > If you end up here you need to return the already acquired completions, > by rolling back your for loop above and do a complete(&gpu->event_free) > for each acquired completion to avoid depleting the completion queue. > Oops.. totally missed that part - time for a coffee. greets -- Christian Gmeiner, MSc https://christian-gmeiner.info
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index fa9c7bd98e9c..ab108b0ed573 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -1137,10 +1137,12 @@ int etnaviv_gpu_fence_sync_obj(struct etnaviv_gem_object *etnaviv_obj, * event management: */ -static unsigned int event_alloc(struct etnaviv_gpu *gpu) +static int event_alloc(struct etnaviv_gpu *gpu, unsigned nr_events, + unsigned int *events) { unsigned long ret, flags; - unsigned int event; + unsigned used, i; + int err = 0; ret = wait_for_completion_timeout(&gpu->event_free, msecs_to_jiffies(10 * 10000)); @@ -1149,16 +1151,24 @@ static unsigned int event_alloc(struct etnaviv_gpu *gpu) spin_lock_irqsave(&gpu->event_spinlock, flags); - /* find first free event */ - event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); - if (event < ETNA_NR_EVENTS) + /* are there enough free events? */ + used = bitmap_weight(gpu->event_bitmap, ETNA_NR_EVENTS); + if (used + nr_events > ETNA_NR_EVENTS) { + err = -EBUSY; + goto out; + } + + for (i = 0; i < nr_events; i++) { + int event = find_first_zero_bit(gpu->event_bitmap, ETNA_NR_EVENTS); + + events[i] = event; set_bit(event, gpu->event_bitmap); - else - event = ~0U; + } +out: spin_unlock_irqrestore(&gpu->event_spinlock, flags); - return event; + return err; } static void event_free(struct etnaviv_gpu *gpu, unsigned int event) @@ -1327,10 +1337,9 @@ int etnaviv_gpu_submit(struct etnaviv_gpu *gpu, * */ - event = event_alloc(gpu); - if (unlikely(event == ~0U)) { + ret = event_alloc(gpu, 1, &event); + if (!ret) { DRM_ERROR("no free event\n"); - ret = -EBUSY; goto out_pm_put; }
This makes it possible to allocate multiple events under the event spinlock. This change is needed to support 'sync'-points. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> --- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 31 ++++++++++++++++++++----------- 1 file changed, 20 insertions(+), 11 deletions(-)