[v2,5/5] drm/vkms: Implement CRC debugfs API

Message ID	dab6e073c5061d549e608f4e02515938651175bf.1531570020.git.hamohammed.sa@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> Date: Sat, 14 Jul 2018 15:23:32 +0300 From: Haneen Mohammed <hamohammed.sa@gmail.com> To: dri-devel@lists.freedesktop.org Subject: [PATCH v2 5/5] drm/vkms: Implement CRC debugfs API Message-ID: <dab6e073c5061d549e608f4e02515938651175bf.1531570020.git.hamohammed.sa@gmail.com> References: <cover.1531570020.git.hamohammed.sa@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <cover.1531570020.git.hamohammed.sa@gmail.com> User-Agent: Mutt/1.9.4 (2018-02-28) Precedence: list Cc: hamohammed.sa@gmail.com, rodrigosiqueiramelo@gmail.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>

Haneen Mohammed July 14, 2018, 12:23 p.m. UTC

Implement the set_crc_source() callback.
Compute CRC using crc32 on the visible part of the framebuffer.
Use ordered workqueue to compute and add CRC at the end of a vblank.

Use appropriate synchronization methods since the crc computation must
be atomic wrt the generated vblank event for a given atomic update,

Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>
---
Changes in v2:
- Include this patch in this patchset.

 drivers/gpu/drm/vkms/Makefile     |  2 +-
 drivers/gpu/drm/vkms/vkms_crtc.c  | 60 ++++++++++++++++++++++++++-----
 drivers/gpu/drm/vkms/vkms_drv.c   |  1 +
 drivers/gpu/drm/vkms/vkms_drv.h   | 21 +++++++++++
 drivers/gpu/drm/vkms/vkms_plane.c | 10 ++++++
 5 files changed, 85 insertions(+), 9 deletions(-)

Sean Paul July 16, 2018, 6:22 p.m. UTC | #1

On Sat, Jul 14, 2018 at 03:23:32PM +0300, Haneen Mohammed wrote:
> Implement the set_crc_source() callback.
> Compute CRC using crc32 on the visible part of the framebuffer.
> Use ordered workqueue to compute and add CRC at the end of a vblank.
> 
> Use appropriate synchronization methods since the crc computation must
> be atomic wrt the generated vblank event for a given atomic update,
> 
> Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>

Hey Haneen,
Thanks for revising this patch set. Things are looking good across the series,
just a few more comments :-)

> ---
> Changes in v2:
> - Include this patch in this patchset.
> 
>  drivers/gpu/drm/vkms/Makefile     |  2 +-
>  drivers/gpu/drm/vkms/vkms_crtc.c  | 60 ++++++++++++++++++++++++++-----
>  drivers/gpu/drm/vkms/vkms_drv.c   |  1 +
>  drivers/gpu/drm/vkms/vkms_drv.h   | 21 +++++++++++
>  drivers/gpu/drm/vkms/vkms_plane.c | 10 ++++++
>  5 files changed, 85 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
> index 986297da51bf..37966914f70b 100644
> --- a/drivers/gpu/drm/vkms/Makefile
> +++ b/drivers/gpu/drm/vkms/Makefile
> @@ -1,3 +1,3 @@
> -vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o
> +vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o vkms_crc.o
>  
>  obj-$(CONFIG_DRM_VKMS) += vkms.o
> diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> index 26babb85ca77..f3a674db33b8 100644
> --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> @@ -10,18 +10,36 @@
>  #include <drm/drm_atomic_helper.h>
>  #include <drm/drm_crtc_helper.h>
>  
> -static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> +static void _vblank_handle(struct vkms_output *output)
>  {
> -	struct vkms_output *output = container_of(timer, struct vkms_output,
> -						  vblank_hrtimer);
>  	struct drm_crtc *crtc = &output->crtc;
> -	int ret_overrun;
> +	struct vkms_crtc_state *state = to_vkms_crtc_state(crtc->state);
>  	bool ret;
>  
> +	int crc_enabled = 0;
> +
> +	spin_lock(&output->lock);
> +	crc_enabled = output->crc_enabled;

Aside from the implicit bool -> int cast, I don't think you need this local var,
just use output->crc_enabled directly below.

>  	ret = drm_crtc_handle_vblank(crtc);
>  	if (!ret)
>  		DRM_ERROR("vkms failure on handling vblank");

This would be more useful with the error code printed.

>  
> +	if (state && crc_enabled) {
> +		state->n_frame = drm_crtc_accurate_vblank_count(crtc);
> +		queue_work(output->crc_workq, &state->crc_work);
> +	}
> +
> +	spin_unlock(&output->lock);
> +}
> +
> +static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> +{
> +	struct vkms_output *output = container_of(timer, struct vkms_output,
> +						  vblank_hrtimer);
> +	int ret_overrun;
> +
> +	_vblank_handle(output);
> +
>  	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
>  					  output->period_ns);
>  
> @@ -97,6 +115,8 @@ vkms_atomic_crtc_duplicate_state(struct drm_crtc *crtc)
>  
>  	__drm_atomic_helper_crtc_duplicate_state(crtc, &vkms_state->base);
>  
> +	INIT_WORK(&vkms_state->crc_work, vkms_crc_work_handle);
> +
>  	return &vkms_state->base;
>  }
>  
> @@ -104,11 +124,18 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
>  					   struct drm_crtc_state *state)
>  {
>  	struct vkms_crtc_state *vkms_state;
> +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
>  
>  	vkms_state = to_vkms_crtc_state(state);
>  
>  	__drm_atomic_helper_crtc_destroy_state(state);
> -	kfree(vkms_state);
> +
> +	if (vkms_state) {
> +		flush_workqueue(vkms_out->crc_workq);

I'm a little worried about this bit. Since the workqueue is per-output, is it
possible you'll be waiting for more frames to complete than you need to be?

> +		drm_framebuffer_put(&vkms_state->data.fb);
> +		memset(&vkms_state->data, 0, sizeof(struct vkms_crc_data));
> +		kfree(vkms_state);
> +	}
>  }
>  
>  static const struct drm_crtc_funcs vkms_crtc_funcs = {
> @@ -120,6 +147,7 @@ static const struct drm_crtc_funcs vkms_crtc_funcs = {
>  	.atomic_destroy_state   = vkms_atomic_crtc_destroy_state,
>  	.enable_vblank		= vkms_enable_vblank,
>  	.disable_vblank		= vkms_disable_vblank,
> +	.set_crc_source		= vkms_set_crc_source,
>  };
>  
>  static void vkms_crtc_atomic_enable(struct drm_crtc *crtc,
> @@ -134,26 +162,37 @@ static void vkms_crtc_atomic_disable(struct drm_crtc *crtc,
>  	drm_crtc_vblank_off(crtc);
>  }
>  
> +static void vkms_crtc_atomic_begin(struct drm_crtc *crtc,
> +				   struct drm_crtc_state *old_crtc_state)
> +{
> +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> +
> +	spin_lock_irq(&vkms_output->lock);

Hmm, you can't lock across atomic_begin/flush. What is this protecting against?

> +}
> +
>  static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
>  				   struct drm_crtc_state *old_crtc_state)
>  {
> -	unsigned long flags;
> +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
>  
>  	if (crtc->state->event) {
> -		spin_lock_irqsave(&crtc->dev->event_lock, flags);
> +		spin_lock(&crtc->dev->event_lock);

What's the rationale behind this change?

>  
>  		if (drm_crtc_vblank_get(crtc) != 0)
>  			drm_crtc_send_vblank_event(crtc, crtc->state->event);
>  		else
>  			drm_crtc_arm_vblank_event(crtc, crtc->state->event);
>  
> -		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
> +		spin_unlock(&crtc->dev->event_lock);
>  
>  		crtc->state->event = NULL;
>  	}
> +
> +	spin_unlock_irq(&vkms_output->lock);
>  }
>  
>  static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> +	.atomic_begin	= vkms_crtc_atomic_begin,
>  	.atomic_flush	= vkms_crtc_atomic_flush,
>  	.atomic_enable	= vkms_crtc_atomic_enable,
>  	.atomic_disable	= vkms_crtc_atomic_disable,
> @@ -162,6 +201,7 @@ static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
>  int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
>  		   struct drm_plane *primary, struct drm_plane *cursor)
>  {
> +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
>  	int ret;
>  
>  	ret = drm_crtc_init_with_planes(dev, crtc, primary, cursor,
> @@ -173,5 +213,9 @@ int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
>  
>  	drm_crtc_helper_add(crtc, &vkms_crtc_helper_funcs);
>  
> +	spin_lock_init(&vkms_out->lock);
> +
> +	vkms_out->crc_workq = alloc_ordered_workqueue("vkms_crc_workq", 0);
> +
>  	return ret;
>  }
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> index 37aa2ef33b21..5d78bd97e69c 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.c
> +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> @@ -46,6 +46,7 @@ static void vkms_release(struct drm_device *dev)
>  	platform_device_unregister(vkms->platform);
>  	drm_mode_config_cleanup(&vkms->drm);
>  	drm_dev_fini(&vkms->drm);
> +	destroy_workqueue(vkms->output.crc_workq);
>  }
>  
>  static struct drm_driver vkms_driver = {
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index bf811d0ec349..95985649768c 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -20,12 +20,23 @@ static const u32 vkms_formats[] = {
>  	DRM_FORMAT_XRGB8888,
>  };
>  
> +struct vkms_crc_data {
> +	struct drm_rect src;
> +	struct drm_framebuffer fb;
> +};
> +
>  /**
>   * vkms_crtc_state - Driver specific CRTC state
>   * @base: base CRTC state
> + * @crc_work: work struct to compute and add CRC entries
> + * @data: data required for CRC computation
> + * @n_frame: frame number for computed CRC
>   */
>  struct vkms_crtc_state {
>  	struct drm_crtc_state base;
> +	struct work_struct crc_work;
> +	struct vkms_crc_data data;
> +	unsigned int n_frame;
>  };
>  
>  struct vkms_output {
> @@ -35,6 +46,11 @@ struct vkms_output {
>  	struct hrtimer vblank_hrtimer;
>  	ktime_t period_ns;
>  	struct drm_pending_vblank_event *event;
> +	bool crc_enabled;

Where is this set to true?

> +	/* ordered wq for crc_work */
> +	struct workqueue_struct *crc_workq;
> +	/* protects concurrent access to crc_data */
> +	spinlock_t lock;
>  };
>  
>  struct vkms_device {
> @@ -95,4 +111,9 @@ void *vkms_gem_vmap(struct drm_gem_object *obj);
>  
>  void vkms_gem_vunmap(struct drm_gem_object *obj);
>  
> +/* CRC Support */
> +int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name,
> +			size_t *values_cnt);
> +void vkms_crc_work_handle(struct work_struct *work);
> +
>  #endif /* _VKMS_DRV_H_ */
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index aaf7c35faed6..1e0ee8ca8bd6 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -24,6 +24,16 @@ static const struct drm_plane_funcs vkms_plane_funcs = {
>  static void vkms_primary_plane_update(struct drm_plane *plane,
>  				      struct drm_plane_state *old_state)
>  {
> +	struct vkms_crtc_state *state;
> +
> +	if (!plane->state->crtc || !plane->state->fb)
> +		return;
> +
> +	state = to_vkms_crtc_state(plane->state->crtc->state);
> +	memcpy(&state->data.src, &plane->state->src, sizeof(struct drm_rect));
> +	memcpy(&state->data.fb, plane->state->fb,
> +	       sizeof(struct drm_framebuffer));
> +	drm_framebuffer_get(&state->data.fb);
>  }
>  
>  static int vkms_plane_atomic_check(struct drm_plane *plane,
> -- 
> 2.17.1
>

Haneen Mohammed July 16, 2018, 11:52 p.m. UTC | #2

On Mon, Jul 16, 2018 at 02:22:56PM -0400, Sean Paul wrote:
> On Sat, Jul 14, 2018 at 03:23:32PM +0300, Haneen Mohammed wrote:
> > Implement the set_crc_source() callback.
> > Compute CRC using crc32 on the visible part of the framebuffer.
> > Use ordered workqueue to compute and add CRC at the end of a vblank.
> > 
> > Use appropriate synchronization methods since the crc computation must
> > be atomic wrt the generated vblank event for a given atomic update,
> > 
> > Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>
> 
> Hey Haneen,
> Thanks for revising this patch set. Things are looking good across the series,
> just a few more comments :-)
> 

Thank you so much for the review! 

> > ---
> > Changes in v2:
> > - Include this patch in this patchset.
> > 
> >  drivers/gpu/drm/vkms/Makefile     |  2 +-
> >  drivers/gpu/drm/vkms/vkms_crtc.c  | 60 ++++++++++++++++++++++++++-----
> >  drivers/gpu/drm/vkms/vkms_drv.c   |  1 +
> >  drivers/gpu/drm/vkms/vkms_drv.h   | 21 +++++++++++
> >  drivers/gpu/drm/vkms/vkms_plane.c | 10 ++++++
> >  5 files changed, 85 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
> > index 986297da51bf..37966914f70b 100644
> > --- a/drivers/gpu/drm/vkms/Makefile
> > +++ b/drivers/gpu/drm/vkms/Makefile
> > @@ -1,3 +1,3 @@
> > -vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o
> > +vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o vkms_crc.o
> >  
> >  obj-$(CONFIG_DRM_VKMS) += vkms.o
> > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > index 26babb85ca77..f3a674db33b8 100644
> > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > @@ -10,18 +10,36 @@
> >  #include <drm/drm_atomic_helper.h>
> >  #include <drm/drm_crtc_helper.h>
> >  
> > -static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > +static void _vblank_handle(struct vkms_output *output)
> >  {
> > -	struct vkms_output *output = container_of(timer, struct vkms_output,
> > -						  vblank_hrtimer);
> >  	struct drm_crtc *crtc = &output->crtc;
> > -	int ret_overrun;
> > +	struct vkms_crtc_state *state = to_vkms_crtc_state(crtc->state);
> >  	bool ret;
> >  
> > +	int crc_enabled = 0;
> > +
> > +	spin_lock(&output->lock);
> > +	crc_enabled = output->crc_enabled;
> 
> Aside from the implicit bool -> int cast, I don't think you need this local var,
> just use output->crc_enabled directly below.
> 
> >  	ret = drm_crtc_handle_vblank(crtc);
> >  	if (!ret)
> >  		DRM_ERROR("vkms failure on handling vblank");
> 
> This would be more useful with the error code printed.
> 

I think this only returns false on failure. Also I've noticed most of the usage of
drm_crtc_handle_vblank don't check the return value, should I do the
same as well and drop ret and error message?

> >  
> > +	if (state && crc_enabled) {
> > +		state->n_frame = drm_crtc_accurate_vblank_count(crtc);
> > +		queue_work(output->crc_workq, &state->crc_work);
> > +	}
> > +
> > +	spin_unlock(&output->lock);
> > +}
> > +
> > +static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > +{
> > +	struct vkms_output *output = container_of(timer, struct vkms_output,
> > +						  vblank_hrtimer);
> > +	int ret_overrun;
> > +
> > +	_vblank_handle(output);
> > +
> >  	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
> >  					  output->period_ns);
> >  
> > @@ -97,6 +115,8 @@ vkms_atomic_crtc_duplicate_state(struct drm_crtc *crtc)
> >  
> >  	__drm_atomic_helper_crtc_duplicate_state(crtc, &vkms_state->base);
> >  
> > +	INIT_WORK(&vkms_state->crc_work, vkms_crc_work_handle);
> > +
> >  	return &vkms_state->base;
> >  }
> >  
> > @@ -104,11 +124,18 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
> >  					   struct drm_crtc_state *state)
> >  {
> >  	struct vkms_crtc_state *vkms_state;
> > +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
> >  
> >  	vkms_state = to_vkms_crtc_state(state);
> >  
> >  	__drm_atomic_helper_crtc_destroy_state(state);
> > -	kfree(vkms_state);
> > +
> > +	if (vkms_state) {
> > +		flush_workqueue(vkms_out->crc_workq);
> 
> I'm a little worried about this bit. Since the workqueue is per-output, is it
> possible you'll be waiting for more frames to complete than you need to be?
> 

I see, maybe I should flush per work_struct instead?

> > +		drm_framebuffer_put(&vkms_state->data.fb);
> > +		memset(&vkms_state->data, 0, sizeof(struct vkms_crc_data));
> > +		kfree(vkms_state);
> > +	}
> >  }
> >  
> >  static const struct drm_crtc_funcs vkms_crtc_funcs = {
> > @@ -120,6 +147,7 @@ static const struct drm_crtc_funcs vkms_crtc_funcs = {
> >  	.atomic_destroy_state   = vkms_atomic_crtc_destroy_state,
> >  	.enable_vblank		= vkms_enable_vblank,
> >  	.disable_vblank		= vkms_disable_vblank,
> > +	.set_crc_source		= vkms_set_crc_source,
> >  };
> >  
> >  static void vkms_crtc_atomic_enable(struct drm_crtc *crtc,
> > @@ -134,26 +162,37 @@ static void vkms_crtc_atomic_disable(struct drm_crtc *crtc,
> >  	drm_crtc_vblank_off(crtc);
> >  }
> >  
> > +static void vkms_crtc_atomic_begin(struct drm_crtc *crtc,
> > +				   struct drm_crtc_state *old_crtc_state)
> > +{
> > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > +
> > +	spin_lock_irq(&vkms_output->lock);
> 
> Hmm, you can't lock across atomic_begin/flush. What is this protecting against?
> 

I did this per Daniel recommendation:
1- spin_lock_irq -> vkms_crtc_atomic_begin
2- update vkms_crc_data and get ref to fb -> vkms_primary_plane_update
3- spin_unlock_irq after vblank event -> vkms_crtc_atomic_flush

> > +}
> > +
> >  static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
> >  				   struct drm_crtc_state *old_crtc_state)
> >  {
> > -	unsigned long flags;
> > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> >  
> >  	if (crtc->state->event) {
> > -		spin_lock_irqsave(&crtc->dev->event_lock, flags);
> > +		spin_lock(&crtc->dev->event_lock);
> 
> What's the rationale behind this change?
> 

hm I am not sure if this is correct, but I assume because spin_lock_irq in atomic_begin
would disable interrupts and since this is before we unlock vkms_output->lock with 
spin_unlock_irq so irqsave here is not necessary?

> >  
> >  		if (drm_crtc_vblank_get(crtc) != 0)
> >  			drm_crtc_send_vblank_event(crtc, crtc->state->event);
> >  		else
> >  			drm_crtc_arm_vblank_event(crtc, crtc->state->event);
> >  
> > -		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
> > +		spin_unlock(&crtc->dev->event_lock);
> >  
> >  		crtc->state->event = NULL;
> >  	}
> > +
> > +	spin_unlock_irq(&vkms_output->lock);
> >  }
> >  
> >  static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> > +	.atomic_begin	= vkms_crtc_atomic_begin,
> >  	.atomic_flush	= vkms_crtc_atomic_flush,
> >  	.atomic_enable	= vkms_crtc_atomic_enable,
> >  	.atomic_disable	= vkms_crtc_atomic_disable,
> > @@ -162,6 +201,7 @@ static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> >  int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
> >  		   struct drm_plane *primary, struct drm_plane *cursor)
> >  {
> > +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
> >  	int ret;
> >  
> >  	ret = drm_crtc_init_with_planes(dev, crtc, primary, cursor,
> > @@ -173,5 +213,9 @@ int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
> >  
> >  	drm_crtc_helper_add(crtc, &vkms_crtc_helper_funcs);
> >  
> > +	spin_lock_init(&vkms_out->lock);
> > +
> > +	vkms_out->crc_workq = alloc_ordered_workqueue("vkms_crc_workq", 0);
> > +
> >  	return ret;
> >  }
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> > index 37aa2ef33b21..5d78bd97e69c 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.c
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> > @@ -46,6 +46,7 @@ static void vkms_release(struct drm_device *dev)
> >  	platform_device_unregister(vkms->platform);
> >  	drm_mode_config_cleanup(&vkms->drm);
> >  	drm_dev_fini(&vkms->drm);
> > +	destroy_workqueue(vkms->output.crc_workq);
> >  }
> >  
> >  static struct drm_driver vkms_driver = {
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index bf811d0ec349..95985649768c 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -20,12 +20,23 @@ static const u32 vkms_formats[] = {
> >  	DRM_FORMAT_XRGB8888,
> >  };
> >  
> > +struct vkms_crc_data {
> > +	struct drm_rect src;
> > +	struct drm_framebuffer fb;
> > +};
> > +
> >  /**
> >   * vkms_crtc_state - Driver specific CRTC state
> >   * @base: base CRTC state
> > + * @crc_work: work struct to compute and add CRC entries
> > + * @data: data required for CRC computation
> > + * @n_frame: frame number for computed CRC
> >   */
> >  struct vkms_crtc_state {
> >  	struct drm_crtc_state base;
> > +	struct work_struct crc_work;
> > +	struct vkms_crc_data data;
> > +	unsigned int n_frame;
> >  };
> >  
> >  struct vkms_output {
> > @@ -35,6 +46,11 @@ struct vkms_output {
> >  	struct hrtimer vblank_hrtimer;
> >  	ktime_t period_ns;
> >  	struct drm_pending_vblank_event *event;
> > +	bool crc_enabled;
> 
> Where is this set to true?
> 

Oh sorry I forgot to add a vkms_crc.c which sets crc_enabled in
set_crc_source callback! 

I'll work on fixing the issues you mentioned here and the other patches
as well, thank you so much again!
Haneen

> > +	/* ordered wq for crc_work */
> > +	struct workqueue_struct *crc_workq;
> > +	/* protects concurrent access to crc_data */
> > +	spinlock_t lock;
> >  };
> >  
> >  struct vkms_device {
> > @@ -95,4 +111,9 @@ void *vkms_gem_vmap(struct drm_gem_object *obj);
> >  
> >  void vkms_gem_vunmap(struct drm_gem_object *obj);
> >  
> > +/* CRC Support */
> > +int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name,
> > +			size_t *values_cnt);
> > +void vkms_crc_work_handle(struct work_struct *work);
> > +
> >  #endif /* _VKMS_DRV_H_ */
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index aaf7c35faed6..1e0ee8ca8bd6 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -24,6 +24,16 @@ static const struct drm_plane_funcs vkms_plane_funcs = {
> >  static void vkms_primary_plane_update(struct drm_plane *plane,
> >  				      struct drm_plane_state *old_state)
> >  {
> > +	struct vkms_crtc_state *state;
> > +
> > +	if (!plane->state->crtc || !plane->state->fb)
> > +		return;
> > +
> > +	state = to_vkms_crtc_state(plane->state->crtc->state);
> > +	memcpy(&state->data.src, &plane->state->src, sizeof(struct drm_rect));
> > +	memcpy(&state->data.fb, plane->state->fb,
> > +	       sizeof(struct drm_framebuffer));
> > +	drm_framebuffer_get(&state->data.fb);
> >  }
> >  
> >  static int vkms_plane_atomic_check(struct drm_plane *plane,
> > -- 
> > 2.17.1
> > 
> 
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS

Sean Paul July 17, 2018, 8:43 p.m. UTC | #3

On Tue, Jul 17, 2018 at 02:52:20AM +0300, Haneen Mohammed wrote:
> On Mon, Jul 16, 2018 at 02:22:56PM -0400, Sean Paul wrote:
> > On Sat, Jul 14, 2018 at 03:23:32PM +0300, Haneen Mohammed wrote:
> > > Implement the set_crc_source() callback.
> > > Compute CRC using crc32 on the visible part of the framebuffer.
> > > Use ordered workqueue to compute and add CRC at the end of a vblank.
> > > 
> > > Use appropriate synchronization methods since the crc computation must
> > > be atomic wrt the generated vblank event for a given atomic update,
> > > 
> > > Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>
> > 
> > Hey Haneen,
> > Thanks for revising this patch set. Things are looking good across the series,
> > just a few more comments :-)
> > 
> 
> Thank you so much for the review! 
> 
> > > ---
> > > Changes in v2:
> > > - Include this patch in this patchset.
> > > 
> > >  drivers/gpu/drm/vkms/Makefile     |  2 +-
> > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 60 ++++++++++++++++++++++++++-----
> > >  drivers/gpu/drm/vkms/vkms_drv.c   |  1 +
> > >  drivers/gpu/drm/vkms/vkms_drv.h   | 21 +++++++++++
> > >  drivers/gpu/drm/vkms/vkms_plane.c | 10 ++++++
> > >  5 files changed, 85 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
> > > index 986297da51bf..37966914f70b 100644
> > > --- a/drivers/gpu/drm/vkms/Makefile
> > > +++ b/drivers/gpu/drm/vkms/Makefile
> > > @@ -1,3 +1,3 @@
> > > -vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o
> > > +vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o vkms_crc.o
> > >  
> > >  obj-$(CONFIG_DRM_VKMS) += vkms.o
> > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > index 26babb85ca77..f3a674db33b8 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > @@ -10,18 +10,36 @@
> > >  #include <drm/drm_atomic_helper.h>
> > >  #include <drm/drm_crtc_helper.h>
> > >  
> > > -static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > > +static void _vblank_handle(struct vkms_output *output)
> > >  {
> > > -	struct vkms_output *output = container_of(timer, struct vkms_output,
> > > -						  vblank_hrtimer);
> > >  	struct drm_crtc *crtc = &output->crtc;
> > > -	int ret_overrun;
> > > +	struct vkms_crtc_state *state = to_vkms_crtc_state(crtc->state);
> > >  	bool ret;
> > >  
> > > +	int crc_enabled = 0;
> > > +
> > > +	spin_lock(&output->lock);
> > > +	crc_enabled = output->crc_enabled;
> > 
> > Aside from the implicit bool -> int cast, I don't think you need this local var,
> > just use output->crc_enabled directly below.
> > 
> > >  	ret = drm_crtc_handle_vblank(crtc);
> > >  	if (!ret)
> > >  		DRM_ERROR("vkms failure on handling vblank");
> > 
> > This would be more useful with the error code printed.
> > 
> 
> I think this only returns false on failure. Also I've noticed most of the usage of
> drm_crtc_handle_vblank don't check the return value, should I do the
> same as well and drop ret and error message?

Ahh, I didn't see that ret was bool. In that case, the error message is fine.
It's always good practice to assume that if a function returns a status, it's
worth checking.

> 
> > >  
> > > +	if (state && crc_enabled) {
> > > +		state->n_frame = drm_crtc_accurate_vblank_count(crtc);
> > > +		queue_work(output->crc_workq, &state->crc_work);
> > > +	}
> > > +
> > > +	spin_unlock(&output->lock);
> > > +}
> > > +
> > > +static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > > +{
> > > +	struct vkms_output *output = container_of(timer, struct vkms_output,
> > > +						  vblank_hrtimer);
> > > +	int ret_overrun;
> > > +
> > > +	_vblank_handle(output);
> > > +
> > >  	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
> > >  					  output->period_ns);
> > >  
> > > @@ -97,6 +115,8 @@ vkms_atomic_crtc_duplicate_state(struct drm_crtc *crtc)
> > >  
> > >  	__drm_atomic_helper_crtc_duplicate_state(crtc, &vkms_state->base);
> > >  
> > > +	INIT_WORK(&vkms_state->crc_work, vkms_crc_work_handle);
> > > +
> > >  	return &vkms_state->base;
> > >  }
> > >  
> > > @@ -104,11 +124,18 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
> > >  					   struct drm_crtc_state *state)
> > >  {
> > >  	struct vkms_crtc_state *vkms_state;
> > > +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
> > >  
> > >  	vkms_state = to_vkms_crtc_state(state);
> > >  
> > >  	__drm_atomic_helper_crtc_destroy_state(state);
> > > -	kfree(vkms_state);
> > > +
> > > +	if (vkms_state) {
> > > +		flush_workqueue(vkms_out->crc_workq);
> > 
> > I'm a little worried about this bit. Since the workqueue is per-output, is it
> > possible you'll be waiting for more frames to complete than you need to be?
> > 
> 
> I see, maybe I should flush per work_struct instead?

Yeah, that would make more sense IMO.

> 
> > > +		drm_framebuffer_put(&vkms_state->data.fb);
> > > +		memset(&vkms_state->data, 0, sizeof(struct vkms_crc_data));
> > > +		kfree(vkms_state);
> > > +	}
> > >  }
> > >  
> > >  static const struct drm_crtc_funcs vkms_crtc_funcs = {
> > > @@ -120,6 +147,7 @@ static const struct drm_crtc_funcs vkms_crtc_funcs = {
> > >  	.atomic_destroy_state   = vkms_atomic_crtc_destroy_state,
> > >  	.enable_vblank		= vkms_enable_vblank,
> > >  	.disable_vblank		= vkms_disable_vblank,
> > > +	.set_crc_source		= vkms_set_crc_source,
> > >  };
> > >  
> > >  static void vkms_crtc_atomic_enable(struct drm_crtc *crtc,
> > > @@ -134,26 +162,37 @@ static void vkms_crtc_atomic_disable(struct drm_crtc *crtc,
> > >  	drm_crtc_vblank_off(crtc);
> > >  }
> > >  
> > > +static void vkms_crtc_atomic_begin(struct drm_crtc *crtc,
> > > +				   struct drm_crtc_state *old_crtc_state)
> > > +{
> > > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > > +
> > > +	spin_lock_irq(&vkms_output->lock);
> > 
> > Hmm, you can't lock across atomic_begin/flush. What is this protecting against?
> > 
> 
> I did this per Daniel recommendation:
> 1- spin_lock_irq -> vkms_crtc_atomic_begin
> 2- update vkms_crc_data and get ref to fb -> vkms_primary_plane_update
> 3- spin_unlock_irq after vblank event -> vkms_crtc_atomic_flush

I think I'm interpreting his mail a little differently. The point of the
spinlock is to protect the pointer to the crc_data in vkms_crtc (right now it's
not a pointer, but it should be).

So I think you need to do the following:
- Change crc_data to a pointer
- In plane_update, allocate new memory to hold crc_data and, under the spinlock,
  update the crc_data to point to the new memory (add a WARN_ON if the pointer
  is NULL)
- In the hrtimer, while holding the spinlock, take the pointer from crc_data
  into a local variable and clear the crtc->crc_data pointer
- Pass the pointer (from the local variable) to the crc_worker

I don't think you need to hold the spinlock across the atomic hooks, just grab
it in plane_update.

Sean

> 
> > > +}
> > > +
> > >  static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
> > >  				   struct drm_crtc_state *old_crtc_state)
> > >  {
> > > -	unsigned long flags;
> > > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > >  
> > >  	if (crtc->state->event) {
> > > -		spin_lock_irqsave(&crtc->dev->event_lock, flags);
> > > +		spin_lock(&crtc->dev->event_lock);
> > 
> > What's the rationale behind this change?
> > 
> 
> hm I am not sure if this is correct, but I assume because spin_lock_irq in atomic_begin
> would disable interrupts and since this is before we unlock vkms_output->lock with 
> spin_unlock_irq so irqsave here is not necessary?
> 

Once you limit the scope of the spinlock above, you will want to revert this
change.

> > >  
> > >  		if (drm_crtc_vblank_get(crtc) != 0)
> > >  			drm_crtc_send_vblank_event(crtc, crtc->state->event);
> > >  		else
> > >  			drm_crtc_arm_vblank_event(crtc, crtc->state->event);
> > >  
> > > -		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
> > > +		spin_unlock(&crtc->dev->event_lock);
> > >  
> > >  		crtc->state->event = NULL;
> > >  	}
> > > +
> > > +	spin_unlock_irq(&vkms_output->lock);
> > >  }
> > >  
> > >  static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> > > +	.atomic_begin	= vkms_crtc_atomic_begin,
> > >  	.atomic_flush	= vkms_crtc_atomic_flush,
> > >  	.atomic_enable	= vkms_crtc_atomic_enable,
> > >  	.atomic_disable	= vkms_crtc_atomic_disable,
> > > @@ -162,6 +201,7 @@ static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> > >  int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
> > >  		   struct drm_plane *primary, struct drm_plane *cursor)
> > >  {
> > > +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
> > >  	int ret;
> > >  
> > >  	ret = drm_crtc_init_with_planes(dev, crtc, primary, cursor,
> > > @@ -173,5 +213,9 @@ int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
> > >  
> > >  	drm_crtc_helper_add(crtc, &vkms_crtc_helper_funcs);
> > >  
> > > +	spin_lock_init(&vkms_out->lock);
> > > +
> > > +	vkms_out->crc_workq = alloc_ordered_workqueue("vkms_crc_workq", 0);
> > > +
> > >  	return ret;
> > >  }
> > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> > > index 37aa2ef33b21..5d78bd97e69c 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_drv.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> > > @@ -46,6 +46,7 @@ static void vkms_release(struct drm_device *dev)
> > >  	platform_device_unregister(vkms->platform);
> > >  	drm_mode_config_cleanup(&vkms->drm);
> > >  	drm_dev_fini(&vkms->drm);
> > > +	destroy_workqueue(vkms->output.crc_workq);
> > >  }
> > >  
> > >  static struct drm_driver vkms_driver = {
> > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > index bf811d0ec349..95985649768c 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > > @@ -20,12 +20,23 @@ static const u32 vkms_formats[] = {
> > >  	DRM_FORMAT_XRGB8888,
> > >  };
> > >  
> > > +struct vkms_crc_data {
> > > +	struct drm_rect src;
> > > +	struct drm_framebuffer fb;
> > > +};
> > > +
> > >  /**
> > >   * vkms_crtc_state - Driver specific CRTC state
> > >   * @base: base CRTC state
> > > + * @crc_work: work struct to compute and add CRC entries
> > > + * @data: data required for CRC computation
> > > + * @n_frame: frame number for computed CRC
> > >   */
> > >  struct vkms_crtc_state {
> > >  	struct drm_crtc_state base;
> > > +	struct work_struct crc_work;
> > > +	struct vkms_crc_data data;
> > > +	unsigned int n_frame;
> > >  };
> > >  
> > >  struct vkms_output {
> > > @@ -35,6 +46,11 @@ struct vkms_output {
> > >  	struct hrtimer vblank_hrtimer;
> > >  	ktime_t period_ns;
> > >  	struct drm_pending_vblank_event *event;
> > > +	bool crc_enabled;
> > 
> > Where is this set to true?
> > 
> 
> Oh sorry I forgot to add a vkms_crc.c which sets crc_enabled in
> set_crc_source callback! 
> 
> I'll work on fixing the issues you mentioned here and the other patches
> as well, thank you so much again!

Thanks for working through this!

Sean

> Haneen
> 
> > > +	/* ordered wq for crc_work */
> > > +	struct workqueue_struct *crc_workq;
> > > +	/* protects concurrent access to crc_data */
> > > +	spinlock_t lock;
> > >  };
> > >  
> > >  struct vkms_device {
> > > @@ -95,4 +111,9 @@ void *vkms_gem_vmap(struct drm_gem_object *obj);
> > >  
> > >  void vkms_gem_vunmap(struct drm_gem_object *obj);
> > >  
> > > +/* CRC Support */
> > > +int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name,
> > > +			size_t *values_cnt);
> > > +void vkms_crc_work_handle(struct work_struct *work);
> > > +
> > >  #endif /* _VKMS_DRV_H_ */
> > > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > > index aaf7c35faed6..1e0ee8ca8bd6 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > > @@ -24,6 +24,16 @@ static const struct drm_plane_funcs vkms_plane_funcs = {
> > >  static void vkms_primary_plane_update(struct drm_plane *plane,
> > >  				      struct drm_plane_state *old_state)
> > >  {
> > > +	struct vkms_crtc_state *state;
> > > +
> > > +	if (!plane->state->crtc || !plane->state->fb)
> > > +		return;
> > > +
> > > +	state = to_vkms_crtc_state(plane->state->crtc->state);
> > > +	memcpy(&state->data.src, &plane->state->src, sizeof(struct drm_rect));
> > > +	memcpy(&state->data.fb, plane->state->fb,
> > > +	       sizeof(struct drm_framebuffer));
> > > +	drm_framebuffer_get(&state->data.fb);
> > >  }
> > >  
> > >  static int vkms_plane_atomic_check(struct drm_plane *plane,
> > > -- 
> > > 2.17.1
> > > 
> > 
> > -- 
> > Sean Paul, Software Engineer, Google / Chromium OS

Sean Paul July 17, 2018, 9:05 p.m. UTC | #4

On Tue, Jul 17, 2018 at 04:43:16PM -0400, Sean Paul wrote:
> On Tue, Jul 17, 2018 at 02:52:20AM +0300, Haneen Mohammed wrote:
> > On Mon, Jul 16, 2018 at 02:22:56PM -0400, Sean Paul wrote:
> > > On Sat, Jul 14, 2018 at 03:23:32PM +0300, Haneen Mohammed wrote:
> > > > Implement the set_crc_source() callback.
> > > > Compute CRC using crc32 on the visible part of the framebuffer.
> > > > Use ordered workqueue to compute and add CRC at the end of a vblank.
> > > > 
> > > > Use appropriate synchronization methods since the crc computation must
> > > > be atomic wrt the generated vblank event for a given atomic update,
> > > > 
> > > > Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>
> > > 
> > > Hey Haneen,
> > > Thanks for revising this patch set. Things are looking good across the series,
> > > just a few more comments :-)
> > > 
> > 
> > Thank you so much for the review! 
> > 
> > > > ---
> > > > Changes in v2:
> > > > - Include this patch in this patchset.
> > > > 
> > > >  drivers/gpu/drm/vkms/Makefile     |  2 +-
> > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 60 ++++++++++++++++++++++++++-----
> > > >  drivers/gpu/drm/vkms/vkms_drv.c   |  1 +
> > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 21 +++++++++++
> > > >  drivers/gpu/drm/vkms/vkms_plane.c | 10 ++++++
> > > >  5 files changed, 85 insertions(+), 9 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
> > > > index 986297da51bf..37966914f70b 100644
> > > > --- a/drivers/gpu/drm/vkms/Makefile
> > > > +++ b/drivers/gpu/drm/vkms/Makefile
> > > > @@ -1,3 +1,3 @@
> > > > -vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o
> > > > +vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o vkms_crc.o
> > > >  
> > > >  obj-$(CONFIG_DRM_VKMS) += vkms.o
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > > index 26babb85ca77..f3a674db33b8 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > > @@ -10,18 +10,36 @@
> > > >  #include <drm/drm_atomic_helper.h>
> > > >  #include <drm/drm_crtc_helper.h>
> > > >  
> > > > -static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > > > +static void _vblank_handle(struct vkms_output *output)
> > > >  {
> > > > -	struct vkms_output *output = container_of(timer, struct vkms_output,
> > > > -						  vblank_hrtimer);
> > > >  	struct drm_crtc *crtc = &output->crtc;
> > > > -	int ret_overrun;
> > > > +	struct vkms_crtc_state *state = to_vkms_crtc_state(crtc->state);
> > > >  	bool ret;
> > > >  
> > > > +	int crc_enabled = 0;
> > > > +
> > > > +	spin_lock(&output->lock);
> > > > +	crc_enabled = output->crc_enabled;
> > > 
> > > Aside from the implicit bool -> int cast, I don't think you need this local var,
> > > just use output->crc_enabled directly below.
> > > 
> > > >  	ret = drm_crtc_handle_vblank(crtc);
> > > >  	if (!ret)
> > > >  		DRM_ERROR("vkms failure on handling vblank");
> > > 
> > > This would be more useful with the error code printed.
> > > 
> > 
> > I think this only returns false on failure. Also I've noticed most of the usage of
> > drm_crtc_handle_vblank don't check the return value, should I do the
> > same as well and drop ret and error message?
> 
> Ahh, I didn't see that ret was bool. In that case, the error message is fine.
> It's always good practice to assume that if a function returns a status, it's
> worth checking.
> 
> > 
> > > >  
> > > > +	if (state && crc_enabled) {
> > > > +		state->n_frame = drm_crtc_accurate_vblank_count(crtc);
> > > > +		queue_work(output->crc_workq, &state->crc_work);
> > > > +	}
> > > > +
> > > > +	spin_unlock(&output->lock);
> > > > +}
> > > > +
> > > > +static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > > > +{
> > > > +	struct vkms_output *output = container_of(timer, struct vkms_output,
> > > > +						  vblank_hrtimer);
> > > > +	int ret_overrun;
> > > > +
> > > > +	_vblank_handle(output);
> > > > +
> > > >  	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
> > > >  					  output->period_ns);
> > > >  
> > > > @@ -97,6 +115,8 @@ vkms_atomic_crtc_duplicate_state(struct drm_crtc *crtc)
> > > >  
> > > >  	__drm_atomic_helper_crtc_duplicate_state(crtc, &vkms_state->base);
> > > >  
> > > > +	INIT_WORK(&vkms_state->crc_work, vkms_crc_work_handle);
> > > > +
> > > >  	return &vkms_state->base;
> > > >  }
> > > >  
> > > > @@ -104,11 +124,18 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
> > > >  					   struct drm_crtc_state *state)
> > > >  {
> > > >  	struct vkms_crtc_state *vkms_state;
> > > > +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
> > > >  
> > > >  	vkms_state = to_vkms_crtc_state(state);
> > > >  
> > > >  	__drm_atomic_helper_crtc_destroy_state(state);
> > > > -	kfree(vkms_state);
> > > > +
> > > > +	if (vkms_state) {
> > > > +		flush_workqueue(vkms_out->crc_workq);
> > > 
> > > I'm a little worried about this bit. Since the workqueue is per-output, is it
> > > possible you'll be waiting for more frames to complete than you need to be?
> > > 
> > 
> > I see, maybe I should flush per work_struct instead?
> 
> Yeah, that would make more sense IMO.
> 
> > 
> > > > +		drm_framebuffer_put(&vkms_state->data.fb);
> > > > +		memset(&vkms_state->data, 0, sizeof(struct vkms_crc_data));
> > > > +		kfree(vkms_state);
> > > > +	}
> > > >  }
> > > >  
> > > >  static const struct drm_crtc_funcs vkms_crtc_funcs = {
> > > > @@ -120,6 +147,7 @@ static const struct drm_crtc_funcs vkms_crtc_funcs = {
> > > >  	.atomic_destroy_state   = vkms_atomic_crtc_destroy_state,
> > > >  	.enable_vblank		= vkms_enable_vblank,
> > > >  	.disable_vblank		= vkms_disable_vblank,
> > > > +	.set_crc_source		= vkms_set_crc_source,
> > > >  };
> > > >  
> > > >  static void vkms_crtc_atomic_enable(struct drm_crtc *crtc,
> > > > @@ -134,26 +162,37 @@ static void vkms_crtc_atomic_disable(struct drm_crtc *crtc,
> > > >  	drm_crtc_vblank_off(crtc);
> > > >  }
> > > >  
> > > > +static void vkms_crtc_atomic_begin(struct drm_crtc *crtc,
> > > > +				   struct drm_crtc_state *old_crtc_state)
> > > > +{
> > > > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > > > +
> > > > +	spin_lock_irq(&vkms_output->lock);
> > > 
> > > Hmm, you can't lock across atomic_begin/flush. What is this protecting against?
> > > 
> > 
> > I did this per Daniel recommendation:
> > 1- spin_lock_irq -> vkms_crtc_atomic_begin
> > 2- update vkms_crc_data and get ref to fb -> vkms_primary_plane_update
> > 3- spin_unlock_irq after vblank event -> vkms_crtc_atomic_flush
> 
> I think I'm interpreting his mail a little differently. The point of the
> spinlock is to protect the pointer to the crc_data in vkms_crtc (right now it's
> not a pointer, but it should be).
> 
> So I think you need to do the following:
> - Change crc_data to a pointer
> - In plane_update, allocate new memory to hold crc_data and, under the spinlock,
>   update the crc_data to point to the new memory (add a WARN_ON if the pointer
>   is NULL)
> - In the hrtimer, while holding the spinlock, take the pointer from crc_data
>   into a local variable and clear the crtc->crc_data pointer
> - Pass the pointer (from the local variable) to the crc_worker
> 
> I don't think you need to hold the spinlock across the atomic hooks, just grab
> it in plane_update.

So the more I think about this, I don't think it's quite right. Perhaps I'm
missing an email from Daniel, or (more likely) misunderstanding what you're
trying to do. However, the email I read suggested storing the crc_data
to be consumed by the crc worker in vkms_crtc. However in this patch it's being
stored (along with the work item) in crtc state. Further, the state cannot be
destroyed without completing the crc_work, which kind of defeats the purpose of
the workqueue in the first place.

Storing a copy of the crc_data in crtc allows you to free the crtc_state, so you
can start working on the next frame while the crc worker is scheduled and picks
up the crc_data from crtc. Storing the crc_data in crtc_state and waiting for
the work to finish is easier since you don't need to worry about copying and
locking (so much). However, it seems like we're doing both in this patch.

Again, I'm likely just not understanding what the goal of this is, so any
clarification would be greatly appreciated :)

Sean

> 
> Sean
> 
> > 
> > > > +}
> > > > +
> > > >  static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
> > > >  				   struct drm_crtc_state *old_crtc_state)
> > > >  {
> > > > -	unsigned long flags;
> > > > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > > >  
> > > >  	if (crtc->state->event) {
> > > > -		spin_lock_irqsave(&crtc->dev->event_lock, flags);
> > > > +		spin_lock(&crtc->dev->event_lock);
> > > 
> > > What's the rationale behind this change?
> > > 
> > 
> > hm I am not sure if this is correct, but I assume because spin_lock_irq in atomic_begin
> > would disable interrupts and since this is before we unlock vkms_output->lock with 
> > spin_unlock_irq so irqsave here is not necessary?
> > 
> 
> Once you limit the scope of the spinlock above, you will want to revert this
> change.
> 
> > > >  
> > > >  		if (drm_crtc_vblank_get(crtc) != 0)
> > > >  			drm_crtc_send_vblank_event(crtc, crtc->state->event);
> > > >  		else
> > > >  			drm_crtc_arm_vblank_event(crtc, crtc->state->event);
> > > >  
> > > > -		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
> > > > +		spin_unlock(&crtc->dev->event_lock);
> > > >  
> > > >  		crtc->state->event = NULL;
> > > >  	}
> > > > +
> > > > +	spin_unlock_irq(&vkms_output->lock);
> > > >  }
> > > >  
> > > >  static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> > > > +	.atomic_begin	= vkms_crtc_atomic_begin,
> > > >  	.atomic_flush	= vkms_crtc_atomic_flush,
> > > >  	.atomic_enable	= vkms_crtc_atomic_enable,
> > > >  	.atomic_disable	= vkms_crtc_atomic_disable,
> > > > @@ -162,6 +201,7 @@ static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> > > >  int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
> > > >  		   struct drm_plane *primary, struct drm_plane *cursor)
> > > >  {
> > > > +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
> > > >  	int ret;
> > > >  
> > > >  	ret = drm_crtc_init_with_planes(dev, crtc, primary, cursor,
> > > > @@ -173,5 +213,9 @@ int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
> > > >  
> > > >  	drm_crtc_helper_add(crtc, &vkms_crtc_helper_funcs);
> > > >  
> > > > +	spin_lock_init(&vkms_out->lock);
> > > > +
> > > > +	vkms_out->crc_workq = alloc_ordered_workqueue("vkms_crc_workq", 0);
> > > > +
> > > >  	return ret;
> > > >  }
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> > > > index 37aa2ef33b21..5d78bd97e69c 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_drv.c
> > > > +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> > > > @@ -46,6 +46,7 @@ static void vkms_release(struct drm_device *dev)
> > > >  	platform_device_unregister(vkms->platform);
> > > >  	drm_mode_config_cleanup(&vkms->drm);
> > > >  	drm_dev_fini(&vkms->drm);
> > > > +	destroy_workqueue(vkms->output.crc_workq);
> > > >  }
> > > >  
> > > >  static struct drm_driver vkms_driver = {
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > > index bf811d0ec349..95985649768c 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > > > @@ -20,12 +20,23 @@ static const u32 vkms_formats[] = {
> > > >  	DRM_FORMAT_XRGB8888,
> > > >  };
> > > >  
> > > > +struct vkms_crc_data {
> > > > +	struct drm_rect src;
> > > > +	struct drm_framebuffer fb;
> > > > +};
> > > > +
> > > >  /**
> > > >   * vkms_crtc_state - Driver specific CRTC state
> > > >   * @base: base CRTC state
> > > > + * @crc_work: work struct to compute and add CRC entries
> > > > + * @data: data required for CRC computation
> > > > + * @n_frame: frame number for computed CRC
> > > >   */
> > > >  struct vkms_crtc_state {
> > > >  	struct drm_crtc_state base;
> > > > +	struct work_struct crc_work;
> > > > +	struct vkms_crc_data data;
> > > > +	unsigned int n_frame;
> > > >  };
> > > >  
> > > >  struct vkms_output {
> > > > @@ -35,6 +46,11 @@ struct vkms_output {
> > > >  	struct hrtimer vblank_hrtimer;
> > > >  	ktime_t period_ns;
> > > >  	struct drm_pending_vblank_event *event;
> > > > +	bool crc_enabled;
> > > 
> > > Where is this set to true?
> > > 
> > 
> > Oh sorry I forgot to add a vkms_crc.c which sets crc_enabled in
> > set_crc_source callback! 
> > 
> > I'll work on fixing the issues you mentioned here and the other patches
> > as well, thank you so much again!
> 
> Thanks for working through this!
> 
> Sean
> 
> > Haneen
> > 
> > > > +	/* ordered wq for crc_work */
> > > > +	struct workqueue_struct *crc_workq;
> > > > +	/* protects concurrent access to crc_data */
> > > > +	spinlock_t lock;
> > > >  };
> > > >  
> > > >  struct vkms_device {
> > > > @@ -95,4 +111,9 @@ void *vkms_gem_vmap(struct drm_gem_object *obj);
> > > >  
> > > >  void vkms_gem_vunmap(struct drm_gem_object *obj);
> > > >  
> > > > +/* CRC Support */
> > > > +int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name,
> > > > +			size_t *values_cnt);
> > > > +void vkms_crc_work_handle(struct work_struct *work);
> > > > +
> > > >  #endif /* _VKMS_DRV_H_ */
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > > > index aaf7c35faed6..1e0ee8ca8bd6 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > > > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > > > @@ -24,6 +24,16 @@ static const struct drm_plane_funcs vkms_plane_funcs = {
> > > >  static void vkms_primary_plane_update(struct drm_plane *plane,
> > > >  				      struct drm_plane_state *old_state)
> > > >  {
> > > > +	struct vkms_crtc_state *state;
> > > > +
> > > > +	if (!plane->state->crtc || !plane->state->fb)
> > > > +		return;
> > > > +
> > > > +	state = to_vkms_crtc_state(plane->state->crtc->state);
> > > > +	memcpy(&state->data.src, &plane->state->src, sizeof(struct drm_rect));
> > > > +	memcpy(&state->data.fb, plane->state->fb,
> > > > +	       sizeof(struct drm_framebuffer));
> > > > +	drm_framebuffer_get(&state->data.fb);
> > > >  }
> > > >  
> > > >  static int vkms_plane_atomic_check(struct drm_plane *plane,
> > > > -- 
> > > > 2.17.1
> > > > 
> > > 
> > > -- 
> > > Sean Paul, Software Engineer, Google / Chromium OS
> 
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS

Haneen Mohammed July 18, 2018, 4:24 p.m. UTC | #5

First, just so you have the complete view, this is the missing part in this patch:
------ vkms_crc.c
static uint32_t _vkms_get_crc(struct vkms_crc_data *crc_data)
{
	struct drm_framebuffer *fb = &crc_data->fb;
	struct drm_gem_object *gem_obj = drm_gem_fb_get_obj(fb, 0);
	struct vkms_gem_object *vkms_obj = drm_gem_to_vkms_gem(gem_obj);
	u32 crc = 0;
	int i = 0;
	unsigned int x = crc_data->src.x1 >> 16;
	unsigned int y = crc_data->src.y1 >> 16;
	unsigned int height = drm_rect_height(&crc_data->src) >> 16;
	unsigned int width = drm_rect_width(&crc_data->src) >> 16;
	unsigned int cpp = fb->format->cpp[0];
	unsigned int src_offset;
	unsigned int size_byte = width * cpp;
	void *vaddr = vkms_obj->vaddr;

	if (WARN_ON(!vaddr))
		return crc;

	for (i = y; i < y + height; i++) {
		src_offset = fb->offsets[0] + (i * fb->pitches[0]) + (x * cpp);
		crc = crc32_le(crc, vaddr + src_offset, size_byte);
	}

	return crc;
}

void vkms_crc_work_handle(struct work_struct *work)
{
	struct vkms_crtc_state *crtc_state = container_of(work,
						struct vkms_crtc_state,
						crc_work);
	struct drm_crtc *crtc = crtc_state->base.crtc;
	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);

	if (crtc_state && out->crc_enabled) {
		u32 crc32 = _vkms_get_crc(&crtc_state->data);
		unsigned int n_frame = crtc_state->n_frame;

		drm_crtc_add_crc_entry(crtc, true, n_frame, &crc32);
	}
}

int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name,
			size_t *values_cnt)
{
	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);

	if (!src_name) {
		out->crc_enabled = false;
	} else if (strcmp(src_name, "auto") == 0) {
		out->crc_enabled = true;
	} else {
		out->crc_enabled = false;
		return -EINVAL;
	}

	*values_cnt = 1;

	return 0;
}
--- end vkms_crc.c

On Tue, Jul 17, 2018 at 05:05:03PM -0400, Sean Paul wrote:
> On Tue, Jul 17, 2018 at 04:43:16PM -0400, Sean Paul wrote:
> > On Tue, Jul 17, 2018 at 02:52:20AM +0300, Haneen Mohammed wrote:
> > > On Mon, Jul 16, 2018 at 02:22:56PM -0400, Sean Paul wrote:
> > > > On Sat, Jul 14, 2018 at 03:23:32PM +0300, Haneen Mohammed wrote:
> > > > > Implement the set_crc_source() callback.
> > > > > Compute CRC using crc32 on the visible part of the framebuffer.
> > > > > Use ordered workqueue to compute and add CRC at the end of a vblank.
> > > > > 
> > > > > Use appropriate synchronization methods since the crc computation must
> > > > > be atomic wrt the generated vblank event for a given atomic update,
> > > > > 
> > > > > Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>
> > > > 
> > > > Hey Haneen,
> > > > Thanks for revising this patch set. Things are looking good across the series,
> > > > just a few more comments :-)
> > > > 
> > > 
> > > Thank you so much for the review! 
> > > 
> > > > > ---
> > > > > Changes in v2:
> > > > > - Include this patch in this patchset.
> > > > > 
> > > > >  drivers/gpu/drm/vkms/Makefile     |  2 +-
> > > > >  drivers/gpu/drm/vkms/vkms_crtc.c  | 60 ++++++++++++++++++++++++++-----
> > > > >  drivers/gpu/drm/vkms/vkms_drv.c   |  1 +
> > > > >  drivers/gpu/drm/vkms/vkms_drv.h   | 21 +++++++++++
> > > > >  drivers/gpu/drm/vkms/vkms_plane.c | 10 ++++++
> > > > >  5 files changed, 85 insertions(+), 9 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/vkms/Makefile b/drivers/gpu/drm/vkms/Makefile
> > > > > index 986297da51bf..37966914f70b 100644
> > > > > --- a/drivers/gpu/drm/vkms/Makefile
> > > > > +++ b/drivers/gpu/drm/vkms/Makefile
> > > > > @@ -1,3 +1,3 @@
> > > > > -vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o
> > > > > +vkms-y := vkms_drv.o vkms_plane.o vkms_output.o vkms_crtc.o vkms_gem.o vkms_crc.o
> > > > >  
> > > > >  obj-$(CONFIG_DRM_VKMS) += vkms.o
> > > > > diff --git a/drivers/gpu/drm/vkms/vkms_crtc.c b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > > > index 26babb85ca77..f3a674db33b8 100644
> > > > > --- a/drivers/gpu/drm/vkms/vkms_crtc.c
> > > > > +++ b/drivers/gpu/drm/vkms/vkms_crtc.c
> > > > > @@ -10,18 +10,36 @@
> > > > >  #include <drm/drm_atomic_helper.h>
> > > > >  #include <drm/drm_crtc_helper.h>
> > > > >  
> > > > > -static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > > > > +static void _vblank_handle(struct vkms_output *output)
> > > > >  {
> > > > > -	struct vkms_output *output = container_of(timer, struct vkms_output,
> > > > > -						  vblank_hrtimer);
> > > > >  	struct drm_crtc *crtc = &output->crtc;
> > > > > -	int ret_overrun;
> > > > > +	struct vkms_crtc_state *state = to_vkms_crtc_state(crtc->state);
> > > > >  	bool ret;
> > > > >  
> > > > > +	int crc_enabled = 0;
> > > > > +
> > > > > +	spin_lock(&output->lock);
> > > > > +	crc_enabled = output->crc_enabled;
> > > > 
> > > > Aside from the implicit bool -> int cast, I don't think you need this local var,
> > > > just use output->crc_enabled directly below.
> > > > 
> > > > >  	ret = drm_crtc_handle_vblank(crtc);
> > > > >  	if (!ret)
> > > > >  		DRM_ERROR("vkms failure on handling vblank");
> > > > 
> > > > This would be more useful with the error code printed.
> > > > 
> > > 
> > > I think this only returns false on failure. Also I've noticed most of the usage of
> > > drm_crtc_handle_vblank don't check the return value, should I do the
> > > same as well and drop ret and error message?
> > 
> > Ahh, I didn't see that ret was bool. In that case, the error message is fine.
> > It's always good practice to assume that if a function returns a status, it's
> > worth checking.
> > 
> > > 
> > > > >  
> > > > > +	if (state && crc_enabled) {
> > > > > +		state->n_frame = drm_crtc_accurate_vblank_count(crtc);
> > > > > +		queue_work(output->crc_workq, &state->crc_work);
> > > > > +	}
> > > > > +
> > > > > +	spin_unlock(&output->lock);
> > > > > +}
> > > > > +
> > > > > +static enum hrtimer_restart vkms_vblank_simulate(struct hrtimer *timer)
> > > > > +{
> > > > > +	struct vkms_output *output = container_of(timer, struct vkms_output,
> > > > > +						  vblank_hrtimer);
> > > > > +	int ret_overrun;
> > > > > +
> > > > > +	_vblank_handle(output);
> > > > > +
> > > > >  	ret_overrun = hrtimer_forward_now(&output->vblank_hrtimer,
> > > > >  					  output->period_ns);
> > > > >  
> > > > > @@ -97,6 +115,8 @@ vkms_atomic_crtc_duplicate_state(struct drm_crtc *crtc)
> > > > >  
> > > > >  	__drm_atomic_helper_crtc_duplicate_state(crtc, &vkms_state->base);
> > > > >  
> > > > > +	INIT_WORK(&vkms_state->crc_work, vkms_crc_work_handle);
> > > > > +
> > > > >  	return &vkms_state->base;
> > > > >  }
> > > > >  
> > > > > @@ -104,11 +124,18 @@ static void vkms_atomic_crtc_destroy_state(struct drm_crtc *crtc,
> > > > >  					   struct drm_crtc_state *state)
> > > > >  {
> > > > >  	struct vkms_crtc_state *vkms_state;
> > > > > +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
> > > > >  
> > > > >  	vkms_state = to_vkms_crtc_state(state);
> > > > >  
> > > > >  	__drm_atomic_helper_crtc_destroy_state(state);
> > > > > -	kfree(vkms_state);
> > > > > +
> > > > > +	if (vkms_state) {
> > > > > +		flush_workqueue(vkms_out->crc_workq);
> > > > 
> > > > I'm a little worried about this bit. Since the workqueue is per-output, is it
> > > > possible you'll be waiting for more frames to complete than you need to be?
> > > > 
> > > 
> > > I see, maybe I should flush per work_struct instead?
> > 
> > Yeah, that would make more sense IMO.
> > 
> > > 
> > > > > +		drm_framebuffer_put(&vkms_state->data.fb);
> > > > > +		memset(&vkms_state->data, 0, sizeof(struct vkms_crc_data));
> > > > > +		kfree(vkms_state);
> > > > > +	}
> > > > >  }
> > > > >  
> > > > >  static const struct drm_crtc_funcs vkms_crtc_funcs = {
> > > > > @@ -120,6 +147,7 @@ static const struct drm_crtc_funcs vkms_crtc_funcs = {
> > > > >  	.atomic_destroy_state   = vkms_atomic_crtc_destroy_state,
> > > > >  	.enable_vblank		= vkms_enable_vblank,
> > > > >  	.disable_vblank		= vkms_disable_vblank,
> > > > > +	.set_crc_source		= vkms_set_crc_source,
> > > > >  };
> > > > >  
> > > > >  static void vkms_crtc_atomic_enable(struct drm_crtc *crtc,
> > > > > @@ -134,26 +162,37 @@ static void vkms_crtc_atomic_disable(struct drm_crtc *crtc,
> > > > >  	drm_crtc_vblank_off(crtc);
> > > > >  }
> > > > >  
> > > > > +static void vkms_crtc_atomic_begin(struct drm_crtc *crtc,
> > > > > +				   struct drm_crtc_state *old_crtc_state)
> > > > > +{
> > > > > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > > > > +
> > > > > +	spin_lock_irq(&vkms_output->lock);
> > > > 
> > > > Hmm, you can't lock across atomic_begin/flush. What is this protecting against?
> > > > 
> > > 
> > > I did this per Daniel recommendation:
> > > 1- spin_lock_irq -> vkms_crtc_atomic_begin
> > > 2- update vkms_crc_data and get ref to fb -> vkms_primary_plane_update
> > > 3- spin_unlock_irq after vblank event -> vkms_crtc_atomic_flush
> > 
> > I think I'm interpreting his mail a little differently. The point of the
> > spinlock is to protect the pointer to the crc_data in vkms_crtc (right now it's
> > not a pointer, but it should be).
> > 
> > So I think you need to do the following:
> > - Change crc_data to a pointer
> > - In plane_update, allocate new memory to hold crc_data and, under the spinlock,
> >   update the crc_data to point to the new memory (add a WARN_ON if the pointer
> >   is NULL)
> > - In the hrtimer, while holding the spinlock, take the pointer from crc_data
> >   into a local variable and clear the crtc->crc_data pointer
> > - Pass the pointer (from the local variable) to the crc_worker
> > 

The problem with this approach is managing the pointer allocation and
deallocation. If I grab it in the hrtimer, then where/when should I free it?
I can't after the crc worker is done with it because what I noticed is
that the crc computation is expected to run multiple times after one
plane_atomic_update.

> > I don't think you need to hold the spinlock across the atomic hooks, just grab
> > it in plane_update.
> 

I asked Daniel in irc about releasing the vkms_output->lock in the begginning of atomic_flush
to avoid modifying the event_lock and he mentioned that "output->lock must wrap the event
handling code completely since we need to make sure that the flip event
and the crc all match up".

So maybe I can grab the spinlock in plane_atomic_update and release it at
the end of crtc_atomic_flush? Is plane_atomic_update always called
before crtc_atomic_flush?

Otherwise if I grab the spinlock in atomic_begin I won't be able to
use pointer for crc_data and allocate it in plane_atomic_update since
I'll be holding a spinlock already. 

I have based this patch on Daniel's second feedback to my email
"[PATCH] drm/vkms: Update CRC support after lock review" which I didn't
cc the dri mailing list in it, quoting Daniel:

"Overall flow:
1. in crtc_funcs->atomic_begin grab the spinlock, to make sure
everything is visible completely or not at all to the vblank handler.
Also update a pointer to the vkms_crtc_state in vkms_output->crc_data
(or something like that.
2. copy the plane data (like you do here) in plane_funcs->atomic_commit.
3. drop the spinlock in crtc_funcs->atomic_flush after the vblank
event is armed.

In the vblank hrtimer handler:
1. grab the spinlock
2. drm_crtc_handle_vblank() - we must do this while holding the
spinlock, otherwise the event and crc might get out of sync. Also grab
the current vblank number here and store it somewhere in
vkms_crtc_state so that the worker can use the right one.
3. launch the worker. Note that the struct_work needs to be within the
vkms_crtc_state structure, otherwise the crc computation worker
doesn't know really for which data it should compute the crc - just
having one copy in data[1] doesn't really solve this race.
4. drop spinlock

So one big problem we have here still is what happens if the crc
worker falls behind with computing crcs. There's 2 cases, let's first
look at how to fixe the crc worker falling behind atomic updates. The
problem there is that the next atomic update might free the
vkms_crtc_state structure before the worker is done with it. That's
fairly easy to fix:
- add a flush_work call to your vkms_crtc_state_destroy function. This
means you need to initialize the work already in
vkms_crtc_state_duplicate (you'll create both of these when doing the
vkms_crtc_state subclassing).
- 2nd we need to make sure crc workers are run in order easiest fix
for that is to create your own private crc work queue (1 per crtc we
have) with the WQ_ORDERED flag.

2nd issue if the vblanks get ahead of the crc worker. This happens
when e.g. there's no atomic updates, but  we still want to generate 1
vblank per frame. This is probably better done in a follow-up patch,
but rough solution is:
- in the vblank hrtimer check whether the work is still scheduled and
not yet complete (while holding the spinlock). in that case simply
tell the worker to fill out the same crc for multiple vblanks, by
using a vblank_seqno_start/end.
- if the worker is not busy, then we tell it to compute a new crc (in
case something in the buffers has changed - in the future we might
change this behaviour)."

> So the more I think about this, I don't think it's quite right. Perhaps I'm
> missing an email from Daniel, or (more likely) misunderstanding what you're
> trying to do. However, the email I read suggested storing the crc_data
> to be consumed by the crc worker in vkms_crtc. However in this patch it's being
> stored (along with the work item) in crtc state. Further, the state cannot be
> destroyed without completing the crc_work, which kind of defeats the purpose of
> the workqueue in the first place.
> 
> Storing a copy of the crc_data in crtc allows you to free the crtc_state, so you
> can start working on the next frame while the crc worker is scheduled and picks
> up the crc_data from crtc. Storing the crc_data in crtc_state and waiting for
> the work to finish is easier since you don't need to worry about copying and
> locking (so much). However, it seems like we're doing both in this patch.
> 
> Again, I'm likely just not understanding what the goal of this is, so any
> clarification would be greatly appreciated :)
> 
> Sean
> 
> > 
> > Sean
> > 
> > > 
> > > > > +}
> > > > > +
> > > > >  static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
> > > > >  				   struct drm_crtc_state *old_crtc_state)
> > > > >  {
> > > > > -	unsigned long flags;
> > > > > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > > > >  
> > > > >  	if (crtc->state->event) {
> > > > > -		spin_lock_irqsave(&crtc->dev->event_lock, flags);
> > > > > +		spin_lock(&crtc->dev->event_lock);
> > > > 
> > > > What's the rationale behind this change?
> > > > 
> > > 
> > > hm I am not sure if this is correct, but I assume because spin_lock_irq in atomic_begin
> > > would disable interrupts and since this is before we unlock vkms_output->lock with 
> > > spin_unlock_irq so irqsave here is not necessary?
> > > 
> > 
> > Once you limit the scope of the spinlock above, you will want to revert this
> > change.
> > 
> > > > >  
> > > > >  		if (drm_crtc_vblank_get(crtc) != 0)
> > > > >  			drm_crtc_send_vblank_event(crtc, crtc->state->event);
> > > > >  		else
> > > > >  			drm_crtc_arm_vblank_event(crtc, crtc->state->event);
> > > > >  
> > > > > -		spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
> > > > > +		spin_unlock(&crtc->dev->event_lock);
> > > > >  
> > > > >  		crtc->state->event = NULL;
> > > > >  	}
> > > > > +
> > > > > +	spin_unlock_irq(&vkms_output->lock);
> > > > >  }
> > > > >  
> > > > >  static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> > > > > +	.atomic_begin	= vkms_crtc_atomic_begin,
> > > > >  	.atomic_flush	= vkms_crtc_atomic_flush,
> > > > >  	.atomic_enable	= vkms_crtc_atomic_enable,
> > > > >  	.atomic_disable	= vkms_crtc_atomic_disable,
> > > > > @@ -162,6 +201,7 @@ static const struct drm_crtc_helper_funcs vkms_crtc_helper_funcs = {
> > > > >  int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
> > > > >  		   struct drm_plane *primary, struct drm_plane *cursor)
> > > > >  {
> > > > > +	struct vkms_output *vkms_out = drm_crtc_to_vkms_output(crtc);
> > > > >  	int ret;
> > > > >  
> > > > >  	ret = drm_crtc_init_with_planes(dev, crtc, primary, cursor,
> > > > > @@ -173,5 +213,9 @@ int vkms_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
> > > > >  
> > > > >  	drm_crtc_helper_add(crtc, &vkms_crtc_helper_funcs);
> > > > >  
> > > > > +	spin_lock_init(&vkms_out->lock);
> > > > > +
> > > > > +	vkms_out->crc_workq = alloc_ordered_workqueue("vkms_crc_workq", 0);
> > > > > +
> > > > >  	return ret;
> > > > >  }
> > > > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> > > > > index 37aa2ef33b21..5d78bd97e69c 100644
> > > > > --- a/drivers/gpu/drm/vkms/vkms_drv.c
> > > > > +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> > > > > @@ -46,6 +46,7 @@ static void vkms_release(struct drm_device *dev)
> > > > >  	platform_device_unregister(vkms->platform);
> > > > >  	drm_mode_config_cleanup(&vkms->drm);
> > > > >  	drm_dev_fini(&vkms->drm);
> > > > > +	destroy_workqueue(vkms->output.crc_workq);
> > > > >  }
> > > > >  
> > > > >  static struct drm_driver vkms_driver = {
> > > > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > > > index bf811d0ec349..95985649768c 100644
> > > > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > > > > @@ -20,12 +20,23 @@ static const u32 vkms_formats[] = {
> > > > >  	DRM_FORMAT_XRGB8888,
> > > > >  };
> > > > >  
> > > > > +struct vkms_crc_data {
> > > > > +	struct drm_rect src;
> > > > > +	struct drm_framebuffer fb;
> > > > > +};
> > > > > +
> > > > >  /**
> > > > >   * vkms_crtc_state - Driver specific CRTC state
> > > > >   * @base: base CRTC state
> > > > > + * @crc_work: work struct to compute and add CRC entries
> > > > > + * @data: data required for CRC computation
> > > > > + * @n_frame: frame number for computed CRC
> > > > >   */
> > > > >  struct vkms_crtc_state {
> > > > >  	struct drm_crtc_state base;
> > > > > +	struct work_struct crc_work;
> > > > > +	struct vkms_crc_data data;
> > > > > +	unsigned int n_frame;
> > > > >  };
> > > > >  
> > > > >  struct vkms_output {
> > > > > @@ -35,6 +46,11 @@ struct vkms_output {
> > > > >  	struct hrtimer vblank_hrtimer;
> > > > >  	ktime_t period_ns;
> > > > >  	struct drm_pending_vblank_event *event;
> > > > > +	bool crc_enabled;
> > > > 
> > > > Where is this set to true?
> > > > 
> > > 
> > > Oh sorry I forgot to add a vkms_crc.c which sets crc_enabled in
> > > set_crc_source callback! 
> > > 
> > > I'll work on fixing the issues you mentioned here and the other patches
> > > as well, thank you so much again!
> > 
> > Thanks for working through this!
> > 

Thank you for your feedback!

- Haneen

> > Sean
> > 
> > > Haneen
> > > 
> > > > > +	/* ordered wq for crc_work */
> > > > > +	struct workqueue_struct *crc_workq;
> > > > > +	/* protects concurrent access to crc_data */
> > > > > +	spinlock_t lock;
> > > > >  };
> > > > >  
> > > > >  struct vkms_device {
> > > > > @@ -95,4 +111,9 @@ void *vkms_gem_vmap(struct drm_gem_object *obj);
> > > > >  
> > > > >  void vkms_gem_vunmap(struct drm_gem_object *obj);
> > > > >  
> > > > > +/* CRC Support */
> > > > > +int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name,
> > > > > +			size_t *values_cnt);
> > > > > +void vkms_crc_work_handle(struct work_struct *work);
> > > > > +
> > > > >  #endif /* _VKMS_DRV_H_ */
> > > > > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > > > > index aaf7c35faed6..1e0ee8ca8bd6 100644
> > > > > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > > > > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > > > > @@ -24,6 +24,16 @@ static const struct drm_plane_funcs vkms_plane_funcs = {
> > > > >  static void vkms_primary_plane_update(struct drm_plane *plane,
> > > > >  				      struct drm_plane_state *old_state)
> > > > >  {
> > > > > +	struct vkms_crtc_state *state;
> > > > > +
> > > > > +	if (!plane->state->crtc || !plane->state->fb)
> > > > > +		return;
> > > > > +
> > > > > +	state = to_vkms_crtc_state(plane->state->crtc->state);
> > > > > +	memcpy(&state->data.src, &plane->state->src, sizeof(struct drm_rect));
> > > > > +	memcpy(&state->data.fb, plane->state->fb,
> > > > > +	       sizeof(struct drm_framebuffer));
> > > > > +	drm_framebuffer_get(&state->data.fb);
> > > > >  }
> > > > >  
> > > > >  static int vkms_plane_atomic_check(struct drm_plane *plane,
> > > > > -- 
> > > > > 2.17.1
> > > > > 
> > > > 
> > > > -- 
> > > > Sean Paul, Software Engineer, Google / Chromium OS
> > 
> > -- 
> > Sean Paul, Software Engineer, Google / Chromium OS
> 
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS

Sean Paul July 18, 2018, 6:44 p.m. UTC | #6

On Wed, Jul 18, 2018 at 07:24:34PM +0300, Haneen Mohammed wrote:
> First, just so you have the complete view, this is the missing part in this patch:
> ------ vkms_crc.c
> static uint32_t _vkms_get_crc(struct vkms_crc_data *crc_data)
> {
> 	struct drm_framebuffer *fb = &crc_data->fb;
> 	struct drm_gem_object *gem_obj = drm_gem_fb_get_obj(fb, 0);
> 	struct vkms_gem_object *vkms_obj = drm_gem_to_vkms_gem(gem_obj);
> 	u32 crc = 0;
> 	int i = 0;
> 	unsigned int x = crc_data->src.x1 >> 16;
> 	unsigned int y = crc_data->src.y1 >> 16;
> 	unsigned int height = drm_rect_height(&crc_data->src) >> 16;
> 	unsigned int width = drm_rect_width(&crc_data->src) >> 16;
> 	unsigned int cpp = fb->format->cpp[0];
> 	unsigned int src_offset;
> 	unsigned int size_byte = width * cpp;
> 	void *vaddr = vkms_obj->vaddr;
> 
> 	if (WARN_ON(!vaddr))
> 		return crc;
> 
> 	for (i = y; i < y + height; i++) {
> 		src_offset = fb->offsets[0] + (i * fb->pitches[0]) + (x * cpp);
> 		crc = crc32_le(crc, vaddr + src_offset, size_byte);
> 	}
> 
> 	return crc;
> }
> 
> void vkms_crc_work_handle(struct work_struct *work)
> {
> 	struct vkms_crtc_state *crtc_state = container_of(work,
> 						struct vkms_crtc_state,
> 						crc_work);
> 	struct drm_crtc *crtc = crtc_state->base.crtc;
> 	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> 
> 	if (crtc_state && out->crc_enabled) {
> 		u32 crc32 = _vkms_get_crc(&crtc_state->data);
> 		unsigned int n_frame = crtc_state->n_frame;
> 
> 		drm_crtc_add_crc_entry(crtc, true, n_frame, &crc32);
> 	}
> }
> 
> int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name,
> 			size_t *values_cnt)
> {
> 	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> 
> 	if (!src_name) {
> 		out->crc_enabled = false;
> 	} else if (strcmp(src_name, "auto") == 0) {
> 		out->crc_enabled = true;
> 	} else {
> 		out->crc_enabled = false;
> 		return -EINVAL;
> 	}
> 
> 	*values_cnt = 1;
> 
> 	return 0;
> }
> --- end vkms_crc.c

/snip

> > > > > > +static void vkms_crtc_atomic_begin(struct drm_crtc *crtc,
> > > > > > +				   struct drm_crtc_state *old_crtc_state)
> > > > > > +{
> > > > > > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > > > > > +
> > > > > > +	spin_lock_irq(&vkms_output->lock);
> > > > > 
> > > > > Hmm, you can't lock across atomic_begin/flush. What is this protecting against?
> > > > > 
> > > > 
> > > > I did this per Daniel recommendation:
> > > > 1- spin_lock_irq -> vkms_crtc_atomic_begin
> > > > 2- update vkms_crc_data and get ref to fb -> vkms_primary_plane_update
> > > > 3- spin_unlock_irq after vblank event -> vkms_crtc_atomic_flush
> > > 
> > > I think I'm interpreting his mail a little differently. The point of the
> > > spinlock is to protect the pointer to the crc_data in vkms_crtc (right now it's
> > > not a pointer, but it should be).
> > > 
> > > So I think you need to do the following:
> > > - Change crc_data to a pointer
> > > - In plane_update, allocate new memory to hold crc_data and, under the spinlock,
> > >   update the crc_data to point to the new memory (add a WARN_ON if the pointer
> > >   is NULL)
> > > - In the hrtimer, while holding the spinlock, take the pointer from crc_data
> > >   into a local variable and clear the crtc->crc_data pointer
> > > - Pass the pointer (from the local variable) to the crc_worker
> > > 
> 
> The problem with this approach is managing the pointer allocation and
> deallocation. If I grab it in the hrtimer, then where/when should I free it?
> I can't after the crc worker is done with it because what I noticed is
> that the crc computation is expected to run multiple times after one
> plane_atomic_update.
> 
> > > I don't think you need to hold the spinlock across the atomic hooks, just grab
> > > it in plane_update.
> > 
> 
> I asked Daniel in irc about releasing the vkms_output->lock in the begginning of atomic_flush
> to avoid modifying the event_lock and he mentioned that "output->lock must wrap the event
> handling code completely since we need to make sure that the flip event
> and the crc all match up".
> 
> So maybe I can grab the spinlock in plane_atomic_update and release it at
> the end of crtc_atomic_flush? Is plane_atomic_update always called
> before crtc_atomic_flush?
> 
> Otherwise if I grab the spinlock in atomic_begin I won't be able to
> use pointer for crc_data and allocate it in plane_atomic_update since
> I'll be holding a spinlock already. 
> 
> I have based this patch on Daniel's second feedback to my email
> "[PATCH] drm/vkms: Update CRC support after lock review" which I didn't
> cc the dri mailing list in it, quoting Daniel:
> 
> "Overall flow:
> 1. in crtc_funcs->atomic_begin grab the spinlock, to make sure
> everything is visible completely or not at all to the vblank handler.
> Also update a pointer to the vkms_crtc_state in vkms_output->crc_data
> (or something like that.
> 2. copy the plane data (like you do here) in plane_funcs->atomic_commit.
> 3. drop the spinlock in crtc_funcs->atomic_flush after the vblank
> event is armed.
> 
> In the vblank hrtimer handler:
> 1. grab the spinlock
> 2. drm_crtc_handle_vblank() - we must do this while holding the
> spinlock, otherwise the event and crc might get out of sync. Also grab
> the current vblank number here and store it somewhere in
> vkms_crtc_state so that the worker can use the right one.
> 3. launch the worker. Note that the struct_work needs to be within the
> vkms_crtc_state structure, otherwise the crc computation worker
> doesn't know really for which data it should compute the crc - just
> having one copy in data[1] doesn't really solve this race.
> 4. drop spinlock

Ok, I think I understand what we're trying to do. It seems like the spinlock is
simulating what we would do in hw by telling the display controller not to latch
in new addresses until we're ready. By blocking the vblank handler (and, by
extension, the event), we're delay sending an event out until the crc data is
complete. In hw, this would be a dropped frame.

I'd probably prefer doing something more similar to hw. Something like simulating
a latch to tell the vblank handler to skip sending the event until the next
timer interrupt, instead of blocking it. I think that would avoid having to hold
the spinlock across begin/<everything>/flush. However Daniel is smarter than I am,
so I'll go with his suggestion since I'm sure there's a very good reason this is a
very bad idea :-)

In that case, what you have makes sense to me, so if we could flush the work
item instead of the whole queue, I think we're there.

Sean


> 
> So one big problem we have here still is what happens if the crc
> worker falls behind with computing crcs. There's 2 cases, let's first
> look at how to fixe the crc worker falling behind atomic updates. The
> problem there is that the next atomic update might free the
> vkms_crtc_state structure before the worker is done with it. That's
> fairly easy to fix:
> - add a flush_work call to your vkms_crtc_state_destroy function. This
> means you need to initialize the work already in
> vkms_crtc_state_duplicate (you'll create both of these when doing the
> vkms_crtc_state subclassing).
> - 2nd we need to make sure crc workers are run in order easiest fix
> for that is to create your own private crc work queue (1 per crtc we
> have) with the WQ_ORDERED flag.
> 
> 2nd issue if the vblanks get ahead of the crc worker. This happens
> when e.g. there's no atomic updates, but  we still want to generate 1
> vblank per frame. This is probably better done in a follow-up patch,
> but rough solution is:
> - in the vblank hrtimer check whether the work is still scheduled and
> not yet complete (while holding the spinlock). in that case simply
> tell the worker to fill out the same crc for multiple vblanks, by
> using a vblank_seqno_start/end.
> - if the worker is not busy, then we tell it to compute a new crc (in
> case something in the buffers has changed - in the future we might
> change this behaviour)."
> 
> > So the more I think about this, I don't think it's quite right. Perhaps I'm
> > missing an email from Daniel, or (more likely) misunderstanding what you're
> > trying to do. However, the email I read suggested storing the crc_data
> > to be consumed by the crc worker in vkms_crtc. However in this patch it's being
> > stored (along with the work item) in crtc state. Further, the state cannot be
> > destroyed without completing the crc_work, which kind of defeats the purpose of
> > the workqueue in the first place.
> > 
> > Storing a copy of the crc_data in crtc allows you to free the crtc_state, so you
> > can start working on the next frame while the crc worker is scheduled and picks
> > up the crc_data from crtc. Storing the crc_data in crtc_state and waiting for
> > the work to finish is easier since you don't need to worry about copying and
> > locking (so much). However, it seems like we're doing both in this patch.
> > 
> > Again, I'm likely just not understanding what the goal of this is, so any
> > clarification would be greatly appreciated :)
> > 
> > Sean
> > 
> > > 
> > > Sean
> > > 
> > > > 
> > > > > > +}
> > > > > > +

/snip

Haneen Mohammed July 18, 2018, 7:42 p.m. UTC | #7

On Wed, Jul 18, 2018 at 02:44:18PM -0400, Sean Paul wrote:
> On Wed, Jul 18, 2018 at 07:24:34PM +0300, Haneen Mohammed wrote:
> > First, just so you have the complete view, this is the missing part in this patch:
> > ------ vkms_crc.c
> > static uint32_t _vkms_get_crc(struct vkms_crc_data *crc_data)
> > {
> > 	struct drm_framebuffer *fb = &crc_data->fb;
> > 	struct drm_gem_object *gem_obj = drm_gem_fb_get_obj(fb, 0);
> > 	struct vkms_gem_object *vkms_obj = drm_gem_to_vkms_gem(gem_obj);
> > 	u32 crc = 0;
> > 	int i = 0;
> > 	unsigned int x = crc_data->src.x1 >> 16;
> > 	unsigned int y = crc_data->src.y1 >> 16;
> > 	unsigned int height = drm_rect_height(&crc_data->src) >> 16;
> > 	unsigned int width = drm_rect_width(&crc_data->src) >> 16;
> > 	unsigned int cpp = fb->format->cpp[0];
> > 	unsigned int src_offset;
> > 	unsigned int size_byte = width * cpp;
> > 	void *vaddr = vkms_obj->vaddr;
> > 
> > 	if (WARN_ON(!vaddr))
> > 		return crc;
> > 
> > 	for (i = y; i < y + height; i++) {
> > 		src_offset = fb->offsets[0] + (i * fb->pitches[0]) + (x * cpp);
> > 		crc = crc32_le(crc, vaddr + src_offset, size_byte);
> > 	}
> > 
> > 	return crc;
> > }
> > 
> > void vkms_crc_work_handle(struct work_struct *work)
> > {
> > 	struct vkms_crtc_state *crtc_state = container_of(work,
> > 						struct vkms_crtc_state,
> > 						crc_work);
> > 	struct drm_crtc *crtc = crtc_state->base.crtc;
> > 	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> > 
> > 	if (crtc_state && out->crc_enabled) {
> > 		u32 crc32 = _vkms_get_crc(&crtc_state->data);
> > 		unsigned int n_frame = crtc_state->n_frame;
> > 
> > 		drm_crtc_add_crc_entry(crtc, true, n_frame, &crc32);
> > 	}
> > }
> > 
> > int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name,
> > 			size_t *values_cnt)
> > {
> > 	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> > 
> > 	if (!src_name) {
> > 		out->crc_enabled = false;
> > 	} else if (strcmp(src_name, "auto") == 0) {
> > 		out->crc_enabled = true;
> > 	} else {
> > 		out->crc_enabled = false;
> > 		return -EINVAL;
> > 	}
> > 
> > 	*values_cnt = 1;
> > 
> > 	return 0;
> > }
> > --- end vkms_crc.c
> 
> /snip
> 
> > > > > > > +static void vkms_crtc_atomic_begin(struct drm_crtc *crtc,
> > > > > > > +				   struct drm_crtc_state *old_crtc_state)
> > > > > > > +{
> > > > > > > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > > > > > > +
> > > > > > > +	spin_lock_irq(&vkms_output->lock);
> > > > > > 
> > > > > > Hmm, you can't lock across atomic_begin/flush. What is this protecting against?
> > > > > > 
> > > > > 
> > > > > I did this per Daniel recommendation:
> > > > > 1- spin_lock_irq -> vkms_crtc_atomic_begin
> > > > > 2- update vkms_crc_data and get ref to fb -> vkms_primary_plane_update
> > > > > 3- spin_unlock_irq after vblank event -> vkms_crtc_atomic_flush
> > > > 
> > > > I think I'm interpreting his mail a little differently. The point of the
> > > > spinlock is to protect the pointer to the crc_data in vkms_crtc (right now it's
> > > > not a pointer, but it should be).
> > > > 
> > > > So I think you need to do the following:
> > > > - Change crc_data to a pointer
> > > > - In plane_update, allocate new memory to hold crc_data and, under the spinlock,
> > > >   update the crc_data to point to the new memory (add a WARN_ON if the pointer
> > > >   is NULL)
> > > > - In the hrtimer, while holding the spinlock, take the pointer from crc_data
> > > >   into a local variable and clear the crtc->crc_data pointer
> > > > - Pass the pointer (from the local variable) to the crc_worker
> > > > 
> > 
> > The problem with this approach is managing the pointer allocation and
> > deallocation. If I grab it in the hrtimer, then where/when should I free it?
> > I can't after the crc worker is done with it because what I noticed is
> > that the crc computation is expected to run multiple times after one
> > plane_atomic_update.
> > 
> > > > I don't think you need to hold the spinlock across the atomic hooks, just grab
> > > > it in plane_update.
> > > 
> > 
> > I asked Daniel in irc about releasing the vkms_output->lock in the begginning of atomic_flush
> > to avoid modifying the event_lock and he mentioned that "output->lock must wrap the event
> > handling code completely since we need to make sure that the flip event
> > and the crc all match up".
> > 
> > So maybe I can grab the spinlock in plane_atomic_update and release it at
> > the end of crtc_atomic_flush? Is plane_atomic_update always called
> > before crtc_atomic_flush?
> > 
> > Otherwise if I grab the spinlock in atomic_begin I won't be able to
> > use pointer for crc_data and allocate it in plane_atomic_update since
> > I'll be holding a spinlock already. 
> > 
> > I have based this patch on Daniel's second feedback to my email
> > "[PATCH] drm/vkms: Update CRC support after lock review" which I didn't
> > cc the dri mailing list in it, quoting Daniel:
> > 
> > "Overall flow:
> > 1. in crtc_funcs->atomic_begin grab the spinlock, to make sure
> > everything is visible completely or not at all to the vblank handler.
> > Also update a pointer to the vkms_crtc_state in vkms_output->crc_data
> > (or something like that.
> > 2. copy the plane data (like you do here) in plane_funcs->atomic_commit.
> > 3. drop the spinlock in crtc_funcs->atomic_flush after the vblank
> > event is armed.
> > 
> > In the vblank hrtimer handler:
> > 1. grab the spinlock
> > 2. drm_crtc_handle_vblank() - we must do this while holding the
> > spinlock, otherwise the event and crc might get out of sync. Also grab
> > the current vblank number here and store it somewhere in
> > vkms_crtc_state so that the worker can use the right one.
> > 3. launch the worker. Note that the struct_work needs to be within the
> > vkms_crtc_state structure, otherwise the crc computation worker
> > doesn't know really for which data it should compute the crc - just
> > having one copy in data[1] doesn't really solve this race.
> > 4. drop spinlock
> 
> Ok, I think I understand what we're trying to do. It seems like the spinlock is
> simulating what we would do in hw by telling the display controller not to latch
> in new addresses until we're ready. By blocking the vblank handler (and, by
> extension, the event), we're delay sending an event out until the crc data is
> complete. In hw, this would be a dropped frame.
> 
> I'd probably prefer doing something more similar to hw. Something like simulating
> a latch to tell the vblank handler to skip sending the event until the next
> timer interrupt, instead of blocking it. I think that would avoid having to hold
> the spinlock across begin/<everything>/flush. However Daniel is smarter than I am,
> so I'll go with his suggestion since I'm sure there's a very good reason this is a
> very bad idea :-)
> 
> In that case, what you have makes sense to me, so if we could flush the work
> item instead of the whole queue, I think we're there.
> 

Alright, will do that. Thank you so much!

- Haneen
> Sean
> 
> 
> > 
> > So one big problem we have here still is what happens if the crc
> > worker falls behind with computing crcs. There's 2 cases, let's first
> > look at how to fixe the crc worker falling behind atomic updates. The
> > problem there is that the next atomic update might free the
> > vkms_crtc_state structure before the worker is done with it. That's
> > fairly easy to fix:
> > - add a flush_work call to your vkms_crtc_state_destroy function. This
> > means you need to initialize the work already in
> > vkms_crtc_state_duplicate (you'll create both of these when doing the
> > vkms_crtc_state subclassing).
> > - 2nd we need to make sure crc workers are run in order easiest fix
> > for that is to create your own private crc work queue (1 per crtc we
> > have) with the WQ_ORDERED flag.
> > 
> > 2nd issue if the vblanks get ahead of the crc worker. This happens
> > when e.g. there's no atomic updates, but  we still want to generate 1
> > vblank per frame. This is probably better done in a follow-up patch,
> > but rough solution is:
> > - in the vblank hrtimer check whether the work is still scheduled and
> > not yet complete (while holding the spinlock). in that case simply
> > tell the worker to fill out the same crc for multiple vblanks, by
> > using a vblank_seqno_start/end.
> > - if the worker is not busy, then we tell it to compute a new crc (in
> > case something in the buffers has changed - in the future we might
> > change this behaviour)."
> > 
> > > So the more I think about this, I don't think it's quite right. Perhaps I'm
> > > missing an email from Daniel, or (more likely) misunderstanding what you're
> > > trying to do. However, the email I read suggested storing the crc_data
> > > to be consumed by the crc worker in vkms_crtc. However in this patch it's being
> > > stored (along with the work item) in crtc state. Further, the state cannot be
> > > destroyed without completing the crc_work, which kind of defeats the purpose of
> > > the workqueue in the first place.
> > > 
> > > Storing a copy of the crc_data in crtc allows you to free the crtc_state, so you
> > > can start working on the next frame while the crc worker is scheduled and picks
> > > up the crc_data from crtc. Storing the crc_data in crtc_state and waiting for
> > > the work to finish is easier since you don't need to worry about copying and
> > > locking (so much). However, it seems like we're doing both in this patch.
> > > 
> > > Again, I'm likely just not understanding what the goal of this is, so any
> > > clarification would be greatly appreciated :)
> > > 
> > > Sean
> > > 
> > > > 
> > > > Sean
> > > > 
> > > > > 
> > > > > > > +}
> > > > > > > +
> 
> /snip
> 
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS

Haneen Mohammed July 28, 2018, 6:40 p.m. UTC | #8

On Wed, Jul 18, 2018 at 02:44:18PM -0400, Sean Paul wrote:
> On Wed, Jul 18, 2018 at 07:24:34PM +0300, Haneen Mohammed wrote:
> > First, just so you have the complete view, this is the missing part in this patch:
> > ------ vkms_crc.c
> > static uint32_t _vkms_get_crc(struct vkms_crc_data *crc_data)
> > {
> > 	struct drm_framebuffer *fb = &crc_data->fb;
> > 	struct drm_gem_object *gem_obj = drm_gem_fb_get_obj(fb, 0);
> > 	struct vkms_gem_object *vkms_obj = drm_gem_to_vkms_gem(gem_obj);
> > 	u32 crc = 0;
> > 	int i = 0;
> > 	unsigned int x = crc_data->src.x1 >> 16;
> > 	unsigned int y = crc_data->src.y1 >> 16;
> > 	unsigned int height = drm_rect_height(&crc_data->src) >> 16;
> > 	unsigned int width = drm_rect_width(&crc_data->src) >> 16;
> > 	unsigned int cpp = fb->format->cpp[0];
> > 	unsigned int src_offset;
> > 	unsigned int size_byte = width * cpp;
> > 	void *vaddr = vkms_obj->vaddr;
> > 
> > 	if (WARN_ON(!vaddr))
> > 		return crc;
> > 
> > 	for (i = y; i < y + height; i++) {
> > 		src_offset = fb->offsets[0] + (i * fb->pitches[0]) + (x * cpp);
> > 		crc = crc32_le(crc, vaddr + src_offset, size_byte);
> > 	}
> > 
> > 	return crc;
> > }
> > 
> > void vkms_crc_work_handle(struct work_struct *work)
> > {
> > 	struct vkms_crtc_state *crtc_state = container_of(work,
> > 						struct vkms_crtc_state,
> > 						crc_work);
> > 	struct drm_crtc *crtc = crtc_state->base.crtc;
> > 	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> > 
> > 	if (crtc_state && out->crc_enabled) {
> > 		u32 crc32 = _vkms_get_crc(&crtc_state->data);
> > 		unsigned int n_frame = crtc_state->n_frame;
> > 
> > 		drm_crtc_add_crc_entry(crtc, true, n_frame, &crc32);
> > 	}
> > }
> > 
> > int vkms_set_crc_source(struct drm_crtc *crtc, const char *src_name,
> > 			size_t *values_cnt)
> > {
> > 	struct vkms_output *out = drm_crtc_to_vkms_output(crtc);
> > 
> > 	if (!src_name) {
> > 		out->crc_enabled = false;
> > 	} else if (strcmp(src_name, "auto") == 0) {
> > 		out->crc_enabled = true;
> > 	} else {
> > 		out->crc_enabled = false;
> > 		return -EINVAL;
> > 	}
> > 
> > 	*values_cnt = 1;
> > 
> > 	return 0;
> > }
> > --- end vkms_crc.c
> 
> /snip
> 
> > > > > > > +static void vkms_crtc_atomic_begin(struct drm_crtc *crtc,
> > > > > > > +				   struct drm_crtc_state *old_crtc_state)
> > > > > > > +{
> > > > > > > +	struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
> > > > > > > +
> > > > > > > +	spin_lock_irq(&vkms_output->lock);
> > > > > > 
> > > > > > Hmm, you can't lock across atomic_begin/flush. What is this protecting against?
> > > > > > 
> > > > > 
> > > > > I did this per Daniel recommendation:
> > > > > 1- spin_lock_irq -> vkms_crtc_atomic_begin
> > > > > 2- update vkms_crc_data and get ref to fb -> vkms_primary_plane_update
> > > > > 3- spin_unlock_irq after vblank event -> vkms_crtc_atomic_flush
> > > > 
> > > > I think I'm interpreting his mail a little differently. The point of the
> > > > spinlock is to protect the pointer to the crc_data in vkms_crtc (right now it's
> > > > not a pointer, but it should be).
> > > > 
> > > > So I think you need to do the following:
> > > > - Change crc_data to a pointer
> > > > - In plane_update, allocate new memory to hold crc_data and, under the spinlock,
> > > >   update the crc_data to point to the new memory (add a WARN_ON if the pointer
> > > >   is NULL)
> > > > - In the hrtimer, while holding the spinlock, take the pointer from crc_data
> > > >   into a local variable and clear the crtc->crc_data pointer
> > > > - Pass the pointer (from the local variable) to the crc_worker
> > > > 
> > 
> > The problem with this approach is managing the pointer allocation and
> > deallocation. If I grab it in the hrtimer, then where/when should I free it?
> > I can't after the crc worker is done with it because what I noticed is
> > that the crc computation is expected to run multiple times after one
> > plane_atomic_update.
> > 
> > > > I don't think you need to hold the spinlock across the atomic hooks, just grab
> > > > it in plane_update.
> > > 
> > 
> > I asked Daniel in irc about releasing the vkms_output->lock in the begginning of atomic_flush
> > to avoid modifying the event_lock and he mentioned that "output->lock must wrap the event
> > handling code completely since we need to make sure that the flip event
> > and the crc all match up".
> > 
> > So maybe I can grab the spinlock in plane_atomic_update and release it at
> > the end of crtc_atomic_flush? Is plane_atomic_update always called
> > before crtc_atomic_flush?
> > 
> > Otherwise if I grab the spinlock in atomic_begin I won't be able to
> > use pointer for crc_data and allocate it in plane_atomic_update since
> > I'll be holding a spinlock already. 
> > 
> > I have based this patch on Daniel's second feedback to my email
> > "[PATCH] drm/vkms: Update CRC support after lock review" which I didn't
> > cc the dri mailing list in it, quoting Daniel:
> > 
> > "Overall flow:
> > 1. in crtc_funcs->atomic_begin grab the spinlock, to make sure
> > everything is visible completely or not at all to the vblank handler.
> > Also update a pointer to the vkms_crtc_state in vkms_output->crc_data
> > (or something like that.
> > 2. copy the plane data (like you do here) in plane_funcs->atomic_commit.
> > 3. drop the spinlock in crtc_funcs->atomic_flush after the vblank
> > event is armed.
> > 
> > In the vblank hrtimer handler:
> > 1. grab the spinlock
> > 2. drm_crtc_handle_vblank() - we must do this while holding the
> > spinlock, otherwise the event and crc might get out of sync. Also grab
> > the current vblank number here and store it somewhere in
> > vkms_crtc_state so that the worker can use the right one.
> > 3. launch the worker. Note that the struct_work needs to be within the
> > vkms_crtc_state structure, otherwise the crc computation worker
> > doesn't know really for which data it should compute the crc - just
> > having one copy in data[1] doesn't really solve this race.
> > 4. drop spinlock
> 
> Ok, I think I understand what we're trying to do. It seems like the spinlock is
> simulating what we would do in hw by telling the display controller not to latch
> in new addresses until we're ready. By blocking the vblank handler (and, by
> extension, the event), we're delay sending an event out until the crc data is
> complete. In hw, this would be a dropped frame.
> 

Hi Sean, now that I've moved the crc_data to plane_state, I think I can
move the spinlock to plane_update.

> I'd probably prefer doing something more similar to hw. Something like simulating
> a latch to tell the vblank handler to skip sending the event until the next
> timer interrupt, instead of blocking it. I think that would avoid having to hold

I didn't completely understand this method, but the best I can think of
is that if I want to try this approach, then I should move the spinlock to
plane_update, which mean it can be acquired again by the vblank_handle()
before we send the vblank event in crtc_atomic_flush(), and thus we need
to delay the call to vblank_handle() until we make sure the event was
sent from atomic_flush?


> the spinlock across begin/<everything>/flush. However Daniel is smarter than I am,
> so I'll go with his suggestion since I'm sure there's a very good reason this is a
> very bad idea :-)
> 

> In that case, what you have makes sense to me, so if we could flush the work
> item instead of the whole queue, I think we're there.
> 
> Sean
> 
> 
> > 
> > So one big problem we have here still is what happens if the crc
> > worker falls behind with computing crcs. There's 2 cases, let's first
> > look at how to fixe the crc worker falling behind atomic updates. The
> > problem there is that the next atomic update might free the
> > vkms_crtc_state structure before the worker is done with it. That's
> > fairly easy to fix:
> > - add a flush_work call to your vkms_crtc_state_destroy function. This
> > means you need to initialize the work already in
> > vkms_crtc_state_duplicate (you'll create both of these when doing the
> > vkms_crtc_state subclassing).
> > - 2nd we need to make sure crc workers are run in order easiest fix
> > for that is to create your own private crc work queue (1 per crtc we
> > have) with the WQ_ORDERED flag.
> > 
> > 2nd issue if the vblanks get ahead of the crc worker. This happens
> > when e.g. there's no atomic updates, but  we still want to generate 1
> > vblank per frame. This is probably better done in a follow-up patch,
> > but rough solution is:
> > - in the vblank hrtimer check whether the work is still scheduled and
> > not yet complete (while holding the spinlock). in that case simply
> > tell the worker to fill out the same crc for multiple vblanks, by
> > using a vblank_seqno_start/end.
> > - if the worker is not busy, then we tell it to compute a new crc (in
> > case something in the buffers has changed - in the future we might
> > change this behaviour)."
> > 
> > > So the more I think about this, I don't think it's quite right. Perhaps I'm
> > > missing an email from Daniel, or (more likely) misunderstanding what you're
> > > trying to do. However, the email I read suggested storing the crc_data
> > > to be consumed by the crc worker in vkms_crtc. However in this patch it's being
> > > stored (along with the work item) in crtc state. Further, the state cannot be
> > > destroyed without completing the crc_work, which kind of defeats the purpose of
> > > the workqueue in the first place.
> > > 
> > > Storing a copy of the crc_data in crtc allows you to free the crtc_state, so you
> > > can start working on the next frame while the crc worker is scheduled and picks
> > > up the crc_data from crtc. Storing the crc_data in crtc_state and waiting for
> > > the work to finish is easier since you don't need to worry about copying and
> > > locking (so much). However, it seems like we're doing both in this patch.
> > > 
> > > Again, I'm likely just not understanding what the goal of this is, so any
> > > clarification would be greatly appreciated :)
> > > 
> > > Sean
> > > 
> > > > 
> > > > Sean
> > > > 
> > > > > 
> > > > > > > +}
> > > > > > > +
> 
> /snip
> 
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS

[v2,5/5] drm/vkms: Implement CRC debugfs API

Commit Message

Comments

Patch