From patchwork Wed Nov 25 19:04:26 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mario Kleiner X-Patchwork-Id: 7729371 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 38027BEEE5 for ; Mon, 30 Nov 2015 20:13:58 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D802D20546 for ; Mon, 30 Nov 2015 20:13:56 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 269AC20544 for ; Mon, 30 Nov 2015 20:13:55 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AA3736E66F; Mon, 30 Nov 2015 12:13:52 -0800 (PST) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTP id 01D946E66C for ; Mon, 30 Nov 2015 12:13:49 -0800 (PST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga103.fm.intel.com with ESMTP; 30 Nov 2015 12:13:41 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,365,1444719600"; d="scan'208";a="831201847" Received: from stinkbox.fi.intel.com (HELO stinkbox) ([10.237.72.174]) by orsmga001.jf.intel.com with SMTP; 30 Nov 2015 12:13:40 -0800 Received: by stinkbox (sSMTP sendmail emulation); Mon, 30 Nov 2015 22:13:38 +0200 Resent-From: Ville =?iso-8859-1?Q?Syrj=E4l=E4?= Resent-Date: Mon, 30 Nov 2015 22:13:38 +0200 Resent-Message-ID: <20151130201338.GL4437@intel.com> Resent-To: dri-devel@lists.freedesktop.org X-Original-To: ville.syrjala@linux.intel.com Delivered-To: ville.syrjala@linux.intel.com Received: from linux.intel.com [10.23.219.25] by localhost with IMAP (fetchmail-6.3.26) for (single-drop); Wed, 25 Nov 2015 21:09:12 +0200 (EET) Received: from orsmga002.jf.intel.com (orsmga002.jf.intel.com [10.7.209.21]) by linux.intel.com (Postfix) with ESMTP id C0C3A6A4083 for ; Wed, 25 Nov 2015 11:03:47 -0800 (PST) Received: from orsmga101.jf.intel.com ([10.7.208.22]) by orsmga002-1.jf.intel.com with ESMTP; 25 Nov 2015 11:04:54 -0800 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0CDAAABBlZWlDVSfUpWCBkBAQIPAQEBAYRNrQeRQIFmhg8CgUM6EgEBAQEBAQERAQEBAQcLCwkfMIQ0AQEBAQIBDAYVGQEbHAEBAwELBgULBgMBAgEJFg8JAwIBAgEPAhEBBQEUCAYNBgIBARcHh3YBAwoIBaAlgTE+MY0ygnmGBwoZJw1WhA0BAQEBAQEBAQEBAQEBAQEBAQESAwEFDotEgUABgRKBYwg9hD4Fh0qPDYJciGOBdoFchzQMBCOLRYNkgiU2gRcnAYI/DQ0JB4FXcQGFKwEBAQ X-IPAS-Result: A0CDAAABBlZWlDVSfUpWCBkBAQIPAQEBAYRNrQeRQIFmhg8CgUM6EgEBAQEBAQERAQEBAQcLCwkfMIQ0AQEBAQIBDAYVGQEbHAEBAwELBgULBgMBAgEJFg8JAwIBAgEPAhEBBQEUCAYNBgIBARcHh3YBAwoIBaAlgTE+MY0ygnmGBwoZJw1WhA0BAQEBAQEBAQEBAQEBAQEBAQESAwEFDotEgUABgRKBYwg9hD4Fh0qPDYJciGOBdoFchzQMBCOLRYNkgiU2gRcnAYI/DQ0JB4FXcQGFKwEBAQ X-IronPort-AV: E=Sophos;i="5.20,343,1444719600"; d="scan'208,223";a="343145289" Received: from mail-wm0-f53.google.com ([74.125.82.53]) by mtab.intel.com with ESMTP; 25 Nov 2015 11:04:31 -0800 Received: by wmvv187 with SMTP id v187so176981wmv.1 for ; Wed, 25 Nov 2015 11:04:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-type; bh=UBObVkIkN5Lcxx/xMI7gjAeOJXm58teDVdcp6X5Dbx4=; b=kKBGVxR9EhNbF2WOVMsRFmz/NU3Q1Zy7/RHZFQdFnznuIZECUhJ28nW1WUw2AkfCra 71qYWjZAa3RfXs120zPKq6FGJO+jLVllztjiZ9UzsSr9E8vnikKqygmMjM/EKpHXy3Cj 4AbVjH/p8A4Tb4KHJih5xJmKAyHpZlOMaiQfKgCV6QaNCezHAP1DiWq6U3By2Q8irdVe K/JPbAdQUtrY7Fopi5Ob+9vJ9HXjWE7NySnIqNgsSqRMvgdvZ0q3vYvtKM1VKxm5Fj98 x78YXnXhZN8y+fMv1v4US9x4GrZTEyifII1M0AX5BkbGnEqGxFCRY1DmRwfXlxCi5G6g nmwg== X-Received: by 10.28.72.137 with SMTP id v131mr6645828wma.63.1448478270266; Wed, 25 Nov 2015 11:04:30 -0800 (PST) Received: from [172.25.194.222] (cin-11.medizin.uni-tuebingen.de. [134.2.118.242]) by smtp.gmail.com with ESMTPSA id da10sm24371382wjb.22.2015.11.25.11.04.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Nov 2015 11:04:28 -0800 (PST) Subject: Re: Funky new vblank counter regressions in Linux 4.4-rc1 To: =?UTF-8?B?VmlsbGUgU3lyasOkbMOk?= References: <20151119182051.GU4437@intel.com> <564E1F18.5070508@gmail.com> <20151119194557.GB4437@intel.com> <564F3B42.5050803@gmail.com> <20151120153435.GR4437@intel.com> <56532F69.9080001@gmail.com> <20151123155109.GB4437@intel.com> <565353CA.5010909@gmail.com> <20151123202426.GC4437@intel.com> <5655EEBD.5040403@gmail.com> <20151125174600.GV4437@intel.com> From: Mario Kleiner Message-ID: <5656063A.8080504@gmail.com> Date: Wed, 25 Nov 2015 20:04:26 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <20151125174600.GV4437@intel.com> Cc: =?UTF-8?Q?Michel_D=c3=a4nzer?= X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED,RCVD_IN_DNSWL_MED,T_DKIM_INVALID,T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 11/25/2015 06:46 PM, Ville Syrjälä wrote: > On Wed, Nov 25, 2015 at 06:24:13PM +0100, Mario Kleiner wrote: >> On 11/23/2015 09:24 PM, Ville Syrjälä wrote: >>> On Mon, Nov 23, 2015 at 06:58:34PM +0100, Mario Kleiner wrote: >>>> >>>> >>>> On 11/23/2015 04:51 PM, Ville Syrjälä wrote: >>>>> On Mon, Nov 23, 2015 at 04:23:21PM +0100, Mario Kleiner wrote: >>>>>> On 11/20/2015 04:34 PM, Ville Syrjälä wrote: >>>>>>> On Fri, Nov 20, 2015 at 04:24:50PM +0100, Mario Kleiner wrote: >>>>>> >>>>>> ... >>>>>> Ok, but why would that be a bad thing? I think we want it to think it is >>>>>> in the previous frame if it is called outside the vblank irq context. >>>>>> The only reason we fudge it to the next frames vblank if i vblank irq is >>>>>> because we know the vblank irq handler we are executing atm. was meant >>>>>> to execute within the upcoming vblank for the next frame, so we fudge >>>>>> the scanout positions and thereby timestamp to correspond to that new >>>>>> frame. But if something called outside irq context it should get a >>>>>> scanout position/timestamp that corresponds to "reality". >>>>> >>>>> It would be a bad thing since it would cause the timestamp to jump >>>>> backwards, and that would also cause the frame count guesstimate to go >>>>> backwards. >>>>> >>>> >>>> But only if we don't use the dev->driver->get_vblank_counter() method, >>>> which we try to use on AMD. >>> >>> Well, if you do it that way then you have the problem of the hw counter >>> seeming to jump forward by one after crossing the start of vblank (when >>> compared to the value you sampled when you processed the early vblank >>> interrupt). >>> >> >> Ok, finally i see the bad scenario that wouldn't get prevented by our >> current locking with the new vblank counting in the core. The vblank >> enable path is safe due to locking and discounting of redundant >> timestamps etc. But the disable path could go wrong: >> >> 1. Vblank irq fires, drm_handle_vblank() -> drm_update_vblank_count(), >> updates timestamps and counts "as if" in vblank -> incremented vblank >> count and timestamp now set in the future. >> >> 2. After vblank irq finishes, but just before leading edge of vblank, >> vblank_disable_and_save() executes, doesn't get bumped timestamp or >> count because before vblank and not in vblank irq. Now >> drm_update_vblank_count() would process a >> "new" timestamp and count from the past and we'd have time and counts >> going backwards, and bad things would happen. >> >> I haven't observed such a thing happening during testing so far, >> probably because the time window in which it could happen is tiny, but >> given how awfully bad it would be, it needs to be prevented. >> >> I had a look at the description of the Vblank irq in the "M76 Register >> Reference Guide" for older asics and the description suggests that the >> vblank irq fires when the crtc's line buffer is finished reading pixel >> data from the scanout buffer in memory for a frame, ie., when the line >> buffer read "enters" vblank. That would explain why the irq happens a >> few scanlines before actual vblank, because line buffer refills must >> obviously happen before the crtc can send pixel data from the line >> buffer to the encoders, so it would lead a bit in time. That also means >> we can't delay the vblank irq to actually happen at start of vblank and >> have to deal with the early vblank irq. >> >>> I guess one silly idea would be to defer the vblank interrupt processing >>> to a timer, and just schedule it a bit into the future from the actual >>> interrupt handler. >>> >> >> Timer would be bad because then we get problems with the pageflip >> completion irq sometimes being processed before the vblank irq, > > You you'd need to move page flip completion to happen from the vblank > timer too I suppose. > >> and >> because we want to be fast in vblank irq handling, delivering vblank >> events etc. I wouldn't trust a timer to be reliable enough for such >> short waits. > > hrtimers should be accurate. But maybe more expensive than the timer > wheel. > Sounds all a bit complex and fraught with new possible complications. I can't spend much more time on this than i did so far, and we need to get this into 4.4-rc, so i am aiming for the most simple solution that would work. >> Busy waiting wouldn't be great either in irq. >> >> So what about this relatively simple one? >> >> 1. In radeon_get_crtc_scanoutpos() we artifically define the >> vblank_start line to be, e.g, 5 scanlines before the true start of >> vblank, so for the purpose of vblank counter queries and timestamping >> our "vblank" would start a bit earlier and the vblank irq would always >> execute in "vblank". Non-Irq invocations like vblank_disable_and_save() >> would also be treated to this early vblank start and so what the DRM >> core observes would always be consistent. >> >> 2. At least in radeon-kms we also use radeon_get_crtc_scanoutpos() >> internally for "dynpm" dynamic power management/reclocking, and to >> implement pageflip completion detection on asics older than DCE3 which >> don't have pageflip interrupts. For those cases we need to use the true >> start of vblank, so for this internal use we pass in some special flag >> into radeon_get_crtc_scanoutpos() to tell it to not shift the vblank >> start around. >> >> 3. I've added another check to my patch for drm_update_vblank_count() to >> catch timestamps going backwards and discounting such cases, for a bit >> of extra robustness against driver trouble. >> >> Any ideas why this would be a stupid idea? I'll try this now and just >> hope meanwhile nobody finds a reason why this would be bad. > > What I don't like is leaking any of this into the core code. All the > hacks should live in the hw specific code. We've managed to keep all > the i915 hacks hidden that way. > > So if you would: > - fudge your get_scanout_position as you suggested > - _not_ expose the hw counter to the vblank core since it > disagrees with the scanout position > - keep the internal get_scanout_position variant/flag > purely internal > > then I think I might like it. That way working hardware doesn't have to > be exposed to these hacks, and there's possible a bit less danger that it > all gets broken again next time the core needs some work. > Ok. Exposing the fudged hw counter shouldn't be a problem though, given that it would be fudged in a consistent way with the fudged scanout positions and to have incremented at time of drm_handle_vblank(). We'd bump the hw counter to the count of the new vblank at the same time when the scanout positions would start counting backwards from minus something to 0, showing how many vblank lines to go until start of scanout, etc. The only difference to reality would be that this simulated vblank would start 5 scanlines earlier than the true hw vblank, but i can't see how the core would care about that. > One problem with that approach could be that the vblank event and page > flip event would be delievered at different times if you keep using the > page flip interrupt, but I'm not sure that would be a problem. At least > they should both have the same timestamp and counter value. > That's the same now, and the timestamps and counts be the same. I'll check with my measurement equipment that the timestamps will be still accurate wrt. to reality. Attached is my current patch i wanted to submit for the drm core's drm_update_vblank_count(). I think it's good to make the core somewhat robust against potential kms driver bugs or glitches. But if you wouldn't like that patch, there wouldn't be much of a point sending it out at all. thanks, -mario >From 2d5d58a1c575ad002ce2cb643f395d0e4757d959 Mon Sep 17 00:00:00 2001 From: Mario Kleiner Date: Wed, 25 Nov 2015 18:48:31 +0100 Subject: [PATCH] drm/irq: Make drm_update_vblank_count() more robust. The changes to drm_update_vblank_count() for Linux 4.4-rc made the function more fragile wrt. some hw quirks. E.g., at dev->driver->enable_vblank(), AMD gpu's fire a spurious redundant vblank irq shortly after enabling vblank irqs, not locked to vblank. This causes a redundant call which needs to be suppressed to avoid miscounting. To increase robustness, shuffle things around a bit: On drivers with high precision vblank timestamping always evaluate the timestamp difference between current timestamp and previously recorded timestamp to detect such redundant invocations and no-op in that case. Also detect and warn about timestamps going backwards to catch potential kms driver bugs. This patch is meant for Linux 4.4-rc and later. Signed-off-by: Mario Kleiner --- drivers/gpu/drm/drm_irq.c | 53 ++++++++++++++++++++++++++++++++++++----------- 1 file changed, 41 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 819b8c1..8728c3c 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -172,9 +172,11 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, unsigned long flags) { struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; - u32 cur_vblank, diff; + u32 cur_vblank, diff = 0; bool rc; struct timeval t_vblank; + const struct timeval *t_old; + u64 diff_ns; int count = DRM_TIMESTAMP_MAXRETRIES; int framedur_ns = vblank->framedur_ns; @@ -195,13 +197,15 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, rc = drm_get_last_vbltimestamp(dev, pipe, &t_vblank, flags); } while (cur_vblank != dev->driver->get_vblank_counter(dev, pipe) && --count > 0); - if (dev->max_vblank_count != 0) { - /* trust the hw counter when it's around */ - diff = (cur_vblank - vblank->last) & dev->max_vblank_count; - } else if (rc && framedur_ns) { - const struct timeval *t_old; - u64 diff_ns; - + /* + * Always use vblank timestamping based method if supported to reject + * redundant vblank irqs. E.g., AMD hardware needs this to not screw up + * due to some irq handling quirk. + * + * This also sets the diff value for use as fallback below in case the + * hw does not support a suitable hw vblank counter. + */ + if (rc && framedur_ns) { t_old = &vblanktimestamp(dev, pipe, vblank->count); diff_ns = timeval_to_ns(&t_vblank) - timeval_to_ns(t_old); @@ -212,11 +216,36 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, */ diff = DIV_ROUND_CLOSEST_ULL(diff_ns, framedur_ns); - if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) + /* Catch driver timestamping bugs and prevent worse. */ + if ((s32) diff < 0) { + DRM_DEBUG_VBL("crtc %u: Timestamp going backward!" + " diff_ns = %lld, framedur_ns = %d)\n", + pipe, (long long) diff_ns, framedur_ns); + return; + } + + if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) { DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored." - " diff_ns = %lld, framedur_ns = %d)\n", - pipe, (long long) diff_ns, framedur_ns); - } else { + " diff_ns = %lld, framedur_ns = %d)\n", + pipe, (long long) diff_ns, framedur_ns); + + /* Treat this redundant invocation as no-op. */ + WARN_ON_ONCE(cur_vblank != vblank->last); + return; + } + } + + /* + * hw counters, if marked as supported via max_vblank_count != 0, + * *must* increment at leading edge of vblank or at least before + * the firing of the hw vblank irq, otherwise we have a race here + * between gpu and us, where small variations in our execution + * timing can cause off-by-one miscounting of vblanks. + */ + if (dev->max_vblank_count != 0) { + /* trust the hw counter when it's around */ + diff = (cur_vblank - vblank->last) & dev->max_vblank_count; + } else if (!(rc && framedur_ns)) { /* some kind of default for drivers w/o accurate vbl timestamping */ diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; } -- 1.9.1