Message ID | 20131030190123.GM16735@n2100.arm.linux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Oct 30, 2013 at 2:01 PM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Wed, Oct 30, 2013 at 01:01:15PM -0500, Matt Sealey wrote: >> I seem to >> remember it needs to be within 2% - 1 bit extra on the fraction and >> you're 3 times that over the range of accectable clocks monitors MUST >> accept by specification. > > I think 2% may be too generous - that's what I originally tried when > fiddling around with this and it wasn't good enough in some > circumstances for my TV to produce a picture. One day I *will* find the VESA spec where I read that. It is mentioned somewhere in the HDMI spec as a reference, but the VESA doc isn't public. There was a time I had access to it - so I have a copy. Will figure it out... >> On i.MX51, too, the IPU HSC can't go above 133MHz without making the >> system unstable, so any mode that needs 148MHz (most high HDMI modes) >> won't work. > > That's where validating the modes you're prepared to support becomes > important, and there's hooks in DRM for doing that. Nope! Not in the right place. > If you can't support those with 148.5MHz dotclocks, then you have to return > MODE_CLOCK_HIGH from the connector's mode_valid callback. Here's the problem; connectors and encoders are initialized under DRM before the display driver. For the connector to even KNOW what the highest valid clock is, it would need to ask the lower level display driver, since the connector and even encoder has no clue of this limitation and shouldn't be hardcoded, or even know what the input pixel clock capabilities are. If you specify a mode that the SII9022 can't display at all, or the monitor connected (via EDID information) says it can't support that from the maximum TMDS clock field in the EDID, sure, it can send back MODE_CLOCK_HIGH, but otherwise what needs to happen is the sii9022 driver needs to know FAR too much. > result in DRM pruning the modes reported by the connector to those > which the display hardware can support. Except that the connector cannot know what the display hardware can support, as above. I already pointed the above out on the DRM list maybe 18 months ago, it's a big flaw in the DRM design, and the only way I can figure around it is to have the connector driver look for it's display controller in the device tree and read out the clocks from there, or just a property (max-tmds-clock?) which would hack around it, overriding a higher value from EDID. http://thread.gmane.org/gmane.comp.video.dri.devel/65394 >> There's no good reason to use the DI internal clock from IPU HSC, >> unless you can guarantee an EXACT rate coming out (i.e. if you need >> 64MHz, then you can get that with 128MHz/2.0 - which is fine.) > > This is exactly what I'm doing on my Carrier-1 board. If the output > can use the the internal clock (iow, if IPU_DI_CLKMODE_EXT is clear) > and the IPU clock can generate the dotclock within 1% of the requested > rate, then we set the DI to use the IPU clock and set the divisor > appropriately. In theory there are two things. One is that old VESA spec which says "we expect some crappy clocks, so anything within a certain range will work" and the CEA spec which says any mode which is specified as X MHz as the clock rate would also be required to be displayed at X*(1000/1001) (which covers NTSC weirdness). So in fact it may only be 0.01% variance that really matters.. 1% could be too high. It depends if they're following the CEA spec to the letter or the VESA specs to the letter. One thing that always got me, and I know this is an aside, is how many "HDMI" systems people have where they're setting the default video mode to 1024x768. You'd think nobody ever read the CEA-861 spec, ever - it specifically states that the ONLY mode guaranteed is 640x480. In fact, 1024x768 is not in the list for either primary OR secondary defined CEA modes, so there's a high chance your TV won't do it. Monitors - sure, they may do it, they may do it because they have to for Windows Logo certification, but TVs don't get certified by Microsoft. > Each of the four cases I handle separately - and I've gotten rid of > the horrid CCF stuff in ipu-di.c: there's really no need for that > additional complication here, we're not sharing this mux and divisor > which are both internal to the DI with anything else. > > Consequently, my ipu-di.c (relative to v3.12-rc7) looks like this, > and works pretty well with HDMI, providing a stable picture with > almost all mode configurations reported by my TV. > > It's also fewer lines too. :) True, and it looks a lot like the Freescale drivers now, but I do think creating a clockdev clock is probably the "correct" way to do it, especially since in debug situations you'd be able to see this - and it's value, and it's current parent - in sysfs along with the other clocks. It might also help to encourage people not to just play with internal clocks, but to create clock devices and use them inside their drivers - tons of audio stuff has manual clock management right now, and all this information gets lost through (IMO) needless abstraction in the audio subsystem. I might be convinced that it's not needless, but I would still say there's no good reason NOT to implement any clock which has a parent which is a registered clkdev clock as a clkdev clock itself, and expose it with the rest of the clock debug infrastructure. I don't think I like that fractional dividers aren't used; until someone can come up with a great reason why not (except, I think odd fractions dividers are not stable, so it could be just masking off the bottom bit of the fraction is the key) and someone tests it on more than one monitor.. I can break out my monitor testing regime, once I figure out the i.MX51 stuff. I have a few HDMI monitors I bought over the years which were notoriously flakey, and a few DVI ones that had real trouble being connected to a real HDMI source at some point. Thanks, Matt Sealey <neko@bakuhatsu.net>
On Tue, Nov 05, 2013 at 04:39:40PM -0600, Matt Sealey wrote: > On Wed, Oct 30, 2013 at 2:01 PM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > That's where validating the modes you're prepared to support becomes > > important, and there's hooks in DRM for doing that. > > Nope! Not in the right place. Sure it's the right place. It's the connector which gets to read the modes, and the connector is part of the DRM system. > > If you can't support those with 148.5MHz dotclocks, then you have to return > > MODE_CLOCK_HIGH from the connector's mode_valid callback. > > Here's the problem; connectors and encoders are initialized under DRM > before the display driver. For the connector to even KNOW what the > highest valid clock is, it would need to ask the lower level display > driver, since the connector and even encoder has no clue of this > limitation and shouldn't be hardcoded, or even know what the input > pixel clock capabilities are. That's because the way imx-drm is setup with its multitude of individual drivers is utter madness; there's no way that it will ever move out of drivers/staging as long as it messes around violating the internals of DRM by playing those kinds of games. This was discussed at some considerable length at kernel summit, with me as the main motivator for it - including some private discussions with various people involved. The DRM model is a card model, just like ALSA. In this model, the subsystem gets initialised once, and at that point all components of the "card" are expected to be known. The reverse is true when tearing down a card: the whole card goes away in one go, individual components do not. What this means for DRM is that all CRTCs, connectors and encoders are registered with DRM entirely within the card-level ->load callback, and all of those are torn down at the ->unload callback. Between those two points in time, nothing in the configuration of the "card" ever changes. The same is true of ALSA: ALSA too does not support hotplugging individual components. This is why we have ASoC - ASoC handles the grouping up of individual components, works out when all components are available, and only then does it register the "card" with ALSA. When any of those components go away, ASoC pulls the whole card from ALSA. As I say, this was discussed at kernel summit. One thing was made abundently clear: DRM is _not_ going to support hotplugging individual components of a card. Moreover, what has been made perfectly clear is that imx-drm is not going to move out of drivers/staging as long as it abuses DRM's card model. > I already pointed the above out on the DRM list maybe 18 months ago, > it's a big flaw in the DRM design, and the only way I can figure > around it is to have the connector driver look for it's display > controller in the device tree and read out the clocks from there, or > just a property (max-tmds-clock?) which would hack around it, > overriding a higher value from EDID. No, not at all. If you build up the set of components first, before involving DRM, then you have the full information. I'm doing that today: the DRM "card" can only finish initialisation when all known connectors can find their CRTCs within the DRM card. If any CRTC (== IPU DIs) specified in the DT file for a "connector" (eg, HDMI interface, lvds bridge, etc) are not present, the DRM card doesn't initialise. Once you have that in place, you have a way to find out the properties of the CRTC, and you then have a way to validate the modes properly against all parts of the display system. > True, and it looks a lot like the Freescale drivers now, Not at all. The Freescale drivers use the clk API as this did, and just as badly. I'm afraid that by making that comment, you've just shown that you much prefer writing long verbose emails rather than actually reading and understanding the subject for which you're responding. I refer you to this: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tree/drivers/mxc/ipu3/ipu_disp.c?h=imx_3.0.35_4.1.0#n155 which clearly shows Freescale's 4.1.0 BSP using the clk API to control the DI clock mux. When you look at the mess that using the clk API creates in the imx-drm driver in staging: static int clk_di_set_rate(struct clk_hw *hw, unsigned long rate, unsigned long parent_rate) { struct ipu_di *di = container_of(hw, struct ipu_di, clk_hw_out); int div; u32 clkgen0; clkgen0 = ipu_di_read(di, DI_BS_CLKGEN0) & ~0xfff; div = ipu_di_clk_calc_div(parent_rate, rate); ipu_di_write(di, clkgen0 | div, DI_BS_CLKGEN0); int ipu_di_init_sync_panel(struct ipu_di *di, struct ipu_di_signal_cfg *sig) { .. ret = clk_set_rate(di->clk_di_pixel, round); ... div = ipu_di_read(di, DI_BS_CLKGEN0) & 0xfff; div = div / 16; /* Now divider is integer portion */ /* Setup pixel clock timing */ /* Down time is half of period */ ipu_di_write(di, (div << 16), DI_BS_CLKGEN1); So, we call clk_set_rate(), that calls down to clk_di_set_rate() which sets up the DI_BS_CLKGEN0 register. We then have to read that register layer in ipu_di_init_sync_panel() to get the divider to program the duty cycle which we want generated. Now, compare that to this, where we choose which clock locally according to some rules set down by the requirements of the attached display bridge, calculate the divider required, and then: /* Set the divider */ ipu_di_write(di, clkgen0, DI_BS_CLKGEN0); /* * Set the high/low periods. Bits 24:16 give us the falling edge, * and bits 8:0 give the rising edge. LSB is fraction, and is * based on the divider above. We want a 50% duty cycle, so set * the falling edge to be half the divider. */ ipu_di_write(di, (clkgen0 >> 4) << 16, DI_BS_CLKGEN1); That's quite a bit simpler, and doesn't violate abstraction levels at all. And getting rid of this clk API crap where it's not required (a) ends up with a better driver which actually _works_, and (b) results in fewer lines of code: drivers/staging/imx-drm/ipu-v3/ipu-di.c | 328 ++++++++++--------------------- 1 files changed, 105 insertions(+), 223 deletions(-) > but I do > think creating a clockdev clock is probably the "correct" way to do > it, especially since in debug situations you'd be able to see this - > and it's value, and it's current parent - in sysfs along with the > other clocks. The clk API just gets in the way of making the correct decisions. Really, it does. We need to calculate the divider to work out what is the right clock to use, or whether we can use a clock. Having that hidden behind the clk API does not give us the ability to make that decision. > It might also help to encourage people not to just play with internal > clocks, but to create clock devices and use them inside their drivers Drivers playing with their own _internal_ clocks is entirely reasonable, and it's entirely reasonable that they need not be exposed to userspace. Take for instance your average PC, it has lots of clocks, especially on the video card, none of them exposed. They've gotten away with that for decades. Why are we any different. "Oh, we're embedded, we're special!" I hear you cry. No we're not special at all. > I don't think I like that fractional dividers aren't used; until > someone can come up with a great reason why not (except, I think odd > fractions dividers are not stable, so it could be just masking off the > bottom bit of the fraction is the key) and someone tests it on more > than one monitor.. Why fractional dividers are bad news - this is the TMDS clock that resulted from the DI being configured with a fractional divider: http://www.home.arm.linux.org.uk/~rmk/cubox/IMG_1545-adj.JPG No, that's not blur. That's the result of a fractional divider before the HDMI phy PLL. The result is that the PLL is frequency modulated, and the connected TV basically tells the IMX to "piss off". It's not "odd fractions" that are a problem - I believe Freescale's code is correct, the problem is that no one has _thought_ about what this is doing: #ifdef WTF_IS_THIS /* * Freescale has this in their Kernel. It is neither clear what * it does nor why it does it */ if (div & 0x10) div &= ~0x7; else { /* Round up divider if it gets us closer to desired pix clk */ if ((div & 0xC) == 0xC) { div += 0x10; div &= ~0xF; } } #endif I know what that's doing, it's not rocket science. But if you have to resort to using fractional dividers when you have another clock available to generate a precise clock without the inherent instability caused by fractional dividers, you might as well make use of it. Moreover, I'm not the only one to run into _real_ problems with fractional dividers - other people have too with the iMX stuff, and have also had to disable these fractional dividers too.
On Thu, Nov 7, 2013 at 11:29 AM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Tue, Nov 05, 2013 at 04:39:40PM -0600, Matt Sealey wrote: >> On Wed, Oct 30, 2013 at 2:01 PM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >> > That's where validating the modes you're prepared to support becomes >> > important, and there's hooks in DRM for doing that. >> >> Nope! Not in the right place. > > Sure it's the right place. It's the connector which gets to read the > modes, and the connector is part of the DRM system. It's really not the right place. >> > If you can't support those with 148.5MHz dotclocks, then you have to return >> > MODE_CLOCK_HIGH from the connector's mode_valid callback. >> >> Here's the problem; connectors and encoders are initialized under DRM >> before the display driver. For the connector to even KNOW what the >> highest valid clock is, it would need to ask the lower level display >> driver, since the connector and even encoder has no clue of this >> limitation and shouldn't be hardcoded, or even know what the input >> pixel clock capabilities are. > > That's because the way imx-drm is setup with its multitude of individual > drivers is utter madness; This is how all DRM cards are written, even the good ones. For the connector to be able to ask the crtc "is this mode even supportable?" it would need a mode_valid callback to call, down to the encoder (and maybe the encoder would call crtc) and down down to the crtc. No such callback exists for this exact reason. At the point the connector pulls EDID modes and uses mode_valid, it's entirely possible and nearly always true that it doesn't even HAVE an encoder or CRTC. there's no way that it will ever move out of > drivers/staging as long as it messes around violating the internals of > DRM by playing those kinds of games. This was discussed at some > considerable length at kernel summit, with me as the main motivator for > it - including some private discussions with various people involved. What it lacks, in this case, is not any functional code in those components, but a workable model to glue those components together in the correct order, on a per-board basis. The appropriate card model isn't implemented - except in crazy functions in imx-drm-core.c - the rest of it is absolutely fine. What that doesn't fix is the underlying, unescapable fact above. At the time the encoder or connector actually goes off and pulls EDID, parses modes and does it's thing, at least until about 6 months ago, it was long, long before it was attached to a CRTC. The ONLY thing that can happen is a call to the crtc mode_fixup() which exists only to say "go fuck yourself" to a mode_set call. Nothing ever culls these mode lists after the connector generates them, because it owns that object and ditching items from it is semi-dangerous layering violation. So here is the user experience with the current model: * User clicks a button in GNOME to set the fanciest resolution they can find, which is listed in their dialog * User is told "that mode can't be set, sorry" * User waits for you to go to sleep and then suffocates you with a pillow And what it should be is: * User clicks a button in GNOME to set the fanciest resolution actually supported with this combination, because the drivers knew enough to pre-filter invalid modes * User always gets a valid mode set because it only ever reports valid modes That second, consumer-oriented, model where the usable mode list is predicated on results from the *entire* card and not just what the monitor said, simply wasn't - and I assume still isn't - possible. Why not? Because when a Radeon is plugged into a monitor it bloody well supports it, and that's the end of it. People don't make displays that modern graphics cards can't use. By the time 1080p TVs in common circulation rolled around for consumer devices, WQXGA monitors already existed, so desktop PC graphics cards followed suit pretty quickly. However, some embedded devices have restrictions. I have a couple devices at home that have a maximum resolution of 1280x720 - because the SoC doesn't provide anything that can do more than a 75MHz pixel clock or so. So, that sucks, but it's a real limitation of the SoC that is essentially insurmountable. In the case on the i.MX51, it was just never designed to be a 1080p-capable device. However, there is NOTHING in the architecture of the chip except the maximum clock and some bus bandwidth foibles that says it's impossible. I can run 1080p@30 and it operates great in 2D and even does respectable 3D. The video decoders still work - if you have a low enough bitrate/complexity movie, it *will* decode 1080p at 30fps. So artificially restricting the displays to "720p maximum" is overly restrictive to customers, considering that 1366x768,1440x900 are well within the performance of the units. 1600x900, 1920x1080 (low enough refresh rate) are still usable. The problem is under DRM > The DRM model is a card model, just like ALSA. In this model, the > subsystem gets initialised once, and at that point all components of > the "card" are expected to be known. The reverse is true when tearing > down a card: the whole card goes away in one go, individual components > do not. > > What this means for DRM is that all CRTCs, connectors and encoders are > registered with DRM entirely within the card-level ->load callback, > and all of those are torn down at the ->unload callback. Between those > two points in time, nothing in the configuration of the "card" ever > changes. Right but there is no card driver, and all the current "embedded" DRM drivers get around this by making some crass assumptions about the connections - most of them just have an LCD controller and HDMI controller built-in so they haven't trampled over the case where the device tree allowed an i2c device to probe and initialize and somehow needs discovery to glue into the framework as an encoder. The "encoder slave" system right now is totally futzed, while it would be a GREAT point to implement every encoder-connector combination that card drivers need to pick up, it doesn't have the right functionality. The reason there's no card driver.. discussed below.. > The same is true of ALSA: ALSA too does not support hotplugging > individual components. This is why we have ASoC - ASoC handles the > grouping up of individual components, works out when all components > are available, and only then does it register the "card" with ALSA. > When any of those components go away, ASoC pulls the whole card from > ALSA. There's a distinct lack of actual procedure and process standardized for collecting card data from device tree, and in the case of a single SoC, the current ALSA/DRM model will be one driver for every board, and every board needs it's own compatible property, and every board does *the exact same thing*. You can see this with the current imx{51|53|6x}-board-{codec} driver model. Every single driver is doing almost identical stuff, where they share codecs they are doing ABSOLUTELY identical stuff. There is stuff in the card level driver that doesn't even do anything at the card level (and the card level always calls down for this functionality to the codec level). If device trees were adopted with the intent to allow removal of the "board setup code written in C and glommed in a directory with the SoC support code", implementing "card support" for every board is just moving that crappy platform code from one directory to another with absolutely zero benefit or flexibility or ability to expose a common configuration. What happened with ALSA is the same shit we're going to get with DRM. > No, not at all. If you build up the set of components first, before > involving DRM, then you have the full information. I'm doing that > today: the DRM "card" can only finish initialisation when all known > connectors can find their CRTCs within the DRM card. If any CRTC > (== IPU DIs) specified in the DT file for a "connector" (eg, HDMI > interface, lvds bridge, etc) are not present, the DRM card doesn't > initialise. And this is what's missing. So, here's the question: how do you implement a card-level driver model that marshalls this information together and submits it to the separate components without, in the process, creating a whole new driver framework? The need for having the Connector ask the CRTC what a valid mode is doesn't make sense here, in that DRM doesn't have such a framework to do such a thing (reasons above). In the i.MX51 case, the maximum display clock is *absolutely not* a function of the connector, but it must know this information in the same way that it would also need to know the maximum TMDS output clock for the encoder (bzzt - sorry, sir, it doesn't) so that it can validate the modes, and in the way it knows to gather the maximum supported TMDS clock from the monitor EDID (if present). > Once you have that in place, you have a way to find out the properties > of the CRTC, and you then have a way to validate the modes properly > against all parts of the display system. Right, and your card driver "isn't DRM" in the same way as DRM has a card model - it operates on the level of a PCIe device driver that probes and magically sets up *all* it's subordinate buses, encoders, connectors. You can do that with i.MX6 HDMI because the HDMI is built into the SoC - if you added a node for it and added a cross-reference to it, then it's fine. What happens in the case where you defined an HDMI transmitter as an external I2C device - it's connected on this bus, Linux already probed the thing. Your card model would have to go out and find this i2c device somehow, and there's not a great way of doing it. What if the EDID can't be obtained by straight I2C and you have to set some weird bits inside the transmitter to access the DDC bus? That requires an API that doesn't exist, and any topology-finding drivers that are built off the single use case of "we just attach a panel direct to the SoC by some parallel or differential link" rather than the intermediate-transmitters case is going to miss out the finer points of the intermediate-transmitters case. >> True, and it looks a lot like the Freescale drivers now, > > Not at all. The Freescale drivers use the clk API as this did, and just > as badly. I'm afraid that by making that comment, you've just shown that > you much prefer writing long verbose emails rather than actually reading > and understanding the subject for which you're responding. > > I refer you to this: > > http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tree/drivers/mxc/ipu3/ipu_disp.c?h=imx_3.0.35_4.1.0#n155 > > which clearly shows Freescale's 4.1.0 BSP using the clk API to control > the DI clock mux. I understand THAT driver like the back of my hand.. >> think creating a clockdev clock is probably the "correct" way to do >> it, especially since in debug situations you'd be able to see this - >> and it's value, and it's current parent - in sysfs along with the >> other clocks. > > The clk API just gets in the way of making the correct decisions. > Really, it does. We need to calculate the divider to work out what > is the right clock to use, or whether we can use a clock. Having > that hidden behind the clk API does not give us the ability to make > that decision. It shouldn't ever have been hidden anyway, someone completely over-abstracted the clock API at some point by making it a query/get/set interface for a generic clock object, where the only reasonable values are the rate and, in a limited fashion, the parent. Why not add clk_get_divider() which essentially returns 0 for anything that's not actually a real divider clock? >> It might also help to encourage people not to just play with internal >> clocks, but to create clock devices and use them inside their drivers > > Drivers playing with their own _internal_ clocks is entirely reasonable, > and it's entirely reasonable that they need not be exposed to userspace. > Take for instance your average PC, it has lots of clocks, especially > on the video card, none of them exposed. They've gotten away with that > for decades. Why are we any different. > > "Oh, we're embedded, we're special!" I hear you cry. No we're not > special at all. No, we're not special in that there's no functional difference, but the real situation is that as a developer it's REALLY difficult to get the information you want from some internal clock somewhere compared to a core SoC clock inside the clkdev framework, and every internal clock needs it's own magic, custom implementation - which just makes maintaining it more work since 'everyone' then has to learn not only how to derive operation from a generic clkdev device, but how the actual implementation of the internal clock works as well underneath it. Which is what clkdev was supposed to get rid of, right? What it does is make simple cases much easier. I will agree with you that in this case, this is totally screwy, but I don't want to see it discourage people from still doing it... that said........ >> I don't think I like that fractional dividers aren't used; until >> someone can come up with a great reason why not (except, I think odd >> fractions dividers are not stable, so it could be just masking off the >> bottom bit of the fraction is the key) and someone tests it on more >> than one monitor.. > > Why fractional dividers are bad news - this is the TMDS clock that > resulted from the DI being configured with a fractional divider: > > http://www.home.arm.linux.org.uk/~rmk/cubox/IMG_1545-adj.JPG > > No, that's not blur. That's the result of a fractional divider before > the HDMI phy PLL. The result is that the PLL is frequency modulated, > and the connected TV basically tells the IMX to "piss off". Understood. > It's not "odd fractions" that are a problem Maybe I should have made myself a bit clearer.. the further away from 0.0 or 0.5 fractional part you get, the less it works, and the further up from 0.5 the less reliable it is regardless. There are some odd (as in oddball, not the opposite to even) fractions that don't make any sense and I found I got more monitors to work if I just masked off the bottom bit of the divider or bottom two, but that broke a couple monitors more than it fixed. What the effect for me was, if I quit using the fractional divider completely, that it would always get caught by the "clock within a reasonable distance of the requested rate" catch in many cases, and then we ignore it and use an external clock, and the amount of testing someone can do with only 11 monitors, limited spare units, and not a lot of time to spend is limited. Which you propose later.. who gives a shit if we can't use fractional rates? Just set the clock right! There's a funky reason why dropping the divider actually works, or why probably masking off certain parts of it for certain modes works.. > - I believe Freescale's code is correct I disagree with that and I'll explain why :) > the problem is that no one has _thought_ about what this is doing: I disagree with that since I was thinking about it in 2009.. I just can't figure out EXACTLY what the real fix is, I never had enough hardware, time, or patience. I totally agree with the assessment Alexander had, and I trust your scopes. A screwy clock gets generated. I don't think it's because of the "fractional part" though. They've been doing odd tricks, since the earliest BSP I could find, and I started working on it at this BSP version; http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tree/drivers/mxc/ipu3/ipu_disp.c?h=imx_2.6.28#n865 Which was modified at this commit, going further back: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/commit/drivers/mxc/ipu3/ipu_disp.c?h=imx_2.6.28&id=1c930f0b2f1a67c7eb46a045c7517376e8ecaa32 I only fiddled with this thing for the shortest time before moving to 2.6.31 as fast as possible; the code reappears in later BSPs just ported in with no git history... but here, note that set_rate sets DI_BS_CLKGEN1, too. There's not really a good reason why it WOULDN'T. http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tree/drivers/mxc/ipu3/ipu_common.c?h=imx_2.6.31_10.08.01#n151 And we get to more modern times, and DI_BS_CLKGEN1 is not in the clk API code. The rounding essentially says you can only have a fraction of 1/2 if your clock divider integer part is odd. If your clock divider integer part is even, and the fractional part has a particular set of values (0.75, 0.8125, 0.875, 0.9375), it'll bump it to the next (odd-integer-part), unfractioned divider. I never got how that even worked, but it is, in my opinion, somewhat to do with this gem in setting DI_BS_CLKGEN1: div = ipu_di_read(di, DI_BS_CLKGEN0) & 0xfff; div = div / 16; /* Now divider is integer portion */ /* Setup pixel clock timing */ /* Down time is half of period */ ipu_di_write(di, (div << 16), DI_BS_CLKGEN1); <-- right here Recall that the original code (1c930f0b2f1a) also set the "up" portion here. If you will notice, it writes the integer part of the divider (so in Alexanders case, 0x3) to the "down period" component of the second clock generator register by shifting the entire thing up 16 bits. According to the docs, DI_BS_CLKGEN1[16] is ANOTHER FRACTION BIT (as is DI_BS_CLKGEN1[0] for the "up period"). My theory that I could never confirm, is that DI_BS_CLKGEN1 has always been misprogrammed because it's off-by-one on the actual integer part of the divider, and there's probably a case for setting the delay period (top of DI_BS_CLKGEN0) and the "up period" of DI_BS_CLKGEN1 too. I just never had the amount of hardware I'd need to do a functional test with real devices that gave us those clock dividers. In Alex's case with his 3.5 divider, the down period is being set to 1.5 (i.e. (0x38 >> 4) <<16) in that bit position. With a 3.0 divider, dropping 1.5 in the down period is sorta correct. Clock works. Yay? It also *ALWAYS* works in the external clock case, since the external clock (as per Freescale) is always used as parent, generating the DI clock with an exact divider of 2, 4, or 8. Does that maybe go so far as to explain why those fractional dividers really don't work properly? There's a REALLY good explanation in the i.MX6QDRM manual around "34.7.10.3 Timing generator" which gives exactly how this clock is set up. What you're expecting as a straight fractional division of a clock isn't true - every pin output on the DI is derived from a fundamental timebase which is defined by that fractional divider in DI_BS_CLKGEN0. What actually goes to the display is THEN derived from that fundamental timebase modified by the valueS (plural!) in DI_BS_CLKGEN1. There's a significant amount of math to figure out whether the exact display timing is going to be valid, not being done here, but it turns out through pure luck, it tends to work out for a couple cases.. I never sat down and thought of the actual math required and since it impacts ALL the waveform generation too, it seemed like a lot of work I wasn't prepared to do at the time. I don't think I was paid enough for it, when I was being paid to do it. I don't get paid to do it at all right now (thank Glob) and nobody's giving me any free hardware that makes me want to do it... so it's in your court. > I know what that's doing, it's not rocket science. But if you have to > resort to using fractional dividers when you have another clock available > to generate a precise clock without the inherent instability caused by > fractional dividers If DI_BS_CLKGEN1 can be proven mis-programmed and fixing that fixes the instability, then awesome. I'd like to see that investigated though. If not, sure, I agree with you 100%. There's no reason > Moreover, I'm not the only one to run into _real_ problems with fractional > dividers - other people have too with the iMX stuff, and have also had to > disable these fractional dividers too. Well I've been dealing with them since the i.MX51 taped out and honestly.. it worked a lot better with the 2.6.26 kernel with the original hacks, but at that time I don't recall we had any support for an external clock. As soon as 2.6.28 rolled round we found some weird problems, and Freescale added code to use external clocks and we stopped caring because it pretty much always worked, with some veeerrrrrry strange caveats - most of which were masked by API deficiencies in the framebuffer subsystem and a distinct inability to ask the IPU driver what clock it actually set from the code in our transmitter driver. Which is, back to an earlier point, *why I want real clkdevs internal to drivers* since it makes it a lot easier to pass the information around in a standardized way. The transmitter may need to know *precisely* what pixel clock was set, and at the encoder level what I can get back is clk_round_rate rounded by the crtc to the exact clock it's outputting, rather than recalculating the clock from the mode information (syncs+totals+blah, you patched this out in your fixes) and not being able to take into account the rounding EXCEPT if I add my own API.. to fetch the information. It's more reasonable if the encoder can just request the crtc's clk handle and do what it may, than for a special API for passing "mode clock information" came into it. What we've not ascertained in reality is not "does it generate a weird clock" but why did Freescale only limit only those very specific fractions (why would (if (div & 0xc)==0xc) div+=0x1; div &= 0xf be "working" for them?) and why don't they set the "up period" in their driver since 2009? And what happens if we really, really properly configure the clock generation? Matt Sealey <neko@bakuhatsu.net>
On Fri, Nov 08, 2013 at 06:23:37PM -0600, Matt Sealey wrote: > On Thu, Nov 7, 2013 at 11:29 AM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > On Tue, Nov 05, 2013 at 04:39:40PM -0600, Matt Sealey wrote: > >> On Wed, Oct 30, 2013 at 2:01 PM, Russell King - ARM Linux > >> <linux@arm.linux.org.uk> wrote: > >> > That's where validating the modes you're prepared to support becomes > >> > important, and there's hooks in DRM for doing that. > >> > >> Nope! Not in the right place. > > > > Sure it's the right place. It's the connector which gets to read the > > modes, and the connector is part of the DRM system. > > It's really not the right place. > > >> > If you can't support those with 148.5MHz dotclocks, then you have to return > >> > MODE_CLOCK_HIGH from the connector's mode_valid callback. > >> > >> Here's the problem; connectors and encoders are initialized under DRM > >> before the display driver. For the connector to even KNOW what the > >> highest valid clock is, it would need to ask the lower level display > >> driver, since the connector and even encoder has no clue of this > >> limitation and shouldn't be hardcoded, or even know what the input > >> pixel clock capabilities are. > > > > That's because the way imx-drm is setup with its multitude of individual > > drivers is utter madness; > > This is how all DRM cards are written, even the good ones. For the > connector to be able to ask the crtc "is this mode even supportable?" > it would need a mode_valid callback to call, down to the encoder (and > maybe the encoder would call crtc) and down down to the crtc. No such > callback exists for this exact reason. At the point the connector > pulls EDID modes and uses mode_valid, it's entirely possible and > nearly always true that it doesn't even HAVE an encoder or CRTC. Now look at the bigger picture. What makes the decision about whether to use a mode with a particular CRTC? Is it: (a) DRM (b) Userspace The answer is not (a) - DRM just does what userspace instructs. What this means is that you can't do per-CRTC validation of mode settings without big changes to the API, which aren't going to happen. So the best you can do is to limit the displayed modes according to what the hardware in general is able to do. Luckily, with imx-drm, in the case of single IPUs, both "crtcs" have exactly the same capability as far as clocks go - which is common with every other DRM system so far. So really, there's no need to know which CRTC is going to be associated with the connector - we already know the capabilities there. > What it lacks, in this case, is not any functional code in those > components, but a workable model to glue those components together in > the correct order, on a per-board basis. > > The appropriate card model isn't implemented - except in crazy > functions in imx-drm-core.c - the rest of it is absolutely fine. Well, I've actually fixed the problem *right* *now*. > The ONLY thing that can happen is a call to the crtc mode_fixup() > which exists only to say "go fuck yourself" to a mode_set call. > Nothing ever culls these mode lists after the connector generates > them, because it owns that object and ditching items from it is > semi-dangerous layering violation. So here is the user experience with > the current model: > > * User clicks a button in GNOME to set the fanciest resolution they > can find, which is listed in their dialog > * User is told "that mode can't be set, sorry" > * User waits for you to go to sleep and then suffocates you with a pillow > > And what it should be is: > > * User clicks a button in GNOME to set the fanciest resolution > actually supported with this combination, because the drivers knew > enough to pre-filter invalid modes > * User always gets a valid mode set because it only ever reports valid modes > > That second, consumer-oriented, model where the usable mode list is > predicated on results from the *entire* card and not just what the > monitor said, simply wasn't - and I assume still isn't - possible. Why > not? Because when a Radeon is plugged into a monitor it bloody well > supports it, and that's the end of it. People don't make displays that > modern graphics cards can't use. By the time 1080p TVs in common > circulation rolled around for consumer devices, WQXGA monitors already > existed, so desktop PC graphics cards followed suit pretty quickly. Sorry, no, I'm not buying your arguments here - you may be right but your apparant bias towards "oh, DRM was written for Radeon" is soo wrong. Maybe you should consider that the DRM maintainer works for Intel, who are a competitor of AMD not only in the CPU market but also the video market too. However, even so. Getting back to the hardware we have, which is imx-drm, there is no difference between the capabilities of the two "CRTCs" in a single IPU as far as mode pixel rates are concerned. There _may_ be a difference between the two IPUs on IMX6Q, but not within a single IPU. > However, some embedded devices have restrictions. I have a couple > devices at home that have a maximum resolution of 1280x720 - because > the SoC doesn't provide anything that can do more than a 75MHz pixel > clock or so. So, that sucks, but it's a real limitation of the SoC > that is essentially insurmountable. Right, so turning everything into micro-components is a bad idea. That's already been said (I think I've already said that about imx-drm.) > In the case on the i.MX51, it was just never designed to be a > 1080p-capable device. However, there is NOTHING in the architecture of > the chip except the maximum clock and some bus bandwidth foibles that > says it's impossible. I can run 1080p@30 and it operates great in 2D > and even does respectable 3D. The video decoders still work - if you > have a low enough bitrate/complexity movie, it *will* decode 1080p at > 30fps. So artificially restricting the displays to "720p maximum" is > overly restrictive to customers, considering that 1366x768,1440x900 > are well within the performance of the units. 1600x900, 1920x1080 (low > enough refresh rate) are still usable. > > The problem is under DRM Err no. The reason 1080p@30 works is because it uses the same pixel rate as 1280x720. 74.25MHz. So the problem is with your thinking. What you're thinking is "it won't do 1080p so we have to deny modes based on the resolution". There's absolutely no problem what so ever restricting the set of available modes based on _pixel_ _clock_ _rate_ in DRM. DRM fully supports this. There is no problem here. So you _can_ have 1080p at 30Hz if you have a bandwidth limitation. All it takes is for the _correct_ limitations to be imposed rather than the artificial ones you seem to be making up. And at this point I give up reading your diatribe. You've been told many times in the past not to write huge long essays as emails, because frankly people won't read them. It's 1 am, I'm not going to spend another hour or more reading whatever you've written below this point because its another 300 lines I just don't have time to read at this. If you'd like to rephrase in a more concise manner then I may read your message.
On Fri, Nov 8, 2013 at 6:53 PM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Fri, Nov 08, 2013 at 06:23:37PM -0600, Matt Sealey wrote: >> On Thu, Nov 7, 2013 at 11:29 AM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: > > Now look at the bigger picture. What makes the decision about whether > to use a mode with a particular CRTC? Is it: > > (a) DRM > (b) Userspace > > The answer is not (a) - DRM just does what userspace instructs. What > this means is that you can't do per-CRTC validation of mode settings > without big changes to the API, which aren't going to happen. That's exactly what I was saying in the first place, and what Alex Duecher said to me a year ago after I spent 2 days straight crashing an i.MX51 trying to figure out a way to get the connector to ask the CRTC for the right information. So I accept that we can't do validation of modes at a CRTC level, but what's the solution? I don't accept that it would be to shove mode culling policy to userspace. xorg-video-modesetting driver is a gold standard - if all users want is to get a linear framebuffer with a particular mode (and we should all be happy for this on ARM since it means KMS is working, horray!), it should not have to know which modes will actually work or not or be modified to be platform specific. In the case of "card drivers" for DRM, on the kernel side, having a "card" driver per board to micromanage settings will get as unwieldy as having multiple platform-specific xorg-video-modesetting patches, when there are 10 or 20 boards based on a particular chip in mainline, and 20 or 30 SoCs supported in active use. ALSA is getting away with it right now because barely anyone has gotten to the point of having working audio with device tree except the i.MX6 and i.MX5 cases. It won't scale as it goes forward. I am betting the one you "have *right* *now*" is the above, you wrote a driver which, given a very specific set of device tree entries with specific compatible properties in combination, specifically initializes several i.MX6-specific components together in the right order. This also won't scale, going forward. > Luckily, with imx-drm, in the case of single IPUs, both "crtcs" have > exactly the same capability as far as clocks go - which is common with > every other DRM system so far. So really, there's no need to know which > CRTC is going to be associated with the connector - we already know the > capabilities there. That's not the point I was trying to get across. > If you'd like to rephrase in a more concise manner then I may read > your message. Maybe I am just not explaining it well enough. Here it is in the shortest way I can do it. * i.MX6Q - two discrete IPU with two DI. IPU have independent clock, up to 200MHz for HSC. * i.MX53/50 - one discrete IPU with two DI. IPU clock up to 150MHz for HSC. * i.MX51 - one discrete IPU with two DI. IPU clock up to 133MHz for HSC. Same driver. DI can divide HSC clock or accept external clock to generate, using a fractional divider, any clock rate it wants. This clock becomes the "fundamental timebase" for the DISP_CLK, which is generated as expected, based on "up" and "down" period values for the fundamental timebase. Whether HSC clock or external clock, the maximum DISP_CLK which goes out to the display is bounded by the current HSC clock rate. Two problems. * i.MX51 maximum clock rate (133MHz) is lower than standard pixel clock rate for 1080p60 and other high-resolution modes with high-refresh rates * DI_BS_CLKGEN0 and 1 need to be programmed correctly to generate a correct DISP_CLK out. Problem without solution (i.e. DRM filters modes back-asswards): * i.MX51 connector is probably not part of the SoC, but an independent and totally divorced component from the IPU. One example is an IPU parallel display connected to an active HDMI encoder, which may be provided by companies such as Texas Instruments or Silicon Image. These things exist in the wild. It is not reasonable for a Silicon Image transmitter driver as a generic i2c device (which it really is) to have intimate knowledge of the i.MX51. It is not reasonable for the Silicon Image transmitter to be predicated upon a specific SoC. DRM offers no way for an encoder/connector to find out this information from the correct place - the CRTC driver - at the time it fetches EDID from the monitor, or gains mode information some other way. DRM only offers a way to use the maximum TMDS clock from the EDID to cull modes the *monitor* can't handle in the CVT/GTF mode timing case (or disable usage of CVT/GTF completely). Problem with solution (i.e. don't just strip fractions please): Current programming methodology for DI_BS_CLKGEN1 only takes into account exact case where DI_BS_CLKGEN0 divider is an integer and is stuffed into CLKGEN1 "down period" shifted one bit right, which is why the "strip the fractions" part works. CLKGEN1 only has a Q8.1 fractional divider whereas CLKGEN0 has a Q12.4 fractional divider. Any use of anything but the integer part of the CLKGEN0 divider probably cannot be represented as an exact value in CLKGEN1 without the original parent clock being multiplied by a suitable value or CLKGEN0 divider (Q12.4) being further adjusted to allow valid CLKGEN1 values (Q8.1). Using CLKGEN1 better and doing more comprehensive 'upstream' clock management may give better results. In the configurations Alex and yourself tried, it is not possible to derive the correct DISP_CLK from the fundamental timebase created by CLKGEN0 using the values CLKGEN1 was programmed with. Previous mail, a GEN0 divider of 3.5 ends up as a GEN1 down period of 1.5. A divider of 3.0 ends up as GEN1 down period of 1.5. In this case the solution is to correct the fundamental timebase such that it is possible to generate that value. Assuming input clock 133MHz, expected rate 38MHz. Simplest case divider is 133/38 = 3.5. With current code, "down period" will be 1.5 instead of (3/2) but it should be 1.75 which we cannot represent because of the single fraction bit. Possible fix? Set divider to 1.75 instead. DI clock generator will create a 76MHz clock as fundamental timebase. This *shouldn't* go directly out on any pins but it's used to synchronize everything else on the waveform generators. What I could never work out is what to do next; set GEN1 down period to 3.5? Or 2.0? The diagrams don't explain whether the period is the relation from timebase to parent clock, or timebase to DISP_CLK very clearly, bit it seems 2.0 would be the correct one in this instance (it makes more sense given the current working divider/down period settings). Here's my notes from sometime in 2010: "When the divider is calculated for the pixel clock, what it should be trying to generate is a Q8.1 number for the "down" and/or "up" periods in GEN1 (you could just set one of them, dependent on clock polarity setting, assuming you want your data clocked at the edge of the input clock.. all stuff the driver kinda glosses over) rather than concentrating on the fractional divider in GEN0 first. The Q12.4 divider is there so you have more flexibility when you find a perfect Q8.1 for the up/down clock periods depending on the input clock." Apart from that contrived case above, I used Alex's case, I couldn't sit down and write that algorithm even if I had all the coffee in Columbia (and some other stuff too), I've had 3 years to try and just didn't. I'm dealing with getting the platform that explains the cases I have up and running in the first place, dealing with *all* the same problems you are with the added problem of sometimes ARM not work. I'm not rushing to get it upstream by a deadline - sometimes my wife complains that I spent too much time in my office after work - so when she lets me be, and it actually boots reliably and I figure out a couple U-Boot problems, maybe we'll get in sync on the issue and be fixing the same things in collaboration, but until then... I'm suggesting you might want to do it as it might cause less weird results later down the road. Of course if you TL;DR it, then you can live with the broken display driver for as long as you like. Concise? Thanks :) Matt Sealey <neko@bakuhatsu.net>
On Sat, Nov 09, 2013 at 12:11:51AM -0600, Matt Sealey wrote: > That's exactly what I was saying in the first place, and what Alex > Duecher said to me a year ago after I spent 2 days straight crashing > an i.MX51 trying to figure out a way to get the connector to ask the > CRTC for the right information. I still completely disagree with you on that. I've yet to see a situation where you have different restrictions on what can be done depending on which "CRTC" in the SoC you connect an interface to. > In the case of "card drivers" for DRM, on the kernel side, having a > "card" driver per board to micromanage settings will get as unwieldy > as having multiple platform-specific xorg-video-modesetting patches, *sigh* completely the wrong end of the stick, and I bet most of the rest of this email is pounded out based on that. > I am betting the one you "have *right* *now*" is the above, you wrote > a driver which, given a very specific set of device tree entries with > specific compatible properties in combination, specifically > initializes several i.MX6-specific components together in the right > order. This also won't scale, going forward. What utter rubbish. I have nothing further to add other than you are wrong. I'm not going to give you any further information for you to pound your poor keyboard with to compose another verbose reply which has the wrong end of the stick. > > If you'd like to rephrase in a more concise manner then I may read > > your message. > > Maybe I am just not explaining it well enough. Here it is in the > shortest way I can do it. > > * i.MX6Q - two discrete IPU with two DI. IPU have independent clock, > up to 200MHz for HSC. > * i.MX53/50 - one discrete IPU with two DI. IPU clock up to 150MHz for HSC. > * i.MX51 - one discrete IPU with two DI. IPU clock up to 133MHz for HSC. > > Same driver. Right, so what you've just told me in a concise way is that the restrictions are per-SoC, not per-CRTC as you've just been claiming in your rediculously verbose emails (the rest of your email cut without reading.) So what we do is we arrange for the imx-drm connector layers to call into the imx-drm-core layer, to ask the imx-drm CRTC layer what the maximum pixel clock is. We then have the connector compare the maximum pixel clock rate with the clock rate required for the mode, and if the mode has a pixel clock rate greater than the CRTC can provide, we return MODE_CLOCK_HIGH. If done correctly, this doesn't require any code in the connector layer: you arrange for imx-drm-core to provide a _generic_ cross- connector function which provides this information, and you point the connector's .mode_valid at that generic function. There is no problem here.
diff --git a/drivers/staging/imx-drm/ipu-v3/ipu-di.c b/drivers/staging/imx-drm/ipu-v3/ipu-di.c index d766e18bfca0..82a9ebad697c 100644 --- a/drivers/staging/imx-drm/ipu-v3/ipu-di.c +++ b/drivers/staging/imx-drm/ipu-v3/ipu-di.c @@ -19,9 +19,6 @@ #include <linux/io.h> #include <linux/err.h> #include <linux/platform_device.h> -#include <linux/clk.h> -#include <linux/clk-provider.h> -#include <linux/clkdev.h> #include "imx-ipu-v3.h" #include "ipu-prv.h" @@ -33,10 +30,7 @@ struct ipu_di { struct clk *clk_di; /* display input clock */ struct clk *clk_ipu; /* IPU bus clock */ struct clk *clk_di_pixel; /* resulting pixel clock */ - struct clk_hw clk_hw_out; - char *clk_name; bool inuse; - unsigned long clkflags; struct ipu_soc *ipu; }; @@ -141,130 +135,6 @@ static inline void ipu_di_write(struct ipu_di *di, u32 value, unsigned offset) writel(value, di->base + offset); } -static int ipu_di_clk_calc_div(unsigned long inrate, unsigned long outrate) -{ - u64 tmp = inrate; - int div; - - tmp *= 16; - - do_div(tmp, outrate); - - div = tmp; - - if (div < 0x10) - div = 0x10; - -#ifdef WTF_IS_THIS - /* - * Freescale has this in their Kernel. It is neither clear what - * it does nor why it does it - */ - if (div & 0x10) - div &= ~0x7; - else { - /* Round up divider if it gets us closer to desired pix clk */ - if ((div & 0xC) == 0xC) { - div += 0x10; - div &= ~0xF; - } - } -#endif - return div; -} - -static unsigned long clk_di_recalc_rate(struct clk_hw *hw, - unsigned long parent_rate) -{ - struct ipu_di *di = container_of(hw, struct ipu_di, clk_hw_out); - unsigned long outrate; - u32 div = ipu_di_read(di, DI_BS_CLKGEN0); - - if (div < 0x10) - div = 0x10; - - outrate = (parent_rate / div) * 16; - - return outrate; -} - -static long clk_di_round_rate(struct clk_hw *hw, unsigned long rate, - unsigned long *prate) -{ - struct ipu_di *di = container_of(hw, struct ipu_di, clk_hw_out); - unsigned long outrate; - int div; - u32 val; - - div = ipu_di_clk_calc_div(*prate, rate); - - outrate = (*prate / div) * 16; - - val = ipu_di_read(di, DI_GENERAL); - - if (!(val & DI_GEN_DI_CLK_EXT) && outrate > *prate / 2) - outrate = *prate / 2; - - dev_dbg(di->ipu->dev, - "%s: inrate: %ld div: 0x%08x outrate: %ld wanted: %ld\n", - __func__, *prate, div, outrate, rate); - - return outrate; -} - -static int clk_di_set_rate(struct clk_hw *hw, unsigned long rate, - unsigned long parent_rate) -{ - struct ipu_di *di = container_of(hw, struct ipu_di, clk_hw_out); - int div; - u32 clkgen0; - - clkgen0 = ipu_di_read(di, DI_BS_CLKGEN0) & ~0xfff; - - div = ipu_di_clk_calc_div(parent_rate, rate); - - ipu_di_write(di, clkgen0 | div, DI_BS_CLKGEN0); - - dev_dbg(di->ipu->dev, "%s: inrate: %ld desired: %ld div: 0x%08x\n", - __func__, parent_rate, rate, div); - return 0; -} - -static u8 clk_di_get_parent(struct clk_hw *hw) -{ - struct ipu_di *di = container_of(hw, struct ipu_di, clk_hw_out); - u32 val; - - val = ipu_di_read(di, DI_GENERAL); - - return val & DI_GEN_DI_CLK_EXT ? 1 : 0; -} - -static int clk_di_set_parent(struct clk_hw *hw, u8 index) -{ - struct ipu_di *di = container_of(hw, struct ipu_di, clk_hw_out); - u32 val; - - val = ipu_di_read(di, DI_GENERAL); - - if (index) - val |= DI_GEN_DI_CLK_EXT; - else - val &= ~DI_GEN_DI_CLK_EXT; - - ipu_di_write(di, val, DI_GENERAL); - - return 0; -} - -static struct clk_ops clk_di_ops = { - .round_rate = clk_di_round_rate, - .set_rate = clk_di_set_rate, - .recalc_rate = clk_di_recalc_rate, - .set_parent = clk_di_set_parent, - .get_parent = clk_di_get_parent, -}; - static void ipu_di_data_wave_config(struct ipu_di *di, int wave_gen, int access_size, int component_size) @@ -528,42 +398,58 @@ static void ipu_di_sync_config_noninterlaced(struct ipu_di *di, ipu_di_sync_config(di, cfg_vga, 0, ARRAY_SIZE(cfg_vga)); } -int ipu_di_init_sync_panel(struct ipu_di *di, struct ipu_di_signal_cfg *sig) +static void ipu_di_config_clock(struct ipu_di *di, + const struct ipu_di_signal_cfg *sig) { - u32 reg; - u32 di_gen, vsync_cnt; - u32 div; - u32 h_total, v_total; - int ret; - unsigned long round; - struct clk *parent; + struct clk *clk; + unsigned clkgen0; + uint32_t val; - dev_dbg(di->ipu->dev, "disp %d: panel size = %d x %d\n", - di->id, sig->width, sig->height); + if (sig->clkflags & IPU_DI_CLKMODE_EXT) { + /* + * CLKMODE_EXT means we must use the DI clock: this is + * needed for things like LVDS which needs to feed the + * DI and LDB with the same pixel clock. + */ + clk = di->clk_di; + + if (sig->clkflags & IPU_DI_CLKMODE_SYNC) { + /* + * CLKMODE_SYNC means that we want the DI to be + * clocked at the same rate as the parent clock. + * This is needed (eg) for LDB which needs to be + * fed with the same pixel clock. We assume that + * the LDB clock has already been set correctly. + */ + clkgen0 = 1 << 4; + } else { + /* + * We can use the divider. We should really have + * a flag here indicating whether the bridge can + * cope with a fractional divider or not. For the + * time being, let's go for simplicitly and + * reliability. + */ + unsigned long in_rate; + unsigned div; - if ((sig->v_sync_width == 0) || (sig->h_sync_width == 0)) - return -EINVAL; + clk_set_rate(clk, sig->pixelclock); - dev_dbg(di->ipu->dev, "Clocks: IPU %luHz DI %luHz Needed %luHz\n", - clk_get_rate(di->clk_ipu), - clk_get_rate(di->clk_di), - sig->pixelclock); + in_rate = clk_get_rate(clk); + div = (in_rate + sig->pixelclock / 2) / sig->pixelclock; + if (div == 0) + div = 1; - /* - * CLKMODE_EXT means we must use the DI clock: this is needed - * for things like LVDS which needs to feed the DI and LDB with - * the same pixel clock. - * - * For other interfaces, we can arbitarily select between the DI - * specific clock and the internal IPU clock. See DI_GENERAL - * bit 20. We select the IPU clock if it can give us a clock - * rate within 1% of the requested frequency, otherwise we use - * the DI clock. - */ - round = sig->pixelclock; - if (sig->clkflags & IPU_DI_CLKMODE_EXT) { - parent = di->clk_di; + clkgen0 = div << 4; + } } else { + /* + * For other interfaces, we can arbitarily select between + * the DI specific clock and the internal IPU clock. See + * DI_GENERAL bit 20. We select the IPU clock if it can + * give us a clock rate within 1% of the requested frequency, + * otherwise we use the DI clock. + */ unsigned long rate, clkrate; unsigned div, error; @@ -578,54 +464,80 @@ int ipu_di_init_sync_panel(struct ipu_di *di, struct ipu_di_signal_cfg *sig) /* Allow a 1% error */ if (error < 1010 && error >= 990) { - parent = di->clk_ipu; + clk = di->clk_ipu; + + clkgen0 = div << 4; } else { - parent = di->clk_di; + unsigned long in_rate; + unsigned div; + + clk = di->clk_di; - ret = clk_set_rate(parent, sig->pixelclock); - if (ret) - dev_err(di->ipu->dev, "Setting of DI clock failed: %d\n", ret); + clk_set_rate(clk, sig->pixelclock); - /* Use the integer divisor rate - avoid fractional dividers */ - round = rate; + in_rate = clk_get_rate(clk); + div = (in_rate + sig->pixelclock / 2) / sig->pixelclock; + if (div == 0) + div = 1; + + clkgen0 = div << 4; } } - ret = clk_set_parent(di->clk_di_pixel, parent); - if (ret) { - dev_err(di->ipu->dev, - "setting pixel clock to parent %s failed with %d\n", - __clk_get_name(parent), ret); - return ret; - } + di->clk_di_pixel = clk; + + /* Set the divider */ + ipu_di_write(di, clkgen0, DI_BS_CLKGEN0); /* - * CLKMODE_SYNC means that we want the DI to be clocked at the - * same rate as the parent clock. This is needed (eg) for LDB - * which needs to be fed with the same pixel clock. - * - * Note: clk_set_rate(clk, clk_round_rate(clk, rate)) is the - * same as clk_set_rate(clk, rate); + * Set the high/low periods. Bits 24:16 give us the falling edge, + * and bits 8:0 give the rising edge. LSB is fraction, and is + * based on the divider above. We want a 50% duty cycle, so set + * the falling edge to be half the divider. */ - if (sig->clkflags & IPU_DI_CLKMODE_SYNC) - round = clk_get_rate(parent); + ipu_di_write(di, (clkgen0 >> 4) << 16, DI_BS_CLKGEN1); - ret = clk_set_rate(di->clk_di_pixel, round); + /* Finally select the input clock */ + val = ipu_di_read(di, DI_GENERAL) & ~DI_GEN_DI_CLK_EXT; + if (clk == di->clk_di) + val |= DI_GEN_DI_CLK_EXT; + ipu_di_write(di, val, DI_GENERAL); - dev_dbg(di->ipu->dev, "Want %luHz IPU %luHz DI %luHz using %s, got %luHz\n", + dev_dbg(di->ipu->dev, "Want %luHz IPU %luHz DI %luHz using %s, %luHz\n", sig->pixelclock, clk_get_rate(di->clk_ipu), clk_get_rate(di->clk_di), - parent == di->clk_di ? "DI" : "IPU", - clk_get_rate(di->clk_di_pixel)); + clk == di->clk_di ? "DI" : "IPU", + clk_get_rate(di->clk_di_pixel) / (clkgen0 >> 4)); +} + +int ipu_di_init_sync_panel(struct ipu_di *di, struct ipu_di_signal_cfg *sig) +{ + u32 reg; + u32 di_gen, vsync_cnt; + u32 div; + u32 h_total, v_total; + + dev_dbg(di->ipu->dev, "disp %d: panel size = %d x %d\n", + di->id, sig->width, sig->height); + + if ((sig->v_sync_width == 0) || (sig->h_sync_width == 0)) + return -EINVAL; h_total = sig->width + sig->h_sync_width + sig->h_start_width + sig->h_end_width; v_total = sig->height + sig->v_sync_width + sig->v_start_width + sig->v_end_width; + dev_dbg(di->ipu->dev, "Clocks: IPU %luHz DI %luHz Needed %luHz\n", + clk_get_rate(di->clk_ipu), + clk_get_rate(di->clk_di), + sig->pixelclock); + mutex_lock(&di_mutex); + ipu_di_config_clock(di, sig); + div = ipu_di_read(di, DI_BS_CLKGEN0) & 0xfff; div = div / 16; /* Now divider is integer portion */ @@ -709,7 +621,11 @@ EXPORT_SYMBOL_GPL(ipu_di_init_sync_panel); int ipu_di_enable(struct ipu_di *di) { - int ret = clk_prepare_enable(di->clk_di_pixel); + int ret; + + WARN_ON(IS_ERR(di->clk_di_pixel)); + + ret = clk_prepare_enable(di->clk_di_pixel); if (ret) return ret; @@ -721,6 +637,8 @@ EXPORT_SYMBOL_GPL(ipu_di_enable); int ipu_di_disable(struct ipu_di *di) { + WARN_ON(IS_ERR(di->clk_di_pixel)); + ipu_module_disable(di->ipu, di->module); clk_disable_unprepare(di->clk_di_pixel); @@ -776,13 +694,6 @@ int ipu_di_init(struct ipu_soc *ipu, struct device *dev, int id, u32 module, struct clk *clk_ipu) { struct ipu_di *di; - int ret; - const char *di_parent[2]; - struct clk_init_data init = { - .ops = &clk_di_ops, - .num_parents = 2, - .flags = 0, - }; if (id > 1) return -ENODEV; @@ -804,45 +715,16 @@ int ipu_di_init(struct ipu_soc *ipu, struct device *dev, int id, if (!di->base) return -ENOMEM; - di_parent[0] = __clk_get_name(di->clk_ipu); - di_parent[1] = __clk_get_name(di->clk_di); - ipu_di_write(di, 0x10, DI_BS_CLKGEN0); - init.parent_names = (const char **)&di_parent; - di->clk_name = kasprintf(GFP_KERNEL, "%s_di%d_pixel", - dev_name(dev), id); - if (!di->clk_name) - return -ENOMEM; - - init.name = di->clk_name; - - di->clk_hw_out.init = &init; - di->clk_di_pixel = clk_register(dev, &di->clk_hw_out); - - if (IS_ERR(di->clk_di_pixel)) { - ret = PTR_ERR(di->clk_di_pixel); - goto failed_clk_register; - } - dev_dbg(dev, "DI%d base: 0x%08lx remapped to %p\n", id, base, di->base); di->inuse = false; di->ipu = ipu; return 0; - -failed_clk_register: - - kfree(di->clk_name); - - return ret; } void ipu_di_exit(struct ipu_soc *ipu, int id) { - struct ipu_di *di = ipu->di_priv[id]; - - clk_unregister(di->clk_di_pixel); - kfree(di->clk_name); }