Message ID | d1559db8eddb5b06747b152ae69b63c3aa66ed09.1541995959.git.fthain@telegraphics.com.au (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | [RFC,01/13] arm: Fix mutual exclusion in arch_gettimeoffset | expand |
On Mon, Nov 12, 2018 at 03:12:39PM +1100, Finn Thain wrote: > Implementations of arch_gettimeoffset are generally not re-entrant > and assume that interrupts have been disabled. Unfortunately this > pre-condition got broken in v2.6.32. > > Fixes: 5cfc8ee0bb51 ("ARM: convert arm to arch_gettimeoffset()") > Signed-off-by: Finn Thain <fthain@telegraphics.com.au> This looks like a sensible fix for -stable backporting. But with m68k converted over it might be time for these two arm platforms to either be converted to the clocksource API or to be dropped..
On Mon, 12 Nov 2018, Christoph Hellwig wrote: > On Mon, Nov 12, 2018 at 03:12:39PM +1100, Finn Thain wrote: > > Implementations of arch_gettimeoffset are generally not re-entrant and > > assume that interrupts have been disabled. Unfortunately this > > pre-condition got broken in v2.6.32. > > > > Fixes: 5cfc8ee0bb51 ("ARM: convert arm to arch_gettimeoffset()") > > Signed-off-by: Finn Thain <fthain@telegraphics.com.au> > > This looks like a sensible fix for -stable backporting. But with m68k > converted over it might be time for these two arm platforms to either > be converted to the clocksource API or to be dropped.. > You could remove the old arch_gettimeoffset API without dropping any platforms. If no-one converts a given platform to the clocksource API it would mean that the default 'jiffies' clocksource will get used on that platform. Clock resolution and timer precision would be degraded, but that might not matter. Anyway, if someone who has this hardware is willing to test a clocksource API conversion, they can let me know and I'll attempt that patch.
On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote: > On Mon, 12 Nov 2018, Christoph Hellwig wrote: > > > On Mon, Nov 12, 2018 at 03:12:39PM +1100, Finn Thain wrote: > > > Implementations of arch_gettimeoffset are generally not re-entrant and > > > assume that interrupts have been disabled. Unfortunately this > > > pre-condition got broken in v2.6.32. > > > > > > Fixes: 5cfc8ee0bb51 ("ARM: convert arm to arch_gettimeoffset()") > > > Signed-off-by: Finn Thain <fthain@telegraphics.com.au> > > > > This looks like a sensible fix for -stable backporting. But with m68k > > converted over it might be time for these two arm platforms to either > > be converted to the clocksource API or to be dropped.. > > > > You could remove the old arch_gettimeoffset API without dropping any > platforms. > > If no-one converts a given platform to the clocksource API it would mean > that the default 'jiffies' clocksource will get used on that platform. > > Clock resolution and timer precision would be degraded, but that might not > matter. > > Anyway, if someone who has this hardware is willing to test a clocksource > API conversion, they can let me know and I'll attempt that patch. There's reasons why that's not appropriate - such as not having two separate timers in order to supply a clocksource and separate clock event. Not all hardware is suited to the clocksource + clockevent idea.
On Tue, 13 Nov 2018, Russell King - ARM Linux wrote: > On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote: > > > > You could remove the old arch_gettimeoffset API without dropping any > > platforms. > > > > If no-one converts a given platform to the clocksource API it would mean > > that the default 'jiffies' clocksource will get used on that platform. > > > > Clock resolution and timer precision would be degraded, but that might not > > matter. > > > > Anyway, if someone who has this hardware is willing to test a clocksource > > API conversion, they can let me know and I'll attempt that patch. > > There's reasons why that's not appropriate - such as not having two > separate timers in order to supply a clocksource and separate clock > event. > > Not all hardware is suited to the clocksource + clockevent idea. > Sorry, I don't follow. AFAIK, clocksources and clock event devices are orthogonal concepts. There are platforms with !ARCH_USES_GETTIMEOFFSET && !GENERIC_CLOCKEVENTS (and every other combination). A clocksource read method just provides a cycle count, and in this sense arch_gettimeoffset() is equivalent to a clocksource. If these two arm platforms have an existing clock event device which somehow precludes any new clocksources, why doesn't that also render arch_gettimeoffset() impossible?
On Wed, Nov 14, 2018 at 08:55:37AM +1100, Finn Thain wrote: > On Tue, 13 Nov 2018, Russell King - ARM Linux wrote: > > > On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote: > > > > > > You could remove the old arch_gettimeoffset API without dropping any > > > platforms. > > > > > > If no-one converts a given platform to the clocksource API it would mean > > > that the default 'jiffies' clocksource will get used on that platform. > > > > > > Clock resolution and timer precision would be degraded, but that might not > > > matter. > > > > > > Anyway, if someone who has this hardware is willing to test a clocksource > > > API conversion, they can let me know and I'll attempt that patch. > > > > There's reasons why that's not appropriate - such as not having two > > separate timers in order to supply a clocksource and separate clock > > event. > > > > Not all hardware is suited to the clocksource + clockevent idea. > > > > Sorry, I don't follow. > > AFAIK, clocksources and clock event devices are orthogonal concepts. There > are platforms with !ARCH_USES_GETTIMEOFFSET && !GENERIC_CLOCKEVENTS (and > every other combination). > > A clocksource read method just provides a cycle count, and in this sense > arch_gettimeoffset() is equivalent to a clocksource. A clocksource provides a cycle counter that monotonically changes and does not wrap between clockevent events. A clock event is responsible for providing events to the system when some work is needing to be done, limited by the wrap interval of the clocksource. Each time the clock event triggers an interrupt, the clocksource is read to determine how much time has passed, using: count = (new_value - old_value) & available_bits nanosecs = count * scale >> shift; If you try to combine the clocksource and clockevent because you only have a single counter, and the counter has the behaviour of: - counting down towards zero - when reaching zero, triggers an interrupt, and reloads with N then this provides your clockevent, but you can't use this as a clock source, because each time you receive an interrupt and try to read the counter value, it will be approximately the same value. This means that the above calculation fails to register the correct number of nanoseconds passing. Hence, this does not work. Also note where I said above that the clock event device must be able to provide an interrupt _before_ the clocksource wraps - clearly with such a timer, that is utterly impossible. The simple solution is to not use such a counter as the clocksource, which means you fall back to using the jiffies clocksource, and your timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you want a 1kHz timer interval. For most applications, that's simply way to coarse, as was realised in the very early days of Linux. If only there was a way to interpolate between timer interrupts... which is exactly what arch_gettimeoffset() does, and is a historical reminant of the pre-clocksource/clockevent days of the kernel - but it is the only way to get better resolution from this sort of setup.
On 14/11/18 12:43 PM, Russell King - ARM Linux wrote: > On Wed, Nov 14, 2018 at 08:55:37AM +1100, Finn Thain wrote: >> On Tue, 13 Nov 2018, Russell King - ARM Linux wrote: >> >>> On Tue, Nov 13, 2018 at 02:39:00PM +1100, Finn Thain wrote: >>>> You could remove the old arch_gettimeoffset API without dropping any >>>> platforms. >>>> >>>> If no-one converts a given platform to the clocksource API it would mean >>>> that the default 'jiffies' clocksource will get used on that platform. >>>> >>>> Clock resolution and timer precision would be degraded, but that might not >>>> matter. >>>> >>>> Anyway, if someone who has this hardware is willing to test a clocksource >>>> API conversion, they can let me know and I'll attempt that patch. >>> There's reasons why that's not appropriate - such as not having two >>> separate timers in order to supply a clocksource and separate clock >>> event. >>> >>> Not all hardware is suited to the clocksource + clockevent idea. >>> >> Sorry, I don't follow. >> >> AFAIK, clocksources and clock event devices are orthogonal concepts. There >> are platforms with !ARCH_USES_GETTIMEOFFSET && !GENERIC_CLOCKEVENTS (and >> every other combination). >> >> A clocksource read method just provides a cycle count, and in this sense >> arch_gettimeoffset() is equivalent to a clocksource. > A clocksource provides a cycle counter that monotonically changes and > does not wrap between clockevent events. > > A clock event is responsible for providing events to the system when > some work is needing to be done, limited by the wrap interval of the > clocksource. > > Each time the clock event triggers an interrupt, the clocksource is > read to determine how much time has passed, using: > > count = (new_value - old_value) & available_bits > nanosecs = count * scale >> shift; > > If you try to combine the clocksource and clockevent because you only > have a single counter, and the counter has the behaviour of: > - counting down towards zero > - when reaching zero, triggers an interrupt, and reloads with N > > then this provides your clockevent, but you can't use this as a clock > source, because each time you receive an interrupt and try to read the > counter value, it will be approximately the same value. This means > that the above calculation fails to register the correct number of > nanoseconds passing. Hence, this does not work. So we'd still have to use jiffies + interpolation from the current timer rundown counter as clocksource (since that will be monotonous and won't wrap)? The way Finn did the clocksource for m68k, the clocksource counter does behave as it should (monotonous, doesn't wrap) at least so far as I've tested. Using the same timer for clocksource and clock events will degrade timekeeping based on timer events to jiffies resolution, as you point out. That would already have been the case before, so no gain in resolution. Other timekeeping would have worked at higher resolution before (interpolating through arch_gettimeoffset) just the same as now (interpolating through clocksource reads). Finn's clocksource read code essentially is arch_gettimeoffset under a new name. Are you saying that's not possible on arm, because the current timer rundown counter can't be read while the timer is running? If I were to run a second timer at higher rate for clocksource, but keeping the 10 ms timer as clock event (could easily do that by using timer D on Atari Falcon) - how would that improve my timekeeping? Clock events still only happen 10 ms apart ... Cheers, Michael > > Also note where I said above that the clock event device must be able > to provide an interrupt _before_ the clocksource wraps - clearly with > such a timer, that is utterly impossible. > > The simple solution is to not use such a counter as the clocksource, > which means you fall back to using the jiffies clocksource, and your > timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you > want a 1kHz timer interval. For most applications, that's simply way > to coarse, as was realised in the very early days of Linux. > > If only there was a way to interpolate between timer interrupts... > which is exactly what arch_gettimeoffset() does, and is a historical > reminant of the pre-clocksource/clockevent days of the kernel - but > it is the only way to get better resolution from this sort of setup. >
On Tue, 13 Nov 2018, Russell King - ARM Linux wrote: > > A clocksource provides a cycle counter that monotonically changes and > does not wrap between clockevent events. > > A clock event is responsible for providing events to the system when > some work is needing to be done, limited by the wrap interval of the > clocksource. > > Each time the clock event triggers an interrupt, the clocksource is > read to determine how much time has passed, using: > > count = (new_value - old_value) & available_bits > nanosecs = count * scale >> shift; > > If you try to combine the clocksource and clockevent because you only > have a single counter, and the counter has the behaviour of: > - counting down towards zero > - when reaching zero, triggers an interrupt, and reloads with N > > then this provides your clockevent, but you can't use this as a clock > source, because each time you receive an interrupt and try to read the > counter value, it will be approximately the same value. This means > that the above calculation fails to register the correct number of > nanoseconds passing. Hence, this does not work. > > Also note where I said above that the clock event device must be able > to provide an interrupt _before_ the clocksource wraps - clearly with > such a timer, that is utterly impossible. > > The simple solution is to not use such a counter as the clocksource, > which means you fall back to using the jiffies clocksource, and your > timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you > want a 1kHz timer interval. For most applications, that's simply way > to coarse, as was realised in the very early days of Linux. > > If only there was a way to interpolate between timer interrupts... > which is exactly what arch_gettimeoffset() does, and is a historical > reminant of the pre-clocksource/clockevent days of the kernel - but > it is the only way to get better resolution from this sort of setup. > Both of the platforms in question (RPC and EBSA110) have not defined(CONFIG_GENERIC_CLOCKEVENTS) and have not defined any struct clock_event_device, AFAICT. So, even assuming that you're right about the limitations of single-timer platforms in general, removal of arch_gettimeoffset wouldn't require the removal of any platforms, AFAICT.
On Wed, Nov 14, 2018 at 02:35:29PM +1300, Michael Schmitz wrote: > So we'd still have to use jiffies + interpolation from the current timer > rundown counter as clocksource (since that will be monotonous and won't > wrap)? > > The way Finn did the clocksource for m68k, the clocksource counter does > behave as it should (monotonous, doesn't wrap) at least so far as I've > tested. Using the same timer for clocksource and clock events will degrade > timekeeping based on timer events to jiffies resolution, as you point out. > That would already have been the case before, so no gain in resolution. ... and that is where you are wrong. RPC, for example, has gettimeofday() resolution of 500ns. Removing gettimeoffset() will result in a resolution of 1/HZ since there is no longer any interpolation between interrupts. > Other timekeeping would have worked at higher resolution before > (interpolating through arch_gettimeoffset) just the same as now > (interpolating through clocksource reads). Finn's clocksource read code > essentially is arch_gettimeoffset under a new name. Where is this code - all I've seen is code adding IRQ exclusion in the gettimeoffset method. If some other solution is being proposed, then it's no good beating about the bush - show the damn code, and then that can be considered. However, what has been said in this thread is basically "remove the gettimeoffset method", which is _not_ acceptable, it will cause gettimeofday() on these platforms to lose resolution, which is a user visible REGRESSION, plain and simple. If what is actually meant is "we have a replacement for gettimeoffset" then, and I'm sure we all know this, there needs to be a patch proposed showing what is being proposed, rather than waffling around the issue. > Are you saying that's not possible on arm, because the current timer rundown > counter can't be read while the timer is running? > > If I were to run a second timer at higher rate for clocksource, but keeping > the 10 ms timer as clock event (could easily do that by using timer D on > Atari Falcon) - how would that improve my timekeeping? Clock events still > only happen 10 ms apart ... Ah, I think you're talking about something else. You seem to be talking about what happens when time keeping interrupts happen. I'm talking about the resolution of gettimeofday() and the other interfaces that are used (eg) for packet capture timestamping and the like - the _user_ visible effects of the timekeeping system. With the existing implementation, all these have better-than-jiffy resolution - in the case of RPC, that's 500ns, rather than 10ms which would be the case without gettimeoffset(). Removing gettimeoffset() as Finn has proposed without preserving that resolution is simply completely unacceptable.
On Mon, Nov 12, 2018 at 03:12:39PM +1100, Finn Thain wrote: > Implementations of arch_gettimeoffset are generally not re-entrant > and assume that interrupts have been disabled. Unfortunately this > pre-condition got broken in v2.6.32. To me, it looks way worse than what you think. The original code sequence before conversion to arch_gettimeoffset() was: 1. lock (inc. disabling IRQs) 2. read gettimeoffset correction 3. add offset to current xtime 4. unlock This means there is no chance for a timer interrupt to happen between reading the current offset and adding it to the current kernel time. If the timer does roll over, then the interrupt will remain pending, so the whole "read offset and add to xtime" is one atomic block and we can use the pending interrupt to indicate whether the timer has rolled over and apply the appropriate offset correction. With the arch_gettimeoffset() code, that changes: 1. read clocksource (which will be jiffies) 2. compute clocksource delta 3. increment nanoseconds 4. add gettimeoffset correction If we receive a timer interrupt part-way through this, for example, between 2 and 3, then jiffies will increment (via do_timer()) and the interrupt will be cleared. This means that the check in ioc_timer_gettimeoffset() for a pending interrupt, for example, will be false, and we will not know that an interrupt has happened. So, unless I'm missing something, commit 5cfc8ee0bb51 well and truely broke the accuracy of timekeeping on these platforms, and will result in timekeeping spontaneously jumping backwards. I had been wondering why NTP had become less stable on EBSA110, but never investigated. However, that still does not justify blowing away this functionality without replacement, as Finn has been proposing.
On Wed, Nov 14, 2018 at 02:17:09PM +1100, Finn Thain wrote: > On Tue, 13 Nov 2018, Russell King - ARM Linux wrote: > > > > > A clocksource provides a cycle counter that monotonically changes and > > does not wrap between clockevent events. > > > > A clock event is responsible for providing events to the system when > > some work is needing to be done, limited by the wrap interval of the > > clocksource. > > > > Each time the clock event triggers an interrupt, the clocksource is > > read to determine how much time has passed, using: > > > > count = (new_value - old_value) & available_bits > > nanosecs = count * scale >> shift; > > > > If you try to combine the clocksource and clockevent because you only > > have a single counter, and the counter has the behaviour of: > > - counting down towards zero > > - when reaching zero, triggers an interrupt, and reloads with N > > > > then this provides your clockevent, but you can't use this as a clock > > source, because each time you receive an interrupt and try to read the > > counter value, it will be approximately the same value. This means > > that the above calculation fails to register the correct number of > > nanoseconds passing. Hence, this does not work. > > > > Also note where I said above that the clock event device must be able > > to provide an interrupt _before_ the clocksource wraps - clearly with > > such a timer, that is utterly impossible. > > > > The simple solution is to not use such a counter as the clocksource, > > which means you fall back to using the jiffies clocksource, and your > > timekeeping has jiffy resolution - so 10ms, or possibly 1ms if you > > want a 1kHz timer interval. For most applications, that's simply way > > to coarse, as was realised in the very early days of Linux. > > > > If only there was a way to interpolate between timer interrupts... > > which is exactly what arch_gettimeoffset() does, and is a historical > > reminant of the pre-clocksource/clockevent days of the kernel - but > > it is the only way to get better resolution from this sort of setup. > > > > Both of the platforms in question (RPC and EBSA110) have not > defined(CONFIG_GENERIC_CLOCKEVENTS) and have not defined any struct > clock_event_device, AFAICT. Correct, because they don't use clockevents. This is pre-clocksource, pre-clockevent code, it uses the old timekeeping infrastructure, which is xtime_update() and providing a gettimeoffset implementation. > So, even assuming that you're right about the limitations of single-timer > platforms in general, removal of arch_gettimeoffset wouldn't require the > removal of any platforms, AFAICT. I haven't proposed removing platforms. I'm just objecting to the idea of removing arch_gettimeoffset(), thereby causing a regression by changing the resolution of gettimeofday() without any sign of equivalent functionality. However, I now see (having searched mailing lists) what you are trying to do - you have _not_ copied me or the mailing lists I'm on with your cover message, so I'm *totally* lacking in the context of your patch series, particularly where you are converting m68k to use clocksources without needing the gettimeoffset() stuff. You have failed to explain that in this thread - probably assuming that I've read your cover message. I haven't until now, because you never sent it to me or the linux-arm-kernel mailing list. I have found this thread _very_ frustrating, and frankly a waste of my time discussing the finer points because of this lack of context. Please ensure that if you're going to be sending a patch series, that the cover message at least finds its way to the intended audience of your patches, so that everyone has the context they need when looking at (eg) the single patch they may receive. Alternatively, if someone raises a problem with the patch, and you _know_ you haven't done that, then please consider informing them where they can get more context, eg, by providing a link to your archived cover message. It would help avoid misunderstandings. Thanks.
Hi Russell, On Wed, Nov 14, 2018 at 3:16 PM Russell King - ARM Linux <linux@armlinux.org.uk> wrote: > On Wed, Nov 14, 2018 at 02:17:09PM +1100, Finn Thain wrote: > > So, even assuming that you're right about the limitations of single-timer > > platforms in general, removal of arch_gettimeoffset wouldn't require the > > removal of any platforms, AFAICT. > > I haven't proposed removing platforms. > > I'm just objecting to the idea of removing arch_gettimeoffset(), > thereby causing a regression by changing the resolution of > gettimeofday() without any sign of equivalent functionality. > > However, I now see (having searched mailing lists) what you are > trying to do - you have _not_ copied me or the mailing lists I'm > on with your cover message, so I'm *totally* lacking in the context > of your patch series, particularly where you are converting m68k > to use clocksources without needing the gettimeoffset() stuff. > > You have failed to explain that in this thread - probably assuming > that I've read your cover message. I haven't until now, because > you never sent it to me or the linux-arm-kernel mailing list. > > I have found this thread _very_ frustrating, and frankly a waste of > my time discussing the finer points because of this lack of context. > Please ensure that if you're going to be sending a patch series, > that the cover message at least finds its way to the intended > audience of your patches, so that everyone has the context they > need when looking at (eg) the single patch they may receive. > > Alternatively, if someone raises a problem with the patch, and you > _know_ you haven't done that, then please consider informing them > where they can get more context, eg, by providing a link to your > archived cover message. It would help avoid misunderstandings. Sorry for the lack of context. The real trigger was also not explained in the cover message, and was a the threat to remove platforms not using modern timekeeping APIs, cfr. "Removing support for old hardware from the kernel" (https://lwn.net/Articles/769468/). Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Wed, Nov 14, 2018 at 03:58:36PM +0100, Geert Uytterhoeven wrote: > Hi Russell, > > On Wed, Nov 14, 2018 at 3:16 PM Russell King - ARM Linux > <linux@armlinux.org.uk> wrote: > > On Wed, Nov 14, 2018 at 02:17:09PM +1100, Finn Thain wrote: > > > So, even assuming that you're right about the limitations of single-timer > > > platforms in general, removal of arch_gettimeoffset wouldn't require the > > > removal of any platforms, AFAICT. > > > > I haven't proposed removing platforms. > > > > I'm just objecting to the idea of removing arch_gettimeoffset(), > > thereby causing a regression by changing the resolution of > > gettimeofday() without any sign of equivalent functionality. > > > > However, I now see (having searched mailing lists) what you are > > trying to do - you have _not_ copied me or the mailing lists I'm > > on with your cover message, so I'm *totally* lacking in the context > > of your patch series, particularly where you are converting m68k > > to use clocksources without needing the gettimeoffset() stuff. > > > > You have failed to explain that in this thread - probably assuming > > that I've read your cover message. I haven't until now, because > > you never sent it to me or the linux-arm-kernel mailing list. > > > > I have found this thread _very_ frustrating, and frankly a waste of > > my time discussing the finer points because of this lack of context. > > Please ensure that if you're going to be sending a patch series, > > that the cover message at least finds its way to the intended > > audience of your patches, so that everyone has the context they > > need when looking at (eg) the single patch they may receive. > > > > Alternatively, if someone raises a problem with the patch, and you > > _know_ you haven't done that, then please consider informing them > > where they can get more context, eg, by providing a link to your > > archived cover message. It would help avoid misunderstandings. > > Sorry for the lack of context. > The real trigger was also not explained in the cover message, and was a > the threat to remove platforms not using modern timekeeping APIs, cfr. > "Removing support for old hardware from the kernel" > (https://lwn.net/Articles/769468/). And naturally, because kernel developers are oh so great at communication, that decision has been communicated to those affected by it. *NOT*. Clearly there is a need for much better communication. We're better at it when dealing with patches, but not when it comes to physical meetings. Maybe when some decision like "we're going to kill stuff still using the old gettimeoffset API" then _someone_ needs to identify which platforms are using that and make sure that those maintainers know about that decision, much the same way that if a patch were to touch the gettimeoffset API, we'd make damn sure that such a patch went to those affected by the change.
On 14/11/18 8:58 PM, Russell King - ARM Linux wrote: > >> Are you saying that's not possible on arm, because the current timer rundown >> counter can't be read while the timer is running? >> >> If I were to run a second timer at higher rate for clocksource, but keeping >> the 10 ms timer as clock event (could easily do that by using timer D on >> Atari Falcon) - how would that improve my timekeeping? Clock events still >> only happen 10 ms apart ... > Ah, I think you're talking about something else. > > You seem to be talking about what happens when time keeping interrupts > happen. That's what I understood your comment was about. > I'm talking about the resolution of gettimeofday() and the other > interfaces that are used (eg) for packet capture timestamping and > the like - the _user_ visible effects of the timekeeping system. > > With the existing implementation, all these have better-than-jiffy > resolution - in the case of RPC, that's 500ns, rather than 10ms > which would be the case without gettimeoffset(). Removing > gettimeoffset() as Finn has proposed without preserving that > resolution is simply completely unacceptable. I agree - but Finn had also offered to supply patches to arm that would have added a clocksource_read() function very much like for those m68k platforms that had used gettimeoffset(). I wondered what reason there was for these not to work on arm I now realize you'd never seen that offer. The proposal to drop architectures still relying on arch_gettimeoffset() had been raising enough of a response on linux-m68k to make me conclude that same proposal had been kicked on to other arch MLs affected as well. I'm a bit naive that way. Now your criticism of arch_gettimeoffset() (missing timer rollover because the timer interrupt has been cleared by the interrupt handler) still stands. I just can't find the offset() functions shown in the 5cfc8ee0bb51 patch. Any hints? Cheers, Michael
On Wed, 14 Nov 2018, Russell King - ARM Linux wrote: > However, I now see (having searched mailing lists) what you are trying > to do - you have _not_ copied me or the mailing lists I'm on with your > cover message, so I'm *totally* lacking in the context of your patch > series, particularly where you are converting m68k to use clocksources > without needing the gettimeoffset() stuff. > True. I should have included all interested parties in the recipients of the cover letter. My bad. > You have failed to explain that in this thread - probably assuming that > I've read your cover message. I offered to write a patch to add a clocksource to replace the arch_gettimeoffset functionality for RPC and EBSA110. You even replied to that offer. I did not propose degrading functionality while there is someone able to test modernization patches (assuming there is...). Would you consider merging untested modernization patches for the arch_gettimeoffset API? > I haven't until now, because you never sent it to me or the > linux-arm-kernel mailing list. > > I have found this thread _very_ frustrating, and frankly a waste of my > time discussing the finer points because of this lack of context. Sorry for any frustration I've caused you. The thread went way off-topic when Christoph took the opportunity to suggest the removal of RPC and EBSA110. That doesn't interest me. My interest remains API modernization. The actual patches I've sent are intended to modernize the clock API *without* degrading any functionality. > Please ensure that if you're going to be sending a patch series, that > the cover message at least finds its way to the intended audience of > your patches, so that everyone has the context they need when looking at > (eg) the single patch they may receive. > OK. I'll have to improve my patch submission scripts.
On Thu, Nov 15, 2018 at 03:12:17PM +1100, Finn Thain wrote: > The thread went way off-topic when Christoph took the opportunity to > suggest the removal of RPC and EBSA110. That doesn't interest me. I suspect that is the right solution - I've been trying to get 4.19 to boot on my RPC and it's proving very difficult for several reasons, not limited to the HDD seeming to be throwing the odd disk error, as well as the kernel being rather large (a 4.19 kernel is 2.7M compressed as opposed to 2.6.29 being 1.2M compressed for the equivalent configuration.) Given that memory on the RPC is not contiguous, that probably means an uncompressed kernel overflows the size of the first memory bank, so there's probably no hope for modern kernels to boot on the machine. The EBSA110 is probably in a similar boat - I don't remember whether it had 16MB or 32MB as the maximal amount of memory, but memory was getting tight with some kernels even running a minimalist userspace. So, it's probably time to say goodbyte to the kernel support for these platforms.
On Fri, 16 Nov 2018, Russell King - ARM Linux wrote: > > The EBSA110 is probably in a similar boat - I don't remember whether it > had 16MB or 32MB as the maximal amount of memory, but memory was getting > tight with some kernels even running a minimalist userspace. > > So, it's probably time to say goodbyte to the kernel support for these > platforms. > Your call. Note that removing code from mainline won't help users obtain older, smaller, -stable kernel releases, free from the bug we were discussing. (The bug appeared in Linux v2.6.32.) BTW, if you did want to boot Linux on a 16 MB system, you do have some options. https://lwn.net/Articles/741494/ https://lwn.net/Articles/608945/ https://tiny.wiki.kernel.org/ Contributing to this kind of effort probably has value for IoT deployments. I suspect it also cuts a small amount of bloat from a large number of other Linux systems.
Hi Finn, Am 17.11.2018 um 11:49 schrieb Finn Thain: > On Fri, 16 Nov 2018, Russell King - ARM Linux wrote: > >> >> The EBSA110 is probably in a similar boat - I don't remember whether it >> had 16MB or 32MB as the maximal amount of memory, but memory was getting >> tight with some kernels even running a minimalist userspace. >> >> So, it's probably time to say goodbyte to the kernel support for these >> platforms. >> > > Your call. > > Note that removing code from mainline won't help users obtain older, > smaller, -stable kernel releases, free from the bug we were discussing. > (The bug appeared in Linux v2.6.32.) > > BTW, if you did want to boot Linux on a 16 MB system, you do have some > options. > > https://lwn.net/Articles/741494/ > https://lwn.net/Articles/608945/ > https://tiny.wiki.kernel.org/ > > Contributing to this kind of effort probably has value for IoT > deployments. I suspect it also cuts a small amount of bloat from a large > number of other Linux systems. I boot 4.19 on a system with 14 MB RAM - 10 MB remain for user space once the kernel loads. After remote login, 4 MB of that remain free for buffers or user code (1.5 MB swapped). That's with sysvinit - Christian tried to boot a systemd userland on his Falcon once, and ran out of memory before swap could be activated. I shouldn't say 16 or 32 MB are hopeless. And the 2.6 kernels were a lot more sluggish to my recollection. Cheers, Michael
diff --git a/arch/arm/mach-ebsa110/core.c b/arch/arm/mach-ebsa110/core.c index 688e5fed49a7..479f89a1accf 100644 --- a/arch/arm/mach-ebsa110/core.c +++ b/arch/arm/mach-ebsa110/core.c @@ -160,12 +160,17 @@ static void __init ebsa110_init_early(void) */ static u32 ebsa110_gettimeoffset(void) { + unsigned long flags; unsigned long offset, count; + local_irq_save(flags); + __raw_writeb(0x40, PIT_CTRL); count = __raw_readb(PIT_T1); count |= __raw_readb(PIT_T1) << 8; + local_irq_restore(flags); + /* * If count > COUNT, make the number negative. */ diff --git a/arch/arm/mach-rpc/time.c b/arch/arm/mach-rpc/time.c index 2689771c1d38..852bb3801638 100644 --- a/arch/arm/mach-rpc/time.c +++ b/arch/arm/mach-rpc/time.c @@ -29,9 +29,12 @@ static u32 ioc_timer_gettimeoffset(void) { + unsigned long flags; unsigned int count1, count2, status; long offset; + local_irq_save(flags); + ioc_writeb (0, IOC_T0LATCH); barrier (); count1 = ioc_readb(IOC_T0CNTL) | (ioc_readb(IOC_T0CNTH) << 8); @@ -42,6 +45,8 @@ static u32 ioc_timer_gettimeoffset(void) barrier (); count2 = ioc_readb(IOC_T0CNTL) | (ioc_readb(IOC_T0CNTH) << 8); + local_irq_restore(flags); + offset = count2; if (count2 < count1) { /*
Implementations of arch_gettimeoffset are generally not re-entrant and assume that interrupts have been disabled. Unfortunately this pre-condition got broken in v2.6.32. Fixes: 5cfc8ee0bb51 ("ARM: convert arm to arch_gettimeoffset()") Signed-off-by: Finn Thain <fthain@telegraphics.com.au> --- arch/arm/mach-ebsa110/core.c | 5 +++++ arch/arm/mach-rpc/time.c | 5 +++++ 2 files changed, 10 insertions(+)