Message ID | 20190718031113.25040-1-yang.jie@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3,1/2] ASoC: SOF: add flag for position update ipc | expand |
On 7/17/19 10:11 PM, Keyon Jie wrote: > From: Marcin Rajwa <marcin.rajwa@linux.intel.com> > > In some cases, FW might need to use the host_period_bytes field to > synchronize the DMA copying (with host side) but the driver does not it's your right to edit my suggested wording, but the notion of 'synchronization' is far from clear. it's my understanding that the host_period_bytes defines the DMA transfer size requested by the firmware, which isn't a value that matters to the host except for rewind usages. > need any position information returned over the IPC channel by the > firmware. The current IPC definition prevents this capability, so add > new field. > > Signed-off-by: Marcin Rajwa <marcin.rajwa@linux.intel.com> > Signed-off-by: Keyon Jie <yang.jie@linux.intel.com> > Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > --- > include/sound/sof/stream.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/sound/sof/stream.h b/include/sound/sof/stream.h > index 643f175cb479..06af4ecb2584 100644 > --- a/include/sound/sof/stream.h > +++ b/include/sound/sof/stream.h > @@ -83,10 +83,10 @@ struct sof_ipc_stream_params { > uint16_t sample_valid_bytes; > uint16_t sample_container_bytes; > > - /* for notifying host period has completed - 0 means no period IRQ */ > uint32_t host_period_bytes; > + uint32_t no_position_ipc; /* 1 means no IPC for position upadte */ typo: update > > - uint32_t reserved[2]; > + uint16_t reserved[3]; > uint16_t chmap[SOF_IPC_MAX_CHANNELS]; /**< channel map - SOF_CHMAP_ */ > } __packed; > >
On 2019/7/18 上午11:35, Pierre-Louis Bossart wrote: > > > On 7/17/19 10:11 PM, Keyon Jie wrote: >> From: Marcin Rajwa <marcin.rajwa@linux.intel.com> >> >> In some cases, FW might need to use the host_period_bytes field to >> synchronize the DMA copying (with host side) but the driver does not > > it's your right to edit my suggested wording, but the notion of > 'synchronization' is far from clear. it's my understanding that the > host_period_bytes defines the DMA transfer size requested by the > firmware, which isn't a value that matters to the host except for rewind > usages. Hi Pierre, here the host_period_bytes is requested by host, FW has its own period size, and DMA will transfer data in FW buffer period size. It works like this: FW buffer[period 0, period 1, ...] <==> DMA <==> host/alsa buffer[host_period 0, host_priod 1, ...] We need this host_preiod_bytes information in FW to do fast draining(e.g. record 2 seconds data within 10ms) in mmap capture, we are slowing down the draining in smaller host_period_bytes cases, otherwise, arecord can't read the buffer in time and overrun will happen. Maybe the wording "synchronize" here is inaccurate, how about something like this: "FW might need to use the host_period_bytes field to configure and control the DMA copying speed but the driver does not..." > >> need any position information returned over the IPC channel by the >> firmware. The current IPC definition prevents this capability, so add >> new field. >> >> Signed-off-by: Marcin Rajwa <marcin.rajwa@linux.intel.com> >> Signed-off-by: Keyon Jie <yang.jie@linux.intel.com> >> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> >> --- >> include/sound/sof/stream.h | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/include/sound/sof/stream.h b/include/sound/sof/stream.h >> index 643f175cb479..06af4ecb2584 100644 >> --- a/include/sound/sof/stream.h >> +++ b/include/sound/sof/stream.h >> @@ -83,10 +83,10 @@ struct sof_ipc_stream_params { >> uint16_t sample_valid_bytes; >> uint16_t sample_container_bytes; >> - /* for notifying host period has completed - 0 means no period >> IRQ */ >> uint32_t host_period_bytes; >> + uint32_t no_position_ipc; /* 1 means no IPC for position upadte */ > > typo: update OK, thanks, another update version needed for it. Thanks, ~Keyon > >> - uint32_t reserved[2]; >> + uint16_t reserved[3]; >> uint16_t chmap[SOF_IPC_MAX_CHANNELS]; /**< channel map - >> SOF_CHMAP_ */ >> } __packed; >> >
On 7/18/2019 7:06 AM, Keyon Jie wrote: > > > On 2019/7/18 上午11:35, Pierre-Louis Bossart wrote: >> >> >> On 7/17/19 10:11 PM, Keyon Jie wrote: >>> From: Marcin Rajwa <marcin.rajwa@linux.intel.com> >>> >>> In some cases, FW might need to use the host_period_bytes field to >>> synchronize the DMA copying (with host side) but the driver does not >> >> it's your right to edit my suggested wording, but the notion of >> 'synchronization' is far from clear. it's my understanding that the >> host_period_bytes defines the DMA transfer size requested by the >> firmware, which isn't a value that matters to the host except for >> rewind usages. > > Hi Pierre, here the host_period_bytes is requested by host, FW has its > own period size, and DMA will transfer data in FW buffer period size. > It works like this: > > FW buffer[period 0, period 1, ...] <==> DMA <==> host/alsa > buffer[host_period 0, host_priod 1, ...] > > We need this host_preiod_bytes information in FW to do fast > draining(e.g. record 2 seconds data within 10ms) in mmap capture, we > are slowing down the draining in smaller host_period_bytes cases, > otherwise, arecord can't read the buffer in time and overrun will happen. > > Maybe the wording "synchronize" here is inaccurate, how about > something like this: > > "FW might need to use the host_period_bytes field to configure and > control the DMA copying speed but the driver does not..." > Hi, we need *host_period_bytes *in FW to properly drain data - by properly I mean to no override host buffer but in the same time to avoid situation when host is waiting for data and doesn't get it. The former is known as *overrun *while the later *underrun**. * So that's why I originally used the word *"to synchronize" *because it best reflects the use of this variable in FW. *The problem *- host establishes the *period_size/**host_period_bytes *and uses it as a "data copy tempo controller" meaning it will copy data buffered in its own buffer (copied there by FW) in *period_size *time intervals. Now, in regular copy (real time stream) firmware doesn't need to care about how fast host copy because the host buffer is big enough to store even several FW periods (each one or few milliseconds) without overriding it. That is why we did not need this *period_size *variable before. However the draining task is very different ball game - it copies *2,1 seconds* of data as fast as possible to the host. Therefore we are very prone to *XRUNs* *The solution* - in FW we need to know how often host copies data out from its own buffer and this information is stored in *host_period_bytes, *lets send this information down to firmware. Now, the FW knowing this can fill the host buffer and wait the time calculated by *host_period_bytes* before next copy. This way we copy as much/fast as we can and in the same time we are safe that host will handle this and no XRUN will ever happen. Hope it helped to understand the need of *host_period_bytes *in the firmware. ** Marcin
On 7/18/19 3:42 AM, Rajwa, Marcin wrote: > > On 7/18/2019 7:06 AM, Keyon Jie wrote: >> >> >> On 2019/7/18 上午11:35, Pierre-Louis Bossart wrote: >>> >>> >>> On 7/17/19 10:11 PM, Keyon Jie wrote: >>>> From: Marcin Rajwa <marcin.rajwa@linux.intel.com> >>>> >>>> In some cases, FW might need to use the host_period_bytes field to >>>> synchronize the DMA copying (with host side) but the driver does not >>> >>> it's your right to edit my suggested wording, but the notion of >>> 'synchronization' is far from clear. it's my understanding that the >>> host_period_bytes defines the DMA transfer size requested by the >>> firmware, which isn't a value that matters to the host except for >>> rewind usages. >> >> Hi Pierre, here the host_period_bytes is requested by host, FW has its >> own period size, and DMA will transfer data in FW buffer period size. >> It works like this: >> >> FW buffer[period 0, period 1, ...] <==> DMA <==> host/alsa >> buffer[host_period 0, host_priod 1, ...] >> >> We need this host_preiod_bytes information in FW to do fast >> draining(e.g. record 2 seconds data within 10ms) in mmap capture, we >> are slowing down the draining in smaller host_period_bytes cases, >> otherwise, arecord can't read the buffer in time and overrun will happen. >> >> Maybe the wording "synchronize" here is inaccurate, how about >> something like this: >> >> "FW might need to use the host_period_bytes field to configure and >> control the DMA copying speed but the driver does not..." >> > > Hi, > > we need *host_period_bytes *in FW to properly drain data - by properly I > mean to no override host buffer but in the same time to avoid situation > when host is waiting for data and doesn't get it. The former is known as > *overrun *while the later *underrun**. > * > > So that's why I originally used the word *"to synchronize" *because it > best reflects the use of this variable in FW. > > *The problem *- host establishes the *period_size/**host_period_bytes > *and uses it as a "data copy tempo controller" meaning it will copy data > buffered in its own buffer (copied there by FW) in *period_size *time > intervals. Now, in regular copy (real time stream) firmware doesn't > > need to care about how fast host copy because the host buffer is big > enough to store even several FW periods (each one or few milliseconds) > without overriding it. That is why we did not need this *period_size > *variable before. However the draining task is very > > different ball game - it copies *2,1 seconds* of data as fast as > possible to the host. Therefore we are very prone to *XRUNs* > > *The solution* - in FW we need to know how often host copies data out > from its own buffer and this information is stored in > *host_period_bytes, *lets send this information down to firmware. Now, > the FW knowing this can fill the host buffer and wait the time > calculated by *host_period_bytes* > > before next copy. This way we copy as much/fast as we can and in the > same time we are safe that host will handle this and no XRUN will ever > happen. We knew from Day1 that draining faster than real-time could potentially lead to the driver detecting overflows or timeouts. It's been documented left and right, with an assumption that the ring buffer is actually big enough to contain all the data stored in the DSP. Before I provide more feedback, can you clarify if the 'host_period_bytes' is the same value as the ALSA period size (in bytes)? And what would be the value when the no_irq mode is used? > > > Hope it helped to understand the need of *host_period_bytes *in the > firmware. > ** > > Marcin >
On 7/18/2019 5:24 PM, Pierre-Louis Bossart wrote: > > > On 7/18/19 3:42 AM, Rajwa, Marcin wrote: >> >> On 7/18/2019 7:06 AM, Keyon Jie wrote: >>> >>> >>> On 2019/7/18 上午11:35, Pierre-Louis Bossart wrote: >>>> >>>> >>>> On 7/17/19 10:11 PM, Keyon Jie wrote: >>>>> From: Marcin Rajwa <marcin.rajwa@linux.intel.com> >>>>> >>>>> In some cases, FW might need to use the host_period_bytes field to >>>>> synchronize the DMA copying (with host side) but the driver does not >>>> >>>> it's your right to edit my suggested wording, but the notion of >>>> 'synchronization' is far from clear. it's my understanding that the >>>> host_period_bytes defines the DMA transfer size requested by the >>>> firmware, which isn't a value that matters to the host except for >>>> rewind usages. >>> >>> Hi Pierre, here the host_period_bytes is requested by host, FW has >>> its own period size, and DMA will transfer data in FW buffer period >>> size. It works like this: >>> >>> FW buffer[period 0, period 1, ...] <==> DMA <==> host/alsa >>> buffer[host_period 0, host_priod 1, ...] >>> >>> We need this host_preiod_bytes information in FW to do fast >>> draining(e.g. record 2 seconds data within 10ms) in mmap capture, we >>> are slowing down the draining in smaller host_period_bytes cases, >>> otherwise, arecord can't read the buffer in time and overrun will >>> happen. >>> >>> Maybe the wording "synchronize" here is inaccurate, how about >>> something like this: >>> >>> "FW might need to use the host_period_bytes field to configure and >>> control the DMA copying speed but the driver does not..." >>> >> >> Hi, >> >> we need *host_period_bytes *in FW to properly drain data - by >> properly I mean to no override host buffer but in the same time to >> avoid situation when host is waiting for data and doesn't get it. The >> former is known as *overrun *while the later *underrun**. >> * >> >> So that's why I originally used the word *"to synchronize" *because >> it best reflects the use of this variable in FW. >> >> *The problem *- host establishes the *period_size/**host_period_bytes >> *and uses it as a "data copy tempo controller" meaning it will copy >> data buffered in its own buffer (copied there by FW) in *period_size >> *time intervals. Now, in regular copy (real time stream) firmware >> doesn't >> >> need to care about how fast host copy because the host buffer is big >> enough to store even several FW periods (each one or few >> milliseconds) without overriding it. That is why we did not need >> this *period_size *variable before. However the draining task is very >> >> different ball game - it copies *2,1 seconds* of data as fast as >> possible to the host. Therefore we are very prone to *XRUNs* >> >> *The solution* - in FW we need to know how often host copies data out >> from its own buffer and this information is stored in >> *host_period_bytes, *lets send this information down to firmware. >> Now, the FW knowing this can fill the host buffer and wait the time >> calculated by *host_period_bytes* >> >> before next copy. This way we copy as much/fast as we can and in the >> same time we are safe that host will handle this and no XRUN will >> ever happen. > > > We knew from Day1 that draining faster than real-time could > potentially lead to the driver detecting overflows or timeouts. It's > been documented left and right, with an assumption that the ring > buffer is actually big enough to contain all the data stored in the DSP. @Pierre, indeed the buffer that kernel allocates is big enough to store all the data. However *arecord* introduces its own buffer which is just a multiple of *period_sizes *- and it is the buffer which overflows. > > Before I provide more feedback, can you clarify if the > 'host_period_bytes' is the same value as the ALSA period size (in > bytes)? And what would be the value when the no_irq mode is used? Yes, this is the same value. It is obtained by *params_period_bytes**()* in kernel. *no_irq mode *- it does not affect the value of *host_period_bytes *after my patch has been applied. Before my patch however, driver and FW used this value to send stream position information from FW to kernel. In short, when *(hda && hda->no_ipc_position)* then *ipc_params->host_period_bytes = 0; *was set in kernel.**Firmware then, read this *host_period_bytes = 0 *and treated it as "OK does not send any IPC regarding stream position". So once *no_ipc_position *was set we lost information about *host_period_bytes *in the FW. So all my patches in FW and Kernel do is to *relax****host_period_bytes *and introduce new parameter dedicated for this stream position IPC. At that time we called it *no_period_irq *to resemble old name but now I think it would be better if we rename it to something like *no_stream_position *as it is more informative. What do you think? > >> >> >> Hope it helped to understand the need of *host_period_bytes *in the >> firmware. >> ** >> >> Marcin >>
>> We knew from Day1 that draining faster than real-time could >> potentially lead to the driver detecting overflows or timeouts. It's >> been documented left and right, with an assumption that the ring >> buffer is actually big enough to contain all the data stored in the DSP. > > @Pierre, indeed the buffer that kernel allocates is big enough to store > all the data. However *arecord* introduces its own buffer which is just > a multiple of *period_sizes *- and it is the buffer which overflows. When I tested this feature with the closed-source firmware on KBL, arecord worked fine. Care to provide more details so that we are on the same page? > >> >> Before I provide more feedback, can you clarify if the >> 'host_period_bytes' is the same value as the ALSA period size (in >> bytes)? And what would be the value when the no_irq mode is used? > > Yes, this is the same value. It is obtained by *params_period_bytes**()* > in kernel. ok good. I was afraid this would be a different concept. So what you are saying is that the draining happens by bursts whose size is tied to the period defined by the host, yes? > > *no_irq mode *- it does not affect the value of *host_period_bytes > *after my patch has been applied. Before my patch however, driver and FW > used this value to send stream position information from FW to kernel. > In short, when *(hda && hda->no_ipc_position)* > > then *ipc_params->host_period_bytes = 0; *was set in kernel.**Firmware > then, read this *host_period_bytes = 0 *and treated it as "OK does not > send any IPC regarding stream position". So once *no_ipc_position *was > set we lost information about *host_period_bytes *in the FW. > > So all my patches in FW and Kernel do is to *relax****host_period_bytes > *and introduce new parameter dedicated for this stream position IPC. At > that time we called it *no_period_irq *to resemble old name but now I > think it would be better if we rename it to something like > > *no_stream_position *as it is more informative. What do you think? The text is quite difficult to read with all the *... Please use plain text. It just occurred to me that the traditional use of the timer-based scheduling (with no_irq mode) is not very smart with this sort of application. The host has absolutely no way of predicting for how long it needs to sleep before the firmware completes the initial flush. This time is linked to hardware, bandwidth to memory, etc, and might vary quite a bit between platforms. In this case, it's much better to rely on events set by hardware upon data availability. in terms of naming, no_stream_position_ipc would be my choice, but it's a bit long. >>> >>> Hope it helped to understand the need of *host_period_bytes *in the >>> firmware. Yes, thanks for taking the time to explain the issue.
On 7/19/2019 4:25 PM, Pierre-Louis Bossart wrote: > >>> We knew from Day1 that draining faster than real-time could >>> potentially lead to the driver detecting overflows or timeouts. It's >>> been documented left and right, with an assumption that the ring >>> buffer is actually big enough to contain all the data stored in the >>> DSP. >> >> @Pierre, indeed the buffer that kernel allocates is big enough to >> store all the data. However *arecord* introduces its own buffer which >> is just a multiple of *period_sizes *- and it is the buffer which >> overflows. > > When I tested this feature with the closed-source firmware on KBL, > arecord worked fine. Care to provide more details so that we are on > the same page? > >> >>> >>> Before I provide more feedback, can you clarify if the >>> 'host_period_bytes' is the same value as the ALSA period size (in >>> bytes)? And what would be the value when the no_irq mode is used? >> >> Yes, this is the same value. It is obtained by >> *params_period_bytes**()* in kernel. > > ok good. I was afraid this would be a different concept. > > So what you are saying is that the draining happens by bursts whose > size is tied to the period defined by the host, yes? Yes. We try to fill host buffer as much as we can to gain fast draining but in the same time we give host time to read it. > >> >> *no_irq mode *- it does not affect the value of *host_period_bytes >> *after my patch has been applied. Before my patch however, driver and >> FW used this value to send stream position information from FW to >> kernel. In short, when *(hda && hda->no_ipc_position)* >> >> then *ipc_params->host_period_bytes = 0; *was set in >> kernel.**Firmware then, read this *host_period_bytes = 0 *and treated >> it as "OK does not send any IPC regarding stream position". So once >> *no_ipc_position *was set we lost information about >> *host_period_bytes *in the FW. >> >> So all my patches in FW and Kernel do is to >> *relax****host_period_bytes *and introduce new parameter dedicated >> for this stream position IPC. At that time we called it >> *no_period_irq *to resemble old name but now I think it would be >> better if we rename it to something like >> >> *no_stream_position *as it is more informative. What do you think? > > The text is quite difficult to read with all the *... Please use plain > text. Ahh I used bold text to stress some important keywords but it seems your mail client can't read it back. I will stop using it, thanks for letting me know. > > It just occurred to me that the traditional use of the timer-based > scheduling (with no_irq mode) is not very smart with this sort of > application. The host has absolutely no way of predicting for how long > it needs to sleep before the firmware completes the initial flush. > This time is linked to hardware, bandwidth to memory, etc, and might > vary quite a bit between platforms. In this case, it's much better to > rely on events set by hardware upon data availability. Exactly. We were facing DMA issues along the way which slowed down draining considerably. So if it all was driven by some timer it would never work properly. > > in terms of naming, no_stream_position_ipc would be my choice, but > it's a bit long. Heh, I had the same idea and the same problem - length of the variable. So I cut _ipc suffix and I think it is good enough. > >>>> >>>> Hope it helped to understand the need of *host_period_bytes *in the >>>> firmware. > > Yes, thanks for taking the time to explain the issue.
diff --git a/include/sound/sof/stream.h b/include/sound/sof/stream.h index 643f175cb479..06af4ecb2584 100644 --- a/include/sound/sof/stream.h +++ b/include/sound/sof/stream.h @@ -83,10 +83,10 @@ struct sof_ipc_stream_params { uint16_t sample_valid_bytes; uint16_t sample_container_bytes; - /* for notifying host period has completed - 0 means no period IRQ */ uint32_t host_period_bytes; + uint32_t no_position_ipc; /* 1 means no IPC for position upadte */ - uint32_t reserved[2]; + uint16_t reserved[3]; uint16_t chmap[SOF_IPC_MAX_CHANNELS]; /**< channel map - SOF_CHMAP_ */ } __packed;