Message ID | 20240416162908.24180-3-fancer.lancer@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | dmaengine: dw: Fix src/dst addr width misconfig | expand |
On Tue, Apr 16, 2024 at 07:28:56PM +0300, Serge Semin wrote: > Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory > data width (single transfer width) is determined based on the buffer > length, buffer base address or DMA master-channel max address width > capability. It isn't enough in case of the channel disabling prior the > block transfer is finished. Here is what DW AHB DMA IP-core databook says > regarding the port suspension (DMA-transfer pause) implementation in the > controller: > > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may > still be data in the channel FIFO, but not enough to form a single > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is > disabled, the remaining data in the channel FIFO is not transferred to the > destination peripheral." > > So in case if the port gets to be suspended and then disabled it's > possible to have the data silently discarded even though the controller > reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped > data already received from the source device. This looks as if the data > somehow got lost on a way from the peripheral device to memory and causes > problems for instance in the DW APB UART driver, which pauses and disables > the DMA-transfer as soon as the recv data timeout happens. Here is the way > it looks: > > Memory <------- DMA FIFO <------ UART FIFO <---------------- UART > DST_TR_WIDTH -+--------| | | > | | | | No more data > Current lvl -+--------| |---------+- DMA-burst lvl > | | |---------+- Leftover data > | | |---------+- SRC_TR_WIDTH > -+--------+-------+---------+ > > In the example above: no more data is getting received over the UART port > and BLOCK_TS is not even close to be fully received; some data is left in > the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA > FIFO; some data is left in the DMA FIFO, but not enough to be passed > further to the system memory in a single transfer. In this situation the > 8250 UART driver catches the recv timeout interrupt, pauses the > DMA-transfer and terminates it completely, after which the IRQ handler > manually fetches the leftover data from the UART FIFO into the > recv-buffer. But since the DMA-channel has been disabled with the data > left in the DMA FIFO, that data will be just discarded and the recv-buffer > will have a gap of the "current lvl" size in the recv-buffer at the tail > of the lately received data portion. So the data will be lost just due to > the misconfigured DMA transfer. > > Note this is only relevant for the case of the transfer suspension and > _disabling_. No problem will happen if the transfer will be re-enabled > afterwards or the block transfer is fully completed. In the later case the > "FIFO flush mode" will be executed at the transfer final stage in order to > push out the data left in the DMA FIFO. > > In order to fix the denoted problem the DW AHB DMA-engine driver needs to > make sure that the _bursted_ source transfer width is greater or equal to > the single destination transfer (note the HW databook describes more > strict constraint than actually required). Since the peripheral-device > side is prescribed by the client driver logic, the memory-side can be only > used for that. The solution can be easily implemented for the DEV_TO_MEM > transfers just by adjusting the memory-channel address width. Sadly it's > not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size > is normally dynamically determined by the controller. So the only thing > that can be done is to make sure that memory-side address width can be > greater than the peripheral device address width. ... > +static int dwc_verify_m_buswidth(struct dma_chan *chan) > +{ > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan); > + struct dw_dma *dw = to_dw_dma(chan->device); > + u32 reg_width, reg_burst, mem_width; > + > + mem_width = dw->pdata->data_width[dwc->dws.m_master]; > + > + /* Make sure src and dst word widths are coherent */ > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) { > + reg_width = dwc->dma_sconfig.dst_addr_width; > + if (mem_width < reg_width) > + return -EINVAL; > + > + dwc->dma_sconfig.src_addr_width = mem_width; > + } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) { > + reg_width = dwc->dma_sconfig.src_addr_width; > + reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst); > + > + dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst); I understand the desire to go this way, but wouldn't be better to have a symmetrical check and return an error? > + } > + > + return 0; > +} IIRC MEM side of the DMA channel will ignore those in HW, so basically you are (re-)using them purely for the __ffs() corrections. ... > dwc->dma_sconfig.src_maxburst = > - clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst); > + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst); > dwc->dma_sconfig.dst_maxburst = > - clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst); > + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);
On Tue, Apr 16, 2024 at 09:47:02PM +0300, Andy Shevchenko wrote: > On Tue, Apr 16, 2024 at 07:28:56PM +0300, Serge Semin wrote: > > Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory > > data width (single transfer width) is determined based on the buffer > > length, buffer base address or DMA master-channel max address width > > capability. It isn't enough in case of the channel disabling prior the > > block transfer is finished. Here is what DW AHB DMA IP-core databook says > > regarding the port suspension (DMA-transfer pause) implementation in the > > controller: > > > > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is > > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not > > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may > > still be data in the channel FIFO, but not enough to form a single > > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is > > disabled, the remaining data in the channel FIFO is not transferred to the > > destination peripheral." > > > > So in case if the port gets to be suspended and then disabled it's > > possible to have the data silently discarded even though the controller > > reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped > > data already received from the source device. This looks as if the data > > somehow got lost on a way from the peripheral device to memory and causes > > problems for instance in the DW APB UART driver, which pauses and disables > > the DMA-transfer as soon as the recv data timeout happens. Here is the way > > it looks: > > > > Memory <------- DMA FIFO <------ UART FIFO <---------------- UART > > DST_TR_WIDTH -+--------| | | > > | | | | No more data > > Current lvl -+--------| |---------+- DMA-burst lvl > > | | |---------+- Leftover data > > | | |---------+- SRC_TR_WIDTH > > -+--------+-------+---------+ > > > > In the example above: no more data is getting received over the UART port > > and BLOCK_TS is not even close to be fully received; some data is left in > > the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA > > FIFO; some data is left in the DMA FIFO, but not enough to be passed > > further to the system memory in a single transfer. In this situation the > > 8250 UART driver catches the recv timeout interrupt, pauses the > > DMA-transfer and terminates it completely, after which the IRQ handler > > manually fetches the leftover data from the UART FIFO into the > > recv-buffer. But since the DMA-channel has been disabled with the data > > left in the DMA FIFO, that data will be just discarded and the recv-buffer > > will have a gap of the "current lvl" size in the recv-buffer at the tail > > of the lately received data portion. So the data will be lost just due to > > the misconfigured DMA transfer. > > > > Note this is only relevant for the case of the transfer suspension and > > _disabling_. No problem will happen if the transfer will be re-enabled > > afterwards or the block transfer is fully completed. In the later case the > > "FIFO flush mode" will be executed at the transfer final stage in order to > > push out the data left in the DMA FIFO. > > > > In order to fix the denoted problem the DW AHB DMA-engine driver needs to > > make sure that the _bursted_ source transfer width is greater or equal to > > the single destination transfer (note the HW databook describes more > > strict constraint than actually required). Since the peripheral-device > > side is prescribed by the client driver logic, the memory-side can be only > > used for that. The solution can be easily implemented for the DEV_TO_MEM > > transfers just by adjusting the memory-channel address width. Sadly it's > > not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size > > is normally dynamically determined by the controller. So the only thing > > that can be done is to make sure that memory-side address width can be > > greater than the peripheral device address width. > > ... > > > +static int dwc_verify_m_buswidth(struct dma_chan *chan) > > +{ > > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan); > > + struct dw_dma *dw = to_dw_dma(chan->device); > > + u32 reg_width, reg_burst, mem_width; > > + > > + mem_width = dw->pdata->data_width[dwc->dws.m_master]; > > + > > + /* Make sure src and dst word widths are coherent */ > > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) { > > + reg_width = dwc->dma_sconfig.dst_addr_width; > > + if (mem_width < reg_width) > > + return -EINVAL; > > + > > + dwc->dma_sconfig.src_addr_width = mem_width; > > + } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) { > > + reg_width = dwc->dma_sconfig.src_addr_width; > > + reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst); > > + > > + dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst); > > I understand the desire to go this way, but wouldn't be better to have > a symmetrical check and return an error? Sadly the situation isn't symmetrical. The main idea of the solution proposed in this patch is to make sure that the DMA transactions would fill in the DMA FIFO in a way so in case of the suspension all the data would be delivered to the destination with nothing left in the DMA FIFO and the CFGx.FIFO_EMPTY flag would mean the real FIFO emptiness. It can be reached only if (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE) >= CTLx.DST_TR_WIDTH (calculated in the real values of course). But CTLx.SRC_MSIZE is only relevant for the flow-control/non-memory peripherals. Thus the conditions under which the problem can be avoided are: DMA_MEM_TO_DEV: CTLx.SRC_TR_WIDTH >= CTLx.DST_TR_WIDTH DMA_DEV_TO_MEM: CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH In both cases the non-memory peripheral side parameters (DEV-side) can't be changed because they are selected by the client drivers based on their specific logic (Device FIFO depth, watermarks, CSR widths, etc). But we can vary the memory-side transfer width as long as it's within the permitted limits. In case of the DMA_MEM_TO_DEV transfers we can change the CTLx.SRC_TR_WIDTH because it represents the memory side transfer width. But if the maximum memory transfer width is smaller than the specified destination register width, there is nothing we can do. Thus returning the EINVAL error. Note this is mainly a hypothetical situation since normally the max width of the memory master xfers is greater than the peripheral master xfer max width (in my case it's 128 and 32 bits respectively). In case of the DMA_DEV_TO_MEM transfers we can change the CTLx.DST_TR_WIDTH parameter because it's the memory side. Thus if the maximum memory transfer width is smaller than the bursted source transfer, then we can stick to the maximum memory transfer width. But if it's greater than the bursted source transfer, we can freely reduce it so to support the safe suspension+disable DMA-usage pattern. > > > + } > > + > > + return 0; > > +} > > IIRC MEM side of the DMA channel will ignore those in HW, so basically you are > (re-)using them purely for the __ffs() corrections. No. DMAC ignores the _burst length_ parameters CTLx.SRC_MSIZE and CTLx.DEST_MSIZE for the memory side (also see my comment above): "The CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are properties valid only for peripherals with a handshaking interface; they cannot be used for defining the burst length for memory peripherals. When the peripherals are memory, the DW_ahb_dmac is always the flow controller and uses DMA transfers to move blocks; thus the CTLx.SRC_MSIZE and CTLx.DEST_MSIZE values are not used for memory peripherals. The SRC_MSIZE/DEST_MSIZE limitations are used to accommodate devices that have limited resources, such as a FIFO. Memory does not normally have limitations similar to the FIFOs." In my case the problem is in the CTLx.SRC_TR_WIDTH and CTLx.DST_TR_WIDTH values misconfiguration. Here is the crucial comment in the HW-manual about that (cited in the commit messages): "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may still be data in the channel FIFO, but not enough to form a single transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is disabled, the remaining data in the channel FIFO is not transferred to the destination peripheral." See Chapter 7.7 "Disabling a Channel Prior to Transfer Completion" of the DW DMAC HW manual for more details. -Serge(y) > > ... > > > dwc->dma_sconfig.src_maxburst = > > - clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst); > > + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst); > > dwc->dma_sconfig.dst_maxburst = > > - clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst); > > + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst); > > -- > With Best Regards, > Andy Shevchenko > >
On Wed, Apr 17, 2024 at 08:13:59PM +0300, Serge Semin wrote: > On Tue, Apr 16, 2024 at 09:47:02PM +0300, Andy Shevchenko wrote: > > On Tue, Apr 16, 2024 at 07:28:56PM +0300, Serge Semin wrote: > > > Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory > > > data width (single transfer width) is determined based on the buffer > > > length, buffer base address or DMA master-channel max address width > > > capability. It isn't enough in case of the channel disabling prior the > > > block transfer is finished. Here is what DW AHB DMA IP-core databook says > > > regarding the port suspension (DMA-transfer pause) implementation in the > > > controller: > > > > > > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is > > > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not > > > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may > > > still be data in the channel FIFO, but not enough to form a single > > > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is > > > disabled, the remaining data in the channel FIFO is not transferred to the > > > destination peripheral." > > > > > > So in case if the port gets to be suspended and then disabled it's > > > possible to have the data silently discarded even though the controller > > > reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped > > > data already received from the source device. This looks as if the data > > > somehow got lost on a way from the peripheral device to memory and causes > > > problems for instance in the DW APB UART driver, which pauses and disables > > > the DMA-transfer as soon as the recv data timeout happens. Here is the way > > > it looks: > > > > > > Memory <------- DMA FIFO <------ UART FIFO <---------------- UART > > > DST_TR_WIDTH -+--------| | | > > > | | | | No more data > > > Current lvl -+--------| |---------+- DMA-burst lvl > > > | | |---------+- Leftover data > > > | | |---------+- SRC_TR_WIDTH > > > -+--------+-------+---------+ > > > > > > In the example above: no more data is getting received over the UART port > > > and BLOCK_TS is not even close to be fully received; some data is left in > > > the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA > > > FIFO; some data is left in the DMA FIFO, but not enough to be passed > > > further to the system memory in a single transfer. In this situation the > > > 8250 UART driver catches the recv timeout interrupt, pauses the > > > DMA-transfer and terminates it completely, after which the IRQ handler > > > manually fetches the leftover data from the UART FIFO into the > > > recv-buffer. But since the DMA-channel has been disabled with the data > > > left in the DMA FIFO, that data will be just discarded and the recv-buffer > > > will have a gap of the "current lvl" size in the recv-buffer at the tail > > > of the lately received data portion. So the data will be lost just due to > > > the misconfigured DMA transfer. > > > > > > Note this is only relevant for the case of the transfer suspension and > > > _disabling_. No problem will happen if the transfer will be re-enabled > > > afterwards or the block transfer is fully completed. In the later case the > > > "FIFO flush mode" will be executed at the transfer final stage in order to > > > push out the data left in the DMA FIFO. > > > > > > In order to fix the denoted problem the DW AHB DMA-engine driver needs to > > > make sure that the _bursted_ source transfer width is greater or equal to > > > the single destination transfer (note the HW databook describes more > > > strict constraint than actually required). Since the peripheral-device > > > side is prescribed by the client driver logic, the memory-side can be only > > > used for that. The solution can be easily implemented for the DEV_TO_MEM > > > transfers just by adjusting the memory-channel address width. Sadly it's > > > not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size > > > is normally dynamically determined by the controller. So the only thing > > > that can be done is to make sure that memory-side address width can be > > > greater than the peripheral device address width. ... > > > +static int dwc_verify_m_buswidth(struct dma_chan *chan) > > > +{ > > > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan); > > > + struct dw_dma *dw = to_dw_dma(chan->device); > > > + u32 reg_width, reg_burst, mem_width; > > > + > > > + mem_width = dw->pdata->data_width[dwc->dws.m_master]; > > > + > > > + /* Make sure src and dst word widths are coherent */ > > > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) { > > > + reg_width = dwc->dma_sconfig.dst_addr_width; > > > + if (mem_width < reg_width) > > > + return -EINVAL; > > > + > > > + dwc->dma_sconfig.src_addr_width = mem_width; > > > + } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) { > > > + reg_width = dwc->dma_sconfig.src_addr_width; > > > + reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst); > > > + > > > + dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst); > > > > > I understand the desire to go this way, but wouldn't be better to have > > a symmetrical check and return an error? > > Sadly the situation isn't symmetrical. > > The main idea of the solution proposed in this patch is to make sure > that the DMA transactions would fill in the DMA FIFO in a way so in > case of the suspension all the data would be delivered to the > destination with nothing left in the DMA FIFO and the CFGx.FIFO_EMPTY > flag would mean the real FIFO emptiness. It can be reached only if > (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE) >= CTLx.DST_TR_WIDTH > (calculated in the real values of course). But CTLx.SRC_MSIZE is only > relevant for the flow-control/non-memory peripherals. Thus the Oh, if it involves flow control, shouldn't you check for that as well? We have a (PPC? IIRC) hardware with this IP that can have peripheral as flow control. > conditions under which the problem can be avoided are: > > DMA_MEM_TO_DEV: CTLx.SRC_TR_WIDTH >= CTLx.DST_TR_WIDTH > DMA_DEV_TO_MEM: CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH > > In both cases the non-memory peripheral side parameters (DEV-side) > can't be changed because they are selected by the client drivers based > on their specific logic (Device FIFO depth, watermarks, CSR widths, > etc). But we can vary the memory-side transfer width as long as it's > within the permitted limits. > > In case of the DMA_MEM_TO_DEV transfers we can change the > CTLx.SRC_TR_WIDTH because it represents the memory side transfer > width. But if the maximum memory transfer width is smaller than the > specified destination register width, there is nothing we can do. Thus > returning the EINVAL error. Note this is mainly a hypothetical > situation since normally the max width of the memory master xfers is > greater than the peripheral master xfer max width (in my case it's 128 > and 32 bits respectively). > > In case of the DMA_DEV_TO_MEM transfers we can change the CTLx.DST_TR_WIDTH > parameter because it's the memory side. Thus if the maximum > memory transfer width is smaller than the bursted source transfer, > then we can stick to the maximum memory transfer width. But if it's > greater than the bursted source transfer, we can freely reduce it > so to support the safe suspension+disable DMA-usage pattern. > > > > > > + } > > > + > > > + return 0; > > > +} > > > > > IIRC MEM side of the DMA channel will ignore those in HW, so basically you are > > (re-)using them purely for the __ffs() corrections. > > No. DMAC ignores the _burst length_ parameters CTLx.SRC_MSIZE and > CTLx.DEST_MSIZE for the memory side (also see my comment above): > > "The CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are properties valid only for > peripherals with a handshaking interface; they cannot be used for > defining the burst length for memory peripherals. > > When the peripherals are memory, the DW_ahb_dmac is always the flow > controller and uses DMA transfers to move blocks; thus the > CTLx.SRC_MSIZE and CTLx.DEST_MSIZE values are not used for memory > peripherals. The SRC_MSIZE/DEST_MSIZE limitations are used to > accommodate devices that have limited resources, such as a FIFO. > Memory does not normally have limitations similar to the FIFOs." > > In my case the problem is in the CTLx.SRC_TR_WIDTH and > CTLx.DST_TR_WIDTH values misconfiguration. Here is the crucial comment > in the HW-manual about that (cited in the commit messages): > > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may > still be data in the channel FIFO, but not enough to form a single > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is > disabled, the remaining data in the channel FIFO is not transferred to the > destination peripheral." > > See Chapter 7.7 "Disabling a Channel Prior to Transfer Completion" of > the DW DMAC HW manual for more details. Got it. Maybe a little summary in the code to explain all this magic? ... > > > dwc->dma_sconfig.src_maxburst = > > > - clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst); > > > + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst); > > > dwc->dma_sconfig.dst_maxburst = > > > - clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst); > > > + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst);
On Wed, Apr 17, 2024 at 08:28:06PM +0300, Andy Shevchenko wrote: > On Wed, Apr 17, 2024 at 08:13:59PM +0300, Serge Semin wrote: > > On Tue, Apr 16, 2024 at 09:47:02PM +0300, Andy Shevchenko wrote: > > > On Tue, Apr 16, 2024 at 07:28:56PM +0300, Serge Semin wrote: > > > > Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory > > > > data width (single transfer width) is determined based on the buffer > > > > length, buffer base address or DMA master-channel max address width > > > > capability. It isn't enough in case of the channel disabling prior the > > > > block transfer is finished. Here is what DW AHB DMA IP-core databook says > > > > regarding the port suspension (DMA-transfer pause) implementation in the > > > > controller: > > > > > > > > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is > > > > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not > > > > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may > > > > still be data in the channel FIFO, but not enough to form a single > > > > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is > > > > disabled, the remaining data in the channel FIFO is not transferred to the > > > > destination peripheral." > > > > > > > > So in case if the port gets to be suspended and then disabled it's > > > > possible to have the data silently discarded even though the controller > > > > reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped > > > > data already received from the source device. This looks as if the data > > > > somehow got lost on a way from the peripheral device to memory and causes > > > > problems for instance in the DW APB UART driver, which pauses and disables > > > > the DMA-transfer as soon as the recv data timeout happens. Here is the way > > > > it looks: > > > > > > > > Memory <------- DMA FIFO <------ UART FIFO <---------------- UART > > > > DST_TR_WIDTH -+--------| | | > > > > | | | | No more data > > > > Current lvl -+--------| |---------+- DMA-burst lvl > > > > | | |---------+- Leftover data > > > > | | |---------+- SRC_TR_WIDTH > > > > -+--------+-------+---------+ > > > > > > > > In the example above: no more data is getting received over the UART port > > > > and BLOCK_TS is not even close to be fully received; some data is left in > > > > the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA > > > > FIFO; some data is left in the DMA FIFO, but not enough to be passed > > > > further to the system memory in a single transfer. In this situation the > > > > 8250 UART driver catches the recv timeout interrupt, pauses the > > > > DMA-transfer and terminates it completely, after which the IRQ handler > > > > manually fetches the leftover data from the UART FIFO into the > > > > recv-buffer. But since the DMA-channel has been disabled with the data > > > > left in the DMA FIFO, that data will be just discarded and the recv-buffer > > > > will have a gap of the "current lvl" size in the recv-buffer at the tail > > > > of the lately received data portion. So the data will be lost just due to > > > > the misconfigured DMA transfer. > > > > > > > > Note this is only relevant for the case of the transfer suspension and > > > > _disabling_. No problem will happen if the transfer will be re-enabled > > > > afterwards or the block transfer is fully completed. In the later case the > > > > "FIFO flush mode" will be executed at the transfer final stage in order to > > > > push out the data left in the DMA FIFO. > > > > > > > > In order to fix the denoted problem the DW AHB DMA-engine driver needs to > > > > make sure that the _bursted_ source transfer width is greater or equal to > > > > the single destination transfer (note the HW databook describes more > > > > strict constraint than actually required). Since the peripheral-device > > > > side is prescribed by the client driver logic, the memory-side can be only > > > > used for that. The solution can be easily implemented for the DEV_TO_MEM > > > > transfers just by adjusting the memory-channel address width. Sadly it's > > > > not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size > > > > is normally dynamically determined by the controller. So the only thing > > > > that can be done is to make sure that memory-side address width can be > > > > greater than the peripheral device address width. > > ... > > > > > +static int dwc_verify_m_buswidth(struct dma_chan *chan) > > > > +{ > > > > + struct dw_dma_chan *dwc = to_dw_dma_chan(chan); > > > > + struct dw_dma *dw = to_dw_dma(chan->device); > > > > + u32 reg_width, reg_burst, mem_width; > > > > + > > > > + mem_width = dw->pdata->data_width[dwc->dws.m_master]; > > > > + > > > > + /* Make sure src and dst word widths are coherent */ > > > > + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) { > > > > + reg_width = dwc->dma_sconfig.dst_addr_width; > > > > + if (mem_width < reg_width) > > > > + return -EINVAL; > > > > + > > > > + dwc->dma_sconfig.src_addr_width = mem_width; > > > > + } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) { > > > > + reg_width = dwc->dma_sconfig.src_addr_width; > > > > + reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst); > > > > + > > > > + dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst); > > > > > > > > I understand the desire to go this way, but wouldn't be better to have > > > a symmetrical check and return an error? > > > > Sadly the situation isn't symmetrical. > > > > The main idea of the solution proposed in this patch is to make sure > > that the DMA transactions would fill in the DMA FIFO in a way so in > > case of the suspension all the data would be delivered to the > > destination with nothing left in the DMA FIFO and the CFGx.FIFO_EMPTY > > flag would mean the real FIFO emptiness. It can be reached only if > > (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE) >= CTLx.DST_TR_WIDTH > > (calculated in the real values of course). But CTLx.SRC_MSIZE is only > > relevant for the flow-control/non-memory peripherals. Thus the > > Oh, if it involves flow control, shouldn't you check for that as well? > We have a (PPC? IIRC) hardware with this IP that can have peripheral > as flow control. Oops. Sorry, I was wrong in saying that the peripheral is a flow-controller. All my cases imply DMAC as flow-controller. The correct sentence would be "But CTLx.SRC_MSIZE is only relevant for the non-memory peripherals." > > > conditions under which the problem can be avoided are: > > > > DMA_MEM_TO_DEV: CTLx.SRC_TR_WIDTH >= CTLx.DST_TR_WIDTH > > DMA_DEV_TO_MEM: CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH > > > > In both cases the non-memory peripheral side parameters (DEV-side) > > can't be changed because they are selected by the client drivers based > > on their specific logic (Device FIFO depth, watermarks, CSR widths, > > etc). But we can vary the memory-side transfer width as long as it's > > within the permitted limits. > > > > In case of the DMA_MEM_TO_DEV transfers we can change the > > CTLx.SRC_TR_WIDTH because it represents the memory side transfer > > width. But if the maximum memory transfer width is smaller than the > > specified destination register width, there is nothing we can do. Thus > > returning the EINVAL error. Note this is mainly a hypothetical > > situation since normally the max width of the memory master xfers is > > greater than the peripheral master xfer max width (in my case it's 128 > > and 32 bits respectively). > > > > In case of the DMA_DEV_TO_MEM transfers we can change the CTLx.DST_TR_WIDTH > > parameter because it's the memory side. Thus if the maximum > > memory transfer width is smaller than the bursted source transfer, > > then we can stick to the maximum memory transfer width. But if it's > > greater than the bursted source transfer, we can freely reduce it > > so to support the safe suspension+disable DMA-usage pattern. > > > > > > > > > + } > > > > + > > > > + return 0; > > > > +} > > > > > > > > IIRC MEM side of the DMA channel will ignore those in HW, so basically you are > > > (re-)using them purely for the __ffs() corrections. > > > > No. DMAC ignores the _burst length_ parameters CTLx.SRC_MSIZE and > > CTLx.DEST_MSIZE for the memory side (also see my comment above): > > > > "The CTLx.SRC_MSIZE and CTLx.DEST_MSIZE are properties valid only for > > peripherals with a handshaking interface; they cannot be used for > > defining the burst length for memory peripherals. > > > > When the peripherals are memory, the DW_ahb_dmac is always the flow > > controller and uses DMA transfers to move blocks; thus the > > CTLx.SRC_MSIZE and CTLx.DEST_MSIZE values are not used for memory > > peripherals. The SRC_MSIZE/DEST_MSIZE limitations are used to > > accommodate devices that have limited resources, such as a FIFO. > > Memory does not normally have limitations similar to the FIFOs." > > > > In my case the problem is in the CTLx.SRC_TR_WIDTH and > > CTLx.DST_TR_WIDTH values misconfiguration. Here is the crucial comment > > in the HW-manual about that (cited in the commit messages): > > > > "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is > > high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not > > permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may > > still be data in the channel FIFO, but not enough to form a single > > transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is > > disabled, the remaining data in the channel FIFO is not transferred to the > > destination peripheral." > > > > See Chapter 7.7 "Disabling a Channel Prior to Transfer Completion" of > > the DW DMAC HW manual for more details. > > Got it. Maybe a little summary in the code to explain all this magic? Will it be enough to add something like this: /* * It's possible to have a data portion locked in the DMA FIFO in case * of the channel suspension. Subsequent channel disabling will cause * that data silent loss. In order to prevent that maintain the src * and dst transfer widths coherency by means of the relation: * (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH) */ ? -Serge(y) > > ... > > > > > dwc->dma_sconfig.src_maxburst = > > > > - clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst); > > > > + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst); > > > > dwc->dma_sconfig.dst_maxburst = > > > > - clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst); > > > > + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst); > > -- > With Best Regards, > Andy Shevchenko > >
On Wed, Apr 17, 2024 at 09:52:42PM +0300, Serge Semin wrote: > On Wed, Apr 17, 2024 at 08:28:06PM +0300, Andy Shevchenko wrote: > > On Wed, Apr 17, 2024 at 08:13:59PM +0300, Serge Semin wrote: ... > > Got it. Maybe a little summary in the code to explain all this magic? > > Will it be enough to add something like this: > /* > * It's possible to have a data portion locked in the DMA FIFO in case > * of the channel suspension. Subsequent channel disabling will cause > * that data silent loss. In order to prevent that maintain the src > * and dst transfer widths coherency by means of the relation: > * (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH) > */ Yes, and you may add something like "Look for the details in the commit message that brings this change." at the end of it.
On Thu, Apr 18, 2024 at 12:37:09PM +0300, Andy Shevchenko wrote: > On Wed, Apr 17, 2024 at 09:52:42PM +0300, Serge Semin wrote: > > On Wed, Apr 17, 2024 at 08:28:06PM +0300, Andy Shevchenko wrote: > > > On Wed, Apr 17, 2024 at 08:13:59PM +0300, Serge Semin wrote: > > ... > > > > Got it. Maybe a little summary in the code to explain all this magic? > > > > Will it be enough to add something like this: > > /* > > * It's possible to have a data portion locked in the DMA FIFO in case > > * of the channel suspension. Subsequent channel disabling will cause > > * that data silent loss. In order to prevent that maintain the src > > * and dst transfer widths coherency by means of the relation: > > * (CTLx.SRC_TR_WIDTH * CTLx.SRC_MSIZE >= CTLx.DST_TR_WIDTH) > > */ > > Yes, and you may add something like > "Look for the details in the commit message that brings this change." > at the end of it. Agreed. Thanks. -Serge(y) > > -- > With Best Regards, > Andy Shevchenko > >
diff --git a/drivers/dma/dw/core.c b/drivers/dma/dw/core.c index c297ca9d5706..61e026310dd8 100644 --- a/drivers/dma/dw/core.c +++ b/drivers/dma/dw/core.c @@ -622,12 +622,10 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, struct dw_desc *prev; struct dw_desc *first; u32 ctllo, ctlhi; - u8 m_master = dwc->dws.m_master; - u8 lms = DWC_LLP_LMS(m_master); + u8 lms = DWC_LLP_LMS(dwc->dws.m_master); dma_addr_t reg; unsigned int reg_width; unsigned int mem_width; - unsigned int data_width = dw->pdata->data_width[m_master]; unsigned int i; struct scatterlist *sg; size_t total_len = 0; @@ -661,7 +659,7 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, mem = sg_dma_address(sg); len = sg_dma_len(sg); - mem_width = __ffs(data_width | mem | len); + mem_width = __ffs(sconfig->src_addr_width | mem | len); slave_sg_todev_fill_desc: desc = dwc_desc_get(dwc); @@ -721,7 +719,7 @@ dwc_prep_slave_sg(struct dma_chan *chan, struct scatterlist *sgl, lli_write(desc, sar, reg); lli_write(desc, dar, mem); lli_write(desc, ctlhi, ctlhi); - mem_width = __ffs(data_width | mem); + mem_width = __ffs(sconfig->dst_addr_width | mem); lli_write(desc, ctllo, ctllo | DWC_CTLL_DST_WIDTH(mem_width)); desc->len = dlen; @@ -813,6 +811,31 @@ static int dwc_verify_p_buswidth(struct dma_chan *chan) return 0; } +static int dwc_verify_m_buswidth(struct dma_chan *chan) +{ + struct dw_dma_chan *dwc = to_dw_dma_chan(chan); + struct dw_dma *dw = to_dw_dma(chan->device); + u32 reg_width, reg_burst, mem_width; + + mem_width = dw->pdata->data_width[dwc->dws.m_master]; + + /* Make sure src and dst word widths are coherent */ + if (dwc->dma_sconfig.direction == DMA_MEM_TO_DEV) { + reg_width = dwc->dma_sconfig.dst_addr_width; + if (mem_width < reg_width) + return -EINVAL; + + dwc->dma_sconfig.src_addr_width = mem_width; + } else if (dwc->dma_sconfig.direction == DMA_DEV_TO_MEM) { + reg_width = dwc->dma_sconfig.src_addr_width; + reg_burst = rounddown_pow_of_two(dwc->dma_sconfig.src_maxburst); + + dwc->dma_sconfig.dst_addr_width = min(mem_width, reg_width * reg_burst); + } + + return 0; +} + static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig) { struct dw_dma_chan *dwc = to_dw_dma_chan(chan); @@ -822,14 +845,18 @@ static int dwc_config(struct dma_chan *chan, struct dma_slave_config *sconfig) memcpy(&dwc->dma_sconfig, sconfig, sizeof(*sconfig)); dwc->dma_sconfig.src_maxburst = - clamp(dwc->dma_sconfig.src_maxburst, 0U, dwc->max_burst); + clamp(dwc->dma_sconfig.src_maxburst, 1U, dwc->max_burst); dwc->dma_sconfig.dst_maxburst = - clamp(dwc->dma_sconfig.dst_maxburst, 0U, dwc->max_burst); + clamp(dwc->dma_sconfig.dst_maxburst, 1U, dwc->max_burst); err = dwc_verify_p_buswidth(chan); if (err) return err; + err = dwc_verify_m_buswidth(chan); + if (err) + return err; + dw->encode_maxburst(dwc, &dwc->dma_sconfig.src_maxburst); dw->encode_maxburst(dwc, &dwc->dma_sconfig.dst_maxburst);
Currently in case of the DEV_TO_MEM or MEM_TO_DEV DMA transfers the memory data width (single transfer width) is determined based on the buffer length, buffer base address or DMA master-channel max address width capability. It isn't enough in case of the channel disabling prior the block transfer is finished. Here is what DW AHB DMA IP-core databook says regarding the port suspension (DMA-transfer pause) implementation in the controller: "When CTLx.SRC_TR_WIDTH < CTLx.DST_TR_WIDTH and the CFGx.CH_SUSP bit is high, the CFGx.FIFO_EMPTY is asserted once the contents of the FIFO do not permit a single word of CTLx.DST_TR_WIDTH to be formed. However, there may still be data in the channel FIFO, but not enough to form a single transfer of CTLx.DST_TR_WIDTH. In this scenario, once the channel is disabled, the remaining data in the channel FIFO is not transferred to the destination peripheral." So in case if the port gets to be suspended and then disabled it's possible to have the data silently discarded even though the controller reported that FIFO is empty and the CTLx.BLOCK_TS indicated the dropped data already received from the source device. This looks as if the data somehow got lost on a way from the peripheral device to memory and causes problems for instance in the DW APB UART driver, which pauses and disables the DMA-transfer as soon as the recv data timeout happens. Here is the way it looks: Memory <------- DMA FIFO <------ UART FIFO <---------------- UART DST_TR_WIDTH -+--------| | | | | | | No more data Current lvl -+--------| |---------+- DMA-burst lvl | | |---------+- Leftover data | | |---------+- SRC_TR_WIDTH -+--------+-------+---------+ In the example above: no more data is getting received over the UART port and BLOCK_TS is not even close to be fully received; some data is left in the UART FIFO, but not enough to perform a bursted DMA-xfer to the DMA FIFO; some data is left in the DMA FIFO, but not enough to be passed further to the system memory in a single transfer. In this situation the 8250 UART driver catches the recv timeout interrupt, pauses the DMA-transfer and terminates it completely, after which the IRQ handler manually fetches the leftover data from the UART FIFO into the recv-buffer. But since the DMA-channel has been disabled with the data left in the DMA FIFO, that data will be just discarded and the recv-buffer will have a gap of the "current lvl" size in the recv-buffer at the tail of the lately received data portion. So the data will be lost just due to the misconfigured DMA transfer. Note this is only relevant for the case of the transfer suspension and _disabling_. No problem will happen if the transfer will be re-enabled afterwards or the block transfer is fully completed. In the later case the "FIFO flush mode" will be executed at the transfer final stage in order to push out the data left in the DMA FIFO. In order to fix the denoted problem the DW AHB DMA-engine driver needs to make sure that the _bursted_ source transfer width is greater or equal to the single destination transfer (note the HW databook describes more strict constraint than actually required). Since the peripheral-device side is prescribed by the client driver logic, the memory-side can be only used for that. The solution can be easily implemented for the DEV_TO_MEM transfers just by adjusting the memory-channel address width. Sadly it's not that easy for the MEM_TO_DEV transfers since the mem-to-dma burst size is normally dynamically determined by the controller. So the only thing that can be done is to make sure that memory-side address width can be greater than the peripheral device address width. Fixes: a09820043c9e ("dw_dmac: autoconfigure data_width or get it via platform data") Signed-off-by: Serge Semin <fancer.lancer@gmail.com> --- drivers/dma/dw/core.c | 41 ++++++++++++++++++++++++++++++++++------- 1 file changed, 34 insertions(+), 7 deletions(-)