Message ID | 20241021084053.2443545-2-andriy.shevchenko@linux.intel.com (mailing list archive) |
---|---|
State | Changes Requested, archived |
Headers | show |
Series | platform/x86: intel_scu_ipc: Avoid working around IO and cleanups | expand |
On Mon, Oct 21, 2024 at 11:38:51AM +0300, Andy Shevchenko wrote: > The theory is that the so called workaround in pwr_reg_rdwr() is > the actual reader of the data in 32-bit chunks. For some reason > the 8-bit IO won't fail after that. Replace the workaround by using > 32-bit IO explicitly and then memcpy() as much data as was requested > by the user. The same approach is already in use in > intel_scu_ipc_dev_command_with_size(). > > Tested-by: Ferry Toth <fntoth@gmail.com> > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
On Mon, 21 Oct 2024, Andy Shevchenko wrote: > The theory is that the so called workaround in pwr_reg_rdwr() is > the actual reader of the data in 32-bit chunks. For some reason > the 8-bit IO won't fail after that. Replace the workaround by using > 32-bit IO explicitly and then memcpy() as much data as was requested > by the user. The same approach is already in use in > intel_scu_ipc_dev_command_with_size(). > > Tested-by: Ferry Toth <fntoth@gmail.com> > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > --- > drivers/platform/x86/intel_scu_ipc.c | 15 ++++----------- > 1 file changed, 4 insertions(+), 11 deletions(-) > > diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c > index 5b16d29c93d7..290b38627542 100644 > --- a/drivers/platform/x86/intel_scu_ipc.c > +++ b/drivers/platform/x86/intel_scu_ipc.c > @@ -217,12 +217,6 @@ static inline u8 ipc_read_status(struct intel_scu_ipc_dev *scu) > return __raw_readl(scu->ipc_base + IPC_STATUS); > } > > -/* Read ipc byte data */ > -static inline u8 ipc_data_readb(struct intel_scu_ipc_dev *scu, u32 offset) > -{ > - return readb(scu->ipc_base + IPC_READ_BUFFER + offset); > -} > - > /* Read ipc u32 data */ > static inline u32 ipc_data_readl(struct intel_scu_ipc_dev *scu, u32 offset) > { > @@ -325,11 +319,10 @@ static int pwr_reg_rdwr(struct intel_scu_ipc_dev *scu, u16 *addr, u8 *data, > } > > err = intel_scu_ipc_check_status(scu); > - if (!err && id == IPC_CMD_PCNTRL_R) { /* Read rbuf */ > - /* Workaround: values are read as 0 without memcpy_fromio */ > - memcpy_fromio(cbuf, scu->ipc_base + 0x90, 16); > - for (nc = 0; nc < count; nc++) > - data[nc] = ipc_data_readb(scu, nc); > + if (!err) { /* Read rbuf */ What is the reason for the removal of that id check? This seems a clear logic change but why? And if you remove want to remove that check, what that comment then means? > + for (nc = 0, offset = 0; nc < 4; nc++, offset += 4) > + wbuf[nc] = ipc_data_readl(scu, offset); > + memcpy(data, wbuf, count); So do we actually need to read more than DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach used in intel_scu_ipc_dev_command_with_size() which you referred to. > } > mutex_unlock(&ipclock); > return err; FYI (unrelated to this patch), there seems to be some open-coded FIELD_PREP()s in pwr_reg_rdwr(), some of which is common code between those if branches too.
On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote: > On Mon, 21 Oct 2024, Andy Shevchenko wrote: > > > The theory is that the so called workaround in pwr_reg_rdwr() is > > the actual reader of the data in 32-bit chunks. For some reason > > the 8-bit IO won't fail after that. Replace the workaround by using > > 32-bit IO explicitly and then memcpy() as much data as was requested > > by the user. The same approach is already in use in > > intel_scu_ipc_dev_command_with_size(). ... > > err = intel_scu_ipc_check_status(scu); > > - if (!err && id == IPC_CMD_PCNTRL_R) { /* Read rbuf */ > > - /* Workaround: values are read as 0 without memcpy_fromio */ > > - memcpy_fromio(cbuf, scu->ipc_base + 0x90, 16); > > - for (nc = 0; nc < count; nc++) > > - data[nc] = ipc_data_readb(scu, nc); > > + if (!err) { /* Read rbuf */ > > What is the reason for the removal of that id check? This seems a clear > logic change but why? And if you remove want to remove that check, what > that comment then means? Let me split this to a separate change with better explanation then. > > + for (nc = 0, offset = 0; nc < 4; nc++, offset += 4) > > + wbuf[nc] = ipc_data_readl(scu, offset); > > + memcpy(data, wbuf, count); > > So do we actually need to read more than > DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach > used in intel_scu_ipc_dev_command_with_size() which you referred to. I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only asked _bytes_ to the user. > > } > > mutex_unlock(&ipclock); > > return err; > > FYI (unrelated to this patch), there seems to be some open-coded > FIELD_PREP()s in pwr_reg_rdwr(), some of which is common code between > those if branches too. This code is quite old and full of tricks that has to be tested. So, yes while it's possible to convert, I would like to do it in a small (baby) steps. This series is already quite intrusive from this perspective :-)
On Mon, 21 Oct 2024, Andy Shevchenko wrote: > On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote: > > On Mon, 21 Oct 2024, Andy Shevchenko wrote: > > > > > The theory is that the so called workaround in pwr_reg_rdwr() is > > > the actual reader of the data in 32-bit chunks. For some reason > > > the 8-bit IO won't fail after that. Replace the workaround by using > > > 32-bit IO explicitly and then memcpy() as much data as was requested > > > by the user. The same approach is already in use in > > > intel_scu_ipc_dev_command_with_size(). > > ... > > > > err = intel_scu_ipc_check_status(scu); > > > - if (!err && id == IPC_CMD_PCNTRL_R) { /* Read rbuf */ > > > - /* Workaround: values are read as 0 without memcpy_fromio */ > > > - memcpy_fromio(cbuf, scu->ipc_base + 0x90, 16); > > > - for (nc = 0; nc < count; nc++) > > > - data[nc] = ipc_data_readb(scu, nc); > > > + if (!err) { /* Read rbuf */ > > > > What is the reason for the removal of that id check? This seems a clear > > logic change but why? And if you remove want to remove that check, what > > that comment then means? > > Let me split this to a separate change with better explanation then. > > > > + for (nc = 0, offset = 0; nc < 4; nc++, offset += 4) > > > + wbuf[nc] = ipc_data_readl(scu, offset); > > > + memcpy(data, wbuf, count); > > > > So do we actually need to read more than > > DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach > > used in intel_scu_ipc_dev_command_with_size() which you referred to. > > I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only > asked _bytes_ to the user. So always reading 16 bytes is not part of the old workaround? Because it has a "lets read enough" feel. > > > } > > > mutex_unlock(&ipclock); > > > return err; > > > > FYI (unrelated to this patch), there seems to be some open-coded > > FIELD_PREP()s in pwr_reg_rdwr(), some of which is common code between > > those if branches too. > > This code is quite old and full of tricks that has to be tested. So, yes > while it's possible to convert, I would like to do it in a small (baby) > steps. This series is already quite intrusive from this perspective :-) Yeah, no pressure, I just noted down what I saw. :-)
On Mon, Oct 21, 2024 at 12:49:08PM +0300, Ilpo Järvinen wrote: > On Mon, 21 Oct 2024, Andy Shevchenko wrote: > > On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote: > > > On Mon, 21 Oct 2024, Andy Shevchenko wrote: ... > > > > + for (nc = 0, offset = 0; nc < 4; nc++, offset += 4) > > > > + wbuf[nc] = ipc_data_readl(scu, offset); > > > > + memcpy(data, wbuf, count); > > > > > > So do we actually need to read more than > > > DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach > > > used in intel_scu_ipc_dev_command_with_size() which you referred to. > > > > I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only > > asked _bytes_ to the user. > > So always reading 16 bytes is not part of the old workaround? Because it > has a "lets read enough" feel. Ah, now I got it! Yes, we may reduce the reads to just needed ones. The idea is that we always have to perform 32-bit reads independently on the amount of data we want. > > > > } > > > > mutex_unlock(&ipclock); > > > > return err; > > > > > > FYI (unrelated to this patch), there seems to be some open-coded > > > FIELD_PREP()s in pwr_reg_rdwr(), some of which is common code between > > > those if branches too. > > > > This code is quite old and full of tricks that has to be tested. So, yes > > while it's possible to convert, I would like to do it in a small (baby) > > steps. This series is already quite intrusive from this perspective :-) > > Yeah, no pressure, I just noted down what I saw. :-) Thanks, I will keep this.
On Mon, Oct 21, 2024 at 12:54:16PM +0300, Andy Shevchenko wrote: > On Mon, Oct 21, 2024 at 12:49:08PM +0300, Ilpo Järvinen wrote: > > On Mon, 21 Oct 2024, Andy Shevchenko wrote: > > > On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote: > > > > On Mon, 21 Oct 2024, Andy Shevchenko wrote: ... > > > > > + for (nc = 0, offset = 0; nc < 4; nc++, offset += 4) > > > > > + wbuf[nc] = ipc_data_readl(scu, offset); > > > > > + memcpy(data, wbuf, count); > > > > > > > > So do we actually need to read more than > > > > DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach > > > > used in intel_scu_ipc_dev_command_with_size() which you referred to. > > > > > > I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only > > > asked _bytes_ to the user. > > > > So always reading 16 bytes is not part of the old workaround? Because it > > has a "lets read enough" feel. > > Ah, now I got it! Yes, we may reduce the reads to just needed ones. > The idea is that we always have to perform 32-bit reads independently > on the amount of data we want. Oh, looking at the code (*) it seems they are really messed up in the original with bytes vs. 32-bit words! Since the above has been tested, let me put this on TODO list to clarify this mess and run with another testing. Sounds good to you? *) the mythical comment about max 5 items for 20-byte buffer is worrying and now I know why,
On Mon, 21 Oct 2024, Andy Shevchenko wrote: > On Mon, Oct 21, 2024 at 12:54:16PM +0300, Andy Shevchenko wrote: > > On Mon, Oct 21, 2024 at 12:49:08PM +0300, Ilpo Järvinen wrote: > > > On Mon, 21 Oct 2024, Andy Shevchenko wrote: > > > > On Mon, Oct 21, 2024 at 12:24:57PM +0300, Ilpo Järvinen wrote: > > > > > On Mon, 21 Oct 2024, Andy Shevchenko wrote: > > ... > > > > > > > + for (nc = 0, offset = 0; nc < 4; nc++, offset += 4) > > > > > > + wbuf[nc] = ipc_data_readl(scu, offset); > > > > > > + memcpy(data, wbuf, count); > > > > > > > > > > So do we actually need to read more than > > > > > DIV_ROUND_UP(min(count, 16U), sizeof(u32))? Because that's the approach > > > > > used in intel_scu_ipc_dev_command_with_size() which you referred to. > > > > > > > > I'm not sure I follow. We do IO for whole (16-bytes) buffer, but return only > > > > asked _bytes_ to the user. > > > > > > So always reading 16 bytes is not part of the old workaround? Because it > > > has a "lets read enough" feel. > > > > Ah, now I got it! Yes, we may reduce the reads to just needed ones. > > The idea is that we always have to perform 32-bit reads independently > > on the amount of data we want. > > Oh, looking at the code (*) it seems they are really messed up in the original > with bytes vs. 32-bit words! Since the above has been tested, let me put this > on TODO list to clarify this mess and run with another testing. > > Sounds good to you? Sure, I'm fine with taking the careful approach. > *) the mythical comment about max 5 items for 20-byte buffer is worrying and > now I know why, Those functions with that comment seem to only be called from scu_reg_access() which error checks count > 4.
diff --git a/drivers/platform/x86/intel_scu_ipc.c b/drivers/platform/x86/intel_scu_ipc.c index 5b16d29c93d7..290b38627542 100644 --- a/drivers/platform/x86/intel_scu_ipc.c +++ b/drivers/platform/x86/intel_scu_ipc.c @@ -217,12 +217,6 @@ static inline u8 ipc_read_status(struct intel_scu_ipc_dev *scu) return __raw_readl(scu->ipc_base + IPC_STATUS); } -/* Read ipc byte data */ -static inline u8 ipc_data_readb(struct intel_scu_ipc_dev *scu, u32 offset) -{ - return readb(scu->ipc_base + IPC_READ_BUFFER + offset); -} - /* Read ipc u32 data */ static inline u32 ipc_data_readl(struct intel_scu_ipc_dev *scu, u32 offset) { @@ -325,11 +319,10 @@ static int pwr_reg_rdwr(struct intel_scu_ipc_dev *scu, u16 *addr, u8 *data, } err = intel_scu_ipc_check_status(scu); - if (!err && id == IPC_CMD_PCNTRL_R) { /* Read rbuf */ - /* Workaround: values are read as 0 without memcpy_fromio */ - memcpy_fromio(cbuf, scu->ipc_base + 0x90, 16); - for (nc = 0; nc < count; nc++) - data[nc] = ipc_data_readb(scu, nc); + if (!err) { /* Read rbuf */ + for (nc = 0, offset = 0; nc < 4; nc++, offset += 4) + wbuf[nc] = ipc_data_readl(scu, offset); + memcpy(data, wbuf, count); } mutex_unlock(&ipclock); return err;