Message ID | 20231117-fix-cdat-cs-v2-1-715399976d4d@intel.com |
---|---|
State | New, archived |
Headers | show |
Series | cxl/cdat: Fixes for CXL CDAT processing | expand |
On Wed, Nov 29, 2023 at 05:33:03PM -0800, Ira Weiny wrote: > The callback for building CDAT tables may return negative error codes. > This was previously unhandled and will result in potentially huge > allocations later on in ct3_build_cdat() > > Detect the negative error code and defer cdat building. > > Fixes: f5ee7413d592 ("hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange") > Cc: Huai-Cheng Kuo <hchkuo@avery-design.com.tw> > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > --- > hw/cxl/cxl-cdat.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c > index 639a2db3e17b..24829cf2428d 100644 > --- a/hw/cxl/cxl-cdat.c > +++ b/hw/cxl/cxl-cdat.c > @@ -63,7 +63,7 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp) > cdat->built_buf_len = cdat->build_cdat_table(&cdat->built_buf, > cdat->private); > > - if (!cdat->built_buf_len) { > + if (cdat->built_buf_len <= 0) { > /* Build later as not all data available yet */ > cdat->to_update = true; > return; > The fix looks good to me. Just curious how to really build cdat table again when an error occurs, for example, the memory allocation fails. Fan > -- > 2.42.0 >
fan wrote: > On Wed, Nov 29, 2023 at 05:33:03PM -0800, Ira Weiny wrote: > > The callback for building CDAT tables may return negative error codes. > > This was previously unhandled and will result in potentially huge > > allocations later on in ct3_build_cdat() > > > > Detect the negative error code and defer cdat building. > > > > Fixes: f5ee7413d592 ("hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange") > > Cc: Huai-Cheng Kuo <hchkuo@avery-design.com.tw> > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > > --- > > hw/cxl/cxl-cdat.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c > > index 639a2db3e17b..24829cf2428d 100644 > > --- a/hw/cxl/cxl-cdat.c > > +++ b/hw/cxl/cxl-cdat.c > > @@ -63,7 +63,7 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp) > > cdat->built_buf_len = cdat->build_cdat_table(&cdat->built_buf, > > cdat->private); > > > > - if (!cdat->built_buf_len) { > > + if (cdat->built_buf_len <= 0) { > > /* Build later as not all data available yet */ > > cdat->to_update = true; > > return; > > > > The fix looks good to me. Just curious how to really build cdat table > again when an error occurs, for example, the memory allocation fails. I did not go that far as I am unsure as well. Ira
On Wed, 20 Dec 2023 11:55:33 -0800 Ira Weiny <ira.weiny@intel.com> wrote: > fan wrote: > > On Wed, Nov 29, 2023 at 05:33:03PM -0800, Ira Weiny wrote: > > > The callback for building CDAT tables may return negative error codes. > > > This was previously unhandled and will result in potentially huge > > > allocations later on in ct3_build_cdat() > > > > > > Detect the negative error code and defer cdat building. > > > > > > Fixes: f5ee7413d592 ("hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange") > > > Cc: Huai-Cheng Kuo <hchkuo@avery-design.com.tw> > > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > > > --- > > > hw/cxl/cxl-cdat.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c > > > index 639a2db3e17b..24829cf2428d 100644 > > > --- a/hw/cxl/cxl-cdat.c > > > +++ b/hw/cxl/cxl-cdat.c > > > @@ -63,7 +63,7 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp) > > > cdat->built_buf_len = cdat->build_cdat_table(&cdat->built_buf, > > > cdat->private); > > > > > > - if (!cdat->built_buf_len) { > > > + if (cdat->built_buf_len <= 0) { > > > /* Build later as not all data available yet */ > > > cdat->to_update = true; > > > return; > > > > > > > The fix looks good to me. Just curious how to really build cdat table > > again when an error occurs, for example, the memory allocation fails. > > I did not go that far as I am unsure as well. Memory allocations in qemu don't fail (well if they do it crashes) Side effect of using glib which makes for simpler cases. https://docs.gtk.org/glib/func.malloc.html There shouldn't even be any checks :( I'll fix that up at somepoint across all the CXL emulation. Sometimes reviewers noticed and we dropped it at earlier stages, but clearly didn't catch them all. Which come to think of it is why this error condition is in practice not actually buggy as the code won't ever manage to return -ENOMEM and I don't think there are other error codes. Jonathan > > Ira >
Jonathan Cameron wrote: > On Wed, 20 Dec 2023 11:55:33 -0800 > Ira Weiny <ira.weiny@intel.com> wrote: > > > fan wrote: > > > On Wed, Nov 29, 2023 at 05:33:03PM -0800, Ira Weiny wrote: > > > > The callback for building CDAT tables may return negative error codes. > > > > This was previously unhandled and will result in potentially huge > > > > allocations later on in ct3_build_cdat() > > > > > > > > Detect the negative error code and defer cdat building. > > > > > > > > Fixes: f5ee7413d592 ("hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange") > > > > Cc: Huai-Cheng Kuo <hchkuo@avery-design.com.tw> > > > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > > > > --- > > > > hw/cxl/cxl-cdat.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c > > > > index 639a2db3e17b..24829cf2428d 100644 > > > > --- a/hw/cxl/cxl-cdat.c > > > > +++ b/hw/cxl/cxl-cdat.c > > > > @@ -63,7 +63,7 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp) > > > > cdat->built_buf_len = cdat->build_cdat_table(&cdat->built_buf, > > > > cdat->private); > > > > > > > > - if (!cdat->built_buf_len) { > > > > + if (cdat->built_buf_len <= 0) { > > > > /* Build later as not all data available yet */ > > > > cdat->to_update = true; > > > > return; > > > > > > > > > > The fix looks good to me. Just curious how to really build cdat table > > > again when an error occurs, for example, the memory allocation fails. > > > > I did not go that far as I am unsure as well. > Memory allocations in qemu don't fail (well if they do it crashes) > Side effect of using glib which makes for simpler cases. > https://docs.gtk.org/glib/func.malloc.html > > There shouldn't even be any checks :( I'll fix that up at somepoint > across all the CXL emulation. Sometimes reviewers noticed and > we dropped it at earlier stages, but clearly didn't catch them all. > > Which come to think of it is why this error condition is in practice > not actually buggy as the code won't ever manage to return -ENOMEM and > I don't think there are other error codes. Ah. Ok but in that case I would say that build_cdat_table() should never return < 0 to be clear at this level what can happen. Would you like a patch for that? (/me assumes you dropped this patch) Ira
On Mon, 8 Jan 2024 08:06:32 -0800 Ira Weiny <ira.weiny@intel.com> wrote: > Jonathan Cameron wrote: > > On Wed, 20 Dec 2023 11:55:33 -0800 > > Ira Weiny <ira.weiny@intel.com> wrote: > > > > > fan wrote: > > > > On Wed, Nov 29, 2023 at 05:33:03PM -0800, Ira Weiny wrote: > > > > > The callback for building CDAT tables may return negative error codes. > > > > > This was previously unhandled and will result in potentially huge > > > > > allocations later on in ct3_build_cdat() > > > > > > > > > > Detect the negative error code and defer cdat building. > > > > > > > > > > Fixes: f5ee7413d592 ("hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange") > > > > > Cc: Huai-Cheng Kuo <hchkuo@avery-design.com.tw> > > > > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > > > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > > > > > --- > > > > > hw/cxl/cxl-cdat.c | 2 +- > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c > > > > > index 639a2db3e17b..24829cf2428d 100644 > > > > > --- a/hw/cxl/cxl-cdat.c > > > > > +++ b/hw/cxl/cxl-cdat.c > > > > > @@ -63,7 +63,7 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp) > > > > > cdat->built_buf_len = cdat->build_cdat_table(&cdat->built_buf, > > > > > cdat->private); > > > > > > > > > > - if (!cdat->built_buf_len) { > > > > > + if (cdat->built_buf_len <= 0) { > > > > > /* Build later as not all data available yet */ > > > > > cdat->to_update = true; > > > > > return; > > > > > > > > > > > > > The fix looks good to me. Just curious how to really build cdat table > > > > again when an error occurs, for example, the memory allocation fails. > > > > > > I did not go that far as I am unsure as well. > > Memory allocations in qemu don't fail (well if they do it crashes) > > Side effect of using glib which makes for simpler cases. > > https://docs.gtk.org/glib/func.malloc.html > > > > There shouldn't even be any checks :( I'll fix that up at somepoint > > across all the CXL emulation. Sometimes reviewers noticed and > > we dropped it at earlier stages, but clearly didn't catch them all. > > > > Which come to think of it is why this error condition is in practice > > not actually buggy as the code won't ever manage to return -ENOMEM and > > I don't think there are other error codes. > > Ah. Ok but in that case I would say that build_cdat_table() should never > return < 0 to be clear at this level what can happen. > > Would you like a patch for that? (/me assumes you dropped this patch) Probably needs to first rip out all the -ENOMEM returns that got into the CXL code in general, then tidy up the return type to be unsigned. If you want to do that it would be welcome! Jonathan > > Ira >
On Mon, 8 Jan 2024 18:00:42 +0000 Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote: > On Mon, 8 Jan 2024 08:06:32 -0800 > Ira Weiny <ira.weiny@intel.com> wrote: > > > Jonathan Cameron wrote: > > > On Wed, 20 Dec 2023 11:55:33 -0800 > > > Ira Weiny <ira.weiny@intel.com> wrote: > > > > > > > fan wrote: > > > > > On Wed, Nov 29, 2023 at 05:33:03PM -0800, Ira Weiny wrote: > > > > > > The callback for building CDAT tables may return negative error codes. > > > > > > This was previously unhandled and will result in potentially huge > > > > > > allocations later on in ct3_build_cdat() > > > > > > > > > > > > Detect the negative error code and defer cdat building. > > > > > > > > > > > > Fixes: f5ee7413d592 ("hw/mem/cxl-type3: Add CXL CDAT Data Object Exchange") > > > > > > Cc: Huai-Cheng Kuo <hchkuo@avery-design.com.tw> > > > > > > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > > > > > > Signed-off-by: Ira Weiny <ira.weiny@intel.com> > > > > > > --- > > > > > > hw/cxl/cxl-cdat.c | 2 +- > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c > > > > > > index 639a2db3e17b..24829cf2428d 100644 > > > > > > --- a/hw/cxl/cxl-cdat.c > > > > > > +++ b/hw/cxl/cxl-cdat.c > > > > > > @@ -63,7 +63,7 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp) > > > > > > cdat->built_buf_len = cdat->build_cdat_table(&cdat->built_buf, > > > > > > cdat->private); > > > > > > > > > > > > - if (!cdat->built_buf_len) { > > > > > > + if (cdat->built_buf_len <= 0) { > > > > > > /* Build later as not all data available yet */ > > > > > > cdat->to_update = true; > > > > > > return; > > > > > > > > > > > > > > > > The fix looks good to me. Just curious how to really build cdat table > > > > > again when an error occurs, for example, the memory allocation fails. > > > > > > > > I did not go that far as I am unsure as well. > > > Memory allocations in qemu don't fail (well if they do it crashes) > > > Side effect of using glib which makes for simpler cases. > > > https://docs.gtk.org/glib/func.malloc.html > > > > > > There shouldn't even be any checks :( I'll fix that up at somepoint > > > across all the CXL emulation. Sometimes reviewers noticed and > > > we dropped it at earlier stages, but clearly didn't catch them all. > > > > > > Which come to think of it is why this error condition is in practice > > > not actually buggy as the code won't ever manage to return -ENOMEM and > > > I don't think there are other error codes. > > > > Ah. Ok but in that case I would say that build_cdat_table() should never > > return < 0 to be clear at this level what can happen. > > > > Would you like a patch for that? (/me assumes you dropped this patch) > > Probably needs to first rip out all the -ENOMEM returns that got into > the CXL code in general, then tidy up the return type to be unsigned. > > If you want to do that it would be welcome! Actually. Build_cdat_table() can return errors just not for this reason. host_memory_backend_get_memory() can fail for example. So original patch is good as is, just that the discussion of memory allocation failure threw me off and should be cleaned up separately. Jonathan > > Jonathan > > > > > > Ira > > >
Jonathan Cameron wrote: [snip] > > > > > > > > > > I did not go that far as I am unsure as well. > > > > Memory allocations in qemu don't fail (well if they do it crashes) > > > > Side effect of using glib which makes for simpler cases. > > > > https://docs.gtk.org/glib/func.malloc.html > > > > > > > > There shouldn't even be any checks :( I'll fix that up at somepoint > > > > across all the CXL emulation. Sometimes reviewers noticed and > > > > we dropped it at earlier stages, but clearly didn't catch them all. > > > > > > > > Which come to think of it is why this error condition is in practice > > > > not actually buggy as the code won't ever manage to return -ENOMEM and > > > > I don't think there are other error codes. > > > > > > Ah. Ok but in that case I would say that build_cdat_table() should never > > > return < 0 to be clear at this level what can happen. > > > > > > Would you like a patch for that? (/me assumes you dropped this patch) > > > > Probably needs to first rip out all the -ENOMEM returns that got into > > the CXL code in general, then tidy up the return type to be unsigned. > > > > If you want to do that it would be welcome! > Actually. Build_cdat_table() can return errors just not for this reason. > > host_memory_backend_get_memory() can fail for example. I must be on a different version because I don't see that. > > So original patch is good > as is, just that the discussion of memory allocation failure threw me > off and should be cleaned up separately. > I did this testing on Fan's DCD version... :-/ ... probably very out of date. Fan do you have a newer version than your 2023-11-16 branch? Ira
On Mon, 8 Jan 2024 18:48:48 -0800 Ira Weiny <ira.weiny@intel.com> wrote: > Jonathan Cameron wrote: > > [snip] > > > > > > > > > > > > > I did not go that far as I am unsure as well. > > > > > Memory allocations in qemu don't fail (well if they do it crashes) > > > > > Side effect of using glib which makes for simpler cases. > > > > > https://docs.gtk.org/glib/func.malloc.html > > > > > > > > > > There shouldn't even be any checks :( I'll fix that up at somepoint > > > > > across all the CXL emulation. Sometimes reviewers noticed and > > > > > we dropped it at earlier stages, but clearly didn't catch them all. > > > > > > > > > > Which come to think of it is why this error condition is in practice > > > > > not actually buggy as the code won't ever manage to return -ENOMEM and > > > > > I don't think there are other error codes. > > > > > > > > Ah. Ok but in that case I would say that build_cdat_table() should never > > > > return < 0 to be clear at this level what can happen. > > > > > > > > Would you like a patch for that? (/me assumes you dropped this patch) > > > > > > Probably needs to first rip out all the -ENOMEM returns that got into > > > the CXL code in general, then tidy up the return type to be unsigned. > > > > > > If you want to do that it would be welcome! > > Actually. Build_cdat_table() can return errors just not for this reason. > > > > host_memory_backend_get_memory() can fail for example. > > I must be on a different version because I don't see that. > > > > > So original patch is good > > as is, just that the discussion of memory allocation failure threw me > > off and should be cleaned up separately. > > > > I did this testing on Fan's DCD version... :-/ ... probably very out of > date. https://elixir.bootlin.com/qemu/latest/source/hw/mem/cxl_type3.c#L183 https://elixir.bootlin.com/qemu/v8.1.0/source/hw/mem/cxl_type3.c#L171 been there a while, but meh, too many branches floating around :) > > Fan do you have a newer version than your 2023-11-16 branch? > > Ira >
diff --git a/hw/cxl/cxl-cdat.c b/hw/cxl/cxl-cdat.c index 639a2db3e17b..24829cf2428d 100644 --- a/hw/cxl/cxl-cdat.c +++ b/hw/cxl/cxl-cdat.c @@ -63,7 +63,7 @@ static void ct3_build_cdat(CDATObject *cdat, Error **errp) cdat->built_buf_len = cdat->build_cdat_table(&cdat->built_buf, cdat->private); - if (!cdat->built_buf_len) { + if (cdat->built_buf_len <= 0) { /* Build later as not all data available yet */ cdat->to_update = true; return;