Message ID | 20221007152156.24883-5-Jonathan.Cameron@huawei.com |
---|---|
State | New, archived |
Headers | show |
Series | QEMU PCIe DOE for PCIe 4.0/5.0 and CXL 2.0 | expand |
This code contains heap corruption on free, and I think should be refactored to pre-allocate all the entries we're interested in putting into the table. This would flatten the code and simplify the error handling steps. Also, should we consider making a union with all the possible entries to make entry allocation easier? It may eat a few extra bytes of memory, but it would simplify the allocation/cleanup code here further. Given that every allocation has to be checked, i'm also not convinced the use of g_autofree is worth the potential footguns associated with it. > diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c > index 568c9d62f5..3fa5d70662 100644 > --- a/hw/mem/cxl_type3.c > +++ b/hw/mem/cxl_type3.c > @@ -12,9 +12,218 @@ > +static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, > + void *priv) > +{ (snip) > + /* For now, no memory side cache, plausiblish numbers */ > + dslbis_nonvolatile = g_malloc(sizeof(*dslbis_nonvolatile) * dslbis_nonvolatile_num); > + if (!dslbis_nonvolatile) > + return -ENOMEM; this allocation creates a table of entries, which is later freed incorrectly > + > + *cdat_table = g_malloc0(len * sizeof(*cdat_table)); this allocation needs to be checked > + /* Header always at start of structure */ > + if (dsmas_nonvolatile) { > + (*cdat_table)[i++] = g_steal_pointer(&dsmas_nonvolatile); > + } > + if (dslbis_nonvolatile) { > + CDATDslbis *dslbis = g_steal_pointer(&dslbis_nonvolatile); using a local reference used to avoid a g_autofree footgun suggests we should not use g_autofree here, and possibly reconsider the overall strategy for allocation and cleanup > + int j; > + > + for (j = 0; j < dslbis_nonvolatile_num; j++) { > + (*cdat_table)[i++] = (CDATSubHeader *)&dslbis[j]; > + } this fills the CDAT table with sub-references to the table allocated above, which leads to heap corruption with the current code, or complicated cleanup if we decide to keep it > + > + return len; > +} > + > +static void ct3_free_cdat_table(CDATSubHeader **cdat_table, int num, void *priv) > +{ > + int i; > + And here we free every entry of the table, which can/will cause heap corruption when the sub-table entries are freed > + for (i = 0; i < num; i++) { > + g_free(cdat_table[i]); > + } > + g_free(cdat_table); > +}
Included in this response is a recommended patch set on top of this patch that resolves a number of issues, including style and a heap corruption bug. The purpose of this patch set is to refactor the CDAT initialization code to support future patch sets that will introduce multi-region support in CXL Type3 devices. 1) Checkpatch errors in the immediately prior patch 2) Flatting of code in cdat initialization 3) Changes in allocation and error checking for cleanliness 4) Change in the allocation/free strategy of CDAT sub-tables to simplify multi-region allocation in the future. Also resolves a heap corruption bug 5) Refactor of CDAT initialization code into a function that initializes sub-tables per memory-region. Gregory Price (5): hw/mem/cxl_type3: fix checkpatch errors hw/mem/cxl_type3: Pull validation checks ahead of functional code hw/mem/cxl_type3: CDAT pre-allocate and check resources prior to work hw/mem/cxl_type3: Change the CDAT allocation/free strategy hw/mem/cxl_type3: Refactor CDAT sub-table entry initialization into a function hw/mem/cxl_type3.c | 240 +++++++++++++++++++++++---------------------- 1 file changed, 122 insertions(+), 118 deletions(-)
On Wed, 12 Oct 2022 14:21:15 -0400 Gregory Price <gourry.memverge@gmail.com> wrote: > Included in this response is a recommended patch set on top of this > patch that resolves a number of issues, including style and a heap > corruption bug. > > The purpose of this patch set is to refactor the CDAT initialization > code to support future patch sets that will introduce multi-region > support in CXL Type3 devices. > > 1) Checkpatch errors in the immediately prior patch > 2) Flatting of code in cdat initialization > 3) Changes in allocation and error checking for cleanliness > 4) Change in the allocation/free strategy of CDAT sub-tables to simplify > multi-region allocation in the future. Also resolves a heap > corruption bug > 5) Refactor of CDAT initialization code into a function that initializes > sub-tables per memory-region. > > Gregory Price (5): > hw/mem/cxl_type3: fix checkpatch errors > hw/mem/cxl_type3: Pull validation checks ahead of functional code > hw/mem/cxl_type3: CDAT pre-allocate and check resources prior to work > hw/mem/cxl_type3: Change the CDAT allocation/free strategy > hw/mem/cxl_type3: Refactor CDAT sub-table entry initialization into a > function > > hw/mem/cxl_type3.c | 240 +++++++++++++++++++++++---------------------- > 1 file changed, 122 insertions(+), 118 deletions(-) > Thanks, I'm going to roll this stuff into the original patch set for v8. Some of this I already have (like the check patch stuff). Some I may disagree with in which case I'll reply to the patches - note I haven't looked at them in detail yet! Jonathan
On Wed, 12 Oct 2022 12:01:54 -0400 Gregory Price <gregory.price@memverge.com> wrote: > This code contains heap corruption on free, and I think should be > refactored to pre-allocate all the entries we're interested in putting > into the table. Good point on the heap corruption.. (oops. Particularly as I raised that I didn't like the complexity of your free in your previous version and still failed to notice the current code was wrong...) > This would flatten the code and simplify the error > handling steps. I'm not so keen on this. Error handling is pretty trivial because of the autofree magic. It will get a tiny bit harder once we have two calls to the factored out function, but not too bad - we just need to free the handed off pointers in reverse from wherever we got to before the error. > > Also, should we consider making a union with all the possible entries to > make entry allocation easier? It may eat a few extra bytes of memory, > but it would simplify the allocation/cleanup code here further. An interesting point, though gets trickier once we have variable numbers of elements. I'm not sure it's worth the effort to save a few lines of code. > > Given that every allocation has to be checked, i'm also not convinced > the use of g_autofree is worth the potential footguns associated with > it. After rolling a version with some of your suggested changes incorporated the autofree logic is all nice and localized so I think it's well worth having. Only slightly messy bit is we end up with 4 separate pointers for the bandwidth and latency elements.
On Wed, 12 Oct 2022 12:01:54 -0400 Gregory Price <gregory.price@memverge.com> wrote: > This code contains heap corruption on free, and I think should be > refactored to pre-allocate all the entries we're interested in putting > into the table. This would flatten the code and simplify the error > handling steps. > > Also, should we consider making a union with all the possible entries to > make entry allocation easier? It may eat a few extra bytes of memory, > but it would simplify the allocation/cleanup code here further. > > Given that every allocation has to be checked, i'm also not convinced > the use of g_autofree is worth the potential footguns associated with > it. > > > diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c > > index 568c9d62f5..3fa5d70662 100644 > > --- a/hw/mem/cxl_type3.c > > +++ b/hw/mem/cxl_type3.c > > @@ -12,9 +12,218 @@ > > +static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, > > + void *priv) > > +{ > (snip) > > + /* For now, no memory side cache, plausiblish numbers */ > > + dslbis_nonvolatile = g_malloc(sizeof(*dslbis_nonvolatile) * dslbis_nonvolatile_num); > > + if (!dslbis_nonvolatile) > > + return -ENOMEM; > > this allocation creates a table of entries, which is later freed > incorrectly > > > + > > + *cdat_table = g_malloc0(len * sizeof(*cdat_table)); > > this allocation needs to be checked I just realized that sizeof should be sizeof(**cdat_table) I've moved to a local autofree pointer after factoring out the guts of the code so this gets simpler anyway (and was more obviously wrong!) Jonathan
On Thu, 13 Oct 2022 07:36:28 -0400 Gregory Price <gourry.memverge@gmail.com> wrote: > Reading through your notes, everything seems reasonable, though I'm not > sure I agree with the two pass notion, though I'll wait to see the patch > set. > > The enum is a good idea, *forehead slap*, I should have done it. If we > have a local enum, why not just make it global (within the file) and > allocate the table as I have once we know how many MRs are present? It's not global as we need the entries to be packed. So if just one mr (which ever one) the entries for that need to be at the beginning of cdat_table. I also don't want to bake into the outer caller that the entries will always be the same size for different MRs. For the two pass case... I'll send code in a few mins, but in meantime my thought is that the extended code for volatile + non volatile will looks something like: (variable names made up) if (ct3d->volatile_mem) { volatile_mr = host_memory_backend_get_memory(ct3d->volatile_mem....); if (!volatile_mr) { return -ENINVAL; } rc = ct3_build_cdat_entries_for_mr(NULL, dsmad++, volatile_mr); if (rc < 0) { return rc; } volatile_len = rc; } if (ct3d->nonvolatile_mem) { nonvolatile_mr = host_memory_backend_get_memory(ct3d->nonvolatile_mem); if (!nonvolatile_mr) { return -ENINVAL; } rc = ct3_build_cdat_entries_for_mr(NULL, dmsmad++, nonvolatile_mr....); if (rc < 0) { return rc; } nonvolatile_len = rc; } dsmad = 0; table = g_malloc(0, (volatile_len + nonvolatile_len) * sizeof(*table)); if (!table) { return -ENOMEM; } if (volatile_len) { rc = ct3_build_cdat_entries_for_mr(&table[0], dmsad++, volatile_mr....); if (rc < 0) { return rc; } } if (nonvolatile_len) { rc = ct3_build_cdat_entries_for_mr(&table[volatile_len], dsmad++, nonvolatile_mr...); if (rc < 0) { /* Only place we need error handling. Could make it more generic of course */ for (i = 0; i < volatile_len; i++) { g_free(cdat_table[i]); } return rc; } } *cdat_table = g_steal_pointer(&table); Jonathan > > 6 eggs/half dozen though, I'm ultimately fine with either. > > On Thu, Oct 13, 2022, 4:58 AM Jonathan Cameron <Jonathan.Cameron@huawei.com> > wrote: > > > On Wed, 12 Oct 2022 14:21:15 -0400 > > Gregory Price <gourry.memverge@gmail.com> wrote: > > > > > Included in this response is a recommended patch set on top of this > > > patch that resolves a number of issues, including style and a heap > > > corruption bug. > > > > > > The purpose of this patch set is to refactor the CDAT initialization > > > code to support future patch sets that will introduce multi-region > > > support in CXL Type3 devices. > > > > > > 1) Checkpatch errors in the immediately prior patch > > > 2) Flatting of code in cdat initialization > > > 3) Changes in allocation and error checking for cleanliness > > > 4) Change in the allocation/free strategy of CDAT sub-tables to simplify > > > multi-region allocation in the future. Also resolves a heap > > > corruption bug > > > 5) Refactor of CDAT initialization code into a function that initializes > > > sub-tables per memory-region. > > > > > > Gregory Price (5): > > > hw/mem/cxl_type3: fix checkpatch errors > > > hw/mem/cxl_type3: Pull validation checks ahead of functional code > > > hw/mem/cxl_type3: CDAT pre-allocate and check resources prior to work > > > hw/mem/cxl_type3: Change the CDAT allocation/free strategy > > > hw/mem/cxl_type3: Refactor CDAT sub-table entry initialization into a > > > function > > > > > > hw/mem/cxl_type3.c | 240 +++++++++++++++++++++++---------------------- > > > 1 file changed, 122 insertions(+), 118 deletions(-) > > > > > > > Thanks, I'm going to roll this stuff into the original patch set for v8. > > Some of this I already have (like the check patch stuff). > > Some I may disagree with in which case I'll reply to the patches - note > > I haven't looked at them in detail yet! > > > > Jonathan > > >
fwiw this is what my function looked like after the prior changes, very similar to yours proposed below static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, void *priv) { CXLType3Dev *ct3d = priv; MemoryRegion *vmr = NULL, *pmr = NULL; uint64_t dpa_base = 0; int dsmad_handle = 0; int num_ents = 0; int cur_ent = 0; int ret = 0; if (ct3d->hostvmem) { vmr = host_memory_backend_get_memory(ct3d->hostvmem); if (!vmr) return -EINVAL; num_ents += CT3_CDAT_SUBTABLE_SIZE; } if (ct3d->hostpmem) { pmr = host_memory_backend_get_memory(ct3d->hostpmem); if (!pmr) return -EINVAL; num_ents += CT3_CDAT_SUBTABLE_SIZE; } if (!num_ents) { return 0; } *cdat_table = g_malloc0(num_ents * sizeof(*cdat_table)); if (!*cdat_table) { return -ENOMEM; } /* Volatile aspects are mapped first */ if (vmr) { ret = ct3_build_cdat_subtable(*cdat_table, vmr, dsmad_handle++, false, dpa_base); if (ret < 0) { goto error_cleanup; } dpa_base = vmr->size; cur_ent += ret; } /* Non volatile aspects */ if (pmr) { /* non-volatile entries follow the volatile entries */ ret = ct3_build_cdat_subtable(&(*cdat_table)[cur_ent], pmr, dsmad_handle, true, dpa_base); if (ret < 0) { goto error_cleanup; } cur_ent += ret; } assert(cur_ent == num_ents); return ret; error_cleanup: int i; for (i = 0; i < num_ents; i++) { g_free(*cdat_table[i]); } g_free(*cdat_table); return ret; } On Thu, Oct 13, 2022 at 12:53:13PM +0100, Jonathan Cameron wrote: > On Thu, 13 Oct 2022 07:36:28 -0400 > Gregory Price <gourry.memverge@gmail.com> wrote: > > > Reading through your notes, everything seems reasonable, though I'm not > > sure I agree with the two pass notion, though I'll wait to see the patch > > set. > > > > The enum is a good idea, *forehead slap*, I should have done it. If we > > have a local enum, why not just make it global (within the file) and > > allocate the table as I have once we know how many MRs are present? > > It's not global as we need the entries to be packed. So if just one mr > (which ever one) the entries for that need to be at the beginning of > cdat_table. I also don't want to bake into the outer caller that the > entries will always be the same size for different MRs. > > For the two pass case... > > I'll send code in a few mins, but in meantime my thought is that > the extended code for volatile + non volatile will looks something like: > (variable names made up) > > if (ct3d->volatile_mem) { > volatile_mr = host_memory_backend_get_memory(ct3d->volatile_mem....); > if (!volatile_mr) { > return -ENINVAL; > } > rc = ct3_build_cdat_entries_for_mr(NULL, dsmad++, volatile_mr); > if (rc < 0) { > return rc; > } > volatile_len = rc; > } > > if (ct3d->nonvolatile_mem) { > nonvolatile_mr = host_memory_backend_get_memory(ct3d->nonvolatile_mem); > if (!nonvolatile_mr) { > return -ENINVAL; > } > rc = ct3_build_cdat_entries_for_mr(NULL, dmsmad++, nonvolatile_mr....); > if (rc < 0) { > return rc; > } > nonvolatile_len = rc; > } > > dsmad = 0; > > table = g_malloc(0, (volatile_len + nonvolatile_len) * sizeof(*table)); > if (!table) { > return -ENOMEM; > } > > if (volatile_len) { > rc = ct3_build_cdat_entries_for_mr(&table[0], dmsad++, volatile_mr....); > if (rc < 0) { > return rc; > } > } > if (nonvolatile_len) { > rc = ct3_build_cdat_entries_for_mr(&table[volatile_len], dsmad++, nonvolatile_mr...); > if (rc < 0) { > /* Only place we need error handling. Could make it more generic of course */ > for (i = 0; i < volatile_len; i++) { > g_free(cdat_table[i]); > } > return rc; > } > } > > *cdat_table = g_steal_pointer(&table); > > > Jonathan > > > > > 6 eggs/half dozen though, I'm ultimately fine with either. > > > > On Thu, Oct 13, 2022, 4:58 AM Jonathan Cameron <Jonathan.Cameron@huawei.com> > > wrote: > > > > > On Wed, 12 Oct 2022 14:21:15 -0400 > > > Gregory Price <gourry.memverge@gmail.com> wrote: > > > > > > > Included in this response is a recommended patch set on top of this > > > > patch that resolves a number of issues, including style and a heap > > > > corruption bug. > > > > > > > > The purpose of this patch set is to refactor the CDAT initialization > > > > code to support future patch sets that will introduce multi-region > > > > support in CXL Type3 devices. > > > > > > > > 1) Checkpatch errors in the immediately prior patch > > > > 2) Flatting of code in cdat initialization > > > > 3) Changes in allocation and error checking for cleanliness > > > > 4) Change in the allocation/free strategy of CDAT sub-tables to simplify > > > > multi-region allocation in the future. Also resolves a heap > > > > corruption bug > > > > 5) Refactor of CDAT initialization code into a function that initializes > > > > sub-tables per memory-region. > > > > > > > > Gregory Price (5): > > > > hw/mem/cxl_type3: fix checkpatch errors > > > > hw/mem/cxl_type3: Pull validation checks ahead of functional code > > > > hw/mem/cxl_type3: CDAT pre-allocate and check resources prior to work > > > > hw/mem/cxl_type3: Change the CDAT allocation/free strategy > > > > hw/mem/cxl_type3: Refactor CDAT sub-table entry initialization into a > > > > function > > > > > > > > hw/mem/cxl_type3.c | 240 +++++++++++++++++++++++---------------------- > > > > 1 file changed, 122 insertions(+), 118 deletions(-) > > > > > > > > > > Thanks, I'm going to roll this stuff into the original patch set for v8. > > > Some of this I already have (like the check patch stuff). > > > Some I may disagree with in which case I'll reply to the patches - note > > > I haven't looked at them in detail yet! > > > > > > Jonathan > > > > > >
On Thu, 13 Oct 2022 08:35:13 -0400 Gregory Price <gourry.memverge@gmail.com> wrote: > fwiw this is what my function looked like after the prior changes, very > similar to yours proposed below Makes sense given only real change is exactly where the size comes from ;) FYI, I've pushed out latest version on top of qemu/master at gitlab.com/jic23/ as tag doe-v8 Just as soon as I finish bouncing patches to a machine I can push from I'll push out the rest of my queue. My current thought is to slide your series under the rest of that queue (so directly on top of the DOE set - v8+ depending on reviews). The other series coming through is Ira's event injection but my guess is that will take a bit more time to stabilize. Jonathan > > static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, > void *priv) > { > CXLType3Dev *ct3d = priv; > MemoryRegion *vmr = NULL, *pmr = NULL; > uint64_t dpa_base = 0; > int dsmad_handle = 0; > int num_ents = 0; > int cur_ent = 0; > int ret = 0; > > if (ct3d->hostvmem) { > vmr = host_memory_backend_get_memory(ct3d->hostvmem); > if (!vmr) > return -EINVAL; > num_ents += CT3_CDAT_SUBTABLE_SIZE; > } > if (ct3d->hostpmem) { > pmr = host_memory_backend_get_memory(ct3d->hostpmem); > if (!pmr) > return -EINVAL; > num_ents += CT3_CDAT_SUBTABLE_SIZE; > } > if (!num_ents) { > return 0; > } > > *cdat_table = g_malloc0(num_ents * sizeof(*cdat_table)); > if (!*cdat_table) { > return -ENOMEM; > } > > /* Volatile aspects are mapped first */ > if (vmr) { > ret = ct3_build_cdat_subtable(*cdat_table, vmr, dsmad_handle++, > false, dpa_base); > if (ret < 0) { > goto error_cleanup; > } > dpa_base = vmr->size; > cur_ent += ret; > } > /* Non volatile aspects */ > if (pmr) { > /* non-volatile entries follow the volatile entries */ > ret = ct3_build_cdat_subtable(&(*cdat_table)[cur_ent], pmr, > dsmad_handle, true, dpa_base); > if (ret < 0) { > goto error_cleanup; > } > cur_ent += ret; > } > assert(cur_ent == num_ents); > > return ret; > error_cleanup: > int i; > for (i = 0; i < num_ents; i++) { Might as well loop only to cur_ent as the rest will be NULL. > g_free(*cdat_table[i]); > } > g_free(*cdat_table); > return ret; > } > > > On Thu, Oct 13, 2022 at 12:53:13PM +0100, Jonathan Cameron wrote: > > On Thu, 13 Oct 2022 07:36:28 -0400 > > Gregory Price <gourry.memverge@gmail.com> wrote: > > > > > Reading through your notes, everything seems reasonable, though I'm not > > > sure I agree with the two pass notion, though I'll wait to see the patch > > > set. > > > > > > The enum is a good idea, *forehead slap*, I should have done it. If we > > > have a local enum, why not just make it global (within the file) and > > > allocate the table as I have once we know how many MRs are present? > > > > It's not global as we need the entries to be packed. So if just one mr > > (which ever one) the entries for that need to be at the beginning of > > cdat_table. I also don't want to bake into the outer caller that the > > entries will always be the same size for different MRs. > > > > For the two pass case... > > > > I'll send code in a few mins, but in meantime my thought is that > > the extended code for volatile + non volatile will looks something like: > > (variable names made up) > > > > if (ct3d->volatile_mem) { > > volatile_mr = host_memory_backend_get_memory(ct3d->volatile_mem....); > > if (!volatile_mr) { > > return -ENINVAL; > > } > > rc = ct3_build_cdat_entries_for_mr(NULL, dsmad++, volatile_mr); > > if (rc < 0) { > > return rc; > > } > > volatile_len = rc; > > } > > > > if (ct3d->nonvolatile_mem) { > > nonvolatile_mr = host_memory_backend_get_memory(ct3d->nonvolatile_mem); > > if (!nonvolatile_mr) { > > return -ENINVAL; > > } > > rc = ct3_build_cdat_entries_for_mr(NULL, dmsmad++, nonvolatile_mr....); > > if (rc < 0) { > > return rc; > > } > > nonvolatile_len = rc; > > } > > > > dsmad = 0; > > > > table = g_malloc(0, (volatile_len + nonvolatile_len) * sizeof(*table)); > > if (!table) { > > return -ENOMEM; > > } > > > > if (volatile_len) { > > rc = ct3_build_cdat_entries_for_mr(&table[0], dmsad++, volatile_mr....); > > if (rc < 0) { > > return rc; > > } > > } > > if (nonvolatile_len) { > > rc = ct3_build_cdat_entries_for_mr(&table[volatile_len], dsmad++, nonvolatile_mr...); > > if (rc < 0) { > > /* Only place we need error handling. Could make it more generic of course */ > > for (i = 0; i < volatile_len; i++) { > > g_free(cdat_table[i]); > > } > > return rc; > > } > > } > > > > *cdat_table = g_steal_pointer(&table); > > > > > > Jonathan > > > > > > > > 6 eggs/half dozen though, I'm ultimately fine with either. > > > > > > On Thu, Oct 13, 2022, 4:58 AM Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > wrote: > > > > > > > On Wed, 12 Oct 2022 14:21:15 -0400 > > > > Gregory Price <gourry.memverge@gmail.com> wrote: > > > > > > > > > Included in this response is a recommended patch set on top of this > > > > > patch that resolves a number of issues, including style and a heap > > > > > corruption bug. > > > > > > > > > > The purpose of this patch set is to refactor the CDAT initialization > > > > > code to support future patch sets that will introduce multi-region > > > > > support in CXL Type3 devices. > > > > > > > > > > 1) Checkpatch errors in the immediately prior patch > > > > > 2) Flatting of code in cdat initialization > > > > > 3) Changes in allocation and error checking for cleanliness > > > > > 4) Change in the allocation/free strategy of CDAT sub-tables to simplify > > > > > multi-region allocation in the future. Also resolves a heap > > > > > corruption bug > > > > > 5) Refactor of CDAT initialization code into a function that initializes > > > > > sub-tables per memory-region. > > > > > > > > > > Gregory Price (5): > > > > > hw/mem/cxl_type3: fix checkpatch errors > > > > > hw/mem/cxl_type3: Pull validation checks ahead of functional code > > > > > hw/mem/cxl_type3: CDAT pre-allocate and check resources prior to work > > > > > hw/mem/cxl_type3: Change the CDAT allocation/free strategy > > > > > hw/mem/cxl_type3: Refactor CDAT sub-table entry initialization into a > > > > > function > > > > > > > > > > hw/mem/cxl_type3.c | 240 +++++++++++++++++++++++---------------------- > > > > > 1 file changed, 122 insertions(+), 118 deletions(-) > > > > > > > > > > > > > Thanks, I'm going to roll this stuff into the original patch set for v8. > > > > Some of this I already have (like the check patch stuff). > > > > Some I may disagree with in which case I'll reply to the patches - note > > > > I haven't looked at them in detail yet! > > > > > > > > Jonathan > > > > > > > > >
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c index 568c9d62f5..3fa5d70662 100644 --- a/hw/mem/cxl_type3.c +++ b/hw/mem/cxl_type3.c @@ -12,9 +12,218 @@ #include "qemu/range.h" #include "qemu/rcu.h" #include "sysemu/hostmem.h" +#include "sysemu/numa.h" #include "hw/cxl/cxl.h" #include "hw/pci/msix.h" +#define DWORD_BYTE 4 + +static int ct3_build_cdat_table(CDATSubHeader ***cdat_table, + void *priv) +{ + g_autofree CDATDsmas *dsmas_nonvolatile = NULL; + g_autofree CDATDslbis *dslbis_nonvolatile = NULL; + g_autofree CDATDsemts *dsemts_nonvolatile = NULL; + CXLType3Dev *ct3d = priv; + int len = 0; + int i = 0; + int next_dsmad_handle = 0; + int nonvolatile_dsmad = -1; + int dslbis_nonvolatile_num = 4; + MemoryRegion *mr; + + /* Non volatile aspects */ + if (ct3d->hostmem) { + dsmas_nonvolatile = g_malloc(sizeof(*dsmas_nonvolatile)); + if (!dsmas_nonvolatile) { + return -ENOMEM; + } + nonvolatile_dsmad = next_dsmad_handle++; + mr = host_memory_backend_get_memory(ct3d->hostmem); + if (!mr) { + return -EINVAL; + } + *dsmas_nonvolatile = (CDATDsmas) { + .header = { + .type = CDAT_TYPE_DSMAS, + .length = sizeof(*dsmas_nonvolatile), + }, + .DSMADhandle = nonvolatile_dsmad, + .flags = CDAT_DSMAS_FLAG_NV, + .DPA_base = 0, + .DPA_length = int128_get64(mr->size), + }; + len++; + + /* For now, no memory side cache, plausiblish numbers */ + dslbis_nonvolatile = g_malloc(sizeof(*dslbis_nonvolatile) * dslbis_nonvolatile_num); + if (!dslbis_nonvolatile) + return -ENOMEM; + + dslbis_nonvolatile[0] = (CDATDslbis) { + .header = { + .type = CDAT_TYPE_DSLBIS, + .length = sizeof(*dslbis_nonvolatile), + }, + .handle = nonvolatile_dsmad, + .flags = HMAT_LB_MEM_MEMORY, + .data_type = HMAT_LB_DATA_READ_LATENCY, + .entry_base_unit = 10000, /* 10ns base */ + .entry[0] = 15, /* 150ns */ + }; + len++; + + dslbis_nonvolatile[1] = (CDATDslbis) { + .header = { + .type = CDAT_TYPE_DSLBIS, + .length = sizeof(*dslbis_nonvolatile), + }, + .handle = nonvolatile_dsmad, + .flags = HMAT_LB_MEM_MEMORY, + .data_type = HMAT_LB_DATA_WRITE_LATENCY, + .entry_base_unit = 10000, + .entry[0] = 25, /* 250ns */ + }; + len++; + + dslbis_nonvolatile[2] = (CDATDslbis) { + .header = { + .type = CDAT_TYPE_DSLBIS, + .length = sizeof(*dslbis_nonvolatile), + }, + .handle = nonvolatile_dsmad, + .flags = HMAT_LB_MEM_MEMORY, + .data_type = HMAT_LB_DATA_READ_BANDWIDTH, + .entry_base_unit = 1000, /* GB/s */ + .entry[0] = 16, + }; + len++; + + dslbis_nonvolatile[3] = (CDATDslbis) { + .header = { + .type = CDAT_TYPE_DSLBIS, + .length = sizeof(*dslbis_nonvolatile), + }, + .handle = nonvolatile_dsmad, + .flags = HMAT_LB_MEM_MEMORY, + .data_type = HMAT_LB_DATA_WRITE_BANDWIDTH, + .entry_base_unit = 1000, /* GB/s */ + .entry[0] = 16, + }; + len++; + + mr = host_memory_backend_get_memory(ct3d->hostmem); + if (!mr) { + return -EINVAL; + } + dsemts_nonvolatile = g_malloc(sizeof(*dsemts_nonvolatile)); + *dsemts_nonvolatile = (CDATDsemts) { + .header = { + .type = CDAT_TYPE_DSEMTS, + .length = sizeof(*dsemts_nonvolatile), + }, + .DSMAS_handle = nonvolatile_dsmad, + .EFI_memory_type_attr = 2, /* Reserved - the non volatile from DSMAS matters */ + .DPA_offset = 0, + .DPA_length = int128_get64(mr->size), + }; + len++; + } + + *cdat_table = g_malloc0(len * sizeof(*cdat_table)); + /* Header always at start of structure */ + if (dsmas_nonvolatile) { + (*cdat_table)[i++] = g_steal_pointer(&dsmas_nonvolatile); + } + if (dslbis_nonvolatile) { + CDATDslbis *dslbis = g_steal_pointer(&dslbis_nonvolatile); + int j; + + for (j = 0; j < dslbis_nonvolatile_num; j++) { + (*cdat_table)[i++] = (CDATSubHeader *)&dslbis[j]; + } + } + if (dsemts_nonvolatile) { + (*cdat_table)[i++] = g_steal_pointer(&dsemts_nonvolatile); + } + + return len; +} + +static void ct3_free_cdat_table(CDATSubHeader **cdat_table, int num, void *priv) +{ + int i; + + for (i = 0; i < num; i++) { + g_free(cdat_table[i]); + } + g_free(cdat_table); +} + +static bool cxl_doe_cdat_rsp(DOECap *doe_cap) +{ + CDATObject *cdat = &CXL_TYPE3(doe_cap->pdev)->cxl_cstate.cdat; + uint16_t ent; + void *base; + uint32_t len; + CDATReq *req = pcie_doe_get_write_mbox_ptr(doe_cap); + CDATRsp rsp; + + assert(cdat->entry_len); + + /* Discard if request length mismatched */ + if (pcie_doe_get_obj_len(req) < + DIV_ROUND_UP(sizeof(CDATReq), DWORD_BYTE)) { + return false; + } + + ent = req->entry_handle; + base = cdat->entry[ent].base; + len = cdat->entry[ent].length; + + rsp = (CDATRsp) { + .header = { + .vendor_id = CXL_VENDOR_ID, + .data_obj_type = CXL_DOE_TABLE_ACCESS, + .reserved = 0x0, + .length = DIV_ROUND_UP((sizeof(rsp) + len), DWORD_BYTE), + }, + .rsp_code = CXL_DOE_TAB_RSP, + .table_type = CXL_DOE_TAB_TYPE_CDAT, + .entry_handle = (ent < cdat->entry_len - 1) ? + ent + 1 : CXL_DOE_TAB_ENT_MAX, + }; + + memcpy(doe_cap->read_mbox, &rsp, sizeof(rsp)); + memcpy(doe_cap->read_mbox + DIV_ROUND_UP(sizeof(rsp), DWORD_BYTE), + base, len); + + doe_cap->read_mbox_len += rsp.header.length; + + return true; +} + +static uint32_t ct3d_config_read(PCIDevice *pci_dev, uint32_t addr, int size) +{ + CXLType3Dev *ct3d = CXL_TYPE3(pci_dev); + uint32_t val; + + if (pcie_doe_read_config(&ct3d->doe_cdat, addr, size, &val)) { + return val; + } + + return pci_default_read_config(pci_dev, addr, size); +} + +static void ct3d_config_write(PCIDevice *pci_dev, uint32_t addr, uint32_t val, + int size) +{ + CXLType3Dev *ct3d = CXL_TYPE3(pci_dev); + + pcie_doe_write_config(&ct3d->doe_cdat, addr, val, size); + pci_default_write_config(pci_dev, addr, val, size); +} + /* * Null value of all Fs suggested by IEEE RA guidelines for use of * EU, OUI and CID @@ -140,6 +349,11 @@ static bool cxl_setup_memory(CXLType3Dev *ct3d, Error **errp) return true; } +static DOEProtocol doe_cdat_prot[] = { + { CXL_VENDOR_ID, CXL_DOE_TABLE_ACCESS, cxl_doe_cdat_rsp }, + { } +}; + static void ct3_realize(PCIDevice *pci_dev, Error **errp) { CXLType3Dev *ct3d = CXL_TYPE3(pci_dev); @@ -189,6 +403,14 @@ static void ct3_realize(PCIDevice *pci_dev, Error **errp) for (i = 0; i < msix_num; i++) { msix_vector_use(pci_dev, i); } + + /* DOE Initailization */ + pcie_doe_init(pci_dev, &ct3d->doe_cdat, 0x190, doe_cdat_prot, true, 0); + + cxl_cstate->cdat.build_cdat_table = ct3_build_cdat_table; + cxl_cstate->cdat.free_cdat_table = ct3_free_cdat_table; + cxl_cstate->cdat.private = ct3d; + cxl_doe_cdat_init(cxl_cstate, errp); } static void ct3_exit(PCIDevice *pci_dev) @@ -197,6 +419,7 @@ static void ct3_exit(PCIDevice *pci_dev) CXLComponentState *cxl_cstate = &ct3d->cxl_cstate; ComponentRegisters *regs = &cxl_cstate->crb; + cxl_doe_cdat_release(cxl_cstate); g_free(regs->special_ops); address_space_destroy(&ct3d->hostmem_as); } @@ -296,6 +519,7 @@ static Property ct3_props[] = { DEFINE_PROP_LINK("lsa", CXLType3Dev, lsa, TYPE_MEMORY_BACKEND, HostMemoryBackend *), DEFINE_PROP_UINT64("sn", CXLType3Dev, sn, UI64_NULL), + DEFINE_PROP_STRING("cdat", CXLType3Dev, cxl_cstate.cdat.filename), DEFINE_PROP_END_OF_LIST(), }; @@ -361,6 +585,9 @@ static void ct3_class_init(ObjectClass *oc, void *data) pc->device_id = 0xd93; /* LVF for now */ pc->revision = 1; + pc->config_write = ct3d_config_write; + pc->config_read = ct3d_config_read; + set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); dc->desc = "CXL PMEM Device (Type 3)"; dc->reset = ct3d_reset;