Message ID | 169645731012.624805.15404457479294344934.stgit@djiang5-mobl3 (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Vishal Verma |
Headers | show |
Series | [NDCTL,1/2] cxl: Save the number of decoders committed to a port | expand |
> Add a check for memdev disable to see if there are active regions present > before disabling the device. This is necessary now regions are present to > fulfill the TODO that was left there. The best way to determine if a > region is active is to see if there are decoders enabled for the mem > device. This is also best effort as the state is only a snapshot the > kernel provides and is not atomic WRT the memdev disable operation. The > expectation is the admin issuing the command has full control of the mem > device and there are no other agents also attempt to control the device. > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> > --- > cxl/memdev.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/cxl/memdev.c b/cxl/memdev.c > index f6a2d3f1fdca..314bac082719 100644 > --- a/cxl/memdev.c > +++ b/cxl/memdev.c > @@ -373,11 +373,21 @@ static int action_free_dpa(struct cxl_memdev *memdev, > > static int action_disable(struct cxl_memdev *memdev, struct action_context *actx) > { > + struct cxl_endpoint *ep; > + struct cxl_port *port; > + > if (!cxl_memdev_is_enabled(memdev)) > return 0; > > - if (!param.force) { > - /* TODO: actually detect rather than assume active */ > + ep = cxl_memdev_get_endpoint(memdev); > + if (!ep) > + return -ENODEV; > + > + port = cxl_endpoint_get_port(ep); > + if (!port) > + return -ENODEV; > + > + if (cxl_port_decoders_committed(port) && !param.force) { > log_err(&ml, "%s is part of an active region\n", > cxl_memdev_get_devname(memdev)); > return -EBUSY; > > Hi Dave, Based on my understanding of the logic in the "disable_region" and "destroy_region" code, in the code logic of 'disable-region -f,' after the check, it proceeds with the offline operation. In the code logic of 'destroy-region -f,' after the check, it performs a disable operation on the region. For the 'disable-memdev -f' operation, after completing the check, is it also necessary to perform corresponding operations on the region(such as disabling region/destroying region) before disabling memdev?
On Tue, 2023-11-28 at 14:35 +0800, Cao, Quanquan/曹 全全 wrote: > > > Add a check for memdev disable to see if there are active regions present > > before disabling the device. This is necessary now regions are present to > > fulfill the TODO that was left there. The best way to determine if a > > region is active is to see if there are decoders enabled for the mem > > device. This is also best effort as the state is only a snapshot the > > kernel provides and is not atomic WRT the memdev disable operation. The > > expectation is the admin issuing the command has full control of the mem > > device and there are no other agents also attempt to control the device. > > > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> > > --- > > cxl/memdev.c | 14 ++++++++++++-- > > 1 file changed, 12 insertions(+), 2 deletions(-) > > > > diff --git a/cxl/memdev.c b/cxl/memdev.c > > index f6a2d3f1fdca..314bac082719 100644 > > --- a/cxl/memdev.c > > +++ b/cxl/memdev.c > > @@ -373,11 +373,21 @@ static int action_free_dpa(struct cxl_memdev *memdev, > > > > static int action_disable(struct cxl_memdev *memdev, struct action_context *actx) > > { > > + struct cxl_endpoint *ep; > > + struct cxl_port *port; > > + > > if (!cxl_memdev_is_enabled(memdev)) > > return 0; > > > > - if (!param.force) { > > - /* TODO: actually detect rather than assume active */ > > + ep = cxl_memdev_get_endpoint(memdev); > > + if (!ep) > > + return -ENODEV; > > + > > + port = cxl_endpoint_get_port(ep); > > + if (!port) > > + return -ENODEV; > > + > > + if (cxl_port_decoders_committed(port) && !param.force) { > > log_err(&ml, "%s is part of an active region\n", > > cxl_memdev_get_devname(memdev)); > > return -EBUSY; > > > > > Hi Dave, > > Based on my understanding of the logic in the "disable_region" and > "destroy_region" code, in the code logic of 'disable-region -f,' after > the check, it proceeds with the offline operation. In the code logic of > 'destroy-region -f,' after the check, it performs a disable operation on > the region. For the 'disable-memdev -f' operation, after completing the > check, is it also necessary to perform corresponding operations on the > region(such as disabling region/destroying region) before disabling memdev? > I think it is better for a disable-memdev to just perform the check and complain, instead of trying to unwind a _lot_ of things - i.e. offline memory, disable region, and then disable memdev. If there is an error in one of the intermediate steps, it can make it hard to trace down what happened, and what state the system is in. I do think that the current -f behavior of just unbind the driver without the safety checks should come with a big loud warning in the man page at least about what -f will do, and how it will potentially break existing regions.
On Wed, Oct 04, 2023 at 03:08:30PM -0700, Dave Jiang wrote: > Add a check for memdev disable to see if there are active regions present > before disabling the device. This is necessary now regions are present to > fulfill the TODO that was left there. The best way to determine if a > region is active is to see if there are decoders enabled for the mem > device. This is also best effort as the state is only a snapshot the > kernel provides and is not atomic WRT the memdev disable operation. The > expectation is the admin issuing the command has full control of the mem > device and there are no other agents also attempt to control the device. > snip > + > + if (cxl_port_decoders_committed(port) && !param.force) { > log_err(&ml, "%s is part of an active region\n", > cxl_memdev_get_devname(memdev)); > return -EBUSY; Can you emit the message either way, that is, even if it is forced, let the user know what they just trampled on. How easy is it to add the region names in the message? > > >
On 11/28/23 11:31, Alison Schofield wrote: > On Wed, Oct 04, 2023 at 03:08:30PM -0700, Dave Jiang wrote: >> Add a check for memdev disable to see if there are active regions present >> before disabling the device. This is necessary now regions are present to >> fulfill the TODO that was left there. The best way to determine if a >> region is active is to see if there are decoders enabled for the mem >> device. This is also best effort as the state is only a snapshot the >> kernel provides and is not atomic WRT the memdev disable operation. The >> expectation is the admin issuing the command has full control of the mem >> device and there are no other agents also attempt to control the device. >> > > snip > >> + >> + if (cxl_port_decoders_committed(port) && !param.force) { >> log_err(&ml, "%s is part of an active region\n", >> cxl_memdev_get_devname(memdev)); >> return -EBUSY; > > Can you emit the message either way, that is, even if it is forced, > let the user know what they just trampled on. Ok I'll change that. > > How easy is it to add the region names in the message? I don't think we can get to a region easily. The way we currently check is to see if there are active decoders and not region existence. > >> >> >>
diff --git a/cxl/memdev.c b/cxl/memdev.c index f6a2d3f1fdca..314bac082719 100644 --- a/cxl/memdev.c +++ b/cxl/memdev.c @@ -373,11 +373,21 @@ static int action_free_dpa(struct cxl_memdev *memdev, static int action_disable(struct cxl_memdev *memdev, struct action_context *actx) { + struct cxl_endpoint *ep; + struct cxl_port *port; + if (!cxl_memdev_is_enabled(memdev)) return 0; - if (!param.force) { - /* TODO: actually detect rather than assume active */ + ep = cxl_memdev_get_endpoint(memdev); + if (!ep) + return -ENODEV; + + port = cxl_endpoint_get_port(ep); + if (!port) + return -ENODEV; + + if (cxl_port_decoders_committed(port) && !param.force) { log_err(&ml, "%s is part of an active region\n", cxl_memdev_get_devname(memdev)); return -EBUSY;
Add a check for memdev disable to see if there are active regions present before disabling the device. This is necessary now regions are present to fulfill the TODO that was left there. The best way to determine if a region is active is to see if there are decoders enabled for the mem device. This is also best effort as the state is only a snapshot the kernel provides and is not atomic WRT the memdev disable operation. The expectation is the admin issuing the command has full control of the mem device and there are no other agents also attempt to control the device. Signed-off-by: Dave Jiang <dave.jiang@intel.com> --- cxl/memdev.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)