Message ID | 20220811143043.126029-6-p.raghav@samsung.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | support zoned block devices with non-power-of-2 zone sizes | expand |
On Thu, Aug 11, 2022 at 04:30:35PM +0200, Pankaj Raghav wrote: > @@ -101,13 +101,6 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf) > } > > ns->zsze = nvme_lba_to_sect(ns, le64_to_cpu(id->lbafe[lbaf].zsze)); > - if (!is_power_of_2(ns->zsze)) { > - dev_warn(ns->ctrl->device, > - "invalid zone size:%llu for namespace:%u\n", > - ns->zsze, ns->head->ns_id); > - status = -ENODEV; > - goto free_data; > - } We used to bail out early if the format wasn't supported, which wouldn't create the namespace. Now you are relying on blk_revalidate_disk_zones() to report an error for invalid format, but the handling for an error there is different: the namespace still gets created. I'm not sure if that's a problem, but it's different.
Hi Keith, On 2022-08-16 23:14, Keith Busch wrote: > On Thu, Aug 11, 2022 at 04:30:35PM +0200, Pankaj Raghav wrote: >> @@ -101,13 +101,6 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf) >> } >> >> ns->zsze = nvme_lba_to_sect(ns, le64_to_cpu(id->lbafe[lbaf].zsze)); >> - if (!is_power_of_2(ns->zsze)) { >> - dev_warn(ns->ctrl->device, >> - "invalid zone size:%llu for namespace:%u\n", >> - ns->zsze, ns->head->ns_id); >> - status = -ENODEV; >> - goto free_data; >> - } > > We used to bail out early if the format wasn't supported, which wouldn't create > the namespace. > > Now you are relying on blk_revalidate_disk_zones() to report an error for > invalid format, but the handling for an error there is different: the namespace > still gets created. I'm not sure if that's a problem, but it's different. That is right but it is not a problem. Once the block layer supports the non-po2 zone sizes, we don't need to bail out here because it will be compatible. If unequal zone sizes are found, block layer will report an error during blk_revalidate_disk_zones() which is the current behavior anyway. Other checks such as zone operation support are still in nvme_update_zone_info() resulting in an early bail out if violated.
diff --git a/drivers/nvme/host/zns.c b/drivers/nvme/host/zns.c index 12316ab51bda..73e4ad495ae8 100644 --- a/drivers/nvme/host/zns.c +++ b/drivers/nvme/host/zns.c @@ -101,13 +101,6 @@ int nvme_update_zone_info(struct nvme_ns *ns, unsigned lbaf) } ns->zsze = nvme_lba_to_sect(ns, le64_to_cpu(id->lbafe[lbaf].zsze)); - if (!is_power_of_2(ns->zsze)) { - dev_warn(ns->ctrl->device, - "invalid zone size:%llu for namespace:%u\n", - ns->zsze, ns->head->ns_id); - status = -ENODEV; - goto free_data; - } disk_set_zoned(ns->disk, BLK_ZONED_HM); blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q); @@ -129,7 +122,7 @@ static void *nvme_zns_alloc_report_buffer(struct nvme_ns *ns, sizeof(struct nvme_zone_descriptor); nr_zones = min_t(unsigned int, nr_zones, - get_capacity(ns->disk) >> ilog2(ns->zsze)); + div64_u64(get_capacity(ns->disk), ns->zsze)); bufsize = sizeof(struct nvme_zone_report) + nr_zones * sizeof(struct nvme_zone_descriptor); @@ -182,6 +175,7 @@ int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector, int ret, zone_idx = 0; unsigned int nz, i; size_t buflen; + u64 remainder = 0; if (ns->head->ids.csi != NVME_CSI_ZNS) return -EINVAL; @@ -197,7 +191,11 @@ int nvme_ns_report_zones(struct nvme_ns *ns, sector_t sector, c.zmr.zrasf = NVME_ZRASF_ZONE_REPORT_ALL; c.zmr.pr = NVME_REPORT_ZONE_PARTIAL; - sector &= ~(ns->zsze - 1); + /* + * Round down the sector value to the nearest zone start + */ + div64_u64_rem(sector, ns->zsze, &remainder); + sector -= remainder; while (zone_idx < nr_zones && sector < get_capacity(ns->disk)) { memset(report, 0, buflen);