Message ID | 20201014191211.27029-4-nsaenzjulienne@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: Default to 32-bit wide ZONE_DMA | expand |
On Wed, Oct 14, 2020 at 2:12 PM Nicolas Saenz Julienne <nsaenzjulienne@suse.de> wrote: > > Introduce of_dma_get_max_cpu_address(), which provides the highest CPU > physical address addressable by all DMA masters in the system. It's > specially useful for setting memory zones sizes at early boot time. > > Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> > > --- > > Changes since v2: > - Use PHYS_ADDR_MAX > - return phys_dma_t > - Rename function > - Correct subject > - Add support to start parsing from an arbitrary device node in order > for the function to work with unit tests > > drivers/of/address.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > include/linux/of.h | 7 +++++++ > 2 files changed, 49 insertions(+) > > diff --git a/drivers/of/address.c b/drivers/of/address.c > index eb9ab4f1e80b..b5a9695aaf82 100644 > --- a/drivers/of/address.c > +++ b/drivers/of/address.c > @@ -1024,6 +1024,48 @@ int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) > } > #endif /* CONFIG_HAS_DMA */ > > +/** > + * of_dma_get_max_cpu_address - Gets highest CPU address suitable for DMA > + * @np: The node to start searching from or NULL to start from the root > + * > + * Gets the highest CPU physical address that is addressable by all DMA masters > + * in the system (or subtree when np is non-NULL). If no DMA constrained device > + * is found, it returns PHYS_ADDR_MAX. > + */ > +phys_addr_t __init of_dma_get_max_cpu_address(struct device_node *np) > +{ > + phys_addr_t max_cpu_addr = PHYS_ADDR_MAX; One issue with using phys_addr_t is it may be 32-bit even though the DT is 64-bit addresses. LPAE capable system with LPAE disabled. Maybe the truncation is fine here? Maybe not. > + struct of_range_parser parser; > + phys_addr_t subtree_max_addr; > + struct device_node *child; > + phys_addr_t cpu_end = 0; > + struct of_range range; > + const __be32 *ranges; > + int len; > + > + if (!np) > + np = of_root; > + > + ranges = of_get_property(np, "dma-ranges", &len); I'm not really following why you changed the algorithm here. You're skipping disabled nodes which is good. Was there some other reason? > + if (ranges && len) { > + of_dma_range_parser_init(&parser, np); > + for_each_of_range(&parser, &range) > + if (range.cpu_addr + range.size > cpu_end) > + cpu_end = range.cpu_addr + range.size; > + > + if (max_cpu_addr > cpu_end) > + max_cpu_addr = cpu_end; > + } > + > + for_each_available_child_of_node(np, child) { > + subtree_max_addr = of_dma_get_max_cpu_address(child); > + if (max_cpu_addr > subtree_max_addr) > + max_cpu_addr = subtree_max_addr; > + } > + > + return max_cpu_addr; > +} > + > /** > * of_dma_is_coherent - Check if device is coherent > * @np: device node > diff --git a/include/linux/of.h b/include/linux/of.h > index 481ec0467285..db8db8f2c967 100644 > --- a/include/linux/of.h > +++ b/include/linux/of.h > @@ -558,6 +558,8 @@ int of_map_id(struct device_node *np, u32 id, > const char *map_name, const char *map_mask_name, > struct device_node **target, u32 *id_out); > > +phys_addr_t of_dma_get_max_cpu_address(struct device_node *np); > + > #else /* CONFIG_OF */ > > static inline void of_core_init(void) > @@ -995,6 +997,11 @@ static inline int of_map_id(struct device_node *np, u32 id, > return -EINVAL; > } > > +static inline phys_addr_t of_dma_get_max_cpu_address(struct device_node *np) > +{ > + return PHYS_ADDR_MAX; > +} > + > #define of_match_ptr(_ptr) NULL > #define of_match_node(_matches, _node) NULL > #endif /* CONFIG_OF */ > -- > 2.28.0 >
> +phys_addr_t __init of_dma_get_max_cpu_address(struct device_node *np) > +{ > + phys_addr_t max_cpu_addr = PHYS_ADDR_MAX; > + struct of_range_parser parser; > + phys_addr_t subtree_max_addr; > + struct device_node *child; > + phys_addr_t cpu_end = 0; > + struct of_range range; > + const __be32 *ranges; > + int len; > + > + if (!np) > + np = of_root; Requiring of_root to be passed explicitly would seem more natural to me than the magic NULL argument. There doesn't seem to be any precedent for that kind of calling convention either.
On Thu, 15 Oct 2020 at 00:03, Rob Herring <robh+dt@kernel.org> wrote: > > On Wed, Oct 14, 2020 at 2:12 PM Nicolas Saenz Julienne > <nsaenzjulienne@suse.de> wrote: > > > > Introduce of_dma_get_max_cpu_address(), which provides the highest CPU > > physical address addressable by all DMA masters in the system. It's > > specially useful for setting memory zones sizes at early boot time. > > > > Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> > > > > --- > > > > Changes since v2: > > - Use PHYS_ADDR_MAX > > - return phys_dma_t > > - Rename function > > - Correct subject > > - Add support to start parsing from an arbitrary device node in order > > for the function to work with unit tests > > > > drivers/of/address.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > > include/linux/of.h | 7 +++++++ > > 2 files changed, 49 insertions(+) > > > > diff --git a/drivers/of/address.c b/drivers/of/address.c > > index eb9ab4f1e80b..b5a9695aaf82 100644 > > --- a/drivers/of/address.c > > +++ b/drivers/of/address.c > > @@ -1024,6 +1024,48 @@ int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) > > } > > #endif /* CONFIG_HAS_DMA */ > > > > +/** > > + * of_dma_get_max_cpu_address - Gets highest CPU address suitable for DMA > > + * @np: The node to start searching from or NULL to start from the root > > + * > > + * Gets the highest CPU physical address that is addressable by all DMA masters > > + * in the system (or subtree when np is non-NULL). If no DMA constrained device > > + * is found, it returns PHYS_ADDR_MAX. > > + */ > > +phys_addr_t __init of_dma_get_max_cpu_address(struct device_node *np) > > +{ > > + phys_addr_t max_cpu_addr = PHYS_ADDR_MAX; > > One issue with using phys_addr_t is it may be 32-bit even though the > DT is 64-bit addresses. LPAE capable system with LPAE disabled. Maybe > the truncation is fine here? Maybe not. > PHYS_ADDR_MAX is the max addressable CPU address on the system, and so it makes sense to use it for the return type, and for the preliminary return value: this is actually what /prevents/ truncation, because we will only overwrite max_cpu_addr if the new u64 value is lower. > > + struct of_range_parser parser; > > + phys_addr_t subtree_max_addr; > > + struct device_node *child; > > + phys_addr_t cpu_end = 0; > > + struct of_range range; > > + const __be32 *ranges; > > + int len; > > + > > + if (!np) > > + np = of_root; > > + > > + ranges = of_get_property(np, "dma-ranges", &len); > > I'm not really following why you changed the algorithm here. You're > skipping disabled nodes which is good. Was there some other reason? > > > + if (ranges && len) { > > + of_dma_range_parser_init(&parser, np); > > + for_each_of_range(&parser, &range) > > + if (range.cpu_addr + range.size > cpu_end) > > + cpu_end = range.cpu_addr + range.size; > > + > > + if (max_cpu_addr > cpu_end) > > + max_cpu_addr = cpu_end; > > + } > > + > > + for_each_available_child_of_node(np, child) { > > + subtree_max_addr = of_dma_get_max_cpu_address(child); > > + if (max_cpu_addr > subtree_max_addr) > > + max_cpu_addr = subtree_max_addr; > > + } > > + > > + return max_cpu_addr; > > +} > > + > > /** > > * of_dma_is_coherent - Check if device is coherent > > * @np: device node > > diff --git a/include/linux/of.h b/include/linux/of.h > > index 481ec0467285..db8db8f2c967 100644 > > --- a/include/linux/of.h > > +++ b/include/linux/of.h > > @@ -558,6 +558,8 @@ int of_map_id(struct device_node *np, u32 id, > > const char *map_name, const char *map_mask_name, > > struct device_node **target, u32 *id_out); > > > > +phys_addr_t of_dma_get_max_cpu_address(struct device_node *np); > > + > > #else /* CONFIG_OF */ > > > > static inline void of_core_init(void) > > @@ -995,6 +997,11 @@ static inline int of_map_id(struct device_node *np, u32 id, > > return -EINVAL; > > } > > > > +static inline phys_addr_t of_dma_get_max_cpu_address(struct device_node *np) > > +{ > > + return PHYS_ADDR_MAX; > > +} > > + > > #define of_match_ptr(_ptr) NULL > > #define of_match_node(_matches, _node) NULL > > #endif /* CONFIG_OF */ > > -- > > 2.28.0 > >
On Wed, 2020-10-14 at 17:02 -0500, Rob Herring wrote: > On Wed, Oct 14, 2020 at 2:12 PM Nicolas Saenz Julienne > <nsaenzjulienne@suse.de> wrote: > > Introduce of_dma_get_max_cpu_address(), which provides the highest CPU > > physical address addressable by all DMA masters in the system. It's > > specially useful for setting memory zones sizes at early boot time. > > > > Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> > > > > --- [...] > > + struct of_range_parser parser; > > + phys_addr_t subtree_max_addr; > > + struct device_node *child; > > + phys_addr_t cpu_end = 0; > > + struct of_range range; > > + const __be32 *ranges; > > + int len; > > + > > + if (!np) > > + np = of_root; > > + > > + ranges = of_get_property(np, "dma-ranges", &len); > > I'm not really following why you changed the algorithm here. You're > skipping disabled nodes which is good. Was there some other reason? Yes, it's a little more complex. But I had to change it in order to be able to start parsing down from an arbitrary device node, which is needed for the unit tests. for_each_of_allnodes() and friends will traverse the whole tree, regardless of the starting point. I couldn't find a similar function that would just iterate over a subsection of the tree, so I went with this recursive approach. Regards, Nicolas
On Thu, 2020-10-15 at 08:56 +0200, Ard Biesheuvel wrote: > On Thu, 15 Oct 2020 at 00:03, Rob Herring <robh+dt@kernel.org> wrote: > > On Wed, Oct 14, 2020 at 2:12 PM Nicolas Saenz Julienne > > <nsaenzjulienne@suse.de> wrote: > > > Introduce of_dma_get_max_cpu_address(), which provides the highest CPU > > > physical address addressable by all DMA masters in the system. It's > > > specially useful for setting memory zones sizes at early boot time. > > > > > > Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> > > > > > > --- > > > > > > Changes since v2: > > > - Use PHYS_ADDR_MAX > > > - return phys_dma_t > > > - Rename function > > > - Correct subject > > > - Add support to start parsing from an arbitrary device node in order > > > for the function to work with unit tests > > > > > > drivers/of/address.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > > > include/linux/of.h | 7 +++++++ > > > 2 files changed, 49 insertions(+) > > > > > > diff --git a/drivers/of/address.c b/drivers/of/address.c > > > index eb9ab4f1e80b..b5a9695aaf82 100644 > > > --- a/drivers/of/address.c > > > +++ b/drivers/of/address.c > > > @@ -1024,6 +1024,48 @@ int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) > > > } > > > #endif /* CONFIG_HAS_DMA */ > > > > > > +/** > > > + * of_dma_get_max_cpu_address - Gets highest CPU address suitable for DMA > > > + * @np: The node to start searching from or NULL to start from the root > > > + * > > > + * Gets the highest CPU physical address that is addressable by all DMA masters > > > + * in the system (or subtree when np is non-NULL). If no DMA constrained device > > > + * is found, it returns PHYS_ADDR_MAX. > > > + */ > > > +phys_addr_t __init of_dma_get_max_cpu_address(struct device_node *np) > > > +{ > > > + phys_addr_t max_cpu_addr = PHYS_ADDR_MAX; > > > > One issue with using phys_addr_t is it may be 32-bit even though the > > DT is 64-bit addresses. LPAE capable system with LPAE disabled. Maybe > > the truncation is fine here? Maybe not. > > > > PHYS_ADDR_MAX is the max addressable CPU address on the system, and so > it makes sense to use it for the return type, and for the preliminary > return value: this is actually what /prevents/ truncation, because we > will only overwrite max_cpu_addr if the new u64 value is lower. > Actually I now see how things might go south. > > > + if (ranges && len) { > > > + of_dma_range_parser_init(&parser, np); > > > + for_each_of_range(&parser, &range) > > > + if (range.cpu_addr + range.size > cpu_end) > > > + cpu_end = range.cpu_addr + range.size; If cpu_end hits 0x1_00000000, it'll overflow to 0. This is possible on 32-bit systems (LPAE or not). And something similar might happen on LPAE disabled systems. I could add some extra logic, something like: /* We overflowed */ if (cpu_end < range.cpu_addr) cpu_end = PHYS_ADDR_MAX; Which is not perfect but will cover most sensible cases. Or simply deal internally in u64s, and upon returning, check if "max_cpu_addr" falls higher than PHYS_ADDR_MAX. > > > + > > > + if (max_cpu_addr > cpu_end) > > > + max_cpu_addr = cpu_end; > > > + } > > > + > > > + for_each_available_child_of_node(np, child) { > > > + subtree_max_addr = of_dma_get_max_cpu_address(child); > > > + if (max_cpu_addr > subtree_max_addr) > > > + max_cpu_addr = subtree_max_addr; > > > + } > > > + > > > + return max_cpu_addr; > > > +} Regards, Nicolas
On Thu, 15 Oct 2020 at 11:16, Nicolas Saenz Julienne <nsaenzjulienne@suse.de> wrote: > > On Thu, 2020-10-15 at 08:56 +0200, Ard Biesheuvel wrote: > > On Thu, 15 Oct 2020 at 00:03, Rob Herring <robh+dt@kernel.org> wrote: > > > On Wed, Oct 14, 2020 at 2:12 PM Nicolas Saenz Julienne > > > <nsaenzjulienne@suse.de> wrote: > > > > Introduce of_dma_get_max_cpu_address(), which provides the highest CPU > > > > physical address addressable by all DMA masters in the system. It's > > > > specially useful for setting memory zones sizes at early boot time. > > > > > > > > Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> > > > > > > > > --- > > > > > > > > Changes since v2: > > > > - Use PHYS_ADDR_MAX > > > > - return phys_dma_t > > > > - Rename function > > > > - Correct subject > > > > - Add support to start parsing from an arbitrary device node in order > > > > for the function to work with unit tests > > > > > > > > drivers/of/address.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > > > > include/linux/of.h | 7 +++++++ > > > > 2 files changed, 49 insertions(+) > > > > > > > > diff --git a/drivers/of/address.c b/drivers/of/address.c > > > > index eb9ab4f1e80b..b5a9695aaf82 100644 > > > > --- a/drivers/of/address.c > > > > +++ b/drivers/of/address.c > > > > @@ -1024,6 +1024,48 @@ int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) > > > > } > > > > #endif /* CONFIG_HAS_DMA */ > > > > > > > > +/** > > > > + * of_dma_get_max_cpu_address - Gets highest CPU address suitable for DMA > > > > + * @np: The node to start searching from or NULL to start from the root > > > > + * > > > > + * Gets the highest CPU physical address that is addressable by all DMA masters > > > > + * in the system (or subtree when np is non-NULL). If no DMA constrained device > > > > + * is found, it returns PHYS_ADDR_MAX. > > > > + */ > > > > +phys_addr_t __init of_dma_get_max_cpu_address(struct device_node *np) > > > > +{ > > > > + phys_addr_t max_cpu_addr = PHYS_ADDR_MAX; > > > > > > One issue with using phys_addr_t is it may be 32-bit even though the > > > DT is 64-bit addresses. LPAE capable system with LPAE disabled. Maybe > > > the truncation is fine here? Maybe not. > > > > > > > PHYS_ADDR_MAX is the max addressable CPU address on the system, and so > > it makes sense to use it for the return type, and for the preliminary > > return value: this is actually what /prevents/ truncation, because we > > will only overwrite max_cpu_addr if the new u64 value is lower. > > > > Actually I now see how things might go south. > > > > > + if (ranges && len) { > > > > + of_dma_range_parser_init(&parser, np); > > > > + for_each_of_range(&parser, &range) > > > > + if (range.cpu_addr + range.size > cpu_end) > > > > + cpu_end = range.cpu_addr + range.size; > > If cpu_end hits 0x1_00000000, it'll overflow to 0. This is possible on 32-bit > systems (LPAE or not). And something similar might happen on LPAE disabled > systems. > > I could add some extra logic, something like: > > /* We overflowed */ > if (cpu_end < range.cpu_addr) > cpu_end = PHYS_ADDR_MAX; > > Which is not perfect but will cover most sensible cases. > > Or simply deal internally in u64s, and upon returning, check if "max_cpu_addr" > falls higher than PHYS_ADDR_MAX. > Just use a u64 for cpu_end > > > > + > > > > + if (max_cpu_addr > cpu_end) > > > > + max_cpu_addr = cpu_end; ... then this comparison and assignment will work as expected. > > > > + } > > > > + > > > > + for_each_available_child_of_node(np, child) { > > > > + subtree_max_addr = of_dma_get_max_cpu_address(child); > > > > + if (max_cpu_addr > subtree_max_addr) > > > > + max_cpu_addr = subtree_max_addr; > > > > + } > > > > + > > > > + return max_cpu_addr; > > > > +} > > Regards, > Nicolas >
On Thu, 2020-10-15 at 07:42 +0200, Christoph Hellwig wrote: > > +phys_addr_t __init of_dma_get_max_cpu_address(struct device_node *np) > > +{ > > + phys_addr_t max_cpu_addr = PHYS_ADDR_MAX; > > + struct of_range_parser parser; > > + phys_addr_t subtree_max_addr; > > + struct device_node *child; > > + phys_addr_t cpu_end = 0; > > + struct of_range range; > > + const __be32 *ranges; > > + int len; > > + > > + if (!np) > > + np = of_root; > > Requiring of_root to be passed explicitly would seem more natural > to me than the magic NULL argument. There doesn't seem to be any > precedent for that kind of calling convention either. I inspired that behavior from __of_find_all_nodes(). I'll change it.
On Thu, Oct 15, 2020 at 12:42 AM Christoph Hellwig <hch@lst.de> wrote: > > > +phys_addr_t __init of_dma_get_max_cpu_address(struct device_node *np) > > +{ > > + phys_addr_t max_cpu_addr = PHYS_ADDR_MAX; > > + struct of_range_parser parser; > > + phys_addr_t subtree_max_addr; > > + struct device_node *child; > > + phys_addr_t cpu_end = 0; > > + struct of_range range; > > + const __be32 *ranges; > > + int len; > > + > > + if (!np) > > + np = of_root; > > Requiring of_root to be passed explicitly would seem more natural > to me than the magic NULL argument. There doesn't seem to be any > precedent for that kind of calling convention either. I prefer that of_root is not more widely exposed and NULL regularly means 'the whole tree'. Rob
diff --git a/drivers/of/address.c b/drivers/of/address.c index eb9ab4f1e80b..b5a9695aaf82 100644 --- a/drivers/of/address.c +++ b/drivers/of/address.c @@ -1024,6 +1024,48 @@ int of_dma_get_range(struct device_node *np, const struct bus_dma_region **map) } #endif /* CONFIG_HAS_DMA */ +/** + * of_dma_get_max_cpu_address - Gets highest CPU address suitable for DMA + * @np: The node to start searching from or NULL to start from the root + * + * Gets the highest CPU physical address that is addressable by all DMA masters + * in the system (or subtree when np is non-NULL). If no DMA constrained device + * is found, it returns PHYS_ADDR_MAX. + */ +phys_addr_t __init of_dma_get_max_cpu_address(struct device_node *np) +{ + phys_addr_t max_cpu_addr = PHYS_ADDR_MAX; + struct of_range_parser parser; + phys_addr_t subtree_max_addr; + struct device_node *child; + phys_addr_t cpu_end = 0; + struct of_range range; + const __be32 *ranges; + int len; + + if (!np) + np = of_root; + + ranges = of_get_property(np, "dma-ranges", &len); + if (ranges && len) { + of_dma_range_parser_init(&parser, np); + for_each_of_range(&parser, &range) + if (range.cpu_addr + range.size > cpu_end) + cpu_end = range.cpu_addr + range.size; + + if (max_cpu_addr > cpu_end) + max_cpu_addr = cpu_end; + } + + for_each_available_child_of_node(np, child) { + subtree_max_addr = of_dma_get_max_cpu_address(child); + if (max_cpu_addr > subtree_max_addr) + max_cpu_addr = subtree_max_addr; + } + + return max_cpu_addr; +} + /** * of_dma_is_coherent - Check if device is coherent * @np: device node diff --git a/include/linux/of.h b/include/linux/of.h index 481ec0467285..db8db8f2c967 100644 --- a/include/linux/of.h +++ b/include/linux/of.h @@ -558,6 +558,8 @@ int of_map_id(struct device_node *np, u32 id, const char *map_name, const char *map_mask_name, struct device_node **target, u32 *id_out); +phys_addr_t of_dma_get_max_cpu_address(struct device_node *np); + #else /* CONFIG_OF */ static inline void of_core_init(void) @@ -995,6 +997,11 @@ static inline int of_map_id(struct device_node *np, u32 id, return -EINVAL; } +static inline phys_addr_t of_dma_get_max_cpu_address(struct device_node *np) +{ + return PHYS_ADDR_MAX; +} + #define of_match_ptr(_ptr) NULL #define of_match_node(_matches, _node) NULL #endif /* CONFIG_OF */
Introduce of_dma_get_max_cpu_address(), which provides the highest CPU physical address addressable by all DMA masters in the system. It's specially useful for setting memory zones sizes at early boot time. Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de> --- Changes since v2: - Use PHYS_ADDR_MAX - return phys_dma_t - Rename function - Correct subject - Add support to start parsing from an arbitrary device node in order for the function to work with unit tests drivers/of/address.c | 42 ++++++++++++++++++++++++++++++++++++++++++ include/linux/of.h | 7 +++++++ 2 files changed, 49 insertions(+)