Message ID | f6b75a33df6f5fd94da3cfecb1e9e7590bf8cd37.1668963937.git.christophe.jaillet@wanadoo.fr (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | viresh kumar |
Headers | show |
Series | cpufreq: tegra186: Use flexible array to simplify memory allocation | expand |
On 20-11-22, 18:19, Christophe JAILLET wrote: > Use flexible array to simplify memory allocation. > It saves some memory, avoids an indirection when reading the 'clusters' > array and removes some LoC. > > > Detailed explanation: > ==================== > Knowing that: > - each devm_ allocation over-allocates 40 bytes for internal needs > - Some rounding is done by the memory allocator on 8, 16, 32, 64, 96, > 128, 192, 256, 512, 1024, 2048, 4096, 8192 boundaries > > and that: > - sizeof(struct tegra186_cpufreq_data) = 24 > - sizeof(struct tegra186_cpufreq_cluster) = 16 > > Memory allocations in tegra186_cpufreq_probe() are: > data: (24 + 40) = 64 => 64 bytes > data->clusters: (2 * 16 + 40) = 72 => 96 bytes > So a total of 160 bytes are allocated. > 56 for the real need, 80 for internal uses and 24 are wasted. > > > If 'struct tegra186_cpufreq_data' is reordered so that 'clusters' is a > flexible array: > - it saves one pointer in the structure > - only one allocation is needed > > So, only 96 bytes are allocated: > 16 + 2 * 16 + 40 = 88 => 96 bytes > > Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> > --- Applied. Thanks.
diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c index 6c88827f4e62..f98f53bf1011 100644 --- a/drivers/cpufreq/tegra186-cpufreq.c +++ b/drivers/cpufreq/tegra186-cpufreq.c @@ -65,8 +65,8 @@ struct tegra186_cpufreq_cluster { struct tegra186_cpufreq_data { void __iomem *regs; - struct tegra186_cpufreq_cluster *clusters; const struct tegra186_cpufreq_cpu *cpus; + struct tegra186_cpufreq_cluster clusters[]; }; static int tegra186_cpufreq_init(struct cpufreq_policy *policy) @@ -221,15 +221,12 @@ static int tegra186_cpufreq_probe(struct platform_device *pdev) struct tegra_bpmp *bpmp; unsigned int i = 0, err; - data = devm_kzalloc(&pdev->dev, sizeof(*data), GFP_KERNEL); + data = devm_kzalloc(&pdev->dev, + struct_size(data, clusters, TEGRA186_NUM_CLUSTERS), + GFP_KERNEL); if (!data) return -ENOMEM; - data->clusters = devm_kcalloc(&pdev->dev, TEGRA186_NUM_CLUSTERS, - sizeof(*data->clusters), GFP_KERNEL); - if (!data->clusters) - return -ENOMEM; - data->cpus = tegra186_cpus; bpmp = tegra_bpmp_get(&pdev->dev);
Use flexible array to simplify memory allocation. It saves some memory, avoids an indirection when reading the 'clusters' array and removes some LoC. Detailed explanation: ==================== Knowing that: - each devm_ allocation over-allocates 40 bytes for internal needs - Some rounding is done by the memory allocator on 8, 16, 32, 64, 96, 128, 192, 256, 512, 1024, 2048, 4096, 8192 boundaries and that: - sizeof(struct tegra186_cpufreq_data) = 24 - sizeof(struct tegra186_cpufreq_cluster) = 16 Memory allocations in tegra186_cpufreq_probe() are: data: (24 + 40) = 64 => 64 bytes data->clusters: (2 * 16 + 40) = 72 => 96 bytes So a total of 160 bytes are allocated. 56 for the real need, 80 for internal uses and 24 are wasted. If 'struct tegra186_cpufreq_data' is reordered so that 'clusters' is a flexible array: - it saves one pointer in the structure - only one allocation is needed So, only 96 bytes are allocated: 16 + 2 * 16 + 40 = 88 => 96 bytes Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> --- Compile tested only --- drivers/cpufreq/tegra186-cpufreq.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)