From patchwork Fri Mar 3 08:37:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Palmer Dabbelt X-Patchwork-Id: 13158502 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 79BF8C7EE2F for ; Fri, 3 Mar 2023 08:50:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pY100-0001s6-IX; Fri, 03 Mar 2023 03:42:44 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pY0zt-0001EE-AO for qemu-devel@nongnu.org; Fri, 03 Mar 2023 03:42:37 -0500 Received: from mail-pj1-x102f.google.com ([2607:f8b0:4864:20::102f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1pY0zr-00084V-6t for qemu-devel@nongnu.org; Fri, 03 Mar 2023 03:42:37 -0500 Received: by mail-pj1-x102f.google.com with SMTP id x34so1847589pjj.0 for ; Fri, 03 Mar 2023 00:42:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; t=1677832954; h=to:from:cc:content-transfer-encoding:mime-version:references :in-reply-to:message-id:date:subject:from:to:cc:subject:date :message-id:reply-to; bh=7cOCkkpY1/6w/mGzzZZmxagwohDu2O0n7RpZ/Apo/tc=; b=g15ydl0N6qW2rZ4UUIL9sPF1fQCK6S1aqiZnUuBxes3bBlhh5hvjW0C8k7ZaobLe/3 4wXLHIS946YfMPVQdRdyCGd8bxFQh+Ss2RHL8vKAfAKOcP0lUnwYkSnRrhUXxWKyWGlp C4UUYLa3ukdpZq/wJq/aj5IHP+QtUlGWTOLz/4X3RLDqD0j1gszU0FUwyS8QKajGAEQH uPio8EZYSdo3KsRK5qMlFxPWdblEqYMsvoim8PHYUlu6b2282QgEt6ls9ONkXwQFYsdC WEXx9uXPxKi90ksToq7oUtlVVZVgzfJ7t+Vrfc0FqEwE0cIm+SLf7cwHDJ7cucA5+sx8 XaWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677832954; h=to:from:cc:content-transfer-encoding:mime-version:references :in-reply-to:message-id:date:subject:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7cOCkkpY1/6w/mGzzZZmxagwohDu2O0n7RpZ/Apo/tc=; b=HfOAXLSgBvFnK88fjhJ1wLQHpWNJvNCxtslcqTQkelDvEfIukWJXBFnae7KdkYBB72 D7l0kQRj9Nip0bPRGqgA1A2//KeCUG3h3Gcqt0cU7GgtcLPf0+fqSzfM+UKD0ftiTrzj jkd8zjX4lX33LbFjaeGlegFDdQ7F3zp/kfZqoEpe0f1wGreDRhoPoe3xSlwNw0aWdClG 8PuSjyBYP01JnpPw6ToPDIg2YCP+L6j0otFVMU9TTU0NY7wWaDAvsYszNowVlXJ0Bu00 YgqHHIOmxXMd2bOtR/51bzSUH3RGksldfI/Le/wDB1JocMncLZ7pMsgssocyLfkO7szd zeAA== X-Gm-Message-State: AO0yUKWXQaaS5A0p67JpcQACnpGkZm5oFRxCAD+sEMbGc1Rxt1Rr/yHc 63auVQP0SlhgsPrii7lnXbsVdQrXkN3j9G0/ X-Google-Smtp-Source: AK7set8KOlWhCXZlDKIc733KoWMKa5mDRMKEVQTrOwe/lJf1xUN0kuPtZn7Ye7T34/RG5SOVGR+u7g== X-Received: by 2002:a05:6a20:4c92:b0:cd:42cf:6e03 with SMTP id fq18-20020a056a204c9200b000cd42cf6e03mr1304085pzb.26.1677832954255; Fri, 03 Mar 2023 00:42:34 -0800 (PST) Received: from localhost ([135.180.224.71]) by smtp.gmail.com with ESMTPSA id p22-20020aa78616000000b005d55225fc07sm1036211pfn.73.2023.03.03.00.42.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Mar 2023 00:42:33 -0800 (PST) Subject: [PULL 58/59] target/riscv/vector_helper.c: create vext_set_tail_elems_1s() Date: Fri, 3 Mar 2023 00:37:39 -0800 Message-Id: <20230303083740.12817-59-palmer@rivosinc.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230303083740.12817-1-palmer@rivosinc.com> References: <20230303083740.12817-1-palmer@rivosinc.com> MIME-Version: 1.0 Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org, Daniel Henrique Barboza , Weiwei Li , Frank Chang , Palmer Dabbelt From: Palmer Dabbelt To: Peter Maydell Received-SPF: pass client-ip=2607:f8b0:4864:20::102f; envelope-from=palmer@rivosinc.com; helo=mail-pj1-x102f.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Daniel Henrique Barboza Commit 752614cab8e6 ("target/riscv: rvv: Add tail agnostic for vector load / store instructions") added code to set the tail elements to 1 in the end of vext_ldst_stride(), vext_ldst_us(), vext_ldst_index() and vext_ldff(). Aside from a env->vl versus an evl value being used in the first loop, the code is being repeated 4 times. Create a helper to avoid code repetition in all those functions. Arguments that are used in the callers (nf, esz and max_elems) are passed as arguments. All other values are being derived inside the helper. Reviewed-by: Weiwei Li Reviewed-by: Frank Chang Signed-off-by: Daniel Henrique Barboza Message-ID: <20230226170514.588071-2-dbarboza@ventanamicro.com> Signed-off-by: Palmer Dabbelt --- target/riscv/vector_helper.c | 86 +++++++++++++----------------------- 1 file changed, 30 insertions(+), 56 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 3073c54871..7e476ea8c3 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -267,6 +267,28 @@ GEN_VEXT_ST_ELEM(ste_h, int16_t, H2, stw) GEN_VEXT_ST_ELEM(ste_w, int32_t, H4, stl) GEN_VEXT_ST_ELEM(ste_d, int64_t, H8, stq) +static void vext_set_tail_elems_1s(CPURISCVState *env, target_ulong vl, + void *vd, uint32_t desc, uint32_t nf, + uint32_t esz, uint32_t max_elems) +{ + uint32_t total_elems = vext_get_total_elems(env, desc, esz); + uint32_t vlenb = env_archcpu(env)->cfg.vlen >> 3; + uint32_t vta = vext_vta(desc); + uint32_t registers_used; + int k; + + for (k = 0; k < nf; ++k) { + vext_set_elems_1s(vd, vta, (k * max_elems + vl) * esz, + (k * max_elems + max_elems) * esz); + } + + if (nf * max_elems % total_elems != 0) { + registers_used = ((nf * max_elems) * esz + (vlenb - 1)) / vlenb; + vext_set_elems_1s(vd, vta, (nf * max_elems) * esz, + registers_used * vlenb); + } +} + /* *** stride: access vector element from strided memory */ @@ -281,8 +303,6 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, uint32_t nf = vext_nf(desc); uint32_t max_elems = vext_max_elems(desc, log2_esz); uint32_t esz = 1 << log2_esz; - uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); uint32_t vma = vext_vma(desc); for (i = env->vstart; i < env->vl; i++, env->vstart++) { @@ -301,18 +321,8 @@ vext_ldst_stride(void *vd, void *v0, target_ulong base, } } env->vstart = 0; - /* set tail elements to 1s */ - for (k = 0; k < nf; ++k) { - vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz, - (k * max_elems + max_elems) * esz); - } - if (nf * max_elems % total_elems != 0) { - uint32_t vlenb = env_archcpu(env)->cfg.vlen >> 3; - uint32_t registers_used = - ((nf * max_elems) * esz + (vlenb - 1)) / vlenb; - vext_set_elems_1s(vd, vta, (nf * max_elems) * esz, - registers_used * vlenb); - } + + vext_set_tail_elems_1s(env, env->vl, vd, desc, nf, esz, max_elems); } #define GEN_VEXT_LD_STRIDE(NAME, ETYPE, LOAD_FN) \ @@ -359,8 +369,6 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, uint32_t nf = vext_nf(desc); uint32_t max_elems = vext_max_elems(desc, log2_esz); uint32_t esz = 1 << log2_esz; - uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); /* load bytes from guest memory */ for (i = env->vstart; i < evl; i++, env->vstart++) { @@ -372,18 +380,8 @@ vext_ldst_us(void *vd, target_ulong base, CPURISCVState *env, uint32_t desc, } } env->vstart = 0; - /* set tail elements to 1s */ - for (k = 0; k < nf; ++k) { - vext_set_elems_1s(vd, vta, (k * max_elems + evl) * esz, - (k * max_elems + max_elems) * esz); - } - if (nf * max_elems % total_elems != 0) { - uint32_t vlenb = env_archcpu(env)->cfg.vlen >> 3; - uint32_t registers_used = - ((nf * max_elems) * esz + (vlenb - 1)) / vlenb; - vext_set_elems_1s(vd, vta, (nf * max_elems) * esz, - registers_used * vlenb); - } + + vext_set_tail_elems_1s(env, evl, vd, desc, nf, esz, max_elems); } /* @@ -484,8 +482,6 @@ vext_ldst_index(void *vd, void *v0, target_ulong base, uint32_t vm = vext_vm(desc); uint32_t max_elems = vext_max_elems(desc, log2_esz); uint32_t esz = 1 << log2_esz; - uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); uint32_t vma = vext_vma(desc); /* load bytes from guest memory */ @@ -505,18 +501,8 @@ vext_ldst_index(void *vd, void *v0, target_ulong base, } } env->vstart = 0; - /* set tail elements to 1s */ - for (k = 0; k < nf; ++k) { - vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz, - (k * max_elems + max_elems) * esz); - } - if (nf * max_elems % total_elems != 0) { - uint32_t vlenb = env_archcpu(env)->cfg.vlen >> 3; - uint32_t registers_used = - ((nf * max_elems) * esz + (vlenb - 1)) / vlenb; - vext_set_elems_1s(vd, vta, (nf * max_elems) * esz, - registers_used * vlenb); - } + + vext_set_tail_elems_1s(env, env->vl, vd, desc, nf, esz, max_elems); } #define GEN_VEXT_LD_INDEX(NAME, ETYPE, INDEX_FN, LOAD_FN) \ @@ -585,8 +571,6 @@ vext_ldff(void *vd, void *v0, target_ulong base, uint32_t vm = vext_vm(desc); uint32_t max_elems = vext_max_elems(desc, log2_esz); uint32_t esz = 1 << log2_esz; - uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); uint32_t vma = vext_vma(desc); target_ulong addr, offset, remain; @@ -647,18 +631,8 @@ ProbeSuccess: } } env->vstart = 0; - /* set tail elements to 1s */ - for (k = 0; k < nf; ++k) { - vext_set_elems_1s(vd, vta, (k * max_elems + env->vl) * esz, - (k * max_elems + max_elems) * esz); - } - if (nf * max_elems % total_elems != 0) { - uint32_t vlenb = env_archcpu(env)->cfg.vlen >> 3; - uint32_t registers_used = - ((nf * max_elems) * esz + (vlenb - 1)) / vlenb; - vext_set_elems_1s(vd, vta, (nf * max_elems) * esz, - registers_used * vlenb); - } + + vext_set_tail_elems_1s(env, env->vl, vd, desc, nf, esz, max_elems); } #define GEN_VEXT_LDFF(NAME, ETYPE, LOAD_FN) \