From patchwork Thu Nov 14 10:57:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 11243535 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B9CF13BD for ; Thu, 14 Nov 2019 10:57:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 86CDE20709 for ; Thu, 14 Nov 2019 10:57:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726997AbfKNK5m (ORCPT ); Thu, 14 Nov 2019 05:57:42 -0500 Received: from mx2.suse.de ([195.135.220.15]:33316 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726142AbfKNK5l (ORCPT ); Thu, 14 Nov 2019 05:57:41 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 25E9CAD00; Thu, 14 Nov 2019 10:57:39 +0000 (UTC) From: Luis Henriques To: Jeff Layton , Sage Weil , Ilya Dryomov , "Yan, Zheng" Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques Subject: [RFC PATCH v2 1/4] ceph: add support for TYPE_MSGR2 address decode Date: Thu, 14 Nov 2019 10:57:33 +0000 Message-Id: <20191114105736.8636-2-lhenriques@suse.com> In-Reply-To: <20191114105736.8636-1-lhenriques@suse.com> References: <20191114105736.8636-1-lhenriques@suse.com> MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The new format actually includes two addresses: one the new messenger v2, and other for the legacy v1, which is the only one currently understood by kernel clients. Add code to pick the legacy address and ignore the v2 one. Signed-off-by: Luis Henriques --- include/linux/ceph/decode.h | 3 ++- net/ceph/decode.c | 33 +++++++++++++++++++++++++++++++-- 2 files changed, 33 insertions(+), 3 deletions(-) diff --git a/include/linux/ceph/decode.h b/include/linux/ceph/decode.h index 450384fe487c..2a2f07dfb39c 100644 --- a/include/linux/ceph/decode.h +++ b/include/linux/ceph/decode.h @@ -219,7 +219,8 @@ static inline void ceph_encode_timespec64(struct ceph_timespec *tv, * sockaddr_storage <-> ceph_sockaddr */ #define CEPH_ENTITY_ADDR_TYPE_NONE 0 -#define CEPH_ENTITY_ADDR_TYPE_LEGACY __cpu_to_le32(1) +#define CEPH_ENTITY_ADDR_TYPE_LEGACY __cpu_to_le32(1) /* legacy msgr1 */ +#define CEPH_ENTITY_ADDR_TYPE_MSGR2 __cpu_to_le32(2) /* msgr2 protocol */ static inline void ceph_encode_banner_addr(struct ceph_entity_addr *a) { diff --git a/net/ceph/decode.c b/net/ceph/decode.c index eea529595a7a..613a2bc6f805 100644 --- a/net/ceph/decode.c +++ b/net/ceph/decode.c @@ -67,16 +67,45 @@ ceph_decode_entity_addr_legacy(void **p, void *end, return ret; } +static int +ceph_decode_entity_addr_versioned_msgr2(void **p, void *end, + struct ceph_entity_addr *addr) +{ + struct ceph_entity_addr tmp_addr; + struct ceph_entity_addr *paddr = addr; + int ret = -EINVAL; + + ceph_decode_skip_32(p, end, bad); /* hard-coded '2' */ + ceph_decode_skip_8(p, end, bad); /* hard-coded '1' */ + + ret = ceph_decode_entity_addr_versioned(p, end, paddr); + if (ret) + goto bad; + /* If we already have a v1 address, simply skip over the other address */ + if (paddr->type == CEPH_ENTITY_ADDR_TYPE_LEGACY) + paddr = &tmp_addr; + + ceph_decode_skip_8(p, end, bad); /* hard-coded '1' */ + + ret = ceph_decode_entity_addr_versioned(p, end, paddr); + +bad: + return ret; +} + int ceph_decode_entity_addr(void **p, void *end, struct ceph_entity_addr *addr) { u8 marker; ceph_decode_8_safe(p, end, marker, bad); - if (marker == 1) + if (marker == CEPH_ENTITY_ADDR_TYPE_MSGR2) + return ceph_decode_entity_addr_versioned_msgr2(p, end, addr); + else if (marker == CEPH_ENTITY_ADDR_TYPE_LEGACY) return ceph_decode_entity_addr_versioned(p, end, addr); - else if (marker == 0) + else if (marker == CEPH_ENTITY_ADDR_TYPE_NONE) return ceph_decode_entity_addr_legacy(p, end, addr); + bad: return -EINVAL; } From patchwork Thu Nov 14 10:57:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 11243537 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 977B8930 for ; Thu, 14 Nov 2019 10:57:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7D4D820709 for ; Thu, 14 Nov 2019 10:57:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727126AbfKNK54 (ORCPT ); Thu, 14 Nov 2019 05:57:56 -0500 Received: from mx2.suse.de ([195.135.220.15]:33340 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726923AbfKNK5l (ORCPT ); Thu, 14 Nov 2019 05:57:41 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 014F4AD07; Thu, 14 Nov 2019 10:57:40 +0000 (UTC) From: Luis Henriques To: Jeff Layton , Sage Weil , Ilya Dryomov , "Yan, Zheng" Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques Subject: [RFC PATCH v2 2/4] ceph: get the require_osd_release field from the osdmap Date: Thu, 14 Nov 2019 10:57:34 +0000 Message-Id: <20191114105736.8636-3-lhenriques@suse.com> In-Reply-To: <20191114105736.8636-1-lhenriques@suse.com> References: <20191114105736.8636-1-lhenriques@suse.com> MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Since Ceph Octopus, OSDs are encoding require_osd_release into the client data part of the osdmap. This patch adds code to pick this extra field. Signed-off-by: Luis Henriques --- include/linux/ceph/ceph_features.h | 10 ++++++++-- include/linux/ceph/osdmap.h | 1 + net/ceph/osdmap.c | 21 +++++++++++++++++++++ 3 files changed, 30 insertions(+), 2 deletions(-) diff --git a/include/linux/ceph/ceph_features.h b/include/linux/ceph/ceph_features.h index 39e6f4c57580..f329d1907dd7 100644 --- a/include/linux/ceph/ceph_features.h +++ b/include/linux/ceph/ceph_features.h @@ -9,6 +9,7 @@ */ #define CEPH_FEATURE_INCARNATION_1 (0ull) #define CEPH_FEATURE_INCARNATION_2 (1ull<<57) // CEPH_FEATURE_SERVER_JEWEL +#define CEPH_FEATURE_INCARNATION_3 ((1ull<<57)|(1ull<<28)) // SERVER_MIMIC #define DEFINE_CEPH_FEATURE(bit, incarnation, name) \ static const uint64_t CEPH_FEATURE_##name = (1ULL<pg_upmap_items)); } + if (struct_v >= 6) + /* crush version */ + ceph_decode_skip_32(p, end, e_inval); + if (struct_v >= 7) { + /* + * skip removed_snaps and purged_snaps + * (snap_interval_set_t = 8 + 8) + */ + ceph_decode_skip_set(p, end, 16, e_inval); + ceph_decode_skip_set(p, end, 16, e_inval); + } + if (struct_v >= 9) { + struct ceph_timespec ts; + + /* last_up_change and last_in_change */ + ceph_decode_copy_safe(p, end, &ts, sizeof(ts), e_inval); + ceph_decode_copy_safe(p, end, &ts, sizeof(ts), e_inval); + } + if (struct_v >= 10) + ceph_decode_8_safe(p, end, map->require_osd_release, e_inval); + /* ignore the rest */ *p = end; From patchwork Thu Nov 14 10:57:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 11243533 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A363D13BD for ; Thu, 14 Nov 2019 10:57:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8D0AF20709 for ; Thu, 14 Nov 2019 10:57:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727022AbfKNK5n (ORCPT ); Thu, 14 Nov 2019 05:57:43 -0500 Received: from mx2.suse.de ([195.135.220.15]:33358 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725977AbfKNK5m (ORCPT ); Thu, 14 Nov 2019 05:57:42 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id C7E62AD5F; Thu, 14 Nov 2019 10:57:40 +0000 (UTC) From: Luis Henriques To: Jeff Layton , Sage Weil , Ilya Dryomov , "Yan, Zheng" Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques Subject: [RFC PATCH v2 3/4] ceph: add require_osd_release field to osdmap debugfs Date: Thu, 14 Nov 2019 10:57:35 +0000 Message-Id: <20191114105736.8636-4-lhenriques@suse.com> In-Reply-To: <20191114105736.8636-1-lhenriques@suse.com> References: <20191114105736.8636-1-lhenriques@suse.com> MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add the require_osd_release information to debugfs. Signed-off-by: Luis Henriques --- include/linux/ceph/rados.h | 22 ++++++++++++++++++++++ net/ceph/ceph_strings.c | 38 ++++++++++++++++++++++++++++++++++++++ net/ceph/debugfs.c | 2 ++ 3 files changed, 62 insertions(+) diff --git a/include/linux/ceph/rados.h b/include/linux/ceph/rados.h index 3eb0e55665b4..68bc65f971b4 100644 --- a/include/linux/ceph/rados.h +++ b/include/linux/ceph/rados.h @@ -164,6 +164,28 @@ extern const char *ceph_osd_state_name(int s); #define CEPH_OSDMAP_REQUIRE_LUMINOUS (1<<18) /* require l for booting osds */ #define CEPH_OSDMAP_RECOVERY_DELETES (1<<19) /* deletes performed during recovery instead of peering */ +/* + * major ceph release numbers + */ +#define CEPH_RELEASE_ARGONAUT 1 +#define CEPH_RELEASE_BOBTAIL 2 +#define CEPH_RELEASE_CUTTLEFISH 3 +#define CEPH_RELEASE_DUMPLING 4 +#define CEPH_RELEASE_EMPEROR 5 +#define CEPH_RELEASE_FIREFLY 6 +#define CEPH_RELEASE_GIANT 7 +#define CEPH_RELEASE_HAMMER 8 +#define CEPH_RELEASE_INFERNALIS 9 +#define CEPH_RELEASE_JEWEL 10 +#define CEPH_RELEASE_KRAKEN 11 +#define CEPH_RELEASE_LUMINOUS 12 +#define CEPH_RELEASE_MIMIC 13 +#define CEPH_RELEASE_NAUTILUS 14 +#define CEPH_RELEASE_OCTOPUS 15 +#define CEPH_RELEASE_MAX 16 /* highest + 1 */ + +extern const char *ceph_release_name(int r); + /* * The error code to return when an OSD can't handle a write * because it is too large. diff --git a/net/ceph/ceph_strings.c b/net/ceph/ceph_strings.c index 10e01494993c..3f280f17bbcb 100644 --- a/net/ceph/ceph_strings.c +++ b/net/ceph/ceph_strings.c @@ -60,3 +60,41 @@ const char *ceph_osd_state_name(int s) return "???"; } } + +const char *ceph_release_name(int r) +{ + switch (r) { + case CEPH_RELEASE_ARGONAUT: + return "argonaut"; + case CEPH_RELEASE_BOBTAIL: + return "bobtail"; + case CEPH_RELEASE_CUTTLEFISH: + return "cuttlefish"; + case CEPH_RELEASE_DUMPLING: + return "dumpling"; + case CEPH_RELEASE_EMPEROR: + return "emperor"; + case CEPH_RELEASE_FIREFLY: + return "firefly"; + case CEPH_RELEASE_GIANT: + return "giant"; + case CEPH_RELEASE_HAMMER: + return "hammer"; + case CEPH_RELEASE_INFERNALIS: + return "infernalis"; + case CEPH_RELEASE_JEWEL: + return "jewel"; + case CEPH_RELEASE_KRAKEN: + return "kraken"; + case CEPH_RELEASE_LUMINOUS: + return "luminous"; + case CEPH_RELEASE_MIMIC: + return "mimic"; + case CEPH_RELEASE_NAUTILUS: + return "nautilus"; + case CEPH_RELEASE_OCTOPUS: + return "octopus"; + default: + return "unknown"; + } +} diff --git a/net/ceph/debugfs.c b/net/ceph/debugfs.c index 7cb992e55475..d42071f6ab57 100644 --- a/net/ceph/debugfs.c +++ b/net/ceph/debugfs.c @@ -65,6 +65,8 @@ static int osdmap_show(struct seq_file *s, void *p) down_read(&osdc->lock); seq_printf(s, "epoch %u barrier %u flags 0x%x\n", map->epoch, osdc->epoch_barrier, map->flags); + seq_printf(s, "require_osd_release: %s\n", + ceph_release_name(map->require_osd_release)); for (n = rb_first(&map->pg_pools); n; n = rb_next(n)) { struct ceph_pg_pool_info *pi = From patchwork Thu Nov 14 10:57:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 11243531 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27278930 for ; Thu, 14 Nov 2019 10:57:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 11F1E20709 for ; Thu, 14 Nov 2019 10:57:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727056AbfKNK5o (ORCPT ); Thu, 14 Nov 2019 05:57:44 -0500 Received: from mx2.suse.de ([195.135.220.15]:33384 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726992AbfKNK5n (ORCPT ); Thu, 14 Nov 2019 05:57:43 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A2FBEAE5E; Thu, 14 Nov 2019 10:57:41 +0000 (UTC) From: Luis Henriques To: Jeff Layton , Sage Weil , Ilya Dryomov , "Yan, Zheng" Cc: ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques Subject: [RFC PATCH v2 4/4] ceph: add support for sending truncate_{seq,size} in 'copy-from' Op Date: Thu, 14 Nov 2019 10:57:36 +0000 Message-Id: <20191114105736.8636-5-lhenriques@suse.com> In-Reply-To: <20191114105736.8636-1-lhenriques@suse.com> References: <20191114105736.8636-1-lhenriques@suse.com> MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Doing an object copy in Ceph will result in not only the data being copied but also the truncate_seq value. This may make sense in generic RADOS object copies, but for the specific case of performing a file copy will result in data corruption in the destination file. In order to fix this, the 'copy-from' operation had to be modified so that it could receive the two extra parameters for the destination object truncate_seq and truncate_size. This patch adds support for these extra parameters to the kernel client. Unfortunately, this operation modification is available in Ceph Octopus only, so it is necessary to ensure that the OSD doing the copy does indeed support this feature. Link: https://tracker.ceph.com/issues/37378 Signed-off-by: Luis Henriques --- fs/ceph/file.c | 10 +++++++++- include/linux/ceph/osd_client.h | 1 + include/linux/ceph/rados.h | 1 + net/ceph/osd_client.c | 7 ++++++- 4 files changed, 17 insertions(+), 2 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index bd77adb64bfd..f45bb3837a31 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1928,6 +1928,7 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, struct ceph_fs_client *src_fsc = ceph_inode_to_client(src_inode); struct ceph_object_locator src_oloc, dst_oloc; struct ceph_object_id src_oid, dst_oid; + struct ceph_osdmap *map = src_fsc->client->osdc.osdmap; loff_t endoff = 0, size; ssize_t ret = -EIO; u64 src_objnum, dst_objnum, src_objoff, dst_objoff; @@ -1958,6 +1959,11 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, if (ceph_test_mount_opt(src_fsc, NOCOPYFROM)) return -EOPNOTSUPP; + if (map->require_osd_release < CEPH_RELEASE_OCTOPUS) { + pr_warn_once("copy_file_range not supported in '%s' release\n", + ceph_release_name(map->require_osd_release)); + return -EOPNOTSUPP; + } /* * Striped file layouts require that we copy partial objects, but the @@ -2086,7 +2092,9 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, CEPH_OSD_OP_FLAG_FADVISE_NOCACHE, &dst_oid, &dst_oloc, CEPH_OSD_OP_FLAG_FADVISE_SEQUENTIAL | - CEPH_OSD_OP_FLAG_FADVISE_DONTNEED, 0); + CEPH_OSD_OP_FLAG_FADVISE_DONTNEED, + dst_ci->i_truncate_seq, dst_ci->i_truncate_size, + CEPH_OSD_COPY_FROM_FLAG_TRUNCATE_SEQ); if (err) { dout("ceph_osdc_copy_from returned %d\n", err); if (!ret) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index eaffbdddf89a..5a62dbd3f4c2 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -534,6 +534,7 @@ int ceph_osdc_copy_from(struct ceph_osd_client *osdc, struct ceph_object_id *dst_oid, struct ceph_object_locator *dst_oloc, u32 dst_fadvise_flags, + u32 truncate_seq, u64 truncate_size, u8 copy_from_flags); /* watch/notify */ diff --git a/include/linux/ceph/rados.h b/include/linux/ceph/rados.h index 68bc65f971b4..318da211bb79 100644 --- a/include/linux/ceph/rados.h +++ b/include/linux/ceph/rados.h @@ -468,6 +468,7 @@ enum { CEPH_OSD_COPY_FROM_FLAG_MAP_SNAP_CLONE = 8, /* map snap direct to * cloneid */ CEPH_OSD_COPY_FROM_FLAG_RWORDERED = 16, /* order with write */ + CEPH_OSD_COPY_FROM_FLAG_TRUNCATE_SEQ = 32, /* send truncate_{seq,size} */ }; enum { diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index ba45b074a362..02abf2790e99 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -5315,6 +5315,7 @@ static int osd_req_op_copy_from_init(struct ceph_osd_request *req, struct ceph_object_locator *src_oloc, u32 src_fadvise_flags, u32 dst_fadvise_flags, + u32 truncate_seq, u64 truncate_size, u8 copy_from_flags) { struct ceph_osd_req_op *op; @@ -5335,6 +5336,8 @@ static int osd_req_op_copy_from_init(struct ceph_osd_request *req, end = p + PAGE_SIZE; ceph_encode_string(&p, end, src_oid->name, src_oid->name_len); encode_oloc(&p, end, src_oloc); + ceph_encode_32(&p, truncate_seq); + ceph_encode_64(&p, truncate_size); op->indata_len = PAGE_SIZE - (end - p); ceph_osd_data_pages_init(&op->copy_from.osd_data, pages, @@ -5350,6 +5353,7 @@ int ceph_osdc_copy_from(struct ceph_osd_client *osdc, struct ceph_object_id *dst_oid, struct ceph_object_locator *dst_oloc, u32 dst_fadvise_flags, + u32 truncate_seq, u64 truncate_size, u8 copy_from_flags) { struct ceph_osd_request *req; @@ -5366,7 +5370,8 @@ int ceph_osdc_copy_from(struct ceph_osd_client *osdc, ret = osd_req_op_copy_from_init(req, src_snapid, src_version, src_oid, src_oloc, src_fadvise_flags, - dst_fadvise_flags, copy_from_flags); + dst_fadvise_flags, truncate_seq, + truncate_size, copy_from_flags); if (ret) goto out;