From patchwork Tue Dec 17 11:25:39 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Oliva X-Patchwork-Id: 3360711 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 8807CC0D4A for ; Tue, 17 Dec 2013 11:26:29 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C526520259 for ; Tue, 17 Dec 2013 11:26:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6612920253 for ; Tue, 17 Dec 2013 11:26:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752786Ab3LQL0V (ORCPT ); Tue, 17 Dec 2013 06:26:21 -0500 Received: from linux-libre.fsfla.org ([208.118.235.54]:47242 "EHLO linux-libre.fsfla.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751537Ab3LQL0V (ORCPT ); Tue, 17 Dec 2013 06:26:21 -0500 Received: from freie.home (home.lxoliva.fsfla.org [172.31.160.22]) by linux-libre.fsfla.org (8.14.4/8.14.4/Debian-2ubuntu2) with ESMTP id rBHBQEV2010072; Tue, 17 Dec 2013 11:26:15 GMT Received: from livre.home (livre.home [172.31.160.2]) by freie.home (8.14.7/8.14.7) with ESMTP id rBHBPdqo007907; Tue, 17 Dec 2013 09:25:39 -0200 From: Alexandre Oliva To: ceph-devel@vger.kernel.org Cc: Zheng Yan Subject: [PATCH] mds: handle setxattr ceph.parent Organization: Free thinker, not speaking for the GNU Project User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux) Date: Tue, 17 Dec 2013 09:25:39 -0200 Message-ID: MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_TVD_MIME_EPI, T_TVD_MIME_NO_HEADERS, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is a (probably half-baked) solution for the problem of cephfs clusters encountering recovery problems when clients are accessing files that don't have a parent attribute. It enables clients to request the parent attribute to be updated right away, by a simple setxattr call: # setfattr -n ceph.parent /cephfs/mount/path/name I had to relax the assert because there's no reason I can think of to force the object and its parent dirty just to set this internal bookkeeping xattr. The operation is not journaled, as it takes effect immediately. Although there's no assurance that the operation is completed before success is returned, my tests indicate that running rados getxattr right after setfattr already gets an attribute, for objects that had been created before the introduction of the parent attribute. I realize Zheng Yan posted a patch that would mark for update missing or too-old parent attributes on the fly, when inodes were brought into the cache, so that the parent attribute would be updated when the inodes were to be expired from the MDS log. I had mds running with that patch for a while, and I even explicitly touched and linked files and dirs that were missing the parent attribute; many, but not all of the files and dirs got the attribute, and I'm having some difficulties getting it to work on the remaining ones, in part because it takes so long to take effect (as in, perform an operation, then wait for several hours until the then-current MDS log segment gets expired, then check whether the attribute was set). This patch causes the parent attribute to be set right away. I'm not sure this immediate behavior would be appropriate for use in production (as in, I'm not sure creating an inode and trying to set the parent attribute might cause a failure because the inode object isn't there yet by the time we try to set an attribute on it), but it should be ok for retrofitting ancient inode objects so that they don't cause recovery problems due to the lack of the parent xattr. BTW, Zheng Yan, thanks for the patch that fixed mds readdir with dirs ending in remote (hard) links; this one had annoyed me for a long time, and I was just about to start actually digging into it when I saw the 0.73 announcement that mentioned what appeared to be a fix for the problem I was running into, and indeed, it was. I merged it into my 0.72.1 build and it's been working great! mds: handle setxattr ceph.parent From: Alexandre Oliva Enable clients to setxattr ceph.parent to update the parent xattr. Signed-off-by: Alexandre Oliva --- src/mds/CInode.cc | 2 +- src/mds/Server.cc | 5 +++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/src/mds/CInode.cc b/src/mds/CInode.cc index 1fc57fe..7c692d4 100644 --- a/src/mds/CInode.cc +++ b/src/mds/CInode.cc @@ -1009,7 +1009,7 @@ struct C_Inode_StoredBacktrace : public Context { void CInode::store_backtrace(Context *fin) { dout(10) << "store_backtrace on " << *this << dendl; - assert(is_dirty_parent()); + assert(!fin || is_dirty_parent()); auth_pin(this); diff --git a/src/mds/Server.cc b/src/mds/Server.cc index 6bb3aef..2afb6d7 100644 --- a/src/mds/Server.cc +++ b/src/mds/Server.cc @@ -3615,6 +3615,11 @@ void Server::handle_set_vxattr(MDRequest *mdr, CInode *cur, journal_and_reply(mdr, cur, 0, le, new C_MDS_inode_update_finish(mds, mdr, cur)); return; } + else if (name == "ceph.parent" && value == "") { + cur->store_backtrace(NULL); + reply_request(mdr, 0); + return; + } dout(10) << " unknown vxattr " << name << dendl; reply_request(mdr, -EINVAL);