From patchwork Tue Mar 26 22:07:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Mayhew X-Patchwork-Id: 10872247 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 977071708 for ; Tue, 26 Mar 2019 22:07:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8247F28569 for ; Tue, 26 Mar 2019 22:07:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 76B6D28CCE; Tue, 26 Mar 2019 22:07:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD81D28569 for ; Tue, 26 Mar 2019 22:07:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731535AbfCZWHd (ORCPT ); Tue, 26 Mar 2019 18:07:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45530 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731589AbfCZWHc (ORCPT ); Tue, 26 Mar 2019 18:07:32 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7974931688F7; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: from coeurl.usersys.redhat.com (ovpn-122-200.rdu2.redhat.com [10.10.122.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2E78C600C1; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: by coeurl.usersys.redhat.com (Postfix, from userid 1000) id C8BE620C54; Tue, 26 Mar 2019 18:07:30 -0400 (EDT) From: Scott Mayhew To: steved@redhat.com Cc: bfields@fieldses.org, jlayton@kernel.org, linux-nfs@vger.kernel.org Subject: [nfs-utils PATCH RFC v3 1/8] Revert "nfsdcltrack: remove the nfsdcld daemon" Date: Tue, 26 Mar 2019 18:07:23 -0400 Message-Id: <20190326220730.3763-2-smayhew@redhat.com> In-Reply-To: <20190326220730.3763-1-smayhew@redhat.com> References: <20190326220730.3763-1-smayhew@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This reverts commit 2cf11ec6ed261ef56bbd0d73ff404fe69f1fefb0. nfsdcld was originally deprecated in favor of a usermodehelper, so as to not require another running daemon. But that turned out to make namespaced nfsd's too difficult, so this commit brings back nfsdcld. A few things have been changed since nfsdcld was removed in 2012: - libnfs.a is now libnfs.la - sqlite_insert_client and sqlite_check_client both take a bool has_session which is being hard-coded to false for now. - sqlite_insert_client also takes a bool zerotime which is being hard-coded to false. - an xlog() call whose arguments did not match the format string (and could crash nfsdcld) has been fixed - cld_pipe_open() now calls event_initialized() rather than checking for EVLIST_INIT directly (which does not compile with newer versions of libevent). Signed-off-by: Scott Mayhew --- .gitignore | 1 + utils/nfsdcltrack/Makefile.am | 7 +- utils/nfsdcltrack/nfsdcld.c | 611 ++++++++++++++++++++++++++++++++++ utils/nfsdcltrack/nfsdcld.man | 185 ++++++++++ 4 files changed, 802 insertions(+), 2 deletions(-) create mode 100644 utils/nfsdcltrack/nfsdcld.c create mode 100644 utils/nfsdcltrack/nfsdcld.man diff --git a/.gitignore b/.gitignore index e91e7a2..d4f4f34 100644 --- a/.gitignore +++ b/.gitignore @@ -54,6 +54,7 @@ utils/rquotad/rquotad utils/rquotad/rquota.h utils/rquotad/rquota_xdr.c utils/showmount/showmount +utils/nfsdcltrack/nfsdcld utils/nfsdcltrack/nfsdcltrack utils/statd/statd tools/locktest/testlk diff --git a/utils/nfsdcltrack/Makefile.am b/utils/nfsdcltrack/Makefile.am index 2f7fe3d..0f599c0 100644 --- a/utils/nfsdcltrack/Makefile.am +++ b/utils/nfsdcltrack/Makefile.am @@ -4,11 +4,14 @@ # overridden at config time. The kernel "knows" the /sbin name. sbindir = /sbin -man8_MANS = nfsdcltrack.man +man8_MANS = nfsdcld.man nfsdcltrack.man EXTRA_DIST = $(man8_MANS) AM_CFLAGS += -D_LARGEFILE64_SOURCE -sbin_PROGRAMS = nfsdcltrack +sbin_PROGRAMS = nfsdcld nfsdcltrack + +nfsdcld_SOURCES = nfsdcld.c sqlite.c +nfsdcld_LDADD = ../../support/nfs/libnfs.la $(LIBEVENT) $(LIBSQLITE) $(LIBCAP) noinst_HEADERS = sqlite.h diff --git a/utils/nfsdcltrack/nfsdcld.c b/utils/nfsdcltrack/nfsdcld.c new file mode 100644 index 0000000..082f3ab --- /dev/null +++ b/utils/nfsdcltrack/nfsdcld.c @@ -0,0 +1,611 @@ +/* + * nfsdcld.c -- NFSv4 client name tracking daemon + * + * Copyright (C) 2011 Red Hat, Jeff Layton + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, + * Boston, MA 02110-1301, USA. + */ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif /* HAVE_CONFIG_H */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#ifdef HAVE_SYS_CAPABILITY_H +#include +#include +#endif + +#include "xlog.h" +#include "nfslib.h" +#include "cld.h" +#include "sqlite.h" + +#ifndef PIPEFS_DIR +#define PIPEFS_DIR NFS_STATEDIR "/rpc_pipefs" +#endif + +#define DEFAULT_CLD_PATH PIPEFS_DIR "/nfsd/cld" + +#ifndef CLD_DEFAULT_STORAGEDIR +#define CLD_DEFAULT_STORAGEDIR NFS_STATEDIR "/nfsdcld" +#endif + +#define UPCALL_VERSION 1 + +/* private data structures */ +struct cld_client { + int cl_fd; + struct event cl_event; + struct cld_msg cl_msg; +}; + +/* global variables */ +static char *pipepath = DEFAULT_CLD_PATH; +static int inotify_fd = -1; +static struct event pipedir_event; + +static struct option longopts[] = +{ + { "help", 0, NULL, 'h' }, + { "foreground", 0, NULL, 'F' }, + { "debug", 0, NULL, 'd' }, + { "pipe", 1, NULL, 'p' }, + { "storagedir", 1, NULL, 's' }, + { NULL, 0, 0, 0 }, +}; + +/* forward declarations */ +static void cldcb(int UNUSED(fd), short which, void *data); + +static void +usage(char *progname) +{ + printf("%s [ -hFd ] [ -p pipe ] [ -s dir ]\n", progname); +} + +static int +cld_set_caps(void) +{ + int ret = 0; +#ifdef HAVE_SYS_CAPABILITY_H + unsigned long i; + cap_t caps; + + if (getuid() != 0) { + xlog(L_ERROR, "Not running as root. Daemon won't be able to " + "open the pipe after dropping capabilities!"); + return -EINVAL; + } + + /* prune the bounding set to nothing */ + for (i = 0; prctl(PR_CAPBSET_READ, i, 0, 0, 0) >= 0 ; ++i) { + ret = prctl(PR_CAPBSET_DROP, i, 0, 0, 0); + if (ret) { + xlog(L_ERROR, "Unable to prune capability %lu from " + "bounding set: %m", i); + return -errno; + } + } + + /* get a blank capset */ + caps = cap_init(); + if (caps == NULL) { + xlog(L_ERROR, "Unable to get blank capability set: %m"); + return -errno; + } + + /* reset the process capabilities */ + if (cap_set_proc(caps) != 0) { + xlog(L_ERROR, "Unable to set process capabilities: %m"); + ret = -errno; + } + cap_free(caps); +#endif + return ret; +} + +#define INOTIFY_EVENT_MAX (sizeof(struct inotify_event) + NAME_MAX) + +static int +cld_pipe_open(struct cld_client *clnt) +{ + int fd; + + xlog(D_GENERAL, "%s: opening upcall pipe %s", __func__, pipepath); + fd = open(pipepath, O_RDWR, 0); + if (fd < 0) { + xlog(D_GENERAL, "%s: open of %s failed: %m", __func__, pipepath); + return -errno; + } + + if (event_initialized(&clnt->cl_event)) + event_del(&clnt->cl_event); + if (clnt->cl_fd >= 0) + close(clnt->cl_fd); + + clnt->cl_fd = fd; + event_set(&clnt->cl_event, clnt->cl_fd, EV_READ, cldcb, clnt); + /* event_add is done by the caller */ + return 0; +} + +static void +cld_inotify_cb(int UNUSED(fd), short which, void *data) +{ + int ret; + size_t elen; + ssize_t rret; + char evbuf[INOTIFY_EVENT_MAX]; + char *dirc = NULL, *pname; + struct inotify_event *event = (struct inotify_event *)evbuf; + struct cld_client *clnt = data; + + if (which != EV_READ) + return; + + xlog(D_GENERAL, "%s: called for EV_READ", __func__); + + dirc = strndup(pipepath, PATH_MAX); + if (!dirc) { + xlog(L_ERROR, "%s: unable to allocate memory", __func__); + goto out; + } + + rret = read(inotify_fd, evbuf, INOTIFY_EVENT_MAX); + if (rret < 0) { + xlog(L_ERROR, "%s: read from inotify fd failed: %m", __func__); + goto out; + } + + /* check to see if we have a filename in the evbuf */ + if (!event->len) { + xlog(D_GENERAL, "%s: no filename in inotify event", __func__); + goto out; + } + + pname = basename(dirc); + elen = strnlen(event->name, event->len); + + /* does the filename match our pipe? */ + if (strlen(pname) != elen || memcmp(pname, event->name, elen)) { + xlog(D_GENERAL, "%s: wrong filename (%s)", __func__, + event->name); + goto out; + } + + ret = cld_pipe_open(clnt); + switch (ret) { + case 0: + /* readd the event for the cl_event pipe */ + event_add(&clnt->cl_event, NULL); + break; + case -ENOENT: + /* pipe must have disappeared, wait for it to come back */ + goto out; + default: + /* anything else is fatal */ + xlog(L_FATAL, "%s: unable to open new pipe (%d). Aborting.", + ret, __func__); + exit(ret); + } + +out: + event_add(&pipedir_event, NULL); + free(dirc); +} + +static int +cld_inotify_setup(void) +{ + int ret; + char *dirc, *dname; + + dirc = strndup(pipepath, PATH_MAX); + if (!dirc) { + xlog_err("%s: unable to allocate memory", __func__); + ret = -ENOMEM; + goto out_free; + } + + dname = dirname(dirc); + + inotify_fd = inotify_init(); + if (inotify_fd < 0) { + xlog_err("%s: inotify_init failed: %m", __func__); + ret = -errno; + goto out_free; + } + + ret = inotify_add_watch(inotify_fd, dname, IN_CREATE); + if (ret < 0) { + xlog_err("%s: inotify_add_watch failed: %m", __func__); + ret = -errno; + goto out_err; + } + +out_free: + free(dirc); + return 0; +out_err: + close(inotify_fd); + goto out_free; +} + +/* + * Set an inotify watch on the directory that should contain the pipe, and then + * try to open it. If it fails with anything but -ENOENT, return the error + * immediately. + * + * If it succeeds, then set up the pipe event handler. At that point, set up + * the inotify event handler and go ahead and return success. + */ +static int +cld_pipe_init(struct cld_client *clnt) +{ + int ret; + + xlog(D_GENERAL, "%s: init pipe handlers", __func__); + + ret = cld_inotify_setup(); + if (ret != 0) + goto out; + + clnt->cl_fd = -1; + ret = cld_pipe_open(clnt); + switch (ret) { + case 0: + /* add the event and we're good to go */ + event_add(&clnt->cl_event, NULL); + break; + case -ENOENT: + /* ignore this error -- cld_inotify_cb will handle it */ + ret = 0; + break; + default: + /* anything else is fatal */ + close(inotify_fd); + goto out; + } + + /* set event for inotify read */ + event_set(&pipedir_event, inotify_fd, EV_READ, cld_inotify_cb, clnt); + event_add(&pipedir_event, NULL); +out: + return ret; +} + +static void +cld_not_implemented(struct cld_client *clnt) +{ + int ret; + ssize_t bsize, wsize; + struct cld_msg *cmsg = &clnt->cl_msg; + + xlog(D_GENERAL, "%s: downcalling with not implemented error", __func__); + + /* set up reply */ + cmsg->cm_status = -EOPNOTSUPP; + + bsize = sizeof(*cmsg); + + wsize = atomicio((void *)write, clnt->cl_fd, cmsg, bsize); + if (wsize != bsize) + xlog(L_ERROR, "%s: problem writing to cld pipe (%ld): %m", + __func__, wsize); + + /* reopen pipe, just to be sure */ + ret = cld_pipe_open(clnt); + if (ret) { + xlog(L_FATAL, "%s: unable to reopen pipe: %d", __func__, ret); + exit(ret); + } +} + +static void +cld_create(struct cld_client *clnt) +{ + int ret; + ssize_t bsize, wsize; + struct cld_msg *cmsg = &clnt->cl_msg; + + xlog(D_GENERAL, "%s: create client record.", __func__); + + + ret = sqlite_insert_client(cmsg->cm_u.cm_name.cn_id, + cmsg->cm_u.cm_name.cn_len, + false, + false); + + cmsg->cm_status = ret ? -EREMOTEIO : ret; + + bsize = sizeof(*cmsg); + + xlog(D_GENERAL, "Doing downcall with status %d", cmsg->cm_status); + wsize = atomicio((void *)write, clnt->cl_fd, cmsg, bsize); + if (wsize != bsize) { + xlog(L_ERROR, "%s: problem writing to cld pipe (%ld): %m", + __func__, wsize); + ret = cld_pipe_open(clnt); + if (ret) { + xlog(L_FATAL, "%s: unable to reopen pipe: %d", + __func__, ret); + exit(ret); + } + } +} + +static void +cld_remove(struct cld_client *clnt) +{ + int ret; + ssize_t bsize, wsize; + struct cld_msg *cmsg = &clnt->cl_msg; + + xlog(D_GENERAL, "%s: remove client record.", __func__); + + ret = sqlite_remove_client(cmsg->cm_u.cm_name.cn_id, + cmsg->cm_u.cm_name.cn_len); + + cmsg->cm_status = ret ? -EREMOTEIO : ret; + + bsize = sizeof(*cmsg); + + xlog(D_GENERAL, "%s: downcall with status %d", __func__, + cmsg->cm_status); + wsize = atomicio((void *)write, clnt->cl_fd, cmsg, bsize); + if (wsize != bsize) { + xlog(L_ERROR, "%s: problem writing to cld pipe (%ld): %m", + __func__, wsize); + ret = cld_pipe_open(clnt); + if (ret) { + xlog(L_FATAL, "%s: unable to reopen pipe: %d", + __func__, ret); + exit(ret); + } + } +} + +static void +cld_check(struct cld_client *clnt) +{ + int ret; + ssize_t bsize, wsize; + struct cld_msg *cmsg = &clnt->cl_msg; + + xlog(D_GENERAL, "%s: check client record", __func__); + + ret = sqlite_check_client(cmsg->cm_u.cm_name.cn_id, + cmsg->cm_u.cm_name.cn_len, + false); + + /* set up reply */ + cmsg->cm_status = ret ? -EACCES : ret; + + bsize = sizeof(*cmsg); + + xlog(D_GENERAL, "%s: downcall with status %d", __func__, + cmsg->cm_status); + wsize = atomicio((void *)write, clnt->cl_fd, cmsg, bsize); + if (wsize != bsize) { + xlog(L_ERROR, "%s: problem writing to cld pipe (%ld): %m", + __func__, wsize); + ret = cld_pipe_open(clnt); + if (ret) { + xlog(L_FATAL, "%s: unable to reopen pipe: %d", + __func__, ret); + exit(ret); + } + } +} + +static void +cld_gracedone(struct cld_client *clnt) +{ + int ret; + ssize_t bsize, wsize; + struct cld_msg *cmsg = &clnt->cl_msg; + + xlog(D_GENERAL, "%s: grace done. cm_gracetime=%ld", __func__, + cmsg->cm_u.cm_gracetime); + + ret = sqlite_remove_unreclaimed(cmsg->cm_u.cm_gracetime); + + /* set up reply: downcall with 0 status */ + cmsg->cm_status = ret ? -EREMOTEIO : ret; + + bsize = sizeof(*cmsg); + + xlog(D_GENERAL, "Doing downcall with status %d", cmsg->cm_status); + wsize = atomicio((void *)write, clnt->cl_fd, cmsg, bsize); + if (wsize != bsize) { + xlog(L_ERROR, "%s: problem writing to cld pipe (%ld): %m", + __func__, wsize); + ret = cld_pipe_open(clnt); + if (ret) { + xlog(L_FATAL, "%s: unable to reopen pipe: %d", + __func__, ret); + exit(ret); + } + } +} + +static void +cldcb(int UNUSED(fd), short which, void *data) +{ + ssize_t len; + struct cld_client *clnt = data; + struct cld_msg *cmsg = &clnt->cl_msg; + + if (which != EV_READ) + goto out; + + len = atomicio(read, clnt->cl_fd, cmsg, sizeof(*cmsg)); + if (len <= 0) { + xlog(L_ERROR, "%s: pipe read failed: %m", __func__); + cld_pipe_open(clnt); + goto out; + } + + if (cmsg->cm_vers > UPCALL_VERSION) { + xlog(L_ERROR, "%s: unsupported upcall version: %hu", + __func__, cmsg->cm_vers); + cld_pipe_open(clnt); + goto out; + } + + switch(cmsg->cm_cmd) { + case Cld_Create: + cld_create(clnt); + break; + case Cld_Remove: + cld_remove(clnt); + break; + case Cld_Check: + cld_check(clnt); + break; + case Cld_GraceDone: + cld_gracedone(clnt); + break; + default: + xlog(L_WARNING, "%s: command %u is not yet implemented", + __func__, cmsg->cm_cmd); + cld_not_implemented(clnt); + } +out: + event_add(&clnt->cl_event, NULL); +} + +int +main(int argc, char **argv) +{ + char arg; + int rc = 0; + bool foreground = false; + char *progname; + char *storagedir = CLD_DEFAULT_STORAGEDIR; + struct cld_client clnt; + + memset(&clnt, 0, sizeof(clnt)); + + progname = strdup(basename(argv[0])); + if (!progname) { + fprintf(stderr, "%s: unable to allocate memory.\n", argv[0]); + return 1; + } + + event_init(); + xlog_syslog(0); + xlog_stderr(1); + + /* process command-line options */ + while ((arg = getopt_long(argc, argv, "hdFp:s:", longopts, + NULL)) != EOF) { + switch (arg) { + case 'd': + xlog_config(D_ALL, 1); + break; + case 'F': + foreground = true; + break; + case 'p': + pipepath = optarg; + break; + case 's': + storagedir = optarg; + break; + default: + usage(progname); + return 0; + } + } + + + xlog_open(progname); + if (!foreground) { + xlog_syslog(1); + xlog_stderr(0); + rc = daemon(0, 0); + if (rc) { + xlog(L_ERROR, "Unable to daemonize: %m"); + goto out; + } + } + + /* drop all capabilities */ + rc = cld_set_caps(); + if (rc) + goto out; + + /* + * now see if the storagedir is writable by root w/o CAP_DAC_OVERRIDE. + * If it isn't then give the user a warning but proceed as if + * everything is OK. If the DB has already been created, then + * everything might still work. If it doesn't exist at all, then + * assume that the maindb init will be able to create it. Fail on + * anything else. + */ + if (access(storagedir, W_OK) == -1) { + switch (errno) { + case EACCES: + xlog(L_WARNING, "Storage directory %s is not writable. " + "Should be owned by root and writable " + "by owner!", storagedir); + break; + case ENOENT: + /* ignore and assume that we can create dir as root */ + break; + default: + xlog(L_ERROR, "Unexpected error when checking access " + "on %s: %m", storagedir); + rc = -errno; + goto out; + } + } + + /* set up storage db */ + rc = sqlite_prepare_dbh(storagedir); + if (rc) { + xlog(L_ERROR, "Failed to open main database: %d", rc); + goto out; + } + + /* set up event handler */ + rc = cld_pipe_init(&clnt); + if (rc) + goto out; + + xlog(D_GENERAL, "%s: Starting event dispatch handler.", __func__); + rc = event_dispatch(); + if (rc < 0) + xlog(L_ERROR, "%s: event_dispatch failed: %m", __func__); + + close(clnt.cl_fd); + close(inotify_fd); +out: + free(progname); + return rc; +} diff --git a/utils/nfsdcltrack/nfsdcld.man b/utils/nfsdcltrack/nfsdcld.man new file mode 100644 index 0000000..9ddaf64 --- /dev/null +++ b/utils/nfsdcltrack/nfsdcld.man @@ -0,0 +1,185 @@ +.\" Automatically generated by Pod::Man 2.22 (Pod::Simple 3.13) +.\" +.\" Standard preamble: +.\" ======================================================================== +.de Sp \" Vertical space (when we can't use .PP) +.if t .sp .5v +.if n .sp +.. +.de Vb \" Begin verbatim text +.ft CW +.nf +.ne \\$1 +.. +.de Ve \" End verbatim text +.ft R +.fi +.. +.\" Set up some character translations and predefined strings. \*(-- will +.\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left +.\" double quote, and \*(R" will give a right double quote. \*(C+ will +.\" give a nicer C++. Capital omega is used to do unbreakable dashes and +.\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, +.\" nothing in troff, for use with C<>. +.tr \(*W- +.ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' +.ie n \{\ +. ds -- \(*W- +. ds PI pi +. if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch +. if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch +. ds L" "" +. ds R" "" +. ds C` "" +. ds C' "" +'br\} +.el\{\ +. ds -- \|\(em\| +. ds PI \(*p +. ds L" `` +. ds R" '' +'br\} +.\" +.\" Escape single quotes in literal strings from groff's Unicode transform. +.ie \n(.g .ds Aq \(aq +.el .ds Aq ' +.\" +.\" If the F register is turned on, we'll generate index entries on stderr for +.\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index +.\" entries marked with X<> in POD. Of course, you'll have to process the +.\" output yourself in some meaningful fashion. +.ie \nF \{\ +. de IX +. tm Index:\\$1\t\\n%\t"\\$2" +.. +. nr % 0 +. rr F +.\} +.el \{\ +. de IX +.. +.\} +.\" +.\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). +.\" Fear. Run. Save yourself. No user-serviceable parts. +. \" fudge factors for nroff and troff +.if n \{\ +. ds #H 0 +. ds #V .8m +. ds #F .3m +. ds #[ \f1 +. ds #] \fP +.\} +.if t \{\ +. ds #H ((1u-(\\\\n(.fu%2u))*.13m) +. ds #V .6m +. ds #F 0 +. ds #[ \& +. ds #] \& +.\} +. \" simple accents for nroff and troff +.if n \{\ +. ds ' \& +. ds ` \& +. ds ^ \& +. ds , \& +. ds ~ ~ +. ds / +.\} +.if t \{\ +. ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" +. ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' +. ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' +. ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' +. ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' +. ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' +.\} +. \" troff and (daisy-wheel) nroff accents +.ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' +.ds 8 \h'\*(#H'\(*b\h'-\*(#H' +.ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] +.ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' +.ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' +.ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] +.ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] +.ds ae a\h'-(\w'a'u*4/10)'e +.ds Ae A\h'-(\w'A'u*4/10)'E +. \" corrections for vroff +.if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' +.if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' +. \" for low resolution devices (crt and lpr) +.if \n(.H>23 .if \n(.V>19 \ +\{\ +. ds : e +. ds 8 ss +. ds o a +. ds d- d\h'-1'\(ga +. ds D- D\h'-1'\(hy +. ds th \o'bp' +. ds Th \o'LP' +. ds ae ae +. ds Ae AE +.\} +.rm #[ #] #H #V #F C +.\" ======================================================================== +.\" +.IX Title "NFSDCLD 8" +.TH NFSDCLD 8 "2011-12-21" "" "" +.\" For nroff, turn off justification. Always turn off hyphenation; it makes +.\" way too many mistakes in technical documents. +.if n .ad l +.nh +.SH "NAME" +nfsdcld \- NFSv4 Client Tracking Daemon +.SH "SYNOPSIS" +.IX Header "SYNOPSIS" +nfsdcld [\-d] [\-F] [\-p path] [\-s stable storage dir] +.SH "DESCRIPTION" +.IX Header "DESCRIPTION" +nfsdcld is the NFSv4 client tracking daemon. It is not necessary to run +this daemon on machines that are not acting as NFSv4 servers. +.PP +When a network partition is combined with a server reboot, there are +edge conditions that can cause the server to grant lock reclaims when +other clients have taken conflicting locks in the interim. A more detailed +explanation of this issue is described in \s-1RFC\s0 3530, section 8.6.3. +.PP +In order to prevent these problems, the server must track a small amount +of per-client information on stable storage. This daemon provides the +userspace piece of that functionality. +.SH "OPTIONS" +.IX Header "OPTIONS" +.IP "\fB\-d\fR, \fB\-\-debug\fR" 4 +.IX Item "-d, --debug" +Enable debug level logging. +.IP "\fB\-F\fR, \fB\-\-foreground\fR" 4 +.IX Item "-F, --foreground" +Runs the daemon in the foreground and prints all output to stderr +.IP "\fB\-p\fR \fIpipe\fR, \fB\-\-pipe\fR=\fIpipe\fR" 4 +.IX Item "-p pipe, --pipe=pipe" +Location of the \*(L"cld\*(R" upcall pipe. The default value is +\&\fI/var/lib/nfs/rpc_pipefs/nfsd/cld\fR. If the pipe does not exist when the +daemon starts then it will wait for it to be created. +.IP "\fB\-s\fR \fIstorage_dir\fR, \fB\-\-storagedir\fR=\fIstorage_dir\fR" 4 +.IX Item "-s storagedir, --storagedir=storage_dir" +Directory where stable storage information should be kept. The default +value is \fI/var/lib/nfs/nfsdcld\fR. +.SH "NOTES" +.IX Header "NOTES" +The Linux kernel NFSv4 server has historically tracked this information +on stable storage by manipulating information on the filesystem +directly, in the directory to which \fI/proc/fs/nfsd/nfsv4recoverydir\fR +points. +.PP +This daemon requires a kernel that supports the nfsdcld upcall. If the +kernel does not support the new upcall, or is using the legacy client +name tracking code then it will not create the pipe that nfsdcld uses to +talk to the kernel. +.PP +This daemon should be run as root, as the pipe that it uses to communicate +with the kernel is only accessable by root. The daemon however does drop all +superuser capabilities after starting. Because of this, the \fIstoragedir\fR +should be owned by root, and be readable and writable by owner. +.SH "AUTHORS" +.IX Header "AUTHORS" +The nfsdcld daemon was developed by Jeff Layton . From patchwork Tue Mar 26 22:07:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Mayhew X-Patchwork-Id: 10872245 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C6E191708 for ; Tue, 26 Mar 2019 22:07:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B1F1B28A9C for ; Tue, 26 Mar 2019 22:07:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A65DE28CCE; Tue, 26 Mar 2019 22:07:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 04DEA28A9C for ; Tue, 26 Mar 2019 22:07:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731491AbfCZWHc (ORCPT ); Tue, 26 Mar 2019 18:07:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42888 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731535AbfCZWHc (ORCPT ); Tue, 26 Mar 2019 18:07:32 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7B76A307D978; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: from coeurl.usersys.redhat.com (ovpn-122-200.rdu2.redhat.com [10.10.122.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id 30B2C1001DD1; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: by coeurl.usersys.redhat.com (Postfix, from userid 1000) id CD4B620D22; Tue, 26 Mar 2019 18:07:30 -0400 (EDT) From: Scott Mayhew To: steved@redhat.com Cc: bfields@fieldses.org, jlayton@kernel.org, linux-nfs@vger.kernel.org Subject: [nfs-utils PATCH RFC v3 2/8] nfsdcld: move nfsdcld to its own directory Date: Tue, 26 Mar 2019 18:07:24 -0400 Message-Id: <20190326220730.3763-3-smayhew@redhat.com> In-Reply-To: <20190326220730.3763-1-smayhew@redhat.com> References: <20190326220730.3763-1-smayhew@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The schema for the db used by nfsdcld is going to diverge quite a bit from that used by nfsdcltrack. It will be easier if the programs are kept in separate directories. Signed-off-by: Scott Mayhew --- .gitignore | 2 +- configure.ac | 23 + utils/Makefile.am | 4 + utils/nfsdcld/Makefile.am | 19 + utils/{nfsdcltrack => nfsdcld}/nfsdcld.c | 0 utils/{nfsdcltrack => nfsdcld}/nfsdcld.man | 0 utils/nfsdcld/sqlite.c | 601 +++++++++++++++++++++ utils/nfsdcld/sqlite.h | 32 ++ utils/nfsdcltrack/Makefile.am | 7 +- 9 files changed, 682 insertions(+), 6 deletions(-) create mode 100644 utils/nfsdcld/Makefile.am rename utils/{nfsdcltrack => nfsdcld}/nfsdcld.c (100%) rename utils/{nfsdcltrack => nfsdcld}/nfsdcld.man (100%) create mode 100644 utils/nfsdcld/sqlite.c create mode 100644 utils/nfsdcld/sqlite.h diff --git a/.gitignore b/.gitignore index d4f4f34..e97b31f 100644 --- a/.gitignore +++ b/.gitignore @@ -54,7 +54,7 @@ utils/rquotad/rquotad utils/rquotad/rquota.h utils/rquotad/rquota_xdr.c utils/showmount/showmount -utils/nfsdcltrack/nfsdcld +utils/nfsdcld/nfsdcld utils/nfsdcltrack/nfsdcltrack utils/statd/statd tools/locktest/testlk diff --git a/configure.ac b/configure.ac index 4bf5aea..8598d50 100644 --- a/configure.ac +++ b/configure.ac @@ -242,6 +242,12 @@ else AM_CONDITIONAL(MOUNT_CONFIG, [test "$enable_mount" = "yes"]) fi +AC_ARG_ENABLE(nfsdcld, + [AC_HELP_STRING([--disable-nfsdcld], + [disable NFSv4 clientid tracking daemon @<:@default=no@:>@])], + enable_nfsdcld=$enableval, + enable_nfsdcld="yes") + AC_ARG_ENABLE(nfsdcltrack, [AC_HELP_STRING([--disable-nfsdcltrack], [disable NFSv4 clientid tracking programs @<:@default=no@:>@])], @@ -321,6 +327,20 @@ if test "$enable_nfsv4" = yes; then dnl Check for sqlite3 AC_SQLITE3_VERS + if test "$enable_nfsdcld" = "yes"; then + AC_CHECK_HEADERS([libgen.h sys/inotify.h], , + AC_MSG_ERROR([Cannot find header needed for nfsdcld])) + + case $libsqlite3_cv_is_recent in + yes) ;; + unknown) + dnl do not fail when cross-compiling + AC_MSG_WARN([assuming sqlite is at least v3.3]) ;; + *) + AC_MSG_ERROR([nfsdcld requires sqlite-devel]) ;; + esac + fi + if test "$enable_nfsdcltrack" = "yes"; then AC_CHECK_HEADERS([libgen.h sys/inotify.h], , AC_MSG_ERROR([Cannot find header needed for nfsdcltrack])) @@ -336,6 +356,7 @@ if test "$enable_nfsv4" = yes; then fi else + enable_nfsdcld="no" enable_nfsdcltrack="no" fi @@ -346,6 +367,7 @@ if test "$enable_nfsv41" = yes; then fi dnl enable nfsidmap when its support by libnfsidmap +AM_CONDITIONAL(CONFIG_NFSDCLD, [test "$enable_nfsdcld" = "yes" ]) AM_CONDITIONAL(CONFIG_NFSDCLTRACK, [test "$enable_nfsdcltrack" = "yes" ]) @@ -623,6 +645,7 @@ AC_CONFIG_FILES([ tools/nfsconf/Makefile utils/Makefile utils/blkmapd/Makefile + utils/nfsdcld/Makefile utils/nfsdcltrack/Makefile utils/exportfs/Makefile utils/gssd/Makefile diff --git a/utils/Makefile.am b/utils/Makefile.am index 0a5b062..4c930a4 100644 --- a/utils/Makefile.am +++ b/utils/Makefile.am @@ -19,6 +19,10 @@ if CONFIG_MOUNT OPTDIRS += mount endif +if CONFIG_NFSDCLD +OPTDIRS += nfsdcld +endif + if CONFIG_NFSDCLTRACK OPTDIRS += nfsdcltrack endif diff --git a/utils/nfsdcld/Makefile.am b/utils/nfsdcld/Makefile.am new file mode 100644 index 0000000..8239be8 --- /dev/null +++ b/utils/nfsdcld/Makefile.am @@ -0,0 +1,19 @@ +## Process this file with automake to produce Makefile.in + +# These binaries go in /sbin (not /usr/sbin), and that cannot be +# overridden at config time. The kernel "knows" the /sbin name. +sbindir = /sbin + +man8_MANS = nfsdcld.man +EXTRA_DIST = $(man8_MANS) + +AM_CFLAGS += -D_LARGEFILE64_SOURCE +sbin_PROGRAMS = nfsdcld + +nfsdcld_SOURCES = nfsdcld.c sqlite.c +nfsdcld_LDADD = ../../support/nfs/libnfs.la $(LIBEVENT) $(LIBSQLITE) $(LIBCAP) + +noinst_HEADERS = sqlite.h + +MAINTAINERCLEANFILES = Makefile.in + diff --git a/utils/nfsdcltrack/nfsdcld.c b/utils/nfsdcld/nfsdcld.c similarity index 100% rename from utils/nfsdcltrack/nfsdcld.c rename to utils/nfsdcld/nfsdcld.c diff --git a/utils/nfsdcltrack/nfsdcld.man b/utils/nfsdcld/nfsdcld.man similarity index 100% rename from utils/nfsdcltrack/nfsdcld.man rename to utils/nfsdcld/nfsdcld.man diff --git a/utils/nfsdcld/sqlite.c b/utils/nfsdcld/sqlite.c new file mode 100644 index 0000000..c59f777 --- /dev/null +++ b/utils/nfsdcld/sqlite.c @@ -0,0 +1,601 @@ +/* + * Copyright (C) 2011 Red Hat, Jeff Layton + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, + * Boston, MA 02110-1301, USA. + */ + +/* + * Explanation: + * + * This file contains the code to manage the sqlite backend database for the + * nfsdcltrack usermodehelper upcall program. + * + * The main database is called main.sqlite and contains the following tables: + * + * parameters: simple key/value pairs for storing database info + * + * clients: an "id" column containing a BLOB with the long-form clientid as + * sent by the client, a "time" column containing a timestamp (in + * epoch seconds) of when the record was last updated, and a + * "has_session" column containing a boolean value indicating + * whether the client has sessions (v4.1+) or not (v4.0). + */ + +#ifdef HAVE_CONFIG_H +#include "config.h" +#endif /* HAVE_CONFIG_H */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "xlog.h" +#include "sqlite.h" + +#define CLTRACK_SQLITE_LATEST_SCHEMA_VERSION 2 + +/* in milliseconds */ +#define CLTRACK_SQLITE_BUSY_TIMEOUT 10000 + +/* private data structures */ + +/* global variables */ + +/* reusable pathname and sql command buffer */ +static char buf[PATH_MAX]; + +/* global database handle */ +static sqlite3 *dbh; + +/* forward declarations */ + +/* make a directory, ignoring EEXIST errors unless it's not a directory */ +static int +mkdir_if_not_exist(const char *dirname) +{ + int ret; + struct stat statbuf; + + ret = mkdir(dirname, S_IRWXU); + if (ret && errno != EEXIST) + return -errno; + + ret = stat(dirname, &statbuf); + if (ret) + return -errno; + + if (!S_ISDIR(statbuf.st_mode)) + ret = -ENOTDIR; + + return ret; +} + +static int +sqlite_query_schema_version(void) +{ + int ret; + sqlite3_stmt *stmt = NULL; + + /* prepare select query */ + ret = sqlite3_prepare_v2(dbh, + "SELECT value FROM parameters WHERE key == \"version\";", + -1, &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(D_GENERAL, "Unable to prepare select statement: %s", + sqlite3_errmsg(dbh)); + ret = 0; + goto out; + } + + /* query schema version */ + ret = sqlite3_step(stmt); + if (ret != SQLITE_ROW) { + xlog(D_GENERAL, "Select statement execution failed: %s", + sqlite3_errmsg(dbh)); + ret = 0; + goto out; + } + + ret = sqlite3_column_int(stmt, 0); +out: + sqlite3_finalize(stmt); + return ret; +} + +static int +sqlite_maindb_update_v1_to_v2(void) +{ + int ret, ret2; + char *err; + + /* begin transaction */ + ret = sqlite3_exec(dbh, "BEGIN EXCLUSIVE TRANSACTION;", NULL, NULL, + &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to begin transaction: %s", err); + goto rollback; + } + + /* + * Check schema version again. This time, under an exclusive + * transaction to guard against racing DB setup attempts + */ + ret = sqlite_query_schema_version(); + switch (ret) { + case 1: + /* Still at v1 -- do conversion */ + break; + case CLTRACK_SQLITE_LATEST_SCHEMA_VERSION: + /* Someone else raced in and set it up */ + ret = 0; + goto rollback; + default: + /* Something went wrong -- fail! */ + ret = -EINVAL; + goto rollback; + } + + /* create v2 clients table */ + ret = sqlite3_exec(dbh, "ALTER TABLE clients ADD COLUMN " + "has_session INTEGER;", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to update clients table: %s", err); + goto rollback; + } + + ret = snprintf(buf, sizeof(buf), "UPDATE parameters SET value = %d " + "WHERE key = \"version\";", + CLTRACK_SQLITE_LATEST_SCHEMA_VERSION); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + goto rollback; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + ret = -EINVAL; + goto rollback; + } + + ret = sqlite3_exec(dbh, (const char *)buf, NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to update schema version: %s", err); + goto rollback; + } + + ret = sqlite3_exec(dbh, "COMMIT TRANSACTION;", NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to commit transaction: %s", err); + goto rollback; + } +out: + sqlite3_free(err); + return ret; +rollback: + ret2 = sqlite3_exec(dbh, "ROLLBACK TRANSACTION;", NULL, NULL, &err); + if (ret2 != SQLITE_OK) + xlog(L_ERROR, "Unable to rollback transaction: %s", err); + goto out; +} + +/* + * Start an exclusive transaction and recheck the DB schema version. If it's + * still zero (indicating a new database) then set it up. If that all works, + * then insert schema version into the parameters table and commit the + * transaction. On any error, rollback the transaction. + */ +static int +sqlite_maindb_init_v2(void) +{ + int ret, ret2; + char *err = NULL; + + /* Start a transaction */ + ret = sqlite3_exec(dbh, "BEGIN EXCLUSIVE TRANSACTION;", NULL, NULL, + &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to begin transaction: %s", err); + return ret; + } + + /* + * Check schema version again. This time, under an exclusive + * transaction to guard against racing DB setup attempts + */ + ret = sqlite_query_schema_version(); + switch (ret) { + case 0: + /* Query failed again -- set up DB */ + break; + case CLTRACK_SQLITE_LATEST_SCHEMA_VERSION: + /* Someone else raced in and set it up */ + ret = 0; + goto rollback; + default: + /* Something went wrong -- fail! */ + ret = -EINVAL; + goto rollback; + } + + ret = sqlite3_exec(dbh, "CREATE TABLE parameters " + "(key TEXT PRIMARY KEY, value TEXT);", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to create parameter table: %s", err); + goto rollback; + } + + /* create the "clients" table */ + ret = sqlite3_exec(dbh, "CREATE TABLE clients (id BLOB PRIMARY KEY, " + "time INTEGER, has_session INTEGER);", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to create clients table: %s", err); + goto rollback; + } + + + /* insert version into parameters table */ + ret = snprintf(buf, sizeof(buf), "INSERT OR FAIL INTO parameters " + "values (\"version\", \"%d\");", + CLTRACK_SQLITE_LATEST_SCHEMA_VERSION); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + goto rollback; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + ret = -EINVAL; + goto rollback; + } + + ret = sqlite3_exec(dbh, (const char *)buf, NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to insert into parameter table: %s", err); + goto rollback; + } + + ret = sqlite3_exec(dbh, "COMMIT TRANSACTION;", NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to commit transaction: %s", err); + goto rollback; + } +out: + sqlite3_free(err); + return ret; + +rollback: + /* Attempt to rollback the transaction */ + ret2 = sqlite3_exec(dbh, "ROLLBACK TRANSACTION;", NULL, NULL, &err); + if (ret2 != SQLITE_OK) + xlog(L_ERROR, "Unable to rollback transaction: %s", err); + goto out; +} + +/* Open the database and set up the database handle for it */ +int +sqlite_prepare_dbh(const char *topdir) +{ + int ret; + + /* Do nothing if the database handle is already set up */ + if (dbh) + return 0; + + ret = snprintf(buf, PATH_MAX - 1, "%s/main.sqlite", topdir); + if (ret < 0) + return ret; + + buf[PATH_MAX - 1] = '\0'; + + /* open a new DB handle */ + ret = sqlite3_open(buf, &dbh); + if (ret != SQLITE_OK) { + /* try to create the dir */ + ret = mkdir_if_not_exist(topdir); + if (ret) + goto out_close; + + /* retry open */ + ret = sqlite3_open(buf, &dbh); + if (ret != SQLITE_OK) + goto out_close; + } + + /* set busy timeout */ + ret = sqlite3_busy_timeout(dbh, CLTRACK_SQLITE_BUSY_TIMEOUT); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to set sqlite busy timeout: %s", + sqlite3_errmsg(dbh)); + goto out_close; + } + + ret = sqlite_query_schema_version(); + switch (ret) { + case CLTRACK_SQLITE_LATEST_SCHEMA_VERSION: + /* DB is already set up. Do nothing */ + ret = 0; + break; + case 1: + /* Old DB -- update to new schema */ + ret = sqlite_maindb_update_v1_to_v2(); + if (ret) + goto out_close; + break; + case 0: + /* Query failed -- try to set up new DB */ + ret = sqlite_maindb_init_v2(); + if (ret) + goto out_close; + break; + default: + /* Unknown DB version -- downgrade? Fail */ + xlog(L_ERROR, "Unsupported database schema version! " + "Expected %d, got %d.", + CLTRACK_SQLITE_LATEST_SCHEMA_VERSION, ret); + ret = -EINVAL; + goto out_close; + } + + return ret; +out_close: + sqlite3_close(dbh); + dbh = NULL; + return ret; +} + +/* + * Create a client record + * + * Returns a non-zero sqlite error code, or SQLITE_OK (aka 0) + */ +int +sqlite_insert_client(const unsigned char *clname, const size_t namelen, + const bool has_session, const bool zerotime) +{ + int ret; + sqlite3_stmt *stmt = NULL; + + if (zerotime) + ret = sqlite3_prepare_v2(dbh, "INSERT OR REPLACE INTO clients " + "VALUES (?, 0, ?);", -1, &stmt, NULL); + else + ret = sqlite3_prepare_v2(dbh, "INSERT OR REPLACE INTO clients " + "VALUES (?, strftime('%s', 'now'), ?);", -1, + &stmt, NULL); + + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: insert statement prepare failed: %s", + __func__, sqlite3_errmsg(dbh)); + return ret; + } + + ret = sqlite3_bind_blob(stmt, 1, (const void *)clname, namelen, + SQLITE_STATIC); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: bind blob failed: %s", __func__, + sqlite3_errmsg(dbh)); + goto out_err; + } + + ret = sqlite3_bind_int(stmt, 2, (int)has_session); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: bind int failed: %s", __func__, + sqlite3_errmsg(dbh)); + goto out_err; + } + + ret = sqlite3_step(stmt); + if (ret == SQLITE_DONE) + ret = SQLITE_OK; + else + xlog(L_ERROR, "%s: unexpected return code from insert: %s", + __func__, sqlite3_errmsg(dbh)); + +out_err: + xlog(D_GENERAL, "%s: returning %d", __func__, ret); + sqlite3_finalize(stmt); + return ret; +} + +/* Remove a client record */ +int +sqlite_remove_client(const unsigned char *clname, const size_t namelen) +{ + int ret; + sqlite3_stmt *stmt = NULL; + + ret = sqlite3_prepare_v2(dbh, "DELETE FROM clients WHERE id==?", -1, + &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: statement prepare failed: %s", + __func__, sqlite3_errmsg(dbh)); + goto out_err; + } + + ret = sqlite3_bind_blob(stmt, 1, (const void *)clname, namelen, + SQLITE_STATIC); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: bind blob failed: %s", __func__, + sqlite3_errmsg(dbh)); + goto out_err; + } + + ret = sqlite3_step(stmt); + if (ret == SQLITE_DONE) + ret = SQLITE_OK; + else + xlog(L_ERROR, "%s: unexpected return code from delete: %d", + __func__, ret); + +out_err: + xlog(D_GENERAL, "%s: returning %d", __func__, ret); + sqlite3_finalize(stmt); + return ret; +} + +/* + * Is the given clname in the clients table? If so, then update its timestamp + * and return success. If the record isn't present, or the update fails, then + * return an error. + */ +int +sqlite_check_client(const unsigned char *clname, const size_t namelen, + const bool has_session) +{ + int ret; + sqlite3_stmt *stmt = NULL; + + ret = sqlite3_prepare_v2(dbh, "SELECT count(*) FROM clients WHERE " + "id==?", -1, &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: unable to prepare update statement: %s", + __func__, sqlite3_errmsg(dbh)); + goto out_err; + } + + ret = sqlite3_bind_blob(stmt, 1, (const void *)clname, namelen, + SQLITE_STATIC); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: bind blob failed: %s", + __func__, sqlite3_errmsg(dbh)); + goto out_err; + } + + ret = sqlite3_step(stmt); + if (ret != SQLITE_ROW) { + xlog(L_ERROR, "%s: unexpected return code from select: %d", + __func__, ret); + goto out_err; + } + + ret = sqlite3_column_int(stmt, 0); + xlog(D_GENERAL, "%s: select returned %d rows", __func__, ret); + if (ret != 1) { + ret = -EACCES; + goto out_err; + } + + /* Only update timestamp for v4.0 clients */ + if (has_session) { + ret = SQLITE_OK; + goto out_err; + } + + sqlite3_finalize(stmt); + stmt = NULL; + ret = sqlite3_prepare_v2(dbh, "UPDATE OR FAIL clients SET " + "time=strftime('%s', 'now') WHERE id==?", + -1, &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: unable to prepare update statement: %s", + __func__, sqlite3_errmsg(dbh)); + goto out_err; + } + + ret = sqlite3_bind_blob(stmt, 1, (const void *)clname, namelen, + SQLITE_STATIC); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: bind blob failed: %s", + __func__, sqlite3_errmsg(dbh)); + goto out_err; + } + + ret = sqlite3_step(stmt); + if (ret == SQLITE_DONE) + ret = SQLITE_OK; + else + xlog(L_ERROR, "%s: unexpected return code from update: %s", + __func__, sqlite3_errmsg(dbh)); + +out_err: + xlog(D_GENERAL, "%s: returning %d", __func__, ret); + sqlite3_finalize(stmt); + return ret; +} + +/* + * remove any client records that were not reclaimed since grace_start. + */ +int +sqlite_remove_unreclaimed(time_t grace_start) +{ + int ret; + char *err = NULL; + + ret = snprintf(buf, sizeof(buf), "DELETE FROM clients WHERE time < %ld", + grace_start); + if (ret < 0) { + return ret; + } else if ((size_t)ret >= sizeof(buf)) { + ret = -EINVAL; + return ret; + } + + ret = sqlite3_exec(dbh, buf, NULL, NULL, &err); + if (ret != SQLITE_OK) + xlog(L_ERROR, "%s: delete failed: %s", __func__, err); + + xlog(D_GENERAL, "%s: returning %d", __func__, ret); + sqlite3_free(err); + return ret; +} + +/* + * Are there any clients that are possibly still reclaiming? Return a positive + * integer (usually number of clients) if so. If not, then return 0. On any + * error, return non-zero. + */ +int +sqlite_query_reclaiming(const time_t grace_start) +{ + int ret; + sqlite3_stmt *stmt = NULL; + + ret = sqlite3_prepare_v2(dbh, "SELECT count(*) FROM clients WHERE " + "time < ? OR has_session != 1", -1, &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: unable to prepare select statement: %s", + __func__, sqlite3_errmsg(dbh)); + return ret; + } + + ret = sqlite3_bind_int64(stmt, 1, (sqlite3_int64)grace_start); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: bind int64 failed: %s", + __func__, sqlite3_errmsg(dbh)); + return ret; + } + + ret = sqlite3_step(stmt); + if (ret != SQLITE_ROW) { + xlog(L_ERROR, "%s: unexpected return code from select: %s", + __func__, sqlite3_errmsg(dbh)); + return ret; + } + + ret = sqlite3_column_int(stmt, 0); + sqlite3_finalize(stmt); + xlog(D_GENERAL, "%s: there are %d clients that have not completed " + "reclaim", __func__, ret); + return ret; +} diff --git a/utils/nfsdcld/sqlite.h b/utils/nfsdcld/sqlite.h new file mode 100644 index 0000000..06e7c04 --- /dev/null +++ b/utils/nfsdcld/sqlite.h @@ -0,0 +1,32 @@ +/* + * Copyright (C) 2011 Red Hat, Jeff Layton + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, + * Boston, MA 02110-1301, USA. + */ + +#ifndef _SQLITE_H_ +#define _SQLITE_H_ + +int sqlite_prepare_dbh(const char *topdir); +int sqlite_insert_client(const unsigned char *clname, const size_t namelen, + const bool has_session, const bool zerotime); +int sqlite_remove_client(const unsigned char *clname, const size_t namelen); +int sqlite_check_client(const unsigned char *clname, const size_t namelen, + const bool has_session); +int sqlite_remove_unreclaimed(const time_t grace_start); +int sqlite_query_reclaiming(const time_t grace_start); + +#endif /* _SQLITE_H */ diff --git a/utils/nfsdcltrack/Makefile.am b/utils/nfsdcltrack/Makefile.am index 0f599c0..2f7fe3d 100644 --- a/utils/nfsdcltrack/Makefile.am +++ b/utils/nfsdcltrack/Makefile.am @@ -4,14 +4,11 @@ # overridden at config time. The kernel "knows" the /sbin name. sbindir = /sbin -man8_MANS = nfsdcld.man nfsdcltrack.man +man8_MANS = nfsdcltrack.man EXTRA_DIST = $(man8_MANS) AM_CFLAGS += -D_LARGEFILE64_SOURCE -sbin_PROGRAMS = nfsdcld nfsdcltrack - -nfsdcld_SOURCES = nfsdcld.c sqlite.c -nfsdcld_LDADD = ../../support/nfs/libnfs.la $(LIBEVENT) $(LIBSQLITE) $(LIBCAP) +sbin_PROGRAMS = nfsdcltrack noinst_HEADERS = sqlite.h From patchwork Tue Mar 26 22:07:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Mayhew X-Patchwork-Id: 10872251 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8482D1708 for ; Tue, 26 Mar 2019 22:07:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A38B28569 for ; Tue, 26 Mar 2019 22:07:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5EA0B28A9C; Tue, 26 Mar 2019 22:07:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 295CC28B87 for ; Tue, 26 Mar 2019 22:07:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731589AbfCZWHd (ORCPT ); Tue, 26 Mar 2019 18:07:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59314 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731644AbfCZWHc (ORCPT ); Tue, 26 Mar 2019 18:07:32 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 93C97C050012; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: from coeurl.usersys.redhat.com (ovpn-122-200.rdu2.redhat.com [10.10.122.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2BEE51001E67; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: by coeurl.usersys.redhat.com (Postfix, from userid 1000) id D1FFD21341; Tue, 26 Mar 2019 18:07:30 -0400 (EDT) From: Scott Mayhew To: steved@redhat.com Cc: bfields@fieldses.org, jlayton@kernel.org, linux-nfs@vger.kernel.org Subject: [nfs-utils PATCH RFC v3 3/8] nfsdcld: a few enhancements Date: Tue, 26 Mar 2019 18:07:25 -0400 Message-Id: <20190326220730.3763-4-smayhew@redhat.com> In-Reply-To: <20190326220730.3763-1-smayhew@redhat.com> References: <20190326220730.3763-1-smayhew@redhat.com> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 1) Adopt the concept of "reboot epochs" (but not coordinated grace periods via the "need" and "enforcing" flags) from Jeff Layton's "Active/Active NFS Server Recovery" presentation from the Fall 2018 NFS Bakeathon. See http://nfsv4bat.org/Documents/BakeAThon/2018/Active_Active%20NFS%20Server%20Recovery.pdf - add a new table "grace" which contains two integer columns representing the "current" epoch (where new client records are stored) and the "recovery" epoch (which has the records for clients that are allowed to recover) - replace the "clients" table with table(s) named "rec-CCCCCCCCCCCCCCCC" (where C is the hex value of the epoch), containing a single column "id" which stores the client id string - when going from normal operation into grace, the current epoch becomes the recovery epoch, the current epoch is incremented, and a new table is created for the current epoch. Clients are allowed to reclaim if they have a record in the table corresponding to the recovery epoch and new records are added to the table corresponding to the current epoch. - when moving from grace back to normal operation, the table associated with the recovery epoch is deleted and the recovery epoch becomes zero. - if the server restarts before exiting the previous grace period, then the epochs are not changed, and all records in the table associated with the "current" epoch are cleared out. 2) Allow knfsd to "slurp" the client records during startup. During client tracking initialization, knfsd will do an upcall to get a list of clients from the database. nfsdcld will do one downcall with a status of -EINPROGRESS for each client record in the database, followed by a final downcall with a status of 0. This will allow 2 things - knfsd can check whether a client is allowed to reclaim without performing an upcall to nfsdcld - knfsd can decide to end the grace period early by tracking the number of RECLAIM_COMPLETE operations it receives from "known" clients, or it can skip the grace period altogether if no clients are allowed to reclaim. Signed-off-by: Scott Mayhew --- support/include/cld.h | 1 + utils/nfsdcld/Makefile.am | 2 +- utils/nfsdcld/cld-internal.h | 30 +++ utils/nfsdcld/nfsdcld.c | 160 +++++++++++- utils/nfsdcld/sqlite.c | 483 ++++++++++++++++++++++++++++------- utils/nfsdcld/sqlite.h | 11 +- 6 files changed, 579 insertions(+), 108 deletions(-) create mode 100644 utils/nfsdcld/cld-internal.h diff --git a/support/include/cld.h b/support/include/cld.h index f14a9ab..c1f5b70 100644 --- a/support/include/cld.h +++ b/support/include/cld.h @@ -33,6 +33,7 @@ enum cld_command { Cld_Remove, /* remove record of this cm_id */ Cld_Check, /* is this cm_id allowed? */ Cld_GraceDone, /* grace period is complete */ + Cld_GraceStart, }; /* representation of long-form NFSv4 client ID */ diff --git a/utils/nfsdcld/Makefile.am b/utils/nfsdcld/Makefile.am index 8239be8..d1da749 100644 --- a/utils/nfsdcld/Makefile.am +++ b/utils/nfsdcld/Makefile.am @@ -13,7 +13,7 @@ sbin_PROGRAMS = nfsdcld nfsdcld_SOURCES = nfsdcld.c sqlite.c nfsdcld_LDADD = ../../support/nfs/libnfs.la $(LIBEVENT) $(LIBSQLITE) $(LIBCAP) -noinst_HEADERS = sqlite.h +noinst_HEADERS = sqlite.h cld-internal.h MAINTAINERCLEANFILES = Makefile.in diff --git a/utils/nfsdcld/cld-internal.h b/utils/nfsdcld/cld-internal.h new file mode 100644 index 0000000..a90cced --- /dev/null +++ b/utils/nfsdcld/cld-internal.h @@ -0,0 +1,30 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, + * Boston, MA 02110-1301, USA. + */ + +#ifndef _CLD_INTERNAL_H_ +#define _CLD_INTERNAL_H_ + +struct cld_client { + int cl_fd; + struct event cl_event; + struct cld_msg cl_msg; +}; + +uint64_t current_epoch; +uint64_t recovery_epoch; + +#endif /* _CLD_INTERNAL_H_ */ diff --git a/utils/nfsdcld/nfsdcld.c b/utils/nfsdcld/nfsdcld.c index 082f3ab..9b1ad98 100644 --- a/utils/nfsdcld/nfsdcld.c +++ b/utils/nfsdcld/nfsdcld.c @@ -42,7 +42,9 @@ #include "xlog.h" #include "nfslib.h" #include "cld.h" +#include "cld-internal.h" #include "sqlite.h" +#include "../mount/version.h" #ifndef PIPEFS_DIR #define PIPEFS_DIR NFS_STATEDIR "/rpc_pipefs" @@ -54,19 +56,17 @@ #define CLD_DEFAULT_STORAGEDIR NFS_STATEDIR "/nfsdcld" #endif +#define NFSD_END_GRACE_FILE "/proc/fs/nfsd/v4_end_grace" + #define UPCALL_VERSION 1 /* private data structures */ -struct cld_client { - int cl_fd; - struct event cl_event; - struct cld_msg cl_msg; -}; /* global variables */ static char *pipepath = DEFAULT_CLD_PATH; static int inotify_fd = -1; static struct event pipedir_event; +static bool old_kernel = false; static struct option longopts[] = { @@ -298,6 +298,43 @@ out: return ret; } +/* + * Older kernels will not tell nfsdcld when a grace period has started. + * Therefore we have to peek at the /proc/fs/nfsd/v4_end_grace file to + * see if nfsd is in grace. We have to do this for create and remove + * upcalls to ensure that the correct table is being updated - otherwise + * we could lose client records when the grace period is lifted. + */ +static int +cld_check_grace_period(void) +{ + int fd, ret = 0; + char c; + + if (!old_kernel) + return 0; + if (recovery_epoch != 0) + return 0; + fd = open(NFSD_END_GRACE_FILE, O_RDONLY); + if (fd < 0) { + xlog(L_WARNING, "Unable to open %s: %m", + NFSD_END_GRACE_FILE); + return 1; + } + if (read(fd, &c, 1) < 0) { + xlog(L_WARNING, "Unable to read from %s: %m", + NFSD_END_GRACE_FILE); + return 1; + } + close(fd); + if (c == 'N') { + xlog(L_WARNING, "nfsd is in grace but didn't send a gracestart upcall, " + "please update the kernel"); + ret = sqlite_grace_start(); + } + return ret; +} + static void cld_not_implemented(struct cld_client *clnt) { @@ -332,14 +369,17 @@ cld_create(struct cld_client *clnt) ssize_t bsize, wsize; struct cld_msg *cmsg = &clnt->cl_msg; + ret = cld_check_grace_period(); + if (ret) + goto reply; + xlog(D_GENERAL, "%s: create client record.", __func__); ret = sqlite_insert_client(cmsg->cm_u.cm_name.cn_id, - cmsg->cm_u.cm_name.cn_len, - false, - false); + cmsg->cm_u.cm_name.cn_len); +reply: cmsg->cm_status = ret ? -EREMOTEIO : ret; bsize = sizeof(*cmsg); @@ -365,11 +405,16 @@ cld_remove(struct cld_client *clnt) ssize_t bsize, wsize; struct cld_msg *cmsg = &clnt->cl_msg; + ret = cld_check_grace_period(); + if (ret) + goto reply; + xlog(D_GENERAL, "%s: remove client record.", __func__); ret = sqlite_remove_client(cmsg->cm_u.cm_name.cn_id, cmsg->cm_u.cm_name.cn_len); +reply: cmsg->cm_status = ret ? -EREMOTEIO : ret; bsize = sizeof(*cmsg); @@ -396,12 +441,26 @@ cld_check(struct cld_client *clnt) ssize_t bsize, wsize; struct cld_msg *cmsg = &clnt->cl_msg; + /* + * If we get a check upcall at all, it means we're talking to an old + * kernel. Furthermore, if we're not in grace it means this is the + * first client to do a reclaim. Log a message and use + * sqlite_grace_start() to advance the epoch numbers. + */ + if (recovery_epoch == 0) { + xlog(D_GENERAL, "%s: received a check upcall, please update the kernel", + __func__); + ret = sqlite_grace_start(); + if (ret) + goto reply; + } + xlog(D_GENERAL, "%s: check client record", __func__); ret = sqlite_check_client(cmsg->cm_u.cm_name.cn_id, - cmsg->cm_u.cm_name.cn_len, - false); + cmsg->cm_u.cm_name.cn_len); +reply: /* set up reply */ cmsg->cm_status = ret ? -EACCES : ret; @@ -429,11 +488,27 @@ cld_gracedone(struct cld_client *clnt) ssize_t bsize, wsize; struct cld_msg *cmsg = &clnt->cl_msg; - xlog(D_GENERAL, "%s: grace done. cm_gracetime=%ld", __func__, - cmsg->cm_u.cm_gracetime); + /* + * If we got a "gracedone" upcall while we're not in grace, then + * 1) we must be talking to an old kernel + * 2) no clients attempted to reclaim + * In that case, log a message and use sqlite_grace_start() to + * advance the epoch numbers, and then proceed as normal. + */ + if (recovery_epoch == 0) { + xlog(D_GENERAL, "%s: received gracedone upcall " + "while not in grace, please update the kernel", + __func__); + ret = sqlite_grace_start(); + if (ret) + goto reply; + } + + xlog(D_GENERAL, "%s: grace done.", __func__); - ret = sqlite_remove_unreclaimed(cmsg->cm_u.cm_gracetime); + ret = sqlite_grace_done(); +reply: /* set up reply: downcall with 0 status */ cmsg->cm_status = ret ? -EREMOTEIO : ret; @@ -453,6 +528,59 @@ cld_gracedone(struct cld_client *clnt) } } +static int +gracestart_callback(struct cld_client *clnt) { + ssize_t bsize, wsize; + struct cld_msg *cmsg = &clnt->cl_msg; + + cmsg->cm_status = -EINPROGRESS; + + bsize = sizeof(struct cld_msg); + + xlog(D_GENERAL, "Sending client %.*s", + cmsg->cm_u.cm_name.cn_len, cmsg->cm_u.cm_name.cn_id); + wsize = atomicio((void *)write, clnt->cl_fd, cmsg, bsize); + if (wsize != bsize) + return -EIO; + return 0; +} + +static void +cld_gracestart(struct cld_client *clnt) +{ + int ret; + ssize_t bsize, wsize; + struct cld_msg *cmsg = &clnt->cl_msg; + + xlog(D_GENERAL, "%s: updating grace epochs", __func__); + + ret = sqlite_grace_start(); + if (ret) + goto reply; + + xlog(D_GENERAL, "%s: sending client records to the kernel", __func__); + + ret = sqlite_iterate_recovery(&gracestart_callback, clnt); + +reply: + /* set up reply: downcall with 0 status */ + cmsg->cm_status = ret ? -EREMOTEIO : ret; + + bsize = sizeof(struct cld_msg); + xlog(D_GENERAL, "Doing downcall with status %d", cmsg->cm_status); + wsize = atomicio((void *)write, clnt->cl_fd, cmsg, bsize); + if (wsize != bsize) { + xlog(L_ERROR, "%s: problem writing to cld pipe (%ld): %m", + __func__, wsize); + ret = cld_pipe_open(clnt); + if (ret) { + xlog(L_FATAL, "%s: unable to reopen pipe: %d", + __func__, ret); + exit(ret); + } + } +} + static void cldcb(int UNUSED(fd), short which, void *data) { @@ -490,6 +618,9 @@ cldcb(int UNUSED(fd), short which, void *data) case Cld_GraceDone: cld_gracedone(clnt); break; + case Cld_GraceStart: + cld_gracestart(clnt); + break; default: xlog(L_WARNING, "%s: command %u is not yet implemented", __func__, cmsg->cm_cmd); @@ -586,6 +717,9 @@ main(int argc, char **argv) } } + if (linux_version_code() < MAKE_VERSION(4, 20, 0)) + old_kernel = true; + /* set up storage db */ rc = sqlite_prepare_dbh(storagedir); if (rc) { diff --git a/utils/nfsdcld/sqlite.c b/utils/nfsdcld/sqlite.c index c59f777..82140ea 100644 --- a/utils/nfsdcld/sqlite.c +++ b/utils/nfsdcld/sqlite.c @@ -21,17 +21,24 @@ * Explanation: * * This file contains the code to manage the sqlite backend database for the - * nfsdcltrack usermodehelper upcall program. + * nfsdcld client tracking daemon. * * The main database is called main.sqlite and contains the following tables: * * parameters: simple key/value pairs for storing database info * - * clients: an "id" column containing a BLOB with the long-form clientid as - * sent by the client, a "time" column containing a timestamp (in - * epoch seconds) of when the record was last updated, and a - * "has_session" column containing a boolean value indicating - * whether the client has sessions (v4.1+) or not (v4.0). + * grace: a "current" column containing an INTEGER representing the current + * epoch (where should new values be stored) and a "recovery" column + * containing an INTEGER representing the recovery epoch (from what + * epoch are we allowed to recover). A recovery epoch of 0 means + * normal operation (grace period not in force). Note: sqlite stores + * integers as signed values, so these must be cast to a uint64_t when + * retrieving them from the database and back to an int64_t when storing + * them in the database. + * + * rec-CCCCCCCCCCCCCCCC (where C is the hex representation of the epoch value): + * a single "id" column containing a BLOB with the long-form clientid + * as sent by the client. */ #ifdef HAVE_CONFIG_H @@ -47,16 +54,21 @@ #include #include #include +#include +#include +#include #include #include #include "xlog.h" #include "sqlite.h" +#include "cld.h" +#include "cld-internal.h" -#define CLTRACK_SQLITE_LATEST_SCHEMA_VERSION 2 +#define CLD_SQLITE_LATEST_SCHEMA_VERSION 3 /* in milliseconds */ -#define CLTRACK_SQLITE_BUSY_TIMEOUT 10000 +#define CLD_SQLITE_BUSY_TIMEOUT 10000 /* private data structures */ @@ -124,7 +136,7 @@ out: } static int -sqlite_maindb_update_v1_to_v2(void) +sqlite_maindb_update_schema(int oldversion) { int ret, ret2; char *err; @@ -142,32 +154,66 @@ sqlite_maindb_update_v1_to_v2(void) * transaction to guard against racing DB setup attempts */ ret = sqlite_query_schema_version(); - switch (ret) { - case 1: - /* Still at v1 -- do conversion */ - break; - case CLTRACK_SQLITE_LATEST_SCHEMA_VERSION: - /* Someone else raced in and set it up */ - ret = 0; + if (ret != oldversion) { + if (ret == CLD_SQLITE_LATEST_SCHEMA_VERSION) + /* Someone else raced in and set it up */ + ret = 0; + else + /* Something went wrong -- fail! */ + ret = -EINVAL; goto rollback; - default: - /* Something went wrong -- fail! */ - ret = -EINVAL; + } + + /* Still at old version -- do conversion */ + + /* create grace table */ + ret = sqlite3_exec(dbh, "CREATE TABLE grace " + "(current INTEGER , recovery INTEGER);", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to create grace table: %s", err); + goto rollback; + } + + /* insert initial epochs into grace table */ + ret = sqlite3_exec(dbh, "INSERT OR FAIL INTO grace " + "values (1, 0);", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to set initial epochs: %s", err); + goto rollback; + } + + /* create recovery table for current epoch */ + ret = sqlite3_exec(dbh, "CREATE TABLE \"rec-0000000000000001\" " + "(id BLOB PRIMARY KEY);", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to create recovery table " + "for current epoch: %s", err); + goto rollback; + } + + /* copy records from old clients table */ + ret = sqlite3_exec(dbh, "INSERT INTO \"rec-0000000000000001\" " + "SELECT id FROM clients;", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to copy client records: %s", err); goto rollback; } - /* create v2 clients table */ - ret = sqlite3_exec(dbh, "ALTER TABLE clients ADD COLUMN " - "has_session INTEGER;", + /* drop the old clients table */ + ret = sqlite3_exec(dbh, "DROP TABLE clients;", NULL, NULL, &err); if (ret != SQLITE_OK) { - xlog(L_ERROR, "Unable to update clients table: %s", err); + xlog(L_ERROR, "Unable to drop old clients table: %s", err); goto rollback; } ret = snprintf(buf, sizeof(buf), "UPDATE parameters SET value = %d " "WHERE key = \"version\";", - CLTRACK_SQLITE_LATEST_SCHEMA_VERSION); + CLD_SQLITE_LATEST_SCHEMA_VERSION); if (ret < 0) { xlog(L_ERROR, "sprintf failed!"); goto rollback; @@ -205,7 +251,7 @@ rollback: * transaction. On any error, rollback the transaction. */ static int -sqlite_maindb_init_v2(void) +sqlite_maindb_init_v3(void) { int ret, ret2; char *err = NULL; @@ -227,7 +273,7 @@ sqlite_maindb_init_v2(void) case 0: /* Query failed again -- set up DB */ break; - case CLTRACK_SQLITE_LATEST_SCHEMA_VERSION: + case CLD_SQLITE_LATEST_SCHEMA_VERSION: /* Someone else raced in and set it up */ ret = 0; goto rollback; @@ -245,20 +291,38 @@ sqlite_maindb_init_v2(void) goto rollback; } - /* create the "clients" table */ - ret = sqlite3_exec(dbh, "CREATE TABLE clients (id BLOB PRIMARY KEY, " - "time INTEGER, has_session INTEGER);", + /* create grace table */ + ret = sqlite3_exec(dbh, "CREATE TABLE grace " + "(current INTEGER , recovery INTEGER);", NULL, NULL, &err); if (ret != SQLITE_OK) { - xlog(L_ERROR, "Unable to create clients table: %s", err); + xlog(L_ERROR, "Unable to create grace table: %s", err); goto rollback; } + /* insert initial epochs into grace table */ + ret = sqlite3_exec(dbh, "INSERT OR FAIL INTO grace " + "values (1, 0);", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to set initial epochs: %s", err); + goto rollback; + } + + /* create recovery table for current epoch */ + ret = sqlite3_exec(dbh, "CREATE TABLE \"rec-0000000000000001\" " + "(id BLOB PRIMARY KEY);", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to create recovery table " + "for current epoch: %s", err); + goto rollback; + } /* insert version into parameters table */ ret = snprintf(buf, sizeof(buf), "INSERT OR FAIL INTO parameters " "values (\"version\", \"%d\");", - CLTRACK_SQLITE_LATEST_SCHEMA_VERSION); + CLD_SQLITE_LATEST_SCHEMA_VERSION); if (ret < 0) { xlog(L_ERROR, "sprintf failed!"); goto rollback; @@ -291,6 +355,42 @@ rollback: goto out; } +static int +sqlite_startup_query_grace(void) +{ + int ret; + uint64_t tcur; + uint64_t trec; + sqlite3_stmt *stmt = NULL; + + /* prepare select query */ + ret = sqlite3_prepare_v2(dbh, "SELECT * FROM grace;", -1, &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(D_GENERAL, "Unable to prepare select statement: %s", + sqlite3_errmsg(dbh)); + goto out; + } + + ret = sqlite3_step(stmt); + if (ret != SQLITE_ROW) { + xlog(D_GENERAL, "Select statement execution failed: %s", + sqlite3_errmsg(dbh)); + goto out; + } + + tcur = (uint64_t)sqlite3_column_int64(stmt, 0); + trec = (uint64_t)sqlite3_column_int64(stmt, 1); + + current_epoch = tcur; + recovery_epoch = trec; + ret = 0; + xlog(D_GENERAL, "%s: current_epoch=%lu recovery_epoch=%lu", + __func__, current_epoch, recovery_epoch); +out: + sqlite3_finalize(stmt); + return ret; +} + /* Open the database and set up the database handle for it */ int sqlite_prepare_dbh(const char *topdir) @@ -322,7 +422,7 @@ sqlite_prepare_dbh(const char *topdir) } /* set busy timeout */ - ret = sqlite3_busy_timeout(dbh, CLTRACK_SQLITE_BUSY_TIMEOUT); + ret = sqlite3_busy_timeout(dbh, CLD_SQLITE_BUSY_TIMEOUT); if (ret != SQLITE_OK) { xlog(L_ERROR, "Unable to set sqlite busy timeout: %s", sqlite3_errmsg(dbh)); @@ -331,19 +431,26 @@ sqlite_prepare_dbh(const char *topdir) ret = sqlite_query_schema_version(); switch (ret) { - case CLTRACK_SQLITE_LATEST_SCHEMA_VERSION: + case CLD_SQLITE_LATEST_SCHEMA_VERSION: /* DB is already set up. Do nothing */ ret = 0; break; + case 2: + /* Old DB -- update to new schema */ + ret = sqlite_maindb_update_schema(2); + if (ret) + goto out_close; + break; + case 1: /* Old DB -- update to new schema */ - ret = sqlite_maindb_update_v1_to_v2(); + ret = sqlite_maindb_update_schema(1); if (ret) goto out_close; break; case 0: /* Query failed -- try to set up new DB */ - ret = sqlite_maindb_init_v2(); + ret = sqlite_maindb_init_v3(); if (ret) goto out_close; break; @@ -351,11 +458,13 @@ sqlite_prepare_dbh(const char *topdir) /* Unknown DB version -- downgrade? Fail */ xlog(L_ERROR, "Unsupported database schema version! " "Expected %d, got %d.", - CLTRACK_SQLITE_LATEST_SCHEMA_VERSION, ret); + CLD_SQLITE_LATEST_SCHEMA_VERSION, ret); ret = -EINVAL; goto out_close; } + ret = sqlite_startup_query_grace(); + return ret; out_close: sqlite3_close(dbh); @@ -369,20 +478,22 @@ out_close: * Returns a non-zero sqlite error code, or SQLITE_OK (aka 0) */ int -sqlite_insert_client(const unsigned char *clname, const size_t namelen, - const bool has_session, const bool zerotime) +sqlite_insert_client(const unsigned char *clname, const size_t namelen) { int ret; sqlite3_stmt *stmt = NULL; - if (zerotime) - ret = sqlite3_prepare_v2(dbh, "INSERT OR REPLACE INTO clients " - "VALUES (?, 0, ?);", -1, &stmt, NULL); - else - ret = sqlite3_prepare_v2(dbh, "INSERT OR REPLACE INTO clients " - "VALUES (?, strftime('%s', 'now'), ?);", -1, - &stmt, NULL); + ret = snprintf(buf, sizeof(buf), "INSERT OR REPLACE INTO \"rec-%016lx\" " + "VALUES (?);", current_epoch); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + return ret; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + return -EINVAL; + } + ret = sqlite3_prepare_v2(dbh, buf, -1, &stmt, NULL); if (ret != SQLITE_OK) { xlog(L_ERROR, "%s: insert statement prepare failed: %s", __func__, sqlite3_errmsg(dbh)); @@ -397,13 +508,6 @@ sqlite_insert_client(const unsigned char *clname, const size_t namelen, goto out_err; } - ret = sqlite3_bind_int(stmt, 2, (int)has_session); - if (ret != SQLITE_OK) { - xlog(L_ERROR, "%s: bind int failed: %s", __func__, - sqlite3_errmsg(dbh)); - goto out_err; - } - ret = sqlite3_step(stmt); if (ret == SQLITE_DONE) ret = SQLITE_OK; @@ -424,8 +528,18 @@ sqlite_remove_client(const unsigned char *clname, const size_t namelen) int ret; sqlite3_stmt *stmt = NULL; - ret = sqlite3_prepare_v2(dbh, "DELETE FROM clients WHERE id==?", -1, - &stmt, NULL); + ret = snprintf(buf, sizeof(buf), "DELETE FROM \"rec-%016lx\" " + "WHERE id==?;", current_epoch); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + return ret; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + return -EINVAL; + } + + ret = sqlite3_prepare_v2(dbh, buf, -1, &stmt, NULL); + if (ret != SQLITE_OK) { xlog(L_ERROR, "%s: statement prepare failed: %s", __func__, sqlite3_errmsg(dbh)); @@ -459,18 +573,26 @@ out_err: * return an error. */ int -sqlite_check_client(const unsigned char *clname, const size_t namelen, - const bool has_session) +sqlite_check_client(const unsigned char *clname, const size_t namelen) { int ret; sqlite3_stmt *stmt = NULL; - ret = sqlite3_prepare_v2(dbh, "SELECT count(*) FROM clients WHERE " - "id==?", -1, &stmt, NULL); + ret = snprintf(buf, sizeof(buf), "SELECT count(*) FROM \"rec-%016lx\" " + "WHERE id==?;", recovery_epoch); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + return ret; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + return -EINVAL; + } + + ret = sqlite3_prepare_v2(dbh, buf, -1, &stmt, NULL); if (ret != SQLITE_OK) { - xlog(L_ERROR, "%s: unable to prepare update statement: %s", - __func__, sqlite3_errmsg(dbh)); - goto out_err; + xlog(L_ERROR, "%s: select statement prepare failed: %s", + __func__, sqlite3_errmsg(dbh)); + return ret; } ret = sqlite3_bind_blob(stmt, 1, (const void *)clname, namelen, @@ -495,37 +617,10 @@ sqlite_check_client(const unsigned char *clname, const size_t namelen, goto out_err; } - /* Only update timestamp for v4.0 clients */ - if (has_session) { - ret = SQLITE_OK; - goto out_err; - } - sqlite3_finalize(stmt); - stmt = NULL; - ret = sqlite3_prepare_v2(dbh, "UPDATE OR FAIL clients SET " - "time=strftime('%s', 'now') WHERE id==?", - -1, &stmt, NULL); - if (ret != SQLITE_OK) { - xlog(L_ERROR, "%s: unable to prepare update statement: %s", - __func__, sqlite3_errmsg(dbh)); - goto out_err; - } - ret = sqlite3_bind_blob(stmt, 1, (const void *)clname, namelen, - SQLITE_STATIC); - if (ret != SQLITE_OK) { - xlog(L_ERROR, "%s: bind blob failed: %s", - __func__, sqlite3_errmsg(dbh)); - goto out_err; - } - - ret = sqlite3_step(stmt); - if (ret == SQLITE_DONE) - ret = SQLITE_OK; - else - xlog(L_ERROR, "%s: unexpected return code from update: %s", - __func__, sqlite3_errmsg(dbh)); + /* Now insert the client into the table for the current epoch */ + return sqlite_insert_client(clname, namelen); out_err: xlog(D_GENERAL, "%s: returning %d", __func__, ret); @@ -599,3 +694,211 @@ sqlite_query_reclaiming(const time_t grace_start) "reclaim", __func__, ret); return ret; } + +int +sqlite_grace_start(void) +{ + int ret, ret2; + char *err; + uint64_t tcur = current_epoch; + uint64_t trec = recovery_epoch; + + /* begin transaction */ + ret = sqlite3_exec(dbh, "BEGIN EXCLUSIVE TRANSACTION;", NULL, NULL, + &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to begin transaction: %s", err); + goto rollback; + } + + if (trec == 0) { + /* + * A normal grace start - update the epoch values in the grace + * table and create a new table for the current reboot epoch. + */ + trec = tcur; + tcur++; + + ret = snprintf(buf, sizeof(buf), "UPDATE grace " + "SET current = %ld, recovery = %ld;", + (int64_t)tcur, (int64_t)trec); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + goto rollback; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", + ret); + ret = -EINVAL; + goto rollback; + } + + ret = sqlite3_exec(dbh, (const char *)buf, NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to update epochs: %s", err); + goto rollback; + } + + ret = snprintf(buf, sizeof(buf), "CREATE TABLE \"rec-%016lx\" " + "(id BLOB PRIMARY KEY);", + tcur); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + goto rollback; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", + ret); + ret = -EINVAL; + goto rollback; + } + + ret = sqlite3_exec(dbh, (const char *)buf, NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to create table for current epoch: %s", + err); + goto rollback; + } + } else { + /* Server restarted while in grace - don't update the epoch + * values in the grace table, just clear out the records for + * the current reboot epoch. + */ + ret = snprintf(buf, sizeof(buf), "DELETE FROM \"rec-%016lx\";", + tcur); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + goto rollback; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + ret = -EINVAL; + goto rollback; + } + + ret = sqlite3_exec(dbh, (const char *)buf, NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to clear table for current epoch: %s", + err); + goto rollback; + } + } + + ret = sqlite3_exec(dbh, "COMMIT TRANSACTION;", NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to commit transaction: %s", err); + goto rollback; + } + + current_epoch = tcur; + recovery_epoch = trec; + xlog(D_GENERAL, "%s: current_epoch=%lu recovery_epoch=%lu", + __func__, current_epoch, recovery_epoch); + +out: + sqlite3_free(err); + return ret; +rollback: + ret2 = sqlite3_exec(dbh, "ROLLBACK TRANSACTION;", NULL, NULL, &err); + if (ret2 != SQLITE_OK) + xlog(L_ERROR, "Unable to rollback transaction: %s", err); + goto out; +} + +int +sqlite_grace_done(void) +{ + int ret, ret2; + char *err; + + /* begin transaction */ + ret = sqlite3_exec(dbh, "BEGIN EXCLUSIVE TRANSACTION;", NULL, NULL, + &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to begin transaction: %s", err); + goto rollback; + } + + ret = sqlite3_exec(dbh, "UPDATE grace SET recovery = \"0\";", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to clear recovery epoch: %s", err); + goto rollback; + } + + ret = snprintf(buf, sizeof(buf), "DROP TABLE \"rec-%016lx\";", + recovery_epoch); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + goto rollback; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + ret = -EINVAL; + goto rollback; + } + + ret = sqlite3_exec(dbh, (const char *)buf, NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to drop table for recovery epoch: %s", + err); + goto rollback; + } + + ret = sqlite3_exec(dbh, "COMMIT TRANSACTION;", NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to commit transaction: %s", err); + goto rollback; + } + + recovery_epoch = 0; + xlog(D_GENERAL, "%s: current_epoch=%lu recovery_epoch=%lu", + __func__, current_epoch, recovery_epoch); + +out: + sqlite3_free(err); + return ret; +rollback: + ret2 = sqlite3_exec(dbh, "ROLLBACK TRANSACTION;", NULL, NULL, &err); + if (ret2 != SQLITE_OK) + xlog(L_ERROR, "Unable to rollback transaction: %s", err); + goto out; +} + + +int +sqlite_iterate_recovery(int (*cb)(struct cld_client *clnt), struct cld_client *clnt) +{ + int ret; + sqlite3_stmt *stmt = NULL; + struct cld_msg *cmsg = &clnt->cl_msg; + + if (recovery_epoch == 0) { + xlog(D_GENERAL, "%s: not in grace!", __func__); + return -EINVAL; + } + + ret = snprintf(buf, sizeof(buf), "SELECT * FROM \"rec-%016lx\";", + recovery_epoch); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + return ret; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + return -EINVAL; + } + + ret = sqlite3_prepare_v2(dbh, buf, -1, &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: select statement prepare failed: %s", + __func__, sqlite3_errmsg(dbh)); + return ret; + } + + while ((ret = sqlite3_step(stmt)) == SQLITE_ROW) { + memcpy(&cmsg->cm_u.cm_name.cn_id, sqlite3_column_blob(stmt, 0), + NFS4_OPAQUE_LIMIT); + cmsg->cm_u.cm_name.cn_len = sqlite3_column_bytes(stmt, 0); + cb(clnt); + } + if (ret == SQLITE_DONE) + ret = 0; + sqlite3_finalize(stmt); + return ret; +} diff --git a/utils/nfsdcld/sqlite.h b/utils/nfsdcld/sqlite.h index 06e7c04..5c56f75 100644 --- a/utils/nfsdcld/sqlite.h +++ b/utils/nfsdcld/sqlite.h @@ -20,13 +20,16 @@ #ifndef _SQLITE_H_ #define _SQLITE_H_ +struct cld_client; + int sqlite_prepare_dbh(const char *topdir); -int sqlite_insert_client(const unsigned char *clname, const size_t namelen, - const bool has_session, const bool zerotime); +int sqlite_insert_client(const unsigned char *clname, const size_t namelen); int sqlite_remove_client(const unsigned char *clname, const size_t namelen); -int sqlite_check_client(const unsigned char *clname, const size_t namelen, - const bool has_session); +int sqlite_check_client(const unsigned char *clname, const size_t namelen); int sqlite_remove_unreclaimed(const time_t grace_start); int sqlite_query_reclaiming(const time_t grace_start); +int sqlite_grace_start(void); +int sqlite_grace_done(void); +int sqlite_iterate_recovery(int (*cb)(struct cld_client *clnt), struct cld_client *clnt); #endif /* _SQLITE_H */ From patchwork Tue Mar 26 22:07:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Mayhew X-Patchwork-Id: 10872235 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1022F1708 for ; Tue, 26 Mar 2019 22:07:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EF9E828A9C for ; Tue, 26 Mar 2019 22:07:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E415228C8A; Tue, 26 Mar 2019 22:07:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E29028A9C for ; Tue, 26 Mar 2019 22:07:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731827AbfCZWHb (ORCPT ); Tue, 26 Mar 2019 18:07:31 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46764 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731491AbfCZWHb (ORCPT ); Tue, 26 Mar 2019 18:07:31 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5462830B4ACE; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: from coeurl.usersys.redhat.com (ovpn-122-200.rdu2.redhat.com [10.10.122.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id 361732A2BF; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: by coeurl.usersys.redhat.com (Postfix, from userid 1000) id D688021343; Tue, 26 Mar 2019 18:07:30 -0400 (EDT) From: Scott Mayhew To: steved@redhat.com Cc: bfields@fieldses.org, jlayton@kernel.org, linux-nfs@vger.kernel.org Subject: [nfs-utils PATCH RFC v3 4/8] nfsdcld: remove some unused functions Date: Tue, 26 Mar 2019 18:07:26 -0400 Message-Id: <20190326220730.3763-5-smayhew@redhat.com> In-Reply-To: <20190326220730.3763-1-smayhew@redhat.com> References: <20190326220730.3763-1-smayhew@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Get rid of sqlite_query_reclaiming() and sqlite_remove_unreclaimed(), which are not used. Signed-off-by: Scott Mayhew --- utils/nfsdcld/sqlite.c | 67 ------------------------------------------ utils/nfsdcld/sqlite.h | 2 -- 2 files changed, 69 deletions(-) diff --git a/utils/nfsdcld/sqlite.c b/utils/nfsdcld/sqlite.c index 82140ea..6e4eb66 100644 --- a/utils/nfsdcld/sqlite.c +++ b/utils/nfsdcld/sqlite.c @@ -628,73 +628,6 @@ out_err: return ret; } -/* - * remove any client records that were not reclaimed since grace_start. - */ -int -sqlite_remove_unreclaimed(time_t grace_start) -{ - int ret; - char *err = NULL; - - ret = snprintf(buf, sizeof(buf), "DELETE FROM clients WHERE time < %ld", - grace_start); - if (ret < 0) { - return ret; - } else if ((size_t)ret >= sizeof(buf)) { - ret = -EINVAL; - return ret; - } - - ret = sqlite3_exec(dbh, buf, NULL, NULL, &err); - if (ret != SQLITE_OK) - xlog(L_ERROR, "%s: delete failed: %s", __func__, err); - - xlog(D_GENERAL, "%s: returning %d", __func__, ret); - sqlite3_free(err); - return ret; -} - -/* - * Are there any clients that are possibly still reclaiming? Return a positive - * integer (usually number of clients) if so. If not, then return 0. On any - * error, return non-zero. - */ -int -sqlite_query_reclaiming(const time_t grace_start) -{ - int ret; - sqlite3_stmt *stmt = NULL; - - ret = sqlite3_prepare_v2(dbh, "SELECT count(*) FROM clients WHERE " - "time < ? OR has_session != 1", -1, &stmt, NULL); - if (ret != SQLITE_OK) { - xlog(L_ERROR, "%s: unable to prepare select statement: %s", - __func__, sqlite3_errmsg(dbh)); - return ret; - } - - ret = sqlite3_bind_int64(stmt, 1, (sqlite3_int64)grace_start); - if (ret != SQLITE_OK) { - xlog(L_ERROR, "%s: bind int64 failed: %s", - __func__, sqlite3_errmsg(dbh)); - return ret; - } - - ret = sqlite3_step(stmt); - if (ret != SQLITE_ROW) { - xlog(L_ERROR, "%s: unexpected return code from select: %s", - __func__, sqlite3_errmsg(dbh)); - return ret; - } - - ret = sqlite3_column_int(stmt, 0); - sqlite3_finalize(stmt); - xlog(D_GENERAL, "%s: there are %d clients that have not completed " - "reclaim", __func__, ret); - return ret; -} - int sqlite_grace_start(void) { diff --git a/utils/nfsdcld/sqlite.h b/utils/nfsdcld/sqlite.h index 5c56f75..757e5cc 100644 --- a/utils/nfsdcld/sqlite.h +++ b/utils/nfsdcld/sqlite.h @@ -26,8 +26,6 @@ int sqlite_prepare_dbh(const char *topdir); int sqlite_insert_client(const unsigned char *clname, const size_t namelen); int sqlite_remove_client(const unsigned char *clname, const size_t namelen); int sqlite_check_client(const unsigned char *clname, const size_t namelen); -int sqlite_remove_unreclaimed(const time_t grace_start); -int sqlite_query_reclaiming(const time_t grace_start); int sqlite_grace_start(void); int sqlite_grace_done(void); int sqlite_iterate_recovery(int (*cb)(struct cld_client *clnt), struct cld_client *clnt); From patchwork Tue Mar 26 22:07:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Mayhew X-Patchwork-Id: 10872237 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B23D6186E for ; Tue, 26 Mar 2019 22:07:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9EF1128C8A for ; Tue, 26 Mar 2019 22:07:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 924E328A9C; Tue, 26 Mar 2019 22:07:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2993628A9C for ; Tue, 26 Mar 2019 22:07:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726277AbfCZWHc (ORCPT ); Tue, 26 Mar 2019 18:07:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:44381 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731779AbfCZWHb (ORCPT ); Tue, 26 Mar 2019 18:07:31 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A2ED5307EA9F; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: from coeurl.usersys.redhat.com (ovpn-122-200.rdu2.redhat.com [10.10.122.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id 871E65C644; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: by coeurl.usersys.redhat.com (Postfix, from userid 1000) id DAE9921345; Tue, 26 Mar 2019 18:07:30 -0400 (EDT) From: Scott Mayhew To: steved@redhat.com Cc: bfields@fieldses.org, jlayton@kernel.org, linux-nfs@vger.kernel.org Subject: [nfs-utils PATCH RFC v3 5/8] nfsdcld: the -p option should specify the rpc_pipefs mountpoint Date: Tue, 26 Mar 2019 18:07:27 -0400 Message-Id: <20190326220730.3763-6-smayhew@redhat.com> In-Reply-To: <20190326220730.3763-1-smayhew@redhat.com> References: <20190326220730.3763-1-smayhew@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Change the -p option to specify the rpc_pipefs mountpoint rather than the full path to the cld pipe file. This is consistent with other daemons that use the rpc_pipefs filesystem. Signed-off-by: Scott Mayhew --- utils/nfsdcld/nfsdcld.c | 17 ++++++++++------- utils/nfsdcld/nfsdcld.man | 9 ++++----- 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/utils/nfsdcld/nfsdcld.c b/utils/nfsdcld/nfsdcld.c index 9b1ad98..272c7c5 100644 --- a/utils/nfsdcld/nfsdcld.c +++ b/utils/nfsdcld/nfsdcld.c @@ -46,11 +46,11 @@ #include "sqlite.h" #include "../mount/version.h" -#ifndef PIPEFS_DIR -#define PIPEFS_DIR NFS_STATEDIR "/rpc_pipefs" +#ifndef DEFAULT_PIPEFS_DIR +#define DEFAULT_PIPEFS_DIR NFS_STATEDIR "/rpc_pipefs" #endif -#define DEFAULT_CLD_PATH PIPEFS_DIR "/nfsd/cld" +#define DEFAULT_CLD_PATH "/nfsd/cld" #ifndef CLD_DEFAULT_STORAGEDIR #define CLD_DEFAULT_STORAGEDIR NFS_STATEDIR "/nfsdcld" @@ -63,7 +63,8 @@ /* private data structures */ /* global variables */ -static char *pipepath = DEFAULT_CLD_PATH; +static char pipefs_dir[PATH_MAX] = DEFAULT_PIPEFS_DIR; +static char pipepath[PATH_MAX]; static int inotify_fd = -1; static struct event pipedir_event; static bool old_kernel = false; @@ -73,7 +74,7 @@ static struct option longopts[] = { "help", 0, NULL, 'h' }, { "foreground", 0, NULL, 'F' }, { "debug", 0, NULL, 'd' }, - { "pipe", 1, NULL, 'p' }, + { "pipefsdir", 1, NULL, 'p' }, { "storagedir", 1, NULL, 's' }, { NULL, 0, 0, 0 }, }; @@ -84,7 +85,7 @@ static void cldcb(int UNUSED(fd), short which, void *data); static void usage(char *progname) { - printf("%s [ -hFd ] [ -p pipe ] [ -s dir ]\n", progname); + printf("%s [ -hFd ] [ -p pipefsdir ] [ -s storagedir ]\n", progname); } static int @@ -663,7 +664,7 @@ main(int argc, char **argv) foreground = true; break; case 'p': - pipepath = optarg; + strlcpy(pipefs_dir, optarg, sizeof(pipefs_dir)); break; case 's': storagedir = optarg; @@ -674,6 +675,8 @@ main(int argc, char **argv) } } + strlcpy(pipepath, pipefs_dir, sizeof(pipepath)); + strlcat(pipepath, DEFAULT_CLD_PATH, sizeof(pipepath)); xlog_open(progname); if (!foreground) { diff --git a/utils/nfsdcld/nfsdcld.man b/utils/nfsdcld/nfsdcld.man index 9ddaf64..b607ba6 100644 --- a/utils/nfsdcld/nfsdcld.man +++ b/utils/nfsdcld/nfsdcld.man @@ -155,11 +155,10 @@ Enable debug level logging. .IP "\fB\-F\fR, \fB\-\-foreground\fR" 4 .IX Item "-F, --foreground" Runs the daemon in the foreground and prints all output to stderr -.IP "\fB\-p\fR \fIpipe\fR, \fB\-\-pipe\fR=\fIpipe\fR" 4 -.IX Item "-p pipe, --pipe=pipe" -Location of the \*(L"cld\*(R" upcall pipe. The default value is -\&\fI/var/lib/nfs/rpc_pipefs/nfsd/cld\fR. If the pipe does not exist when the -daemon starts then it will wait for it to be created. +.IP "\fB\-p\fR \fIpath\fR, \fB\-\-pipefsdir\fR=\fIpath\fR" 4 +.IX Item "-p path, --pipefsdir=path" +Location of the rpc_pipefs filesystem. The default value is +\&\fI/var/lib/nfs/rpc_pipefs\fR. .IP "\fB\-s\fR \fIstorage_dir\fR, \fB\-\-storagedir\fR=\fIstorage_dir\fR" 4 .IX Item "-s storagedir, --storagedir=storage_dir" Directory where stable storage information should be kept. The default From patchwork Tue Mar 26 22:07:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Mayhew X-Patchwork-Id: 10872239 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0076C1708 for ; Tue, 26 Mar 2019 22:07:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E29DF28A9C for ; Tue, 26 Mar 2019 22:07:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D72D528C8A; Tue, 26 Mar 2019 22:07:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E17D28B87 for ; Tue, 26 Mar 2019 22:07:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731880AbfCZWHc (ORCPT ); Tue, 26 Mar 2019 18:07:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46772 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726277AbfCZWHc (ORCPT ); Tue, 26 Mar 2019 18:07:32 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AE9F1307CDD5; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: from coeurl.usersys.redhat.com (ovpn-122-200.rdu2.redhat.com [10.10.122.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8E2C9600C1; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: by coeurl.usersys.redhat.com (Postfix, from userid 1000) id DFB3321347; Tue, 26 Mar 2019 18:07:30 -0400 (EDT) From: Scott Mayhew To: steved@redhat.com Cc: bfields@fieldses.org, jlayton@kernel.org, linux-nfs@vger.kernel.org Subject: [nfs-utils PATCH RFC v3 6/8] nfsdcld: add /etc/nfs.conf support Date: Tue, 26 Mar 2019 18:07:28 -0400 Message-Id: <20190326220730.3763-7-smayhew@redhat.com> In-Reply-To: <20190326220730.3763-1-smayhew@redhat.com> References: <20190326220730.3763-1-smayhew@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Scott Mayhew --- nfs.conf | 4 ++++ utils/nfsdcld/nfsdcld.c | 13 +++++++++++++ utils/nfsdcld/nfsdcld.man | 15 +++++++++++++++ 3 files changed, 32 insertions(+) diff --git a/nfs.conf b/nfs.conf index 796bee4..aabf300 100644 --- a/nfs.conf +++ b/nfs.conf @@ -34,6 +34,10 @@ # state-directory-path=/var/lib/nfs # ha-callout= # +[nfsdcld] +# debug=0 +# storagedir=/var/lib/nfs/nfsdcld +# [nfsdcltrack] # debug=0 # storagedir=/var/lib/nfs/nfsdcltrack diff --git a/utils/nfsdcld/nfsdcld.c b/utils/nfsdcld/nfsdcld.c index 272c7c5..313c68f 100644 --- a/utils/nfsdcld/nfsdcld.c +++ b/utils/nfsdcld/nfsdcld.c @@ -45,6 +45,7 @@ #include "cld-internal.h" #include "sqlite.h" #include "../mount/version.h" +#include "conffile.h" #ifndef DEFAULT_PIPEFS_DIR #define DEFAULT_PIPEFS_DIR NFS_STATEDIR "/rpc_pipefs" @@ -640,6 +641,7 @@ main(int argc, char **argv) char *progname; char *storagedir = CLD_DEFAULT_STORAGEDIR; struct cld_client clnt; + char *s; memset(&clnt, 0, sizeof(clnt)); @@ -653,6 +655,17 @@ main(int argc, char **argv) xlog_syslog(0); xlog_stderr(1); + conf_init_file(NFS_CONFFILE); + s = conf_get_str("general", "pipefs-directory"); + if (s) + strlcpy(pipefs_dir, s, sizeof(pipefs_dir)); + s = conf_get_str("nfsdcld", "storagedir"); + if (s) + storagedir = s; + rc = conf_get_num("nfsdcld", "debug", 0); + if (rc > 0) + xlog_config(D_ALL, 1); + /* process command-line options */ while ((arg = getopt_long(argc, argv, "hdFp:s:", longopts, NULL)) != EOF) { diff --git a/utils/nfsdcld/nfsdcld.man b/utils/nfsdcld/nfsdcld.man index b607ba6..c271d14 100644 --- a/utils/nfsdcld/nfsdcld.man +++ b/utils/nfsdcld/nfsdcld.man @@ -163,6 +163,21 @@ Location of the rpc_pipefs filesystem. The default value is .IX Item "-s storagedir, --storagedir=storage_dir" Directory where stable storage information should be kept. The default value is \fI/var/lib/nfs/nfsdcld\fR. +.SH "CONFIGURATION FILE" +.IX Header "CONFIGURATION FILE" +The following values are recognized in the \fB[nfsdcld]\fR section +of the \fI/etc/nfs.conf\fR configuration file: +.IP "\fBstoragedir\fR" 4 +.IX Item "storagedir" +Equivalent to \fB\-s\fR/\fB\-\-storagedir\fR. +.IP "\fBdebug\fR" 4 +.IX Item "debug" +Setting "debug = 1" is equivalent to \fB\-d\fR/\fB\-\-debug\fR. +.LP +In addition, the following value is recognized from the \fB[general]\fR section: +.IP "\fBpipefs\-directory\fR" 4 +.IX Item "pipefs-directory" +Equivalent to \fB\-p\fR/\fB\-\-pipefsdir\fR. .SH "NOTES" .IX Header "NOTES" The Linux kernel NFSv4 server has historically tracked this information From patchwork Tue Mar 26 22:07:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Mayhew X-Patchwork-Id: 10872243 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A65A4186D for ; Tue, 26 Mar 2019 22:07:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9496628569 for ; Tue, 26 Mar 2019 22:07:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8942728B87; Tue, 26 Mar 2019 22:07:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 43ECB28CCE for ; Tue, 26 Mar 2019 22:07:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731779AbfCZWHc (ORCPT ); Tue, 26 Mar 2019 18:07:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46776 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731491AbfCZWHb (ORCPT ); Tue, 26 Mar 2019 18:07:31 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C968730FD84B; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: from coeurl.usersys.redhat.com (ovpn-122-200.rdu2.redhat.com [10.10.122.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id A9CCC5D6A9; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: by coeurl.usersys.redhat.com (Postfix, from userid 1000) id E496321349; Tue, 26 Mar 2019 18:07:30 -0400 (EDT) From: Scott Mayhew To: steved@redhat.com Cc: bfields@fieldses.org, jlayton@kernel.org, linux-nfs@vger.kernel.org Subject: [nfs-utils PATCH RFC v3 7/8] systemd: add a unit file for nfsdcld Date: Tue, 26 Mar 2019 18:07:29 -0400 Message-Id: <20190326220730.3763-8-smayhew@redhat.com> In-Reply-To: <20190326220730.3763-1-smayhew@redhat.com> References: <20190326220730.3763-1-smayhew@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Scott Mayhew --- systemd/nfs-server.service | 2 ++ systemd/nfsdcld.service | 10 ++++++++++ 2 files changed, 12 insertions(+) create mode 100644 systemd/nfsdcld.service diff --git a/systemd/nfs-server.service b/systemd/nfs-server.service index 136552b..24118d6 100644 --- a/systemd/nfs-server.service +++ b/systemd/nfs-server.service @@ -6,10 +6,12 @@ Requires= nfs-mountd.service Wants=rpcbind.socket network-online.target Wants=rpc-statd.service nfs-idmapd.service Wants=rpc-statd-notify.service +Wants=nfsdcld.service After= network-online.target local-fs.target After= proc-fs-nfsd.mount rpcbind.socket nfs-mountd.service After= nfs-idmapd.service rpc-statd.service +After= nfsdcld.service Before= rpc-statd-notify.service # GSS services dependencies and ordering diff --git a/systemd/nfsdcld.service b/systemd/nfsdcld.service new file mode 100644 index 0000000..a32d243 --- /dev/null +++ b/systemd/nfsdcld.service @@ -0,0 +1,10 @@ +[Unit] +Description=NFSv4 Client Tracking Daemon +DefaultDependencies=no +Conflicts=umount.target +Requires=rpc_pipefs.target proc-fs-nfsd.mount +After=rpc_pipefs.target proc-fs-nfsd.mount + +[Service] +Type=forking +ExecStart=/usr/sbin/nfsdcld From patchwork Tue Mar 26 22:07:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Mayhew X-Patchwork-Id: 10872249 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4171917E0 for ; Tue, 26 Mar 2019 22:07:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2D16228569 for ; Tue, 26 Mar 2019 22:07:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 21B8528C8A; Tue, 26 Mar 2019 22:07:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD5B428A9C for ; Tue, 26 Mar 2019 22:07:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731644AbfCZWHf (ORCPT ); Tue, 26 Mar 2019 18:07:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41704 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731859AbfCZWHc (ORCPT ); Tue, 26 Mar 2019 18:07:32 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0BC5330FE5D2; Tue, 26 Mar 2019 22:07:32 +0000 (UTC) Received: from coeurl.usersys.redhat.com (ovpn-122-200.rdu2.redhat.com [10.10.122.200]) by smtp.corp.redhat.com (Postfix) with ESMTP id B8BF017DC5; Tue, 26 Mar 2019 22:07:31 +0000 (UTC) Received: by coeurl.usersys.redhat.com (Postfix, from userid 1000) id E92CB2134A; Tue, 26 Mar 2019 18:07:30 -0400 (EDT) From: Scott Mayhew To: steved@redhat.com Cc: bfields@fieldses.org, jlayton@kernel.org, linux-nfs@vger.kernel.org Subject: [nfs-utils PATCH RFC v3 8/8] nfsdcld: add a facility for migrating from older client tracking methods Date: Tue, 26 Mar 2019 18:07:30 -0400 Message-Id: <20190326220730.3763-9-smayhew@redhat.com> In-Reply-To: <20190326220730.3763-1-smayhew@redhat.com> References: <20190326220730.3763-1-smayhew@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Tue, 26 Mar 2019 22:07:32 +0000 (UTC) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds a one-time "upgrade" when using nfsdcld for the first time. During initialization of the database, we try to copy records from the old nfdscltrack database and from the legacy v4recovery directory (although we don't expect to find records in both places). nfsdcltrack stores the client name string so the handling of those records is straightforward. Legacy records are md5 hashes of the client name string, so we prefix them with "hash:" as way to let knfsd know that this is a legacy record. knfsd will need a fallback check in the event that it cannot find a reclaim record using the client name string. Upon receipt of the first "GraceDone" upcall we clean out any records from the old nfsdcltrack database and delete any subdirectories under the v4recovery directory. Signed-off-by: Scott Mayhew --- utils/nfsdcld/Makefile.am | 4 +- utils/nfsdcld/cld-internal.h | 3 + utils/nfsdcld/legacy.c | 180 +++++++++++++++++++++++ utils/nfsdcld/legacy.h | 24 ++++ utils/nfsdcld/nfsdcld.c | 13 ++ utils/nfsdcld/sqlite.c | 272 +++++++++++++++++++++++++++++++++++ utils/nfsdcld/sqlite.h | 2 + 7 files changed, 496 insertions(+), 2 deletions(-) create mode 100644 utils/nfsdcld/legacy.c create mode 100644 utils/nfsdcld/legacy.h diff --git a/utils/nfsdcld/Makefile.am b/utils/nfsdcld/Makefile.am index d1da749..2c4e5a1 100644 --- a/utils/nfsdcld/Makefile.am +++ b/utils/nfsdcld/Makefile.am @@ -10,10 +10,10 @@ EXTRA_DIST = $(man8_MANS) AM_CFLAGS += -D_LARGEFILE64_SOURCE sbin_PROGRAMS = nfsdcld -nfsdcld_SOURCES = nfsdcld.c sqlite.c +nfsdcld_SOURCES = nfsdcld.c sqlite.c legacy.c nfsdcld_LDADD = ../../support/nfs/libnfs.la $(LIBEVENT) $(LIBSQLITE) $(LIBCAP) -noinst_HEADERS = sqlite.h cld-internal.h +noinst_HEADERS = sqlite.h cld-internal.h legacy.h MAINTAINERCLEANFILES = Makefile.in diff --git a/utils/nfsdcld/cld-internal.h b/utils/nfsdcld/cld-internal.h index a90cced..76e97db 100644 --- a/utils/nfsdcld/cld-internal.h +++ b/utils/nfsdcld/cld-internal.h @@ -26,5 +26,8 @@ struct cld_client { uint64_t current_epoch; uint64_t recovery_epoch; +int first_time; +int num_cltrack_records; +int num_legacy_records; #endif /* _CLD_INTERNAL_H_ */ diff --git a/utils/nfsdcld/legacy.c b/utils/nfsdcld/legacy.c new file mode 100644 index 0000000..f0ca316 --- /dev/null +++ b/utils/nfsdcld/legacy.c @@ -0,0 +1,180 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, + * Boston, MA 02110-1301, USA. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "cld.h" +#include "sqlite.h" +#include "xlog.h" +#include "legacy.h" + +#define NFSD_RECDIR_FILE "/proc/fs/nfsd/nfsv4recoverydir" + +/* + * Loads client records from the v4recovery directory into the database. + * Records are prefixed with the string "hash:" and include the '\0' byte. + * + * Called during database initialization as part of a one-time "upgrade". + */ +void +legacy_load_clients_from_recdir(int *num_records) +{ + int fd; + DIR *v4recovery; + struct dirent *entry; + char recdirname[PATH_MAX]; + char buf[NFS4_OPAQUE_LIMIT]; + struct stat st; + char *nl; + + fd = open(NFSD_RECDIR_FILE, O_RDONLY); + if (fd < 0) { + xlog(D_GENERAL, "Unable to open %s: %m", NFSD_RECDIR_FILE); + return; + } + if (read(fd, recdirname, PATH_MAX) < 0) { + xlog(D_GENERAL, "Unable to read from %s: %m", NFSD_RECDIR_FILE); + return; + } + close(fd); + /* the output from the proc file isn't null-terminated */ + nl = strchr(recdirname, '\n'); + if (!nl) + return; + *nl = '\0'; + if (stat(recdirname, &st) < 0) { + xlog(D_GENERAL, "Unable to stat %s: %d", recdirname, errno); + return; + } + if (!S_ISDIR(st.st_mode)) { + xlog(D_GENERAL, "%s is not a directory: mode=0%o", recdirname + , st.st_mode); + return; + } + v4recovery = opendir(recdirname); + if (!v4recovery) + return; + while ((entry = readdir(v4recovery))) { + int ret; + + /* skip "." and ".." */ + if (entry->d_name[0] == '.') { + switch (entry->d_name[1]) { + case '\0': + continue; + case '.': + if (entry->d_name[2] == '\0') + continue; + } + } + /* prefix legacy records with the string "hash:" */ + ret = snprintf(buf, sizeof(buf), "hash:%s", entry->d_name); + /* if there's a problem, then skip this entry */ + if (ret < 0 || (size_t)ret >= sizeof(buf)) { + xlog(L_WARNING, "%s: unable to build client string for %s!", + __func__, entry->d_name); + continue; + } + /* legacy client records need to include the null terminator */ + ret = sqlite_insert_client((unsigned char *)buf, strlen(buf) + 1); + if (ret) + xlog(L_WARNING, "%s: unable to insert %s: %d", __func__, + entry->d_name, ret); + else + (*num_records)++; + } + closedir(v4recovery); +} + +/* + * Cleans out the v4recovery directory. + * + * Called upon receipt of the first "GraceDone" upcall only. + */ +void +legacy_clear_recdir(void) +{ + int fd; + DIR *v4recovery; + struct dirent *entry; + char recdirname[PATH_MAX]; + char dirname[PATH_MAX]; + struct stat st; + char *nl; + + fd = open(NFSD_RECDIR_FILE, O_RDONLY); + if (fd < 0) { + xlog(D_GENERAL, "Unable to open %s: %m", NFSD_RECDIR_FILE); + return; + } + if (read(fd, recdirname, PATH_MAX) < 0) { + xlog(D_GENERAL, "Unable to read from %s: %m", NFSD_RECDIR_FILE); + return; + } + close(fd); + /* the output from the proc file isn't null-terminated */ + nl = strchr(recdirname, '\n'); + if (!nl) + return; + *nl = '\0'; + if (stat(recdirname, &st) < 0) { + xlog(D_GENERAL, "Unable to stat %s: %d", recdirname, errno); + return; + } + if (!S_ISDIR(st.st_mode)) { + xlog(D_GENERAL, "%s is not a directory: mode=0%o", recdirname + , st.st_mode); + return; + } + v4recovery = opendir(recdirname); + if (!v4recovery) + return; + while ((entry = readdir(v4recovery))) { + int len; + + /* skip "." and ".." */ + if (entry->d_name[0] == '.') { + switch (entry->d_name[1]) { + case '\0': + continue; + case '.': + if (entry->d_name[2] == '\0') + continue; + } + } + len = snprintf(dirname, sizeof(dirname), "%s/%s", recdirname, + entry->d_name); + /* if there's a problem, then skip this entry */ + if (len < 0 || (size_t)len >= sizeof(dirname)) { + xlog(L_WARNING, "%s: unable to build filename for %s!", + __func__, entry->d_name); + continue; + } + len = rmdir(dirname); + if (len) + xlog(L_WARNING, "%s: unable to rmdir %s: %d", __func__, + dirname, len); + } + closedir(v4recovery); +} diff --git a/utils/nfsdcld/legacy.h b/utils/nfsdcld/legacy.h new file mode 100644 index 0000000..8988f6e --- /dev/null +++ b/utils/nfsdcld/legacy.h @@ -0,0 +1,24 @@ +/* + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, + * Boston, MA 02110-1301, USA. + */ + +#ifndef _LEGACY_H_ +#define _LEGACY_H_ + +void legacy_load_clients_from_recdir(int *); +void legacy_clear_recdir(void); + +#endif /* _LEGACY_H_ */ diff --git a/utils/nfsdcld/nfsdcld.c b/utils/nfsdcld/nfsdcld.c index 313c68f..cbf71fc 100644 --- a/utils/nfsdcld/nfsdcld.c +++ b/utils/nfsdcld/nfsdcld.c @@ -46,6 +46,7 @@ #include "sqlite.h" #include "../mount/version.h" #include "conffile.h" +#include "legacy.h" #ifndef DEFAULT_PIPEFS_DIR #define DEFAULT_PIPEFS_DIR NFS_STATEDIR "/rpc_pipefs" @@ -510,6 +511,15 @@ cld_gracedone(struct cld_client *clnt) ret = sqlite_grace_done(); + if (first_time) { + if (num_cltrack_records > 0) + sqlite_delete_cltrack_records(); + if (num_legacy_records > 0) + legacy_clear_recdir(); + sqlite_first_time_done(); + first_time = 0; + } + reply: /* set up reply: downcall with 0 status */ cmsg->cm_status = ret ? -EREMOTEIO : ret; @@ -642,6 +652,9 @@ main(int argc, char **argv) char *storagedir = CLD_DEFAULT_STORAGEDIR; struct cld_client clnt; char *s; + first_time = 0; + num_cltrack_records = 0; + num_legacy_records = 0; memset(&clnt, 0, sizeof(clnt)); diff --git a/utils/nfsdcld/sqlite.c b/utils/nfsdcld/sqlite.c index 6e4eb66..faa62f9 100644 --- a/utils/nfsdcld/sqlite.c +++ b/utils/nfsdcld/sqlite.c @@ -64,8 +64,11 @@ #include "sqlite.h" #include "cld.h" #include "cld-internal.h" +#include "conffile.h" +#include "legacy.h" #define CLD_SQLITE_LATEST_SCHEMA_VERSION 3 +#define CLTRACK_DEFAULT_STORAGEDIR NFS_STATEDIR "/nfsdcltrack" /* in milliseconds */ #define CLD_SQLITE_BUSY_TIMEOUT 10000 @@ -73,6 +76,7 @@ /* private data structures */ /* global variables */ +static char *cltrack_storagedir = CLTRACK_DEFAULT_STORAGEDIR; /* reusable pathname and sql command buffer */ static char buf[PATH_MAX]; @@ -135,6 +139,37 @@ out: return ret; } +static int +sqlite_query_first_time(int *first_time) +{ + int ret; + sqlite3_stmt *stmt = NULL; + + /* prepare select query */ + ret = sqlite3_prepare_v2(dbh, + "SELECT value FROM parameters WHERE key == \"first_time\";", + -1, &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(D_GENERAL, "Unable to prepare select statement: %s", + sqlite3_errmsg(dbh)); + goto out; + } + + /* query first_time */ + ret = sqlite3_step(stmt); + if (ret != SQLITE_ROW) { + xlog(D_GENERAL, "Select statement execution failed: %s", + sqlite3_errmsg(dbh)); + goto out; + } + + *first_time = sqlite3_column_int(stmt, 0); + ret = 0; +out: + sqlite3_finalize(stmt); + return ret; +} + static int sqlite_maindb_update_schema(int oldversion) { @@ -229,6 +264,18 @@ sqlite_maindb_update_schema(int oldversion) goto rollback; } + ret = sqlite_query_first_time(&first_time); + if (ret != SQLITE_OK) { + /* insert first_time into parameters table */ + ret = sqlite3_exec(dbh, "INSERT OR FAIL INTO parameters " + "values (\"first_time\", \"1\");", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to insert into parameter table: %s", err); + goto rollback; + } + } + ret = sqlite3_exec(dbh, "COMMIT TRANSACTION;", NULL, NULL, &err); if (ret != SQLITE_OK) { xlog(L_ERROR, "Unable to commit transaction: %s", err); @@ -338,6 +385,15 @@ sqlite_maindb_init_v3(void) goto rollback; } + /* insert first_time into parameters table */ + ret = sqlite3_exec(dbh, "INSERT OR FAIL INTO parameters " + "values (\"first_time\", \"1\");", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to insert into parameter table: %s", err); + goto rollback; + } + ret = sqlite3_exec(dbh, "COMMIT TRANSACTION;", NULL, NULL, &err); if (ret != SQLITE_OK) { xlog(L_ERROR, "Unable to commit transaction: %s", err); @@ -391,6 +447,153 @@ out: return ret; } +static int +sqlite_attach_db(const char *path) +{ + int ret; + char dbpath[PATH_MAX]; + struct stat stb; + sqlite3_stmt *stmt = NULL; + + ret = snprintf(dbpath, PATH_MAX - 1, "%s/main.sqlite", path); + if (ret < 0) + return ret; + + dbpath[PATH_MAX - 1] = '\0'; + ret = stat(dbpath, &stb); + if (ret < 0) + return ret; + + xlog(D_GENERAL, "attaching %s", dbpath); + ret = sqlite3_prepare_v2(dbh, "ATTACH DATABASE ? AS attached;", + -1, &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: unable to prepare attach statement: %s", + __func__, sqlite3_errmsg(dbh)); + return ret; + } + + ret = sqlite3_bind_text(stmt, 1, dbpath, strlen(dbpath), SQLITE_STATIC); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: bind text failed: %s", + __func__, sqlite3_errmsg(dbh)); + return ret; + } + + ret = sqlite3_step(stmt); + if (ret == SQLITE_DONE) + ret = SQLITE_OK; + else + xlog(L_ERROR, "%s: unexpected return code from attach: %s", + __func__, sqlite3_errmsg(dbh)); + + sqlite3_finalize(stmt); + stmt = NULL; + return ret; +} + +static int +sqlite_detach_db(void) +{ + int ret; + char *err = NULL; + + xlog(D_GENERAL, "detaching database"); + ret = sqlite3_exec(dbh, "DETACH DATABASE attached;", NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to detach attached db: %s", err); + } + + sqlite3_free(err); + return ret; +} + +/* + * Copies client records from the nfsdcltrack database as part of a one-time + * "upgrade". + * + * Returns a non-zero sqlite error code, or SQLITE_OK (aka 0). + * Returns the number of records copied via "num_rec". + */ +static int +sqlite_copy_cltrack_records(int *num_rec) +{ + int ret, ret2; + char *s; + char *err = NULL; + sqlite3_stmt *stmt = NULL; + + s = conf_get_str("nfsdcltrack", "storagedir"); + if (s) + cltrack_storagedir = s; + ret = sqlite_attach_db(cltrack_storagedir); + if (ret) + goto out; + ret = sqlite3_exec(dbh, "BEGIN EXCLUSIVE TRANSACTION;", NULL, NULL, + &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to begin transaction: %s", err); + goto rollback; + } + ret = snprintf(buf, sizeof(buf), "DELETE FROM \"rec-%016lx\";", + current_epoch); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + goto rollback; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + ret = -EINVAL; + goto rollback; + } + ret = sqlite3_exec(dbh, (const char *)buf, NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to clear records from current epoch: %s", err); + goto rollback; + } + ret = snprintf(buf, sizeof(buf), "INSERT INTO \"rec-%016lx\" " + "SELECT id FROM attached.clients;", + current_epoch); + if (ret < 0) { + xlog(L_ERROR, "sprintf failed!"); + goto rollback; + } else if ((size_t)ret >= sizeof(buf)) { + xlog(L_ERROR, "sprintf output too long! (%d chars)", ret); + ret = -EINVAL; + goto rollback; + } + ret = sqlite3_prepare_v2(dbh, buf, -1, &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "%s: insert statement prepare failed: %s", + __func__, sqlite3_errmsg(dbh)); + goto rollback; + } + ret = sqlite3_step(stmt); + if (ret != SQLITE_DONE) { + xlog(L_ERROR, "%s: unexpected return code from insert: %s", + __func__, sqlite3_errmsg(dbh)); + goto rollback; + } + *num_rec = sqlite3_changes(dbh); + ret = sqlite3_exec(dbh, "COMMIT TRANSACTION;", NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to commit transaction: %s", err); + goto rollback; + } +cleanup: + sqlite3_finalize(stmt); + sqlite3_free(err); + sqlite_detach_db(); +out: + xlog(D_GENERAL, "%s: returning %d", __func__, ret); + return ret; +rollback: + *num_rec = 0; + ret2 = sqlite3_exec(dbh, "ROLLBACK TRANSACTION;", NULL, NULL, &err); + if (ret2 != SQLITE_OK) + xlog(L_ERROR, "Unable to rollback transaction: %s", err); + goto cleanup; +} + /* Open the database and set up the database handle for it */ int sqlite_prepare_dbh(const char *topdir) @@ -465,6 +668,23 @@ sqlite_prepare_dbh(const char *topdir) ret = sqlite_startup_query_grace(); + ret = sqlite_query_first_time(&first_time); + if (ret) + goto out_close; + + /* one-time "upgrade" from older client tracking methods */ + if (first_time) { + sqlite_copy_cltrack_records(&num_cltrack_records); + xlog(D_GENERAL, "%s: num_cltrack_records = %d\n", + __func__, num_cltrack_records); + legacy_load_clients_from_recdir(&num_legacy_records); + xlog(D_GENERAL, "%s: num_legacy_records = %d\n", + __func__, num_legacy_records); + if (num_cltrack_records > 0 && num_legacy_records > 0) + xlog(L_WARNING, "%s: first-time upgrade detected " + "both cltrack and legacy records!\n", __func__); + } + return ret; out_close: sqlite3_close(dbh); @@ -835,3 +1055,55 @@ sqlite_iterate_recovery(int (*cb)(struct cld_client *clnt), struct cld_client *c sqlite3_finalize(stmt); return ret; } + +/* + * Cleans out the old nfsdcltrack database. + * + * Called upon receipt of the first "GraceDone" upcall only. + */ +int +sqlite_delete_cltrack_records(void) +{ + int ret; + char *s; + char *err = NULL; + + s = conf_get_str("nfsdcltrack", "storagedir"); + if (s) + cltrack_storagedir = s; + ret = sqlite_attach_db(cltrack_storagedir); + if (ret) + goto out; + ret = sqlite3_exec(dbh, "DELETE FROM attached.clients;", + NULL, NULL, &err); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to clear records from cltrack db: %s", + err); + } + sqlite_detach_db(); +out: + sqlite3_free(err); + return ret; +} + +/* + * Sets first_time to 0 in the parameters table to ensure we only + * copy old client tracking records into the database one time. + * + * Called upon receipt of the first "GraceDone" upcall only. + */ +int +sqlite_first_time_done(void) +{ + int ret; + char *err = NULL; + + ret = sqlite3_exec(dbh, "UPDATE parameters SET value = \"0\" " + "WHERE key = \"first_time\";", + NULL, NULL, &err); + if (ret != SQLITE_OK) + xlog(L_ERROR, "Unable to clear first_time: %s", err); + + sqlite3_free(err); + return ret; +} diff --git a/utils/nfsdcld/sqlite.h b/utils/nfsdcld/sqlite.h index 757e5cc..7741382 100644 --- a/utils/nfsdcld/sqlite.h +++ b/utils/nfsdcld/sqlite.h @@ -29,5 +29,7 @@ int sqlite_check_client(const unsigned char *clname, const size_t namelen); int sqlite_grace_start(void); int sqlite_grace_done(void); int sqlite_iterate_recovery(int (*cb)(struct cld_client *clnt), struct cld_client *clnt); +int sqlite_delete_cltrack_records(void); +int sqlite_first_time_done(void); #endif /* _SQLITE_H */