From patchwork Thu Feb 17 13:15:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Weinberger X-Patchwork-Id: 12750037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A354C43219 for ; Thu, 17 Feb 2022 13:23:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240887AbiBQNXO (ORCPT ); Thu, 17 Feb 2022 08:23:14 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:51286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240884AbiBQNXM (ORCPT ); Thu, 17 Feb 2022 08:23:12 -0500 Received: from lithops.sigma-star.at (lithops.sigma-star.at [195.201.40.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 141B9B0A73 for ; Thu, 17 Feb 2022 05:22:56 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id 27E3D60765A3; Thu, 17 Feb 2022 14:16:43 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 84PNtuXwNWDs; Thu, 17 Feb 2022 14:16:38 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id 925A260765A1; Thu, 17 Feb 2022 14:16:38 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id zgUuE5thLYNQ; Thu, 17 Feb 2022 14:16:38 +0100 (CET) Received: from blindfold.corp.sigma-star.at (213-47-184-186.cable.dynamic.surfer.at [213.47.184.186]) by lithops.sigma-star.at (Postfix) with ESMTPSA id 27E2C608A38A; Thu, 17 Feb 2022 14:16:38 +0100 (CET) From: Richard Weinberger To: linux-nfs@vger.kernel.org Cc: david@sigma-star.at, bfields@fieldses.org, luis.turcitu@appsbroker.com, david.young@appsbroker.com, david.oberhollenzer@sigma-star.at, trond.myklebust@hammerspace.com, anna.schumaker@netapp.com, chris.chilvers@appsbroker.com, Richard Weinberger Subject: [RFC PATCH 1/6] Implement reexport helper library Date: Thu, 17 Feb 2022 14:15:26 +0100 Message-Id: <20220217131531.2890-2-richard@nod.at> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20220217131531.2890-1-richard@nod.at> References: <20220217131531.2890-1-richard@nod.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org This internal library contains code that will be used by various tools within the nfs-utils package to deal better with NFS re-export, especially cross mounts. Signed-off-by: Richard Weinberger --- configure.ac | 12 + support/Makefile.am | 4 + support/reexport/Makefile.am | 6 + support/reexport/reexport.c | 477 +++++++++++++++++++++++++++++++++++ support/reexport/reexport.h | 53 ++++ 5 files changed, 552 insertions(+) create mode 100644 support/reexport/Makefile.am create mode 100644 support/reexport/reexport.c create mode 100644 support/reexport/reexport.h diff --git a/configure.ac b/configure.ac index 93626d62..86bf8ba9 100644 --- a/configure.ac +++ b/configure.ac @@ -274,6 +274,17 @@ AC_ARG_ENABLE(nfsv4server, fi AM_CONDITIONAL(CONFIG_NFSV4SERVER, [test "$enable_nfsv4server" = "yes" ]) +AC_ARG_ENABLE(reexport, + [AC_HELP_STRING([--enable-reexport], + [enable support for re-exporting NFS mounts @<:@default=no@:>@])], + enable_reexport=$enableval, + enable_reexport="no") + if test "$enable_reexport" = yes; then + AC_DEFINE(HAVE_REEXPORT_SUPPORT, 1, + [Define this if you want NFS re-export support compiled in]) + fi + AM_CONDITIONAL(CONFIG_REEXPORT, [test "$enable_reexport" = "yes" ]) + dnl Check for TI-RPC library and headers AC_LIBTIRPC @@ -730,6 +741,7 @@ AC_CONFIG_FILES([ support/nsm/Makefile support/nfsidmap/Makefile support/nfsidmap/libnfsidmap.pc + support/reexport/Makefile tools/Makefile tools/locktest/Makefile tools/nlmtest/Makefile diff --git a/support/Makefile.am b/support/Makefile.am index c962d4d4..986e9b5f 100644 --- a/support/Makefile.am +++ b/support/Makefile.am @@ -10,6 +10,10 @@ if CONFIG_JUNCTION OPTDIRS += junction endif +if CONFIG_REEXPORT +OPTDIRS += reexport +endif + SUBDIRS = export include misc nfs nsm $(OPTDIRS) MAINTAINERCLEANFILES = Makefile.in diff --git a/support/reexport/Makefile.am b/support/reexport/Makefile.am new file mode 100644 index 00000000..9d544a8f --- /dev/null +++ b/support/reexport/Makefile.am @@ -0,0 +1,6 @@ +## Process this file with automake to produce Makefile.in + +noinst_LIBRARIES = libreexport.a +libreexport_a_SOURCES = reexport.c + +MAINTAINERCLEANFILES = Makefile.in diff --git a/support/reexport/reexport.c b/support/reexport/reexport.c new file mode 100644 index 00000000..551ec278 --- /dev/null +++ b/support/reexport/reexport.c @@ -0,0 +1,477 @@ +#ifdef HAVE_CONFIG_H +#include +#endif + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "nfslib.h" +#include "reexport.h" +#include "xlog.h" + +#define REEXPDB_SHM_NAME "/nfs_reexport_db_lock" +#define REEXPDB_SHM_SZ 4096 +#define REEXPDB_INIT_LOCK NFS_STATEDIR "/reexpdb_init.lock" +#define REEXPDB_DBFILE NFS_STATEDIR "/reexpdb.sqlite3" + +static const char initdb_sql[] = "CREATE TABLE IF NOT EXISTS fsidnums (num INTEGER PRIMARY KEY CHECK (num > 0 AND num < 4294967296), path TEXT UNIQUE); CREATE TABLE IF NOT EXISTS subvolumes (path TEXT PRIMARY KEY); CREATE INDEX IF NOT EXISTS idx_ids_path ON fsidnums (path);"; +/* + * This query is a little tricky. We use SQL to find and claim the smallest free fsid number. + * To find a free fsid the fsidnums is left joined to itself but with an offset of 1. + * Everything after the UNION statement is to handle the corner case where fsidnums + * is empty. In this case we want 1 as first fsid number. + */ +static const char new_fsidnum_by_path_sql[] = "INSERT INTO fsidnums VALUES ((SELECT ids1.num + 1 FROM fsidnums AS ids1 LEFT JOIN fsidnums AS ids2 ON ids2.num = ids1.num + 1 WHERE ids2.num IS NULL UNION SELECT 1 WHERE NOT EXISTS (SELECT NULL FROM fsidnums WHERE num = 1) LIMIT 1), ?1) RETURNING num;"; +static const char fsidnum_by_path_sql[] = "SELECT num FROM fsidnums WHERE path = ?1;"; +static const char add_crossed_volume_sql[] = "REPLACE INTO subvolumes VALUES(?1);"; +static const char drop_crossed_volume_sql[] = "DELETE FROM subvolumes WHERE path = ?1;"; +static const char get_crossed_volumes_sql[] = "SELECT path from subvolumes;"; + +static sqlite3 *db; +static pthread_rwlock_t *reexpdb_rwlock; +static int init_done; + +static void reexpdb_wrlock(void) +{ + assert(pthread_rwlock_wrlock(reexpdb_rwlock) == 0); +} + +static void reexpdb_rdlock(void) +{ + assert(pthread_rwlock_rdlock(reexpdb_rwlock) == 0); +} + +static void reexpdb_unlock(void) +{ + assert(pthread_rwlock_unlock(reexpdb_rwlock) == 0); +} + +static int init_shm_lock(void) +{ + int lockfd = -1, shmfd = -1; + int initlock = 0; + int ret = 0; + + assert(sizeof(*reexpdb_rwlock) <= REEXPDB_SHM_SZ); + + lockfd = open(REEXPDB_INIT_LOCK, O_RDWR | O_CREAT, 0600); + if (lockfd == -1) { + ret = -1; + xlog(L_FATAL, "Unable to open %s: %m", REEXPDB_INIT_LOCK); + + goto out; + } + + ret = flock(lockfd, LOCK_EX); + if (ret == -1) { + ret = -1; + xlog(L_FATAL, "Unable to lock %s: %m", REEXPDB_INIT_LOCK); + + goto out_close; + } + + shmfd = shm_open(REEXPDB_SHM_NAME, O_RDWR, 0600); + if (shmfd == -1 && errno == ENOENT) { + shmfd = shm_open(REEXPDB_SHM_NAME, O_RDWR | O_CREAT, 0600); + if (shmfd == -1) { + ret = -1; + xlog(L_FATAL, "Unable to create shared memory: %m"); + goto out_unflock; + } + + ret = ftruncate(shmfd, REEXPDB_SHM_SZ); + if (ret == -1) { + ret = -1; + xlog(L_FATAL, "Unable to ftruncate shared memory: %m"); + goto out_unflock; + } + + initlock = 1; + } else if (shmfd == -1) { + ret = -1; + xlog(L_FATAL, "Unable to open shared memory: %m"); + goto out_unflock; + } + + reexpdb_rwlock = mmap(NULL, REEXPDB_SHM_SZ, PROT_READ | PROT_WRITE, MAP_SHARED, shmfd, 0); + close(shmfd); + if (reexpdb_rwlock == (void *)-1) { + xlog(L_FATAL, "Unable to mmap shared memory: %m"); + ret = -1; + goto out_unflock; + } + + if (initlock) { + pthread_rwlockattr_t attr; + + ret = pthread_rwlockattr_init(&attr); + if (ret != 0) { + xlog(L_FATAL, "Unable to pthread_rwlockattr_init: %m"); + ret = -1; + goto out_unflock; + } + + ret = pthread_rwlockattr_setpshared(&attr, PTHREAD_PROCESS_SHARED); + if (ret != 0) { + xlog(L_FATAL, "Unable to set PTHREAD_PROCESS_SHARED: %m"); + ret = -1; + goto out_unflock; + } + + ret = pthread_rwlock_init(reexpdb_rwlock, &attr); + if (ret != 0) { + xlog(L_FATAL, "Unable to pthread_rwlock_init: %m"); + ret = -1; + goto out_unflock; + } + } + + ret = 0; + +out_unflock: + flock(lockfd, LOCK_UN); +out_close: + close(lockfd); +out: + return ret; +} + +/* + * reexpdb_init - Initialize reexport database + * + * Setup shared lock (database is concurrently used by multiple processes), + * if needed create tables and create database handle. + * It is okay to call this function multiple times per process. + */ +int reexpdb_init(void) +{ + char *sqlerr; + int ret; + + if (init_done) + return 0; + + ret = init_shm_lock(); + if (ret) + return -1; + + ret = sqlite3_open_v2(REEXPDB_DBFILE, &db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | SQLITE_OPEN_FULLMUTEX, NULL); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to open reexport database: %s", sqlite3_errstr(ret)); + return -1; + } + + reexpdb_wrlock(); + ret = sqlite3_exec(db, initdb_sql, NULL, NULL, &sqlerr); + reexpdb_unlock(); + if (ret != SQLITE_OK) { + xlog(L_ERROR, "Unable to init reexport database: %s", sqlite3_errstr(ret)); + sqlite3_free(sqlerr); + sqlite3_close_v2(db); + ret = -1; + } else { + init_done = 1; + ret = 0; + } + + return ret; +} + +/* + * reexpdb_destroy - Undo reexpdb_init(). + * + * The shared lock keeps. We cannot know which other + * processes are still use the database. + */ +void reexpdb_destroy(void) +{ + if (!init_done) + return; + + sqlite3_close_v2(db); + munmap((void *)reexpdb_rwlock, REEXPDB_SHM_SZ); + reexpdb_rwlock = NULL; +} + +static int get_fsidnum_by_path(char *path, uint32_t *fsidnum) +{ + sqlite3_stmt *stmt = NULL; + int found = 0; + int ret; + + ret = sqlite3_prepare_v2(db, fsidnum_by_path_sql, sizeof(fsidnum_by_path_sql), &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_WARNING, "Unable to prepare SQL query: %s", sqlite3_errstr(ret)); + goto out; + } + + ret = sqlite3_bind_text(stmt, 1, path, -1, NULL); + if (ret != SQLITE_OK) { + xlog(L_WARNING, "Unable to bind \"%s\" SQL query: %s", __func__, sqlite3_errstr(ret)); + goto out; + } + + reexpdb_rdlock(); + ret = sqlite3_step(stmt); + if (ret == SQLITE_ROW) { + *fsidnum = sqlite3_column_int(stmt, 0); + found = 1; + } else if (ret == SQLITE_DONE) { + /* No hit */ + found = 0; + } else { + xlog(L_WARNING, "Error while looking up \"%s\" in database: %s", path, sqlite3_errstr(ret)); + } + reexpdb_unlock(); + +out: + sqlite3_finalize(stmt); + return found; +} + +static int new_fsidnum_by_path(char *path, uint32_t *fsidnum) +{ + sqlite3_stmt *stmt = NULL; + int found = 0, check = 0; + int ret; + + ret = sqlite3_prepare_v2(db, new_fsidnum_by_path_sql, sizeof(new_fsidnum_by_path_sql), &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_WARNING, "Unable to prepare SQL query: %s", sqlite3_errstr(ret)); + goto out; + } + + ret = sqlite3_bind_text(stmt, 1, path, -1, NULL); + if (ret != SQLITE_OK) { + xlog(L_WARNING, "Unable to bind \"%s\" SQL query: %s", path, sqlite3_errstr(ret)); + goto out; + } + + reexpdb_wrlock(); + ret = sqlite3_step(stmt); + if (ret == SQLITE_ROW) { + *fsidnum = sqlite3_column_int(stmt, 0); + found = 1; + } else if (ret == SQLITE_CONSTRAINT) { + /* Maybe we lost the race against another writer and the path is now present. */ + check = 1; + } else { + xlog(L_WARNING, "Error while looking up \"%s\" in database: %s", path, sqlite3_errstr(ret)); + } + reexpdb_unlock(); + +out: + sqlite3_finalize(stmt); + + if (check) { + found = get_fsidnum_by_path(path, fsidnum); + if (!found) + xlog(L_WARNING, "SQLITE_CONSTRAINT error while inserting \"%s\" in database", path); + } + + return found; +} + +int reexpdb_fsidnum_by_path(char *path, uint32_t *fsidnum, int may_create) +{ + int found; + + found = get_fsidnum_by_path(path, fsidnum); + + if (!found && may_create) + found = new_fsidnum_by_path(path, fsidnum); + + return found; +} + +int reexpdb_apply_reexport_settings(struct exportent *ep, char *flname, int flline) +{ + int ret = 0; + + switch (ep->e_reexport) { + case REEXP_REMOTE_DEVFSID: + if (!ep->e_fsid && !ep->e_uuid) { + xlog(L_ERROR, "%s:%i: Selected 'reexport=' mode needs either a numerical or UUID 'fsid='\n", + flname, flline); + ret = -1; + } + break; + case REEXP_AUTO_FSIDNUM: + case REEXP_PREDEFINED_FSIDNUM: { + uint32_t fsidnum; + int found; + + if (ep->e_uuid) + break; + + if (reexpdb_init() != 0) { + ret = -1; + + break; + } + + found = reexpdb_fsidnum_by_path(ep->e_path, &fsidnum, 0); + if (!found) { + if (ep->e_reexport == REEXP_AUTO_FSIDNUM) { + found = reexpdb_fsidnum_by_path(ep->e_path, &fsidnum, 1); + if (!found) { + xlog(L_ERROR, "%s:%i: Unable to generate fsid for %s", + flname, flline, ep->e_path); + ret = -1; + + break; + } + } else { + if (!ep->e_fsid) { + xlog(L_ERROR, "%s:%i: Selected 'reexport=' mode requires either a UUID 'fsid=' or a numerical 'fsid=' or a reexport db entry %d", + flname, flline, ep->e_fsid); + ret = -1; + } + + break; + } + } + + if (ep->e_fsid) { + if (ep->e_fsid != fsidnum) { + xlog(L_ERROR, "%s:%i: Selected 'reexport=' mode requires configured numerical 'fsid=' to agree with reexport db entry", + flname, flline); + ret = -1; + } + + break; + } + + ep->e_fsid = fsidnum; + + break; + } + } + + return ret; +} + +int reexpdb_add_subvolume(char *path) +{ + sqlite3_stmt *stmt = NULL; + int ret; + + reexpdb_wrlock(); + ret = sqlite3_prepare_v2(db, add_crossed_volume_sql, sizeof(add_crossed_volume_sql), &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_WARNING, "Unable to prepare SQL query: %s", sqlite3_errstr(ret)); + ret = -1; + goto out; + } + + ret = sqlite3_bind_text(stmt, 1, path, -1, NULL); + if (ret != SQLITE_OK) { + xlog(L_WARNING, "Unable to bind \"%s\" SQL query: %s", __func__, sqlite3_errstr(ret)); + ret = -1; + goto out; + } + + ret = sqlite3_step(stmt); + if (ret != SQLITE_DONE) { + xlog(L_WARNING, "Error while adding \"%s\" from database: %s", path, sqlite3_errstr(ret)); + ret = -1; + } else { + ret = 0; + } + +out: + reexpdb_unlock(); + sqlite3_finalize(stmt); + return ret; +} + +int reexpdb_drop_subvolume_unlocked(char *path) +{ + sqlite3_stmt *stmt = NULL; + int ret; + + ret = sqlite3_prepare_v2(db, drop_crossed_volume_sql, sizeof(drop_crossed_volume_sql), &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_WARNING, "Unable to prepare SQL query: %s", sqlite3_errstr(ret)); + ret = -1; + goto out; + } + + ret = sqlite3_bind_text(stmt, 1, path, -1, NULL); + if (ret != SQLITE_OK) { + xlog(L_WARNING, "Unable to bind \"%s\" SQL query: %s", __func__, sqlite3_errstr(ret)); + ret = -1; + goto out; + } + + ret = sqlite3_step(stmt); + if (ret != SQLITE_DONE) { + xlog(L_WARNING, "Error while deleting \"%s\" from database: %s", path, sqlite3_errstr(ret)); + ret = -1; + } else { + ret = 0; + } + +out: + sqlite3_finalize(stmt); + return ret; +} + + +int reexpdb_uncover_subvolumes(void (*cb)(char *path)) +{ + sqlite3_stmt *stmt = NULL; + struct statfs st; + const unsigned char *path; + int ret; + + if (cb) + reexpdb_wrlock(); + else + reexpdb_rdlock(); + + ret = sqlite3_prepare_v2(db, get_crossed_volumes_sql, sizeof(get_crossed_volumes_sql), &stmt, NULL); + if (ret != SQLITE_OK) { + xlog(L_WARNING, "Unable to prepare SQL query: %s", sqlite3_errstr(ret)); + ret = -1; + goto out; + } + + for (;;) { + ret = sqlite3_step(stmt); + if (ret != SQLITE_ROW) + break; + + path = sqlite3_column_text(stmt, 0); + if (cb) + cb((char *)path); + else + statfs((char *)path, &st); + } + + if (ret != SQLITE_DONE) { + xlog(L_WARNING, "Error while reading all subvolumes: %s", sqlite3_errstr(ret)); + ret = -1; + goto out_unlock; + } + + ret = 0; + +out_unlock: + reexpdb_unlock(); + sqlite3_finalize(stmt); +out: + return ret; +} diff --git a/support/reexport/reexport.h b/support/reexport/reexport.h new file mode 100644 index 00000000..46ec8a96 --- /dev/null +++ b/support/reexport/reexport.h @@ -0,0 +1,53 @@ +#ifndef REEXPORT_H +#define REEXPORT_H + +enum { + REEXP_NONE = 0, + REEXP_AUTO_FSIDNUM, + REEXP_PREDEFINED_FSIDNUM, + REEXP_REMOTE_DEVFSID, +}; + +#ifdef HAVE_REEXPORT_SUPPORT +int reexpdb_init(void); +void reexpdb_destroy(void); +int reexpdb_fsidnum_by_path(char *path, uint32_t *fsidnum, int may_create); +int reexpdb_apply_reexport_settings(struct exportent *ep, char *flname, int flline); +int reexpdb_add_subvolume(char *path); +int reexpdb_uncover_subvolumes(void (*cb)(char *path)); +int reexpdb_drop_subvolume_unlocked(char *path); +#else +static inline int reexpdb_init(void) { return 0; } +static inline void reexpdb_destroy(void) {} +static inline int reexpdb_fsidnum_by_path(char *path, uint32_t *fsidnum, int may_create) +{ + (void)path; + (void)may_create; + *fsidnum = 0; + return 0; +} +static inline int reexpdb_apply_reexport_settings(struct exportent *ep, char *flname, int flline) +{ + (void)ep; + (void)flname; + (void)flline; + return 0; +} +static inline int reexpdb_add_subvolume(char *path) +{ + (void)path; + return 0; +} +static inline int reexpdb_uncover_subvolumes(void (*cb)(char *path)) +{ + (void)cb; + return 0; +} +static inline int reexpdb_drop_subvolume_unlocked(char *path) +{ + (void)path; + return 0; +} +#endif /* HAVE_REEXPORT_SUPPORT */ + +#endif /* REEXPORT_H */ From patchwork Thu Feb 17 13:15:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Weinberger X-Patchwork-Id: 12750033 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60A53C433FE for ; Thu, 17 Feb 2022 13:22:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240666AbiBQNXK (ORCPT ); Thu, 17 Feb 2022 08:23:10 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:51032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238854AbiBQNXJ (ORCPT ); Thu, 17 Feb 2022 08:23:09 -0500 Received: from lithops.sigma-star.at (lithops.sigma-star.at [195.201.40.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA91799ECF for ; Thu, 17 Feb 2022 05:22:52 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id C33CA608898A; Thu, 17 Feb 2022 14:16:39 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id GlwiPVsXDZHd; Thu, 17 Feb 2022 14:16:39 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id F33DE608A38A; Thu, 17 Feb 2022 14:16:38 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 86VoNR_wcpn6; Thu, 17 Feb 2022 14:16:38 +0100 (CET) Received: from blindfold.corp.sigma-star.at (213-47-184-186.cable.dynamic.surfer.at [213.47.184.186]) by lithops.sigma-star.at (Postfix) with ESMTPSA id 8A069605DEBB; Thu, 17 Feb 2022 14:16:38 +0100 (CET) From: Richard Weinberger To: linux-nfs@vger.kernel.org Cc: david@sigma-star.at, bfields@fieldses.org, luis.turcitu@appsbroker.com, david.young@appsbroker.com, david.oberhollenzer@sigma-star.at, trond.myklebust@hammerspace.com, anna.schumaker@netapp.com, chris.chilvers@appsbroker.com, Richard Weinberger Subject: [RFC PATCH 2/6] exports: Implement new export option reexport= Date: Thu, 17 Feb 2022 14:15:27 +0100 Message-Id: <20220217131531.2890-3-richard@nod.at> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20220217131531.2890-1-richard@nod.at> References: <20220217131531.2890-1-richard@nod.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org When re-exporting a NFS volume it is mandatory to specify either a UUID or numerical fsid= option because nfsd is unable to derive a identifier on its own. For NFS cross mounts this becomes a problem because nfsd also needs a identifier for every crossed mount. A common workaround is stating every single subvolume in the exports list too. But this defeats the purpose of the crossmnt option and is tedious. This is where the reexport= tries to help. It offers various strategies to automatically derive a identifier for NFS volumes and sub volumes. Each have their pros and cons. Currently three modes are implemented: 1. auto-fsidnum In this mode mountd/exportd will create a new numerical fsid for a NFS volume and subvolume. The numbers are stored in a database such that the server will always use the same fsid. The entry in the exports file allowed to skip fsid= entiry but stating a UUID is allowed, if needed. This mode has the obvious downside that load balancing is not possible since multiple re-exporting NFS servers would generate different ids. 2. predefined-fsidnum This mode works just like auto-fsidnum but does not generate ids for you. It helps in the load balancing case. A system administrator has to manually maintain the database and install it on all re-exporting NFS servers. If you have a massive amount of subvolumes this mode will help because you don't have to bloat the exports list. 3. remote-devfsid If this mode is selected mountd/exportd will derive an UUID from the re-exported NFS volume's fsid (rfc7530 section-5.8.1.9). No further local state is needed on the re-exporting server. The export list entry still needs a fsid= setting because while parsing the exports file the NFS mounts might be not there yet. This mode is dangerous, use only of you're absolutely sure that the NFS server you're re-exporting has a stable fsid. Chances are good that it can change. Since an UUID is derived, reexporting from NFSv3 to NFSv3 is not possible. The file handle space is too small. NFSv3 to NFSv4 works, though. Signed-off-by: Richard Weinberger --- support/include/nfslib.h | 1 + support/nfs/Makefile.am | 1 + support/nfs/exports.c | 73 ++++++++++++++++++++++++++++++++++++++ utils/exportfs/Makefile.am | 4 +++ utils/mount/Makefile.am | 6 ++++ 5 files changed, 85 insertions(+) diff --git a/support/include/nfslib.h b/support/include/nfslib.h index 6faba71b..0465a1ff 100644 --- a/support/include/nfslib.h +++ b/support/include/nfslib.h @@ -85,6 +85,7 @@ struct exportent { struct sec_entry e_secinfo[SECFLAVOR_COUNT+1]; unsigned int e_ttl; char * e_realpath; + int e_reexport; }; struct rmtabent { diff --git a/support/nfs/Makefile.am b/support/nfs/Makefile.am index 67e3a8e1..c4357e7d 100644 --- a/support/nfs/Makefile.am +++ b/support/nfs/Makefile.am @@ -9,6 +9,7 @@ libnfs_la_SOURCES = exports.c rmtab.c xio.c rpcmisc.c rpcdispatch.c \ svc_socket.c cacheio.c closeall.c nfs_mntent.c \ svc_create.c atomicio.c strlcat.c strlcpy.c libnfs_la_LIBADD = libnfsconf.la +libnfs_la_CPPFLAGS = -I$(top_srcdir)/support/reexport libnfsconf_la_SOURCES = conffile.c xlog.c diff --git a/support/nfs/exports.c b/support/nfs/exports.c index 2c8f0752..13129d68 100644 --- a/support/nfs/exports.c +++ b/support/nfs/exports.c @@ -31,6 +31,7 @@ #include "xlog.h" #include "xio.h" #include "pseudoflavors.h" +#include "reexport.h" #define EXPORT_DEFAULT_FLAGS \ (NFSEXP_READONLY|NFSEXP_ROOTSQUASH|NFSEXP_GATHERED_WRITES|NFSEXP_NOSUBTREECHECK) @@ -103,6 +104,7 @@ static void init_exportent (struct exportent *ee, int fromkernel) ee->e_nsqgids = 0; ee->e_uuid = NULL; ee->e_ttl = default_ttl; + ee->e_reexport = REEXP_NONE; } struct exportent * @@ -302,6 +304,26 @@ putexportent(struct exportent *ep) } if (ep->e_uuid) fprintf(fp, "fsid=%s,", ep->e_uuid); + + if (ep->e_reexport) { + fprintf(fp, "reexport="); + switch (ep->e_reexport) { + case REEXP_AUTO_FSIDNUM: + fprintf(fp, "auto-fsidnum"); + break; + case REEXP_PREDEFINED_FSIDNUM: + fprintf(fp, "predefined-fsidnum"); + break; + case REEXP_REMOTE_DEVFSID: + fprintf(fp, "remote-devfsid"); + break; + default: + xlog(L_ERROR, "unknown reexport method %i", ep->e_reexport); + fprintf(fp, "none"); + } + fprintf(fp, ","); + } + if (ep->e_mountpoint) fprintf(fp, "mountpoint%s%s,", ep->e_mountpoint[0]?"=":"", ep->e_mountpoint); @@ -538,6 +560,7 @@ parseopts(char *cp, struct exportent *ep, int warn, int *had_subtree_opt_ptr) char *flname = efname?efname:"command line"; int flline = efp?efp->x_line:0; unsigned int active = 0; + int saw_reexport = 0; squids = ep->e_squids; nsquids = ep->e_nsquids; sqgids = ep->e_sqgids; nsqgids = ep->e_nsqgids; @@ -644,6 +667,13 @@ bad_option: } } else if (strncmp(opt, "fsid=", 5) == 0) { char *oe; + + if (saw_reexport) { + xlog(L_ERROR, "%s:%d: 'fsid=' has to be after 'reexport=' %s\n", + flname, flline, opt); + goto bad_option; + } + if (strcmp(opt+5, "root") == 0) { ep->e_fsid = 0; setflags(NFSEXP_FSID, active, ep); @@ -688,6 +718,49 @@ bad_option: active = parse_flavors(opt+4, ep); if (!active) goto bad_option; + } else if (strncmp(opt, "reexport=", 9) == 0) { +#ifdef HAVE_REEXPORT_SUPPORT + char *strategy = strchr(opt, '='); + + if (!strategy) { + xlog(L_ERROR, "%s:%d: bad option %s\n", + flname, flline, opt); + goto bad_option; + } + strategy++; + + if (saw_reexport) { + xlog(L_ERROR, "%s:%d: only one 'reexport=' is allowed%s\n", + flname, flline, opt); + goto bad_option; + } + + if (strcmp(strategy, "auto-fsidnum") == 0) { + ep->e_reexport = REEXP_AUTO_FSIDNUM; + } else if (strcmp(strategy, "predefined-fsidnum") == 0) { + ep->e_reexport = REEXP_PREDEFINED_FSIDNUM; + } else if (strcmp(strategy, "remote-devfsid") == 0) { + ep->e_reexport = REEXP_REMOTE_DEVFSID; + } else if (strcmp(strategy, "none") == 0) { + ep->e_reexport = REEXP_NONE; + } else { + xlog(L_ERROR, "%s:%d: bad option %s\n", + flname, flline, strategy); + goto bad_option; + } + + if (reexpdb_apply_reexport_settings(ep, flname, flline) != 0) + goto bad_option; + + if (ep->e_fsid) + setflags(NFSEXP_FSID, active, ep); + + saw_reexport = 1; +#else + xlog(L_ERROR, "%s:%d: 'reexport=' not available, rebuild with --enable-reexport\n", + flname, flline); + goto bad_option; +#endif } else { xlog(L_ERROR, "%s:%d: unknown keyword \"%s\"\n", flname, flline, opt); diff --git a/utils/exportfs/Makefile.am b/utils/exportfs/Makefile.am index 96524c72..9eabef14 100644 --- a/utils/exportfs/Makefile.am +++ b/utils/exportfs/Makefile.am @@ -12,4 +12,8 @@ exportfs_LDADD = ../../support/export/libexport.a \ ../../support/misc/libmisc.a \ $(LIBWRAP) $(LIBNSL) $(LIBPTHREAD) +if CONFIG_REEXPORT +exportfs_LDADD += ../../support/reexport/libreexport.a $(LIBSQLITE) -lrt +endif + MAINTAINERCLEANFILES = Makefile.in diff --git a/utils/mount/Makefile.am b/utils/mount/Makefile.am index 3101f7ab..f4d5b182 100644 --- a/utils/mount/Makefile.am +++ b/utils/mount/Makefile.am @@ -32,6 +32,12 @@ mount_nfs_LDADD = ../../support/nfs/libnfs.la \ ../../support/misc/libmisc.a \ $(LIBTIRPC) +if CONFIG_REEXPORT +mount_nfs_LDADD += ../../support/reexport/libreexport.a \ + $(LIBSQLITE) -lrt $(LIBPTHREAD) +endif + + mount_nfs_SOURCES = $(mount_common) if CONFIG_LIBMOUNT From patchwork Thu Feb 17 13:15:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Weinberger X-Patchwork-Id: 12750034 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A03BC4332F for ; Thu, 17 Feb 2022 13:22:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230153AbiBQNXL (ORCPT ); Thu, 17 Feb 2022 08:23:11 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:51070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240747AbiBQNXJ (ORCPT ); Thu, 17 Feb 2022 08:23:09 -0500 Received: from lithops.sigma-star.at (lithops.sigma-star.at [195.201.40.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA1C398F7A for ; Thu, 17 Feb 2022 05:22:52 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id D6A1760ED821; Thu, 17 Feb 2022 14:16:40 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 8HEKE043ZNLK; Thu, 17 Feb 2022 14:16:39 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id 7408E60765A3; Thu, 17 Feb 2022 14:16:39 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id w30w7zKvo9uz; Thu, 17 Feb 2022 14:16:39 +0100 (CET) Received: from blindfold.corp.sigma-star.at (213-47-184-186.cable.dynamic.surfer.at [213.47.184.186]) by lithops.sigma-star.at (Postfix) with ESMTPSA id EAC65608898A; Thu, 17 Feb 2022 14:16:38 +0100 (CET) From: Richard Weinberger To: linux-nfs@vger.kernel.org Cc: david@sigma-star.at, bfields@fieldses.org, luis.turcitu@appsbroker.com, david.young@appsbroker.com, david.oberhollenzer@sigma-star.at, trond.myklebust@hammerspace.com, anna.schumaker@netapp.com, chris.chilvers@appsbroker.com, Richard Weinberger Subject: [RFC PATCH 3/6] export: Implement logic behind reexport= Date: Thu, 17 Feb 2022 14:15:28 +0100 Message-Id: <20220217131531.2890-4-richard@nod.at> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20220217131531.2890-1-richard@nod.at> References: <20220217131531.2890-1-richard@nod.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org This covers the cross mount case. When mountd/exportd detect a cross mount on a re-exported NFS volume a identifier has to be found to make nfsd happy. Signed-off-by: Richard Weinberger --- support/export/Makefile.am | 2 + support/export/cache.c | 140 +++++++++++++++++++++++++++++++++---- utils/exportd/Makefile.am | 8 ++- utils/exportd/exportd.c | 2 + utils/mountd/Makefile.am | 6 ++ 5 files changed, 144 insertions(+), 14 deletions(-) diff --git a/support/export/Makefile.am b/support/export/Makefile.am index eec737f6..90109b1e 100644 --- a/support/export/Makefile.am +++ b/support/export/Makefile.am @@ -14,6 +14,8 @@ libexport_a_SOURCES = client.c export.c hostname.c \ xtab.c mount_clnt.c mount_xdr.c \ cache.c auth.c v4root.c fsloc.c \ v4clients.c +libexport_a_CPPFLAGS = -I$(top_srcdir)/support/reexport + BUILT_SOURCES = $(GENFILES) noinst_HEADERS = mount.h diff --git a/support/export/cache.c b/support/export/cache.c index a5823e92..6039745e 100644 --- a/support/export/cache.c +++ b/support/export/cache.c @@ -33,6 +33,7 @@ #include "export.h" #include "pseudoflavors.h" #include "xcommon.h" +#include "reexport.h" #ifdef HAVE_JUNCTION_SUPPORT #include "fsloc.h" @@ -235,6 +236,16 @@ static void auth_unix_gid(int f) xlog(L_ERROR, "auth_unix_gid: error writing reply"); } +static int match_crossmnt_fsidnum(uint32_t parsed_fsidnum, char *path) +{ + uint32_t fsidnum; + + if (reexpdb_fsidnum_by_path(path, &fsidnum, 0) == 0) + return 0; + + return fsidnum == parsed_fsidnum; +} + #ifdef USE_BLKID static const char *get_uuid_blkdev(char *path) { @@ -331,7 +342,52 @@ static const unsigned long nonblkid_filesystems[] = { 0 /* last */ }; -static int uuid_by_path(char *path, int type, size_t uuidlen, char *uuid) +static int get_uuid_from_fsid(char *path, char *uuid_str, size_t len) +{ + unsigned int min_dev, maj_dev, min_fsid, maj_fsid; + int rc, n, found = 0, header_seen = 0; + struct stat stb; + FILE *nfsfs_fd; + char line[128]; + + rc = nfsd_path_stat(path, &stb); + if (rc) { + xlog(L_WARNING, "Unable to stat %s", path); + return 0; + } + + nfsfs_fd = fopen("/proc/fs/nfsfs/volumes", "r"); + if (nfsfs_fd == NULL) { + xlog(L_WARNING, "Unable to open nfsfs volume file: %m"); + return 0; + } + + while (fgets(line, sizeof(line), nfsfs_fd) != NULL) { + if (!header_seen) { + header_seen = 1; + continue; + } + n = sscanf(line, "v%*u %*x %*u %u:%u %x:%x %*s", &maj_dev, + &min_dev, &maj_fsid, &min_fsid); + + if (n != 4) { + xlog(L_WARNING, "Unable to parse nfsfs volume line: %d, %s", n, line); + continue; + } + + if (makedev(maj_dev, min_dev) == stb.st_dev) { + found = 1; + snprintf(uuid_str, len, "%08x%08x", maj_fsid, min_fsid); + break; + } + } + + fclose(nfsfs_fd); + + return found; +} + +static int uuid_by_path(struct exportent *exp, char *path, int type, size_t uuidlen, char *uuid) { /* get a uuid for the filesystem found at 'path'. * There are several possible ways of generating the @@ -362,7 +418,7 @@ static int uuid_by_path(char *path, int type, size_t uuidlen, char *uuid) */ struct statfs64 st; char fsid_val[17]; - const char *blkid_val = NULL; + const char *fsuuid_val = NULL; const char *val; int rc; @@ -375,7 +431,20 @@ static int uuid_by_path(char *path, int type, size_t uuidlen, char *uuid) break; } if (*bad == 0) - blkid_val = get_uuid_blkdev(path); + fsuuid_val = get_uuid_blkdev(path); + else if (exp->e_reexport == REEXP_REMOTE_DEVFSID && + *bad == 0x6969 /* NFS_SUPER_MAGIC */) { + char tmp[17]; + int ret = get_uuid_from_fsid(path, tmp, sizeof(tmp)); + + if (ret < 0) { + xlog(L_WARNING, "Unable to read nfsfs volume file: %i", ret); + } else if (ret == 0) { + xlog(L_WARNING, "Unable to find nfsfs volume entry for %s", path); + } else { + fsuuid_val = tmp; + } + } } if (rc == 0 && @@ -385,8 +454,8 @@ static int uuid_by_path(char *path, int type, size_t uuidlen, char *uuid) else fsid_val[0] = 0; - if (blkid_val && (type--) == 0) - val = blkid_val; + if (fsuuid_val && (type--) == 0) + val = fsuuid_val; else if (fsid_val[0] && (type--) == 0) val = fsid_val; else @@ -684,8 +753,13 @@ static int match_fsid(struct parsed_fsid *parsed, nfs_export *exp, char *path) goto match; case FSID_NUM: if (((exp->m_export.e_flags & NFSEXP_FSID) == 0 || - exp->m_export.e_fsid != parsed->fsidnum)) + exp->m_export.e_fsid != parsed->fsidnum)) { + if (exp->m_export.e_flags & NFSEXP_CROSSMOUNT && + match_crossmnt_fsidnum(parsed->fsidnum, path)) + goto match; + goto nomatch; + } goto match; case FSID_UUID4_INUM: case FSID_UUID16_INUM: @@ -708,7 +782,7 @@ static int match_fsid(struct parsed_fsid *parsed, nfs_export *exp, char *path) } else for (type = 0; - uuid_by_path(path, type, parsed->uuidlen, u); + uuid_by_path(&exp->m_export, path, type, parsed->uuidlen, u); type++) if (memcmp(u, parsed->fhuuid, parsed->uuidlen) == 0) goto match; @@ -932,7 +1006,7 @@ static void write_fsloc(char **bp, int *blen, struct exportent *ep) release_replicas(servers); } #endif -static void write_secinfo(char **bp, int *blen, struct exportent *ep, int flag_mask) +static void write_secinfo(char **bp, int *blen, struct exportent *ep, int flag_mask, int extra_flag) { struct sec_entry *p; @@ -947,11 +1021,20 @@ static void write_secinfo(char **bp, int *blen, struct exportent *ep, int flag_m qword_addint(bp, blen, p - ep->e_secinfo); for (p = ep->e_secinfo; p->flav; p++) { qword_addint(bp, blen, p->flav->fnum); - qword_addint(bp, blen, p->flags & flag_mask); + qword_addint(bp, blen, (p->flags | extra_flag) & flag_mask); } } +static int can_reexport_via_fsidnum(struct exportent *exp, struct statfs64 *st) +{ + if (st->f_type != 0x6969 /* NFS_SUPER_MAGIC */) + return 0; + + return exp->e_reexport == REEXP_PREDEFINED_FSIDNUM || + exp->e_reexport == REEXP_AUTO_FSIDNUM; +} + static int dump_to_cache(int f, char *buf, int blen, char *domain, char *path, struct exportent *exp, int ttl) { @@ -968,21 +1051,52 @@ static int dump_to_cache(int f, char *buf, int blen, char *domain, if (exp) { int different_fs = strcmp(path, exp->e_path) != 0; int flag_mask = different_fs ? ~NFSEXP_FSID : ~0; + int rc, do_fsidnum = 0; + uint32_t fsidnum = exp->e_fsid; + + if (different_fs) { + struct statfs64 st; + + rc = nfsd_path_statfs64(path, &st); + if (rc) { + xlog(L_WARNING, "unable to statfs %s", path); + errno = EINVAL; + return -1; + } + + if (can_reexport_via_fsidnum(exp, &st)) { + do_fsidnum = 1; + flag_mask = ~0; + } + } qword_adduint(&bp, &blen, now + exp->e_ttl); - qword_addint(&bp, &blen, exp->e_flags & flag_mask); + + if (do_fsidnum) { + uint32_t search_fsidnum = 0; + if (reexpdb_fsidnum_by_path(path, &search_fsidnum, + exp->e_reexport == REEXP_AUTO_FSIDNUM) == 0) { + errno = EINVAL; + return -1; + } + fsidnum = search_fsidnum; + qword_addint(&bp, &blen, exp->e_flags | NFSEXP_FSID); + } else { + qword_addint(&bp, &blen, exp->e_flags & flag_mask); + } + qword_addint(&bp, &blen, exp->e_anonuid); qword_addint(&bp, &blen, exp->e_anongid); - qword_addint(&bp, &blen, exp->e_fsid); + qword_addint(&bp, &blen, fsidnum); #ifdef HAVE_JUNCTION_SUPPORT write_fsloc(&bp, &blen, exp); #endif - write_secinfo(&bp, &blen, exp, flag_mask); + write_secinfo(&bp, &blen, exp, flag_mask, do_fsidnum ? NFSEXP_FSID : 0); if (exp->e_uuid == NULL || different_fs) { char u[16]; if ((exp->e_flags & flag_mask & NFSEXP_FSID) == 0 && - uuid_by_path(path, 0, 16, u)) { + uuid_by_path(exp, path, 0, 16, u)) { qword_add(&bp, &blen, "uuid"); qword_addhex(&bp, &blen, u, 16); } diff --git a/utils/exportd/Makefile.am b/utils/exportd/Makefile.am index c95bdee7..b0ec9034 100644 --- a/utils/exportd/Makefile.am +++ b/utils/exportd/Makefile.am @@ -16,11 +16,17 @@ exportd_SOURCES = exportd.c exportd_LDADD = ../../support/export/libexport.a \ ../../support/nfs/libnfs.la \ ../../support/misc/libmisc.a \ - $(OPTLIBS) $(LIBBLKID) $(LIBPTHREAD) -luuid + $(OPTLIBS) $(LIBBLKID) $(LIBPTHREAD) \ + -luuid +if CONFIG_REEXPORT +exportd_LDADD += ../../support/reexport/libreexport.a $(LIBSQLITE) -lrt +endif exportd_CPPFLAGS = $(AM_CPPFLAGS) $(CPPFLAGS) \ -I$(top_srcdir)/support/export +exportd_CPPFLAGS += -I$(top_srcdir)/support/reexport + MAINTAINERCLEANFILES = Makefile.in ####################################################################### diff --git a/utils/exportd/exportd.c b/utils/exportd/exportd.c index 2dd12cb6..4ddfed35 100644 --- a/utils/exportd/exportd.c +++ b/utils/exportd/exportd.c @@ -22,6 +22,7 @@ #include "conffile.h" #include "exportfs.h" #include "export.h" +#include "reexport.h" extern void my_svc_run(void); @@ -296,6 +297,7 @@ main(int argc, char **argv) /* Open files now to avoid sharing descriptors among forked processes */ cache_open(); v4clients_init(); + reexpdb_init(); /* Process incoming upcalls */ cache_process_loop(); diff --git a/utils/mountd/Makefile.am b/utils/mountd/Makefile.am index 13b25c90..569d335a 100644 --- a/utils/mountd/Makefile.am +++ b/utils/mountd/Makefile.am @@ -20,10 +20,16 @@ mountd_LDADD = ../../support/export/libexport.a \ $(OPTLIBS) \ $(LIBBSD) $(LIBWRAP) $(LIBNSL) $(LIBBLKID) -luuid $(LIBTIRPC) \ $(LIBPTHREAD) +if CONFIG_REEXPORT +mountd_LDADD += ../../support/reexport/libreexport.a $(LIBSQLITE) -lrt +endif + mountd_CPPFLAGS = $(AM_CPPFLAGS) $(CPPFLAGS) \ -I$(top_builddir)/support/include \ -I$(top_srcdir)/support/export +mountd_CPPFLAGS += -I$(top_srcdir)/support/reexport + MAINTAINERCLEANFILES = Makefile.in ####################################################################### From patchwork Thu Feb 17 13:15:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Weinberger X-Patchwork-Id: 12750032 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3233C433F5 for ; Thu, 17 Feb 2022 13:22:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240881AbiBQNXK (ORCPT ); Thu, 17 Feb 2022 08:23:10 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:50992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240666AbiBQNXJ (ORCPT ); Thu, 17 Feb 2022 08:23:09 -0500 X-Greylist: delayed 370 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Thu, 17 Feb 2022 05:22:53 PST Received: from lithops.sigma-star.at (lithops.sigma-star.at [195.201.40.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA26B996AE for ; Thu, 17 Feb 2022 05:22:52 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id E69A160F6B69; Thu, 17 Feb 2022 14:16:40 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id lk-DzrSJFwxk; Thu, 17 Feb 2022 14:16:40 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id 0D132608A38A; Thu, 17 Feb 2022 14:16:40 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id V0GTcbLbDGXO; Thu, 17 Feb 2022 14:16:39 +0100 (CET) Received: from blindfold.corp.sigma-star.at (213-47-184-186.cable.dynamic.surfer.at [213.47.184.186]) by lithops.sigma-star.at (Postfix) with ESMTPSA id 71E23605DED6; Thu, 17 Feb 2022 14:16:39 +0100 (CET) From: Richard Weinberger To: linux-nfs@vger.kernel.org Cc: david@sigma-star.at, bfields@fieldses.org, luis.turcitu@appsbroker.com, david.young@appsbroker.com, david.oberhollenzer@sigma-star.at, trond.myklebust@hammerspace.com, anna.schumaker@netapp.com, chris.chilvers@appsbroker.com, Richard Weinberger Subject: [RFC PATCH 4/6] export: Record mounted volumes Date: Thu, 17 Feb 2022 14:15:29 +0100 Message-Id: <20220217131531.2890-5-richard@nod.at> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20220217131531.2890-1-richard@nod.at> References: <20220217131531.2890-1-richard@nod.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org As soon a client mounts a volume, record it in the database to be able to uncover NFS subvolumes after a reboot. Signed-off-by: Richard Weinberger --- support/export/cache.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/support/export/cache.c b/support/export/cache.c index 6039745e..b5763b1d 100644 --- a/support/export/cache.c +++ b/support/export/cache.c @@ -967,8 +967,10 @@ static void nfsd_fh(int f) * line. */ qword_addint(&bp, &blen, 0x7fffffff); - if (found) + if (found) { + reexpdb_add_subvolume(found_path); qword_add(&bp, &blen, found_path); + } qword_addeol(&bp, &blen); if (blen <= 0 || cache_write(f, buf, bp - buf) != bp - buf) xlog(L_ERROR, "nfsd_fh: error writing reply"); From patchwork Thu Feb 17 13:15:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Weinberger X-Patchwork-Id: 12750036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4CD7C433F5 for ; Thu, 17 Feb 2022 13:23:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240885AbiBQNXN (ORCPT ); Thu, 17 Feb 2022 08:23:13 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:51266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240882AbiBQNXM (ORCPT ); Thu, 17 Feb 2022 08:23:12 -0500 Received: from lithops.sigma-star.at (lithops.sigma-star.at [195.201.40.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0190BAEF12 for ; Thu, 17 Feb 2022 05:22:56 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id 2125560F6B8E; Thu, 17 Feb 2022 14:16:41 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 7AdH86T1LhPA; Thu, 17 Feb 2022 14:16:40 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id 7AE3860D482C; Thu, 17 Feb 2022 14:16:40 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id sDOCmXgzzmK9; Thu, 17 Feb 2022 14:16:40 +0100 (CET) Received: from blindfold.corp.sigma-star.at (213-47-184-186.cable.dynamic.surfer.at [213.47.184.186]) by lithops.sigma-star.at (Postfix) with ESMTPSA id E43C9605DEBB; Thu, 17 Feb 2022 14:16:39 +0100 (CET) From: Richard Weinberger To: linux-nfs@vger.kernel.org Cc: david@sigma-star.at, bfields@fieldses.org, luis.turcitu@appsbroker.com, david.young@appsbroker.com, david.oberhollenzer@sigma-star.at, trond.myklebust@hammerspace.com, anna.schumaker@netapp.com, chris.chilvers@appsbroker.com, Richard Weinberger Subject: [RFC PATCH 5/6] nfsd: statfs() every known subvolume upon start Date: Thu, 17 Feb 2022 14:15:30 +0100 Message-Id: <20220217131531.2890-6-richard@nod.at> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20220217131531.2890-1-richard@nod.at> References: <20220217131531.2890-1-richard@nod.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org This will trigger an automount of a subvolume and existing file handles will continue to work. Signed-off-by: Richard Weinberger --- utils/nfsd/Makefile.am | 6 ++++++ utils/nfsd/nfsd.c | 10 ++++++++++ 2 files changed, 16 insertions(+) diff --git a/utils/nfsd/Makefile.am b/utils/nfsd/Makefile.am index 8acc9a04..3acc8354 100644 --- a/utils/nfsd/Makefile.am +++ b/utils/nfsd/Makefile.am @@ -11,6 +11,12 @@ noinst_HEADERS = nfssvc.h nfsd_SOURCES = nfsd.c nfssvc.c nfsd_LDADD = ../../support/nfs/libnfs.la $(LIBTIRPC) +if CONFIG_REEXPORT +nfsd_LDADD += ../../support/reexport/libreexport.a $(LIBSQLITE) $(LIBPTHREAD) -lrt +endif + +nfsd_CPPFLAGS = -I$(top_srcdir)/support/reexport + MAINTAINERCLEANFILES = Makefile.in ####################################################################### diff --git a/utils/nfsd/nfsd.c b/utils/nfsd/nfsd.c index b0741718..b5175f7a 100644 --- a/utils/nfsd/nfsd.c +++ b/utils/nfsd/nfsd.c @@ -29,6 +29,7 @@ #include "nfssvc.h" #include "xlog.h" #include "xcommon.h" +#include "reexport.h" #ifndef NFSD_NPROC #define NFSD_NPROC 8 @@ -347,6 +348,15 @@ main(int argc, char **argv) exit(1); } + /* + * Make sure that uncovered NFS subvolumes are present such that + * existing file handles continue working. + */ + if (reexpdb_init() == 0) { + reexpdb_uncover_subvolumes(NULL); + reexpdb_destroy(); + } + /* make sure nfsdfs is mounted if it's available */ nfssvc_mount_nfsdfs(progname); From patchwork Thu Feb 17 13:15:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Weinberger X-Patchwork-Id: 12750035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7379CC433EF for ; Thu, 17 Feb 2022 13:22:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238854AbiBQNXL (ORCPT ); Thu, 17 Feb 2022 08:23:11 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:51030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230153AbiBQNXJ (ORCPT ); Thu, 17 Feb 2022 08:23:09 -0500 Received: from lithops.sigma-star.at (lithops.sigma-star.at [195.201.40.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA54399ECE for ; Thu, 17 Feb 2022 05:22:52 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id A7713605DEBB; Thu, 17 Feb 2022 14:16:41 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id epkojYBpEXMZ; Thu, 17 Feb 2022 14:16:41 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by lithops.sigma-star.at (Postfix) with ESMTP id 030CD60765A6; Thu, 17 Feb 2022 14:16:41 +0100 (CET) Received: from lithops.sigma-star.at ([127.0.0.1]) by localhost (lithops.sigma-star.at [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id u5LINtTGGjJ2; Thu, 17 Feb 2022 14:16:40 +0100 (CET) Received: from blindfold.corp.sigma-star.at (213-47-184-186.cable.dynamic.surfer.at [213.47.184.186]) by lithops.sigma-star.at (Postfix) with ESMTPSA id 5D644605DED6; Thu, 17 Feb 2022 14:16:40 +0100 (CET) From: Richard Weinberger To: linux-nfs@vger.kernel.org Cc: david@sigma-star.at, bfields@fieldses.org, luis.turcitu@appsbroker.com, david.young@appsbroker.com, david.oberhollenzer@sigma-star.at, trond.myklebust@hammerspace.com, anna.schumaker@netapp.com, chris.chilvers@appsbroker.com, Richard Weinberger Subject: [RFC PATCH 6/6] export: Garbage collect orphaned subvolumes upon start Date: Thu, 17 Feb 2022 14:15:31 +0100 Message-Id: <20220217131531.2890-7-richard@nod.at> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20220217131531.2890-1-richard@nod.at> References: <20220217131531.2890-1-richard@nod.at> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org Make sure the database contains no orphaned subvolumes. We have to be careful. Signed-off-by: Richard Weinberger --- support/export/cache.c | 97 +++++++++++++++++++++++++++++++++++++++++ support/export/export.h | 3 ++ utils/exportd/exportd.c | 17 +++++++- utils/mountd/mountd.c | 1 + utils/mountd/svc_run.c | 18 ++++++++ 5 files changed, 135 insertions(+), 1 deletion(-) diff --git a/support/export/cache.c b/support/export/cache.c index b5763b1d..94a0d79a 100644 --- a/support/export/cache.c +++ b/support/export/cache.c @@ -1181,6 +1181,103 @@ lookup_export(char *dom, char *path, struct addrinfo *ai) return found; } +static char *get_export_path(char *path) +{ + int i; + nfs_export *exp; + nfs_export *found = NULL; + + for (i = 0; i < MCL_MAXTYPES; i++) { + for (exp = exportlist[i].p_head; exp; exp = exp->m_next) { + if (!path_matches(exp, path)) + continue; + + if (!found) { + found = exp; + continue; + } + + /* Always prefer non-V4ROOT exports */ + if (exp->m_export.e_flags & NFSEXP_V4ROOT) + continue; + if (found->m_export.e_flags & NFSEXP_V4ROOT) { + found = exp; + continue; + } + + /* If one is a CROSSMOUNT, then prefer the longest path */ + if (((found->m_export.e_flags & NFSEXP_CROSSMOUNT) || + (exp->m_export.e_flags & NFSEXP_CROSSMOUNT)) && + strlen(found->m_export.e_path) != + strlen(exp->m_export.e_path)) { + + if (strlen(exp->m_export.e_path) > + strlen(found->m_export.e_path)) { + found = exp; + } + continue; + } + } + } + + if (found) + return found->m_export.e_path; + else + return NULL; +} + +int export_subvol_orphaned(char *path) +{ + struct statfs st, stp; + char *path_parent; + int ret; + + path_parent = get_export_path(path); + if (!path_parent) + /* + * Path has no parent in export list. + * Must be orphaned. + */ + return 1; + + ret = statfs(path_parent, &stp); + if (ret == -1) + /* + * Parent path is not statfs'able. Maybe not yet mounted? + * Can't be sure, don't treat path as orphaned. + */ + return 0; + + if (strcmp(path_parent, path) == 0) + /* + * This is not a subvolume, it is listed in exports. + * No need to keep tack of it. + */ + return 1; + + if (stp.f_type != 0x6969) + /* + * Parent is not a NFS mount. Maybe not yet mounted? + * Can't be sure either. + */ + return 0; + + ret = statfs(path, &st); + if (ret == -1) { + if (errno == ENOENT) + /* + * Parent is a NFS mount but path is gone. + * Must be orphaned. + */ + return 1; + } + + /* + * For all remaining cases we can't be sure either. + */ + return 0; +} + #ifdef HAVE_JUNCTION_SUPPORT #include diff --git a/support/export/export.h b/support/export/export.h index 8d5a0d30..45dd3da4 100644 --- a/support/export/export.h +++ b/support/export/export.h @@ -38,4 +38,7 @@ static inline bool is_ipaddr_client(char *dom) { return dom[0] == '$'; } + +int export_subvol_orphaned(char *path); + #endif /* EXPORT__H */ diff --git a/utils/exportd/exportd.c b/utils/exportd/exportd.c index 4ddfed35..6dc51a32 100644 --- a/utils/exportd/exportd.c +++ b/utils/exportd/exportd.c @@ -208,6 +208,12 @@ read_exportd_conf(char *progname, char **argv) default_ttl = ttl; } +static void subvol_cb(char *path) +{ + if (export_subvol_orphaned(path)) + reexpdb_drop_subvolume_unlocked(path); +} + int main(int argc, char **argv) { @@ -297,7 +303,16 @@ main(int argc, char **argv) /* Open files now to avoid sharing descriptors among forked processes */ cache_open(); v4clients_init(); - reexpdb_init(); + if (reexpdb_init() != 0) { + xlog(L_ERROR, "%s: Failed to init reexport database", __func__); + exit(1); + } + + /* + * Load exports into memory and garbage collect orphaned subvolumes. + */ + auth_reload(); + reexpdb_uncover_subvolumes(subvol_cb); /* Process incoming upcalls */ cache_process_loop(); diff --git a/utils/mountd/mountd.c b/utils/mountd/mountd.c index bcf749fa..8555d746 100644 --- a/utils/mountd/mountd.c +++ b/utils/mountd/mountd.c @@ -32,6 +32,7 @@ #include "nfsd_path.h" #include "nfslib.h" #include "export.h" +#include "reexport.h" extern void my_svc_run(void); diff --git a/utils/mountd/svc_run.c b/utils/mountd/svc_run.c index 167b9757..9a891ff0 100644 --- a/utils/mountd/svc_run.c +++ b/utils/mountd/svc_run.c @@ -57,6 +57,7 @@ #include #endif #include "export.h" +#include "reexport.h" void my_svc_run(void); @@ -87,6 +88,12 @@ my_svc_getreqset (fd_set *readfds) #endif +static void subvol_cb(char *path) +{ + if (export_subvol_orphaned(path)) + reexpdb_drop_subvolume_unlocked(path); +} + /* * The heart of the server. A crib from libc for the most part... */ @@ -96,6 +103,17 @@ my_svc_run(void) fd_set readfds; int selret; + if (reexpdb_init() != 0) { + xlog(L_ERROR, "%s: Failed to init reexport database", __func__); + return; + } + + /* + * Load exports into memory and garbage collect orphaned subvolumes. + */ + auth_reload(); + reexpdb_uncover_subvolumes(subvol_cb); + for (;;) { readfds = svc_fdset;