From patchwork Wed Apr 20 20:43:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Hostetler X-Patchwork-Id: 12820812 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CA26C43217 for ; Wed, 20 Apr 2022 20:44:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1382201AbiDTUr3 (ORCPT ); Wed, 20 Apr 2022 16:47:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1382165AbiDTUrK (ORCPT ); Wed, 20 Apr 2022 16:47:10 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7842E45AE2 for ; Wed, 20 Apr 2022 13:43:54 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id o20-20020a05600c511400b0038ebbbb2ad8so2054067wms.0 for ; Wed, 20 Apr 2022 13:43:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=bIorACD9MvGhf+EOOe9S7KEedrit1duOzRZ5v8bUeZQ=; b=aDZm//1/KlbHB3RTSchqdsIIZImLw/3+wjUb/0CxQmBEeFO5gyOHpbjH1GRw7bhNF3 fhpzm7mezgbHL023CvbgrYaf4XyYbsmshvjv7oiujpqTEvVrMgoOFJsKoofVEuyxzoCt iHvUNUhb57c0nQweVoRi00iAJ8SSJIhO9H+vYW1lImzRDkDlCxZVi+QSRDrSsUVyXO0m K2L70IYme7O0tUBrn3KRQjFJzuj8jfGjYvuC/g4uuPJ+3Pjhi5s4J6vJRQIPYRW18s67 55u15NT26IrtQSKR5EcGtHegF4XSNws0g8NCpjzqdDMkrjJPaX4BLLzz/iRkrdAIthsZ 4l8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=bIorACD9MvGhf+EOOe9S7KEedrit1duOzRZ5v8bUeZQ=; b=V9U8NwQiiRfd8viVja9YCLVZVvV6Q8f4BbMQbm9HsC6NbHsG7sXNDCBqlfyBw9JA2h ZjXzllef6ppzRxdsd999wRALG8oTJv8qx8dBXOSMIwGcJaL+oLI00Ri0z+MWHA9cxj6i GkpeXYv14flCzlOXgjibthup3VcHx6HQgOEojR/elQjoE7dDA3ZUzzrEBjB8JvJYwivX FbcRWSHNyAK7O5MREROtuLNfi0sNHeiu5X8Z7/Co/i28le67M/EuwTw8RGqIJ8wGRG7J GReuy0Q9xAU6D29Oo6akASc+i+5XyqHV9CPbCUFILVTnX42GjxCptVxAGrHSM8Cuy3vl Ww3g== X-Gm-Message-State: AOAM533mTSbiqGk9y7G7ko9/sE8ePBwdDp2SKvgw7ouOOyPQjX13MuBf wJEGCBj25k5Parw/9IPl3JH8mhdum5k= X-Google-Smtp-Source: ABdhPJyUAIZXh46fdZln0Kn1eqprnV7e4zoldw/jh3hvxnpb46xd46OjIa1gF1sINjEp4I6cgYZacw== X-Received: by 2002:a7b:cd04:0:b0:38e:d7a4:1548 with SMTP id f4-20020a7bcd04000000b0038ed7a41548mr5277065wmj.73.1650487432699; Wed, 20 Apr 2022 13:43:52 -0700 (PDT) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id p1-20020a1c7401000000b0038ed3bb00c9sm342825wmc.6.2022.04.20.13.43.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Apr 2022 13:43:52 -0700 (PDT) Message-Id: In-Reply-To: References: Date: Wed, 20 Apr 2022 20:43:15 +0000 Subject: [PATCH v5 25/28] fsmonitor: on macOS also emit NFC spelling for NFD pathname Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Jeff Hostetler , Derrick Stolee , =?utf-8?b?w4Z2YXIgQXJuZmrDtnI=?= =?utf-8?b?w7A=?= Bjarmason , Torsten =?unknown-8bit?q?B?= =?unknown-8bit?q?=C3=B6gershausen?= , rsbecker@nexbridge.com, Bagas Sanjaya , Jeff Hostetler , Jeff Hostetler Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler From: Jeff Hostetler Emit NFC or NFC and NFD spellings of pathnames on macOS. MacOS is Unicode composition insensitive, so NFC and NFD spellings are treated as aliases and collide. While the spelling of pathnames in filesystem events depends upon the underlying filesystem, such as APFS, HFS+ or FAT32, the OS enforces such collisions regardless of filesystem. Teach the daemon to always report the NFC spelling and to report the NFD spelling when stored in that format on the disk. This is slightly more general than "core.precomposeUnicode". Signed-off-by: Jeff Hostetler --- compat/fsmonitor/fsm-listen-darwin.c | 33 ++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/compat/fsmonitor/fsm-listen-darwin.c b/compat/fsmonitor/fsm-listen-darwin.c index 83d38e8ac6c..823cf63999e 100644 --- a/compat/fsmonitor/fsm-listen-darwin.c +++ b/compat/fsmonitor/fsm-listen-darwin.c @@ -155,6 +155,35 @@ static int ef_ignore_xattr(const FSEventStreamEventFlags ef) return ((ef & mask) == kFSEventStreamEventFlagItemXattrMod); } +/* + * On MacOS we have to adjust for Unicode composition insensitivity + * (where NFC and NFD spellings are not respected). The different + * spellings are essentially aliases regardless of how the path is + * actually stored on the disk. + * + * This is related to "core.precomposeUnicode" (which wants to try + * to hide NFD completely and treat everything as NFC). Here, we + * don't know what the value the client has (or will have) for this + * config setting when they make a query, so assume the worst and + * emit both when the OS gives us an NFD path. + */ +static void my_add_path(struct fsmonitor_batch *batch, const char *path) +{ + char *composed; + + /* add the NFC or NFD path as received from the OS */ + fsmonitor_batch__add_path(batch, path); + + /* if NFD, also add the corresponding NFC spelling */ + composed = (char *)precompose_string_if_needed(path); + if (!composed || composed == path) + return; + + fsmonitor_batch__add_path(batch, composed); + free(composed); +} + + static void fsevent_callback(ConstFSEventStreamRef streamRef, void *ctx, size_t num_of_events, @@ -305,7 +334,7 @@ static void fsevent_callback(ConstFSEventStreamRef streamRef, if (!batch) batch = fsmonitor_batch__new(); - fsmonitor_batch__add_path(batch, rel); + my_add_path(batch, rel); } if (event_flags[k] & kFSEventStreamEventFlagItemIsDir) { @@ -318,7 +347,7 @@ static void fsevent_callback(ConstFSEventStreamRef streamRef, if (!batch) batch = fsmonitor_batch__new(); - fsmonitor_batch__add_path(batch, tmp.buf); + my_add_path(batch, tmp.buf); } break;