Message ID | cover.1679328580.git.phillip.wood@dunelm.org.uk (mailing list archive) |
---|---|
Headers | show |
Series | wildmatch: fix exponential behavior | expand |
Phillip Wood <phillip.wood123@gmail.com> writes: > This series is based on maint. Unfortunately it conflicts with > my/wildmatch-cleanups when merged with seen. There are sematic > conflicts with the removal of dowild() in e303cf8092 (wildmatch: > more cleanups after killing uchar, 2023-02-26) as well as textual > conflicts around the change of uchar->char. Thanks. What's not in 'next' are fair game to break and force reroll ;-) > Phillip Wood (3): > wildmatch: fix exponential behavior > wildmatch: avoid undefined behavior > wildmatch: hide internal return values
On 3/20/2023 12:09 PM, Phillip Wood wrote: > From: Phillip Wood <phillip.wood@dunelm.org.uk> > > The wildmatch implementation in git suffers from exponential behavior as > described in [1] where the time taken for a failing match is exponential > in the number of wildcards it contains. The original implementation > imported from rsync is immune but the optimizations introduced by [2.3] > failed to prevent unnecessary backtracking when handling '*' and '/**/'. > > This bug was were discussed on the security list and the conclusion was > that it only affects operations that are already potential DoS vectors. > > In the long term it would be nice to get rid of the recursion in the > wildmatch() code but the patches here focus on a minimal fix. Thanks for these changes. The patches look good to me. I particularly appreciate that there is a regression test to avoid this accidentally happening again in the future. The two second timeout is a reasonable balance between "not taking too long" and "will not be flaky, assuming the code is correct". I could imagine that it might _pass_ unexpectedly if it runs on fast-enough hardware, but that's not a huge concern right now. CI machines are not normally powered significantly more than a typical developer machine. Thanks, -Stolee
Hi Stolee On 23/03/2023 14:19, Derrick Stolee wrote: > On 3/20/2023 12:09 PM, Phillip Wood wrote: >> From: Phillip Wood <phillip.wood@dunelm.org.uk> >> >> The wildmatch implementation in git suffers from exponential behavior as >> described in [1] where the time taken for a failing match is exponential >> in the number of wildcards it contains. The original implementation >> imported from rsync is immune but the optimizations introduced by [2.3] >> failed to prevent unnecessary backtracking when handling '*' and '/**/'. >> >> This bug was were discussed on the security list and the conclusion was >> that it only affects operations that are already potential DoS vectors. >> >> In the long term it would be nice to get rid of the recursion in the >> wildmatch() code but the patches here focus on a minimal fix. > > Thanks for these changes. The patches look good to me. > > I particularly appreciate that there is a regression test to avoid > this accidentally happening again in the future. The two second > timeout is a reasonable balance between "not taking too long" and > "will not be flaky, assuming the code is correct". I could imagine > that it might _pass_ unexpectedly if it runs on fast-enough hardware, > but that's not a huge concern right now. CI machines are not normally > powered significantly more than a typical developer machine. Thanks for taking the time to look at these again and for prompting me to add the regression test in the first place. Best Wishes Phillip > Thanks, > -Stolee
From: Phillip Wood <phillip.wood@dunelm.org.uk> The wildmatch implementation in git suffers from exponential behavior as described in [1] where the time taken for a failing match is exponential in the number of wildcards it contains. The original implementation imported from rsync is immune but the optimizations introduced by [2.3] failed to prevent unnecessary backtracking when handling '*' and '/**/'. This bug was were discussed on the security list and the conclusion was that it only affects operations that are already potential DoS vectors. In the long term it would be nice to get rid of the recursion in the wildmatch() code but the patches here focus on a minimal fix. This series is based on maint. Unfortunately it conflicts with my/wildmatch-cleanups when merged with seen. There are sematic conflicts with the removal of dowild() in e303cf8092 (wildmatch: more cleanups after killing uchar, 2023-02-26) as well as textual conflicts around the change of uchar->char. [1] https://research.swtch.com/glob [2] 6f1a31f0aa (wildmatch: advance faster in <asterisk> + <literal> patterns, 2013-01-01) [3] 46983441ae (wildmatch: make a special case for "*/" with FNM_PATHNAME, 2013-01-01) Published-As: https://github.com/phillipwood/git/releases/tag/wildmatch-fixes%2Fv1 View-Changes-At: https://github.com/phillipwood/git/compare/73876f486...a74ab7138 Fetch-It-Via: git fetch https://github.com/phillipwood/git wildmatch-fixes/v1 Phillip Wood (3): wildmatch: fix exponential behavior wildmatch: avoid undefined behavior wildmatch: hide internal return values t/t3070-wildmatch.sh | 9 +++++++++ wildmatch.c | 23 ++++++++++++++++------- wildmatch.h | 2 -- 3 files changed, 25 insertions(+), 9 deletions(-)