Message ID | 20220608215444.1216-1-ematsumiya@suse.de (mailing list archive) |
---|---|
Headers | show |
Series | Introduce dns_interval procfs setting | expand |
Looks good. Reviewed-by for both patches. On Thu, 9 Jun 2022 at 07:55, Enzo Matsumiya <ematsumiya@suse.de> wrote: > > Hi, > > These 2 patches are a simple way to fix the DNS issue that > currently exists in cifs, where the upcall to key.dns_resolver will > always return 0 for the record TTL, hence, making the resolve worker > always use the default value SMB_DNS_RESOLVE_INTERVAL_DEFAULT > (currently 600 seconds). > > This also makes the new setting `dns_interval' user-configurable via > procfs (/proc/fs/cifs/dns_interval). > > One disadvantage here is that the interval is applied to all hosts > resolution. This is still how it works today, because we're always using > the default value anyway, but should someday this be fixed, then we can > go back to rely on the keys infrastructure to cache each hostname with > its own separate TTL. > > Please review and test. All feedback is welcome. > > > Cheers > > Enzo > > Enzo Matsumiya (2): > cifs: create procfs for dns_interval setting > cifs: reschedule DNS resolve worker based on dns_interval > > fs/cifs/cifs_debug.c | 63 +++++++++++++++++++++++++++++++++++++++++++ > fs/cifs/cifs_debug.h | 2 ++ > fs/cifs/cifsfs.c | 1 + > fs/cifs/connect.c | 4 +-- > fs/cifs/dns_resolve.c | 8 ++++++ > 5 files changed, 76 insertions(+), 2 deletions(-) > > -- > 2.36.1 >
On 6/8/2022 5:54 PM, Enzo Matsumiya wrote: > Hi, > > These 2 patches are a simple way to fix the DNS issue that > currently exists in cifs, where the upcall to key.dns_resolver will > always return 0 for the record TTL, hence, making the resolve worker > always use the default value SMB_DNS_RESOLVE_INTERVAL_DEFAULT > (currently 600 seconds). > > This also makes the new setting `dns_interval' user-configurable via > procfs (/proc/fs/cifs/dns_interval). > > One disadvantage here is that the interval is applied to all hosts > resolution. This is still how it works today, because we're always using > the default value anyway, but should someday this be fixed, then we can > go back to rely on the keys infrastructure to cache each hostname with > its own separate TTL. Curious, why did you choose a procfs global approach? Wouldn't it be more appropriate to make this a mount option, so it would be applied selectively per-server? Tom. > Please review and test. All feedback is welcome. > > > Cheers > > Enzo > > Enzo Matsumiya (2): > cifs: create procfs for dns_interval setting > cifs: reschedule DNS resolve worker based on dns_interval > > fs/cifs/cifs_debug.c | 63 +++++++++++++++++++++++++++++++++++++++++++ > fs/cifs/cifs_debug.h | 2 ++ > fs/cifs/cifsfs.c | 1 + > fs/cifs/connect.c | 4 +-- > fs/cifs/dns_resolve.c | 8 ++++++ > 5 files changed, 76 insertions(+), 2 deletions(-) >
Hi Enzo, Enzo Matsumiya <ematsumiya@suse.de> writes: > These 2 patches are a simple way to fix the DNS issue that > currently exists in cifs, where the upcall to key.dns_resolver will > always return 0 for the record TTL, hence, making the resolve worker > always use the default value SMB_DNS_RESOLVE_INTERVAL_DEFAULT > (currently 600 seconds). > > This also makes the new setting `dns_interval' user-configurable via > procfs (/proc/fs/cifs/dns_interval). How is the user supposed to know which TTL value will be used? If the expire time is set by either key.dns_resolver or any other program used for dns_resolver key, the above setting would no longer work and users might get confused.
On 06/09, Tom Talpey wrote: >On 6/8/2022 5:54 PM, Enzo Matsumiya wrote: >>Hi, >> >>These 2 patches are a simple way to fix the DNS issue that >>currently exists in cifs, where the upcall to key.dns_resolver will >>always return 0 for the record TTL, hence, making the resolve worker >>always use the default value SMB_DNS_RESOLVE_INTERVAL_DEFAULT >>(currently 600 seconds). >> >>This also makes the new setting `dns_interval' user-configurable via >>procfs (/proc/fs/cifs/dns_interval). >> >>One disadvantage here is that the interval is applied to all hosts >>resolution. This is still how it works today, because we're always using >>the default value anyway, but should someday this be fixed, then we can >>go back to rely on the keys infrastructure to cache each hostname with >>its own separate TTL. > >Curious, why did you choose a procfs global approach? Wouldn't it be >more appropriate to make this a mount option, so it would be applied >selectively per-server? I tried to stick to the current behaviour, while also fixing the issue of getting a TTL=0 from key.dns_resolver upcall. A mount option looks indeed better initially, and that was also my initial approach to this. But in a, e.g. multi-domain forest, with multiple DFS targets, the per-mount setting starts to look irrelevant again as well. So I took the "per-client" approach instead. Cheers, Enzo
Hi Paulo, On 06/09, Paulo Alcantara wrote: >Hi Enzo, > >Enzo Matsumiya <ematsumiya@suse.de> writes: > >> These 2 patches are a simple way to fix the DNS issue that >> currently exists in cifs, where the upcall to key.dns_resolver will >> always return 0 for the record TTL, hence, making the resolve worker >> always use the default value SMB_DNS_RESOLVE_INTERVAL_DEFAULT >> (currently 600 seconds). >> >> This also makes the new setting `dns_interval' user-configurable via >> procfs (/proc/fs/cifs/dns_interval). > >How is the user supposed to know which TTL value will be used? If the >expire time is set by either key.dns_resolver or any other program used >for dns_resolver key, the above setting would no longer work and users >might get confused. The initial value is the same as before (SMB_DNS_RESOLVE_INTERVAL_DEFAULT, 600s). The user don't need to know a value to be used, unless they know the value the server uses (either manually set by themselves or by some external script) and then they can use that same value for this new dns_interval setting. A very simple example on how one could do it, if there's no desire to use the default 600s: # TTL=$(dig +noall +answer my.server.domain | awk '{ print $2 }') # echo $TTL > /proc/fs/cifs/dns_interval Cheers, Enzo
Enzo Matsumiya <ematsumiya@suse.de> writes: > The initial value is the same as before (SMB_DNS_RESOLVE_INTERVAL_DEFAULT, 600s). > > The user don't need to know a value to be used, unless they know the > value the server uses (either manually set by themselves or by some > external script) and then they can use that same value for this new > dns_interval setting. > > A very simple example on how one could do it, if there's no desire to > use the default 600s: > > # TTL=$(dig +noall +answer my.server.domain | awk '{ print $2 }') > # echo $TTL > /proc/fs/cifs/dns_interval That's not what I meant. Sorry if I was unclear. If the upcalled program sets the record TTL (key's expire time), then any values set in /proc/fs/cifs/dns_interval will not work. The user might expect it to work, though.
On 6/9/2022 11:03 AM, Enzo Matsumiya wrote: > On 06/09, Tom Talpey wrote: >> On 6/8/2022 5:54 PM, Enzo Matsumiya wrote: >>> Hi, >>> >>> These 2 patches are a simple way to fix the DNS issue that >>> currently exists in cifs, where the upcall to key.dns_resolver will >>> always return 0 for the record TTL, hence, making the resolve worker >>> always use the default value SMB_DNS_RESOLVE_INTERVAL_DEFAULT >>> (currently 600 seconds). >>> >>> This also makes the new setting `dns_interval' user-configurable via >>> procfs (/proc/fs/cifs/dns_interval). >>> >>> One disadvantage here is that the interval is applied to all hosts >>> resolution. This is still how it works today, because we're always using >>> the default value anyway, but should someday this be fixed, then we can >>> go back to rely on the keys infrastructure to cache each hostname with >>> its own separate TTL. >> >> Curious, why did you choose a procfs global approach? Wouldn't it be >> more appropriate to make this a mount option, so it would be applied >> selectively per-server? > > I tried to stick to the current behaviour, while also fixing the issue > of getting a TTL=0 from key.dns_resolver upcall. > > A mount option looks indeed better initially, and that was also my > initial approach to this. But in a, e.g. multi-domain forest, with > multiple DFS targets, the per-mount setting starts to look irrelevant > again as well. So I took the "per-client" approach instead. Ok, I guess. It seems like kicking the can down the road, a little. I agree that rearchitecting the DNS cache might be extreme, but many distros consider procfs to be a user API, and they'll require it to be supported basically forever. That would be unfortunate... Tom.
On 06/09, Paulo Alcantara wrote: >Enzo Matsumiya <ematsumiya@suse.de> writes: > >> The initial value is the same as before (SMB_DNS_RESOLVE_INTERVAL_DEFAULT, 600s). >> >> The user don't need to know a value to be used, unless they know the >> value the server uses (either manually set by themselves or by some >> external script) and then they can use that same value for this new >> dns_interval setting. >> >> A very simple example on how one could do it, if there's no desire to >> use the default 600s: >> >> # TTL=$(dig +noall +answer my.server.domain | awk '{ print $2 }') >> # echo $TTL > /proc/fs/cifs/dns_interval > >That's not what I meant. Sorry if I was unclear. > >If the upcalled program sets the record TTL (key's expire time), then >any values set in /proc/fs/cifs/dns_interval will not work. The user >might expect it to work, though. Exactly. Currently, key.dns_resolver uses getaddrinfo() to resolve names, which doesn't contain the TTL for the record, hence *always* returns 0 to cifs.ko. This patch is just a way to provide some flexibility to the user, in case they don't want to use the currently-always-fixed 600s. As we discussed, this is supposed to be fixed in key.dns_resolver, but my proposed patch for that never got ack'd, or even replied to. I might try again at some point. Cheers, Enzo
Enzo Matsumiya <ematsumiya@suse.de> writes: > Currently, key.dns_resolver uses getaddrinfo() to resolve names, which > doesn't contain the TTL for the record, hence *always* returns 0 to cifs.ko. > This patch is just a way to provide some flexibility to the user, in > case they don't want to use the currently-always-fixed 600s. It is not limited to key.dns_resolver. The user is free to choose whatever program he/she wants to be upcalled for dns_resolver key. For instance, some distros might still be using cifs.upcall(8) that actually set record TTL, thus making it impossible for the user to change default via /proc/fs/cifs/dns_interval.
On 06/09, Paulo Alcantara wrote: >Enzo Matsumiya <ematsumiya@suse.de> writes: > >> Currently, key.dns_resolver uses getaddrinfo() to resolve names, which >> doesn't contain the TTL for the record, hence *always* returns 0 to cifs.ko. >> This patch is just a way to provide some flexibility to the user, in >> case they don't want to use the currently-always-fixed 600s. > >It is not limited to key.dns_resolver. The user is free to choose >whatever program he/she wants to be upcalled for dns_resolver key. > >For instance, some distros might still be using cifs.upcall(8) that >actually set record TTL, thus making it impossible for the user to >change default via /proc/fs/cifs/dns_interval. Ah sorry, I misunderstood. But this patch isn't supposed to allow the user to change the _default_ TTL value, but only to give them a chance to change the TTL value *iff* the upcall returned 0. In case the upcall returns TTL != 0, dns_resolve_server_name_to_ip() will use that value instead, which, again, maintains the current behaviour. But yes, if desired, I can adjust the patch to completely ignore the TTL value from upcall and manage it by ourselves always, either by procfs or by mount option. What do you think? Cheers, Enzo
On 06/09, Tom Talpey wrote: >>I tried to stick to the current behaviour, while also fixing the issue >>of getting a TTL=0 from key.dns_resolver upcall. >> >>A mount option looks indeed better initially, and that was also my >>initial approach to this. But in a, e.g. multi-domain forest, with >>multiple DFS targets, the per-mount setting starts to look irrelevant >>again as well. So I took the "per-client" approach instead. > >Ok, I guess. It seems like kicking the can down the road, a little. >I agree that rearchitecting the DNS cache might be extreme, but many >distros consider procfs to be a user API, and they'll require it to >be supported basically forever. That would be unfortunate... Thanks, I didn't know that. I'll re-work on the patch and take this in consideration then. Cheers, Enzo
Enzo Matsumiya <ematsumiya@suse.de> writes: > On 06/09, Paulo Alcantara wrote: >>Enzo Matsumiya <ematsumiya@suse.de> writes: >> >>> Currently, key.dns_resolver uses getaddrinfo() to resolve names, which >>> doesn't contain the TTL for the record, hence *always* returns 0 to cifs.ko. >>> This patch is just a way to provide some flexibility to the user, in >>> case they don't want to use the currently-always-fixed 600s. >> >>It is not limited to key.dns_resolver. The user is free to choose >>whatever program he/she wants to be upcalled for dns_resolver key. >> >>For instance, some distros might still be using cifs.upcall(8) that >>actually set record TTL, thus making it impossible for the user to >>change default via /proc/fs/cifs/dns_interval. > > Ah sorry, I misunderstood. > > But this patch isn't supposed to allow the user to change the _default_ TTL > value, but only to give them a chance to change the TTL value *iff* the > upcall returned 0. That was my concern. If we expose it to users, they would probably expect it work at all times regardless whether the key's expire time was set or not. > In case the upcall returns TTL != 0, dns_resolve_server_name_to_ip() > will use that value instead, which, again, maintains the current behaviour. Then it would start ignoring values from /proc/fs/cifs/dns_interval and not telling the user why. > But yes, if desired, I can adjust the patch to completely ignore the > TTL value from upcall and manage it by ourselves always, either by > procfs or by mount option. That would work, too. BTW, I'd personally go with the mount option version.