Message ID | 20240508184743778PSWkv_r8dMoye7WmZ7enP@zte.com.cn (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: introduce vm's max_halt_poll_ns to debugfs | expand |
On Wed, May 08, 2024, cheng.lin130@zte.com.cn wrote: > From: Cheng Lin <cheng.lin130@zte.com.cn> > > Introduce vm's max_halt_poll_ns and override_halt_poll_ns to > debugfs. Provide a way to check and modify them. Why?
> From: seanjc <seanjc@google.com> > > From: Cheng Lin <cheng.lin130@zte.com.cn> > > > > Introduce vm's max_halt_poll_ns and override_halt_poll_ns to > > debugfs. Provide a way to check and modify them. > Why? If a vm's max_halt_poll_ns has been set using KVM_CAP_HALT_POLL, the module parameter kvm.halt_poll.ns will no longer indicate the maximum halt pooling interval for that vm. After introducing these two attributes into debugfs, it can be used to check whether the individual configuration of the vm is enabled and the working value. This patch provides a way to check and modify them through the debugfs.
On Thu, May 09, 2024, cheng.lin130@zte.com.cn wrote: > > From: seanjc <seanjc@google.com> > > > From: Cheng Lin <cheng.lin130@zte.com.cn> > > > > > > Introduce vm's max_halt_poll_ns and override_halt_poll_ns to > > > debugfs. Provide a way to check and modify them. > > Why? > If a vm's max_halt_poll_ns has been set using KVM_CAP_HALT_POLL, > the module parameter kvm.halt_poll.ns will no longer indicate the maximum > halt pooling interval for that vm. After introducing these two attributes into > debugfs, it can be used to check whether the individual configuration of the > vm is enabled and the working value. But why is max_halt_poll_ns special enough to warrant debugfs entries? There is a _lot_ of state in KVM that is configurable per-VM, it simply isn't feasible to dump everything into debugfs. I do think it would be reasonable to capture the max allowed polling time in the existing tracepoint though, e.g. diff --git a/include/trace/events/kvm.h b/include/trace/events/kvm.h index 74e40d5d4af4..7e66e9b2e497 100644 --- a/include/trace/events/kvm.h +++ b/include/trace/events/kvm.h @@ -41,24 +41,26 @@ TRACE_EVENT(kvm_userspace_exit, ); TRACE_EVENT(kvm_vcpu_wakeup, - TP_PROTO(__u64 ns, bool waited, bool valid), - TP_ARGS(ns, waited, valid), + TP_PROTO(__u64 ns, __u32 max_ns, bool waited, bool valid), + TP_ARGS(ns, max_ns, waited, valid), TP_STRUCT__entry( __field( __u64, ns ) + __field( __u32, max_ns ) __field( bool, waited ) __field( bool, valid ) ), TP_fast_assign( __entry->ns = ns; + __entry->max_ns = max_ns; __entry->waited = waited; __entry->valid = valid; ), - TP_printk("%s time %lld ns, polling %s", + TP_printk("%s time %llu ns (max poll %u ns), polling %s", __entry->waited ? "wait" : "poll", - __entry->ns, + __entry->ns, __entry->max_ns, __entry->valid ? "valid" : "invalid") ); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2e388972d856..f093138f3cd7 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3846,7 +3846,8 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu) } } - trace_kvm_vcpu_wakeup(halt_ns, waited, vcpu_valid_wakeup(vcpu)); + trace_kvm_vcpu_wakeup(halt_ns, max_halt_poll_ns, waited, + vcpu_valid_wakeup(vcpu)); } EXPORT_SYMBOL_GPL(kvm_vcpu_halt);
> > > From: seanjc <seanjc@google.com> > > > > From: Cheng Lin <cheng.lin130@zte.com.cn> > > > > > > > > Introduce vm's max_halt_poll_ns and override_halt_poll_ns to > > > > debugfs. Provide a way to check and modify them. > > > Why? > > If a vm's max_halt_poll_ns has been set using KVM_CAP_HALT_POLL, > > the module parameter kvm.halt_poll.ns will no longer indicate the maximum > > halt pooling interval for that vm. After introducing these two attributes into > > debugfs, it can be used to check whether the individual configuration of the > > vm is enabled and the working value. > But why is max_halt_poll_ns special enough to warrant debugfs entries? There is > a _lot_ of state in KVM that is configurable per-VM, it simply isn't feasible to > dump everything into debugfs. If we want to provide a directly modification interface under /sys for per-vm max_halt_poll_ns, like module parameter /sys/module/kvm/parameters/halt_poll_ns, using debugfs may be worth. Further, if the override_halt_poll_ns under debugfs is set to be writable, it can even achieve the setting of per-vm max_halt_poll_ns, as the KVM_CAP_HALL_POLL interface does. > I do think it would be reasonable to capture the max allowed polling time in > the existing tracepoint though, e.g. Yes, I agree it. It is sufficient to get per-vm max_halt_poll_ns through tracepoint if KVP_CAP_HALL_POLL is used as the unique setting interface. Do you consider it is worth to provide a setting interface other than KVP_CAP_HALL_POLL?
On Fri, May 10, 2024, cheng.lin130@zte.com.cn wrote: > > > > From: seanjc <seanjc@google.com> > > > > > From: Cheng Lin <cheng.lin130@zte.com.cn> > > > > > > > > > > Introduce vm's max_halt_poll_ns and override_halt_poll_ns to > > > > > debugfs. Provide a way to check and modify them. > > > > Why? > > > If a vm's max_halt_poll_ns has been set using KVM_CAP_HALT_POLL, > > > the module parameter kvm.halt_poll.ns will no longer indicate the maximum > > > halt pooling interval for that vm. After introducing these two attributes into > > > debugfs, it can be used to check whether the individual configuration of the > > > vm is enabled and the working value. > > But why is max_halt_poll_ns special enough to warrant debugfs entries? There is > > a _lot_ of state in KVM that is configurable per-VM, it simply isn't feasible to > > dump everything into debugfs. > If we want to provide a directly modification interface under /sys for per-vm > max_halt_poll_ns, like module parameter /sys/module/kvm/parameters/halt_poll_ns, > using debugfs may be worth. Yes, but _why_? I know _what_ a debugs knob allows, but you have yet to explain why this General speaking, functionality of any kind should not be routed through debugfs, it really is meant for debug. E.g. it's typically root-only, is not guaranteed to exist, its population is best-effort, etc. > Further, if the override_halt_poll_ns under debugfs is set to be writable, it can even > achieve the setting of per-vm max_halt_poll_ns, as the KVM_CAP_HALL_POLL interface > does. > > I do think it would be reasonable to capture the max allowed polling time in > > the existing tracepoint though, e.g. > Yes, I agree it. > It is sufficient to get per-vm max_halt_poll_ns through tracepoint if KVP_CAP_HALL_POLL > is used as the unique setting interface. > > Do you consider it is worth to provide a setting interface other than KVP_CAP_HALL_POLL?
> > > > > From: seanjc <seanjc@google.com> > > > > > > From: Cheng Lin <cheng.lin130@zte.com.cn> > > > > > > > > > > > > Introduce vm's max_halt_poll_ns and override_halt_poll_ns to > > > > > > debugfs. Provide a way to check and modify them. > > > > > Why? > > > > If a vm's max_halt_poll_ns has been set using KVM_CAP_HALT_POLL, > > > > the module parameter kvm.halt_poll.ns will no longer indicate the maximum > > > > halt pooling interval for that vm. After introducing these two attributes into > > > > debugfs, it can be used to check whether the individual configuration of the > > > > vm is enabled and the working value. > > > But why is max_halt_poll_ns special enough to warrant debugfs entries? There is > > > a _lot_ of state in KVM that is configurable per-VM, it simply isn't feasible to > > > dump everything into debugfs. > > If we want to provide a directly modification interface under /sys for per-vm > > max_halt_poll_ns, like module parameter /sys/module/kvm/parameters/halt_poll_ns, > > using debugfs may be worth. > Yes, but _why_? I know _what_ a debugs knob allows, but you have yet to explain > why this I think that if such an interface is provided, it can be used to check the source of vm's max_halt_poll_ns, general module parameter or per-vm configuration. When configured through per-vm, such an interface can be used to monitor this configuration. If there is an error in the setting through KVMCAP_HALL_POLL, such an interface can be used to fix or reset it dynamicly. > General speaking, functionality of any kind should not be routed through debugfs, > it really is meant for debug. E.g. it's typically root-only, is not guaranteed > to exist, its population is best-effort, etc. > > Further, if the override_halt_poll_ns under debugfs is set to be writable, it can even > > achieve the setting of per-vm max_halt_poll_ns, as the KVM_CAP_HALL_POLL interface > > does. > > > I do think it would be reasonable to capture the max allowed polling time in > > > the existing tracepoint though, e.g. > > Yes, I agree it. > > It is sufficient to get per-vm max_halt_poll_ns through tracepoint if KVP_CAP_HALL_POLL > > is used as the unique setting interface. > > > > Do you consider it is worth to provide a setting interface other than KVP_CAP_HALL_POLL?
On Sat, May 11, 2024, cheng.lin130@zte.com.cn wrote: > > > > > > From: seanjc <seanjc@google.com> > > > > > > > From: Cheng Lin <cheng.lin130@zte.com.cn> > > > > > > > > > > > > > > Introduce vm's max_halt_poll_ns and override_halt_poll_ns to > > > > > > > debugfs. Provide a way to check and modify them. > > > > > > Why? > > > > > If a vm's max_halt_poll_ns has been set using KVM_CAP_HALT_POLL, > > > > > the module parameter kvm.halt_poll.ns will no longer indicate the maximum > > > > > halt pooling interval for that vm. After introducing these two attributes into > > > > > debugfs, it can be used to check whether the individual configuration of the > > > > > vm is enabled and the working value. > > > > But why is max_halt_poll_ns special enough to warrant debugfs entries? There is > > > > a _lot_ of state in KVM that is configurable per-VM, it simply isn't feasible to > > > > dump everything into debugfs. > > > If we want to provide a directly modification interface under /sys for per-vm > > > max_halt_poll_ns, like module parameter /sys/module/kvm/parameters/halt_poll_ns, > > > using debugfs may be worth. > > Yes, but _why_? I know _what_ a debugs knob allows, but you have yet to explain > > why this > I think that if such an interface is provided, it can be used to check the source of > vm's max_halt_poll_ns, general module parameter or per-vm configuration. > When configured through per-vm, such an interface can be used to monitor this > configuration. If there is an error in the setting through KVMCAP_HALL_POLL, such > an interface can be used to fix or reset it dynamicly. But again, that argument can be made for myriad settings in KVM. And unlike many settings, a "bad" max_halt_poll_ns can be fixed simply by redoing KVM_CAP_HALL_POLL. It's not KVM's responsibility to police userspace for bugs/errors, and IMO a backdoor into max_halt_poll_ns isn't justified.
> > > Yes, but _why_? I know _what_ a debugs knob allows, but you have yet to explain > > > why this > > I think that if such an interface is provided, it can be used to check the source of > > vm's max_halt_poll_ns, general module parameter or per-vm configuration. > > When configured through per-vm, such an interface can be used to monitor this > > configuration. If there is an error in the setting through KVMCAP_HALL_POLL, such > > an interface can be used to fix or reset it dynamicly. > But again, that argument can be made for myriad settings in KVM. And unlike many > settings, a "bad" max_halt_poll_ns can be fixed simply by redoing KVM_CAP_HALL_POLL. Yes, Whether it is convenient to redo it will depend on the userspace. > It's not KVM's responsibility to police userspace for bugs/errors, and IMO a > backdoor into max_halt_poll_ns isn't justified. Yes, It's not KVM's responsibility to police userspace. In addition to depend on userspace redo, it can be seen as a planB to ensure that the VM works as expected.
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ff0a20565..60dae952c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1151,6 +1151,11 @@ static int kvm_create_vm_debugfs(struct kvm *kvm, const char *fdname) &stat_fops_per_vm); } + debugfs_create_bool("override_halt_poll_ns", 0444, kvm->debugfs_dentry, + &kvm->override_halt_poll_ns); + debugfs_create_u32("max_halt_poll_ns", 0644, kvm->debugfs_dentry, + &kvm->max_halt_poll_ns); + kvm_arch_create_vm_debugfs(kvm); return 0; out_err: