Message ID | 20090116211323.22836.40477.stgit@debian.lart (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
On Fri, 2009-01-16 at 14:13 -0700, Alex Williamson wrote: > This will be used for RX mode, MAC filter table, VLAN filtering, etc... > > The control transaction consists of one or more "out" sg entries and > one or more "in" sg entries. The first out entry contains a header > defining the class and command. Additional out entries may provide > data for the command. The last in entry provides a status response > back from the command. > > Virtqueues typically run asynchronous, running a callback function > when there's data in the channel. We can't readily make use of this > in the command paths where we need to use this. Instead, we kick > the virtqueue and spin. The kick causes an I/O write, triggering an > immediate trap into the hypervisor. > > Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Mark McLoughlin <markmc@redhat.com> Cheers, Mark. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Rusty, On Tue, 2009-01-27 at 13:52 +1030, Rusty Russell wrote: > On Saturday 17 January 2009 07:43:23 Alex Williamson wrote: > > + return status ? -EFAULT : 0; > > This is wrong. Currently this can't happen, right? But you put it in > so someone in future may want some kind of return from the commands. > In which case, they'll want to see the value. > > If we're sure they never want to see the value, then we don't need to > be synchronous at all: just spin if add_buf() fails. In my qemu series of patches it can happen if the command isn't properly defined, something bad happens, or the command is unrecognized. As I was hashing this out, I first had more errnos, but I wasn't sure how extensively to fill out the error returns, and eventually defaulted to ok/fail. Should I expand on these some? Suggestions for a reasonable small yet complete set? Should we use Linux errno values and let other OS virtio-net implementations create a switch table? I would like to keep this interface synchronous, particularly I'm wondering if there's anything we might want to do for ethtool like statistics. In that case, the backend might fill a buffer of data along with returning a status code. I could imagine other similar uses as well. > > > +struct virtio_net_ctrl_hdr { > > > + __u8 class; > > > + __u8 cmd; > > > +}; > > This would need to be __attribute__((packed)). On ARM, that struct > would be 4 bytes long. Thanks, I'll fix that. > > + > > > +typedef __u8 virtio_net_ctrl_ack; > > > + > > > +#define VIRTIO_NET_OK 0 > > > +#define VIRTIO_NET_ERR 1 > > Hmm, we define it and don't use it. And we never expect it to actually > error (did your qemu implementation ever actually return non-zero?). Yup, good point. These are mainly here to stay in sync with the qemu backend, which does make use of them. Should I remove them here, or should we make a more worthwhile set of return values? I have tried manually returning non-zero status from qemu to verify it's reflected in the response. Thanks for the comments, Alex
On Tuesday 27 January 2009 14:30:06 Alex Williamson wrote: > Hi Rusty, > > On Tue, 2009-01-27 at 13:52 +1030, Rusty Russell wrote: > > On Saturday 17 January 2009 07:43:23 Alex Williamson wrote: > > > + return status ? -EFAULT : 0; > > > > This is wrong. Currently this can't happen, right? But you put it in > > so someone in future may want some kind of return from the commands. > > In which case, they'll want to see the value. > > > > If we're sure they never want to see the value, then we don't need to > > be synchronous at all: just spin if add_buf() fails. > > In my qemu series of patches it can happen if the command isn't properly > defined ie. guest bug. > , something bad happens ??? You're going to tell me to read the code, aren't you? :) > , or the command is unrecognized. If we go for feature bits, this is also a guest bug. And I think we should, since that's what feature bits are for. > As I > was hashing this out, I first had more errnos, but I wasn't sure how > extensively to fill out the error returns, and eventually defaulted to > ok/fail. Should I expand on these some? Suggestions for a reasonable > small yet complete set? Should we use Linux errno values and let other > OS virtio-net implementations create a switch table? Not error codes. In future we may have a command where return codes are meaningful, but I'm pretty sure they'll be command specific so we don't know how to define them now anyway. AFAICT that's the only real justification for defining this interface as sync. So I think just defining 0 on success is fine. Defining that to always happen to well-formed commands is even better :) Thanks, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2009-01-28 at 23:35 +1030, Rusty Russell wrote: > On Tuesday 27 January 2009 14:30:06 Alex Williamson wrote: > > Hi Rusty, > > > > On Tue, 2009-01-27 at 13:52 +1030, Rusty Russell wrote: > > > On Saturday 17 January 2009 07:43:23 Alex Williamson wrote: > > > > + return status ? -EFAULT : 0; > > > > > > This is wrong. Currently this can't happen, right? But you put it in > > > so someone in future may want some kind of return from the commands. > > > In which case, they'll want to see the value. > > > > > If we're sure they never want to see the value, then we don't need to > > > be synchronous at all: just spin if add_buf() fails. > > > > In my qemu series of patches it can happen if the command isn't properly > > defined > > ie. guest bug. > > > , something bad happens > > ??? You're going to tell me to read the code, aren't you? :) The only one of these that stands out is if the qemu_mallocz() for the MAC filter table fails for a size we think is reasonable. > > , or the command is unrecognized. > > If we go for feature bits, this is also a guest bug. And I think we > should, since that's what feature bits are for. One of the reasons I had avoided using a feature bit is that it's only a 32bit field, and we've already used up to bit 16 (though I'm not sure what to do about the sparse-ness of it). The virtqueue interface I've designed supports up to 255 command classes, each with 255 commands. We can't add a feature bit for every one, or even much of a subset. I'd be happy to add a feature bit indicating the virtqueue command channel is supported, but beyond that, I think we need to fall back to enable or initialization commands failing on various classes. We could also define that command 0 for each class as a probe if we want to make it more explicit. Thanks, Alex
On Thursday 29 January 2009 05:32:21 Alex Williamson wrote: > On Wed, 2009-01-28 at 23:35 +1030, Rusty Russell wrote: > > On Tuesday 27 January 2009 14:30:06 Alex Williamson wrote: > > > On Tue, 2009-01-27 at 13:52 +1030, Rusty Russell wrote: > > > > If we're sure they never want to see the value, then we don't need to > > > > be synchronous at all: just spin if add_buf() fails. ... > The only one of these that stands out is if the qemu_mallocz() for the > MAC filter table fails for a size we think is reasonable. Yes, that seems fair. Of course it's quite probable that the malloc will "succeed" then we'll segv later on if we're really low on mem. > > > , or the command is unrecognized. > > > > If we go for feature bits, this is also a guest bug. And I think we > > should, since that's what feature bits are for. > > One of the reasons I had avoided using a feature bit is that it's only a > 32bit field, and we've already used up to bit 16 (though I'm not sure > what to do about the sparse-ness of it). The virtqueue interface I've > designed supports up to 255 command classes, each with 255 commands. We > can't add a feature bit for every one, or even much of a subset. We will need to figure out how to figure out how to expand the feature bits for virtio_pci anyway. I think the plan is to use feature bit 31 to mean "look here for more feature bits". Anthony would know more... s390 and lguest didn't fall into this trap; we have infinite feature bits. > I'd be > happy to add a feature bit indicating the virtqueue command channel is > supported We probably should, for good measure. If someone added another virtqueue for some other purpose in future and didn't want to support the command channel it would get icky. If we have 255 features, the problem is that we have 255 features, not that it's a lot of bits :) Thanks, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e7700de..d4be0a2 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -39,7 +39,7 @@ module_param(gso, bool, 0444); struct virtnet_info { struct virtio_device *vdev; - struct virtqueue *rvq, *svq; + struct virtqueue *rvq, *svq, *cvq; struct net_device *dev; struct napi_struct napi; @@ -603,6 +603,41 @@ static int virtnet_open(struct net_device *dev) return 0; } +static int virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd, + void *data, unsigned int len) +{ + struct scatterlist sg[3]; + struct virtio_net_ctrl_hdr ctrl; + virtio_net_ctrl_ack status; + unsigned int tmp; + int i = 0; + + if (!vi->cvq) + return -EIO; + + sg_init_table(sg, len ? 3 : 2); + + sg_set_buf(&sg[i++], &ctrl, sizeof(ctrl)); + if (len) + sg_set_buf(&sg[i++], data, len); + sg_set_buf(&sg[i], &status, sizeof(status)); + + ctrl.class = class; + ctrl.cmd = cmd; + + status = ~0; + + if (vi->cvq->vq_ops->add_buf(vi->cvq, sg, i, 1, vi) != 0) + BUG(); + + vi->cvq->vq_ops->kick(vi->cvq); + + while (!vi->cvq->vq_ops->get_buf(vi->cvq, &tmp)) + cpu_relax(); + + return status ? -EFAULT : 0; +} + static int virtnet_close(struct net_device *dev) { struct virtnet_info *vi = netdev_priv(dev); @@ -733,6 +768,14 @@ static int virtnet_probe(struct virtio_device *vdev) goto free_recv; } + /* + * Outbound control channel virtqueue. We can live without it, + * so don't go fatal if it's not there. + */ + vi->cvq = vdev->config->find_vq(vdev, 2, NULL); + if (IS_ERR(vi->cvq)) + vi->cvq = NULL; + /* Initialize our empty receive and send queues. */ skb_queue_head_init(&vi->recv); skb_queue_head_init(&vi->send); @@ -763,6 +806,8 @@ static int virtnet_probe(struct virtio_device *vdev) unregister: unregister_netdev(dev); free_send: + if (vi->cvq) + vdev->config->del_vq(vi->cvq); vdev->config->del_vq(vi->svq); free_recv: vdev->config->del_vq(vi->rvq); @@ -793,6 +838,8 @@ static void virtnet_remove(struct virtio_device *vdev) vdev->config->del_vq(vi->svq); vdev->config->del_vq(vi->rvq); + if (vi->cvq) + vdev->config->del_vq(vi->cvq); unregister_netdev(vi->dev); while (vi->pages) diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h index 5cdd0aa..e2c1d81 100644 --- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -53,4 +53,21 @@ struct virtio_net_hdr_mrg_rxbuf { __u16 num_buffers; /* Number of merged rx buffers */ }; +/* + * Control virtqueue data structures + * + * The control virtqueue expects a header in the first sg entry + * and an ack/status response in the last entry. Data for the + * command goes in between. + */ +struct virtio_net_ctrl_hdr { + __u8 class; + __u8 cmd; +}; + +typedef __u8 virtio_net_ctrl_ack; + +#define VIRTIO_NET_OK 0 +#define VIRTIO_NET_ERR 1 + #endif /* _LINUX_VIRTIO_NET_H */
This will be used for RX mode, MAC filter table, VLAN filtering, etc... The control transaction consists of one or more "out" sg entries and one or more "in" sg entries. The first out entry contains a header defining the class and command. Additional out entries may provide data for the command. The last in entry provides a status response back from the command. Virtqueues typically run asynchronous, running a callback function when there's data in the channel. We can't readily make use of this in the command paths where we need to use this. Instead, we kick the virtqueue and spin. The kick causes an I/O write, triggering an immediate trap into the hypervisor. Signed-off-by: Alex Williamson <alex.williamson@hp.com> --- drivers/net/virtio_net.c | 49 +++++++++++++++++++++++++++++++++++++++++++- include/linux/virtio_net.h | 17 +++++++++++++++ 2 files changed, 65 insertions(+), 1 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html