Message ID | cover.1657920926.git.linux_oss@crudebyte.com (mailing list archive) |
---|---|
Headers | show |
Series | remove msize limit in virtio transport | expand |
On Samstag, 16. Juli 2022 01:28:51 CEST Dominique Martinet wrote: > Dominique Martinet wrote on Sat, Jul 16, 2022 at 07:30:45AM +0900: > > Christian Schoenebeck wrote on Fri, Jul 15, 2022 at 11:35:26PM +0200: > > > * Patches 7..11 tremendously reduce unnecessarily huge 9p message sizes > > > and > > > > > > therefore provide performance gain as well. So far, almost all 9p > > > messages > > > simply allocated message buffers exactly msize large, even for > > > messages > > > that actually just needed few bytes. So these patches make sense by > > > themselves, independent of this overall series, however for this > > > series > > > even more, because the larger msize, the more this issue would have > > > hurt > > > otherwise. > > > > Unless they got stuck somewhere the mails are missing patches 10 and 11, > > one too many 0s to git send-email ? > > nevermind, they just got in after 1h30... I thought it'd been 1h since > the first mails because the first ones were already 50 mins late and I > hadn't noticed! I wonder where they're stuck, that's the time > lizzy.crudebyte.com received them and it filters earlier headers so > probably between you and it? Certainly an outbound SMTP greylisting delay, i.e. lack of karma. Sometimes my patches make it to lists after 3 hours. I haven't figured out though why some patches within the same series arrive significantly faster than certain other ones, which is especially weird when that happens not in order they were sent. > ohwell. > > > I'll do a quick review from github commit meanwhile > > Looks good to me, I'll try to get some tcp/rdma testing done this > weekend and stash them up to next Great, thanks! > -- > Dominique
Christian Schoenebeck wrote on Sat, Jul 16, 2022 at 11:54:29AM +0200: > > Looks good to me, I'll try to get some tcp/rdma testing done this > > weekend and stash them up to next > > Great, thanks! Quick update on this: tcp seems to work fine, I need to let it run a bit longer but not expecting any trouble. RDMA is... complicated. I was certain an adapter in loopback mode ought to work so I just bought a cheap card alone, but I couldn't get it to work (ipoib works but I think that's just the linux tcp stack cheating, I'm getting unable to resolve route (rdma_resolve_route) errors when trying real rdma applications...) OTOH, linux got softiwarp merged in as RDMA_SIW which works perfectly with my rdma applications, after fixing/working around a couple of bugs on the server I'm getting hangs that I can't reproduce with debug on current master so this isn't exactly great, not sure where it goes wrong :| At least with debug still enabled I'm not getting any new hang with your patches, so let's call it ok...? I'll send a mail to ex-collegues who might care about it (and investigate a bit more if so), and a more open mail if that falls short... -- Dominique
On Samstag, 16. Juli 2022 13:54:08 CEST Dominique Martinet wrote: > Christian Schoenebeck wrote on Sat, Jul 16, 2022 at 11:54:29AM +0200: > > > Looks good to me, I'll try to get some tcp/rdma testing done this > > > weekend and stash them up to next > > > > Great, thanks! > > Quick update on this: tcp seems to work fine, I need to let it run a bit > longer but not expecting any trouble. > > RDMA is... complicated. > I was certain an adapter in loopback mode ought to work so I just > bought a cheap card alone, but I couldn't get it to work (ipoib works > but I think that's just the linux tcp stack cheating, I'm getting unable > to resolve route (rdma_resolve_route) errors when trying real rdma > applications...) > > > OTOH, linux got softiwarp merged in as RDMA_SIW which works perfectly > with my rdma applications, after fixing/working around a couple of bugs > on the server I'm getting hangs that I can't reproduce with debug on > current master so this isn't exactly great, not sure where it goes > wrong :| > At least with debug still enabled I'm not getting any new hang with your > patches, so let's call it ok...? Well, I would need more info to judge or resolve that, like which patch exactly broke RDMA behaviour for you? > I'll send a mail to ex-collegues who might care about it (and > investigate a bit more if so), and a more open mail if that falls > short... > > -- > Dominique
Christian Schoenebeck wrote on Sat, Jul 16, 2022 at 02:10:05PM +0200: > > OTOH, linux got softiwarp merged in as RDMA_SIW which works perfectly > > with my rdma applications, after fixing/working around a couple of bugs > > on the server I'm getting hangs that I can't reproduce with debug on > > current master so this isn't exactly great, not sure where it goes > > wrong :| > > At least with debug still enabled I'm not getting any new hang with your > > patches, so let's call it ok...? > > Well, I would need more info to judge or resolve that, like which patch > exactly broke RDMA behaviour for you? I wouldn't have troubles if I knew that, I don't have access to the hardware I last used 9p/rdma on so it might very well be a softiwarp compatibility problem, server version, or anything else. At the very least I'm not getting new errors and the server does receive everyhing we sent, so as far as these patches are concerned I don't think we're making anything worse. I'll get back to you once I hear back from former employer (if they can have someone run some tests, confirm it works and/or bisect that), I really spent too much time trying to get the old adapter I got working already... All I can say is that there's no error anywhere, I've finally reproduced it once with debug and I can confirm the server sent the reply and didn't get any error in ibv_post_send() so the message should have been sent, but the client just never processed it. Next step would be to add/enable some logs on the client see if it actually received something or not and go from there, but I'd like to see something that works first... -- Dominique