Message ID | 20220905135019.3749982-1-marmarek@invisiblethingslab.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] vchan-socket-proxy: add reconnect marker support | expand |
On Mon, Sep 5, 2022 at 9:50 AM Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> wrote: > > When vchan client reconnect quickly, the server may not notice it. This > means, it won't reconnect the UNIX socket either. For QMP, it will > prevent the client to see the QMP protocol handshake, and the > communication will timeout. > Solve the issue by sending in-band connect marker. Whenever server sees > one (elsewhere than the first byte in the connection), handle it as a > client had reconnected. The marker is a one byte, and the user need to > choose something that doesn't appear in the data stream elsewhere. > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com> Here are some of my thoughts on the correctness of this approach: The QMP data is json - ascii (utf-8?) - so using \x01 is fine as it won't be in the data stream. The dropping of any partial message works since the close and re-open of the QMP socket resets the data stream. There shouldn't be any partial data in vchan-socket-proxy to drop since the client side should not have sent any new data before it closed and re-opened. If there is partial data, we shouldn't send it into QEMU since the new QMP client isn't expecting any response it would generate. So that all seems okay. Regards, Jason
On Mon, Sep 05, 2022 at 03:50:18PM +0200, Marek Marczykowski-Górecki wrote: > + reconnect_marker_value = atoi(optarg); atoi() isn't great, if there's garbage it just return 0, which is validated below. Would there be value here in using strtol() which can detect input error? To at least help with maybe hard to debug issues? > + if (reconnect_marker_value < 0 || reconnect_marker_value > 255) { > + fprintf(stderr, "invalid argument for --reconnect-marker, " > + "must be a number between 0 and 255\n"); > + usage(argv); > + } > + break; > case '?': > usage(argv); > } > @@ -509,6 +549,15 @@ int main(int argc, char **argv) > ret = 1; > break; > } > + if (reconnect_marker_value != -1) { > + const char marker_buf[] = { reconnect_marker_value }; > + > + if (libxenvchan_write(state.ctrl, marker_buf, sizeof(marker_buf)) > + != sizeof(marker_buf)) { > + fprintf(stderr, "failed to send reconnect marker\n"); Is this an expected failure? If so, maybe adding "(ignored)" might be valuable to someone reading the logs? > + break; > + } > + } > if (data_loop(&state) != 0) > break; > /* don't reconnect if output was stdout */ Otherwise, the patch looks fine. Even if kept as is: Acked-by: Anthony PERARD <anthony.perard@citrix.com> Thanks,
On Wed, Dec 07, 2022 at 03:22:46PM +0000, Anthony PERARD wrote: > On Mon, Sep 05, 2022 at 03:50:18PM +0200, Marek Marczykowski-Górecki wrote: > > + reconnect_marker_value = atoi(optarg); > > atoi() isn't great, if there's garbage it just return 0, which is > validated below. > > Would there be value here in using strtol() which can detect input > error? To at least help with maybe hard to debug issues? I'll send v2 with this changed. > > + if (reconnect_marker_value < 0 || reconnect_marker_value > 255) { > > + fprintf(stderr, "invalid argument for --reconnect-marker, " > > + "must be a number between 0 and 255\n"); > > + usage(argv); > > + } > > + break; > > case '?': > > usage(argv); > > } > > @@ -509,6 +549,15 @@ int main(int argc, char **argv) > > ret = 1; > > break; > > } > > + if (reconnect_marker_value != -1) { > > + const char marker_buf[] = { reconnect_marker_value }; > > + > > + if (libxenvchan_write(state.ctrl, marker_buf, sizeof(marker_buf)) > > + != sizeof(marker_buf)) { > > + fprintf(stderr, "failed to send reconnect marker\n"); > > Is this an expected failure? If so, maybe adding "(ignored)" might be > valuable to someone reading the logs? No, it isn't, it may mean server is gone or other fatal communication error. The break below will terminate the process (after performing cleanup). > > + break; > > + } > > + } > > if (data_loop(&state) != 0) > > break; > > /* don't reconnect if output was stdout */ > > > Otherwise, the patch looks fine. Even if kept as is: > Acked-by: Anthony PERARD <anthony.perard@citrix.com> > > Thanks, > > -- > Anthony PERARD
diff --git a/tools/vchan/vchan-socket-proxy.c b/tools/vchan/vchan-socket-proxy.c index e1d959c6d15c..1e7defe9bae7 100644 --- a/tools/vchan/vchan-socket-proxy.c +++ b/tools/vchan/vchan-socket-proxy.c @@ -31,6 +31,7 @@ * One client is served at a time, clients needs to coordinate this themselves. */ +#define _GNU_SOURCE #include <stdlib.h> #include <stdio.h> #include <string.h> @@ -54,6 +55,9 @@ static void usage(char** argv) "\t-m, --mode=client|server - vchan connection mode (client by default)\n" "\t-s, --state-path=path - xenstore path where write \"running\" to \n" "\t at startup\n" + "\t-r, --reconnect-marker=value - send(client)/expect(server) a\n" + "\t single-byte marker to detect quick reconnects and\n" + "\t force reconnecting UNIX socket\n" "\t-v, --verbose - verbose logging\n" "\n" "client: client of a vchan connection, fourth parameter can be:\n" @@ -61,7 +65,7 @@ static void usage(char** argv) "\t whenever new connection is accepted;\n" "\t handle multiple _subsequent_ connections, until terminated\n" "\n" - "\tfile-no: except open FD of a socket in listen mode;\n" + "\tfile-no: expect open FD of a socket in listen mode;\n" "\t otherwise similar to socket-path\n" "\n" "\t-: open vchan connection immediately and pass the data\n" @@ -88,6 +92,7 @@ char outbuf[BUFSIZE]; int insiz = 0; int outsiz = 0; int verbose = 0; +int reconnect_marker_value = -1; struct vchan_proxy_state { struct libxenvchan *ctrl; @@ -291,6 +296,7 @@ int data_loop(struct vchan_proxy_state *state) int ret; int libxenvchan_fd; int max_fd; + bool just_connected = true; libxenvchan_fd = libxenvchan_fd_for_select(state->ctrl); for (;;) { @@ -368,8 +374,33 @@ int data_loop(struct vchan_proxy_state *state) exit(1); if (verbose) fprintf(stderr, "from-vchan: %.*s\n", ret, outbuf + outsiz); + if (reconnect_marker_value != -1) { + char *reconnect_found = + memrchr(outbuf + outsiz, reconnect_marker_value, ret); + if (just_connected && reconnect_found == outbuf + outsiz) { + /* skip reconnect marker at the very first byte of the data + * stream */ + memmove(outbuf + outsiz, outbuf + outsiz + 1, ret - 1); + ret -= 1; + } else if (reconnect_found) { + size_t newsiz = outbuf + outsiz + ret - reconnect_found - 1; + if (verbose) + fprintf(stderr, "reconnect marker found\n"); + /* discard everything before and including the reconnect + * marker */ + memmove(outbuf, reconnect_found + 1, newsiz); + outsiz = newsiz; + /* then handle it as the client had just disconnected */ + close(state->output_fd); + state->output_fd = -1; + close(state->input_fd); + state->input_fd = -1; + return 0; + } + } outsiz += ret; socket_wr(state->output_fd); + just_connected = false; } } return 0; @@ -385,6 +416,7 @@ static struct option options[] = { { "mode", required_argument, NULL, 'm' }, { "verbose", no_argument, NULL, 'v' }, { "state-path", required_argument, NULL, 's' }, + { "reconnect-marker", required_argument, NULL, 'r' }, { } }; @@ -421,6 +453,14 @@ int main(int argc, char **argv) case 's': state_path = optarg; break; + case 'r': + reconnect_marker_value = atoi(optarg); + if (reconnect_marker_value < 0 || reconnect_marker_value > 255) { + fprintf(stderr, "invalid argument for --reconnect-marker, " + "must be a number between 0 and 255\n"); + usage(argv); + } + break; case '?': usage(argv); } @@ -509,6 +549,15 @@ int main(int argc, char **argv) ret = 1; break; } + if (reconnect_marker_value != -1) { + const char marker_buf[] = { reconnect_marker_value }; + + if (libxenvchan_write(state.ctrl, marker_buf, sizeof(marker_buf)) + != sizeof(marker_buf)) { + fprintf(stderr, "failed to send reconnect marker\n"); + break; + } + } if (data_loop(&state) != 0) break; /* don't reconnect if output was stdout */
When vchan client reconnect quickly, the server may not notice it. This means, it won't reconnect the UNIX socket either. For QMP, it will prevent the client to see the QMP protocol handshake, and the communication will timeout. Solve the issue by sending in-band connect marker. Whenever server sees one (elsewhere than the first byte in the connection), handle it as a client had reconnected. The marker is a one byte, and the user need to choose something that doesn't appear in the data stream elsewhere. Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> --- tools/vchan/vchan-socket-proxy.c | 51 +++++++++++++++++++++++++++++++- 1 file changed, 50 insertions(+), 1 deletion(-)