From patchwork Sat Nov 11 17:45:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Grant Erickson X-Patchwork-Id: 13453154 Received: from mohas.pair.com (mohas.pair.com [209.68.5.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C1F6134D8 for ; Sat, 11 Nov 2023 17:45:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=nuovations.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=nuovations.com Authentication-Results: smtp.subspace.kernel.org; dkim=none Received: from mohas.pair.com (localhost [127.0.0.1]) by mohas.pair.com (Postfix) with ESMTP id 82A19730EE; Sat, 11 Nov 2023 12:45:53 -0500 (EST) Received: from [IPv6:2601:647:5a00:15c1:34e1:cabf:fe5f:4f18] (unknown [IPv6:2601:647:5a00:15c1:34e1:cabf:fe5f:4f18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mohas.pair.com (Postfix) with ESMTPSA id 26DAC73122; Sat, 11 Nov 2023 12:45:53 -0500 (EST) Precedence: bulk X-Mailing-List: connman@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: [PATCH v6 1/3] gweb: Rename 'parse_url'. From: Grant Erickson In-Reply-To: Date: Sat, 11 Nov 2023 09:45:52 -0800 Cc: Marcel Holtmann , Denis Kenzior Message-Id: References: To: connman@lists.linux.dev X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Scanned-By: mailmunge 3.11 on 209.68.5.112 This renames 'parse_url' to 'parse_request_and_proxy_urls' to more accurately reflect its role and function in processing BOTH a request URL and, optionally, a proxy URL for the specified web request session. --- gweb/gweb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gweb/gweb.c b/gweb/gweb.c index 97ff0ab21f69..b68ec123438f 100644 --- a/gweb/gweb.c +++ b/gweb/gweb.c @@ -1110,7 +1110,7 @@ static int create_transport(struct web_session *session) return 0; } -static int parse_url(struct web_session *session, +static int parse_request_and_proxy_urls(struct web_session *session, const char *url, const char *proxy) { char *scheme, *host, *port, *path; @@ -1300,7 +1300,7 @@ static guint do_request(GWeb *web, const char *url, if (!session) return 0; - if (parse_url(session, url, web->proxy) < 0) { + if (parse_request_and_proxy_urls(session, url, web->proxy) < 0) { free_session(session); return 0; } From patchwork Sat Nov 11 17:47:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Grant Erickson X-Patchwork-Id: 13453155 Received: from mohas.pair.com (mohas.pair.com [209.68.5.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AAB281BDEE for ; Sat, 11 Nov 2023 17:47:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=nuovations.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=nuovations.com Authentication-Results: smtp.subspace.kernel.org; dkim=none Received: from mohas.pair.com (localhost [127.0.0.1]) by mohas.pair.com (Postfix) with ESMTP id 0BA78730EE; Sat, 11 Nov 2023 12:47:23 -0500 (EST) Received: from [IPv6:2601:647:5a00:15c1:34e1:cabf:fe5f:4f18] (unknown [IPv6:2601:647:5a00:15c1:34e1:cabf:fe5f:4f18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mohas.pair.com (Postfix) with ESMTPSA id 7BBBE73106; Sat, 11 Nov 2023 12:47:22 -0500 (EST) Precedence: bulk X-Mailing-List: connman@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: [PATCH v6 2/3] gweb: Refactor 'parse_request_and_proxy_urls'. From: Grant Erickson In-Reply-To: Date: Sat, 11 Nov 2023 09:47:21 -0800 Cc: Marcel Holtmann , Denis Kenzior Message-Id: References: To: connman@lists.linux.dev X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Scanned-By: mailmunge 3.11 on 209.68.5.112 Prior to this change, 'parse_request_and_proxy_urls' failed to correctly handle RFC 2732-compliant URLs with bracketed IPv6 addresses such as: http://[2001:db8:4006:812::200e]:8080/online/status.html Such bracketing is necessary when using IPv6 addresses to disambiguate the host component from the port component due to the presence of the colon (':') in IPv6 addresses. As such, prior to this change, such URLs resulted in the brackets and the IPv6 address being passed to GResolv which, unsurprisingly, failed to successfully forward resolve since the resulting host was neither a valid host name nor a valid IPv6 address. As a result, support for such RFC 2732-compliant bracketed IPv6 addresses has been added with this change which refactors the previously-monolithic 'parse_request_and_proxy_urls' into several, focused functions: * parse_request_and_proxy_urls - parse_request_url o parse_url_components + parse_url_scheme + parse_url_host_and_port * parse_url_host * parse_url_port + parse_url_path - parse_proxy_url In particular, 'parse_url_host' is the new function responsible for parsing the host and correctly handling one of seven possible combinations of host and port, two of which include bracketed IPv6 addresses. In addition, 'parse_url_host' will now return an error on an empty, non-existent host and 'parse_url_port' will return an error on invalid, out-of-range ports. --- gweb/gweb.c | 455 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 390 insertions(+), 65 deletions(-) diff --git a/gweb/gweb.c b/gweb/gweb.c index b68ec123438f..16980b0e6a4e 100644 --- a/gweb/gweb.c +++ b/gweb/gweb.c @@ -1110,101 +1110,426 @@ static int create_transport(struct web_session *session) return 0; } -static int parse_request_and_proxy_urls(struct web_session *session, - const char *url, const char *proxy) +static int parse_url_scheme(const char *url, size_t url_length, + const char **cursor, + char **scheme) { - char *scheme, *host, *port, *path; + static const char * const scheme_delimiter = "://"; + static const size_t scheme_delimiter_length = 3; + const char *result; + size_t remaining_length; + size_t scheme_length = 0; - scheme = g_strdup(url); - if (!scheme) + if (!url || !url_length || !cursor) return -EINVAL; - host = strstr(scheme, "://"); - if (host) { - *host = '\0'; - host += 3; + remaining_length = url_length - (size_t)(*cursor - url); + if (remaining_length) { + result = memmem(*cursor, + remaining_length, + scheme_delimiter, + scheme_delimiter_length); + if (result) { + scheme_length = (size_t)(result - *cursor); - if (strcasecmp(scheme, "https") == 0) { - session->port = 443; - session->flags |= SESSION_FLAG_USE_TLS; - } else if (strcasecmp(scheme, "http") == 0) { - session->port = 80; - } else { - g_free(scheme); + if (scheme) + *scheme = g_strndup(*cursor, scheme_length); + + *cursor += scheme_length + scheme_delimiter_length; + } else if (scheme) + *scheme = NULL; + } else if (scheme) + *scheme = NULL; + + return 0; +} + +static int parse_url_host(const char *url, size_t url_length, + const char **cursor, + char **host) +{ + static char port_delimiter = ':'; + static char path_delimiter = '/'; + size_t remaining_length; + size_t host_length = 0; + const char *result; + const char *opening_bracket; + const char *closing_bracket; + int err = 0; + + if (!url || !url_length || !cursor) + return -EINVAL; + + /* + * Since it's the easiest to detect, first rule out an IPv6 + * address. The only reliably way to do so is to search for the + * delimiting '[' and ']'. Searching for ':' may incorrectly yield + * one of the other forms above (for example, (2), (5), or (7)). + */ + remaining_length = url_length - (size_t)(*cursor - url); + + opening_bracket = memchr(*cursor, '[', remaining_length); + if (opening_bracket) { + /* + * We found an opening bracket; this might be an IPv6 + * address. Search for its peer closing bracket. + */ + remaining_length = url_length - (size_t)(opening_bracket - url); + + closing_bracket = memchr(opening_bracket, + ']', + remaining_length); + if (!closing_bracket) return -EINVAL; - } + + /* + * Assign the first character of the IPv6 address after the + * opening bracket up to, but not including, the closing + * bracket to the host name. + */ + host_length = closing_bracket - opening_bracket - 1; + + if (host_length && host) + *host = g_strndup(opening_bracket + 1, host_length); } else { - host = scheme; - session->port = 80; - } + /* + * At this point, we either have an IPv4 address or a host + * name, maybe with a port and maybe with a path. + * + * Whether we have a port or not, we definitively know where + * the IPv4 address or host name ends. If we have a port, it + * ends at the port delimiter, ':'. If we don't have a port, + * then it ends at the end of the string or at the path + * delimiter, if any. + */ + result = memchr(*cursor, port_delimiter, remaining_length); + + /* + * There was no port delimiter; attempt to find a path + * delimiter. + */ + if (!result) + result = memchr(*cursor, path_delimiter, remaining_length); + + /* + * Whether stopping at the port or path delimiter, if we had a + * result, the end of the host is the span from the cursor to + * that result. Otherwise, it is simply the remaining length + * of the string. + */ + if (result) + host_length = result - *cursor; + else + host_length = remaining_length; - path = strchr(host, '/'); - if (path) - *(path++) = '\0'; + if (host_length && host) + *host = g_strndup(*cursor, host_length); + } - if (!proxy) - session->request = g_strdup_printf("/%s", path ? path : ""); + if (!host_length) + err = -EINVAL; else - session->request = g_strdup(url); + *cursor += host_length; + + return err; +} + +static int parse_url_port(const char *url, size_t url_length, + const char **cursor, + int16_t *port) +{ + static char port_delimiter = ':'; + static const size_t port_delimiter_length = 1; + const char *result; + size_t remaining_length; + size_t port_length = 0; + char *end; + unsigned long tmp_port; + + if (!url || !url_length || !cursor) + return -EINVAL; + + remaining_length = url_length - (size_t)(*cursor - url); + + result = memchr(*cursor, port_delimiter, remaining_length); + if (result) { + tmp_port = strtoul(result + port_delimiter_length, &end, 10); + if (tmp_port == ULONG_MAX) + return -ERANGE; + else if (tmp_port > UINT16_MAX) + return -ERANGE; + else if (result + port_delimiter_length == end) + return -EINVAL; - port = strrchr(host, ':'); - if (port) { - char *end; - int tmp = strtol(port + 1, &end, 10); + port_length = end - (result + port_delimiter_length); - if (*end == '\0') { - *port = '\0'; - session->port = tmp; + *cursor += port_length; + } else + tmp_port = -1; + + if (port) + *port = (int16_t)tmp_port; + + return 0; +} + +static int parse_url_host_and_port(const char *url, size_t url_length, + const char **cursor, + char **host, + int16_t *port) +{ + g_autofree char *temp_host = NULL; + int err = 0; + + if (!url || !url_length || !cursor) + return -EINVAL; + + /* Attempt to handle the host component. */ + + err = parse_url_host(url, url_length, cursor, &temp_host); + if (err != 0) + goto done; + + /* Attempt to handle the port component. */ + + err = parse_url_port(url, url_length, cursor, port); + if (err != 0) + goto done; + + if (host) + *host = g_steal_pointer(&temp_host); + +done: + return err; +} + +static int parse_url_path(const char *url, size_t url_length, + const char **cursor, + char **path) +{ + static char path_delimiter = '/'; + static const size_t path_delimiter_length = 1; + const char *result; + size_t remaining_length; + size_t path_length = 0; + + if (!url || !url_length || !cursor) + return -EINVAL; + + remaining_length = url_length - (size_t)(*cursor - url); + + result = memchr(*cursor, path_delimiter, remaining_length); + if (result) { + path_length = url_length - + (size_t)(result + path_delimiter_length - url); + + if (path) + *path = g_strndup(result + path_delimiter_length, path_length); + + *cursor += path_length + path_delimiter_length; + } else if (path) + *path = NULL; + + return 0; +} + +static int parse_url_components(const char *url, + char **scheme, + char **host, + int16_t *port, + char **path) +{ + size_t total_length; + const char *p; + g_autofree char *temp_scheme = NULL; + g_autofree char *temp_host = NULL; + int err = 0; + + if (!url) + return -EINVAL; + + p = url; + + total_length = strlen(p); + if (!total_length) + return -EINVAL; + + /* Skip any leading space, if any. */ + + while (g_ascii_isspace(*p)) + p++; + + /* Attempt to handle the scheme component. */ + + err = parse_url_scheme(url, total_length, &p, &temp_scheme); + if (err != 0) + goto done; + + /* Attempt to handle the host component. */ + + err = parse_url_host_and_port(url, total_length, &p, &temp_host, port); + if (err != 0) + goto done; + + /* Attempt to handle the path component. */ + + err = parse_url_path(url, total_length, &p, path); + if (err != 0) + goto done; + + if (scheme) + *scheme = g_steal_pointer(&temp_scheme); + + if (host) + *host = g_steal_pointer(&temp_host); + +done: + return err; +} + +static int parse_request_url(struct web_session *session, + const char *request_url, bool has_proxy_url) +{ + g_autofree char *scheme = NULL; + g_autofree char *host = NULL; + g_autofree char *path = NULL; + int16_t port = -1; + int err = 0; + + if (!session || !request_url) + return -EINVAL; + + /* Parse the request URL components. */ + + err = parse_url_components(request_url, + &scheme, + &host, + &port, + &path); + if (err != 0) + goto done; + + /* + * Handle the URL scheme, if any, for the session, defaulting to + * the "http" scheme and port 80. + */ + if (scheme) { + if (g_ascii_strcasecmp(scheme, "https") == 0) + session->port = 443; + else if (g_ascii_strcasecmp(scheme, "http") == 0) + session->port = 80; + else { + err = -EINVAL; + goto done; } + } else + session->port = 80; + + /* Handle the URL host and port, if any, for the session. */ - if (!proxy) + if (port != -1) { + session->port = port; + + if (!has_proxy_url) session->host = g_strdup(host); else - session->host = g_strdup_printf("%s:%u", host, tmp); + session->host = g_strdup_printf("%s:%u", host, port); } else session->host = g_strdup(host); - g_free(scheme); + /* Handle the URL path, if any, for the session. */ - if (!proxy) - return 0; + if (!has_proxy_url) + session->request = g_strdup_printf("/%s", path ? path : ""); + else + session->request = g_strdup(request_url); + +done: + return err; +} - scheme = g_strdup(proxy); - if (!scheme) +static int parse_proxy_url(struct web_session *session, const char *proxy_url) +{ + const char *p; + size_t proxy_length; + g_autofree char *scheme = NULL; + g_autofree char *host = NULL; + int16_t port = -1; + int err = 0; + + if (!session || !proxy_url) return -EINVAL; - host = strstr(proxy, "://"); - if (host) { - *host = '\0'; - host += 3; + /* + * Parse the proxy URL scheme, host, and port, the only three + * components we care about. + */ + p = proxy_url; + proxy_length = strlen(p); + + err = parse_url_scheme(proxy_url, + proxy_length, + &p, + &scheme); + if (err != 0) + goto done; + + err = parse_url_host_and_port(proxy_url, + proxy_length, + &p, + &host, + &port); + if (err != 0) + goto done; + + /* + * Handle the proxy URL scheme, if any, for the session. Only + * "http" is allowed. + */ + if (scheme && g_ascii_strcasecmp(scheme, "http") != 0) { + err = -EINVAL; + goto done; + } - if (strcasecmp(scheme, "http") != 0) { - g_free(scheme); - return -EINVAL; - } - } else - host = scheme; + /* + * Handle the proxy URL host and port for the session. + */ + if (host) + session->address = host; - path = strchr(host, '/'); - if (path) - *(path++) = '\0'; + if (port != -1) + session->port = port; - port = strrchr(host, ':'); - if (port) { - char *end; - int tmp = strtol(port + 1, &end, 10); +done: + return err; +} - if (*end == '\0') { - *port = '\0'; - session->port = tmp; - } - } +static int parse_request_and_proxy_urls(struct web_session *session, + const char *url, const char *proxy) +{ + const bool has_proxy_url = (proxy != NULL); + int err = 0; - session->address = g_strdup(host); + if (!session || !url) + return -EINVAL; - g_free(scheme); + /* Parse and handle the request URL */ - return 0; + err = parse_request_url(session, url, has_proxy_url); + if (err != 0) + goto done; + + if (!has_proxy_url) + goto done; + + /* Parse and handle the proxy URL */ + + err = parse_proxy_url(session, proxy); + if (err != 0) + goto done; + +done: + return err; } static void handle_resolved_address(struct web_session *session) From patchwork Sat Nov 11 17:47:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Grant Erickson X-Patchwork-Id: 13453156 Received: from mohas.pair.com (mohas.pair.com [209.68.5.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 71E80134D8 for ; Sat, 11 Nov 2023 17:47:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=nuovations.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=nuovations.com Authentication-Results: smtp.subspace.kernel.org; dkim=none Received: from mohas.pair.com (localhost [127.0.0.1]) by mohas.pair.com (Postfix) with ESMTP id 9CFF673105; Sat, 11 Nov 2023 12:47:48 -0500 (EST) Received: from [IPv6:2601:647:5a00:15c1:34e1:cabf:fe5f:4f18] (unknown [IPv6:2601:647:5a00:15c1:34e1:cabf:fe5f:4f18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mohas.pair.com (Postfix) with ESMTPSA id 26B7B7310E; Sat, 11 Nov 2023 12:47:48 -0500 (EST) Precedence: bulk X-Mailing-List: connman@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Subject: [PATCH v6 3/3] gweb: Add documentation to URL parsing functions. From: Grant Erickson In-Reply-To: Date: Sat, 11 Nov 2023 09:47:47 -0800 Cc: Marcel Holtmann , Denis Kenzior Message-Id: <6028B970-B14E-420E-82F8-81F58171299C@nuovations.com> References: To: connman@lists.linux.dev X-Mailer: Apple Mail (2.3608.120.23.2.4) X-Scanned-By: mailmunge 3.11 on 209.68.5.112 This adds documentation to the following URL parsing functions: * parse_request_and_proxy_urls - parse_request_url o parse_url_components + parse_url_scheme + parse_url_host_and_port * parse_url_host * parse_url_port + parse_url_path - parse_proxy_url --- gweb/gweb.c | 398 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 398 insertions(+) diff --git a/gweb/gweb.c b/gweb/gweb.c index 16980b0e6a4e..65c075d2128e 100644 --- a/gweb/gweb.c +++ b/gweb/gweb.c @@ -1110,6 +1110,40 @@ static int create_transport(struct web_session *session) return 0; } +/** + * @brief + * Attempt to parse the scheme component from a URL. + * + * This attempts to parse the scheme component from the specified URL + * of the provided length at the specified cursor point in the + * URL. If provided, the parsed scheme is copied and + * assigned to @a scheme. + * + * @note + * The caller is responsible for deallocating the memory assigned + * to @a *scheme, if provided, on success. + * + * @param[in] url A pointer to the immutable string, + * of length @a url_length, from + * which to parse the scheme + * component. + * @param[in] url_length The length, in bytes, of @a url. + * @param[in,out] cursor A pointer to the current parsing + * position within @a url at which to + * start parsing the scheme. On + * success, this is updated to the + * first byte past the parsed scheme. + * @param[in,out] scheme An optional pointer to storage to + * assign a copy of the parsed scheme + * on success. + * + * @retval 0 If successful. + * @retavl -EINVAL If @a url was null, @a url_length was zero, or @a + * cursor was null. + * + * @sa parse_url_components + * + */ static int parse_url_scheme(const char *url, size_t url_length, const char **cursor, char **scheme) @@ -1144,6 +1178,52 @@ static int parse_url_scheme(const char *url, size_t url_length, return 0; } +/** + * @brief + * Attempt to parse the host component from a URL. + * + * This attempts to parse the host component from the specified + * URL of the provided length at the specified cursor point in the + * URL. If provided, the parsed host is copied and assigned to @a + * host. + * + * Compliant with RFC 2732, the format of the host component of the + * URL may be one of the following: + * + * 1. "[\]" + * 2. "[\]:" + * 4. "\" + * 5. "\:" + * 6. "\" + * 7. "\:" + * + * @note + * The caller is responsible for deallocating the memory assigned + * to @a *host, if provided, on success. + * + * @param[in] url A pointer to the immutable string, + * of length @a url_length, from + * which to parse the host + * component. + * @param[in] url_length The length, in bytes, of @a url. + * @param[in,out] cursor A pointer to the current parsing + * position within @a url at which to + * start parsing the host. On + * success, this is updated to the + * first byte past the parsed host. + * @param[in,out] host An optional pointer to storage to + * assign a copy of the parsed host + * on success. + * + * @retval 0 If successful. + * @retavl -EINVAL If @a url was null, @a url_length was zero, @a + * cursor was null, or if the host portion of @a + * url is malformed. + * + * @sa parse_url_host_and_port + * @sa parse_url_components + * + */ static int parse_url_host(const char *url, size_t url_length, const char **cursor, char **host) @@ -1234,6 +1314,51 @@ static int parse_url_host(const char *url, size_t url_length, return err; } +/** + * @brief + * Attempt to parse the port component from a URL. + * + * This attempts to parse the port component from the specified URL + * of the provided length at the specified cursor point in the + * URL. If provided, the parsed port is assigned to @a port. + * + * Compliant with RFC 2732, the format of the host component of the + * URL may be one of the following: + * + * 1. "[\]" + * 2. "[\]:" + * 4. "\" + * 5. "\:" + * 6. "\" + * 7. "\:" + * + * @param[in] url A pointer to the immutable string, + * of length @a url_length, from + * which to parse the port + * component. + * @param[in] url_length The length, in bytes, of @a url. + * @param[in,out] cursor A pointer to the current parsing + * position within @a url at which to + * start parsing the port. On + * success, this is updated to the + * first byte past the parsed port. + * @param[in,out] port An optional pointer to storage to + * assign the parsed port on + * success. On failure or absence of + * a port to parsed, this is assigned + * -1. + * + * @retval 0 If successful. + * @retavl -EINVAL If @a url was null, @a url_length was zero, @a + * cursor was null, or if there were no characters + * to parse after the port delimiter (':'). + * @retval -ERANGE If the parsed port was outside of the range [0, + * 65535], inclusive. + * + * @sa parse_url_host_and_port + * @sa parse_url_components + * + */ static int parse_url_port(const char *url, size_t url_length, const char **cursor, int16_t *port) @@ -1273,6 +1398,62 @@ static int parse_url_port(const char *url, size_t url_length, return 0; } +/** + * @brief + * Attempt to parse the host and port components from a URL. + * + * This attempts to parse the host and port components from the + * specified URL of the provided length at the specified cursor point + * in the URL. If provided, the parsed host is copied and assigned to + * @a host and, if provided, the parsed port is assigned to @a port. + * + * Compliant with RFC 2732, the format of the host component of the + * URL may be one of the following: + * + * 1. "[\]" + * 2. "[\]:" + * 4. "\" + * 5. "\:" + * 6. "\" + * 7. "\:" + * + * @note + * The caller is responsible for deallocating the memory assigned + * to @a *host, if provided, on success. + * + * @param[in] url A pointer to the immutable string, + * of length @a url_length, from + * which to parse the host and port + * components. + * @param[in] url_length The length, in bytes, of @a url. + * @param[in,out] cursor A pointer to the current parsing + * position within @a url at which to + * start parsing the host and + * port. On success, this is updated + * to the first byte past the parsed + * host or port, if present. + * @param[in,out] host An optional pointer to storage to + * assign a copy of the parsed host + * on success. + * @param[in,out] port An optional pointer to storage to + * assign the parsed port on + * success. On failure or absence of + * a port to parsed, this is assigned + * -1. + * + * @retval 0 If successful. + * @retavl -EINVAL If @a url was null, @a url_length was zero, @a + * cursor was null, if the host portion of @a url + * is malformed, or if there were no characters to + * parse after the port delimiter (':'). + * @retval -ERANGE If the parsed port was outside of the range [0, + * 65535], inclusive. + * + * @sa parse_url_host + * @sa parse_url_port + * @sa parse_url_components + * + */ static int parse_url_host_and_port(const char *url, size_t url_length, const char **cursor, char **host, @@ -1303,6 +1484,40 @@ done: return err; } +/** + * @brief + * Attempt to parse the path component from a URL. + * + * This attempts to parse the path component from the specified + * URL of the provided length at the specified cursor point in the + * URL. If provided, the parsed path is copied and assigned to @a + * path. + * + * @note + * The caller is responsible for deallocating the memory assigned + * to @a *path, if provided, on success. + * + * @param[in] url A pointer to the immutable string, + * of length @a url_length, from + * which to parse the path + * component. + * @param[in] url_length The length, in bytes, of @a url. + * @param[in,out] cursor A pointer to the current parsing + * position within @a url at which to + * start parsing the path. On + * success, this is updated to the + * first byte past the parsed path. + * @param[in,out] path An optional pointer to storage to + * assign a copy of the parsed path + * on success. + * + * @retval 0 If successful. + * @retavl -EINVAL If @a url was null, @a url_length was zero, or @a + * cursor was null. + * + * @sa parse_url_components + * + */ static int parse_url_path(const char *url, size_t url_length, const char **cursor, char **path) @@ -1333,6 +1548,61 @@ static int parse_url_path(const char *url, size_t url_length, return 0; } +/** + * @brief + * Attempt to parse the scheme, host, port, and path components + * from a URL. + * + * This attempts to parse the scheme, host, port, and path components + * from the specified URL. If provided, the parsed scheme, host and + * path are copied and assigned to @a scheme, @a host, and @a path, + * respective and the parsed port is assigned to @a port. + * + * Compliant with RFC 2732, the format of the host component of the + * URL may be one of the following: + * + * 1. "[\]" + * 2. "[\]:" + * 4. "\" + * 5. "\:" + * 6. "\" + * 7. "\:" + * + * @param[in] url A pointer to the immutable null- + * terminated C string from which to + * parse the scheme, host, port, and + * path components. + * @param[in,out] scheme An optional pointer to storage to + * assign a copy of the parsed scheme + * on success. + * @param[in,out] host An optional pointer to storage to + * assign a copy of the parsed host + * on success. + * @param[in,out] port An optional pointer to storage to + * assign the parsed port on + * success. On failure or absence of + * a port to parsed, this is assigned + * -1. + * @param[in,out] path An optional pointer to storage to + * assign a copy of the parsed path + * on success. + * + * @retval 0 If successful. + * @retavl -EINVAL If @a url was null, @a url length was zero, if + * the host portion of @a url is malformed, or if + * there were no characters to parse after the port + * delimiter (':'). + * @retval -ERANGE If the parsed port was outside of the range [0, + * 65535], inclusive. + * + * @sa parse_url_scheme_with_default + * @sa parse_url_scheme + * @sa parse_url_host + * @sa parse_url_port + * @sa parse_url_host_and_port + * @sa parse_url_path + * + */ static int parse_url_components(const char *url, char **scheme, char **host, @@ -1387,6 +1657,46 @@ done: return err; } +/** + * @brief + * Attempt to parse the request URL for the web request session. + * + * This attempts to parse the specified request URL for the specified + * web request session. From the request URL, the scheme is parsed, + * mapped and assigned to the @a session port field and the host and + * path are parsed, copied, and assigned to the host and request + * fields, respectively. + * + * Compliant with RFC 2732, the format of the host component of the + * request and proxy URLs may be one of the following: + * + * 1. "[\]" + * 2. "[\]:" + * 4. "\" + * 5. "\:" + * 6. "\" + * 7. "\:" + * + * @note + * The caller is responsible for deallocating the memory assigned + * to the @a session host, request, and address fields. + * + * @param[in,out] session A pointer to the mutable web session + * request object to be populated from + * @a url and, if provided, @a proxy. On + * success, the session port, host, + * request, and address fields will be + * populated from the parsed request URL. + * @param[in] request_url A pointer to the immutable null- + * terminated C string containing the + * request URL to parse. + * + * @retval 0 If successful. + * @retval -EINVAL If @request_url was not a valid URL. + * + * @sa parse_url_components + * + */ static int parse_request_url(struct web_session *session, const char *request_url, bool has_proxy_url) { @@ -1448,6 +1758,46 @@ done: return err; } +/** + * @brief + * Attempt to parse the proxy URL for the web request session. + * + * This attempts to parse the specified proxy URL for the specified + * web request session. From the proxy URL, the port component is + * parsed and assigned to the @a session port field and the host + * component is parsed, copied, and assigned to the address field. + * + * Compliant with RFC 2732, the format of the host component of the + * request and proxy URLs may be one of the following: + * + * 1. "[\]" + * 2. "[\]:" + * 4. "\" + * 5. "\:" + * 6. "\" + * 7. "\:" + * + * @note + * The caller is responsible for deallocating the memory assigned + * to the @a session address field. + * + * @param[in,out] session A pointer to the mutable web session + * request object to be populated from + * @a url and, if provided, @a proxy. On + * success, the session port and address + * fields will be populated from the + * parsed proxy URL. + * @param[in] proxy_url A pointer to the immutable null- + * terminated C string containing the + * web proxy URL to parse. + * + * @retval 0 If successful. + * @retval -EINVAL If @a proxy_url was not a valid URL. + * + * @sa parse_url_scheme + * @sa parse_url_host_and_port + * + */ static int parse_proxy_url(struct web_session *session, const char *proxy_url) { const char *p; @@ -1504,6 +1854,54 @@ done: return err; } +/** + * @brief + * Attempt to parse the request and proxy URLs for the web request + * session. + * + * This attempts to parse the specified request and optional proxy + * URL for the specified web request session. From the request URL, + * the scheme is parsed, mapped and assigned to the @a session port + * field and the host and path are parsed, copied, and assigned to + * the host and request fields, respectively. From the proxy URL, if + * present, the port component is parsed and assigned to the @a + * session port field and the host component is parsed, copied, and + * assigned to the address field. + * + * Compliant with RFC 2732, the format of the host component of the + * request and proxy URLs may be one of the following: + * + * 1. "[\]" + * 2. "[\]:" + * 4. "\" + * 5. "\:" + * 6. "\" + * 7. "\:" + * + * @note + * The caller is responsible for deallocating the memory assigned + * to the @a session host, request, and address fields. + * + * @param[in,out] session A pointer to the mutable web session request + * object to be populated from @a url and, + * if provided, @a proxy. On success, the + * session port, host, request, and address + * fields will be populated from the parsed + * request URL and/or proxy URLs. + * @param[in] url A pointer to the immutable null-terminated + * C string containing the request URL to + * parse. + * @param[in] proxy An optional pointer to the immutable null- + * terminated C string containing the web + * proxy URL, if any, to parse. + * + * @retval 0 If successful. + * @retval -EINVAL If @url was not a valid URL. + * + * @sa parse_request_url + * @sa parse_proxy_url + * + */ static int parse_request_and_proxy_urls(struct web_session *session, const char *url, const char *proxy) {