mbox series

[v6,0/3] gweb: refactor parse_url for IPv6 addresses.

Message ID D7CD4553-EAFB-4A13-8847-668F8E909424@nuovations.com (mailing list archive)
Headers show
Series gweb: refactor parse_url for IPv6 addresses. | expand

Message

Grant Erickson Nov. 11, 2023, 5:44 p.m. UTC
Prior to this change, 'parse_url' failed to correctly handle RFC 2732-
compliant URLs with bracketed IPv6 addresses such as:

    http://[2001:db8:4006:812::200e]:8080/online/status.html

Such bracketing is necessary when using IPv6 addresses to disambiguate
the host component from the port component due to the presence of the
colon (':') in IPv6 addresses. As such, prior to this change, such URLs
resulted in the brackets and the IPv6 address being passed to GResolv
which, unsurprisingly, failed to successfully forward resolve since the
resulting host was neither a valid host name nor a valid IPv6 address.

As a result, support for such RFC 2732-compliant bracketed IPv6
addresses has been added with this change which refactors the
previously-monolithic 'parse_url' into several, focused functions:

    * parse_request_and_proxy_urls
        - parse_request_url
            o parse_url_components
                + parse_url_scheme
                + parse_url_host_and_port
                    * parse_url_host
                    * parse_url_port
                + parse_url_path
        - parse_proxy_url

In particular, 'parse_url_host' is the new function responsible for
parsing the host and correctly handling one of seven possible
combinations of host and port, two of which include bracketed IPv6
addresses.

In addition, 'parse_url_host' will now return an error on an empty,
non-existent host and 'parse_url_port' will return an error on invalid,
out-of-range ports.

Grant Erickson (3):
  gweb: Rename 'parse_url'.
  gweb: Refactor 'parse_request_and_proxy_urls'.
  gweb: Add documentation to URL parsing functions.

 gweb/gweb.c | 855 ++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 789 insertions(+), 66 deletions(-)

Comments

Marcel Holtmann Nov. 11, 2023, 6:16 p.m. UTC | #1
Hi Grant,

> Prior to this change, 'parse_url' failed to correctly handle RFC 2732-
> compliant URLs with bracketed IPv6 addresses such as:
> 
>    http://[2001:db8:4006:812::200e]:8080/online/status.html
> 
> Such bracketing is necessary when using IPv6 addresses to disambiguate
> the host component from the port component due to the presence of the
> colon (':') in IPv6 addresses. As such, prior to this change, such URLs
> resulted in the brackets and the IPv6 address being passed to GResolv
> which, unsurprisingly, failed to successfully forward resolve since the
> resulting host was neither a valid host name nor a valid IPv6 address.
> 
> As a result, support for such RFC 2732-compliant bracketed IPv6
> addresses has been added with this change which refactors the
> previously-monolithic 'parse_url' into several, focused functions:
> 
>    * parse_request_and_proxy_urls
>        - parse_request_url
>            o parse_url_components
>                + parse_url_scheme
>                + parse_url_host_and_port
>                    * parse_url_host
>                    * parse_url_port
>                + parse_url_path
>        - parse_proxy_url
> 
> In particular, 'parse_url_host' is the new function responsible for
> parsing the host and correctly handling one of seven possible
> combinations of host and port, two of which include bracketed IPv6
> addresses.
> 
> In addition, 'parse_url_host' will now return an error on an empty,
> non-existent host and 'parse_url_port' will return an error on invalid,
> out-of-range ports.
> 
> Grant Erickson (3):
>  gweb: Rename 'parse_url'.
>  gweb: Refactor 'parse_request_and_proxy_urls'.
>  gweb: Add documentation to URL parsing functions.
> 
> gweb/gweb.c | 855 ++++++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 789 insertions(+), 66 deletions(-)

all 3 patches have been applied.

Regards

Marcel