sanic/sanic/headers.py

183 lines
6.6 KiB
Python
Raw Normal View History

import re
from typing import Any, Dict, Iterable, List, Optional, Tuple, Union
Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2
2019-09-02 14:50:56 +01:00
from urllib.parse import unquote
HeaderIterable = Iterable[Tuple[str, Any]] # Values convertible to str
Options = Dict[str, Union[int, str]] # key=value fields in various headers
Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2
2019-09-02 14:50:56 +01:00
OptionsIterable = Iterable[Tuple[str, str]] # May contain duplicate keys
_token, _quoted = r"([\w!#$%&'*+\-.^_`|~]+)", r'"([^"]*)"'
_param = re.compile(fr";\s*{_token}=(?:{_token}|{_quoted})", re.ASCII)
_firefox_quote_escape = re.compile(r'\\"(?!; |\s*$)')
_ipv6 = "(?:[0-9A-Fa-f]{0,4}:){2,7}[0-9A-Fa-f]{0,4}"
_ipv6_re = re.compile(_ipv6)
_host_re = re.compile(
r"((?:\[" + _ipv6 + r"\])|[a-zA-Z0-9.\-]{1,253})(?::(\d{1,5}))?"
)
# RFC's quoted-pair escapes are mostly ignored by browsers. Chrome, Firefox and
# curl all have different escaping, that we try to handle as well as possible,
# even though no client espaces in a way that would allow perfect handling.
# For more information, consult ../tests/test_requests.py
Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2
2019-09-02 14:50:56 +01:00
def parse_content_header(value: str) -> Tuple[str, Options]:
"""Parse content-type and content-disposition header values.
E.g. 'form-data; name=upload; filename=\"file.txt\"' to
('form-data', {'name': 'upload', 'filename': 'file.txt'})
Mostly identical to cgi.parse_header and werkzeug.parse_options_header
but runs faster and handles special characters better. Unescapes quotes.
"""
Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2
2019-09-02 14:50:56 +01:00
value = _firefox_quote_escape.sub("%22", value)
pos = value.find(";")
if pos == -1:
options: Dict[str, Union[int, str]] = {}
else:
options = {
m.group(1).lower(): m.group(2) or m.group(3).replace("%22", '"')
Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2
2019-09-02 14:50:56 +01:00
for m in _param.finditer(value[pos:])
}
value = value[:pos]
return value.strip().lower(), options
Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2
2019-09-02 14:50:56 +01:00
# https://tools.ietf.org/html/rfc7230#section-3.2.6 and
# https://tools.ietf.org/html/rfc7239#section-4
# This regex is for *reversed* strings because that works much faster for
# right-to-left matching than the other way around. Be wary that all things are
# a bit backwards! _rparam matches forwarded pairs alike ";key=value"
_rparam = re.compile(f"(?:{_token}|{_quoted})={_token}\\s*($|[;,])", re.ASCII)
def parse_forwarded(headers, config) -> Optional[Options]:
"""Parse RFC 7239 Forwarded headers.
The value of `by` or `secret` must match `config.FORWARDED_SECRET`
:return: dict with keys and values, or None if nothing matched
"""
header = headers.getall("forwarded", None)
secret = config.FORWARDED_SECRET
if header is None or not secret:
return None
header = ",".join(header) # Join multiple header lines
if secret not in header:
return None
# Loop over <separator><key>=<value> elements from right to left
sep = pos = None
options: List[Tuple[str, str]] = []
Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2
2019-09-02 14:50:56 +01:00
found = False
for m in _rparam.finditer(header[::-1]):
# Start of new element? (on parser skips and non-semicolon right sep)
if m.start() != pos or sep != ";":
# Was the previous element (from right) what we wanted?
if found:
break
# Clear values and parse as new element
del options[:]
pos = m.end()
val_token, val_quoted, key, sep = m.groups()
key = key.lower()[::-1]
val = (val_token or val_quoted.replace('"\\', '"'))[::-1]
options.append((key, val))
if key in ("secret", "by") and val == secret:
found = True
# Check if we would return on next round, to avoid useless parse
if found and sep != ";":
break
# If secret was found, return the matching options in left-to-right order
return fwd_normalize(reversed(options)) if found else None
def parse_xforwarded(headers, config) -> Optional[Options]:
"""Parse traditional proxy headers."""
real_ip_header = config.REAL_IP_HEADER
proxies_count = config.PROXIES_COUNT
addr = real_ip_header and headers.get(real_ip_header)
if not addr and proxies_count:
assert proxies_count > 0
try:
# Combine, split and filter multiple headers' entries
forwarded_for = headers.getall(config.FORWARDED_FOR_HEADER)
proxies = [
p
for p in (
p.strip() for h in forwarded_for for p in h.split(",")
)
if p
]
Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2
2019-09-02 14:50:56 +01:00
addr = proxies[-proxies_count]
except (KeyError, IndexError):
pass
# No processing of other headers if no address is found
if not addr:
return None
def options():
yield "for", addr
for key, header in (
("proto", "x-scheme"),
("proto", "x-forwarded-proto"), # Overrides X-Scheme if present
("host", "x-forwarded-host"),
("port", "x-forwarded-port"),
("path", "x-forwarded-path"),
):
yield key, headers.get(header)
return fwd_normalize(options())
def fwd_normalize(fwd: OptionsIterable) -> Options:
"""Normalize and convert values extracted from forwarded headers."""
ret: Dict[str, Union[int, str]] = {}
Forwarded headers and otherwise improved proxy handling (#1638) * Added support for HTTP Forwarded header and combined parsing of other proxy headers. - Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded - parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found - parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation - This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers. * Use req.forwarded in req properties server_name, server_port, scheme and remote_addr. X-Scheme handling moved to parse_xforwarded. * Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used. * Update docstrings to incidate that forwarded header is used first. * Remove testing function. * Fix tests and linting. - One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect. - Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation). * Try to workaround buggy tools complaining about incorrect ordering of imports. * Cleanup forwarded processing, add comments. secret is now also returned. * Added tests, fixed quoted string handling, cleanup. * Further tests for full coverage. * Try'n make linter happy. * Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded. * Implement multiple headers support for X-Forwarded-For. - Previously only the first header was used, so this BUGFIX may affect functionality. * Bugfix for request.server_name: strip port and other parts. - request.server_name docs claim that it returns the hostname only (no port). - config.SERVER_NAME may be full URL, so strip scheme, port and path - HTTP Host and consequently forwarded Host may include port number, so strip that also for forwarded hosts (previously only done for HTTP Host). - Possible performance benefit of limiting to one split. * Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented). * Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used. * Heil lintnazi. * Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for. * Forwarded and Host header parsing improved. - request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses - forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")). - more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected * Fixed typo in docstring. * Added IPv6 address tests for Host header. * Fix regex. * Further tests and stricter forwarded handling. * Fix merge commit * Linter * Linter * Linter * Add to avoid re-using the variable. Make a few raw strings non-raw. * Remove unnecessary or * Updated docs (work in progress). * Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting. - Also cleanup and added comments * New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests. * Remove support for PROXIES_COUNT=-1. * Linter errors. - This is getting ridiculous: cannot fit an URL on one line, linter requires splitting the string literal! * Add support for by=_proxySecret, updated docs, updated tests. * Forwarded headers' semantics tuning. - Forwarded host is now preserved in original format - request.host now returns a forwarded host if available, else the Host header - Forwarded options are preserved in original order, and later keys override earlier ones - Forwarded path is automatically URL-unquoted - Forwarded 'by' and 'for' are omitted if their value is unknown - Tests modified accordingly - Cleanup and improved documentation * Add ASGI test. * Linter * Linter #2
2019-09-02 14:50:56 +01:00
for key, val in fwd:
if val is not None:
try:
if key in ("by", "for"):
ret[key] = fwd_normalize_address(val)
elif key in ("host", "proto"):
ret[key] = val.lower()
elif key == "port":
ret[key] = int(val)
elif key == "path":
ret[key] = unquote(val)
else:
ret[key] = val
except ValueError:
pass
return ret
def fwd_normalize_address(addr: str) -> str:
"""Normalize address fields of proxy headers."""
if addr == "unknown":
raise ValueError() # omit unknown value identifiers
if addr.startswith("_"):
return addr # do not lower-case obfuscated strings
if _ipv6_re.fullmatch(addr):
addr = f"[{addr}]" # bracket IPv6
return addr.lower()
def parse_host(host: str) -> Tuple[Optional[str], Optional[int]]:
"""Split host:port into hostname and port.
:return: None in place of missing elements
"""
m = _host_re.fullmatch(host)
if not m:
return None, None
host, port = m.groups()
return host.lower(), int(port) if port is not None else None
def format_http1(headers: HeaderIterable) -> bytes:
"""Convert a headers iterable into HTTP/1 header format.
- Outputs UTF-8 bytes where each header line ends with \\r\\n.
- Values are converted into strings if necessary.
"""
return "".join(f"{name}: {val}\r\n" for name, val in headers).encode()