sanic/docs/sanic/config.md
L. Kärkkäinen 1e4b1c4d1a Forwarded headers and otherwise improved proxy handling (#1638)
* Added support for HTTP Forwarded header and combined parsing of other proxy headers.

- Accessible via request.forwarded that tries parse_forwarded and then parse_xforwarded
- parse_forwarded uses the Forwarded header, if config.FORWARDED_SECRET is provided and a matching header field is found
- parse_xforwarded uses X-Real-IP and X-Forwarded-* much alike the existing implementation
- This commit does not change existing request properties that still use the old code and won't make use of Forwarded headers.

* Use req.forwarded in req properties server_name, server_port, scheme and remote_addr.

X-Scheme handling moved to parse_xforwarded.

* Cleanup and fix req.server_port; no longer reports socket port if any forwards headers are used.

* Update docstrings to incidate that forwarded header is used first.

* Remove testing function.

* Fix tests and linting.

- One test removed due to change of semantics - no socket port will be used if any forwarded headers are in effect.
- Other tests augmented with X-Forwarded-For, to allow the header being tested take effect (shouldn't affect old implementation).

* Try to workaround buggy tools complaining about incorrect ordering of imports.

* Cleanup forwarded processing, add comments. secret is now also returned.

* Added tests, fixed quoted string handling, cleanup.

* Further tests for full coverage.

* Try'n make linter happy.

* Add support for multiple Forwarded headers. Unify parse_forwarded parameters with parse_xforwarded.

* Implement multiple headers support for X-Forwarded-For.

- Previously only the first header was used, so this BUGFIX may affect functionality.

* Bugfix for request.server_name: strip port and other parts.

- request.server_name docs claim that it returns the hostname only (no port).
- config.SERVER_NAME may be full URL, so strip scheme, port and path
- HTTP Host and consequently forwarded Host may include port number, so
  strip that also for forwarded hosts (previously only done for HTTP Host).
- Possible performance benefit of limiting to one split.

* Fallback to app.url_for and let it handle SERVER_NAME if defined (until a proper solution is implemented).

* Revise previous commit. Only fallback for full URL SERVER_NAMEs; allows host to be defined and proxied information still being used.

* Heil lintnazi.

* Modify testcase not to use underscores in URLs. Use hyphens which the spec allows for.

* Forwarded and Host header parsing improved.

- request.forwarded lowercases hosts, separates host:port into their own fields and lowercases addresses
- forwarded.parse_host helper function added and used for parsing all host-style headers (IPv6 cannot be simply split(":")).
- more tests fixed not to use underscores in hosts as those are no longer accepted and lead to the field being rejected

* Fixed typo in docstring.

* Added IPv6 address tests for Host header.

* Fix regex.

* Further tests and stricter forwarded handling.

* Fix merge commit

* Linter

* Linter

* Linter

* Add  to avoid re-using the  variable. Make a few raw strings non-raw.

* Remove unnecessary or

* Updated docs (work in progress).

* Enable REAL_IP_HEADER parsing irregardless of PROXIES_COUNT setting.

- Also cleanup and added comments

* New defaults for PROXIES_COUNT and REAL_IP_HEADER, updated tests.

* Remove support for PROXIES_COUNT=-1.

* Linter errors.

- This is getting ridiculous: cannot fit an URL on one line, linter requires
  splitting the string literal!

* Add support for by=_proxySecret, updated docs, updated tests.

* Forwarded headers' semantics tuning.

- Forwarded host is now preserved in original format
- request.host now returns a forwarded host if available, else the Host header
- Forwarded options are preserved in original order, and later keys override earlier ones
- Forwarded path is automatically URL-unquoted
- Forwarded 'by' and 'for' are omitted if their value is unknown
- Tests modified accordingly
- Cleanup and improved documentation

* Add ASGI test.

* Linter

* Linter #2
2019-09-02 08:50:56 -05:00

212 lines
11 KiB
Markdown

# Configuration
Any reasonably complex application will need configuration that is not baked into the actual code. Settings might be different for different environments or installations.
## Basics
Sanic holds the configuration in the `config` attribute of the application object. The configuration object is merely an object that can be modified either using dot-notation or like a dictionary:
```
app = Sanic('myapp')
app.config.DB_NAME = 'appdb'
app.config.DB_USER = 'appuser'
```
Since the config object actually is a dictionary, you can use its `update` method in order to set several values at once:
```
db_settings = {
'DB_HOST': 'localhost',
'DB_NAME': 'appdb',
'DB_USER': 'appuser'
}
app.config.update(db_settings)
```
In general the convention is to only have UPPERCASE configuration parameters. The methods described below for loading configuration only look for such uppercase parameters.
## Loading Configuration
There are several ways how to load configuration.
### From Environment Variables
Any variables defined with the `SANIC_` prefix will be applied to the sanic config. For example, setting `SANIC_REQUEST_TIMEOUT` will be loaded by the application automatically and fed into the `REQUEST_TIMEOUT` config variable. You can pass a different prefix to Sanic:
```python
app = Sanic(load_env='MYAPP_')
```
Then the above variable would be `MYAPP_REQUEST_TIMEOUT`. If you want to disable loading from environment variables you can set it to `False` instead:
```python
app = Sanic(load_env=False)
```
### From an Object
If there are a lot of configuration values and they have sensible defaults it might be helpful to put them into a module:
```
import myapp.default_settings
app = Sanic('myapp')
app.config.from_object(myapp.default_settings)
```
or also by path to config:
```
app = Sanic('myapp')
app.config.from_object('config.path.config.Class')
```
You could use a class or any other object as well.
### From a File
Usually you will want to load configuration from a file that is not part of the distributed application. You can load configuration from a file using `from_pyfile(/path/to/config_file)`. However, that requires the program to know the path to the config file. So instead you can specify the location of the config file in an environment variable and tell Sanic to use that to find the config file:
```
app = Sanic('myapp')
app.config.from_envvar('MYAPP_SETTINGS')
```
Then you can run your application with the `MYAPP_SETTINGS` environment variable set:
```
$ MYAPP_SETTINGS=/path/to/config_file python3 myapp.py
INFO: Goin' Fast @ http://0.0.0.0:8000
```
The config files are regular Python files which are executed in order to load them. This allows you to use arbitrary logic for constructing the right configuration. Only uppercase variables are added to the configuration. Most commonly the configuration consists of simple key value pairs:
```
# config_file
DB_HOST = 'localhost'
DB_NAME = 'appdb'
DB_USER = 'appuser'
```
## Builtin Configuration Values
Out of the box there are just a few predefined values which can be overwritten when creating the application.
| Variable | Default | Description |
| ------------------------- | ----------------- | --------------------------------------------------------------------------- |
| REQUEST_MAX_SIZE | 100000000 | How big a request may be (bytes) |
| REQUEST_BUFFER_QUEUE_SIZE | 100 | Request streaming buffer queue size |
| REQUEST_TIMEOUT | 60 | How long a request can take to arrive (sec) |
| RESPONSE_TIMEOUT | 60 | How long a response can take to process (sec) |
| KEEP_ALIVE | True | Disables keep-alive when False |
| KEEP_ALIVE_TIMEOUT | 5 | How long to hold a TCP connection open (sec) |
| GRACEFUL_SHUTDOWN_TIMEOUT | 15.0 | How long to wait to force close non-idle connection (sec) |
| ACCESS_LOG | True | Disable or enable access log |
| PROXIES_COUNT | -1 | The number of proxy servers in front of the app (e.g. nginx; see below) |
| FORWARDED_FOR_HEADER | "X-Forwarded-For" | The name of "X-Forwarded-For" HTTP header that contains client and proxy ip |
| REAL_IP_HEADER | "X-Real-IP" | The name of "X-Real-IP" HTTP header that contains real client ip |
### The different Timeout variables:
#### `REQUEST_TIMEOUT`
A request timeout measures the duration of time between the instant when a new open TCP connection is passed to the
Sanic backend server, and the instant when the whole HTTP request is received. If the time taken exceeds the
`REQUEST_TIMEOUT` value (in seconds), this is considered a Client Error so Sanic generates an `HTTP 408` response
and sends that to the client. Set this parameter's value higher if your clients routinely pass very large request payloads
or upload requests very slowly.
#### `RESPONSE_TIMEOUT`
A response timeout measures the duration of time between the instant the Sanic server passes the HTTP request to the
Sanic App, and the instant a HTTP response is sent to the client. If the time taken exceeds the `RESPONSE_TIMEOUT`
value (in seconds), this is considered a Server Error so Sanic generates an `HTTP 503` response and sends that to the
client. Set this parameter's value higher if your application is likely to have long-running process that delay the
generation of a response.
#### `KEEP_ALIVE_TIMEOUT`
##### What is Keep Alive? And what does the Keep Alive Timeout value do?
`Keep-Alive` is a HTTP feature introduced in `HTTP 1.1`. When sending a HTTP request, the client (usually a web browser application)
can set a `Keep-Alive` header to indicate the http server (Sanic) to not close the TCP connection after it has send the response.
This allows the client to reuse the existing TCP connection to send subsequent HTTP requests, and ensures more efficient
network traffic for both the client and the server.
The `KEEP_ALIVE` config variable is set to `True` in Sanic by default. If you don't need this feature in your application,
set it to `False` to cause all client connections to close immediately after a response is sent, regardless of
the `Keep-Alive` header on the request.
The amount of time the server holds the TCP connection open is decided by the server itself.
In Sanic, that value is configured using the `KEEP_ALIVE_TIMEOUT` value. By default, it is set to 5 seconds.
This is the same default setting as the Apache HTTP server and is a good balance between allowing enough time for
the client to send a new request, and not holding open too many connections at once. Do not exceed 75 seconds unless
you know your clients are using a browser which supports TCP connections held open for that long.
For reference:
```
Apache httpd server default keepalive timeout = 5 seconds
Nginx server default keepalive timeout = 75 seconds
Nginx performance tuning guidelines uses keepalive = 15 seconds
IE (5-9) client hard keepalive limit = 60 seconds
Firefox client hard keepalive limit = 115 seconds
Opera 11 client hard keepalive limit = 120 seconds
Chrome 13+ client keepalive limit > 300+ seconds
```
### Proxy configuration
When you use a reverse proxy server (e.g. nginx), the value of `request.ip` will contain ip of a proxy, typically `127.0.0.1`. Sanic may be configured to use proxy headers for determining the true client IP, available as `request.remote_addr`. The full external URL is also constructed from header fields if available.
Without proper precautions, a malicious client may use proxy headers to spoof its own IP. To avoid such issues, Sanic does not use any proxy headers unless explicitly enabled.
Services behind reverse proxies must configure `FORWARDED_SECRET`, `REAL_IP_HEADER` and/or `PROXIES_COUNT`.
#### Forwarded header
```
Forwarded: for="1.2.3.4"; proto="https"; host="yoursite.com"; secret="Pr0xy",
for="10.0.0.1"; proto="http"; host="proxy.internal"; by="_1234proxy"
```
* Set `FORWARDED_SECRET` to an identifier used by the proxy of interest.
The secret is used to securely identify a specific proxy server. Given the above header, secret `Pr0xy` would use the information on the first line and secret `_1234proxy` would use the second line. The secret must exactly match the value of `secret` or `by`. A secret in `by` must begin with an underscore and use only characters specified in [RFC 7239 section 6.3](https://tools.ietf.org/html/rfc7239#section-6.3), while `secret` has no such restrictions.
Sanic ignores any elements without the secret key, and will not even parse the header if no secret is set.
All other proxy headers are ignored once a trusted forwarded element is found, as it already carries complete information about the client.
#### Traditional proxy headers
```
X-Real-IP: 1.2.3.4
X-Forwarded-For: 1.2.3.4, 10.0.0.1
X-Forwarded-Proto: https
X-Forwarded-Host: yoursite.com
```
* Set `REAL_IP_HEADER` to `x-real-ip`, `true-client-ip`, `cf-connecting-ip` or other name of such header.
* Set `PROXIES_COUNT` to the number of entries expected in `x-forwarded-for` (name configurable via `FORWARDED_FOR_HEADER`).
If client IP is found by one of these methods, Sanic uses the following headers for URL parts:
* `x-forwarded-proto`, `x-forwarded-host`, `x-forwarded-port`, `x-forwarded-path` and if necessary, `x-scheme`.
#### Proxy config if using ...
* a proxy that supports `forwarded`: set `FORWARDED_SECRET` to the value that the proxy inserts in the header
* Apache Traffic Server: `CONFIG proxy.config.http.insert_forwarded STRING for|proto|host|by=_secret`
* NGHTTPX: `nghttpx --add-forwarded=for,proto,host,by --forwarded-for=ip --forwarded-by=_secret`
* NGINX: after [the official instructions](https://www.nginx.com/resources/wiki/start/topics/examples/forwarded/), add anywhere in your config:
proxy_set_header Forwarded "$proxy_add_forwarded;by=\"_$server_name\";proto=$scheme;host=\"$http_host\";path=\"$request_uri\";secret=_secret";
* a custom header with client IP: set `REAL_IP_HEADER` to the name of that header
* `x-forwarded-for`: set `PROXIES_COUNT` to `1` for a single proxy, or a greater number to allow Sanic to select the correct IP
* no proxies: no configuration required!
#### Changes in Sanic 19.9
Earlier Sanic versions had unsafe default settings. From 19.9 onwards proxy settings must be set manually, and support for negative PROXIES_COUNT has been removed.