PEP 594 has cgi module scheduled for deprecation in Python 3.8 (#1649)

* PEP 594 has cgi module scheduled for deprecation in Python 3.8. Reimplement
cgi.parse_header in Sanic. The new implementation is much faster than either
cgi.parse_header or equivalent werkzeug.parse_options_header, and unlike the
two, handles also quoted values with semicolons or \" in them.

* Fix string escape.

* Useless linter complaints.

* More linter issues

* Add return type hint.

* Do not support quoted-pair escapes.

- Improved documentation and renamed the function more aptly as it only seems
  to apply to content-type and content-disposition headers.

* Unquote filenames also in normal mode.

* Add tests for headers. Adapted from CPython parse_header tests with changes on the final test.

* Linter

* Revert "Unquote filenames also in normal mode."

This reverts commit bf0d502bcd.

* Improved parse_content_header and added tests with Firefox and Chrome.

- Unescaping of quotes moved to parse_content_header because it affects all fields,
  not just filenames.
- It is impossible to handle all cases correctly but the current heuristics should
  suffice well for typical cases and beyond.
- Added comparisons with cgi.parse_header and werkzeug.parse_options_header.

* Updated comments as well.
This commit is contained in:
L. Kärkkäinen
2019-08-27 16:30:23 +03:00
committed by Stephen Sadowski
parent 228a31ee0a
commit 2011f3a0b2
3 changed files with 97 additions and 3 deletions

57
tests/test_headers.py Normal file
View File

@@ -0,0 +1,57 @@
import pytest
from sanic import headers
@pytest.mark.parametrize(
"input, expected",
[
("text/plain", ("text/plain", {})),
("text/vnd.just.made.this.up ; ", ("text/vnd.just.made.this.up", {})),
("text/plain;charset=us-ascii", ("text/plain", {"charset": "us-ascii"})),
('text/plain ; charset="us-ascii"', ("text/plain", {"charset": "us-ascii"})),
(
'text/plain ; charset="us-ascii"; another=opt',
("text/plain", {"charset": "us-ascii", "another": "opt"})
),
(
'attachment; filename="silly.txt"',
("attachment", {"filename": "silly.txt"})
),
(
'attachment; filename="strange;name"',
("attachment", {"filename": "strange;name"})
),
(
'attachment; filename="strange;name";size=123;',
("attachment", {"filename": "strange;name", "size": "123"})
),
(
'form-data; name="files"; filename="fo\\"o;bar\\"',
('form-data', {'name': 'files', 'filename': 'fo"o;bar\\'})
# cgi.parse_header:
# ('form-data', {'name': 'files', 'filename': 'fo"o;bar\\'})
# werkzeug.parse_options_header:
# ('form-data', {'name': 'files', 'filename': '"fo\\"o', 'bar\\"': None})
),
# <input type=file name="foo&quot;;bar\"> with Unicode filename!
(
# Chrome:
# Content-Disposition: form-data; name="foo%22;bar\"; filename="😀"
'form-data; name="foo%22;bar\\"; filename="😀"',
('form-data', {'name': 'foo";bar\\', 'filename': '😀'})
# cgi: ('form-data', {'name': 'foo%22;bar"; filename="😀'})
# werkzeug: ('form-data', {'name': 'foo%22;bar"; filename='})
),
(
# Firefox:
# Content-Disposition: form-data; name="foo\";bar\"; filename="😀"
'form-data; name="foo\\";bar\\"; filename="😀"',
('form-data', {'name': 'foo";bar\\', 'filename': '😀'})
# cgi: ('form-data', {'name': 'foo";bar"; filename="😀'})
# werkzeug: ('form-data', {'name': 'foo";bar"; filename='})
),
]
)
def test_parse_headers(input, expected):
assert headers.parse_content_header(input) == expected