Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexDefect in httpclient because of a header spanning multiple lines #19261

Closed
ghost opened this issue Dec 17, 2021 · 6 comments · Fixed by #22886
Closed

IndexDefect in httpclient because of a header spanning multiple lines #19261

ghost opened this issue Dec 17, 2021 · 6 comments · Fixed by #22886

Comments

@ghost
Copy link

ghost commented Dec 17, 2021

I was doing some scanning stuff and found a host that replies with one header that spans multiple lines. Chromium seems to be happy with it, but httpclient does an out of bounds access which results in IndexDefect.

Example

No usable-by-default Nim repro code yet, but here's the whole response from the server:

> GET / HTTP/1.1
> Host: [REDACTED]
> User-Agent: curl/7.80.0
> Accept: */*
> 
< HTTP/1.1 301 Moved Permanently
< Server: nginx
< Date: Fri, 17 Dec 2021 00:23:58 GMT
< Content-Type: text/html
< Content-Length: 178
< Connection: keep-alive
< Keep-Alive: timeout=5
< Location: ./login.html
< Content-Security-Policy: default-src 'self';
<                                     script-src 'self'
<                                                'unsafe-inline'
<                                                'unsafe-eval';
<                                     img-src 'self'
<                                             data:;
<                                     style-src 'self'
<                                               'unsafe-inline';
<                                     font-src 'self';
<                                     child-src 'self';
<                                     object-src 'none'
< X-Content-Type-Options: nosniff
< X-Frame-Options: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< Strict-Transport-Security: max-age=31536000; includeSubDomains
< 
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>nginx</center>
</body>
</html>

Notice how Content-Security-Policy spans multiple lines.

Current Output

/home/dian/Projects/PrivWeb/scratchpad.nim(47) scratchpad
/home/dian/Things/nim/lib/pure/httpclient.nim(1188) getContent
/home/dian/Things/nim/lib/pure/httpclient.nim(1183) get
/home/dian/Things/nim/lib/pure/httpclient.nim(1110) request
/home/dian/Things/nim/lib/pure/httpclient.nim(1057) requestAux
/home/dian/Things/nim/lib/pure/httpclient.nim(846) parseResponse
/home/dian/Things/nim/lib/system/fatal.nim(53) sysFatal
Error: unhandled exception: index 53 not in 0 .. 52 [IndexDefect]

Expected Output

No error from httpclient, and it should parse that header into a single line, like Chromium:
image

Additional Information

$ nim -v
Nim Compiler Version 1.7.1 [Linux: amd64]
Compiled at 2021-12-17
Copyright (c) 2006-2021 by Andreas Rumpf

git hash: 0d0c249074d6a1041de16108dc247396efef5513
active boot switches: -d:release
@ghost ghost added the Standard Library label Dec 17, 2021
@ghost ghost changed the title IndexDefect in httpclient because of a weird header in response IndexDefect in httpclient because of a header spanning multiple lines Dec 17, 2021
@ghost
Copy link
Author

ghost commented Dec 17, 2021

Also, seems like that code uses parseUntil which says it returns 0 in case of an error, but I don't see anything related to that in the proc's body.

@ghost
Copy link
Author

ghost commented Dec 17, 2021

I'm trying to make a simple patch to try to fix it, but the data:; case is quite weird - it can be interpreted as a new header instead of an addition to the CSP header. And it's valid spec too - see https://content-security-policy.com/ :
data: - img-src 'self' data: - Allows loading resources via the data scheme (eg Base64 encoded images).

@ghost
Copy link
Author

ghost commented Dec 17, 2021

I think I have a solution, but it might break backwards compatibility if someone relied on this behaviour because httpclient wasn't enforcing the spec regarding HTTP headers.

From https://datatracker.ietf.org/doc/html/rfc7230#section-3.2:

     header-field   = field-name ":" OWS field-value OWS

     field-name     = token
     field-value    = *( field-content / obs-fold )
     field-content  = field-vchar [ 1*( SP / HTAB ) field-vchar ]
     field-vchar    = VCHAR / obs-text

     obs-fold       = CRLF 1*( SP / HTAB )
                    ; obsolete line folding
                    ; see Section 3.2.4

Definition of token:

   tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
    "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA
   token = 1*tchar

The spec forbids spaces before the header name and after the header name before :. In fact, there was a security vulnerability related to this because Firefox allowed those types of headers, although I don't know if it applies to httpclient - https://www.mozilla.org/en-US/security/advisories/mfsa2006-33/.

I can make a more or less clean fix for this issue if I change the httpclient to follow the spec more strictly.

@ghost
Copy link
Author

ghost commented Dec 17, 2021

Also, from my manual testing, if there is leading whitespace before the header, then both Chromium and Firefox just append it to the previous one, so we should make httpclient do the same as that'll fix this issue.

The behaviour for whitespace after the header name before the colon differs though - Chromium parses it as a valid header while Firefox discards the header. So I'm not sure if we should enforce the spec like Firefox does, or be more lax like Chromium.

@avahe-kellenberger
Copy link
Contributor

In my opinion, we should enforce the spec.

@dom96
Copy link
Contributor

dom96 commented Dec 17, 2021

Yes, follow spec always :)

Araq added a commit that referenced this issue Nov 1, 2023
Continuation of #19262

Fixes #19261

The parsing code is still too lenient (e.g. it will happily parse header
names with spaces in them, which is outright invalid by the spec), but I
didn't want to touch it beyond the simple changes to make sure that
`std/httpclient` won't throw `IndexDefect`s like it does now on those
cases:
- Multiline header values
- No colon after the header name
- No value after the header name + colon

One question remains - should I keep `toCaseInsensitive` exported in
`httpcore` or just copy-paste the implementation?

---------

Co-authored-by: Andreas Rumpf <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants