Add support for customizing HTTP headers#1196
Conversation
There was a problem hiding this comment.
Thanks for the contribution! 🎉
I have a few primary concerns here:
- Although rare, it is valid to send a specific header more than once in a request. When there are headers with the same name their order may also matter.
- How does your implementation handle "reserved" headers that users should not override like
OriginorUser-Agent? - It seems less than ideal to send the same headers to all hosts, eg. if host A requires auth but host B does not, Pkl should not send a configured
Authorizationheader to both hosts. I think a prefix matching scheme like the one used for rewrites is necessary here to ensure the proper security controls are in place.
To this end, I think the Pkl API for this needs to be something along these lines:
headers: Mapping<HttpRewrite, Listing<Pair<String, String>>>This complicates the CLI a bit, of course, but maybe something like this is reasonable?
--http-header <prefix>='<header name>: <header value>'
|
Just a heads up, we want reviews from the entire core team because this feature affects the core language and has security implications. One member of the team is currently out on leave so it's going to be a little bit before we can fully review this, but we haven't forgotten about it! |
| val headerParts = header.split(":", limit = 2) | ||
| require(headerParts.size == 2) { "Header '$header' is not in 'name:value' format. " } | ||
| PPair(headerParts[0], headerParts[1]) |
There was a problem hiding this comment.
curl accepts this same syntax, but also accepts <name>:[ \t]+<value> and appears to strip the leading (but not trailing) whitespace from header values before sending the request. To avoid surprise for users of Pkl, it might be good to do the same. In theory, HTTP servers should be ignoring this leading (and trailing) whitespace when parsing headers so it shouldn't matter, but it might be best to avoid taking chances here.
I'm also wondering if it makes sense to do some validation of characters in headers here (and in the EvaluatorSettings API) to eagerly report invalid characters instead of lazily doing so when a matching request is made for the first time.
70161b3 to
c2632cc
Compare
bioball
left a comment
There was a problem hiding this comment.
Hey, thanks for the PR! I agree with @stackoverflow; it'd be good to have this discussed in a SPICE.
If you'd like to drive this feature, can you write one up? It should be submitted as PR to pkl-evolution.
Your SPICE should start by copy/pasting the template. Your SPICE number would be the next number after the latest proposed SPICE see (https://github.com/apple/pkl-evolution/pulls)
|
I will resume working on this PR after getting the SPICE merged! |
|
As a heads up: the SPICE can only be merged once this pr is merged and released on a new version. So you only need the SPICE to be approved. |
94962c3 to
ecf2d8b
Compare
|
@kyokuping was the close intentional? |
|
Sorry, accidentally closed the PR. 😢 |
|
Gotcha, feel free to re-open a new PR! |
|
Happy to see this, just went looking for http(s) auth support :D |
e8ea9f6 to
ffbf0cf
Compare
|
When will this be merged? Really looking forward to access private repos 🙏🏻 |
HT154
left a comment
There was a problem hiding this comment.
Hey @kyokuping would you mind rebasing this to resolve the conflict(s)?
I have some pretty minor feedback, but from a technical standpoint this looks pretty good. I'd like to try and ship this in 0.32 in a few months, so it'd be great if this was ready for another set of eyes by the time the core team's all back together in a few weeks!
| if (!headerName.equals(headerName.toLowerCase())) { | ||
| throw new IllegalArgumentException( | ||
| "HTTP header '%s' should be all lowercase".formatted(headerName)); | ||
| } |
There was a problem hiding this comment.
Is this validation actually necessary? Why would this be preferable to normalizing header names during CLI parsing?
There was a problem hiding this comment.
Based on the previous review on the SPICE, I updated the settings to enforce lowercase values and therefore expected the CLI to match this behavior. If we want to change this in the CLI, we should probably align the EvaluatorSettings implementation too. Which way do you prefer to go with?
There was a problem hiding this comment.
In the evaluator settings, "pre"-normalization is required to ensure the Mapping has unique keys. On the CLI, each instance of the option is parsed linearly, so the CLI could perform its own normalization without issue.
In general, I think it's best that we accept the broadest input possible for each input method (even if those are not consistent). It's very possible the other core folks could disagree, so I'm happy to put a pin in this until they're back over the next few weeks.
There was a problem hiding this comment.
Hey, had any chance discussing around this?
There was a problem hiding this comment.
We just chatted on this; we've settled on: we don't think it's necessary to require any special rules for header names when configured using CLI flags.
Co-authored-by: Jen Basch <jbasch94@gmail.com>
- Introduce pattern-based URL matches - Both accept a single and multiple values for header values - Move header append logic to RequestRewritingClient
Co-authored-by: Jen Basch <jbasch94@gmail.com>
| for ((stringPattern, header) in it) { | ||
| val headerRegex = Regex("""^(.+?):[ \t]*(.+)$""") |
There was a problem hiding this comment.
| for ((stringPattern, header) in it) { | |
| val headerRegex = Regex("""^(.+?):[ \t]*(.+)$""") | |
| val headerRegex = Regex("""^(.+?):[ \t]*(.+)$""") | |
| for ((stringPattern, header) in it) { |
Wasteful to recompile the Regex on every iteration. Even better to move it to the companion object.
| } | ||
| } | ||
|
|
||
| val HEADER_NAME_REGEX = Pattern.compile("^[a-zA-Z0-9!#$%&'*+-.^_`|~]+$") |
There was a problem hiding this comment.
This is duplicated verbatim from IoUtils. You can make it public there and use it here.
| } | ||
|
|
||
| val HEADER_NAME_REGEX = Pattern.compile("^[a-zA-Z0-9!#$%&'*+-.^_`|~]+$") | ||
| val HEADER_VALUE_REGEX = Pattern.compile("^[\\t\\u0020-\\u007E\\u0080-\\u00FF]*$") |
There was a problem hiding this comment.
Same here (duplicated from IoUtils).
| .multiple() | ||
| .toMap() | ||
|
|
||
| val httpHeaders: List<PPair<Pattern, List<PPair<String, String>>>> by |
There was a problem hiding this comment.
This is more of a nit, but: List<PPair<Pattern, List<PPair<String, String>>>> is a quite generic type that doesn't convey much information. It would be better to have a properly-named record that stores the URL glob and the header name-value pairs.
There was a problem hiding this comment.
I agree this type's gotten a bit out of hand and a record would be a welcome improvement. I'm not eager to block this PR more than necessary, so I'm fine leaving this as follow-up work.
| "via" | ||
| }; | ||
|
|
||
| private static final String[] reservedHeaderPrefixs = {"proxy-", "sec-", "access-control-"}; |
There was a problem hiding this comment.
| private static final String[] reservedHeaderPrefixs = {"proxy-", "sec-", "access-control-"}; | |
| private static final String[] reservedHeaderPrefixes = {"proxy-", "sec-", "access-control-"}; |
| "access-control-request-headers", | ||
| "access-control-request-method", |
There was a problem hiding this comment.
Do you need these here if you already have "access-control-" in the reservedHeaderPrefixs?
Co-authored-by: Islon Scherer <islonscherer@gmail.com>
HT154
left a comment
There was a problem hiding this comment.
I think this is getting real close now. From my perspective, just a little bit to hash out with URI patterns.
| typealias HttpRewrite = String(startsWith(Regex("https?://")), endsWith("/"), hasNonEmptyHostname) | ||
|
|
||
| @Since { version = "0.32.0" } | ||
| typealias UrlPattern = String(endsWith(Regex("[/*]"))) |
There was a problem hiding this comment.
I think this validation needs to be tightened:
| typealias UrlPattern = String(endsWith(Regex("[/*]"))) | |
| typealias UrlPattern = String(endsWith(Regex(#"/\*{0,2}"#))) |
The idea here is that attempts to match a prefix should always end on a path boundary (exact or prefix) to avoid accidentally matching unintended origins, e.g. pattern *.example.com matches foo.example.company.com, which could leak credentials.
There was a problem hiding this comment.
I actually think we don't need to tighten this. Unlike string prefix matching, glob expressions are unlikely to be accidentally misconfigured.
These two are clear enough, in my opinion:
https://example.com** -- match everything that starts with example.com
https://example.com/** -- match everything within the host example.com
I do wonder if we want a external boolean isGlobPattern on class String, so this can be:
String(isGlobPattern)
There was a problem hiding this comment.
Hmm I still think it would be ideal to prevent prefix matching on hosts, but I do see what you mean. I like the idea of isGlobPattern to mirror isRegex.
| IoUtils.validateHeaderName(headerName) | ||
| IoUtils.validateHeaderValue(headerValue) | ||
| headersMap | ||
| .computeIfAbsent(stringPattern) { mutableListOf() } |
There was a problem hiding this comment.
The pattern here isn't validated by the same rule as the pattern values in pkl:EvaluatorSettings
| .multiple() | ||
| .toMap() | ||
|
|
||
| val httpHeaders: List<PPair<Pattern, List<PPair<String, String>>>> by |
There was a problem hiding this comment.
I agree this type's gotten a bit out of hand and a record would be a welcome improvement. I'm not eager to block this PR more than necessary, so I'm fine leaving this as follow-up work.
| .toList(); | ||
| parsedHeaderDefs.add(new Pair(urlPattern, pairs)); | ||
| } catch (InvalidGlobPatternException e) { | ||
| throw new PklException(ErrorMessages.create("invalidUri", stringPattern)); |
There was a problem hiding this comment.
| throw new PklException(ErrorMessages.create("invalidUri", stringPattern)); | |
| throw new PklException(ErrorMessages.create("invalidGlobPattern", stringPattern)); |
| } | ||
|
|
||
| @Override | ||
| public Builder setHeaders(List<Pair<Pattern, List<Pair<String, String>>>> headers) { |
There was a problem hiding this comment.
I think it would be better if this signature matched the in-language evaluator settings.
I think this should be something like:
setHeaderRules(Map<String, Map<String, List<String>>> headerRules)Where the top-level map key is the glob pattern.
This method can throw IllegalArgumentException if a glob pattern is invalid.
And, I think we should also have:
addHeaderRule(String globPattern, Map<String, List<String>> headers)It'd also be good to avoid using Pair here.
| typealias HttpRewrite = String(startsWith(Regex("https?://")), endsWith("/"), hasNonEmptyHostname) | ||
|
|
||
| @Since { version = "0.32.0" } | ||
| typealias UrlPattern = String(endsWith(Regex("[/*]"))) |
There was a problem hiding this comment.
I actually think we don't need to tighten this. Unlike string prefix matching, glob expressions are unlikely to be accidentally misconfigured.
These two are clear enough, in my opinion:
https://example.com** -- match everything that starts with example.com
https://example.com/** -- match everything within the host example.com
I do wonder if we want a external boolean isGlobPattern on class String, so this can be:
String(isGlobPattern)
bioball
left a comment
There was a problem hiding this comment.
This is in a good enough place to merge; there's some things that I plan on addressing in a follow-up PR!
HT154
left a comment
There was a problem hiding this comment.
With @bioball on this one.
Thanks again for all the work on this, @kyokuping! We really appreciate the contribution.
This PR adds support for custom HTTP headers, introducing a
--http-headerCLI flag to acceptkey=valuepairs. These headers can also be specified within thesetting.pklfile.Closes #633
SPICE: apple/pkl-evolution#24