I think it's reasonable to expect strings embedded in html to be html-escaped.
I modified the Node::Text match arm to call html_escape::encode_text_to_string on string literals in text nodes in this commit as a demonstration: 9626096 However, this approach is deficient. First, this implementation will become less useful once the unquoted-text branch is merged. But more importantly, this doesn't address the bigger issue which is serializing user-generated data into a node which will arrive via some embedded rust code. Further, it would be nice to be able to nest calls to html! but not double-escape the html tags generated by the inner html!.
Here are some cases I think would be reasonable to support:
// Case 1: Embedded string literals should be escaped (see demo commit):
let case1 = html! { <p>"Some string in a text node that contains lt (<) and gt (>) characters"</p> }`;
// Case 2: Dynamic strings embedded into nodes should be escaped:
let user_comment = r#"I'm an innocent comment 👼 <script src="//example.com/infogulchs_steal_your_money.js"></script>😈"#;
let case2 = html! { <p>{ user_comment }</p> };
// Case 3: HTML fragments previously generated by calling html! should *not* be escaped again:
let user_name = "infogulch";
let case3 = html! { <div>"By: "<i>{ user_name }</i><br>{ case2 }</div> };
// Case 4: There should be some way to bypass automatic escaping:
let case4 = html! { <div>{ this_is_safe_html_pinky_promise(my_vetted_html_template.render(user_comment)) }</div>
// Case 5: Escaping should work on attribute values as well:
let oops_id = "myid\"oops";
let case5 = html! { <div id={oops_id}>foo</div> };
To support Cases 3 and 4 I suspect the best approach will be to wrap the output of html! with a newtype that indicates that it will emit correctly escaped html content, where if wrapped again it will not escape twice.
Steps
This is probably too big an effort to resolve all at once, here are some intermediate steps that I think would be useful in the meantime:
Thoughts?
🏹🎯, 👀☝️🌠🌠
I think it's reasonable to expect strings embedded in html to be html-escaped.
I modified the
Node::Textmatch arm to callhtml_escape::encode_text_to_stringon string literals in text nodes in this commit as a demonstration: 9626096 However, this approach is deficient. First, this implementation will become less useful once theunquoted-textbranch is merged. But more importantly, this doesn't address the bigger issue which is serializing user-generated data into a node which will arrive via some embedded rust code. Further, it would be nice to be able to nest calls tohtml!but not double-escape the html tags generated by the innerhtml!.Here are some cases I think would be reasonable to support:
To support Cases 3 and 4 I suspect the best approach will be to wrap the output of
html!with a newtype that indicates that it will emit correctly escaped html content, where if wrapped again it will not escape twice.Steps
This is probably too big an effort to resolve all at once, here are some intermediate steps that I think would be useful in the meantime:
html-to-string-macrodoesn't do any escaping and that users should be careful to correctly escape inputs to avoid injections.Thoughts?
🏹🎯, 👀☝️🌠🌠