-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
format_args!: only de-duplicate captured identifiers that refer to places
#152480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Shoyu Vanilla (Flint) <modulo641@gmail.com>
| fmt = flatten_format_args(fmt); | ||
| fmt = self.inline_literals(fmt); | ||
| } | ||
| fmt = self.dedup_captured_places(fmt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The de-duplication happens here as a separate pass, heavily based on the unstable pass to flatten nested format_args! and inline literals.
| // Remove the arguments that were de-duplicated. | ||
| if deduped_anything { | ||
| let fmt = fmt.to_mut(); | ||
|
|
||
| // Drop all the arguments that are marked for removal. | ||
| let mut remove_it = remove.iter(); | ||
| fmt.arguments.all_args_mut().retain(|_| remove_it.next() != Some(&true)); | ||
|
|
||
| // Calculate the mapping of old to new indexes for the remaining arguments. | ||
| let index_map: Vec<usize> = remove | ||
| .into_iter() | ||
| .scan(0, |i, remove| { | ||
| let mapped = *i; | ||
| *i += !remove as usize; | ||
| Some(mapped) | ||
| }) | ||
| .collect(); | ||
|
|
||
| // Correct the indexes that refer to arguments that have shifted position. | ||
| for_all_argument_indexes(&mut fmt.template, |index| *index = index_map[*index]); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was all copy-pasted from inline_literals. It should either be factored out or made more specialized. E.g. here it doesn't have to iterate through all arguments, since captured arguments should always be last, I think? Probably there's plenty of other things it could do to be more clever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dianne Captured args aren't always last:
let x = 42;
println!("{x} {} {x}", "==");There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By last, I mean in fmt.arguments's Vec. To my understanding, positional arguments come first, then named arguments, and then captured arguments are last. Since only captured arguments would be removed by the de-duplication pass, we could do something like (ignoring privacy, readability, and other practical concerns)
fmt
.arguments
.all_args_mut()
.extract_if(fmt.arguments.num_explicit_args.., |_| /* ... */)
.for_each(drop)It feels excessive, and it would probably need a helper on rustc_ast::FormatArguments to avoid exposing to the rest of rustc that arguments are sorted in that way, but in theory it would avoid some iteration.
| // We don't de-duplicate widths or precisions since de-duplication can be observed. | ||
| let _ = | ||
| { | ||
| super let args = (&x, &x, &x); | ||
| super let args = | ||
| [format_argument::new_display(args.0), | ||
| format_argument::from_usize(args.1), | ||
| format_argument::from_usize(args.2)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could get this down to only capturing &x once, but it'd still need two separate from_usize calls to always be the same as providing the width two separate times. This is similar to my understanding of why {a.b} is tricky to fully de-duplicate: if x has an impure/pathological impl Deref<Target = usize>, the deref coercion of &x to a &usize could return different values each time (or modify global state, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't try to do that optimization in this PR since I think de-duplicating the from_usize calls would mean adding hacks to more central parts of the format_args! expansion.
| Entry::Vacant(vacant_entry) => { | ||
| // This is the first time we've seen a captured identifier. If it's a local | ||
| // or static, note the argument index so other occurrences can be deduped. | ||
| if let Some(partial_res) = self.resolver.partial_res_map.get(&arg.expr.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be the first use of self.resolver in format_args. I understand that's the entire point of this PR, but that's exactly the thing that I think is problematic. This is what would make format_args too magical for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a separate optimization pass, which is what makes it seem reasonable to me.
Proof-of-concept PR to help resolve the lets-see-the-optimization concern on #149926. I tried to aim for something as decoupled as possible from the rest of the
format_args!logic, rather than a clever or efficient implementation. This doesn't do anything to optimize captured width/precision arguments. I've left more details inline.