[Branch Hinting] Add binary support#7572
Conversation
Co-authored-by: Thomas Lively <tlively123@gmail.com>
|
Note: I have yet to confirm the binary format here interops with others, see WebAssembly/branch-hinting#31 |
|
Converting to draft as the order was wrong here: unlike DWARF, this section must appear before the code. I did that in the reader, but doing it in the writer will take some interesting plumbing... |
|
Since we'll apparently have to assemble the output in chunks, maybe it's a good opportunity to investigate parallelizing binary writing for functions. |
|
Yeah, that can make sense after #7575 which allows us to work on side buffers - we could emit functions in parallel ones. |
|
This is now ready, but includes parts of #7575 which should land first. |
|
|
||
| auto funcIter = binaryLocations.functions.find(func); | ||
| assert(funcIter != binaryLocations.functions.end()); | ||
| auto funcDeclarations = funcIter->second.declarations; |
There was a problem hiding this comment.
Can we call this funcOffset instead? That will be clearer when we subtract it from the expr offset below.
There was a problem hiding this comment.
I moved the comment so it is right above this code, and I think now makes it explanatory? I.e. the code is now doing literally what the comment says, "take the offset relative to function declarations"
| if (DWARF) { | ||
| // Do not break, so we keep looking for DWARF. | ||
| continue; | ||
| } else { | ||
| break; | ||
| } |
There was a problem hiding this comment.
I'm not sure this conditional selection between continue and break is worth the complexity. Can we just unconditionally continue here?
There was a problem hiding this comment.
Hmm, it means we pre-scan through the entire binary in the common case, but I measured it now and the difference is maybe 1%, so I agree it's not worth it. Done.
|
(rebased, this no longer has parts of another PR) |
| std::move_backward(&o[sectionStart - 1], &o[oldSize], o.end()); | ||
| std::copy( | ||
| annotationsBuffer.begin(), annotationsBuffer.end(), &o[sectionStart - 1]); |
There was a problem hiding this comment.
It would be simpler if we wrote sections to several buffers and then concatenated them at the end rather than shifting contents around within buffers. That would be a bigger change, though.
There was a problem hiding this comment.
Yes, and possibly less efficient (though I'm not sure, maybe reallocating a single buffer is worse?)
|
|
||
| // We compute the location of the function declaration area (where the | ||
| // locals are declared) the first time we need it. | ||
| BinaryLocation funcDeclarations = 0; |
There was a problem hiding this comment.
Can we call this funcDeclarationsOffset or similar so it makes more sense when we subtract it from exprOffset below?
| } | ||
| } | ||
|
|
||
| if (!funcHints.exprHints.empty()) { |
There was a problem hiding this comment.
Let's turn this into an early exit:
| if (!funcHints.exprHints.empty()) { | |
| if (funcHints.exprHints.empty()) { | |
| continue; | |
| } |
| // Write the final size. We can ignore the return value, which is the number | ||
| // of bytes we shrank (if the LEB was smaller than the maximum size), as no | ||
| // value in this section cares. | ||
| (void)buffer.emitRetroactiveSectionSizeLEB(lebPos); |
There was a problem hiding this comment.
Do we need this void cast to avoid compiler errors? It doesn't seem like we should.
There was a problem hiding this comment.
I guess the return value isn't marked as erroring if unused, yeah, and the comment should be enough. Done.
| readDylink(payloadLen); | ||
| } else if (sectionName.equals(BinaryConsts::CustomSections::Dylink0)) { | ||
| readDylink0(payloadLen); | ||
| } else if (sectionName.equals(Annotations::BranchHint.str)) { |
There was a problem hiding this comment.
Since both sides are Name, can't we use ==?
| } else if (sectionName.equals(Annotations::BranchHint.str)) { | |
| } else if (sectionName == Annotations::BranchHint) { |
| auto funcIndex = getU32LEB(); | ||
| auto& func = wasm.functions[funcIndex]; |
There was a problem hiding this comment.
We should guard against OOB indices here.
|
|
||
| ;; RUN: wasm-opt -all %s -S -o - | filecheck %s | ||
| ;; RUN: wasm-opt -all %s -S -o - | filecheck %s | ||
| ;; RUN: wasm-opt -all --roundtrip %s -S -o - | filecheck %s --check-prefix=BINARY |
There was a problem hiding this comment.
Sometimes we use RTRIP as the check prefix so it has the same number of characters as CHECK.
| // TODO: This mode may not matter (when debugging, code annotations are an | ||
| // optimization that can be skipped), but atm source maps cause | ||
| // annotations to break. |
There was a problem hiding this comment.
What's the reason for the breakage?
There was a problem hiding this comment.
I'm not sure, I only noticed it by chance, and added the TODO to investigate.
There was a problem hiding this comment.
Eventually I think we will need this to work. One use case for source maps is deobfuscation of stack traces collected from the wild, which means they need to be accurate (to the extent possible) for production binaries.
There was a problem hiding this comment.
Ah, makes sense. I'll get to it sooner, then.
|
(Meanwhile I am pretty sure this is correct wrt the binary format. The last difference with the spec interpreter turned out to be a bug on that side, WebAssembly/branch-hinting#34) |
No description provided.