Panic on invalid programs and bad seed inputs#94
Conversation
Our custom mutators are the only supported producers of IR programs and are designed to *always* create valid programs. Thus validating programs on every mutation iteration is useless work for all supported use cases. The only time it's helpful is when not using our custom mutators or when the seed corpus was generated by something other than the current custom mutators. Subsequent commits will remove program validation entirely, instead relying on the ProgramBuilder and executor panicking when programs are invalid.
Now that smite-ir-mutator no longer calls Program::validate, nothing else uses it.
Make the executor treat ill-formed IR programs as panics rather than recoverable errors. We now require our AFL++ custom mutator shim to be used when fuzzing the IR scenario -- using the default AFL++ mutators or any other mutator library is highly likely to produce ill-formed programs that trigger panics. A major benefit of this approach is that any bugs in our mutators or generators can surface as panics rather than being hidden as errors that the executor recovered from. The tradeoff is that manually-created seeds or even seeds from previous versions of Smite will cause panics rather than being silently discarded.
I'm not sure how future proof this is. I think this approach will force mutators to do two things:
IMO, mutators should be dumb and try to do only (1). I guess doing (2) is fine as well (and we already do it), as long as we're talking about structural validity, i.e. validity of the program graph, because the information required to enforce it is already provided to the mutators (in the form of the program itself). But as discussed in #75, once we start implementing more advanced mutation schemes, this means we will have to maintain a rulebook of sorts for every new mutator that tries to modify the semantic structure of the program.
Okay, so the "rulebook" is supposed to reside in the executor. I don't feel very strongly about the executor panicking or not panicking, but I think it makes more sense for |
|
I think with this we get slightly faster code, since all the validation paths are gone. Previously, we had to maintain the validity of IR programs for the custom mutator anyway, and the same still holds now. So the main thing we lose is the ability to use AFL++’s default mutators, which were not very helpful anyway (especially once we have more interesting mutators in smite). I would lean towards the simpler and faster approach here, and be explicit that IR programs should use a custom mutator instead of wasting cycles on Skip. I think optimizations and performance improvements on the IR or mutator side are much more likely to uncover target bugs than relying on AFL++’s default mutators.
Though right now, adding a new operation is not backward compatible anyway, so previously collected corpora would already become invalid. Because of that, I don’t think resuming old corpora after adding new operations would help much for further fuzzing, so it is probably fine for them to be paniced/discarded completely. |
| let key_bytes = resolve_private_key(&variables, instr.inputs[0]); | ||
| let sk = SecretKey::from_slice(&key_bytes) | ||
| .map_err(|_| ExecuteError::InvalidPrivateKey)?; | ||
| .expect("mutator produced an invalid private key"); |
There was a problem hiding this comment.
As written, the message suggests that we're expecting the mutator to produce an invalid private key (but it didn't), while we're actually expecting that it didn't (but it did). So similar to how expect() is used in other places, I suggest this:
| .expect("mutator produced an invalid private key"); | |
| .expect("valid private key"); |
I think we are in agreement here. I'm using the term valid in the structural sense -- i.e. a valid program is runnable by the executor. I want mutators to always produce runnable programs but otherwise be ignorant of the higher-level LN protocol semantics (e.g., message sequencing). I think generators are better suited to encapsulate the semantics, and mutators should be free to break them. The main changes proposed in this PR are really:
|
Right.
Okay, this is where I think the friction arises from. Both
Yeah, I don't feel too strongly about this, either way is fine. |
Demonstrates how we can simplify our code using the approach described in #92 (comment).
Pros:
Cons:
Lightly tested with a 10 minute LDK fuzzing session. If we decide we want to do this, we'll need to test longer runs of all 4 targets.