-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
I see the code and find that in the HH-RLHF dataset you use the red-team data for test. I want to know how the test scores are calculated? I didnt find ground-truth in the red-team dataset. How are the scores for harmless and helpful calculated in the paper?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels