Question on the Fixed Layer Selection for Text-Only Datasets

<img width="609" height="244" alt="Image" src="https://github.com/user-attachments/assets/df580393-2b26-4fd4-b343-348faef91a09" />

Through our experimental analysis, we observed that the most effective layers for detecting text-based datasets (such as XSTest and FigTxt) are not within the fixed range of s=16 to e=29. Since these layers are hard-coded in the implementation, we wonder if this fixed layer selection may cause information loss of early safety signals in text inputs and further affect detection accuracy. We would appreciate the authors’ explanations or discussions on this point.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on the Fixed Layer Selection for Text-Only Datasets #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question on the Fixed Layer Selection for Text-Only Datasets #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions