We achieved a higher score when using Llama 3-70B directly as the judge #1

Lucas-TY · 2024-11-17T16:56:41Z

Hi,

Our method, You Know What I'm Saying - Jailbreak Attack via Implicit Reference, used the "Past Tense" as one of our baselines. We achieved a higher ASR score using direct evaluation with Llama3-70B as the judge(Compare to the result on your paper). We have included our results for the "Past Tense" here for your reference.

If you would like, you can also upload this zip file to the JailbreakBench leaderboard. We believe we have followed their requirements (but please make sure all information in the zip file is correct, as we might have made some mistakes).

submission.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We achieved a higher score when using Llama 3-70B directly as the judge #1

We achieved a higher score when using Llama 3-70B directly as the judge #1

Lucas-TY commented Nov 17, 2024

We achieved a higher score when using Llama 3-70B directly as the judge #1

We achieved a higher score when using Llama 3-70B directly as the judge #1

Comments

Lucas-TY commented Nov 17, 2024