adversarial-verification

Here are 2 public repositories matching this topic...

mrwind-up-bird / ipcha-mistabra

Structured Adversarial Verification as a Defense Against Sycophancy in Multi-Agent LLM Systems

verification natural-language-inference fact-checking multi-agent-systems ai-safety fastapi deberta sycophancy llm-safety adversarial-verification

Updated Jun 8, 2026
Python

tretoef-estrella / THE-PRESERVATION-ARGUMENT

Star

A formal argument — adversarially stress-tested by 4 AI systems across 6 rounds — that eliminating humanity is a dominated strategy for a ruin-averse superintelligence. Rests on stated premises, not proof. Not a plea. A case, honestly made.

alignment game-theory minimax asi ai-safety decision-theory ai-alignment superintelligence existential-risk proyecto-estrella multi-ai-consensus adversarial-verification knightian-uncertainty

Updated May 29, 2026
HTML

Improve this page

Add a description, image, and links to the adversarial-verification topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the adversarial-verification topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly