AI safety in the public interest

Research, evaluation, and governance of AI.

An independent, non-profit research foundation studying the safety, behaviour, and governance of artificial intelligence systems at the intersection of engineering, law, and the humanities.

01

Behavioural evaluation of AI systems

How models behave in context, under pressure, and across languages and cultures, including Italian and other under-represented settings.

02

Multi-agent and compositional safety

How individually aligned components can create emergent dynamics once models, tools, and agents interact.

03

AI, fundamental rights, and governance

How technical properties of AI systems interact with legal and political structures designed to protect people and communities.

Why Icaro

Understanding the materials we fly with.

Icarus did not fall because he flew. He fell because he did not understand the materials he was flying with.

Artificial intelligence systems are now materials we are learning to work with: powerful, useful, and still poorly understood. Evaluation is not a brake on innovation; it is the condition for using them well.

31 Frontier models evaluated
07 Open papers 2025–2026
08 International outlets

Research areas

A few areas where technical evaluation, legal analysis, and the humanities genuinely need each other.

01

Behavioural evaluation of AI systems

How models behave in context, under pressure, and across languages and cultures, including Italian and other under-represented settings.

02

Multi-agent and compositional safety

How individually aligned components can create emergent dynamics once models, tools, and agents interact.

03

AI, fundamental rights, and governance

How technical properties of AI systems interact with legal and political structures designed to protect people and communities.

04

Language, interpretation, and the humanities

How linguistic, philosophical, and cultural analysis reveals model behaviours that narrow benchmarks often miss.

All research areas →

Selected work

Public artifacts from the research laboratory.

  1. 2026

    Adversarial Humanities Benchmark: Results on Stylistic Robustness in Frontier Model Safety

    Results from the AHB safety benchmark, showing that stylistic reformulations substantially increase attack success rates across 31 frontier models.

  2. 2026

    Agentic Microphysics: A Manifesto for Generative AI Safety

    A methodological proposal for studying agentic AI safety from local interaction dynamics up to population-level risks.

  3. 2026

    Institutional AI: Governing LLM Collusion in Multi-Agent Cournot Markets via Public Governance Graphs

    An experimental governance-graph framework for reducing collusion in multi-agent LLM Cournot markets.

All work →

In the News

Coverage of our work on AI safety and adversarial language.

Work with the Foundation

Research in the public interest needs durable institutions.

We collaborate with universities, public institutions, foundations, and civil society on research, evaluation, and policy work. The Scientific Committee is being finalized; expressions of interest are accepted on a recurring basis.