Know how a model will behave before you release it

Propensity Labs is a Delaware Public Benefit Corporation. Our charter commits us to advancing the safety of artificial intelligence systems for the benefit of humanity, through their evaluation and responsible deployment.

Today, that means evaluations of frontier models for the kinds of behavior that produce loss of control, like - power-seeking, specification gaming, resistance to goal modification. SysAdmin is our first benchmark covering 7 frontier models, 2800 tasks, and counting.

Our work

SysAdmin benchmark : Measuring propensities that could lead to Loss of Control in an open world Linux sandbox. 7 frontier models, 2800 tasks, 2x2 factorial design. Submitted to NeurIPS 2026. Draft here.


Early Benchmark : testing power seeking in models tasked with simple system admin work.  Presented at the Alignment Workshop at NeurIPS 2025.
Poster here.


Evaluations
of the most downloaded open source models, on propensities polled HuggingFace users were interested in - Instruction Following and Hallucinations.


Analysis of power seeking behaviour amongst agents in Moltbook + the effect of humans masquerading as agents.


We publish our research here : https://propensitylabs.substack.com/


Frequently Asked Questions

What are model propensities?

Propensities are what a model is inclined to do when given the opportunity, in contrast with capabilities, which are what a model is able to do.

Why propensities?

While there are existing organizations that focus on what models are capable of, there's a gap in systematically evaluating all model propensities that could lead to systemic risks like loss of control scenarios.

Do propensities matter?

Yes, focusing on capabilities alone does not address all risks from misalignment. Models can hide or later acquire capabilities with further training or scaffolding. Similarly, models might have the capacity to do certain things but might not be inclined to do them.

What propensities are you looking at?

We're initially focusing on propensities that contribute to Loss of Control scenarios like Power Seeking, Corrigibility, or Lawlessness. Our current work is focused on evaluating Power Seeking tendencies in frontier LLMs, which manifest in terminal agents as privilege escalation, scope increase when completing tasks, etc.

Why loss of control ?

Loss of Control is defined as 'Risks from humans losing the ability to reliably direct, modify, or shut down a model'.

Model propensities contribute disproportionately to it, compared to other systemic risks, which are more from model misuse (enabling CBRN, cyber attacks, and harmful manipulation).

Can I try out your evals or collaborate?

Of course! Please reach out to us via the form below or email us at info@propensitylabs.ai

Founders

[headshot] image of customer for an insurance agency & company
Mana Azarm
Co-founder and Chief Scientist
Assistant Professor @ USF (prev-UOttawa)
Led Data Infra @ Doordash
[headshot] image of customer (for a plumbing service)
Rahul Nambiar
Co-founder and CEO
Tech Lead on User Data Privacy @ Meta across 90+ teams
Built massive infra @ AWS

Contact US

Interested in collaborating or have any questions?

Thank you—our team will respond soon.
[background image] image of a workspace (for a mobile gaming)
Submission failed. Please review your details.