Propensity Labs

Our work

SysAdmin Benchmark : testing power seeking in models tasked with simple system admin work. Presented at the Alignment Workshop at NeurIPS 2025.
Poster here.

Evaluations of the most downloaded open source models, on propensities polled HuggingFace users were interested in - Instruction Following and Hallucinations.

Analysis of power seeking behaviour amongst agents in Moltbook + the effect of humans masquerading as agents.

[WIP] We're creating an open world linux environment with complex tasks to measure power seeking propensities in models. Stay tuned!

We publish our research here : https://propensitylabs.substack.com/

Frequently Asked Questions

What are model propensities?

Propensities are what a model is inclined to do when given the opportunity, in contrast with capabilities, which are what a model is able to do.

Why propensities?

While there are existing organizations that focus on what models are capable of, there's a gap in systematically evaluating all model propensities that could lead to systemic risks like loss of control scenarios.

Do propensities matter?

Yes, focusing on capabilities alone does not address all risks from misalignment. Models can hide or later acquire capabilities with further training or scaffolding. Similarly, models might have the capacity to do certain things but might not be inclined to do them.

What propensities are you looking at?

We're initially focusing on propensities that contribute to Loss of Control scenarios like Power Seeking, Corrigibility, or Lawlessness. Our current work is focused on evaluating Power Seeking tendencies in frontier LLMs.

Why loss of control ?

Loss of Control is defined as 'Risks from humans losing the ability to reliably direct, modify, or shut down a model'.

Model propensities contribute disproportionately to it, compared to other systemic risks, which are more from model misuse (enabling CBRN, cyber attacks, and harmful manipulation).

Can I try out your evals or collaborate?

Of course! Please reach out to us via the form below or email us at info@propensitylabs.ai

‍

Evaluate model behaviour