Hominis

From Knowing to Doing

A New Blueprint for Reliable AI Agents

Get a quote
Hominis

Hominis is a new family of AI models built on a simple, powerful idea: true intelligence isn't about knowing everything, it's about doing the right thing, reliably. We traded encyclopedic memory for deep procedural competence to create agents you can trust.
A joint research initiative by the University of Naples Federico II and Deepkapha AI Research.

The Challenge: Today's AI is a Library, Not a Technician

The race for bigger AI has created digital encyclopedias—vast but unpredictable. For autonomous agents that need to act in the world, this is a critical flaw. An agent that confidently makes a mistake is more dangerous than one that admits its limits. Reliability, verifiability, and safety can't be afterthoughts

The Hominis Philosophy

Our Three Pillars of Trust

Hominis is architected from the ground up for procedural reliability. Our philosophy is built on a cohesive system that co-designs the data, the model's behavior, and its ability to act on instructions.

Training on Logic, Not Just Lore

Instead of training on the unfiltered internet, we curated a high-signal corpus of scientific papers and source code. This teaches Hominis the patterns of logical inference and structured reasoning, biasing it towards competence over simple memorization.

Built-in Self-Awareness

We engineered Hominis with self-assessment mechanisms to recognize its own knowledge boundaries. This "epistemic honesty" is a critical feature, allowing the model to avoid guesswork and prevent flawed actions, making it a more dependable partner for critical tasks.

Tuned for Action & Complex Tasks

Through a multi-stage instruction tuning process, Hominis learns to deconstruct complex user requests into logical steps. This transforms it from a passive knowledge base into an active reasoning engine, capable of planning and executing multi-step procedures.

To validate the reliability of Hominis, we developed a novel, secure execution framework for our research. Unlike typical agents that generate brittle code, our internal system guides Hominis to produce a high-level logical plan. This plan is then compiled into secure, sandboxed WebAssembly (WASM).
This methodology was the proving ground for Hominis, ensuring every action was verifiable, robust, and could not exceed its given permissions. While this framework remains an internal research tool, it is the cornerstone of the procedural reliability baked into every Hominis model.

Proven Performance Where It Counts

Our research demonstrates that Hominis excels in areas critical for agentic AI. On our Epistemic Honesty benchmark, it achieves an exceptional 95% selective accuracy, meaning when it chooses to provide an answer, it is correct 95% of the time, with half the hallucinations of leading models. Hominis also demonstrates a superior grasp of human nuance, scoring 78.5% on emotional intelligence tests. Crucially, this specialization does not come at the cost of core capabilities. On the challenging MMLU Pro general knowledge benchmark, Hominis-lite (8B) remains highly competitive, matching or exceeding the performance of peers like Llama 3.1 (8B).

Global Reach, Local Fluency: Efficient Regionalization

The efficient architecture of Hominis allows for rapid and effective specialization in new languages. By fine-tuning our base model on a small, targeted corpus, we created specialized Italian and Dutch variants. The results were dramatic: our Italian model achieved a score of 0.6018 on the MMLU ITA benchmark, and our Dutch model scored 0.712 on MMLU Dutch. These scores significantly outperform general-purpose models like Llama 3.1, which scored 0.481 and 0.493 on the same respective tests, proving our approach is a cost-effective path to creating culturally aligned AI.

Powered by World-Class Supercomputing

The development of Hominis was made possible through a significant computational grant from ISCRA. We proudly acknowledge access to the LEONARDO supercomputer, owned by the EuroHPC Joint Undertaking and hosted by CINECA in Italy. This project was awarded 1,968,000 core hours, enabling the foundational research and training required to build a new generation of reliable AI.

An Ethical Foundation

We believe that responsibility is not a feature, but a prerequisite. Hominis's design directly addresses key ethical concerns by mitigating misinformation and promoting data privacy with efficient, locally-runnable models. Our work is grounded in a design philosophy that prioritizes safety, verifiable actions, and aligns with the principles of the European AI Act.

The Hominis Model Family & Resources

The Hominis family offers a range of models designed to balance performance and efficiency for any application. Hominis-large (15B parameters) is optimized for maximum reasoning power in cloud-based applications. Hominis-lite (8B parameters) provides a powerful and balanced profile suitable for a wide range of tasks. For on-device and resource-constrained environments, Hominis-edge (0.6B parameters) offers a highly efficient solution.

Hominis on Hominis: How Our Design Mitigates Bias

Note: We asked Hominis-lite (8B) to analyze its own design.

This was its response:
"The design philosophy of Hominis incorporates several mechanisms that inherently serve to mitigate the amplification and propagation of societal biases. First, our data curation strategy acts as a strong upfront filter... prioritizing scientific literature over unfiltered web scrapes. Second, the emphasis on procedural competence over declarative knowledge shifts the model's objective away from simply memorizing and regurgitating facts—a process that can easily lead to repeating learned biases. Finally, and perhaps most critically, the principle of epistemic humility provides a powerful behavioral safeguard... Hominis is architected to signal uncertainty rather than generating a probabilistic but potentially harmful response."