GRAPHIA’s SSH Federated Knowledge Graph: Vision and Goal
Authors: Julien Homo, Luca de Santis and Ursula Rabar (photos credit Julien Homo and Canva)
SSH is one of the most diverse research domains, spread across hundreds of platforms, repositories and databases. Yet most of these resources exist in silos. An author publishes in three repositories, cites datasets from two others, and works on a project listed in a fourth, but no system connects these dots. Discovery platforms can help you find content, but they won’t tell you how things relate to each other. GRAPHIA is building a federated Knowledge Graph that goes further: a semantic layer where entities (researchers, publications, datasets, projects) are not just indexed but meaningfully connected, enabling cross-source reasoning and analytics.

It involves not only creating a new knowledge graph based primarily on GoTriple data sources but also integrating it with other SSH-focused knowledge graphs. For that purpose, a federated approach has been selected to ensure that the GRAPHIA SSH Knowledge Graph can support large-scale data integration while remaining flexible and sustainable.
Why Federation Matters
Federation is a practical strategy when dealing with large, dynamic data repositories. Attempting to create a single, centralised knowledge graph that ingests data from all possible sources would face several challenges:
- Data quality risks: duplication, stale information, and broken references that are difficult to detect and costly to fix
- Operational complexity: Managing a massive, centralised dataset is costly and cumbersome
- Non-SSH content: Multidisciplinary knowledge graphs may include unrelated data, complicating curation.
- Governance: a centralised model would require imposing a single governance framework on all data providers, which is unrealistic in a multi-institutional context.
To address these challenges, GRAPHIA is adopting a federated architecture rather than a centralised ingestion model in which:
- Knowledge Graphs remain autonomous,
- Data are not centrally aggregated,
- Provenance, governance, and update cycles stay under the control of each provider.
How the Federation Works
In GRAPHIA’s federated vision, multiple knowledge graphs are interlinked through a shared interoperability layer defined by the GRAPHIA Ontology, designed according to the Scientific Knowledge Graphs Interoperability Framework and its reference ontology. This setup allows cross-graph discovery and analytics while hiding the technical complexity of dynamically federating multiple graphs.

Participating organisations can manage their own knowledge graphs independently: they maintain their update schedules, governance policies, and data models. The only requirement is adherence to the federation rules defined through the GRAPHIA Ontology. This approach simplifies both technical management and governance, making it easier to integrate diverse datasets.
To join the federation, knowledge graphs don’t need to change their internal data model, they only need to expose a SKG-IF compliant endpoint. The GRAPHIA Ontology takes care of the alignment across sources. The architecture is deliberately flexible and non-prescriptive, designed to work with different levels of technical maturity among partners.
In 2026, we will focus on building the first operational connections between SSH knowledge graphs and testing the federated architecture with our initial partners. You can read the initial Technical Architecture released in January for more detailed technical information. We will provide more updates when they become available, in the meantime, you can sign up to our newsletter and follow our events page.