SSSOM Importer

SSSOM (Simple Standard for Sharing Ontology Mappings) importer.

Downloads SSSOM bundles listed in data/sssom_sources.yaml, extracts every .sssom.tsv file, converts each one to a SsomMapping JSON-LD file, and writes the results to a configurable output directory (default: docker/mappings/sssom/).

Typical usage

From Python:

from rdfsolve.sssom_importer import import_sssom_source, seed_sssom_mappings

# Import a single source defined in data/sssom_sources.yaml
result = import_sssom_source(
    entry={
        "name": "ols_mappings",
        "provider": "EMBL-EBI (UK)",
        "url": "https://ftp.ebi.ac.uk/pub/databases/spot/ols/latest/mappings_sssom.tgz",
    },
    output_dir="docker/mappings/sssom/",
)

# Seed all sources in the YAML
results = seed_sssom_mappings(
    sssom_sources_yaml="data/sssom_sources.yaml",
    output_dir="docker/mappings/sssom/",
)

From the CLI:

python scripts/seed_sssom_mappings.py
python scripts/seed_sssom_mappings.py --name ols_mappings
python scripts/seed_sssom_mappings.py --output-dir /tmp/sssom/

SSSOM TSV format

Each .sssom.tsv file has a YAML front-matter block (lines starting with #) followed by a TSV header row and data rows. Mandatory mapping columns (SSSOM v0.15 and later):

subject_id predicate_id object_id mapping_justification

Optional columns used when present:

subject_label object_label confidence license mapping_set_id mapping_set_title subject_source object_source

The front-matter may also carry mapping_set_id, mapping_set_title, license, and a curie_map (prefix -> namespace) block.

import_sssom_source(entry: dict[str, Any], output_dir: str = 'docker/mappings/sssom') dict[str, Any][source]

Download and convert one SSSOM source entry to JSON-LD files.

For each .sssom.tsv file found in the archive at entry["url"], one JSON-LD file is written to output_dir:

{source_name}__{sssom_filename_stem}.jsonld
Parameters:
  • entry – A dict with at least name and url keys, as found in data/sssom_sources.yaml.

  • output_dir – Directory to write output JSON-LD files.

Returns:

Summary dict:

{
    "succeeded": ["ols_mappings__hp.sssom.tsv", ...],
    "failed": [{"file": "...", "error": "..."}],
    "skipped": [],
}

seed_sssom_mappings(sssom_sources_yaml: str = 'data/sssom_sources.yaml', output_dir: str = 'docker/mappings/sssom', names: list[str] | None = None) dict[str, Any][source]

Seed SSSOM mapping files for all (or selected) sources.

Reads sssom_sources_yaml, optionally filters to names, and calls import_sssom_source() for each entry.

Parameters:
  • sssom_sources_yaml – Path to the SSSOM sources YAML file.

  • output_dir – Directory for output JSON-LD files.

  • names – Optional list of source names to restrict processing; if None (default), all entries are processed.

Returns:

Aggregated summary with keys "succeeded", "failed", "skipped".