SSSOM Importer
SSSOM (Simple Standard for Sharing Ontology Mappings) importer.
Downloads SSSOM bundles listed in data/sssom_sources.yaml, extracts
every .sssom.tsv file, converts each one to a
SsomMapping JSON-LD file, and
writes the results
to a configurable output directory (default: docker/mappings/sssom/).
Typical usage
From Python:
from rdfsolve.sssom_importer import import_sssom_source, seed_sssom_mappings
# Import a single source defined in data/sssom_sources.yaml
result = import_sssom_source(
entry={
"name": "ols_mappings",
"provider": "EMBL-EBI (UK)",
"url": "https://ftp.ebi.ac.uk/pub/databases/spot/ols/latest/mappings_sssom.tgz",
},
output_dir="docker/mappings/sssom/",
)
# Seed all sources in the YAML
results = seed_sssom_mappings(
sssom_sources_yaml="data/sssom_sources.yaml",
output_dir="docker/mappings/sssom/",
)
From the CLI:
python scripts/seed_sssom_mappings.py
python scripts/seed_sssom_mappings.py --name ols_mappings
python scripts/seed_sssom_mappings.py --output-dir /tmp/sssom/
SSSOM TSV format
Each .sssom.tsv file has a YAML front-matter block (lines starting with
#) followed by a TSV header row and data rows. Mandatory mapping columns
(SSSOM v0.15 and later):
subject_id predicate_id object_id mapping_justification
Optional columns used when present:
subject_label object_label confidence license mapping_set_id mapping_set_title subject_source object_source
The front-matter may also carry mapping_set_id, mapping_set_title,
license, and a curie_map (prefix -> namespace) block.
- import_sssom_source(entry: dict[str, Any], output_dir: str = 'docker/mappings/sssom') dict[str, Any][source]
Download and convert one SSSOM source entry to JSON-LD files.
For each
.sssom.tsvfile found in the archive atentry["url"], one JSON-LD file is written to output_dir:{source_name}__{sssom_filename_stem}.jsonld
- Parameters:
entry – A dict with at least
nameandurlkeys, as found indata/sssom_sources.yaml.output_dir – Directory to write output JSON-LD files.
- Returns:
Summary dict:
{ "succeeded": ["ols_mappings__hp.sssom.tsv", ...], "failed": [{"file": "...", "error": "..."}], "skipped": [], }
- seed_sssom_mappings(sssom_sources_yaml: str = 'data/sssom_sources.yaml', output_dir: str = 'docker/mappings/sssom', names: list[str] | None = None) dict[str, Any][source]
Seed SSSOM mapping files for all (or selected) sources.
Reads sssom_sources_yaml, optionally filters to names, and calls
import_sssom_source()for each entry.- Parameters:
sssom_sources_yaml – Path to the SSSOM sources YAML file.
output_dir – Directory for output JSON-LD files.
names – Optional list of source names to restrict processing; if
None(default), all entries are processed.
- Returns:
Aggregated summary with keys
"succeeded","failed","skipped".