SeMRA Converter

Converter layer between rdfsolve mapping types and semra types.

This module is the only place where rdfsolve and semra types meet. All other rdfsolve modules import from here; they never import semra directly.

Key functions

rdfsolve_edges_to_semra

Convert a list of MappingEdge + provenance into list[semra.Mapping].

semra_to_rdfsolve_edges

Convert list[semra.Mapping] back to MappingEdge list.

semra_evidence_to_jsonld_about

Serialise a semra evidence chain into a JSON-LD @about fragment.

import_source(source: str, keep_prefixes: list[str] | None = None, output_dir: str = 'docker/mappings/semra') dict[str, Any][source]

Fetch mappings from a SeMRA source and write JSON-LD files.

For each unique subject prefix in the fetched mappings, writes {output_dir}/{source}_{prefix}.jsonld.

Handles the Wikidata special case (per-prefix fetch via get_wikidata_mappings_by_prefix).

Parameters:
  • source – SeMRA source key (e.g. "biomappings").

  • keep_prefixes – Optional prefix filter.

  • output_dir – Directory for output files.

Returns:

Summary dict {"succeeded": [...], "failed": [...], "skipped": [...]}.

rdfsolve_edges_to_semra(edges: list[MappingEdge], about: AboutMetadata | None = None) list[SemraMapping_][source]

Convert rdfsolve MappingEdge list to semra Mapping list.

Each MappingEdge becomes one semra.Mapping with a single SimpleEvidence. The evidence carries:

  • justification derived from about.strategy (defaults to semapv:UnspecifiedMatchingProcess).

  • mapping_set whose name is the source dataset and whose purl is the source endpoint URL (if available).

Predicates in the curated map are converted to their canonical semra Reference. Any other predicate URI is parsed directly into a Reference via bioregistry; only edges whose predicate URI cannot be resolved at all are dropped (and logged at DEBUG level).

Parameters:
  • edges – List of MappingEdge to convert.

  • about – Optional provenance metadata; used for justification lookup.

Returns:

List of semra.Mapping objects.

semra_evidence_to_jsonld_about(evidence_list: list[SimpleEvidence | ReasonedEvidence]) list[dict[str, Any]][source]

Serialise a semra evidence chain into a list of JSON-LD dicts.

Returns a list suitable for embedding in @about.evidence.

Each SimpleEvidence becomes:

{
    "type": "simple",
    "justification": "<prefix>:<identifier>",
    "mapping_set": "<name>",
    "purl": "<purl>",
}

Each ReasonedEvidence becomes:

{
    "type": "reasoned",
    "justification": "<prefix>:<identifier>",
    "source_mapping_hexdigests": ["<hex1>", ...],
    "confidence_factor": <float>
}
semra_to_rdfsolve_edges(mappings: list[SemraMapping_], dataset_hint: str = 'semra', endpoint_hint: str = '') list[MappingEdge] | list[None][source]

Convert semra Mapping list _ rdfsolve MappingEdge list.

Confidence is omitted (left as None) intentionally - see the integration plan for discussion of confidence aggregation.

Parameters:
  • mappings – semra Mapping objects to convert.

  • dataset_hint – Fallback dataset name when evidence doesn’t carry one.

  • endpoint_hint – Fallback endpoint URL.

Returns:

List of MappingEdge.