Mapping Models

Mapping models - public API re-exports.

Core

Core mapping models: MappingEdge, InstanceMatchResult, Mapping.

Base class and helpers for all mapping types.

class MappingEdge(*, source_class: str, target_class: str, predicate: str = 'http://www.w3.org/2004/02/skos/core#narrowMatch', source_dataset: str, target_dataset: str, source_endpoint: str | None = None, target_endpoint: str | None = None, confidence: Annotated[float | None, Ge(ge=0), Le(le=1)] = None)[source]

Bases: BaseModel

A single mapping edge between two classes.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

source_class: str

target_class: str

predicate: str

source_dataset: str

target_dataset: str

source_endpoint: str | None

target_endpoint: str | None

confidence: float | None

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class InstanceMatchResult(*, dataset_name: str, endpoint_url: str, uri_format: str, matched_class: str | None = None)[source]

Bases: BaseModel

Raw result of probing one URI format against one endpoint.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

dataset_name: str

endpoint_url: str

uri_format: str

matched_class: str | None

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class Mapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'unknown')[source]

Bases: BaseModel

Container for a set of mapping edges with provenance.

Base class for all mapping types.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

edges: list[MappingEdge]

about: AboutMetadata

mapping_type: str

classmethod from_jsonld(path: str | Path) → Mapping[source]

Reconstruct from a mapping JSON-LD file.

Inverse of to_jsonld(). Expands CURIEs using the file’s own @context block.

to_networkx() → Any[source]: Export the mapping as an nx.MultiDiGraph.

classmethod dataset_graph(paths: Iterable[str | Path], class_to_datasets: dict[str, set[str]], *, base_graph: Any | None = None, strategies: Collection[str] | None = None) → Any[source]

Stream mapping files into a weighted dataset-pair graph.

For every mapping edge whose both endpoint classes appear in class_to_datasets, increment the weight of the (dataset_a, dataset_b) pair in the output graph.

to_jsonld() → dict[str, Any][source]

Export as JSON-LD with @context, @graph, @about.

Edges are grouped by source_class.

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Instance Mapping

InstanceMapping - instance-based matching.

class InstanceMapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'instance_matcher', resource_prefix: str, uri_formats: list[str] = <factory>, match_results: list[InstanceMatchResult] = <factory>)[source]

Bases: Mapping

Mapping generated by instance-based matching.

Probes SPARQL endpoints for instances matching bioregistry URI patterns to discover which classes across different datasets represent the same kind of entity.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

mapping_type: str

resource_prefix: str

uri_formats: list[str]

match_results: list[InstanceMatchResult]

to_jsonld() → dict[str, Any][source]: Extend base JSON-LD with instance-matcher provenance.

classmethod from_bioregistry_resource(prefix: str, datasources: Any, predicate: str = 'http://www.w3.org/2004/02/skos/core#narrowMatch', dataset_names: list[str] | None = None, timeout: float = 60.0) → InstanceMapping[source]

Probe all endpoints for a bioregistry resource.

Parameters:

prefix – Bioregistry prefix (e.g. "ensembl").
datasources – DataFrame with columns [dataset_name, endpoint_url].
predicate – Mapping predicate URI.
dataset_names – Optional subset of datasets to query.
timeout – SPARQL request timeout in seconds.

Returns:

InstanceMapping ready for export.

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

merge_instance_jsonld(existing: dict[str, Any], new: dict[str, Any]) → dict[str, Any][source]

Merge new instance-mapping JSON-LD into existing.

Merges:

@context - union of all prefix->namespace entries.
@graph - nodes keyed by @id; predicate targets are merged (duplicates skipped).
@about - uri_formats_queried is unioned; pattern_count recomputed; generated_at refreshed.

Returns:: The mutated existing dict (also returned for convenience).

SSSOM Mapping

SsomMapping - SSSOM-derived mappings.

class SsomMapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'sssom_import', source_name: str, sssom_file: str, mapping_set_id: str | None = None, mapping_set_title: str | None = None, license: str | None = None, curie_map: dict[str, str]=<factory>)[source]

Bases: Mapping

Mapping imported from an SSSOM source.

Each instance corresponds to one .sssom.tsv file extracted from an SSSOM bundle (e.g. the EBI OLS SSSOM archive).

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

mapping_type: str

source_name: str

sssom_file: str

mapping_set_id: str | None

mapping_set_title: str | None

license: str | None

curie_map: dict[str, str]

to_jsonld() → dict[str, Any][source]: Extend base JSON-LD with SSSOM provenance.

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

SemraMapping - SeMRA-derived mappings.

class SemraMapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'semra_import', source_name: str, source_prefix: str | None = None, evidence_chain: list[dict[str, ~typing.Any]]=<factory>)[source]

Bases: Mapping

Mapping imported from a SeMRA external source.

Carries the semra source key (e.g. "biomappings") and, for per-prefix sources, the bioregistry prefix.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

mapping_type: str

source_name: str

source_prefix: str | None

evidence_chain: list[dict[str, Any]]

to_jsonld() → dict[str, Any][source]: Extend base JSON-LD with SeMRA provenance.

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

Inference Mapping

InferencedMapping - inference pipeline mappings.

class InferencedMapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'inferenced', inference_types: list[str] = <factory>, source_mapping_files: list[str] = <factory>, evidence_chain: list[dict[str, ~typing.Any]]=<factory>, stats: dict[str, ~typing.Any]=<factory>)[source]

Bases: Mapping

Mapping produced by the rdfsolve/SeMRA inference pipeline.

Carries the set of inference types applied, source mapping files, evidence chain, and optional aggregate stats.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

mapping_type: str

inference_types: list[str]

source_mapping_files: list[str]

evidence_chain: list[dict[str, Any]]

stats: dict[str, Any]

to_jsonld() → dict[str, Any][source]: Extend base JSON-LD with inference provenance.

model_config = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].