Mapping Models
Mapping models - public API re-exports.
Core
Core mapping models: MappingEdge, InstanceMatchResult, Mapping.
Base class and helpers for all mapping types.
- class MappingEdge(*, source_class: str, target_class: str, predicate: str = 'http://www.w3.org/2004/02/skos/core#narrowMatch', source_dataset: str, target_dataset: str, source_endpoint: str | None = None, target_endpoint: str | None = None, confidence: Annotated[float | None, Ge(ge=0), Le(le=1)] = None)[source]
Bases:
BaseModelA single mapping edge between two classes.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class InstanceMatchResult(*, dataset_name: str, endpoint_url: str, uri_format: str, matched_class: str | None = None)[source]
Bases:
BaseModelRaw result of probing one URI format against one endpoint.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class Mapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'unknown')[source]
Bases:
BaseModelContainer for a set of mapping edges with provenance.
Base class for all mapping types.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- edges: list[MappingEdge]
- about: AboutMetadata
- classmethod from_jsonld(path: str | Path) Mapping[source]
Reconstruct from a mapping JSON-LD file.
Inverse of
to_jsonld(). Expands CURIEs using the file’s own@contextblock.
- classmethod dataset_graph(paths: Iterable[str | Path], class_to_datasets: dict[str, set[str]], *, base_graph: Any | None = None, strategies: Collection[str] | None = None) Any[source]
Stream mapping files into a weighted dataset-pair graph.
For every mapping edge whose both endpoint classes appear in class_to_datasets, increment the weight of the
(dataset_a, dataset_b)pair in the output graph.
- to_jsonld() dict[str, Any][source]
Export as JSON-LD with @context, @graph, @about.
Edges are grouped by source_class.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Instance Mapping
InstanceMapping - instance-based matching.
- class InstanceMapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'instance_matcher', resource_prefix: str, uri_formats: list[str] = <factory>, match_results: list[InstanceMatchResult] = <factory>)[source]
Bases:
MappingMapping generated by instance-based matching.
Probes SPARQL endpoints for instances matching bioregistry URI patterns to discover which classes across different datasets represent the same kind of entity.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- match_results: list[InstanceMatchResult]
- classmethod from_bioregistry_resource(prefix: str, datasources: Any, predicate: str = 'http://www.w3.org/2004/02/skos/core#narrowMatch', dataset_names: list[str] | None = None, timeout: float = 60.0) InstanceMapping[source]
Probe all endpoints for a bioregistry resource.
- Parameters:
prefix – Bioregistry prefix (e.g.
"ensembl").datasources – DataFrame with columns
[dataset_name, endpoint_url].predicate – Mapping predicate URI.
dataset_names – Optional subset of datasets to query.
timeout – SPARQL request timeout in seconds.
- Returns:
InstanceMappingready for export.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- merge_instance_jsonld(existing: dict[str, Any], new: dict[str, Any]) dict[str, Any][source]
Merge new instance-mapping JSON-LD into existing.
Merges:
@context- union of all prefix->namespace entries.@graph- nodes keyed by@id; predicate targets are merged (duplicates skipped).@about-uri_formats_queriedis unioned;pattern_countrecomputed;generated_atrefreshed.
- Returns:
The mutated existing dict (also returned for convenience).
SSSOM Mapping
SsomMapping - SSSOM-derived mappings.
- class SsomMapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'sssom_import', source_name: str, sssom_file: str, mapping_set_id: str | None = None, mapping_set_title: str | None = None, license: str | None = None, curie_map: dict[str, str]=<factory>)[source]
Bases:
MappingMapping imported from an SSSOM source.
Each instance corresponds to one
.sssom.tsvfile extracted from an SSSOM bundle (e.g. the EBI OLS SSSOM archive).Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
SemraMapping - SeMRA-derived mappings.
- class SemraMapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'semra_import', source_name: str, source_prefix: str | None = None, evidence_chain: list[dict[str, ~typing.Any]]=<factory>)[source]
Bases:
MappingMapping imported from a SeMRA external source.
Carries the semra source key (e.g.
"biomappings") and, for per-prefix sources, the bioregistry prefix.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Inference Mapping
InferencedMapping - inference pipeline mappings.
- class InferencedMapping(*, edges: list[MappingEdge] = <factory>, about: AboutMetadata, mapping_type: str = 'inferenced', inference_types: list[str] = <factory>, source_mapping_files: list[str] = <factory>, evidence_chain: list[dict[str, ~typing.Any]]=<factory>, stats: dict[str, ~typing.Any]=<factory>)[source]
Bases:
MappingMapping produced by the rdfsolve/SeMRA inference pipeline.
Carries the set of inference types applied, source mapping files, evidence chain, and optional aggregate stats.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].