Inference
SeMRA-powered inference pipeline for rdfsolve mappings.
Takes one or more mapping JSON-LD files, converts their edges to
semra.Mapping objects, applies the requested inference operations
(inversion, transitivity/chain, generalisation), deduplicates via
semra.api.assemble_evidences, and writes the result as an
InferencedMapping JSON-LD file.
Main entry-point
infer_mappings() - full pipeline.
seed_inferenced_mappings() - convenience wrapper for CLI/scripts.
- infer_mappings(input_paths: list[str], output_path: str, *, inversion: bool = True, transitivity: bool = True, generalisation: bool = False, chain_cutoff: int = 3, dataset_name: str | None = None) dict[str, Any][source]
Run the inference pipeline over a set of mapping JSON-LD files.
Loads all mapping edges from input_paths, converts them to semra Mappings, applies the chosen inference operations, deduplicates via
semra.api.assemble_evidences, converts back to rdfsolve edges, and writes anInferencedMappingJSON-LD to output_path.- Parameters:
input_paths – Paths to input mapping JSON-LD files.
output_path – Path to write the inferenced mapping JSON-LD.
inversion – Apply symmetric inversion of every mapping.
transitivity – Apply transitive chain inference.
generalisation – Apply generalisation (broader/narrower).
chain_cutoff – Max chain length for transitivity inference.
dataset_name – Override for the
@about.dataset_namefield.
- Returns:
Summary dict with keys
"input_edges","output_edges","inference_types","output_path".
- seed_inferenced_mappings(input_dir: str = 'docker/mappings', output_dir: str = 'docker/mappings/inferenced', output_name: str = 'inferenced_mappings', inversion: bool = True, transitivity: bool = True, generalisation: bool = False, chain_cutoff: int = 3) dict[str, Any][source]
Infer over all mappings in input_dir and write to output_dir.
Collects all
*.jsonldfiles under input_dir (instance_matching/,semra/, andsssom/subdirs), runsinfer_mappings(), and writes{output_dir}/{output_name}.jsonld.This is the convenience entry-point for the CLI and seed scripts.
- Parameters:
input_dir – Directory that contains mapping subdirs.
output_dir – Directory to write inferenced output.
output_name – Stem for the output file (without
.jsonld).inversion – Apply inversion inference.
transitivity – Apply transitivity inference.
generalisation – Apply generalisation inference.
chain_cutoff – Max chain length for transitivity.
- Returns:
Summary from
infer_mappings().