polymerist.genutils.sequences.similarity.edits
For calculating the edit distance between sequences and inspecting the edits needed to go between them
Attributes
Classes
For annotating distinct kinds of sequence edits and their associated index offsets |
|
for bundling together information about a sequence edit step |
Functions
|
Compute (N+1)x(M+1) matrix of Levenshtein distances between all partial prefices of a pair of sequences |
|
Takes a Wagner-Fischer Levenshtein distance matrix and returns the indices of the minimal path through the matrix |
|
Describes step-by-step the insertion, deletion, or substitution operations needed to transform one sequence into another |
|
Compute the Levenshtein (edit) distance between a pair of sequences with elements of compatible type |
Module Contents
- polymerist.genutils.sequences.similarity.edits.T
- class polymerist.genutils.sequences.similarity.edits.EditOperation(*args, **kwds)[source]
Bases:
enum.EnumFor annotating distinct kinds of sequence edits and their associated index offsets
- NULL = 0
- INSERTION = 1
- DELETION = 2
- SUBSTITUTION = 3
- property bits: tuple[int, int]
Convert the integer value of the Enum field into its binary bits
- offsets
- class polymerist.genutils.sequences.similarity.edits.EditInfo[source]
for bundling together information about a sequence edit step
- edit_op: EditOperation
- indices: tuple[int, int]
- distance: int
- polymerist.genutils.sequences.similarity.edits.compute_wf_matrix(seq1: Sequence[T], seq2: Sequence[T], int_type: Type = int) numpy.ndarray[polymerist.genutils.typetools.numpytypes.Shape[polymerist.genutils.typetools.numpytypes.N, polymerist.genutils.typetools.numpytypes.M], int][source]
Compute (N+1)x(M+1) matrix of Levenshtein distances between all partial prefices of a pair of sequences where N and M are the lengths of the first and second sequence, respectively. Implements the Wagner-Fischer algorithm
- polymerist.genutils.sequences.similarity.edits.traverse_wf_matrix(wf_matrix: numpy.ndarray[polymerist.genutils.typetools.numpytypes.Shape[polymerist.genutils.typetools.numpytypes.N, polymerist.genutils.typetools.numpytypes.M], int], begin_idxs: tuple[int, int] = (0, 0), end_idxs: tuple[int, int] = (-1, -1)) Generator[list[EditInfo], None, None][source]
Takes a Wagner-Fischer Levenshtein distance matrix and returns the indices of the minimal path through the matrix from the origin (i.e. empty sequences) to the
- polymerist.genutils.sequences.similarity.edits.describe_edits(seq1: Sequence[T], seq2: Sequence[T], int_type: Type = int, indicator: str = ' -> ', delimiter: str = '\n') Generator[str, None, None][source]
Describes step-by-step the insertion, deletion, or substitution operations needed to transform one sequence into another
- polymerist.genutils.sequences.similarity.edits.levenshtein_distance(seq1: Sequence[T], seq2: Sequence[T], int_type: Type = int) int[source]
Compute the Levenshtein (edit) distance between a pair of sequences with elements of compatible type Denotes the minimal number of insertion, deletion, or substitution operations needed to transform either sequence into the other