polymerist.genutils.sequences.similarity.edits ============================================== .. py:module:: polymerist.genutils.sequences.similarity.edits .. autoapi-nested-parse:: For calculating the edit distance between sequences and inspecting the edits needed to go between them Attributes ---------- .. autoapisummary:: polymerist.genutils.sequences.similarity.edits.T Classes ------- .. autoapisummary:: polymerist.genutils.sequences.similarity.edits.EditOperation polymerist.genutils.sequences.similarity.edits.EditInfo Functions --------- .. autoapisummary:: polymerist.genutils.sequences.similarity.edits.compute_wf_matrix polymerist.genutils.sequences.similarity.edits.traverse_wf_matrix polymerist.genutils.sequences.similarity.edits.describe_edits polymerist.genutils.sequences.similarity.edits.levenshtein_distance Module Contents --------------- .. py:data:: T .. py:class:: EditOperation(*args, **kwds) Bases: :py:obj:`enum.Enum` For annotating distinct kinds of sequence edits and their associated index offsets .. py:attribute:: NULL :value: 0 .. py:attribute:: INSERTION :value: 1 .. py:attribute:: DELETION :value: 2 .. py:attribute:: SUBSTITUTION :value: 3 .. py:property:: bits :type: tuple[int, int] Convert the integer value of the Enum field into its binary bits .. py:attribute:: offsets .. py:class:: EditInfo for bundling together information about a sequence edit step .. py:attribute:: edit_op :type: EditOperation .. py:attribute:: indices :type: tuple[int, int] .. py:attribute:: distance :type: int .. py:function:: compute_wf_matrix(seq1: Sequence[T], seq2: Sequence[T], int_type: Type = int) -> numpy.ndarray[polymerist.genutils.typetools.numpytypes.Shape[polymerist.genutils.typetools.numpytypes.N, polymerist.genutils.typetools.numpytypes.M], int] Compute (N+1)x(M+1) matrix of Levenshtein distances between all partial prefices of a pair of sequences where N and M are the lengths of the first and second sequence, respectively. Implements the Wagner-Fischer algorithm .. py:function:: traverse_wf_matrix(wf_matrix: numpy.ndarray[polymerist.genutils.typetools.numpytypes.Shape[polymerist.genutils.typetools.numpytypes.N, polymerist.genutils.typetools.numpytypes.M], int], begin_idxs: tuple[int, int] = (0, 0), end_idxs: tuple[int, int] = (-1, -1)) -> Generator[list[EditInfo], None, None] Takes a Wagner-Fischer Levenshtein distance matrix and returns the indices of the minimal path through the matrix from the origin (i.e. empty sequences) to the .. py:function:: describe_edits(seq1: Sequence[T], seq2: Sequence[T], int_type: Type = int, indicator: str = ' -> ', delimiter: str = '\n') -> Generator[str, None, None] Describes step-by-step the insertion, deletion, or substitution operations needed to transform one sequence into another .. py:function:: levenshtein_distance(seq1: Sequence[T], seq2: Sequence[T], int_type: Type = int) -> int Compute the Levenshtein (edit) distance between a pair of sequences with elements of compatible type Denotes the minimal number of insertion, deletion, or substitution operations needed to transform either sequence into the other