polymerist.polymers.building.sequencing

For generating and manipulating sequences of symbols which correspond to monomer ordering in blocky and random copolymers

Attributes

LOGGER

Classes

LinearCopolymerSequencer

For encapsulating information about the sequence of repeat units in a periodic, linear copolymer

Module Contents

polymerist.polymers.building.sequencing.LOGGER
class polymerist.polymers.building.sequencing.LinearCopolymerSequencer[source]

For encapsulating information about the sequence of repeat units in a periodic, linear copolymer Also covers, as trivial special cases, homopolymers and alternating copolymers

Parameters:
  • sequence_kernel (str) – A sequence indicating a periodic ordering of monomers in a linear polymer block (e.g. “A”, “ABAC”, etc) Each unique symbol in the sequence corresponds to a distinct monomer in the block

  • n_repeat_units (int) – The desired total number of monomers (including terminal monomers) in a polymer chain

  • n_monomers_terminal (int) – The number of terminal monomers (“end groups”) which are to be included in the chain in addition to the middle monomers described by “sequence”

Raises:
  • EmpyBlockSequence – The sequence provided is empty (can’t be used to define nonzero-length chain)

  • End GroupDominatedChain – The number of terminal monomers exceed the number of total monomers

sequence_kernel: str
n_repeat_units: int
n_repeat_units_terminal: int = 0
copy() LinearCopolymerSequencer[source]

Returns another equivalent instance of the current sequence info more efficiently than a complete deepcopy

reduce() None[source]

Determines if there is a shorter repeating subsequence making up the current sequence kernel If there is, adjusts the sequence kernel to that minimal sequence; does nothing otherwise

Reduction is idempotent, and guarantees that the smallest possible kernel is used when sequencing

reduced() LinearCopolymerSequencer[source]

Return a sequence-reduced version of the current sequence info

property n_repeat_units_middle: int

Number of middle (i.e. non-terminal) repeat units

property block_size: int

Number of repeat units units in one whole iteration of the kernel block

period
property n_full_periods: int

Largest number of complete repetitions of the sequence kernel which, when taken together, contain no more repeats units than the specified number of middle units

property n_residual_repeat_units: int

Difference between number of middle repeat units and units which would occur in maximal full periods of the kernel

By construction, is no greater than the block size and is identically zero exactly when a whole number of kernel repeats

property has_residual: bool

Whether or not the target number of middle repeat units can be attained by a whole number of kernel repeats

property sequence_residual: str

Partial repeat of the kernel sequence needed to attain the speficied number of middle units

residual
procrustean_alignment(allow_partial_sequences: bool = False) tuple[str, int][source]

PROCRUSTEAN: Periodic Recurrence Of Cyclic Repeat Unit Sequences, Truncated to an Exact and Arbitrary Number Stretches or truncates the sequence kernel to achieve a target sequence length, cycling through the kernel’s period as many times as needed

Algorithm produces a sequence string “P” and number of repeats “r” which, taken together, satisfy the following: - The number of units in r repeats of P plus the number of terminal monomers is precisely equal to the target number of monomers - The units in P cycle through the units in S, in the order they appear in S - The number of times S is cycled through in P is always a rational multiple of the length of S If no satisfiable sequence-count pair can be found, raises an appropriate informative exception

describe_order(end_group_names: Iterable[str] | None = None, default_end_group_name: str = 'END-GROUP') str[source]

Descriptive string presenting a condensed view of the order of repeat units in the final sequence

describe_tally() str[source]

Descriptive string indicating how all parts of the overall sequence contribute to the target number of repeat units