:mod:`biotransformers.utils.msa_utils`
======================================

.. py:module:: biotransformers.utils.msa_utils


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   biotransformers.utils.msa_utils.get_translation
   biotransformers.utils.msa_utils.read_sequence
   biotransformers.utils.msa_utils.remove_insertions
   biotransformers.utils.msa_utils.read_msa
   biotransformers.utils.msa_utils.get_msa_list
   biotransformers.utils.msa_utils.get_msa_lengths
   biotransformers.utils.msa_utils.msa_to_remove



.. function:: get_translation() -> Dict[int, Any]

   get translation dict to convert unused character in MSA


.. function:: read_sequence(filename: str) -> Tuple[str, str]

   Reads the first (reference) sequences from a fasta or MSA file.


.. function:: remove_insertions(sequence: str) -> str

   Removes any insertions into the sequence.
   Needed to load aligned sequences in an MSA.


.. function:: read_msa(filename: str, nseq: int) -> List[Tuple[str, str]]

   Reads the first nseq sequences from an MSA file,
   automatically removes insertions.


.. function:: get_msa_list(path_msa: Optional[str]) -> List[str]

   Get all files of the msa folder and check file format

   :param path_msa: path of the folder with a3m file
   :type path_msa: Optional[str]


.. function:: get_msa_lengths(list_msa: List[List[Tuple[str, str]]], nseq: int) -> List[int]

   Get length of an MSA list

   All MSA must have at least nseq in msa

   :param list_msa: list of MSA. MSA is a list of tuple
   :type list_msa: List[List[Tuple[str,str]]]
   :param nseq:

   :returns: [description]
   :rtype: List[int]


.. function:: msa_to_remove(path_msa: str, n_seq) -> List[str]

   Get list of msa with less than nseq sequence

   :param path_msa: [description]
   :type path_msa: str

   :returns: List of msa filepath that don't have enough enough sequences.


