stcrpy.tcr_formats package

Submodules

stcrpy.tcr_formats.tcr_formats module

stcrpy.tcr_formats.tcr_formats.get_sequences(entity: Bio.PDB.Entity, amino_acids_only: bool = True, residues_to_include: list = None) → dict[source]

Extract seqeunces from strcuture objects as dictionary.

Parameters:

entity (Bio.PDB.Entity) – Stucture object
amino_acids_only (bool, optional) – Whether to remove non-amino acid ‘X’ from sequences. Defaults to True.
residues_to_include (list, optional) – List of residue IDs to include in sequence. Defaults to None.

Raises:

e – AttributeError if entity has no attribute .get_chains(). The assuems entity is chain level and returns single sequence

Returns:

Dictionary of amino acid sequences, keyed by chain ID in strcuctre entity.

Return type:

dict

stcrpy.tcr_formats.tcr_formats.to_AF3_json(tcr: TCR, tcr_only: bool = True, save: bool = True, save_dir: str = '', name: str = None, V_domain_only: bool = False) → dict[source]

Converts TCR object to dict in Alphafold 3 JSON input format, ie. amino acid sequences. Eg: {

“name”: Job name, “modelSeeds”: [], “sequences”: [

{“proteinChain”: {“sequence”: AAAAAAAAAAAAAA, “count”: 1}}, {“proteinChain”: {“sequence”: AAAAAAAAAAAAAA, “count”: 1}}, {“proteinChain”: {“sequence”: AAAAAAAAAAAAAA, “count”: 1}},

],

}

Parameters:

tcr (TCR) – TCR structure object
tcr_only (bool, optional) – Whether to include TCR sequence only, excluding antigen and MHC. Defaults to True.
save (bool, optional) – Whether to save dict as JSON file. Defaults to True.
save_dir (str, optional) – Directory to save JSON files to. Defaults to “”.
name (str, optional) – TCR ID to use as name for AF3 job. Defaults to None.
V_domain_only (bool, optional) – Include full TCR sequence or only the variable domain (1-128 IMGT numbering). Defaults to False.

Returns:

Nested dictionary of AF3 sequence inputs.

Return type:

dict

stcrpy.tcr_formats.tcr_haddock module

class stcrpy.tcr_formats.tcr_haddock.HADDOCKFormatter(save_dir: str = None)[source]

Bases: object

pMHC_to_haddock(mhc: MHC, antigen: list[Antigen])[source]

Bound reformatting of MHC and antigen structures object to HADDOCK compatible PDB file.

Parameters:

mhc (MHC) – MHC structure object
antigen (Antigen) – Antigen structure object

tcr_to_haddock(tcr: TCR)[source]

Bound reformatting of TCR structure object to HADDOCK compatible PDB file.

Parameters:: tcr (TCR) – TCR structure object

write_TCR_pdb_file(tcr: TCR, save_dir: str)[source]

Writes TCR structure to a PDB file in a format HADDOCK can deal with. Generates a PDB file, a mapping from the old to the new numbering,

and a list of active residues to restrain the HADDOCK simulation.

Parameters:

tcr (TCR) – The TCR structure.
save_dir (str) – The directory to save the files (default is current directory).

write_antigen_pdb_file(mhc: MHC, antigen: list[Antigen], save_dir: str)[source]

Writes the antigen PDB file for docking with HADDOCK. Generates a PDB file, a file containing the renumbering mapping, and a list of active residues to restrict the simulation.

Parameters:

mhc (MHC) – MHC structure object.
antigen (list[Antigen]) – List containing antigen chain. Should be length 1.
save_dir (str, optional) – The directory to save the PDB file. Defaults to “.”.

Returns:

The filename of the saved antigen PDB file.

Return type:

str

class stcrpy.tcr_formats.tcr_haddock.HADDOCKResultsParser(haddock_results_dir: str, tcr_renumbering_file: str = None, pmhc_renumbering_file: str = None)[source]

Bases: object

get_haddock_scores() → pandas.DataFrame[source]

Retrieve HADDOCK energy scoes and RMSD evaluations from simulation output:

Columns:

“haddock_score”,

“interface_rmsd”,

“ligand_rmsd”,

“frac_common_contacts”,

“E_vdw”,

“E_elec”,

“E_air”,

“E_desolv”,

“ligand_rmsd_2”,

“cluster_id”,

Raises:
FileNotFoundError: HADDOCK file contianing scores not found.

Returns:
pandas.DataFrame: DataFrame with HADDOCK simulation metrics.

renumber_all_haddock_predictions()[source]: Renumber all haddock predictions contained in results folder. Requires standard HADDOCK output directory format.

renumber_haddock_prediction(docked_prediction_file: str, haddock_renumbering_file: str, antigen_renumbering_file: str = None) → Model[source]

Renumber the HADDOCK prediction based on the renumbering files.

Parameters:

docked_prediction_file (str) – Path to the docked prediction file.
haddock_renumbering_file (str) – Path to the HADDOCK renumbering file.
antigen_renumbering_file (str, optional) – Path to the antigen renumbering file. Needed for TCR only PDBs with no antigen. Defaults to None.

Returns:

The renumbered HADDOCK prediction.

Return type:

Bio.PDB.Model.Model

Raises:

ValueError – If the renumbering index is not found in the renumbering file.

stcrpy.tcr_formats.tcr_haddock.imgt_insertion_char_to_int(char: str) → int[source]

Converts an IMGT insertion character to an integer.

Parameters:: char (str) – The IMGT insertion character.
Returns:: The corresponding integer value.
Return type:: int

stcrpy.tcr_formats.tcr_haddock.parse_renumbered_line(line: str) → tuple[source]

Parses a renumbered line from a file and extracts the chain ID, original numbering, and HADDOCK numbering.

Parameters:: line (str) – The renumbered line to parse.
Returns:: A tuple containing the chain ID, original numbering, and HADDOCK numbering.
Return type:: tuple

Example

line = “(O,( ,3, ),( ,203, )” result = parse_renumbered_line(line) # Output: (O)’, (‘’, ‘3’, ‘’), (‘’, ‘203’, ‘’))

stcrpy.tcr_formats.tcr_haddock.sort_residues_by_imgt_numbering(residues: list[<module 'Bio.PDB.Residue' from '/home/quast/miniconda3/envs/test-stcrpy/lib/python3.12/site-packages/Bio/PDB/Residue.py'>]) → list[<module 'Bio.PDB.Residue' from '/home/quast/miniconda3/envs/test-stcrpy/lib/python3.12/site-packages/Bio/PDB/Residue.py'>][source]

Sort residues in order by IMGT numbering.

Parameters:: residues (list[Bio.PDB.Residue]) – List of IMGT numbered residues.
Returns:: Sorted list of IMGT numbered residuess.
Return type:: list[Bio.PDB.Residue]

stcrpy.tcr_formats package

Submodules

stcrpy.tcr_formats.tcr_formats module

stcrpy.tcr_formats.tcr_haddock module

Module contents