stcrpy.tcr_processing package

Subpackages

Submodules

stcrpy.tcr_processing.AGchain module

Created on 10 May 2017 @author: leem

Based on the AGchain class from ABDB.

class stcrpy.tcr_processing.AGchain.AGchain(identifier)[source]

Bases: Chain

Non-TCR and non-MHC (peptide) chains are described using this class.

get_type()[source]
is_engineered()[source]
set_engineered(engineered)[source]
set_type()[source]

Use the type check to check the residue name from the chemical component dictionary For ease of use I have binned these into four types peptide nucleic-acid saccharide (carbohydrate) non-polymer

stcrpy.tcr_processing.Chemical_components module

Created on 12 May 2017

@author: leem Based on the ABDB.AbPDB.Chemical_components module by dunbar.

Analyse the chemical component dictionary http://www.wwpdb.org/ccd.html

These are the types of chemical components in the pdb: ( grep “_chem_comp.type” components.cif | cut -c 50-120 | sort | uniq )

We will bin them into:

peptide:

“D-beta-peptide, C-gamma linking” “D-gamma-peptide, C-delta linking” “D-peptide linking” “D-PEPTIDE LINKING” “D-peptide NH3 amino terminus” “D-PEPTIDE NH3 AMINO TERMINUS” “L-beta-peptide, C-gamma linking” “L-gamma-peptide, C-delta linking” “L-peptide COOH carboxy terminus” “L-PEPTIDE COOH CARBOXY TERMINUS” “L-peptide linking” “L-PEPTIDE LINKING” “L-peptide NH3 amino terminus” peptide-like PEPTIDE-LIKE “peptide linking” “PEPTIDE LINKING”

nucleic-acid:

“DNA linking” “DNA LINKING” “DNA OH 3 prime terminus” “DNA OH 3 PRIME TERMINUS” “L-DNA LINKING” “L-RNA LINKING” “RNA linking” “RNA LINKING” “RNA OH 3 prime terminus”

saccharide:

D-saccharide D-SACCHARIDE “D-saccharide 1,4 and 1,4 linking” “D-SACCHARIDE 1,4 AND 1,4 LINKING” L-saccharide L-SACCHARIDE “L-SACCHARIDE 1,4 AND 1,4 LINKING” saccharide SACCHARIDE

non-polymer:

non-polymer NON-POLYMER

This has been done in resname_to_type

Common buffers/molecules in the PDB which are unlikely to be hapten antigens

Method 1:

This list was taken from the supplementary material (Table 2) of: Visualizing ligand molecules in twilight electron density. C. X. Weichenberger, E. Pozharski and B. Rupp. Acta Cryst. (2013). F69, 195-200.

Acknowledgement - Anthony’s JC 03/04/13

Method 2:

The list of chemical component code to pdb code was taken from: http://ligand-expo.rcsb.org/ld-download.html Saved as: ./Antibody/AbPDB/dat/Resources/cc-to-pdb.tdd.txt

We look at the number of structures with each code. Distribution of number of pdb codes with the chemical component found at: ./Antibody/AbPDB/dat/Resources/Frequency_of_cc_in_pdb.pdf

We use the cut-off of 15. (and manually examine those which are over but under 50 and have mainly known Antibodies - just in case it’s a pet hapten antigen)

This is a harsh cut off! This is fine for analysing antibodies (as you are unlikely to have more than 15 bound to the same ag). However, change the cutoff for other purposes (suggest at least 200 for “common”)

There is still a problem if there is a rarely used code or a newly introduced code for buffer.

fix 120613 will chemical component dict api - runs to online database if cannot find it - obviously requires web access.

Acknowledgement - JP for suggesting the method. The following have been removed from the list as they are either common sugar or peptides

BGC saccharide NAG saccharide XYP saccharide XYS saccharide MAL saccharide MAN saccharide GLA saccharide GLC saccharide A2G saccharide LMT saccharide PE1 peptide F6P saccharide DPN peptide GAL saccharide BOG saccharide NGA saccharide FUC saccharide BMA saccharide SUC saccharide FUL saccharide NDG saccharide

I have updated method 1 list with method 2 list

It is left as a dictionary of dictionaries if we decide to add annotations

Functions provided are:

is_aa is_common_buffer is_carbohydrate is_nucleic_acid is_polymer get_type get_chemical_name

each take either a three letter code or a residue object as argument

stcrpy.tcr_processing.Chemical_components.get_chemical_name(residue)[source]
stcrpy.tcr_processing.Chemical_components.get_from_expo(residue)[source]

The PDB has a habit of updating …therefore, if we don’t have the three letter code try to get it from ligand expo database online.

stcrpy.tcr_processing.Chemical_components.get_name_type(residue)[source]
stcrpy.tcr_processing.Chemical_components.get_res_type(residue)[source]
stcrpy.tcr_processing.Chemical_components.is_aa(residue, standard=False)[source]
stcrpy.tcr_processing.Chemical_components.is_carbohydrate(residue)[source]
stcrpy.tcr_processing.Chemical_components.is_common_buffer(residue)[source]

Is the residue a common buffer? If it occurs in the L{common buffers<common_buffers>} list it is considered a common buffer.

Parameters:

residue – A AbPDB residue object or residues identifier e.g. PO4

Returns:

Flag if the residue is a common buffer.

stcrpy.tcr_processing.Chemical_components.is_complete(residue, quiet=True)[source]

Check the whether a residue has all the heavy atoms we would expect for the residue type. Works on the residue object in biopython pdb. Returns a flag (True or False) and a list of missing atom names missing

stcrpy.tcr_processing.Chemical_components.is_nucleic_acid(residue)[source]
stcrpy.tcr_processing.Chemical_components.is_polymer(residue)[source]

stcrpy.tcr_processing.Entity module

Created on 9 May 2017 @author: leem

A modified Entity class based on SAbDab’s ABDB.AbPDB and Bio.PDB’s entity

class stcrpy.tcr_processing.Entity.Entity(id)[source]

Bases: Entity

A modified entity object allows for direct writing of coordinates.

copy()[source]

Copy has been played with a bit. For my purposes the version in 1.61 did not work as explicit copying of the child list meant that the child objects became referenced to both self and shallow. This may be due to overriding the residue and chain classes so may not be a bug in biopython.

When copying the child_list in the loop, I use the list to iterate over instead of the dictionary. This preserves the ordering of the children.

save(output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, renumber=True, selection=False, remarks=True)[source]

Save the coordinates of the entity. Example: entity.save(“path/to/file/filename.pdb”) residue.save( “residue1.pdb” )

Parameters:
  • output – Where to write coordinates to. Should be an an open file, string or sys.stdout. By default the output is written to stdout

  • renumber – Flag whether to renumber the atoms to IMGT scheme Default is to renumber the atoms so that the first is 1 etc. Use renumber = False to retain the original atom numbering from the pdb file

  • selection – Provide a selector object to select which children of the entity should be outputted. Selection should be a selector object from TcrPDB.Select. Some basic selector classes are provided in the module. More complex classes can be created by inheriting from these. If selection = False (default) all atoms in the entity are output

  • remarks – Flag to print out remarks generated by TcrPDB. Default TRUE

transform(rot, tran)[source]

Apply rotation and translation to the atomic coordinates.

Example

>>> rotation=rotmat(pi, Vector(1,0,0))
>>> translation=array((0,0,1), 'f')
>>> entity.transform(rotation, translation)
Parameters:
  • rot – A right multiplying rotation matrix (3x3 Numeric array)

  • tran – the translation vector (size 3 Numeric array)

stcrpy.tcr_processing.Fragment module

Created on 9 May 2017 @author: leem Modified version of the ABDB.AbPDB.Fragment class

class stcrpy.tcr_processing.Fragment.Fragment(id)[source]

Bases: Entity

A modified Entity class that can be thought of as a way of grouping children:
e.g. TCR (TCR object) -> TCRchain (TCRchain object) -> Fragment CDRB3 (Fragment object)

-> Residue B110 (Residue object)

Does not modify the parent/child attributes of its children. For instance, one might define a fragment and add residues to it in order to visualise them.

add(entity)[source]

Add a child to the Entity.

get_atoms()[source]
get_residues()[source]
insert(pos, entity)[source]

Add a child to the Entity at a specified position.

stcrpy.tcr_processing.Holder module

Created on 9 May 2017 @author: leem

A generic holder class that can be used to contain individual chains, etc.

class stcrpy.tcr_processing.Holder.Holder(identifier)[source]

Bases: Entity

stcrpy.tcr_processing.MHC module

Created on 30 Apr 2016

@author: leem, based on work by dunbar

The MHC class. This is similar to the Fab class.

class stcrpy.tcr_processing.MHC.CD1(c1, c2)[source]

Bases: MHC

CD1 class. Holds paired CD1/B2M domains.

get_B2M()[source]
get_CD1()[source]
class stcrpy.tcr_processing.MHC.MH1(c1, c2)[source]

Bases: MHC

Type 1 MHC class. Holds paired MHC domains.

get_B2M()[source]
get_GA1()[source]
get_GA2()[source]
get_MH1()[source]
get_alpha()[source]
class stcrpy.tcr_processing.MHC.MH2(c1, c2)[source]

Bases: MHC

Type 2 MHC class. Holds paired MHC domains.

get_GA()[source]
get_GB()[source]
class stcrpy.tcr_processing.MHC.MHC(c1, c2)[source]

Bases: Entity

MHC class. Holds paired MHC domains.

get_MHC_type()[source]
get_TCR()[source]
get_allele_assignments()[source]
get_antigen()[source]

Return a list of bound antigens. If the antigen has more than one chain, those in contact with the antibody will be returned.

get_atoms()[source]
get_chains()[source]
get_residues()[source]
is_bound()[source]

Check whether there is an antigen bound to the antibody fab

class stcrpy.tcr_processing.MHC.MR1(c1, c2)[source]

Bases: MHC

MR1 class. Holds paired MR1/B2M domains.

get_B2M()[source]
get_MR1()[source]
class stcrpy.tcr_processing.MHC.scCD1(c1)[source]

Bases: MHC

Type 1 MHC class. Holds single chain MHC domains of type CD1 for Class I MHC if the identiifed chain is the double alpha helix, ie. CD1 without B2M.

get_B2M()[source]
get_CD1()[source]
get_GA1L()[source]
class stcrpy.tcr_processing.MHC.scMH1(c1)[source]

Bases: MHC

Type 1 MHC class. Holds single chain MHC domains for Class I MHC if the identiifed chain is the double alpha helix, ie. MH1 without B2M, with exception for GA1.

get_B2M()[source]
get_GA1()[source]
get_GA2()[source]
get_MH1()[source]
get_alpha()[source]
class stcrpy.tcr_processing.MHC.scMH2(c1)[source]

Bases: MHC

Single chain MHC class 2. Holds single GA or GB chain. Usually this will only occur if ANARCI has not been identified one of the two chains correctly.

get_GA()[source]
get_GB()[source]

stcrpy.tcr_processing.MHCchain module

Created on 30 Apr 2016 @author: leem

Based on the ABchain class from @dunbar

class stcrpy.tcr_processing.MHCchain.MHCchain(id)[source]

Bases: Chain, Entity

A class to hold an MHC chain

add_unnumbered(residue)[source]
analyse(chain_type)[source]
annotate_children()[source]
get_TCR()[source]
get_allele_assignments()[source]
get_antigen()[source]
get_chains()[source]
get_fragments()[source]
get_sequence(type=<class 'dict'>)[source]
get_unnumbered()[source]
is_bound()[source]

Check whether there is an antigen bound to the antibody chain

set_chain_type(chain_type)[source]

Set the MHC’s chain type

set_engineered(engineered)[source]
set_sequence()[source]

stcrpy.tcr_processing.Model module

Created on 9 May 2017 @author: leem

Based on the ABDB.AbPDB.Model class.

class stcrpy.tcr_processing.Model.Model(identifier, serial_num=None)[source]

Bases: Model, Entity

Override to use our Entity

@change: __getitem__ changed so that single chains can be called as well as holder object from a model.

e.g. s[0][“B”] and s[0][“BA”] gets the B chain and the BA tcr respectively.

stcrpy.tcr_processing.Select module

Select.py Created on 9 May 2017 @author: leem

These are selection classes for the save method of the TcrPDB entity They are based on the ABDB.AbPDB.Select and Bio.PDB.PDBIO Selection classes

class stcrpy.tcr_processing.Select.backbone[source]

Bases: select_all

Select only backbone (no side chains) atoms in the structure. Backbone defined as “C”,”CA”,”N”,”CB” and “O” atom identifiers in amino acid (pdb notation)

accept_atom(atom)[source]

Overload this to reject atoms for output.

class stcrpy.tcr_processing.Select.cdr3[source]

Bases: variable_only

Select only CDR3.

accept_residue(residue)[source]

Overload this to reject residues for output.

class stcrpy.tcr_processing.Select.fv_only_backbone[source]

Bases: variable_only, backbone

Select the backbone atoms of the variable region. Example of combining selection classes.

class stcrpy.tcr_processing.Select.select_all[source]

Bases: object

Default selection (everything) during writing - can be used as base class to implement selective output. This selects which entities will be written out.

accept(ob)[source]
accept_atom(atom)[source]

Overload this to reject atoms for output.

accept_chain(chain)[source]

Overload this to reject chains for output.

accept_fragment(fragment)[source]

Overload this to reject residues for output.

accept_holder(model)[source]

Overload this to reject holders for output. (TCRs, TCRchains-holder, MHCchains-holder, AGchains-holder)

accept_model(model)[source]

Overload this to reject models for output.

accept_residue(residue)[source]

Overload this to reject residues for output.

class stcrpy.tcr_processing.Select.variable_only[source]

Bases: select_all

Select the variable region(s) of the structure.

accept_holder(holder)[source]

Overload this to reject holders for output.

accept_residue(residue)[source]

Overload this to reject residues for output.

stcrpy.tcr_processing.TCR module

Created on 3rd April 2024 Nele Quast based on work by Dunbar and Leem The TCR class.

class stcrpy.tcr_processing.TCR.TCR(id)[source]

Bases: Entity

TCR class. Inherits from PDB.Entity. This is a base class for TCR strucutres, enabling antigen and MHC association. abTCR and gdTCR are the instantiated subclasses of this class.

calculate_docking_geometry(mode='rudolph', as_df=False)[source]

Calculate docking geometry of TCR to MHC. This is a wrapper function for the TCRGeom class.

Parameters:
  • mode (str, optional) – Mode for calculating the geometry. Options “rudolph”, “cys”, “com”. Defaults to “rudolph”.

  • as_df (bool, optional) – Whether to return as dictionary or dataframe. Defaults to False.

Returns:

TCR to MHC geometry.

Return type:

[dict, DataFrame]

get_CDRs()[source]

Obtain complementarity determining regions (CDRs) from a TCR structure object as generator.

Yields:

Fragment – TCR CDR regions

get_MHC()[source]

Return a list of TCR associated MHCs.

get_MHC_allele_assignments()[source]

Retrieve MHC allele assignments for all TCR associated MHCs. This is a list of dictionaries with the MHC ID as key and the allele assignments as value.

Returns:

dict with MHC chain ID as key and allele assignments as value

Return type:

dict

get_TCR_type()[source]

Get TCR type according to variable region assignments.

Returns:

TCR type (abTCR, gdTCR, dbTCR)

Return type:

str

get_antigen()[source]

Return a list of TCR associated antigens.

get_atoms()[source]

Returns generator of TCR atoms.

Yields:

Atom – TCR atoms

get_chains()[source]

Returns generator of TCR chains.

Yields:

Chain – TCR chain

get_frameworks()[source]

Obtain framework regions from a TCR structure object as generator.

Yields:

Fragment – TCR framework regions

get_germline_assignments()[source]

Retrive germline assignments for all TCR chains. This is a dictionary with the chain ID as key and the germline assignments as value.

Returns:

dict with TCR chain ID as key and germline assignments as value

Return type:

dict

get_germlines_and_alleles()[source]

Get all germline and allele assignments for TCR and MHC chains as a dictionary with the chain ID as key and the germline assignments as value.

Returns:

Dictionary of TCR germline and MHC allele assignemnts with amino acid sequences.

Return type:

dict

get_interaction_heatmap(plotting_kwargs={}, **interaction_kwargs)[source]

Get interaction heatmap of TCR to MHC and peptide. Generates heatmap image. Plotting kwargs are passed to heatmap function.

Parameters:
  • plotting_kwargs (dict, optional) –

    save_as: path to save heatmap image to interaction_type: type of interaction (eg. saltbridge, h_bond) to plot. All interactions are plotted by default. antigen_name: name of antigen for plot title mutation_index: index of antigen residues to highlight in plot Defaults to {

    save_as:None, interaction_type:None, antigen_name:None, mutation_index:None }.

  • interaction_kwargs – kwargs for TCRInteractionProfiler class. See TCRInteractionProfiler for details.

get_pitch_angle(mode='cys')[source]

Returns TCR:pMHC complex pitch angle of TCR to MHC. See paper for details.

Parameters:

mode (str, optional) – Mode for calculating the scanning angle. Options “rudolph”, “cys”, “com”. Defaults to “cys”.

Returns:

Pitch angle of TCR to MHC in degrees

Return type:

float

get_residues()[source]

Returns generator of TCR residues.

Yields:

Residue – TCR residue

get_scanning_angle(mode='rudolph')[source]

Returns TCR:pMHC complex scanning (aka crossing, incident) angle of TCR to MHC. See paper for details.

Parameters:

mode (str, optional) – Mode for calculating the scanning angle. Options “rudolph”, “cys”, “com”. Defaults to “rudolph”.

Returns:

Scanning angle of TCR to MHC in degrees

Return type:

float

is_bound()[source]

True or False if the TCR is associated with an antigen.

Returns:

Whether TCR is associated with an antigen.

Return type:

bool

profile_MHC_interactions()[source]
profile_TCR_interactions()[source]
profile_peptide_interactions(renumber: bool = True, save_to: str = None, **kwargs) pd.DataFrame[source]

Profile the interactions of the peptide to the TCR and the MHC.

Parameters:
  • renumber (bool, optional) – Whether to renumber the interacting residues. Defaults to True.

  • save_to (str, optional) – Path to save intraction data to as csv. Defaults to None.

Returns:

Dataframe of peptide interactions

Return type:

pd.DataFrame

save(save_as=None, tcr_only: bool = False, format: str = 'pdb')[source]

Save TCR object as PDB or MMCIF file.

Parameters:
  • save_as (str, optional) – File path to save TCR to. Defaults to None.

  • tcr_only (bool, optional) – Whether to save TCR only or to include MHC and antigen. Defaults to False.

  • format (str, optional) – Whether to save as PDB or MMCIF. Defaults to “pdb”.

score_docking_geometry(**kwargs)[source]

Score docking geometry of TCR to MHC. This is a wrapper function for the TCRGeomFiltering class. The score is calculated as the negative log of the TCR:pMHC complex geometry feature probabilities based on the distributions fit by maximum likelihood estimation of TCR to Class I MHC strucutres from STCRDab. Please see the paper methods for details.

Returns:

TCR:pMHC complex score as negative log of TCR:pMHC complex geometry feature probabilities

Return type:

float

class stcrpy.tcr_processing.TCR.abTCR(c1, c2)[source]

Bases: TCR

abTCR class. Inherits from TCR. This is a subclass of TCR for TCRs with alpha and beta chains.

get_VA()[source]

Retrieve the variable alpha chain of the TCR

Returns:

VA chain

Return type:

TCRchain

get_VB()[source]

Retrieve the variable beta chain of the TCR

Returns:

VB chain

Return type:

TCRchain

get_domain_assignment()[source]

Retrieve the domain assignment of the TCR as a dict with variable domain type as key and chain ID as value.

Returns:

domain assignment from domain to chain ID, e.g. {“VA”: “D”, “VB”: “E”}

Return type:

dict

get_fragments()[source]

Retrieve the fragments, ie FW and CDR loops of the TCR as a generator.

Yields:

Fragment – fragment of TCR chain.

is_engineered()[source]

Flag for engineered TCRs.

Returns:

Flag for engineered TCRs

Return type:

bool

class stcrpy.tcr_processing.TCR.dbTCR(c1, c2)[source]

Bases: TCR

get_VB()[source]
get_VD()[source]
get_domain_assignment()[source]
get_fragments()[source]
is_engineered()[source]
class stcrpy.tcr_processing.TCR.gdTCR(c1, c2)[source]

Bases: TCR

get_VD()[source]
get_VG()[source]
get_domain_assignment()[source]
get_fragments()[source]
is_engineered()[source]

stcrpy.tcr_processing.TCRIO module

class stcrpy.tcr_processing.TCRIO.TCRIO[source]

Bases: PDBIO

save(tcr: TCR, save_as: str = None, tcr_only: bool = False, format: str = 'pdb')[source]

Save structure to a file.

Parameters:
  • file (string or filehandle) – output file

  • select (object) – selects which entities will be written.

Typically select is a subclass of L{Select}, it should have the following methods:

  • accept_model(model)

  • accept_chain(chain)

  • accept_residue(residue)

  • accept_atom(atom)

These methods should return 1 if the entity is to be written out, 0 otherwise.

Typically select is a subclass of L{Select}.

stcrpy.tcr_processing.TCRParser module

Created on 3 April 2024 @author: Nele Quast, based on leem

TCRParser object which is based on ABDB’s AntibodyParser and BioPython’s PDB parser.

class stcrpy.tcr_processing.TCRParser.TCRParser(PERMISSIVE=True, get_header=True, QUIET=False)[source]

Bases: PDBParser, MMCIFParser

get_tcr_structure(id, file, prenumbering=None, ali_dict={}, crystal_contacts=[])[source]

Post processing of the TCRPDB.Bio.PDB structure object into a TCR context.

id: a string to identify the structure file: the path to the .pdb file

optional:

prenumbering: prenumbering for the chains in the structure.

stcrpy.tcr_processing.TCRStructure module

Created on 10 May 2017 @author: leem Based on the ABDB.AbPDB.AntibodyStructure class.

class stcrpy.tcr_processing.TCRStructure.TCRStructure(identifier)[source]

Bases: Entity

The TCRStructure class contains a collection of models

get_MHCs()[source]

Get any instance of the MHC object. Hierarchy:

TCRStructure

|______ TCR | |______ MHC

get_TCRchains()[source]

Gets all TCR chains

get_TCRs()[source]

Get any instance of the TCR object. Hierarchy:

TCRStructure

|______ TCR | |______ MHC

get_antigens()[source]

This gets the ‘antigen’ chains in the structure, that have been assigned to a TCR or an MHC.

get_atoms()[source]
get_chains()[source]
get_header()[source]
get_holders()[source]
get_models()[source]
get_residues()[source]
get_seq(model=0)[source]
get_unpaired_TCRchains()[source]

This gets the TCR chains that are not paired

set_header(header)[source]

Set the header as the parsed header dictionary from biopython

stcrpy.tcr_processing.TCRchain module

class stcrpy.tcr_processing.TCRchain.TCRchain(identifier)[source]

Bases: Chain, Entity

A class to hold a TCR chain.

add_unnumbered(residue)[source]
analyse(chain_type)[source]
annotate_children()[source]
get_CDRs()[source]
get_MHC()[source]
get_antigen()[source]
get_fragments()[source]
get_frameworks()[source]

Obtain framework regions from a TCRChain object.

get_germline_assignments()[source]
get_sequence(type=<class 'dict'>)[source]
get_unnumbered()[source]
is_bound()[source]

Check whether there is an antigen bound to the TCR

is_engineered()[source]
set_chain_type(chain_type)[source]

Set the chain type to B, A, D, or G

set_engineered(engineered)[source]
set_sequence()[source]

stcrpy.tcr_processing.annotate module

Created on 10 May 2017 @author: leem

Implementation to call anarci (built-in to STrDab) to annotate structures.

stcrpy.tcr_processing.annotate.align_numbering(numbering, sequence_list, alignment_dict={})[source]

Align the sequence that has been numbered to the sequence you input. The numbered sequence should be “in” the input sequence. If not, supply an alignment dictionary.(align sequences and use get_alignment_dict(ali1,ali2))

stcrpy.tcr_processing.annotate.align_scTCR_numbering(numbering, sequence_list, sequence_str)[source]

Align the sequence that has been numbered to a scTCR structure. :param numbering: numbered list of residues; this is usually a two-element list/tuple from TCRDB.anarci.number :param sequence_list: list of residues (e.g. from a structure) in its original numbering :param sequence_str: string form of sequence_list

stcrpy.tcr_processing.annotate.annotate(chain)[source]

Annotate the sequence of a chain object from TCRDB.TcrPDB # e.g. if you have chains B, A and X, you want to force the annotator to return the annotation # for B and A but not for X (the antigen)

returns a dictionary which has the residue ids as key and the annotation as value or is False, and chain type which is B/A/G/D/MH1/GA/GB/B2M or False.

stcrpy.tcr_processing.annotate.call_anarci(seq, allow={'A', 'B', 'B2M', 'D', 'G', 'GA', 'GA1', 'GA1L', 'GA2', 'GA2L', 'GB', 'MH1', 'MR1', 'MR2'})[source]

Use the ANARCI program to number the sequence.

Parameters:

seq – An amino acid sequence that you wish to number.

Returns:

numbering, chain type, germline information

stcrpy.tcr_processing.annotate.cleanup_scTCR_numbering(numbering_dict, sequence_list)[source]

The scTCR numbering method, while useful for sequences with two domains, can have gaps in between (e.g. CD1 molecule of 4lhu). This is to close the gaps in the numbering so that residues that were unnumbered by anarci don’t move around during structural parsing (when they’re probably just connections between domains).

Parameters:
  • numbering_dict – numbered dictionary from align_scTCR_numbering

  • sequence_list – sequence list from the structure for alignment.

stcrpy.tcr_processing.annotate.easy_alignment(seq1, seq2)[source]

Function to align two sequences by checking if one is in the other. This function will conserve gaps.

stcrpy.tcr_processing.annotate.extract_sequence(chain, selection=False, return_warnings=False, ignore_hets=False, backbone=False)[source]

Get the amino acid sequence of the chain. Residues containing HETATOMs are skipped –> Residues containing HETATOMs are checked as an amino acid.

Residues containing HETATOMs are checked to be amino acids and the single letter returned.

This works provided the residues in the chain are in the correct order.

Parameters:
  • selection – a selection object to select certain residues

  • return_warnings – Flag to return a list of warnings or not

  • backbone – Flag whether to only show residues with a complete backbone (in the structure) or not.

Returns:

aa tuple list and the sequence as a string.

Return type:

The sequence in a resid

stcrpy.tcr_processing.annotate.get_alignment_dict(ali1, ali2)[source]

Get a dictionary which tells you the index in sequence 2 that should align with the index in sequence 1 (key)

ali1: —-bcde-f— seq1: bcdef ali2: —abcd–f— seq2: abcdf

alignment_dict={

0:1, 1:2, 2:3, 4:4 }

If the index is aligned with a gap do not include in the dictionary. e.g 1 in alignment_dict –> True e.g 3 in alignment_dict –> False

stcrpy.tcr_processing.annotate.interpret(x)[source]

Function to interpret an annotation in the form H100A into the form ( 100, ‘A’ )

stcrpy.tcr_processing.annotate.pairwise_alignment(seq1, seq2, exact=False)[source]

Function to do alignment of sequences between sequences using biopython.

stcrpy.tcr_processing.annotate.validate_sequence(seq)[source]

Check whether a sequence is a protein sequence or if someone has submitted something nasty.

Module contents