Package org.snpeff.snpEffect
Class HgvsProtein
- java.lang.Object
-
- org.snpeff.snpEffect.Hgvs
-
- org.snpeff.snpEffect.HgvsProtein
-
public class HgvsProtein extends Hgvs
Coding change in HGVS notation (amino acid changes) References: http://www.hgvs.org/mutnomen/recs.html
-
-
Field Summary
Fields Modifier and Type Field Description static boolean
debug
-
Fields inherited from class org.snpeff.snpEffect.Hgvs
duplication, genome, hgvsTrId, marker, MAX_SEQUENCE_LEN_HGVS, strandMinus, strandPlus, tr, variant, variantEffect
-
-
Constructor Summary
Constructors Constructor Description HgvsProtein(VariantEffect variantEffect)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected java.lang.String
aaCode(char aa1Letter)
protected java.lang.String
aaCode(java.lang.String aa1Letter)
Use one letter / three letter AA codes Most times we want to vonvert to 3 letter code HGVS: the three-letter amino acid code is prefered (see Discussion), with "*" designating a translation termination codon; for clarity we this page describes changes using the three-letter amino acidprotected java.lang.String
del()
Deletions remove one or more amino acid residues from the protein and are described using "del" after an indication of the first and last amino acid(s) deleted separated by a "_" (underscore).protected java.lang.String
delins()
Mixed variants Deletion/insertions (indels) replace one or more amino acid residues with one or more other amino acid residues.protected java.lang.String
dup()
Duplicationsprotected java.lang.String
fs()
Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11).protected java.lang.String
ins()
Insertions Insertions add one or more amino acid residues between two existing amino acids and this insertion is not a copy of a sequence immediately 5'-flanking (see Duplication).protected boolean
isDuplication()
Is this variant a duplication Reference: http://www.hgvs.org/mutnomen/disc.html#dupins ...the description "dup" (see Standards) may by definition only be used when the additional copy is directly 3'-flanking of the original copy (tandem duplication)protected java.lang.String
pos(int codonNum)
Protein positionprotected java.lang.String
pos(int start, int end)
protected java.lang.String
pos(Transcript tr, int codonNum)
Protein positionprotected java.lang.String
pos(Transcript tr, int start, int end)
Position string given two coordinatesprotected java.lang.String
posDel()
Position for deletionsprotected java.lang.String
posDelIns()
Position for 'delins'protected java.lang.String
posDup()
Position for 'duplications' (a special kind of insertion)protected java.lang.String
posFs()
Frame shifts ....are described using ...protected java.lang.String
posIns()
Position for insertionsprotected java.lang.String
posSnpOrMnp()
Position: SNP or NMPprotected java.lang.String
snpOrMnp()
SNP or MNP changesjava.lang.String
toString()
protected java.lang.String
translocation()
Translocation nomenclature.protected java.lang.String
typeOfReference()
Return "p." string with/without transcript ID, according to user command line options.-
Methods inherited from class org.snpeff.snpEffect.Hgvs
initStrand, parseTranscript, removeTranscript
-
-
-
-
Constructor Detail
-
HgvsProtein
public HgvsProtein(VariantEffect variantEffect)
-
-
Method Detail
-
aaCode
protected java.lang.String aaCode(char aa1Letter)
-
aaCode
protected java.lang.String aaCode(java.lang.String aa1Letter)
Use one letter / three letter AA codes Most times we want to vonvert to 3 letter code HGVS: the three-letter amino acid code is prefered (see Discussion), with "*" designating a translation termination codon; for clarity we this page describes changes using the three-letter amino acid
-
del
protected java.lang.String del()
Deletions remove one or more amino acid residues from the protein and are described using "del" after an indication of the first and last amino acid(s) deleted separated by a "_" (underscore). Deletions remove either a small internal segment of the protein (in-frame deletion), part of the N-terminus of the protein (initiation codon change) or the entire C-terminal part of the protein (nonsense change). A nonsense change is a special type of deletion removing the entire C-terminal part of a protein starting at the site of the variant (specified 2013-03-16). 1) in-frame deletions - are described using "del" after an indication of the first and last amino acid(s) deleted separated, by a "_" (underscore). p.Gln8del in the sequence MKMGHQQQCC denotes a Glutamine-8 (Gln, Q) deletion to MKMGHQQCC p.(Cys28_Met30del) denotes RNA nor protein was analysed but the predicted change is a deletion of three amino acids, from Cysteine-28 to Methionine-30 2) initiating methionine change (Met1) causing a N-terminal deletion (see Discussion, see Examples) NOTE: changes extending the N-terminal protein sequence are described as an extension p.0 - no protein is produced (experimental data should be available) NOTE: this change is not described as p.Met1_Leu833del, i.e. as a deletion removing the entire protein coding sequence p.Met1? - denotes that amino acid Methionine-1 (translation initiation site) is changed and that it is unclear what the consequence of this change is p.Met1_Lys45del - a new translation initiation site is activated (at Met46) 3) nonsense variant - are a special type of amino acid deletion removing the entire C-terminal part of a protein starting at the site of the variant. A nonsense change is described using the format p.Trp26Ter (alternatively p.Trp26*). The description does not include the deletion at protein level from the site of the change to the C-terminal end of the protein (stop codon) like p.Trp26_Leu833del (the deletion of amino acid residue Trp26 to the last amino acid of the protein Leu833). p.(Trp26Ter) indicates RNA nor protein was analysed but amino acid Tryptophan26 (Trp, W) is predicted to change to a stop codon (Ter) (alternatively p.(W26*) or p.(Trp26*))
-
delins
protected java.lang.String delins()
Mixed variants Deletion/insertions (indels) replace one or more amino acid residues with one or more other amino acid residues. Deletion/insertions are described using "delins" as a deletion followed by an insertion after an indication of the amino acid(s) flanking the site of the deletion/insertion separated by a "_" (underscore, see Discussion). Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11). A frame shift is described using "fs" after the first amino acid affected by the change. Descriptions either use a short ("fs") or long ("fsTer#") description. The description of frame shifts does not include the deletion at protein level from the site of the frame shift to the natural end of the protein (stop codon). The inserted amino acid residues are not described, only the total length of the new shifted frame is given (i.e. including the first amino acid changed).
-
dup
protected java.lang.String dup()
Duplications
-
fs
protected java.lang.String fs()
Frame shifts are a special type of amino acid deletion/insertion affecting an amino acid between the first (initiation, ATG) and last codon (termination, stop), replacing the normal C-terminal sequence with one encoded by another reading frame (specified 2013-10-11). A frame shift is described using "fs" after the first amino acid affected by the change. Descriptions either use a short ("fs") or long ("fsTer#") description
-
ins
protected java.lang.String ins()
Insertions Insertions add one or more amino acid residues between two existing amino acids and this insertion is not a copy of a sequence immediately 5'-flanking (see Duplication). Insertions are described using "ins" after an indication of the amino acids flanking the insertion site, separated by a "_" (underscore) and followed by a description of the amino acid(s) inserted. Since for large insertions the amino acids can be derived from the DNA and/or RNA descriptions they need not to be described exactly but the total number may be given (like "ins17"). Examples: 1) p.Lys2_Met3insGlnSerLys denotes that the sequence GlnSerLys (QSK) was inserted between amino acids Lysine-2 (Lys, K) and Methionine-3 (Met, M), changing MKMGHQQQCC to MKQSKMGHQQQCC 2) p.Trp182_Gln183ins17 describes a variant that inserts 17 amino acids between amino acids Trp182 and Gln183 NOTE: it must be possible to deduce the 17 inserted amino acids from the description given at DNA or RNA level
-
isDuplication
protected boolean isDuplication()
Is this variant a duplication Reference: http://www.hgvs.org/mutnomen/disc.html#dupins ...the description "dup" (see Standards) may by definition only be used when the additional copy is directly 3'-flanking of the original copy (tandem duplication)
-
pos
protected java.lang.String pos(int codonNum)
Protein position
-
pos
protected java.lang.String pos(int start, int end)
-
pos
protected java.lang.String pos(Transcript tr, int codonNum)
Protein position
-
pos
protected java.lang.String pos(Transcript tr, int start, int end)
Position string given two coordinates
-
posDel
protected java.lang.String posDel()
Position for deletions
-
posDelIns
protected java.lang.String posDelIns()
Position for 'delins'
-
posDup
protected java.lang.String posDup()
Position for 'duplications' (a special kind of insertion)
-
posFs
protected java.lang.String posFs()
Frame shifts ....are described using ... the change of the first amino acid affected ... the description does not include a description of the deletion from the site of the change
-
posIns
protected java.lang.String posIns()
Position for insertions
-
posSnpOrMnp
protected java.lang.String posSnpOrMnp()
Position: SNP or NMP
-
snpOrMnp
protected java.lang.String snpOrMnp()
SNP or MNP changes
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
translocation
protected java.lang.String translocation()
Translocation nomenclature. From HGVS: Translocations at protein level occur when a translocation at DNA level leads to the production of a fusion protein, joining the N-terminal end of the protein on one chromosome to the C-terminal end of the protein on the other chromosome (and vice versa). No recommendations have been made sofar to describe protein translocations. t(X;17)(DMD:p.Met1_Val1506; SGCA:p.Val250_*387) describes a fusion protein resulting from a translocation between the chromosomes X and 17; the fusion protein contains an N-terminal segment of DMD (dystrophin, amino acids Methionine-1 to Valine-1506), and a C-terminal segment of SGCA (alpha-sarcoglycan, amino acids Valine-250 to the stop codon at 387)
-
typeOfReference
protected java.lang.String typeOfReference()
Return "p." string with/without transcript ID, according to user command line options.
-
-