Class GoTerms

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Iterable<GoTerm>

    public class GoTerms
    extends java.lang.Object
    implements java.lang.Iterable<GoTerm>, java.io.Serializable
    A collection of GO terms
    Author:
    Pablo Cingolani
    See Also:
    Serialized Form
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static boolean debug  
      static boolean verbose  
    • Constructor Summary

      Constructors 
      Constructor Description
      GoTerms()
      Default constructor
      GoTerms​(java.lang.String oboFile, java.lang.String nameSpace, java.lang.String interestingGenesFile, java.lang.String geneAssocFile, boolean removeObsolete, boolean useGeneId)
      Constructor
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      GoTerm add​(GoTerm goTerm)
      Add a GOTerm (if not already in this GOTerms) WARNING: Creates 'fake' symbolNames based on symbolIds.
      void addInterestingSymbol​(java.lang.String symbolId, int rank, java.util.HashSet<java.lang.String> noGoTermFound)
      Add a symbol as 'interesting' symbol (to every corresponding GOTerm in this set)
      boolean addSymbolId​(GoTerm goTerm, java.lang.String symbolId)
      Add a symbolId (as well as all needed mappings)
      void addSymbolsFromChilds()
      Use symbols for chids in DAG For every GOTerm, each child's symbols are added to the term so that root term contains every symbol and every interestingSymbol
      java.util.Set<java.lang.String> allSymbols()
      Create a set with all the symbols
      void checkInterestingSymbolIds​(java.util.Set<java.lang.String> interestingSymbolIds)
      Checks that every symboolID is in the set (as 'interesting' symbols)
      GoTerm disjointSet​(java.util.List<GoTerm> goTermList, int activeSets)
      Produce a GOTerm based on a list of GOTerms and a 'mask'
      GoTerm getGoTerm​(java.lang.String goTermAcc)  
      java.util.HashMap<java.lang.String,​GoTerm> getGoTermsByGoTermAcc()  
      java.util.HashMap<java.lang.String,​java.util.Set<GoTerm>> getGoTermsBySymbolId()  
      java.util.Set<GoTerm> getGoTermsBySymbolId​(java.lang.String symbolId)  
      java.util.HashSet<java.lang.String> getInterestingSymbolIdsSet()  
      int getInterestingSymbolIdsSize()  
      java.lang.String getLabel()  
      int getMaxRank()  
      java.lang.String getNameSpace()  
      int getRank​(java.lang.String symbolId)
      Get symbol's rank
      java.util.HashMap<java.lang.String,​java.lang.Integer> getRankSymbolId()  
      java.util.Iterator<GoTerm> iterator()
      Iterate through each GOterm in this GOTerms
      java.util.Set<java.lang.String> keySet()  
      int levels()
      Calculate each node's level (in DAG)
      java.util.List<GoTerm> listTopTerms​(int numberToSelect)
      Select a number of GOTerms
      int numberOfInterestingSymbols()
      Calculate how many interesting symbol-IDs in are there in all these GOTerms
      int numberOfNodes()
      Number of nodes in this DAG
      int numberOfNodesWithOneInterestingSymbol()
      Calculate the number of nodes in that have at least one interesting symbol
      int numberOfNodesWithOneSymbol()
      Calculate the number of nodes in that have at least one annotated symbol
      int numberOfSymbols()
      Calculate how many symbol-IDs in are there in all these GOTerms
      void readGeneAssocFile​(java.lang.String goGenesFile, boolean useGeneId)
      Reads a file containing every gene (names and ids) associated GO terms
      void readInterestingSymbolIdsFile​(java.lang.String fileName)
      Reads a file with a list of 'interesting' genes (one per line)
      void readOboFile​(java.lang.String oboFile, boolean removeObsolete)
      Read an OBO file
      void removeGOTerm​(java.lang.String goTermAcc)
      Remove a GOTerm
      void resetInterestingSymbolIds()
      Reset every 'interesting' symbolId (on every single GOTerm in this GOTerms)
      java.util.Set<GoTerm> rootNodes()  
      void saveGseaGeneSets​(java.lang.String fileName)
      Save gene sets file for GSEA analysis Format specification: http://www.broad.mit.edu/cancer/software/gsea/wiki/index.php/Data_formats#GMT:_Gene_Matrix_Transposed_file_format_.28.2A.gmt.29
      void setLabel​(java.lang.String label)  
      java.lang.String toString()  
      java.util.Collection<GoTerm> values()  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
      • Methods inherited from interface java.lang.Iterable

        forEach, spliterator
    • Field Detail

      • debug

        public static boolean debug
      • verbose

        public static boolean verbose
    • Constructor Detail

      • GoTerms

        public GoTerms()
        Default constructor
      • GoTerms

        public GoTerms​(java.lang.String oboFile,
                       java.lang.String nameSpace,
                       java.lang.String interestingGenesFile,
                       java.lang.String geneAssocFile,
                       boolean removeObsolete,
                       boolean useGeneId)
        Constructor
        Parameters:
        oboFile - : Path to OBO description file
        nameSpace - : Can be 'null' for "all namespaces"
        interestingGenesFile - : Path to a file containing a list of 'interesting' genes (one geneName per line)
        geneAssocFile - : A file containing lines like: "GOterm \t gene_product_id \t gene_name \n"
    • Method Detail

      • add

        public GoTerm add​(GoTerm goTerm)
        Add a GOTerm (if not already in this GOTerms) WARNING: Creates 'fake' symbolNames based on symbolIds. This method is used mostly for testing / debugging
      • addInterestingSymbol

        public void addInterestingSymbol​(java.lang.String symbolId,
                                         int rank,
                                         java.util.HashSet<java.lang.String> noGoTermFound)
        Add a symbol as 'interesting' symbol (to every corresponding GOTerm in this set)
        Parameters:
        symbolName - : Symbol's name
        rank - : symbol's rank
        noGoTermFound - : Add symbol here if there are no GOTerms associated with this symbol
      • addSymbolId

        public boolean addSymbolId​(GoTerm goTerm,
                                   java.lang.String symbolId)
        Add a symbolId (as well as all needed mappings)
        Parameters:
        goTermAcc -
        symbolId -
        symbolName -
        goTermType -
        description -
        Returns:
        true if OK, false on error (GOTerm 'goTermAcc' not found)
      • addSymbolsFromChilds

        public void addSymbolsFromChilds()
        Use symbols for chids in DAG For every GOTerm, each child's symbols are added to the term so that root term contains every symbol and every interestingSymbol
      • allSymbols

        public java.util.Set<java.lang.String> allSymbols()
        Create a set with all the symbols
      • checkInterestingSymbolIds

        public void checkInterestingSymbolIds​(java.util.Set<java.lang.String> interestingSymbolIds)
        Checks that every symboolID is in the set (as 'interesting' symbols)
        Parameters:
        interestingSymbolIds - : A set of interesting symbols Throws an exception on error
      • disjointSet

        public GoTerm disjointSet​(java.util.List<GoTerm> goTermList,
                                  int activeSets)
        Produce a GOTerm based on a list of GOTerms and a 'mask'
        Parameters:
        goTermList - : A list of GOTerms
        activeSets - : An integer (binary mask) that specifies weather a set in the list should be taken into account or not. The operation performed is: Intersection{ GOTerms where mask_bit == 1 } - Union{ GOTerms where mask_bit == 0 } ) where the minus sign '-' is actually a 'set minus' operation. This operation is done for both sets in GOTerm (i.e. symbolIds and interestingSymbolIds)
        Returns:
        A GOTerm
      • getGoTerm

        public GoTerm getGoTerm​(java.lang.String goTermAcc)
      • getGoTermsByGoTermAcc

        public java.util.HashMap<java.lang.String,​GoTerm> getGoTermsByGoTermAcc()
      • getGoTermsBySymbolId

        public java.util.HashMap<java.lang.String,​java.util.Set<GoTerm>> getGoTermsBySymbolId()
      • getGoTermsBySymbolId

        public java.util.Set<GoTerm> getGoTermsBySymbolId​(java.lang.String symbolId)
      • getInterestingSymbolIdsSet

        public java.util.HashSet<java.lang.String> getInterestingSymbolIdsSet()
      • getInterestingSymbolIdsSize

        public int getInterestingSymbolIdsSize()
      • getLabel

        public java.lang.String getLabel()
      • getMaxRank

        public int getMaxRank()
      • getNameSpace

        public java.lang.String getNameSpace()
      • getRank

        public int getRank​(java.lang.String symbolId)
        Get symbol's rank
        Parameters:
        symbolId -
        Returns:
      • getRankSymbolId

        public java.util.HashMap<java.lang.String,​java.lang.Integer> getRankSymbolId()
      • iterator

        public java.util.Iterator<GoTerm> iterator()
        Iterate through each GOterm in this GOTerms
        Specified by:
        iterator in interface java.lang.Iterable<GoTerm>
      • keySet

        public java.util.Set<java.lang.String> keySet()
      • levels

        public int levels()
        Calculate each node's level (in DAG)
        Returns:
        maximum level
      • listTopTerms

        public java.util.List<GoTerm> listTopTerms​(int numberToSelect)
        Select a number of GOTerms
        Parameters:
        numberToSelect -
        Returns:
      • numberOfInterestingSymbols

        public int numberOfInterestingSymbols()
        Calculate how many interesting symbol-IDs in are there in all these GOTerms
        Returns:
        Number of interesting symbols
      • numberOfNodes

        public int numberOfNodes()
        Number of nodes in this DAG
        Returns:
      • numberOfNodesWithOneInterestingSymbol

        public int numberOfNodesWithOneInterestingSymbol()
        Calculate the number of nodes in that have at least one interesting symbol
        Returns:
      • numberOfNodesWithOneSymbol

        public int numberOfNodesWithOneSymbol()
        Calculate the number of nodes in that have at least one annotated symbol
        Returns:
      • numberOfSymbols

        public int numberOfSymbols()
        Calculate how many symbol-IDs in are there in all these GOTerms
        Returns:
        Number of interesting symbols
      • readGeneAssocFile

        public void readGeneAssocFile​(java.lang.String goGenesFile,
                                      boolean useGeneId)
        Reads a file containing every gene (names and ids) associated GO terms
        Parameters:
        goGenesFile - : A file containing gene associations to GO terms
      • readInterestingSymbolIdsFile

        public void readInterestingSymbolIdsFile​(java.lang.String fileName)
        Reads a file with a list of 'interesting' genes (one per line)
        Parameters:
        fileName - : Can be "-" for no-file
      • readOboFile

        public void readOboFile​(java.lang.String oboFile,
                                boolean removeObsolete)
        Read an OBO file
        Parameters:
        oboFile -
        nameSpace -
      • removeGOTerm

        public void removeGOTerm​(java.lang.String goTermAcc)
        Remove a GOTerm
      • resetInterestingSymbolIds

        public void resetInterestingSymbolIds()
        Reset every 'interesting' symbolId (on every single GOTerm in this GOTerms)
      • rootNodes

        public java.util.Set<GoTerm> rootNodes()
      • saveGseaGeneSets

        public void saveGseaGeneSets​(java.lang.String fileName)
        Save gene sets file for GSEA analysis Format specification: http://www.broad.mit.edu/cancer/software/gsea/wiki/index.php/Data_formats#GMT:_Gene_Matrix_Transposed_file_format_.28.2A.gmt.29
        Parameters:
        fileName -
      • setLabel

        public void setLabel​(java.lang.String label)
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • values

        public java.util.Collection<GoTerm> values()