Package com.actelion.research.chem.mmp
Class MMPServices
- java.lang.Object
-
- com.actelion.research.chem.mmp.MMPServices
-
public class MMPServices extends java.lang.Object
-
-
Constructor Summary
Constructors Constructor Description MMPServices()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.List<java.lang.String>
getCategoryNames(java.lang.String datasetName)
Returns the list of categories for each available numeric fieldjava.util.List<java.lang.String>
getChemicalSpace(java.lang.String datasetName, java.lang.String[] keys, java.lang.String value, java.lang.String dataField)
Gets the chemical space for a specific data setjava.lang.String
getChemicalSpaceDWAR(java.lang.String datasetName, java.lang.String idCode, java.lang.String[] keys, java.lang.String dataField)
Generates the DWAR file of the Chemical Space for a specific data set and a specific 'key'int
getChemicalSpaceSize(java.lang.String datasetName, java.lang.String key)
Gets the size of the chemical space for a specific data setint
getChemicalSpaceSize(java.lang.String datasetName, java.lang.String[] keys)
Gets the size of the chemical space for a specific data setjava.util.List<java.lang.String>
getDataFields(java.lang.String datasetName)
Returns a list of available (numerical) data fields for a specific data setjava.lang.String
getDatasetInformations(java.util.ArrayList<java.lang.String> datasetNames)
Returns general informations about a specific data setjava.lang.String
getIDCodeFromMolName(java.lang.String datasetName, java.lang.String molName)
Returns the idCode of a molecule from its namejava.util.List<java.lang.String>
getLongDataFields(java.lang.String datasetName)
Returns a list of long field names for each available numeric fieldjava.lang.String
getMMPsDWAR(java.lang.String datasetName, java.lang.String idCode, java.lang.String[] keys, java.lang.String value1, java.lang.String value2, int replacementSize, java.util.List<java.lang.String> properties)
Generates the DWAR file of Matched Molecular Pairs for a specific data set and specific transformationjava.util.List<java.lang.String>
getPercentiles5(java.lang.String datasetName)
Return the list of the 5% percentiles for each available numeric fieldjava.util.List<java.lang.String>
getPercentiles95(java.lang.String datasetName)
Return the list of the 95% percentiles for each available numeric fieldjava.lang.String
getTransformationsDWAR(java.lang.String datasetName, java.lang.String idCode, java.lang.String[] keys, java.lang.String value1, int minAtoms, int maxAtoms, int environmentSize, java.util.List<java.lang.String> properties)
Generates the DWAR file of the Transformations for a specific data setjava.lang.String
getTransformationsJSON(java.lang.String datasetName, java.lang.String idCode, java.lang.String[] keys, java.lang.String value1, int minAtoms, int maxAtoms, java.lang.String sortBy)
Generates the main JSON string for a seeded 'value'int
getTransformationsSize(java.lang.String datasetName, java.lang.String value1, int minAtoms, int maxAtoms)
Gets the number of transformations for a specific data set, seed 'value' and deltas of heavy atomsjava.util.List<java.lang.String[]>
getTransformationsTable(java.lang.String datasetName, java.lang.String[] keys, java.lang.String value1, int minAtoms, int maxAtoms)
Returns a list of transformationsjava.lang.String
readMMPFile(java.io.BufferedReader br, boolean verbose)
Reads a new MMP File
-
-
-
Method Detail
-
readMMPFile
public java.lang.String readMMPFile(java.io.BufferedReader br, boolean verbose) throws java.io.IOException, java.lang.Exception
Reads a new MMP File- Parameters:
br
- BufferedReaderverbose
- Verbose- Returns:
- short name of the data set
- Throws:
java.io.IOException
java.lang.Exception
-
getChemicalSpaceSize
public int getChemicalSpaceSize(java.lang.String datasetName, java.lang.String key)
Gets the size of the chemical space for a specific data set- Parameters:
datasetName
- Short name of the data setkey
- idCode of the 'key' (constant part of the molecule)- Returns:
- Size of the chemical space, -1 if the data set does not exist
-
getChemicalSpaceSize
public int getChemicalSpaceSize(java.lang.String datasetName, java.lang.String[] keys)
Gets the size of the chemical space for a specific data set- Parameters:
datasetName
- Short name of the data setkeys
- Array of 'keys' idCodes (constant part of the molecule, one for single cut, two for double cuts)- Returns:
- Size of the chemical space, -1 if the data set does not exist
-
getChemicalSpace
public java.util.List<java.lang.String> getChemicalSpace(java.lang.String datasetName, java.lang.String[] keys, java.lang.String value, java.lang.String dataField)
Gets the chemical space for a specific data set- Parameters:
datasetName
- Short name of the data setkeys
- Array of 'keys' idCodes (constant part of the molecule)value
- 'value' idCode (variable part of the molecule - not used yet). Can be nulldataField
- Name of the data field for which data should be returned. Can be null- Returns:
- a List of tab-delimited [idCodes, moleculeName, data] entries
-
getMMPsDWAR
public java.lang.String getMMPsDWAR(java.lang.String datasetName, java.lang.String idCode, java.lang.String[] keys, java.lang.String value1, java.lang.String value2, int replacementSize, java.util.List<java.lang.String> properties)
Generates the DWAR file of Matched Molecular Pairs for a specific data set and specific transformation- Parameters:
datasetName
- Short name of the data setidCode
- idCode of the seed moleculekeys
- Array of 'keys' idCodes (constant part of the molecule)value1
- seeded 'value' idCode (variable part of the molecule)value2
- target 'value' idCode (transformation)replacementSize
- Difference in number of heavy atoms between seed and target fragmentsproperties
- List of data fields for which data should be returned- Returns:
- a String containing the content of the whole DWAR file
-
getChemicalSpaceDWAR
public java.lang.String getChemicalSpaceDWAR(java.lang.String datasetName, java.lang.String idCode, java.lang.String[] keys, java.lang.String dataField)
Generates the DWAR file of the Chemical Space for a specific data set and a specific 'key'- Parameters:
datasetName
- Short name of the data setidCode
- idCode of the seed moleculekeys
- Array of 'keys' idCodes (constant part of the molecule)dataField
- Name of the data field for which data should be returned. Can be null.- Returns:
- a String containing the content of the whole DWAR file
-
getTransformationsSize
public int getTransformationsSize(java.lang.String datasetName, java.lang.String value1, int minAtoms, int maxAtoms)
Gets the number of transformations for a specific data set, seed 'value' and deltas of heavy atoms- Parameters:
datasetName
- Short name of the data setvalue1
- idCode of the seed 'value' (variable part of the molecule)minAtoms
- minimal delta number of heavy atoms (compared to the seed fragment)maxAtoms
- maximal delta number of heavy atoms (compared to the seed fragment)- Returns:
- the number of transformations, -1 if the data set does not exist
-
getTransformationsTable
public java.util.List<java.lang.String[]> getTransformationsTable(java.lang.String datasetName, java.lang.String[] keys, java.lang.String value1, int minAtoms, int maxAtoms)
Returns a list of transformations- Parameters:
datasetName
- Short name of the data setkeys
- Array of 'keys' idCodes (constant part of the molecule)value1
- seeded 'value' idCode (variable part of the molecule)minAtoms
- minimal delta number of heavy atoms (compared to the seed fragment)maxAtoms
- maximal delta number of heavy atoms (compared to the seed fragment)- Returns:
- List of String arrays ([seed, target, number of examples, transformed molecule exists])
-
getTransformationsJSON
public java.lang.String getTransformationsJSON(java.lang.String datasetName, java.lang.String idCode, java.lang.String[] keys, java.lang.String value1, int minAtoms, int maxAtoms, java.lang.String sortBy)
Generates the main JSON string for a seeded 'value'- Parameters:
datasetName
- Short name of the data setidCode
- idCode of the whole seed moleculekeys
- Array of 'keys' idCodes (constant part of the molecule)value1
- seeded 'value' idCode (variable part of the molecule)minAtoms
- minimal delta number of heavy atoms (compared to the seed fragment)maxAtoms
- maximal delta number of heavy atoms (compared to the seed fragment)sortBy
- SORT_BY_NUMBER_OF_EXAMPLES or SORT_BY_SIMILARITY- Returns:
- a JSON string with all data
-
getTransformationsDWAR
public java.lang.String getTransformationsDWAR(java.lang.String datasetName, java.lang.String idCode, java.lang.String[] keys, java.lang.String value1, int minAtoms, int maxAtoms, int environmentSize, java.util.List<java.lang.String> properties)
Generates the DWAR file of the Transformations for a specific data set- Parameters:
datasetName
- Short name of the data setidCode
- idCode of the whole seed moleculekeys
- Array of 'keys' idCodes (constant part of the molecule)value1
- seeded 'value' idCode (variable part of the molecule)minAtoms
- minimal delta number of heavy atoms (compared to the seed fragment)maxAtoms
- maximal delta number of heavy atoms (compared to the seed fragment)environmentSize
- Size of the local environment (0-5)properties
- List of data fields for which data should be returned- Returns:
- a String containing the content of the whole DWAR file
-
getIDCodeFromMolName
public java.lang.String getIDCodeFromMolName(java.lang.String datasetName, java.lang.String molName)
Returns the idCode of a molecule from its name- Parameters:
datasetName
- Short name of the data setmolName
- Molecule name- Returns:
- idCode string
-
getDataFields
public java.util.List<java.lang.String> getDataFields(java.lang.String datasetName)
Returns a list of available (numerical) data fields for a specific data set- Parameters:
datasetName
- Short name of the data set- Returns:
- List of field names
-
getLongDataFields
public java.util.List<java.lang.String> getLongDataFields(java.lang.String datasetName)
Returns a list of long field names for each available numeric field- Parameters:
datasetName
- Short name of the data set- Returns:
- List of long field names (or short names if long names are not available)
-
getCategoryNames
public java.util.List<java.lang.String> getCategoryNames(java.lang.String datasetName)
Returns the list of categories for each available numeric field- Parameters:
datasetName
- Short name of the data set- Returns:
- List of categories, or 'other' if no categories are available
-
getPercentiles5
public java.util.List<java.lang.String> getPercentiles5(java.lang.String datasetName)
Return the list of the 5% percentiles for each available numeric field- Parameters:
datasetName
- Short name of the data set- Returns:
- List of 5% percentiles
-
getPercentiles95
public java.util.List<java.lang.String> getPercentiles95(java.lang.String datasetName)
Return the list of the 95% percentiles for each available numeric field- Parameters:
datasetName
- Short name of the data set- Returns:
- List of 95% percentiles
-
getDatasetInformations
public java.lang.String getDatasetInformations(java.util.ArrayList<java.lang.String> datasetNames)
Returns general informations about a specific data set- Parameters:
datasetNames
- Ordered list of data set names; required to ensure that the order is identical to the one in the settings file- Returns:
- Tab-delimited [short data set name, number of molecules, data generation date, one random molecule name]
-
-