Toolkits#

app.modules.toolkits.rdkit_wrapper module#

app.modules.toolkits.rdkit_wrapper.check_RO5_violations(molecule)[source]#

Check the molecule for violations of Lipinski’s Rule of Five.

Return type:

int

Args:

molecule (Chem.Mol): RDKit molecule object.

Returns:

int: Number of Lipinski Rule violations.

app.modules.toolkits.rdkit_wrapper.get_2d_mol(molecule)[source]#

Generate a 2D Mol block representation from a given SMILES string.

Return type:

str

Args:

molecule (Chem.Mol): RDKit molecule object.

Returns:

str: 2D Mol block representation. If an error occurs during SMILES parsing, an error message is returned.

app.modules.toolkits.rdkit_wrapper.get_3d_conformers(molecule, depict=True)[source]#

Convert a SMILES string to an RDKit Mol object with 3D coordinates.

Return type:

Mol

Args:

molecule (Chem.Mol): RDKit molecule object. depict (bool, optional): If True, returns the molecule’s 3D structure in MolBlock format. If False, returns the 3D molecule without hydrogen atoms.

Returns:

str or rdkit.Chem.rdchem.Mol: If depict is True, returns the 3D structure in MolBlock format. Otherwise, returns an RDKit Mol object.

app.modules.toolkits.rdkit_wrapper.get_GhoseFilter(molecule)[source]#

Determine if a molecule satisfies Ghose’s filter criteria.

Ghose’s filter is a set of criteria for drug-like molecules. This function checks if a given molecule meets the criteria defined by Ghose.

Parameters: molecule (any): A molecule represented as an RDKit Mol object.

Returns: bool: True if the molecule meets Ghose’s criteria, False otherwise.

Ghose’s criteria: - Molecular Weight (MW) should be between 160 and 480. - LogP (Partition Coefficient) should be between 0.4 and 5.6. - Number of Atoms (NoAtoms) should be between 20 and 70. - Molar Refractivity (MolarRefractivity) should be between 40 and 130.

Return type:

bool

app.modules.toolkits.rdkit_wrapper.get_PAINS(molecule)[source]#

Check if a molecule contains a PAINS (Pan Assay INterference compoundS)substructure.

Parameters: molecule (any): A molecule represented as an RDKit Mol object.

Returns: Union[bool, Tuple[str, str]]: The function returns a tuple with the PAINS family and its description if a PAINS substructure is detected in the molecule. Otherwise, it returns False.

This function uses the RDKit library to check if the given molecule contains any PAINS substructure. PAINS are known substructures that may interfere with various biological assays.

Return type:

Union[bool, Tuple[str, str]]

app.modules.toolkits.rdkit_wrapper.get_REOSFilter(molecule)[source]#

Determine if a molecule passes the REOS (Rapid Elimination Of Swill).

filter.

The REOS filter is a set of criteria that a molecule must meet to be considered a viable drug-like compound. This function takes a molecule as input and checks its properties against the following criteria: :rtype: bool

  • Molecular Weight (MW): Must be in the range [200, 500].

  • LogP (Partition Coefficient): Must be in the range [-5, 5].

  • Hydrogen Bond Donors (HBD): Must be in the range [0, 5].

  • Hydrogen Bond Acceptors (HBA): Must be in the range [0, 10].

  • Formal Charge: Must be in the range [-2, 2].

  • Number of Rotatable Bonds: Must be in the range [0, 8].

  • Number of Heavy Atoms (non-hydrogen atoms): Must be in the range [15, 50].

Parameters:

molecule (any): A molecule represented as an RDKit Mol object.

Returns:

bool: True if the molecule passes the REOS filter, False otherwise.

app.modules.toolkits.rdkit_wrapper.get_RuleofThree(molecule)[source]#

Check if a molecule meets the Rule of Three criteria.

The Rule of Three is a guideline for drug-likeness in chemical compounds. It suggests that a molecule is more likely to be a good drug candidate if it meets the following criteria: 1. Molecular Weight (MW) <= 300 2. LogP (partition coefficient) <= 3 3. Number of Hydrogen Bond Donors (HBD) <= 3 4. Number of Hydrogen Bond Acceptors (HBA) <= 3 5. Number of Rotatable Bonds <= 3

Return type:

bool

Parameters:

molecule (any): A molecule represented as an RDKit Mol object.

Returns:

bool: True if the molecule meets the Rule of Three criteria, False otherwise.

app.modules.toolkits.rdkit_wrapper.get_VeberFilter(molecule)[source]#

Apply the Veber filter to evaluate the drug-likeness of a molecule.

The Veber filter assesses drug-likeness based on two criteria: the number of rotatable bonds and the polar surface area (TPSA). A molecule is considered drug-like if it has 10 or fewer rotatable bonds and a TPSA of 140 or less.

Return type:

bool

Parameters:

molecule (any): A molecule represented as an RDKit Mol object.

Returns:
bool: True if the molecule passes the Veber filter criteria, indicating

drug-likeness; False otherwise.

Note: The function relies on RDKit functions to calculate the number of rotatable bonds and TPSA, and it returns a boolean value to indicate whether the input molecule passes the Veber filter criteria.

Reference: Veber, D. F., Johnson, S. R., Cheng, H. Y., Smith, B. R., Ward, K. W., & Kopple, K. D. (2002). Molecular properties that influence the oral bioavailability of drug candidates. Journal of Medicinal Chemistry, 45(12), 2615-2623. DOI: 10.1021/jm020017n

app.modules.toolkits.rdkit_wrapper.get_ertl_functional_groups(molecule)[source]#

This function takes an organic molecule as input and uses the algorithm.

proposed by Peter Ertl to.

identify functional groups within the molecule. The identification is based on the analysis of chemical fragments present in the molecular structure.

Return type:

list

Parameters:

molecule (any): A molecule represented as an RDKit Mol object.

Returns:

list: A list of identified functional groups in the molecule.

References: - Ertl, Peter. “Implementation of an algorithm to identify functional groups in organic molecules.” Journal of Cheminformatics 9.1 (2017): 9. https://jcheminf.springeropen.com/articles/10.1186/s13321-017-0225-z

If no functional groups are found, the function returns a list with a single element: [{‘None’: ‘No fragments found’}]

app.modules.toolkits.rdkit_wrapper.get_properties(sdf_file)[source]#

Extracts properties from a single molecule contained in an SDF file.

This function uses the RDKit library to read an SDF (Structure-Data File) and extract properties from the first molecule in the file. It checks if the supplied SDF file contains a valid molecule and retrieves its properties as a dictionary.

Return type:

dict

Args:

sdf_file (str): The path to the SDF file containing the molecule.

Returns:

Dict or None: A dictionary containing the properties of the molecule. If the SDF file contains a valid molecule, the dictionary will have property names as keys and property values as values. If no valid molecule is found, or if there are no properties associated with the molecule, None is returned.

Raises:

ValueError: If the SDF file is not found or cannot be read.

app.modules.toolkits.rdkit_wrapper.get_rdkit_CXSMILES(molecule)[source]#

Generate CXSMILES representation with coordinates from a given SMILES.

string.

Return type:

str

Args:

molecule (Chem.Mol): RDKit molecule object.

Returns:

str: CXSMILES representation with coordinates. If an error occurs during SMILES parsing, an error message is returned.

async app.modules.toolkits.rdkit_wrapper.get_rdkit_HOSE_codes(molecule, noOfSpheres)[source]#

Calculate and retrieve RDKit HOSE codes for a given SMILES string.

This function takes a SMILES string as input and returns the calculated HOSE codes.

Return type:

List[str]

Args:

molecule (Chem.Mol): RDKit molecule object. no_of_spheres (int): Number of spheres for which to generate HOSE codes.

Returns:

List[str]: List of HOSE codes generated for each atom.

Raises:

ValueError: If the input SMILES string is empty or contains whitespace.

app.modules.toolkits.rdkit_wrapper.get_rdkit_descriptors(molecule)[source]#

Calculate a selected set of molecular descriptors for the input SMILES.

string.

Return type:

Union[tuple, str]

Args:

molecule (Chem.Mol): RDKit molecule object.

Returns:
dict: Dictionary of calculated molecular descriptors.

If an error occurs during SMILES parsing, an error message is returned.

app.modules.toolkits.rdkit_wrapper.get_sas_score(molecule)[source]#

Calculate the Synthetic Accessibility Score (SAS) for a given molecule.

The Synthetic Accessibility Score is a measure of how easy or difficult it is to synthesize a given molecule. A higher score indicates a molecule that is more challenging to synthesize, while a lower score suggests a molecule that is easier to synthesize.

Return type:

float

Parameters:

molecule (Chem.Mol): An RDKit molecule object representing the chemical structure.

Returns:

float: The Synthetic Accessibility Score rounded to two decimal places.

Note:
  • The SAS is calculated using the sascorer.calculateScore() function from the RDKit Contrib library.

  • The SAS score can be used as a factor in drug design and compound optimization, with lower scores often indicating more drug-like and synthesizable molecules.

See Also:
References:
  • Ertl, P., & Schuffenhauer, A. (2009). Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. Journal of Cheminformatics, 1(1), 8. DOI: 10.1186/1758-2946-1-8

  • RDKit Documentation: https://www.rdkit.org/docs/index.html

app.modules.toolkits.rdkit_wrapper.get_tanimoto_similarity_rdkit(mol1, mol2, fingerprinter='ECFP', radius=2, nBits=2048)[source]#

Calculate the Tanimoto similarity index between two molecular.

structures.

represented as RDKit Mol objects.

This function computes the Tanimoto similarity index, a measure of structural similarity, between two chemical compounds using various fingerprinting methods available in RDKit.

Return type:

Union[float, str]

Args:

mol1 (Chem.Mol): The RDKit Mol object representing the first molecule. mol2 (Chem.Mol): The RDKit Mol object representing the second molecule. fingerprinter (str, optional): The type of fingerprint to use. Defaults to “ECFP”. radius (int, optional): The radius parameter for ECFP fingerprints. Ignored for other fingerprint types. nBits (int, optional): The number of bits for fingerprint vectors. Ignored for MACCS keys.

Returns:

Union[float, str]: The Tanimoto similarity index between the two molecules if they are valid. If molecules are not valid, returns a string indicating an error.

Note:
  • Supported fingerprinter options: “ECFP”, “RDKit”, “Atompairs”, “MACCS”.

  • ECFP fingerprints are based on atom environments up to a specified radius.

  • RDKit and Atom Pair fingerprints are based on different molecular descriptors.

  • MACCS keys are a fixed-length binary fingerprint.

app.modules.toolkits.rdkit_wrapper.has_stereochemistry(molecule)[source]#

Check if the given SMILES string contains stereochemistry information.

Return type:

bool

Args:

molecule (Chem.Mol): RDKit molecule object.

Returns:

bool: True if the SMILES contains stereochemistry information, False otherwise.

app.modules.toolkits.rdkit_wrapper.is_valid_molecule(input_text)[source]#

Check whether the input text represents a valid molecule in SMILES or.

Molblock format.

Return type:

Union[str, bool]

Args:

input_text (str): SMILES string or Molblock.

Returns:

str: “smiles” if the input is a valid SMILES, “mol” if the input is a valid Molblock, otherwise False.

app.modules.toolkits.cdk_wrapper module#

async app.modules.toolkits.cdk_wrapper.get_CDK_HOSE_codes(molecule, noOfSpheres, ringsize)[source]#

Generate CDK-generated HOSECodes for the given SMILES.

Return type:

List[str]

Args:

molecule (IAtomContainer): molecule given by the user. noOfSpheres (int): Number of spheres for HOSECode generation. ringsize (bool): Whether to consider ring size for HOSECode generation.

Returns:

List[str]: List of CDK-generated HOSECodes.

app.modules.toolkits.cdk_wrapper.get_CDK_IAtomContainer(smiles)[source]#

This function takes the input SMILES and creates a CDK IAtomContainer.

Args:

smiles (str): SMILES string as input.

Returns:

mol (object): IAtomContainer with CDK.

app.modules.toolkits.cdk_wrapper.get_CDK_SDG(molecule)[source]#

This function takes the input IAtomContainer and Creates a.

Structure Diagram Layout using the CDK.

Args:

molecule (IAtomContainer): molecule given by the user.

Returns:

mol object: mol object with CDK SDG.

app.modules.toolkits.cdk_wrapper.get_CDK_SDG_mol(molecule, V3000=False)[source]#

Returns a mol block string with Structure Diagram Layout for the given.

SMILES.

Return type:

str

Args:

molecule (IAtomContainer): molecule given by the user. V3000 (bool, optional): Option to return V3000 mol. Defaults to False.

Returns:

str: CDK Structure Diagram Layout mol block.

app.modules.toolkits.cdk_wrapper.get_CDK_descriptors(molecule)[source]#

Take an input SMILES and generate a selected set of molecular.

descriptors generated using CDK as a list.

Args (str): molecule (IAtomContainer): molecule given by the user.

Returns (list): A list of calculated descriptors.

Return type:

Union[tuple, str]

app.modules.toolkits.cdk_wrapper.get_CXSMILES(molecule)[source]#

Generate CXSMILES representation with 2D atom coordinates from the.

given.

SMILES.

Return type:

str

Args:

molecule (IAtomContainer): molecule given by the user.

Returns:

str: CXSMILES representation with 2D atom coordinates.

app.modules.toolkits.cdk_wrapper.get_InChI(molecule, InChIKey=False)[source]#

Generate InChI or InChIKey from the given SMILES string.

Return type:

str

Args:

molecule (IAtomContainer): molecule given by the user. InChIKey (bool): If True, return InChIKey instead of InChI. The default is False.

Returns:

str: InChI or InChIKey string.

app.modules.toolkits.cdk_wrapper.get_aromatic_ring_count(molecule)[source]#

Calculate the number of aromatic rings present in a given molecule.

Return type:

int

Args:

molecule (IAtomContainer): molecule given by the user.

Returns:

int: The number of aromatic rings present in the molecule.

app.modules.toolkits.cdk_wrapper.get_canonical_SMILES(molecule)[source]#

Generate Canonical SMILES representation with 2D atom coordinates from.

the given SMILES.

Return type:

str

Args:

molecule (IAtomContainer): molecule given by the user.

Returns:

str: Canonical SMILES representation with 2D atom coordinates.

app.modules.toolkits.cdk_wrapper.get_cip_annotation(molecule)[source]#

Return the CIP (Cahn–Ingold–Prelog) annotations using the CDK CIP.

toolkit.

This function takes a SMILES (Simplified Molecular Input Line Entry System) string as input and returns a CIP annotated molecule block using the CDK CIP toolkit.

Return type:

str

Args:

molecule (IAtomContainer): molecule given by the user.

Returns:

str: A CIP annotated molecule block.

app.modules.toolkits.cdk_wrapper.get_murko_framework(molecule)[source]#

This function takes the user input SMILES and returns.

the Murko framework

Return type:

str

Args:

molecule (IAtomContainer): molecule given by the user.

Returns:

smiles (string): Murko Framework as SMILES.

app.modules.toolkits.cdk_wrapper.get_smiles_opsin(input_text)[source]#

Convert IUPAC chemical name to SMILES notation using OPSIN.

Parameters: - input_text (str): The IUPAC chemical name to be converted.

Returns: - str: The SMILES notation corresponding to the given IUPAC name.

Raises: - Exception: If the IUPAC name is not valid or if there are issues in the conversion process. The exception message will guide the user to check the data again.

Return type:

str

app.modules.toolkits.cdk_wrapper.get_tanimoto_similarity_CDK(mol1, mol2, fingerprinter='PubChem', ECFP=6)[source]#

Calculate the Tanimoto similarity between two molecules using.

PubChem/CircularFingerprints in CDK.

Return type:

float

Args:

mol1 (IAtomContainer): First molecule given by the user. mol2 (IAtomContainer): Second molecule given by the user. fingerprinter (str, optional): The fingerprinter to use. Currently, only “PubChem/ECFP6” is supported. Defaults to “PubChem”.

Returns:

float: The Tanimoto similarity score between the two molecules.

Raises:

ValueError: If an unsupported fingerprinter is specified.

app.modules.toolkits.cdk_wrapper.get_tanimoto_similarity_ECFP_CDK(mol1, mol2, ECFP=2)[source]#

Calculate the Tanimoto similarity index between two molecules using.

CircularFingerprinter fingerprints.

https://cdk.github.io/cdk/2.8/docs/api/org/openscience/cdk/fingerprint/CircularFingerprinter.html

Return type:

str

Args:

mol1 (IAtomContainer): First molecule given by the user. mol2 (IAtomContainer): Second molecule given by the user.

Returns:

str: The Tanimoto similarity as a string with 5 decimal places, or an error message.

app.modules.toolkits.cdk_wrapper.get_tanimoto_similarity_PubChem_CDK(mol1, mol2)[source]#

Calculate the Tanimoto similarity index between two molecules using.

PubChem fingerprints.

Return type:

str

Args:

mol1 (IAtomContainer): First molecule given by the user. mol2 (IAtomContainer): Second molecule given by the user.

Returns:

str: The Tanimoto similarity as a string with 5 decimal places, or an error message.

app.modules.toolkits.cdk_wrapper.get_vander_waals_volume(molecule)[source]#

Calculate the Van der Waals volume of a given molecule.

Return type:

float

Args:

molecule (IAtomContainer): molecule given by the user.

Returns:

float: The Van der Waals volume of the molecule.

app.modules.toolkits.openbabel_wrapper module#

app.modules.toolkits.openbabel_wrapper.get_ob_InChI(smiles, InChIKey=False)[source]#

Convert a SMILES string to InChI.

Return type:

str

Args:

smiles (str): Input SMILES string. InChIKey (bool, optional): Whether to return InChIKey. Defaults to False.

Returns:

str: InChI string or InChIKey string if InChIKey is True.

app.modules.toolkits.openbabel_wrapper.get_ob_canonical_SMILES(smiles)[source]#

Convert a SMILES string to Canonical SMILES.

Return type:

str

Args:

smiles (str): Input SMILES string.

Returns:

str: Canonical SMILES string.

app.modules.toolkits.openbabel_wrapper.get_ob_mol(smiles, threeD=False, depict=False)[source]#

Convert a SMILES string to a 2D/3D mol block.

Return type:

str

Args:

smiles (str): Input SMILES string. threeD (bool, optional): Generate 3D structure. Defaults to False. depict (bool, optional): Generate 3D structure for depiction. Defaults to False.

Returns:

str: Mol block (2D/3D).