.. _pyrosetta_cyp51_docking: Docking ligands to P450 enzymes with PyRosetta =================================================== This protocol has been adpted from `Rosetta Ligand Docking tutorial`_, `Ligand Dock Application`_ and `08.01 PyRosetta notebook`_ and utilised during our study on docking various inhibitors to P450 orthologs. Comparative modelling for the twelve CYP51 orthologus sequences has been completed as described in :ref:`comparative modelling with Modeller` section. The protocol has been utilised in our large scale CYP51 docking study: O.O. Akapo, J.M. Macnar, J.D. Kryƛ, P.R. Syed, K. Syed and D. Gront, *In silico structural modeling and analysis of interactions of Tremellomycetes cytochrome P450 monooxygenases CYP51s with substrates and azoles*, submitted to *Scientific Reports* The following Rosetta version was employed: .. code-block:: bash INFO:pyrosetta.rosetta:PyRosetta-4 2019 PyRosetta4.conda.linux.CentOS.python36.Release2019.44+release.5aed75f2e796a33ab71515b6c1daa321eb2294a2 2019-10-29T08:37:43 retrieved from: http://www.pyrosetta.org .. contents:: :local: Preparation of docking ~~~~~~~~~~~~~~~~~~~~~~~~~ Protein preparation ^^^^^^^^^^^^^^^^^^^^^^ We used Rosetta `clean-pdb.py` script (distributed with the software package) to prepare a protein chain for further steps: .. code-block:: bash python2.7 $ROSETTA/tools/protein_tools/scripts/clean_pdb.py protein.pdb chain_letter where ``chain_letter`` is a single-letter ID of the chain (typically 'a') and ``$ROSETTA`` shell variable provides the locations of your Rosetta package installation. As a result you will get a ``protein_chain_letter.pdb`` and ``protein_chain_letter.fasta`` files. Cofactor preparation ^^^^^^^^^^^^^^^^^^^^^^^ To prepare HEM `.params` file you will need a ``HEM.sdf`` file that have to be prepared from a protein-HEM complex to have the correct coordinates and have added hydrogens. You can easily do that using for example `PyMOL`_ or any other software you like. .. code-block:: bash python2.7 $ROSETTA/main/source/scripts/python/public/molfile_to_params.py \ -n HEM -p HEM --chain=B HEM.sdf As a result, you will get a ``HEM_0001.pdb`` and ``HEM.params`` files in which HEM will be present in chain B. Ligand preparation ^^^^^^^^^^^^^^^^^^^^^ Ligands subjected for docking also needs a `.params` file but this time the conformers will be required as well. To get them use `bcl`_ following the protocole used in Meiler Lab. The input `ligand.sdf` file can be downloaded from `PubChem`_, `PDB`_ or prepared using `PyMOL`_ or `openbabel`_ from other type files. First, check if the molecule has reasonable bonds/geometries/clashing/etc. Ligands that pass are the *matched* and those that fail are the *unmatched*. .. code-block:: bash bcl.exe molecule:Filter -add_h -neutralize -defined_atom_types \ -3d -input_filenames ligand.sdf \ -output_matched ligand.CLEANED.sdf \ -output_unmatched ligand.UNCLEANED.sdf -message_level Debug bcl.exe molecule:ConformerGenerator -rotamer_library cod \ -top_models 100 -ensemble_filenames ligand.CLEANED.sdf \ -conformers_single_file ligand.CLEANED.conf.sdf \ -conformation_comparer 'Dihedral(method=Max)' 30 -max_iterations 1000 Then `.params` file can be prepared as follow: .. code-block:: bash python2.7 $ROSETTA/main/source/scripts/python/public/molfile_to_params.py \ -n LIG -p LIG --chain=X --conformers-in-one-file ligand.CLEANED.conf.sdf Output should contain `LIG.pdb`, `LIG_conformers.pdb` and `LIG.params` files. Check the `.params` if `PDB_ROTAMERS` line is present at the end with the correct conformers filename. Putting protein, cofactor and ligand together ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ All those components must be in the same `.pdb` file. First, add HEM and minimize the side chians: .. code-block:: bash cat protein_chain_letter.pdb HEM_0001.pdb > protein_HEM.pdb $ROSETTA/main/source/bin/ligand_rpkmin.linuxgccrelease \ -database $ROSETTA/main/database \ -in:file:extra_res_fa HEM.params \ -ex1 -ex2 -no_optH false -flip_HNQ \ -docking:ligand:old_estat \ -docking:ligand:soft_rep \ -nstruct 10 \ -s protein_HEM.pdb Then the structure with the lowest energy should be choosen for the next steps with the following command: .. code-block:: bash k=` awk '/^pose / {print FILENAME, $NF}' protein_HEM_*.pdb | sort -nk2 |head -1|awk '{print $1}'` && cp $k protein_HEM_input.pdb which simply sorts file names by the respective scores and selects the file corresponding to the lowest energy value. The last step, which adds the ligand `.pdb` at the end of the input file, is done by: .. code-block:: bash cat protein_HEM_input.pdb LIG.pdb > protein_HEM_LIG.pdb Docking Procedures ~~~~~~~~~~~~~~~~~~~~~ To perform docking the following Python script have to be run: .. code-block:: python """ This script was prepared on the canva of https://github.com/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/08.01-Ligand-Docking-XMLObjects.ipynb Before you use this script remove "outputs" folder from your curent directory! """ # Setup import sys import logging logging.basicConfig(level=logging.INFO) import matplotlib import pandas as pd import os import pyrosetta from pyrosetta import * from pyrosetta.rosetta.protocols.rosetta_scripts import * import pyrosetta.distributed.viewer as viewer from pyrosetta.rosetta.protocols.ligand_docking import * # Declare path to input files pdb_filename = sys.argv[1] #"protein_HEM_LIG.pdb" ligand_params = sys.argv[2] # "LIG.params" ligand_conformers = sys.argv[3] # "LIG_conformers.pdb" ligand_filename = sys.argv[4] # "LIG.pdb" HEM_params = "HEM.params" # assuming that it is present in the current directory # Flags normaly used in options.short flags = f""" -extra_res_fa {HEM_params} {ligand_params} -s '{pdb_filename} {ligand_filename}' -mute all -ex1 -ex2 -no_optH false -flip_HNQ true -ignore_ligand_chi true -overwrite """ pyrosetta.distributed.init(flags) # Uncomment if you would like to use options.short as input parameter file #pyrosetta.init('-no_fconfig @options2.short') # Create a pose object - we need it for further docking pose = pyrosetta.io.pose_from_file(filename=pdb_filename) # xml file # information about coordinates of a center of box # Complex filename # Ligand to SlideTogether (I'm not sure if we still need it) xml_ = """ """.format(pdb_filename) #print(xml_) xml = pyrosetta.rosetta.protocols.rosetta_scripts.XmlObjects.create_from_string(xml_).get_mover("ParsedProtocol") # Produce 5 local ligand docking trajectories in "outputs" directory, using `PyJobDistributor`: scorefxn = pyrosetta.create_score_function("ligand_soft_rep") if not os.getenv("DEBUG"): working_dir = os.getcwd() output_dir = "output" if not os.path.exists(output_dir): os.mkdir(output_dir) os.chdir(output_dir) jd = pyrosetta.toolbox.py_jobdistributor.PyJobDistributor(pdb_name=pdb_filename, nstruct=5, scorefxn=scorefxn) # Output name, number of structures, scorefunction used jd.native_pose = pose #df = pd.DataFrame() while not jd.job_complete: test_pose = pose.clone() xml.apply(test_pose) #test_df = pd.DataFrame.from_records(dict(test_pose.scores), index=[jd.current_name]) #df = df.append(test_df) jd.output_decoy(test_pose) os.chdir(working_dir) Results clustering using hierarchical agglomerative algorithm ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Docking calculations are followed by results analysis, which starts from hierarchical clustering of the obtained poses. This step has been done according to :ref:`this protocol` .. rubric:: References .. target-notes:: .. _`Rosetta Ligand Docking tutorial` : https://www.rosettacommons.org/demos/latest/tutorials/ligand_docking/ligand_docking_tutorial/ .. _`Ligand Dock Application` : https://www.rosettacommons.org/manuals/archive/rosetta3.1_user_guide/app_ligand_docking.html .. _`08.01 PyRosetta notebook` : https://github.com/RosettaCommons/PyRosetta.notebooks/blob/master/notebooks/08.01-Ligand-Docking-XMLObjects.ipynb .. _`PDB`: http://www.rcsb.org/ .. _`MODELLER`: https://salilab.org/modeller/ .. _`PyMOL`: https://pymol.org/2/ .. _`PubChem` : https://pubchem.ncbi.nlm.nih.gov/ .. _`openbabel`: https://openbabel.org/docs/dev/Command-line_tools/babel.html .. _`bcl` : http://www.meilerlab.org/index.php/bclcommons/show/b_apps_id/1 .. _`BioShell 3.0` : https://bioshell.readthedocs.io/en/latest/