# Cesg/gw/11/2005 # # THIS IS A NMR DATA_TEMPLATE FOR DOING A SEMI-AUTOMATED PDB # DEPOSITION. ADDITIONAL COMMENTS HAVE BEEN EMBEDDED IN THIS FILE # TO HELP CESG STRUCTURE SOLVERS IN FILLING OUT THIS FORM. THESE # COMMENTS ARE MARKED WITH A POUND SIGN (#). ONLY DOUBLE QUOTED # INFORMATION BETWEEN THE BRA-KET SYMBOLS ( < > ) IS PARSED BY THE # AUTODEPOSTION SOFTWARE - THIS IS WHERE YOU WILL EDIT-IN YOUR # INFORMATION. NOTE THAT THE ARCHIVED .mmcif FILE MAINTAINS UPPER # AND LOWER CASE SO THAT YOU SHOULD USE BOTH CASES IF IT ADDS CLARITY # TO THE TEXT, EVEN THOUGH THE PDB FILE WILL ONLY CONTAIN UPPER CASE. # # IF YOU HAVE ANY QUESTIONS OR SUGGESTIONS PLEASE CONTACT ME: # # gary@biochem.wisc.edu # ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ AUTOMATED NMR STRUCTURE DEPOSITION TO PROTEIN DATA BANK (April.8, 2004, last modify June 6,2005) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ GUIDELINES FOR USING THIS FILE! 1 Only the strings in brackets will be parsed for evaluation. Therefore, never write left or right bracket sign in the file! 2 All values (strings or numbers) in brackets must be double-quoted! Therefore, NEVER include double_quotation (") in your string values. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ PLEASE CHECK CATEGORY 1 before proceeding any further and make sure sequences are complete and correct. You may choose to fill in the rest CATEGORIES either here or later in ADIT. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~START INPUT DATA BELLOW~~~~~~~~~~~~~~~~~~~~~~~ ================CATEGORY 1: Molecular Entity Sequence=================== Enter one letter code sequence for each molecular entity -------------------------------------------------------------------------- SOME DEFINITIONS An ENTITY is defined as any unique molecule present in the asymmetric unit. Each unique biological polymer (protein or nucleic acids) in the structure is considered an entity. Thus, if there are five copies of a single protein in the asymmetric unit, the molecular entity is still only one. Water and non-polymers like ions, ligands and sugars are also entities. Here we only consider the sequences of polymeric entities (protein or nucleic acid). GUIDELINES FOR COMPLETING THIS CATEGORY * In a PDB or mmCIF format file, all residues of a single polymeric entity should have one chain ID. Multiple copies of the same entity should each be assigned a unique chain ID. The multiple chain IDs should be separated by commas as 'A,B,C,...'. If incorrect chain IDs are used the entity groups extracted by this program will not be correct. To avoid this, make necessary corrections in the PDB or mmCIF file used to generate the data_template file and regenerate the data_template.text file. Alternatively, edit the extracted sequence in this file to correctly represent the sequence and chain IDs of each polymeric entity. * In addition to chain IDs, this program uses distance geometry to asses if there are any breaks in the polymer sequence. These breaks may occur due to missing residues (not included in the model due to missing electron density) or due to poor geometry. Four question marks '????' are used to denote these chain breaks. Replace these question marks with the sequence of residues missing from the coordinates. Also add any residues missing from the N- and/or C-termini here. * If there are non-standard residues in the coordinates, this program lists them according to the three letter code used in the coordinate file as (ABC). If all the residues in your sequence are nonstandard, check and edit the sequence manually to represent it correctly in this file. * If any residue was modeled as Ala or Gly due to lack of the side-chain density, the sequence extracted here will represent them as A or G respectively. Correct this to the original sequence that was present in the crystal. ---------------------------------------------------------------------------- # GIVE THE FULL SEQUENCE FOR THE PROTEIN IN THE SAMPLE, THAT IS, EVEN IF THE # MODEL IS INCOMPLETE, INCLUDE RESIDUES MISSING IN THE MODEL (AND TAGS OR MUTATIONS, # IF THERE ARE ANY IN THE ACTUAL SOLUTION SAMPLE BEING STUDIED ) # EXAMPLE: GHHHHHHLELEVHIPSVGPEAEGPRQSPEKSHMVFRVEVLCSGRRHTVPRR # YSEFHALHKRIKKLYKVPDFPSKRLPNWRTRGLEQRRQGLEAYIQGILYLNQEVPKELLEFLRLRHFPTDPKASNWG < molecule_chain_id="A,B" > # IF THERE IS MORE THAN ONE CHAIN WITH THE SAME AA SEQUENCE # IN THE STRUCTURE, ADD THE CHAIN LABELS HERE, FOR EXAMPLE: # FOR A HOMO-DIMER YOU WOULD USE: molecule_chain_id="A,B" # "GO." + ORF_NUMBER_STRING, GET ORF NUMBER FROM SESAME # IF YOU DON'T ALREADY KNOW IT. EXAMPLE: 'GO.12345' # IF THERE IS MORE THAN ONE TYPE OF PEPTIDE OR NUCLEIC ACID CHAIN, USE MORE # molecule_entity BLOCKS FIELDS HERE:::: < target_DB_id=" " > (if known) # NOTE THAT CONTACT AUTHORS ARE USED BY THE PDB FOR INTERNAL PURPOSES # ONLY AND THAT THESE NAMES ARE NOT PUBLICLY RELEASED. AUTHOR FIELDS # ARE DEFINED ELSEWHERE. THE CONTACT AUTHORS SHOULD INCLUDE ONLY THE # INDIVIDUALS INVOLVED WITH THE DEPOSITION. ================CATEGORY 2: Contact Authors============================= Enter information about the contact authors. Note: items marked by (e.g. ) are manditory. PI information should be always given. # GIVE PI INFO, ONLY ONE ALLOWED AT THIS TIME, PDB SAID THEY COULD # CHANGE ADIT TO ACCEPT MORE THAN ONE PI IF WE ASKED THEM TO, LET ME # KNOW IF YOU THINK WE WILL EVER NEED MORE THAN ONE PI. # PICK ONE BELOW, DELETE OTHER: 1. Information about the Principal investigator (PI) should be given. ( Dr./Prof./Mr./Mrs./Ms.) (e.g. John) (e.g. Rodgers) (e.g. investigator/responsible scientist) (e.g. academica/commercial/goverment/other) (e.g. name@host.domain.country) (e.g. Piscataway) (e.g. New Jersey) (e.g. 08864) (e.g. UNITED STATES) # # ( Dr./Prof./Mr./Mrs./Ms.) # (e.g. John) # (e.g. Rodgers) # # (e.g. investigator/responsible scientist) # (e.g. academica/commercial/goverment/other) # (e.g. name@host.domain.country) # # (e.g. Piscataway) # (e.g. New Jersey) # (e.g. 08864) # (e.g. UNITED STATES) # # # GIVE OTHER NON-PI CONTACT AUTHORS. IN ADDITION TO THE PI ABOVE, # MEMBERS OF THIS LIST WILL RECEIVE COPIES OF ALL THE CORRESPONDANCES # RELATED TO THIS STRUCTURE DEPOSTION. # DELETE OR ADD TO BELOW AS APPROPRIATE: 2. Information about other contact authors (e.g. 2,3,4..) (e.g. 2,3,4..) ...(add more if needed)... # # THIS INFO SHOULD NOT CHANGE: # ================CATEGORY 3: Structure Genomics========================= If it is the structure genomics project, give the information Structure Initiative) # # THIS INFO SHOULD NOT CHANGE: # ================CATEGORY 4: Release Status============================== Enter Release Status for Coordinates, Constraints, Sequence Status should be chosen from one of the following: (release now, hold for publication, hold for 6 weeks, hold for 6 months, hold for 1 year) # # PICK AN APPROPRIATE TITLE FOR THE PDB FILE (NOT THE TITLE OF PLANNED # PUBLICATION, THOUGH IT COULD BE.) # EXAMPLE: "NMR STRUCTURE OF the sorting nexin-22 PX domain from Human Hs.157607 # ================CATEGORY 5: Title======================================= Enter a title for the structure # # I ASKED PDB ABOUT THIS NEW FIELD, HERE'S WHAT THEY (KYLE) TOLD ME: # "_struct.ndb_details doesn't map to any remark in the PDB file. The annotator # reads the information in the token and then decides where the information # be placed in the file." # # I WOULD SUGGEST THAT WE MAKE USE OF THIS FIELD TO INCLUDE INFORMATION # LIKE: # " REMARK 210 REMARK: ALL TRIPLE-RESONANCE AND NOESY SPECTRA WERE ACQUIRED # USING A CRYOGENIC PROBE." # # THEN WHEN A HUMAN READS THIS, THEY WILL HOPEFULLY KNOW EXACTLY WHERE TO PUT IT. # # # # THIS IS THE LIST OF PEOPLE (OR GROUPS) WHO ARE AUTHORS OF THE PDB ENTRY. # CENTER FOR EUKARYOTIC STRUCTURAL GENOMICS (CESG) IS A REQUIRED AUTHOR. # ================CATEGORY 5B: Authors of Structure============================ Enter authors of the deposited structures (e.g. Surname, F.M.) ...add more if needed... # # GIVE AUTHORS OR LEAVE BLANK TO USE SAME LIST OF AUTHORS GIVEN UNDER # CATEGORY 5B ABOVE. THIS INFORMATION MAY BE UPDATED AT A LATER TIME # WHEN A PUBLICATION IS RELEASED. # ================CATEGORY 6: Citation Authors============================ Enter author names for the publications associated with this deposition. The primary citation is the article in which the deposited coordinates were first reported. Other related citations may also be provided. 1. For the primary citation (e.g. Surname, F.M.) ...add more if needed... 2. For other related citations (if applicable) (e.g. 1, 2 ..) ...add more if needed... ...(add more other citations if needed)... # # LEAVE BLANK UNLESS YOU ALREADY HAVE A PUBLICATION OUT OR A TITLE # OTHER THAN THE STRUCTURE TITLE CHOSEN. THIS INFORMATION MAY BE UPDATED # AT A LATER TIME WHEN A PUBLICATION IS RELEASED. # ================CATEGORY 7: Citation Article============================ Enter citation article (journal, title, year, volume, page) If the citation has not yet been published, use 'To be published' for the category 'journal_abbrev' and leave pages and volume blank. 1. For primary citation (e.g. to be published) 2. For other related citation (if applicable) (e.g. 1, 2, 3 ...) ...(add more citations if needed)... ================CATEGORY 8: Molecule Names============================== Enter the name of the molecule for each entity The name of molecule should be obtained from the appropriate sequence database reference, if available. Otherwise the gene name or other common name of the entity may be used. e.g. HIV-1 integrase for protein RNA Hammerhead Ribozyme for RNA The number of entities should be the same as in CATEGORY 1. (entity 1) (entity 2) ...(add more if needed)... ================CATEGORY 9: Molecule Details============================ Enter additional information about each entity, if known. (optional) Additional information would include details such as fragment name (if applicable), mutation, and E.C.number. 1. For entity 1 (e.g. 1, 2, ...) (e.g. ligand binding domain, hairpin) (e.g. C280S) (if known: e.g. 2.7.7.7) 2. For entity 2 ...(add more if needed)... ================CATEGORY 10: Genetically Manipulated Source============== Enter data in the genetically manipulated source category If the biomolecule has been genetically manipulated, describe its source and expression system here. 1. For entity 1 (e.g. 1, 2, ...) (e.g. Homo sapiens) (e.g. RPOD, ALKA...) (e.g. Escherichia coli) (e.g. BL21(DE3)) (e.g. plasmid) (e.g. pET26) (any other relevant information) # # HERE'S AN EXAMPLE OF RECENT XRAY PIPELINE TARGET: # # Manipulated_entity_id = "1 " # Source_organism_scientific_name = "Arabidopsis thaliana " # Source_organism_gene = "At5g21482.1" # Expression_system_scientific_name = "Escherichia coli " # Expression_system_strain = "B834 P(RARE2) " # Expression_system_vector_type = "PLASMID " # Expression_system_plasmid_name = "PVP 16 " # Manipulated_source_details = " "> # # HERE'S A CELL FREE EXAMPLE FROM A RECENT PIPELINE TARGET: # # Manipulated_entity_id_1 = "1" # Source_organism_scientific_name_1 = "Homo sapiens" # Source_organism_gene_1 = "BC019655, SNX22_HUMAN, Hs.157607 " # Expression_system_scientific_name_1 = "wheat germ cell-free" # Expression_system_strain_1 = " " # Expression_system_vector_type_1 = " " # Expression_system_plasmid_name_1 = " " # Manipulated_source_details_1 = "in vitro expression" 2. For entity 2 ...(add more if needed)... # # BELOW IS RARELY USED: # ================CATEGORY 11: Natural Source============================= Enter data in the natural source category (if applicable) If the biomolecule was derived from a natural source, describe it here. 1. For entity 1 (e.g. 1, 2, ...) (e.g. Homo sapiens) (e.g. DH5a , BMH 71-18) (e.g. organ, tissue, cell ..) 2. For entity 2 ...(add more if needed)... # # BELOW IS RARELY USED (?): # ================CATEGORY 12: Synthetic Source============================= If the biomolecule has not been genetically manipulated or synthesized, describe its source here. 1. For entity 1 (e.g. 1, 2, ...) (if known) 2. For entity 2 ...(add more if needed)... # DON'T BE SHY, PICK OUT SOME GOOD KEYWORDS. # REQUIRED: "Structural Genomics, Protein Structure Initiative, PSI, # Center for Eukaryotic Structural Genomics" # PLACE YOUR CHOSEN KEYWORDS IN FRONT OF REQUIRED ONES. ================CATEGORY 13: Keywords=================================== Enter a list of keywords that describe important features of the deposited structure. For example, beta barrel, protein-DNA complex, double helix, hydrolase, structural genomics etc. # # RECENT KEYWORDS EXAMPLE: # # structure_keywords = "PX domain, sorting nexin 22, # BC019655, SNX22_HUMAN, Hs.157607, # Structural Genomics, Protein Structure Initiative, PSI, # Center for Eukaryotic Structural Genomics" ================CATEGORY 14: Ensemble=================================== Enter data in category ensemble Skip this section, if only one average structure has been deposited. (e.g. 200) (e.g. 20) (e.g. 20 structures for lowest energy) ================CATEGORY 15: Representative Conformers================== Enter data in category representative conformers Normally, only one of the ensemble is selected as a representative structure. (e.g. 1,2..) (e.g.lowest energy, fewest violations) # # PLEASE GIVE DETAILS OF ALL COMPONENTS OF THE SAMPLE SOLUTION. # ESTIMATE IONIC STRENGTH BASED ON THESE COMPONENTS. # # OVERALL EXAMPLE: # # REMARK 210 TEMPERATURE (KELVIN) : 298 # REMARK 210 PH : 7.0 # REMARK 210 IONIC STRENGTH : 202 MM # REMARK 210 PRESSURE : 1 ATM # REMARK 210 SAMPLE CONTENTS : 0.5 MM 13C,15N-LABELED # REMARK 210 BC019655, 10 MM BIS-TRIS, 100 # REMARK 210 MM NACL, 10 MM DTT, 50 MM # REMARK 210 ARGININE HYDROCHLORIDE, 50 MM # REMARK 210 SODIUM GLUTAMATE, 90% H2O, 10% # REMARK 210 D2O # REMARK 210 # REMARK 210 NMR EXPERIMENTS CONDUCTED : 1H,15N-HSQC, 1H,13C-HSQC, # REMARK 210 HNCACB, CBCACONH, CCONH, # REMARK 210 HCCONH, HBACONH, HNHA, HNCOCA, # REMARK 210 HCCH-TOCSY, 13C-EDITED 1H,1H- # REMARK 210 NOESY, 15N-EDITED 1H,1H-NOESY # REMARK 210 SPECTROMETER FIELD STRENGTH : 600 MHZ # REMARK 210 SPECTROMETER MODEL : INOVA # REMARK 210 SPECTROMETER MANUFACTURER : VARIAN ================CATEGORY 16: Sample Details============================= Enter a description of each NMR sample, including the solvent system used. 1. for sample 1. (e.g. 1, 2.. ) (e.g. 50mM phosphate buffer NA; 90% H2O, 10% D2O) (e.g. 90% H2O, 10% D2O ) 2. for sample 2. ....add more if needed.... ================CATEGORY 17: Sample Conditions========================== Enter experimental conditions used for each sample. Each set of conditions is identified by a numerical code. 1. for sample 1. (e.g. 1, 2..) (e.g. 298) (in Kelvin) (e.g. ambient, 1atm) (e.g. 7.2) (e.g. 100MM KCL) 2. for sample 2. ....add more if needed.... ================CATEGORY 18: Spectrometer=============================== Enter the details about each spectrometer used to collect data. 1. for experiment 1: (e.g. 1, 2..) (e.g. Bruker ..) (e.g. DRX) (e.g. 500, 700) 2. for experiment 2: ....add more if needed.... ================CATEGORY 19: Experiment Type============================ Enter information for those experiments that were used to generate constraint data. For each NMR experiment, indicate which sample and which sample conditions were used for the experiment. 1. for experiment type 1: (e.g. 1, 2..) (same ID as solution_id_1 in CATEGORY 16) (same ID as conditions_id_1 in CATEGORY 17) (e.g. 3D_15N-separated_NOESY) 2. for experiment type 2: (e.g. 1, 2..) (same ID as solution_id_1 in CATEGORY 16) (same ID as conditions_id_1 in CATEGORY 17) 3. for experiment type 3: (e.g. 1, 2..) (same ID as solution_id_1 in CATEGORY 16) (same ID as conditions_id_1 in CATEGORY 17) 4. for experiment type 4: (e.g. 1, 2..) (same ID as solution_id_1 in CATEGORY 16) (same ID as conditions_id_1 in CATEGORY 17) ....add more if needed.... ================CATEGORY 20: Method and Details========================= Enter the method and details of the refinement for the deposited structure. # # I ASKED PDB ABOUT ALLOWED VALUES, # HERE'S THE ANSWER THOUGH IT MAY NOT # BE COMPLETE: # # "For the Nmr_method field in the template, # the method can be # simulated annealing, # torsion angle dynamics, # distance geometry, etc. # To be honest, I do not know much about NMR. # (ME NEITHER) # (e.g. simulated annealing) # # LAST TIME THE INFORMATION ENTERED IN THIS FIELD ENDED-UP # AS 'REMARK 3 OTHER REFINEMENT REMARKS: ...' # # SO, GOOD NEWS, THIS IS WHERE WE CAN PUT THE STUFF LIKE: # # REMARK 3 OTHER REFINEMENT REMARKS: STRUCTURES ARE BASED ON A TOTAL OF # REMARK 3 1885 NOE RESTRAINTS (801 INTRA, 449 SEQUENTIAL, 277 MEDIUM, # REMARK 3 AND 358 LONG RANGE), 76 HBOND RESTRAINTS, AND 172 PHI AND PSI # REMARK 3 DIHEDRAL ANGLE CONSTRAINTS. # (enter details about the NMR refinement) =====================================END==================================