BB-Reader - manual for version 2.2

Table of Contents


Legal Stuff
Using BB-Reader for Scientific Work
Platform
Installing BB-Reader
What does BB-Reader for you?
Before you run the program
Getting started
The output
Groups of equivalent atoms - ambigous assignment
Calculation of the score
Nomenclature of atomtypes

 Logfile
Command-line arguments
Limitations (and how to work around)
Feedback


Legal Stuff


The software and accompanying instructions are provided "as is" without warranty of any kind. The authors do not warrant, guarantee, or make any representations regarding the use, or the results of the use of the software or accompanying instructions in terms of correctness, accuracy, reliabilty, currentness or otherwise. The entire risk as to the results and performance of the software is assumed by you. If the software or instructions are defective, you, and not the authors, assume the entire cost of all necessary servicing, repair or correction.

BB-Reader is copyrighted software. You may use and distribute it free of charge, but it must not be sold or offered as an inducement to buy other products. Moreover modified source-code must not be distributed and the source-code must not be redistributed without the accompanying files (manual).


Using BB-Reader for Scientific Work


If you used BB-Reader for scientific work, and you are publishing that work, please cite the following article: Wimmer, R., Müller, N., and Petersen, S.B., "B-B-Reader: A Computer Program for the Combined Use of the BioMagResBank and the PDB databases," J. Biomol. NMR 9, 101-104 (1997).


Platform


BB-Reader is written in C, and it is designed to run under a UNIX-environment. It does not have any graphical user-interface.


Installing BB-Reader


You can obtain the file BBReader.tar from http://www.bmrb.wisc.edu/bbreader/BBReader.html
Create a directory and move the tarfile in there. Extract the files from the archive with "tar -xf BBReader.tar", and you will find the source-code ("BBReader.c") and the manual as text ("bbr_man.txt") and as RTF ("bbr_man.rtf") file.
The source-code is written in C, and the necessary compiling-commands differ from system to system. You must in any case use the mathematics library, because the program is calculating square-roots. On an SGI-workstation, the necessary command is:
cc -lm BBReader.c -o BBReader.exe
On a HP9000/735, running under HP-UX, the necessary command is:
cc -A a BBReader.c -lm -o BBReader.exe
If you are working on a different kind of system, you have to find out your specific compiling-command by means of the compiler manual.


What does BB-Reader for you?


BB-Reader is designed for protein-chemists dealing with NMR or for NMR-spectroscopists working on proteins. Given chemical shift data the program searches for possible assignments. Given peak positions BB-Reader suggests assingments for user-specified homo- and heteronuclear one-to three dimensional COSY and NOESY-type experiments. It can handle 1H, 13C and 15N shift-data. Distance-information from PDB-files can be utilized for filtering possible NOESY-cross-peak assignments. BBReader will provide you with a list of possible assignments along with a ranking to give you help for the final decision.

Before you run the program


The basic condition is, that the assignment of the NMR-signals of the protein you are dealing with, is known, and that the assignement is contained in the BioMagResBank (Seavey et al. 1991) (http://www.bmrb.wisc.edu) in the STAR flat-file-format (Hall 1991; Hall and Spadaccini 1994; Hall and Cook 1995) . Pay attention to the experimental conditions under which the assignment was done (pH, temperature, solvent,...)! You have to download the file to your own computer. If you plan to work with NOESY-spectra, it is recommended (but not necessary) that you also download the PDB-file of that protein from the Brookhaven database (Abola et al., 1987; Bernstein et al., 1977) (http://www.pdb.bnl.gov). BB-Reader can work with PDB-files in the current format, only the lines beginning with ATOM are used.


Getting started


After starting the program by entering BBReader, you are prompted for the name of the BioMagRes-file. The program will scan the file to obtain assignment information. BB-Reader can work with 1H, 13C and 15N-shift-data.
Then you will be asked for the number of dimensions your spectrum has, you can enter a number from 1 to 3. Version 2.0 cannot treat 4D and 5D-spectra, but you can work around that (see section "Limitations").
You will be prompted for the nucleus for each dimension (this question is suppressed, if data for only one nucleus is available). You are free to enter any combination of nuclei, and BB-Reader will not refuse any, it will, however, warn you, if you enter highly unusual combinations (e.g. NN-correlations).
All coherence transfers between 1H and 13C or 15N, respectively, will be considered to be of COSY-type (i.e. to be based on scalar coupling), for coherence transfers between two proton-dimensions you will be asked, whether the coherence transfer is of COSY or of NOESY-type.
If you have a NOESY-step in your sequence, the name of a PDB-file will be requested. The PDB-file will be used to calculate internuclear distances and to create a distance-dependent score. This possibility is optional, you may just enter "0", if there is no PDB-file available or if you don't want to use it.


The output


After the setup described above, you will be asked for the output-type you wish. There are two possibilities:
- "single-peak-mode": you enter the position of a cross-peak (i.e. one chemical shift per dimension) and BB-Reader will provide you with an ordered list of those possible cross-peaks, that come closest to your input;

- "range-mode": you enter a shift-range for each dimension, and BB-Reader will give you a list of all possible cross-peaks, that fall into that range.
If you have specified a PDB-file and are working in "range-mode", you will also be asked for a distance threshold - a maximum distance, above that proton-pairs are not listed in the result-list any more.
After you having defined the characteristics of spectrum and output, BB-Reader will search the database for nuclei with matching chemical shifts, those potential cross-peaks are checked, whether they are possible from the spectroscopic viewpoint. If they pass the test, they will be marked for output. In the single-peak-mode the score is calculated and this "hit" will be written to its place in the temporary hitlist.
The output line follows the general format:

r1 no1 at1 s1 | r2 no2 at2 s2 | t12 | r3 no3 at3 s3 | t23 | score

the symbols mean (with subscripts referring to the spectral dimension):
r: three-letter-code for residue
no:sequence number of residue
at: atom-code as given in the BioMagResBank (see section "Nomenclature of atom-types" below for a detailed explanation)
t12: additional information: for a COSY-type spectrum either "geminal" for a 2J-coupling between protons or "allylic" for a 4J-coupling between protons is given, otherwise this field is left empty. For a NOESY-spectrum the internuclear distance between the two atoms [Å] is printed there. If one or both coupling partners belong to a group (see section "Groups of equivalent atoms / ambigous assignment" below), the shortest distance between any two atoms of that group is given.
t23: same as t12, but for dimension two and three.
score: only if single-peak-mode is chosen.

Finally BB-Reader gives information on what fraction of all nuclei in the protein is contained in the BioMagResBank-file.


Groups of equivalent atoms - ambigous assignment


For many pairs of diastereotopic protons the stereospecific assignment is not known. Therefore two possible assignments exist for each chemical shift. These nuclei are combined to a group, and the shift is assigned to that group. In the output the group is given (e.g. "HB2/HB3") instead of giving two separate assignments or just giving only one.
Combining equivalent atoms to groups reduces the output and improves the clarity. In case of a NOESY-type-spectrum, the shortest distance between any members of the two groups will be given in the output.
Other isochronous atoms (e.g. HD1/HD2 in Phe and Tyr) are treated in the same way.
The three equivalent atoms of methyl groups form a "natural" group.


Calculation of the score:


The score, which is used for the ranking of the hits, is calculated in the following way:
A distance-penalty is calculated for each NOESY-step (provided that a PDB-file was used). For distances below 4Å no penalty is calculated. If the distance is above 4Å, the distance penalty is calculated as
k*(distance-4Å)2 in order to make such relatively big distances even more unlikely (k was determined empirically). If there are two NOESY-steps, the distance-penalties for each step are added.
A shift-penalty is calculated for each dimension in the following way: shift penalty=(f*(experimental shift - database shift))2,
f being a nucleus-dependent factor.
All shift penalties and distance penalties are added and the sum is divided by the number of dimensions. This is to account for the fact, that the sum of penalty points increases automatically with the number of dimensions. This penalty is then subtracted from 100 to yield the final score.
The calculation of the score is highly empirical, and it might be subject to future changes.


Nomenclature of atomtypes


The nomenclature of atomtypes is equivalent to that given in the BioMagResBank, it can be looked up at http://www.bmrb.wisc.edu/Nomenclature/commonaa.html. This file contains the amino-acids along with the atom-types of all atoms as they are given by the BioMagResBank and in the output of BB-Reader.


Logfile


BB-Reader offers the possibility for having a logfile written. The logfile will contain all settings and the result. The logfile is always named "RBMlog.out", it is a text-file.

Printing the Logfile


BB-Reader allows you to write a default print-command into a text-file called "RBM_default_printcmd". This file must be in the same directory as the program. If you have not specified a print command "lp RBMlog.out" will be used.
Before the program exits, it asks you, whether you'd like to have the logfile printed. You can as well specify an editor-command like "jot RBMlog.out" or "vi RBMlog.out".


Command-line arguments:


Typically, one has many cross-peaks to examine from one spectrum. To prevent being prompted for the same input every time you start the program for examining a new cross-peak, several command-line-arguments are possible:
The generalized syntax is:

BBReader [-hdrlfp123] [spectrum]

The first argument can contain:
h (Help) overview about the available arguments
d (Default) the same BioMagResBank-file and the same PDB-file as last time you ran the program are used again.
r (Ranking) single-peak-mode
l (List) range-mode
f (File) the logfile is written
p (Print) the logfile is written and printed
1 1D-spectrum
2 2D-spectrum
3 3D-spectrum

"1","2" and "3" as well as "r" and "l" are mutually exclusive.

The second command-line-argument ("spectrum") can be used to specify the nuclei and the coherence transfer-modes. It must be of the following format: the first letter must be "H", "C" or "N" and represents the nucleus of the first dimension. For 1D-spectra, the argument must end here, for more-dimensional spectra, the second letter must be "c" or "n" and represents the coherence-transfer-mode between dimension one and two (COSY or NOESY). The third letter must be the nucleus of the second dimension. If the spectrum is three-dimensional, then another letter for the coherence transfer between dimension 2 and 3 and another letter for the nucleus in dimension 3 is necessary.
For example: "BBReader -3 HnHcC " means, that the spectrum is a 3D-H-H-C spectrum with a NOESY-type-transfer between dimension one and two and a COSY-type-transfer between dimension two and three.


Limitations (and how to work around):


4D-spectra:

Version 2.0 of BB-Reader is restricted to 1D, 2D and 3D-spectra. Future versions may extend this. For the time being, you can handle more than three dimensions by performing a database-search with the first three and one search with the last three dimensions and combine those (3D) cross-peaks, that have the same two atoms once in dimension two and three and once in dimension one and two.
For example, to simulate a 4D-H-C-C-H-experiment, make a search for two 3D-experiments, one being H-C-C and one being C-C-H. Then combine those cross-peaks from both output lists, whose second and third atoms in the first experiment are the same as the first and second atoms in the second experiment.

2D-TOCSY:

Version 2.0 contains only COSY and NOESY-type coherence transfers. You can combine several COSY-steps in order to simulate a TOCSY. The number of COSY-steps necessary to cover all possibilties depends on the residue and reaches a maximum of 5 for lysine.

Amino-Acids:

Version 2.0 is restricted to the 20 classical amino-acids in the L-isomeric form. Other amino-acids will be ignored and the user will be informed about their occurrance.


Bugs:


No software is completely bug-free. If you find a bug, please send a message to the address mentioned below.


Feedback:


If you have suggestions on how to improve the program or other remarks (such as fan-post, for example (flames are piped to /dev/null)), send a message to:

Reinhard Wimmer
Department of Life Sciences
Aalborg University

Sohngaardsholmsvej 49
DK-9000 Aalborg
DENMARK

Aalborg, June 2004


References:


Abola, E.E., Bernstein, F.C., Bryant, S.H., Koetzle, T.F. and Weng, J. (1987). In Crystallographic Databases - Information Content, Software Systems, Scientific Applications: Protein Data Bank, (Eds. F.H. Allen, G. Berghoff and R. Sievers), Data Commission of the International Union of Crystallography, Bonn/Cambridge/Chester, pp. 107-132

Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, E.F.Jr., Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T. and Tasumi, M. (1977), J. Mol. Biol., 112, 535-542

Hall, S. R. (1991) J. Chem. Inf. Comput. Sci., 31, 326-333.

Hall, S. R. and Cook, A. P. F. (1995) J. Chem. Inf. Comput. Sci., 35, 819-825.

Hall, S. R. and Spadaccini, N. (1994) J. Chem. Inf. Comput. Sci., 34, 505-508.

Seavey, B. R., Farr, E. A., Westler, W. M. and Markley, L. (1991) J. Biomol. NMR, 1, 217-236.