NHLBI-AbDesigner

Frequently Asked Questions


What is the objective of AbDesigner?
The objective of AbDesigner is to display the features of a protein relevant to the choice of a synthetic peptide sequence to be used as an immunogen in antibody production. It does so in a manner that allows the user to judge trade-offs for candidate peptide sequences with respect to multiple factors including hydropathy, secondary structure, uniqueness, conservation among species, and the presence or absence of post-translational modifications.

What is the input of AbDesigner?
To specify a protein for analysis, the program accepts the following types of input: from any of the following seven species: Proteins from other species can be analyzed by entering the sequence of that protein FASTA format (This should be avoided when analyzing proteins from the above seven species because of the limitations associated with submitting a FASTA amino acid sequence as described below).

What are the limitations associated with submitting a FASTA amino acid sequence?
By submitting just a FASTA amino acid sequence, the program assumes that the input protein is not present in the local Swiss-Prot protein database (from the following seven species: Homo sapiens, Rattus norvegicus, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, or Arabidopsis thaliana). Thus, the following processes that make use of the database cannot be executed:

Process not executed when submitting a FASTA amino acid sequence Consequent
Extraction of Protein Features Protein Features will not be displayed in the graphical output
Calculation of Tail Bonus Immunogenicity Score will be calculated without factoring in Tail Bonus
Calculation of Uniqueness Score Uniqueness Score will not be calculated, the graphical output of Uniqueness Score will be the default brightest yellow, and Uniqueness-optimized rank will be the same as Immunogenicity Score rank
Calculation of Conservation Score Conservation Score will not be calculated, the graphical output of Conservation Score will be the default black, and Conservation-optimized rank will be the same as Immunogenicity Score rank

How long should I set the 'Peptide Length'?
How long should I set the 'Epitope Length'?
What is the output of AbDesigner?
After the input values are entered, AbDesigner calculates and displays a variety of information and allows users to 'mouse-over' each element to obtain further data including:
What information are displayed on the amino acid sequence in the main output panel?
When users mouse-over the main panel, the peptide of interest of the appropriate length is highlighted in yellow with the central amino acid underlined. The low complexity regions of a protein identified by the segmasker program (based on the SEG filtering algorithm, obtained from the NCBI C++ Toolkit) are displayed in red font and on mouse-over along the protein sequence row.

What information are displayed on the Chou-Fasman secondary structure prediction in the main output panel?
The Chou-Fasman prediction is performed as described except for slight modifications. The secondary structure predicted, i.e. alpha helix, beta sheet, strong beta turn, or weak beta turn is displayed by different colors and on mouse-over. High immunogenicity correlates with a lack of alpha helices or beta sheets, and presence of beta turns. The chief practical value of this analysis is that it identifies locations of prolines (the chief determinant of a prediction of 'beta-turn'), which aid immunogenicity by interfering with α-helix formation.

What information are displayed on the Kyte-Doolittle (K-D) hydropathy heat map in the main output panel?
The main panel displays the Kyte-Doolittle hydropathy index (KDHI) of each peptide along a protein sequence. The KDHI is displayed in 8-bit scale (0 - 255) RGB color heat map and on mouse-over. Given that lower KDHI values correlate with greater immunogenicity, the 8-bit transformation of the KDHI is performed in the reverse direction so that the lowest value (-4.5, most hydrophilic) is transformed to 255 (displayed in the brightest cyan, by default) while the highest value (4.5, most hydrophobic) is transformed to 0.

What is the Immunogenicity Score (Ig Score)?
Immunogenicity Score is a predictor of immunogenicity. The higher the score, the greater the predicted immunogenicity. The Immunogenicity Score of a peptide is calculated from the following formula:



Where: KDHI = Kyte-Doolittle hydropathy index, Pt (average) = average Chou-Fasman conformational parameters of beta turn, and Tail Bonus = a value ranging from 1.0 to 1.5. KDHI is an average value of the hydropathy indices of consecutive amino acids in a peptide. It is calculated using the hydropathy scale (a range of -4.5 to 4.5), with a negative KDHI predicted to be more immunogenic than a positive KDHI. Thus, the negative value of the KDHI is used in the formula. A value of 4.5 is added in order to keep the Immunogenicity Score in a positive range. Pt (average) is an average value of the Chou-Fasman conformational parameters of a beta turn (Pt) of amino acids in a peptide. It is calculated using Pt values from a reference database of 29 proteins as described. The 'topological domain' information of a protein extracted from the Swiss-Prot database is used for determining the Tail Bonus. Tail Bonus is only given to a peptide that resides in NH2- or COOH-terminal tail of an integral membrane protein. Tail Bonus values can range from 1.0 to 1.5. A Tail Bonus value of 1.5 corresponds to the full length of a peptide being present in a tail region, while a peptide whose full length is present in a non-tail region is given a tail bonus value of 1.0. Values are linearly correlated with the number of amino acids contained in the tail compared with the entire length of the peptide.

What is the Uniqueness Score?
A typical immunogenic peptide is assumed to contain multiple linear, overlapping potential epitopes (~ 5-10 amino acids), each of which can theoretically invoke an immune response. The uniqueness of these linear epitope sequences compared with other proteins of the same species helps determine the specificity of an antibody produced against that peptide. In AbDesigner, the sequence of each successive linear epitope (shifting by one amino acid) along a peptide sequence is compared against the entire protein sequence database of that species. The length of a linear epitope can be set from 5 amino acids to the full length of the peptide. The total number of linear epitopes in other proteins that have sequences identical to the linear epitopes in a given peptide is calculated as follows:



Where: M = the total number of linear epitope matches, n = the number of successive linear epitopes along a peptide sequence, and mi = the number of the linear epitopes in the other proteins that have sequences identical to the linear epitope i. Redundant linear epitopes in a protein are counted only once. A higher value of M corresponds to a less unique peptide and vice versa. The Uniqueness Score of a peptide is then calculated based on the following formula:



Where: Mc = the cutoff value for M. The highest Uniqueness Score is equal to Mc (when there is no match to any linear epitopes in a peptide, M = 0) and the lowest Uniqueness Score is equal to 0 (when the total number of linear epitope matches is equal to or above the cutoff value, M ≥ Mc). Mc is arbitrarily set to three times of n.

What is the Conservation Score?
Conservation Score predicts the likelihood that the target protein will be detectable by the antibody in multiple orthologous species. The higher the score, the greater the predicted conservation. The Conservation Score of a peptide is determined in a comparable manner to the Uniqueness Score. To begin with, the sequence of each successive linear epitope (shifting by one amino acid) along a peptide in a protein is compared against the sequences of the orthologous proteins, based on a mnemonic protein identification code in the Swiss-Prot entry name, among three commonly studied species (i.e. human, rat, and mouse). The Conservation Score of a peptide is then calculated from the total number of linear epitopes in orthologous proteins that have sequences identical to the linear epitopes in a given peptide as follows:



Where: n = the number of successive linear epitopes along a peptide sequence and ci = the number of the linear epitopes in the orthologous proteins that have sequences identical to the linear epitope i. Redundant linear epitopes in a protein are counted only once. The highest Conservation Score is reached when a peptide is completely conserved across all three species and is equal to the total number of the orthologous species evaluated multiplied by n. The lowest Conservation Score is equal to 0 (when there is no conservation).

How the Immunogenicity Score, Uniqueness Score, and Conservation Score are displayed in the main output panel?
The Immunogenicity Score, Uniqueness Score, and Conservation Score of each peptide are displayed in 8-bit RGB color heat maps. The transformation of those values into a density representation (D) on an 8-bit scale (0 - 255) is performed using linear scaling: D = 255 × (X - Xmin)/(Xmax - Xmin), where X is the value, Xmin is the minimum value and Xmax is the maximum value as defined below:

X Xmin Xmax
Ig Score the lowest Ig Score in that protein the highest Ig Score in that protein
Uniqueness Score 0 3 times of n
Conservation Score 0 the total number of the orthologous species evaluated multiplied by n
Note: n = the number of successive linear epitopes along a peptide sequence

By default, the highest Immunogenicity Score, Uniqueness Score, and Conservation Score are displayed in the brightest green, yellow, and red, respectively. However, user-defined colors can be selected from the menu bar. In addition, for each peptide, the Immunogenicity Score value and rank are displayed on mouse-over in the Immunogenicity Score heat map, whereas multiple sequence alignments are displayed on mouse-over in the Uniqueness and Conservation heat maps.

What is the Protein Features?
The Protein Features reported are position-dependent annotations of regions or sites of interest such as post-translational modifications, binding sites, enzyme active sites, local secondary structure, sequence conflicts, and other characteristics culled from the appropriate Swiss-Prot protein record. They are displayed by various distinct colors and on mouse-over along a protein sequence.

What are the three separate lists of peptides under the main output panel?
Under the main output panel are three separate lists of peptides sorted by Immunogenicity Score rank, Uniqueness-optimized rank, and Conservation-optimized rank. These lists provide an alternative to the heat maps for identification of candidate peptides for immunization. However, use of these lists alone has the pitfall that they do not take into account the protein annotations provided in the main display.
Helix Systems | CIT | NHLBI | NIH | DHHS | USA.gov