Journal Article

Modeling the functional consequences of single residue replacements in bacteriophage f1 gene V protein

Majid Masso, Ewy Mathe, Nida Parvez, Kahkeshan Hijazi and Iosif I. Vaisman

in Protein Engineering, Design and Selection

Volume 22, issue 11, pages 665-671
Published in print November 2009 | ISSN: 1741-0126
Published online August 2009 | e-ISSN: 1741-0134 | DOI:
Modeling the functional consequences of single residue replacements in bacteriophage f1 gene V protein

Show Summary Details


A computational mutagenesis methodology utilizing a four-body, knowledge-based, statistical contact potential is applied toward globally quantifying relative environmental perturbations (residual scores) in bacteriophage f1 gene V protein (GVP) due to single amino acid substitutions. We show that residual scores correlate well with experimentally measured relative changes in protein function upon mutation. Residual scores also distinguish between GVP amino acid positions grouped according to protein structural or functional roles or based on similarities in physicochemical characteristics. For each mutant, the in silico mutagenesis additionally yields local measures of environmental change (EC scores) occurring at every residue position (residual profile) relative to the native protein. Implementation of the random forest (RF) algorithm, utilizing experimental GVP mutants whose feature vector components include EC scores at the mutated position and at six structurally nearest neighbors, correctly classifies mutants based on function with up to 77% cross-validation accuracy while achieving 0.82 area under the receiver operating characteristic curve. A control experiment highlights the effectiveness of mutant feature vector signals, and a variety of learning curves are generated to analyze the impact of GVP mutant data set size on performance measures. An optimally trained RF model is subsequently used for inferring function for all the remaining unexplored GVP mutants.

Keywords: computational mutagenesis; Delaunay tessellation; knowledge-based statistical potential; random forest supervised classification; structure–function relationship

Journal Article.  4634 words.  Illustrated.

Subjects: Proteins

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.