Assistant Member of The Staff
Department of Microbiology
National Taiwan University, Taipei, Taiwan, B.S., 1986, Agricultural Chemistry
National Taiwan University, Taipei, Taiwan, M.S., 1988, Microbiology
University of Massachusetts, Amherst, MA, Ph.D. 1995, Microbiology
Bioinformatics is a thriving scientific discipline that integrates computer science with applications in molecular biology. During the last decade, the completion of genomic sequences from many organisms has provided much valuable data to be analyzed, organized and stored. Among the genomes that have been or are currently being sequenced are those of a number of pathogens associated with oral and dental diseases. The Chen lab develops and uses high-throughput computational and experimental approaches to unveil information hidden in certain oral pathogenic genome sequences. The results of our work, to be shared via public and proprietary databases, will lead to better understanding of oral disease mechanisms and, ultimately, to better treatments or disease prevention.
Specific projects in our lab include the following:
Several problems exist regarding the effective use of the genomic sequences and the related information. First, multiple and different annotations are often available for the same genome. While different annotations provide unique and valuable features, different gene identification schemes and functional assignment methods used in these annotations present a challenge for cross-referencing. Second, the updating of the genome annotations is too infrequent to reflect the rapid growth of molecular sequences in the public databases, on which the homologous searches are based. Third, although versatile bioinformatics tools abound, few are integrated with the data that they are designed to analyze. The use of these software tools requires download and installation in user's local computers and proper reformatting of the input data, a task that often proves daunting to biologists without proper computer skills.
The immediate goal of The Bioinformatics Resource for Oral Pathogens project is to provide solutions to the above problems—an integrated graphical genome viewer that assimilates and shows multiple annotations of the same genome for easy reconciliation; a 24-7 automatic data-mining pipeline providing up-to-date annotations for the oral pathogen genomes; and seamlessly integrated bioinformatics software tools such as EMBOSS (a molecular biology software package containing more than 150 applications) and SAOPMD (for statistical analysis of microarray data) which can be used right in the Web browser.
The long-term goal of the BROP project is to provide a comprehensive and "specialized" online bioinformatics resource center for studying oral pathogens. By focusing on a group of pathogens that are associated with oral infectious diseases, the vision of the BROP is to serve as a working model for a modern bioinformatics resource center with three essential features—integrative, current, and community-centric.
One of the major goals of the BROP project is to provide state-of-the-art integrated bioinformatics software tools to the research community for analyzing oral pathogen genomes. Software tools that are currently provided in BROP include: BROP Genome Explorer, BROP Genome Viewer, "Genomewide ORF Alignment" (GOAL), Significance Analysis of Oral Pathogen Microarray Data (SAOPMD), and The European Molecular Biology Open Software Suite (EMBOSS). We are continuing to develop novel and include existing open-source software tools in the context of oral pathogen genomes so that the oral and dental research community can use these tools for analyzing the oral pathogen genomes.
To date, homologous sequence matching remains the most common and useful way of functional inference for newly identified genes in a genome. The amount of molecular data, against which most new sequences are searched, continues to increase exponentially. Major public sequence databanks provide daily updates and exchange of their data. However most genomes, once annotated, published and deposited to the public databases, do not receive frequent updates by searching against the new sequences that are constantly added to the databanks. Currently, BROP is continuously annotating oral pathogen genomes and providing annotations that are updated every month. Current BROP data-mining algorithms include: i) BLASTP search against weekly updated NCBI non-redundant protein data; ii) BLASTP search against Swiss-Prot protein data, and iii) InterProScan search against Scan Reg Exp, BlastProDom, ProfileScan, HMMPfam, superfamily, HMMTigr, Seg, Coil, HMMPIR, FPrintScan, and HMMSmart databases. Currently it is constantly mining the data for 14 complete or partial genomes. BROP also provides a live statistics and status Web page for monitoring all the data-mining work so that users are aware of the date the information they are exploring was deposited.
MyOPMD and SAOPMD are two new microarray data analysis and management tools specially designed for oral pathogen microarrays. Together they provide the "intergrative" and "community-centric" features of the BROP project mentioned above.
SAOPMD (Significance Analysis of Oral Pathogen Microarray Data) uses a robust statistical inference program—the Limma (Linear Model for Microarray data) package, to evaluate the statistical significances of differential gene expression. SAOPMD is easy to use—just upload the microarray result files (e.g., GenePix gpr files) and with one single click you get all the results including multiple diagnostic plots for each array, data before and after normalization, and a searchable and sortable list of statistics right on the Web page. The results also provide links for each gene to the BROP Genome Viewer, the oligo information, and the original raw signal intensities. The results can also be downloaded as plain tab-delimited text or Microsoft Excel files. SAOPMD can analyze all the 70-mer oligo arrays printed by TIGR-NIAID, including slides of Actinobacillus actinomycetemcomitans HK1651, Porphyromonas gingivalis W83, Streptococcus mutans US159, and Treponema denticola ATCC 35405.
Best of all, all the SAOPMD results including the original data are automatically saved to MyOPMD — an online microarray data management system. Data saved in MyOPMD are visible only to the owner but can also be securely shared with other designated users. If the data are used in your publications in the future, they can also be assigned with a "public" status and a unique ID will be automatically generated for easy referencing in a consistent URL format (e.g., http://www.brop. org/idn:11316418814083).
Dewhirst FE, Chen T, Izard J, Paster BJ, Tanner AC, Yu WH, Lakshmanan A, Wade WG. The human oral microbiome. J Bacteriol. 2010 Oct;192(19):5002-17. Epub 2010 Jul 23.
Chen T, Yu WH, Izard J, Baranova OV, Lakshmanan A, Dewhirst FE. The Human Oral Microbiome Database: a web accessible resource for investigating oral microbe taxonomic and genomic information. Database (Oxford). 2010 Jul 6;2010:baq013. Print 2010.
Yu WH, Høvik H, Chen T. A hidden Markov support vector machine framework incorporating profile geometry learning for identifying microbial RNA in tiling array data. Bioinformatics. 2010 Jun 1;26(11):1423-30.
King WF, Chen T, Nogueira R, Mattos-Graner R, Smith DJ. (2010) Epitopes shared among pioneer oral flora and Streptococcus mutans GbpB. 1st Tohoku-Forsyth Symposium, Boston, MA, March 10-11, 2009. In: Sasano T, Suzuki O. (eds). Interface Oral Health Science 2009. (In press).
Downes J, Vartoukian S, Dewhirst FE, Izard J, Chen T, Yu W, Wade WG. (2009) Pyramidobacter pisciolens gen. nov., sp. nov., a member of the phylum ‘synergistetes’ isolated from the human oral cavity. Int. J. Syst. Evol. Microbiol. 59(Pt. 5):972-980.
Downes J, Vartoukian S, Chen T. (2006) DNA Microarrays—An armory for combating infectious diseases in the new century. Infect. Disord. Drug Targets 6(3):263–279.
Chen T, Abbey K, Deng W-J, Cheng M-C. (2005) The Bioinformatics Resource for Oral Pathogens. Nucleic Acids Res. 33(Web Server Issue):W734–W740.
Dewhirst FE, Shen Z, Scimeca MS, Stokes LN, Boumenna T, Chen T, Paster BJ, Fox JG. (2005) Discordant 16S and 23S rRNA gene phylogenies for the genus Helicobacter: Implications for phylogenetic inference and systematics. J. Bacteriol. 187(17):6106–6118.
Wen-Han Yu, M.S.