- 我的研究以電腦輔助藥物設計(Computer-aided drug design)及結構生物資訊(Structural bioinformatics)為主軸。在電腦輔助藥物設計方面，目前我們是分子對接(Molecular docking)軟體(GEMDOCK)提供者之一，我們也與國內外超過十個實驗室合作，經由生物實驗證實我們的方法確實可找到藥物標的(如Envelop protein、Skimate kinase、 influenza virus neuraminidase)前導藥物(Lead compound)、功能催化部位(如ß-lactoglobulin)、及蛋白質功能設計(如endo-chitosanase to exo-chitosanase)等，這些成果已發表多篇論文在這些領域最好的期刊上。這些論文從2004年起被引用次數已超過70次 (根據ISI)；另外GEMDOCK被國內外數十個實驗室使用，除研究的應用外，此軟體也使用在教學上。這些成果讓我們獲得2007年國家新創獎(由國家生技醫療產業策進會主辦)，是唯一以電腦軟體獲獎的團體。
- 在結構生物資訊方面，我們在蛋白質結構預測(PS2)、高速蛋白質結構的搜尋與應用(3D-BLAST)、及結構為基的蛋白質網路(3D-partner)已有相當成果，這些研究已發表在相關領域最好的期刊上(如Nucleic Acids Research、Genome Biology) 。最值得一提的研究成果是3D-BLAST，3D-BLAST搜尋蛋白質結構的速度與BLAST搜尋胺基酸序列一樣快，同時具備BLAST的優點與操作介面，也就是能提供可信的統計基礎(e-value)及高效率的搜尋能力。因此3D-BLAST可能成為蛋白質結構搜尋的標準，對於結構搜尋有巨大影響，此研究成果已引起相關學者的重視與討論。2007年起這些論文被引用次數已超過13次，目前已有44個國家，超過5,100 人次使用我們提供的服務。
- CELLO-- protein subcellular localization predictor
pKNOT -- the first knotted protein server
- We aim to develop various evolutionary computation algorithms and optimization methodologies, and provide user-friendly tools for system optimization. A representative paper is as follows: S.-Y. Ho*, L.-S. Shu and J.-H. Chen, "Intelligent Evolutionary Algorithms for Large Parameter Optimization Problems," IEEE Trans. Evolutionary Computation, vol. 8, no. 6, pp. 522-541, Dec. 2004. (Highly cited article)
- Based on the expertise of intelligent computation, we develop various bio-inspired optimization algorithms for computational proteomics, computational systems biology, computational biology, bioinformatics, etc.
- We establish national/international interdisciplinary collaboration with biologists to investigate systems biology and validate our proposed models.
- We are interested in establishing a user-friendly system of computer-aid vaccine design.
(Electrocardiology and Cardiovascular Bioinformatics Lab – PI: Dr. Ten-Fang Yang)
(Electrocardiology and Cardiovascular Bioinformatics Lab – PI: Dr. Ten-Fang Yang)
- Multiple Sequence Alignment with Constraints (MuSiC, MuSiC-ME, RE-MuSiC)
We are the first to propose the concept of constrained sequence alignment that allows biologists to incorporate their knowledge about structures/functionalities/consensuses of their datasets into sequence alignment. By specifying known functionally, structurally or evolutionarily related residues/nucleotides of the input sequences as constraints, the output of the constrained sequence alignment is an optimal sequence alignment in the condition that the user-specified residues/nucleotides should be aligned together in the alignment, so that the output alignment can more accurately reflect the true biological relationships among the input sequences. In this study, we have first designed an efficient algorithm for computing a constrained alignment of multiple sequences and have also developed a web server, called MuSiC (Multiple Sequence alignment with Constraints), for the online analysis. Using MuSiC, we have successfully located the subsequence fragment of the RNA sequence of SARS that is capable of folding itself into a stable RNA secondary structure with pseudoknot responsible for the replication of SARS viruses (Bioinformatics, 20:2309-2311, 2004). Then we have further developed its memory-efficient version, called MuSiC-ME (Memory-Efficient Multiple Sequence alignment with Constraints), that allows the users to align multiple sequences of length up to several thousand residues (Bioinformatics, 21:20-30, 2005). More recently, we have developed RE-MuSiC by further enhancing the constraint formulation of MuSiC as regular expressions, which is convenient in expressing many biologically significant patterns like those collected in the PROSITE database, or structural consensuses that often involve variable ranges between conserved parts. Experiments demonstrate that RE-MuSiC can be used to help predict important residues and locate evolutionarily conserved structural elements (Nucleic Acids Research, 35:W639-644, 2007).
Figure 1: Three GST (Glutathione S-Transferase) proteins: The structural similarity between these three proteins is very high, but their pairwise sequence identities are extremely low.
Figure 2: The constrained sequence alignment produced by RE-MuSiC, using the pattern of "[ST]-x(2)-[DE]" as the constraint, in which the residues shaded in yellow match the pattern. In addition, the residues in green boxes that correspond to the active sites shared by these three GST proteins are aligned together.
- [MicroRNA Regulation: Databases and Tools]
Recent works have demonstrated that microRNAs (miRNAs) are involved in critical biological processes by suppressing the translation of coding genes. In order to facilitate the investigation of microRNA regulation, we developed several biological databases and computational tools in this important field. Six articles in this filed were published in Nucleic Acids Research (2007 SCI Impact = 6.954). miRNAMap was selected as hot research in 2006 NAR Database Issue. miRNAMap was genomics maps for microRNA genes and their targets in metazoan genomes (Nucl Acids Res, 2006, Nucl Acids Res, 2008). ViTa is a database of host microRNA targets on viruses (Nucl Acids Res, 2007). We also survey the literatures to extract the RNA structural motifs and their functions to construct the RegRNA database (Nucl Acids Res, 2006). The RNAMST web server was developed for searching RNA structural homologs (Nucl Acids Res, 2006). RNALogo is designed as a new approach to display structural RNA alignment (Nucl Acids Res, 2008). These databases and tools were cited more than 53 times during last two years.
- [Protein Post-translational Modification: Database and Tools]
Protein Post-Translational Modification (PTM) plays an essential role in cellular control mechanisms that adjust protein physical and chemical properties, folding, conformation, stability and activity, thus also altering protein function. Four articles in this filed were published in Nucleic Acids Research (2007 SCI Impact = 6.954). dbPTM is a comprehensive information repository of protein post-translational modification (PTM) (Nucl Acids Res, 2006). Furthermore, we developed KinasePhos [1.0, 2.0], which is a web tool for identifying protein kinase-specific phosphorylation site (Nucl Acids Res, 2005, 2007, J. Comp Chem 2005). Besides, ProKware was designed as an integrated software for presenting protein structural properties in protein tertiary structures (Nucl Acids Res, 2006). These database and tools were totally cited more than 50 times during last three years.
- 蛋白質結構與反應之理論：利用各種計算化學方法，如全初始化理論(ab initio)、密度泛函理論(density function theory)、半經驗理論(semi-empirical methods)以及分子力學(molecular mechanics)等理論或其混合方法，探討蛋白質構形與相關生化反應的理論性質。
- Many Saccharomyces cerevisiae duplicate genes that were derived from an ancient whole-genome duplication (WGD) unexpectedly show a small synonymous divergence (K S), a higher sequence similarity to each other than to orthologues in Saccharomyces bayanus, or slow evolution compared with the orthologue in Kluyveromyces waltii, a non-WGD species. This decelerated evolution was attributed to gene conversion between duplicates. Using ≈300 WGD gene pairs in four species and their orthologues in non-WGD species, we show that codon-usage bias and protein-sequence conservation are two important causes for decelerated evolution of duplicate genes, whereas gene conversion is effective only in the presence of strong codon-usage bias or protein-sequence conservation. Furthermore, we find that change in mutation pattern or in tDNA copy number changed codon-usage bias and increased the K S distance between K. waltii and S. cerevisiae. Intriguingly, some proteins showed fast evolution before the radiation of WGD species but little or no sequence divergence between orthologues and paralogues thereafter, indicating that functional conservation after the radiation may also be responsible for decelerated evolution in duplicates.
Y.-S. Lin, J.K. Byrnes, J.-K. Hwang, and W.-H. Li*. (2006). Codon usage bias versus gene conversion in the evolution of yeast duplicate genes. Proc. Natl. Acad. Sci. USA. 103: 14412-14416.
- We used pluripotent P19 cells to study the function of microtubule-associated proteins during neuritogenesis. Multi-dimensional protein identification technology (one type of gel-free high throughput proteomics) was performed on microtubule-associated proteins prepared before versus shortly after neurite induction. More than 800 proteins were consistently identified in both proteomes. Surprisingly, when these two proteomes were quantitatively compared, the majority of the proteome remain unchanged. Substantial changes in the microtubule-associated proteome occurred at the level of individual proteins. Based on our proteomic results, we assayed primary neurons using RNA interference to identify a novel inhibitory role for protein TRIM2 in neurite elongation.