- 我的研究以电脑辅助药物设计(Computer-aided drug design)及结构生物资讯(Structural bioinformatics)为主轴。在电脑辅助药物设计方面，目前我们是分子对接(Molecular docking)软件(GEMDOCK)提供者之一，我们也与国内外超过十个实验室合作，经由生物实验证实我们的方法确实可找到药物标的(如Envelop protein、Skimate kinase、 influenza virus neuraminidase)前导药物(Lead compound)、功能催化部位(如ß-lactoglobulin)、及蛋白质功能设计(如endo-chitosanase to exo-chitosanase)等，这些成果已发表多篇论文在这些领域最好的期刊上。这些论文从2004年起被引用次数已超过70次 (根据ISI)；另外GEMDOCK被国内外数十个实验室使用，除研究的应用外，此软件也使用在教学上。这些成果让我们获得2007年国家新创奖(由国家生技医疗产业策进会主办)，是唯一以电脑软件获奖的团体。
- 在结构生物资讯方面，我们在蛋白质结构预测(PS2)、高速蛋白质结构的搜寻与应用(3D-BLAST)、及结构为基的蛋白质网络(3D-partner)已有相当成果，这些研究已发表在相关领域最好的期刊上(如Nucleic Acids Research、Genome Biology) 。最值得一提的研究成果是3D-BLAST，3D-BLAST搜寻蛋白质结构的速度与BLAST搜寻胺基酸序列一样快，同时具备BLAST的优点与操作接口，也就是能提供可信的统计基础(e-value)及高效率的搜寻能力。因此3D-BLAST可能成为蛋白质结构搜寻的标准，对于结构搜寻有巨大影响，此研究成果已引起相关学者的重视与讨论。2007年起这些论文被引用次数已超过13次，目前已有44个国家，超过5,100 人次使用我们提供的服务。
- CELLO-- protein subcellular localization predictor
pKNOT -- the first knotted protein server
- We aim to develop various evolutionary computation algorithms and optimization methodologies, and provide user-friendly tools for system optimization. A representative paper is as follows: S.-Y. Ho*, L.-S. Shu and J.-H. Chen, "Intelligent Evolutionary Algorithms for Large Parameter Optimization Problems," IEEE Trans. Evolutionary Computation, vol. 8, no. 6, pp. 522-541, Dec. 2004. (Highly cited article)
- Based on the expertise of intelligent computation, we develop various bio-inspired optimization algorithms for computational proteomics, computational systems biology, computational biology, bioinformatics, etc.
- We establish national/international interdisciplinary collaboration with biologists to investigate systems biology and validate our proposed models.
- We are interested in establishing a user-friendly system of computer-aid vaccine design.
(Electrocardiology and Cardiovascular Bioinformatics Lab – PI: Dr. Ten-Fang Yang)
(Electrocardiology and Cardiovascular Bioinformatics Lab – PI: Dr. Ten-Fang Yang)
- Multiple Sequence Alignment with Constraints (MuSiC, MuSiC-ME, RE-MuSiC)
We are the first to propose the concept of constrained sequence alignment that allows biologists to incorporate their knowledge about structures/functionalities/consensuses of their datasets into sequence alignment. By specifying known functionally, structurally or evolutionarily related residues/nucleotides of the input sequences as constraints, the output of the constrained sequence alignment is an optimal sequence alignment in the condition that the user-specified residues/nucleotides should be aligned together in the alignment, so that the output alignment can more accurately reflect the true biological relationships among the input sequences. In this study, we have first designed an efficient algorithm for computing a constrained alignment of multiple sequences and have also developed a web server, called MuSiC (Multiple Sequence alignment with Constraints), for the online analysis. Using MuSiC, we have successfully located the subsequence fragment of the RNA sequence of SARS that is capable of folding itself into a stable RNA secondary structure with pseudoknot responsible for the replication of SARS viruses (Bioinformatics, 20:2309-2311, 2004). Then we have further developed its memory-efficient version, called MuSiC-ME (Memory-Efficient Multiple Sequence alignment with Constraints), that allows the users to align multiple sequences of length up to several thousand residues (Bioinformatics, 21:20-30, 2005). More recently, we have developed RE-MuSiC by further enhancing the constraint formulation of MuSiC as regular expressions, which is convenient in expressing many biologically significant patterns like those collected in the PROSITE database, or structural consensuses that often involve variable ranges between conserved parts. Experiments demonstrate that RE-MuSiC can be used to help predict important residues and locate evolutionarily conserved structural elements (Nucleic Acids Research, 35:W639-644, 2007).
Figure 1: Three GST (Glutathione S-Transferase) proteins: The structural similarity between these three proteins is very high, but their pairwise sequence identities are extremely low.
Figure 2: The constrained sequence alignment produced by RE-MuSiC, using the pattern of "[ST]-x(2)-[DE]" as the constraint, in which the residues shaded in yellow match the pattern. In addition, the residues in green boxes that correspond to the active sites shared by these three GST proteins are aligned together.
- [MicroRNA Regulation: Databases and Tools]
Recent works have demonstrated that microRNAs (miRNAs) are involved in critical biological processes by suppressing the translation of coding genes. In order to facilitate the investigation of microRNA regulation, we developed several biological databases and computational tools in this important field. Six articles in this filed were published in Nucleic Acids Research (2007 SCI Impact = 6.954). miRNAMap was selected as hot research in 2006 NAR Database Issue. miRNAMap was genomics maps for microRNA genes and their targets in metazoan genomes (Nucl Acids Res, 2006, Nucl Acids Res, 2008). ViTa is a database of host microRNA targets on viruses (Nucl Acids Res, 2007). We also survey the literatures to extract the RNA structural motifs and their functions to construct the RegRNA database (Nucl Acids Res, 2006). The RNAMST web server was developed for searching RNA structural homologs (Nucl Acids Res, 2006). RNALogo is designed as a new approach to display structural RNA alignment (Nucl Acids Res, 2008). These databases and tools were cited more than 53 times during last two years.
- [Protein Post-translational Modification: Database and Tools]
Protein Post-Translational Modification (PTM) plays an essential role in cellular control mechanisms that adjust protein physical and chemical properties, folding, conformation, stability and activity, thus also altering protein function. Four articles in this filed were published in Nucleic Acids Research (2007 SCI Impact = 6.954). dbPTM is a comprehensive information repository of protein post-translational modification (PTM) (Nucl Acids Res, 2006). Furthermore, we developed KinasePhos [1.0, 2.0], which is a web tool for identifying protein kinase-specific phosphorylation site (Nucl Acids Res, 2005, 2007, J. Comp Chem 2005). Besides, ProKware was designed as an integrated software for presenting protein structural properties in protein tertiary structures (Nucl Acids Res, 2006). These database and tools were totally cited more than 50 times during last three years.
- 蛋白质结构与反应之理论：利用各种计算化学方法，如全初始化理论(ab initio)、密度泛函理论(density function theory)、半经验理论(semi-empirical methods)以及分子力学(molecular mechanics)等理论或其混合方法，探讨蛋白质构形与相关生化反应的理论性质。
- Many Saccharomyces cerevisiae duplicate genes that were derived from an ancient whole-genome duplication (WGD) unexpectedly show a small synonymous divergence (K S), a higher sequence similarity to each other than to orthologues in Saccharomyces bayanus, or slow evolution compared with the orthologue in Kluyveromyces waltii, a non-WGD species. This decelerated evolution was attributed to gene conversion between duplicates. Using ≈300 WGD gene pairs in four species and their orthologues in non-WGD species, we show that codon-usage bias and protein-sequence conservation are two important causes for decelerated evolution of duplicate genes, whereas gene conversion is effective only in the presence of strong codon-usage bias or protein-sequence conservation. Furthermore, we find that change in mutation pattern or in tDNA copy number changed codon-usage bias and increased the K S distance between K. waltii and S. cerevisiae. Intriguingly, some proteins showed fast evolution before the radiation of WGD species but little or no sequence divergence between orthologues and paralogues thereafter, indicating that functional conservation after the radiation may also be responsible for decelerated evolution in duplicates.
Y.-S. Lin, J.K. Byrnes, J.-K. Hwang, and W.-H. Li*. (2006). Codon usage bias versus gene conversion in the evolution of yeast duplicate genes. Proc. Natl. Acad. Sci. USA. 103: 14412-14416.
- We used pluripotent P19 cells to study the function of microtubule-associated proteins during neuritogenesis. Multi-dimensional protein identification technology (one type of gel-free high throughput proteomics) was performed on microtubule-associated proteins prepared before versus shortly after neurite induction. More than 800 proteins were consistently identified in both proteomes. Surprisingly, when these two proteomes were quantitatively compared, the majority of the proteome remain unchanged. Substantial changes in the microtubule-associated proteome occurred at the level of individual proteins. Based on our proteomic results, we assayed primary neurons using RNA interference to identify a novel inhibitory role for protein TRIM2 in neurite elongation.