Applying advanced algorithms and machine learning techniques to clinical proteomics

2022-02-25
  • 1,728

Host : Prof. Hyojin Sung

Biography

Dr. Sangtae Kim is the Chief Technology Officer at Bertis Bioscience who leads the company’s R&D and innovation in product development. Prior to his current position, he was principal scientist at Seer, staff bioinformatics scientist at Illumina and senior research scientist at Pacific Northwest National Laboratory. Dr. Kim has dedicated his career to developing innovative algorithms and software for high-throughput biological data analysis, ranging from genomic variant calling via next-generation sequencing to protein identification via mass spectrometry. He is well known in the scientific community as the developer of two widely-used software tools MS-GF+ and Strelka2. He has authored over 25 research papers and his publications currently report over 4300 citations. Dr. Kim received B.S. and M.S. from Seoul National University, and Ph.D. from University of California, San Diego, all in computer science.

Abstract

Proteomics is becoming increasingly popular in basic, translational, and clinical research. For the success of proteomics research, computational tools that effectively and efficiently mine information from large proteomics data are crucial. In this talk, I will give an overview of basic concepts of interpreting mass spectrometry data and share my experience in developing a popular bioinformatics tool MS-GF+ for peptide identification. Particularly, I will explain how a dynamic programming algorithm enabled unbiased estimation of statistical significance of peptide identification. Then, I will introduce a recent progress at Bertis on applying machine learning and explainable artificial intelligence techniques to interpretation of clinical proteomics data.

 

LIST