Portrait

Thanh Dat Nguyen

Researcher in Bioinformatics & Biotechnology

Thanh Dat Nguyen is a former researcher at the Research Center for Genetics and Reproductive Health, University of Health Sciences, Vietnam National University – Ho Chi Minh City (VNU-HCM). He holds a B.Sc. in Biotechnology from Tan Tao University (2018–2022), where his thesis focused on developing a risk assessment system for predicting lung metastasis in breast cancer. Currently, he is pursuing a master’s program at the Department of Artificial Intelligence, Chang Gung University, Taoyuan City, Taiwan. A dedicated researcher in biotechnology and bioinformatics, he is passionate about advancing cancer diagnostics, genetic analysis, and machine learning applications in healthcare. Through his graduate studies, he aims to enhance his expertise and contribute to innovative research in medical AI and data science.

He is passionate about advancing cancer diagnostics, genetic analysis of diseases, and applying machine learning in healthcare. Thanh Dat has co-authored multiple research publications and presented his work at various scientific conferences. He is currently pursuing graduate studies, aiming to contribute to cutting-edge research and global scientific advancements.

Download CV (PDF)

Research Highlights

  • Bioinformatics for cancer diagnosis and treatment: Conducting genomic and transcriptomic analyses to discover biomarkers and improve cancer diagnostics.
  • Genetic analysis of hereditary diseases: Investigating genetic risk factors and mutations in hereditary conditions to understand disease mechanisms.
  • Machine learning applications in healthcare: Developing predictive models (e.g., for cancer diagnosis and cancer metastasis or blood disorders) to support early detection and personalized treatment.

Research Projects

  • Validity of Red Cell Distribution Width Index to Differentiate between Iron Deficiency Anemia and Thalassemia – Member, Vietnam National University HCMC (2025–2027)
  • Screening Nature’s Melanin Inhibitors: Medicinal Plants for Skin Brightening – Member, Vietnam National University HCMC (2024–2026)
  • Building a Machine Learning Model for Diagnosis of Hepatocellular Carcinoma – Member, Vietnam National University HCMC (2024–2026)
  • Develop a method for early detection of prostate cancer based on the expression of miRNA in serum – Member, Vietnam National University HCMC (2023–2025)
  • Prediction of Lung Metastasis in Breast Cancer Patients – Member, Vietnam National University HCMC (2023–2025)
  • In silico screening of potential antidiabetic phytochemicals from Ipomoea Bettas (L.) leaf against multiple therapeutic targets of T2DM – Member, Vietnam National University HCMC (2023–2025)
  • GADD45B Expression Is associated With Risk of Bone Metastasis and Survival Outcome in Breast Cancer Patients – Principal Investigator, Tan Tao University (2021–2022)

Contact

Name: Thanh Dat Nguyen

Email: M1461025@cgu.edu.tw

Phone: (+886) 989 895 917

Affiliation: Master’s Student, Department of Artificial Intelligence, Chang Gung University

Address: No. 259號, Wenhua 1st Rd, Guishan District, Taoyuan City, Taiwan

Publications

2025

Gene expression-based machine learning model for diagnosis, prognosis, and treatment response prediction in hepatocellular carcinoma: a retrospective study
Tan Thinh Nguyen *, Thanh Dat Nguyen *, Phu Qui Le Nguyen, Phuong Thi Bui, Minh Nam Nguyen
Journal of Yeungnam Medical Science (DOI: 10.12701/jyms.2026.43.21), First Online: March 4, 2026

Background: Hepatocellular carcinoma (HCC) remains a leading cause of cancer-related mortality worldwide, largely because of challenges in early diagnosis and the limited sensitivity of conventional biomarkers. Therefore, reliable molecular tools for early detection, prognostic stratification, and individualized treatment predictions are urgently required. Methods: This retrospective study analyzed publicly available gene expression datasets. Candidate biomarkers were identified from the GSE14520 cohort using a multistep screening workflow that integrated differential expression analysis, diagnostic performance, and prognostic relevance. A 10-gene diagnostic model was constructed using least absolute shrinkage and selection operator logistic regression and subsequently validated across multiple independent cohorts. Survival outcomes were evaluated using the Kaplan-Meier analysis and treatment responses to sorafenib and transarterial chemoembolization (TACE) were assessed using receiver operating characteristic analysis. Results: A 10-gene signature (TOP2A, CDK1, CYP3A4, MASP2, EPHX2, HAO1, RACGAP1, GLYAT, ADH1B, and CYP4A11) was established. The model demonstrated robust internal performance and consistent accuracy across external validation cohorts (area under the curve [AUC], >0.9). This signature effectively identified early-stage HCC and distinguished malignancy from cirrhosis. High-risk scores were significantly associated with poor overall survival and recurrence-free survival (p < 0.05). Furthermore, the model could predict treatment sensitivity, with higher risk scores associated with better outcomes for sorafenib (AUC, 0.791), whereas lower risk scores correlated with an improved response to TACE (AUC, 0.768). Conclusion: Our gene expression-based machine learning model provides a robust tool for HCC diagnosis, prognosis, and treatment response prediction, with potential as a supportive system for personalized clinical decision-making.

Prediction of Lung Metastasis in Breast Cancer Patients Using Machine Learning Classifiers
Thanh Dat Nguyen, Quynh-Mai Thi Nguyen, Tuong Van Nguyen, Phuong Thi Bui, Kim Nhuong Thi Nguyen, Minh Nam Nguyen
The Journal of Molecular Diagnostics (DOI: 10.1016/j.jmoldx.2025.10.010), First Online: 27 November 2025

Breast cancer is the most common cancer among women, and metastasis to the lung is associated with poor prognosis. Reliable biomarkers for predicting lung metastasis are urgently needed to improve early detection and clinical decision-making. This study used microarray data sets comprising gene expression profiles and clinical data from primary breast cancer patients who were followed up for lung metastasis outcomes. High-throughput screening combined with Venn diagram analysis was used to identify common candidate probes, and the least absolute shrinkage and selection operator method were used to select 11 genes for model development. Logistic regression was used to construct predictive models, and the final risk signature consisted of 10 candidate genes (CDK19, GLUD1, GTPBP4, HLCS, HYI, KCND3, MAP2K1, NMUR1, PRKD3, and SLC16A3). The model achieved strong performance in training and validation cohorts (areas under the curve > 0.87) and generalized to the independent METABRIC data set (area under the curve = 0.706). Subset analyses restricted to patients with early-stage disease confirmed that the signature retained predictive value. Kaplan-Meier analyses showed that patients with high-risk scores had shorter lung metastasis–free survival, recurrence-free survival, and overall survival. Multivariate Cox analysis confirmed that the risk signature provided independent predictive information from clinical variables. In conclusion, the risk signature accurately identifies patients with breast cancer at risk of lung metastasis, enabling clinicians to better assess risk and tailor effective treatment strategies.

Application of Machine Learning in Predicting Lung Metastasis in Breast Cancer Patients
Tuong Van Nguyen, Quynh Mai T. Nguyen, Thanh Dat Nguyen, Minh Linh Nguyen, Minh Nam Nguyen
IFMBE Proceedings (IFMBE 2024,volume 122), First Online: 05 June 2025

Background: Breast cancer is the most common cancer in women, and the lung is one of the frequent sites for metastasis in breast cancer. Identifying biomarkers to predict lung metastasis is essential for early detection and targeted intervention, thereby improving survival rates for patients with lung metastatic breast cancer. Method: Four datasets (E-MTAB-365, GSE2603, GSE11078, and GSE14020) from NCBI and Array Express databases, containing gene expression profiles and clinical data from breast cancer patients, were selected. High-throughput screening identified potential biological markers by evaluating the predictive ability of each gene for lung metastasis, and a Venn diagram was used to find common genes across datasets with an AUC > 0.65. Using the linear regression algorithm, we developed a gene-based model to predict the risk of lung metastasis with E-MTAB-365 as the training dataset and the remaining datasets for validation. Model performance was evaluated through ROC curve analysis and the Kaplan-Meier curve. Results: The model, consisting of three genes (IRAK1, ATP11A, and LYN), achieved a good AUC across the analyzed datasets. The model demonstrated that patients with lung metastasis had significantly higher risk scores than those without. In addition, patients with high-risk scores faced a higher risk of lung metastasis and a shorter non-metastatic survival time. Conclusion: We successfully built a machine-learning model based on gene expression to predict lung metastasis in breast cancer patients and validated its reliability. Our results suggest the potential application of the established model as a predictive tool to assist physicians in practical diagnosis.

Long-Term Mental Health Impact of COVID-19 on Pregnant Women in Vietnam
Minh Nam Nguyen, Huy Dung Tran, Kim Nhuong T. Nguyen, Thanh Dat Nguyen, Dieu Hien T. Huynh, Thu Suong T. Nguyen, Bao An H. Nguyen, Giau Van Vo
IFMBE Proceedings (IFMBE 2024,volume 123), First Online: 05 June 2025

The adverse effects of COVID-19 infections during pregnancy have been extensively documented; however, the persistent sequelae of this infection remain unexplored. This study aimed to investigate the mental health status of women who recovered from COVID-19 infection during pregnancy using the Depression, Anxiety, and Stress Scale (DASS-21) questionnaire. A cross-sectional observational study enrolled 104 women, including 56 participants who recovered from COVID-19 infection during pregnancy (COVID-19 pregnant women-CPW) and 48 controls who were never infected with COVID-19 during pregnancy (non-COVID-19 pregnant women, NCPW) from November 15th, 2022, to February 15th, 2023, at the Respiratory Department in Children’s Hospital 1, Ho Chi Minh City. Data were collected through a questionnaire that gathered general participant information and included a DASS-21 questionnaire. Results showed that 77% of the CPW reported at least one long-term symptom, with fatigue (62.50%), sore throat (60.71%), and cough (57.14%) being the most common. CPW exhibited significantly higher stress, anxiety, and depression scores compared to NCPW (p < 0.0108, p < 0.0006 and p < 0.0007, respectively). Depression correlated significantly with residence (p < 0.0394). Spearman correlation analysis indicated positive associations between depression, anxiety, and stress core in CPW. These findings suggested that COVID-19 infection during pregnancy exerts a notable impact on mental health outcomes, including anxiety, depression, and stress. Further investigations should be performed to elucidate potential mechanisms underlying these findings to develop interventions supporting maternal and infant mental health.

Insights into the epidemiology and clinical aspects of post-COVID-19 conditions in adults
Dieu Hien T. Huynh *, Dat T. Nguyen *, Thu Suong T. Nguyen, Bao An H. Nguyen, Anh T. T. Huynh, Vy N. N. Nguyen, Dat Q. Tran, Thi N. N. Hoang, Huy Dung Tran, Dao Thanh Liem, Giau V. Vo, Minh Nam Nguyen
Chronic Illness, 21(1):157-169. Publish: March 2025

Objectives: While most individuals infected with COVID-19 recover completely within a few weeks, some continue to experience lingering symptoms. This study was conducted to identify and describe the clinical and subclinical manifestations of adult patients from the long-term effects of COVID-19.Methods: The study analyzed 205 medical records of inpatients (age ≥ 16 years, ≥ 4 weeks post-COVID-19 recovery, and a negative SARS-CoV-2 status at enrollment) at Thong Nhat Hospital, Vietnam, from 6 September 2021 to 26 August 2022, using R language software.Results: The majority of patients hospitalized with long COVID-19 symptoms (92.68%) had normal consciousness. The most common symptoms on admission were fatigue (59.02%), dyspnea (52.68%), and cough (42.93%). In total, 80% of patients observed respiratory symptoms, primarily dyspnea, while 42.44% reported neurological symptoms, with sleep disturbance being the most common. Noticeably, 42.93% of patients experienced respiratory failure in the post-COVID-19 period, resembling acute respiratory distress syndrome.Discussion: These findings provide crucial insights into the epidemiology, clinical, and subclinical aspects of post-COVID-19 conditions, shedding light on the prevalence of common symptoms and the demographic distribution of affected patients. Understanding these manifestations is vital for patient well-being, improved clinical practice, and targeted healthcare planning, potentially leading to better patient care, management, and future interventions.

2024

Evaluation of Thalassemia Screening Results and Epidemiological Characteristics in Pregnant Women at Hung Vuong Hospital
Đỗ Nguyễn Thảo Vy, Nguyễn Thành Đạt, Nguyễn Lê Phú Quí, Nguyễn Vạn Thông, Phạm Nguyễn Hữu Phúc, Hứa Thị Mỹ Huyền, Trần Phương Huy, Nguyễn Nữ Hải Long, Phạm Thị Vân Anh, Nguyễn Minh Nam
Vietnam Journal of Community Medicine, Vol. 65, Special Issue 12, 153-158. Published: December 12, 2024

Objective: This study aimed to evaluate the screening rate and demographic of Thalassemia among pregnant women at Hung Vuong Hospital. Subjects and methods: A cross-sectional study was conducted on a cohort of pregnant women who underwent screening for Thalassemia using mean corpuscular hemoglobin and mean corpuscular volume indices at Hung Vuong Hospital. Results: From January 2023 to January 2024, a total of 23,015 pregnant women were screened, with 5,372 cases (23.34%) identified as high-risk for Thalassemia. Among these high-risk cases, 63 pregnant women consented to further confirmatory diagnostic testing, revealing that 38.10% of those screened as high-risk did not carry the disease gene. The prevalence of AlphaThalassemia was notably high, accounting for 41.27%, while Beta-Thalassemia was detected in 15.87% of cases. Approximately 4.76% of pregnant women were found to have both Alpha and Beta-Thalassemia. Among those diagnosed with Alpha-Thalassemia, the SEA heterozygous genotype was the most prevalent (68.97%). In the Beta-Thalassemia group, the HBE heterozygous genotype was the most common, comprising 34.48% of cases. Conclusion: The rate of Thalassemia gene carriers is high, especially among ethnic minorities, with Alpha-Thalassemia being predominant. Expanding screening and appropriate management of iron deficiency plays a crucial role in improving health for pregnant women.

2023

Prediction of Lung Metastasis Risk in Breast Cancer Patients Based on CFAP410 Expression
Hoang Dang Hieu, Nguyen Thanh Dat, Nguyen Thi Quynh Mai, Nguyen Minh Nam
Thai Nguyen University Journal of Science and Technology, 228(13): 148–156. Published: August 4, 2023

Breast cancer is the most common cancer in women, but the main cause of death in breast cancer is metastasis to other organs in the body. Among these, the lung is one of the common sites of breast cancer metastasis. Determining whether breast cancer has metastasized or not will make the treatment of patients easier. Therefore, the development of a marker to predict breast cancer metastasis has the potential to facilitate more effective treatment and increase the survival time of breast cancer patients. In this study, we analyzed the correlation of Cilia and flagella-associated protein 410 (CFAP410) with breast cancer lung metastasis. The results showed that the expression levels of CFAP410 in breast cancer patients without lung metastasis were significantly higher than the expression levels of this gene in patients with lung metastatic breast cancer. In addition, patients with high expression of CFAP410 had a low risk of metastasis, and their survival time was also markedly higher than that of patients with low expression levels of CFAP410. Our research also shows that CFAP410 can be used as an independent marker to identify breast cancer lung metastasis early, which helps patients receive better treatment and increases survival time for breast cancer lung metastasis patients.

Predicting the severity of COVID-19 patients using the CD24-CSF1R index in whole blood samples
D. Nguyen Thanh, N. T. Thanh Giang, T. V. Le, N. M. Truong, T. V. Ngo, T. N. Lam, D. T. Nguyen, Q. H. Tran, M. N. Nguyen
Heliyon, 9(3):e13945. Publish: March 2023

Coronavirus disease 2019 (COVID-19), caused by SARS-CoV-2, has become one of the most serious public health crises worldwide. Most infected people are asymptomatic but are still able to spread the virus. People with mild or moderate illnesses are likely to recover without hospitalization, while critically ill patients face a higher risk of organ injury or even death. In this study, we aimed to identify a novel biomarker that can predict the severity of COVID-19 patients. Clinical information and RNA-seq data of leukocytes from whole blood samples with and without a COVID-19 diagnosis (n = 100 and 26, respectively) were retrieved from the National Center for Biotechnology Information Gene Expression Omnibus database. Raw data were processed using the Transcripts Per Million (TPM) method and then transformed using log2 (TPM+1) for normalization. The CD24-CSF1R index was established. Violin plots, Kaplan-Meier curves, ROC curves, and multivariate Cox proportional hazards regression analyses were performed to evaluate the prognostic value of the established index. The CD24-CSF1R index was significantly associated with ICU admission (n = 50 ICU, 50 non-ICU) and ventilatory status (n = 42 ventilation, 58 non-ventilation) with p = 4.186e-11 and p = 1.278e-07, respectively. The ROC curve produced a relatively accurate prediction of ICU admission with an AUC of 0.8524. Additionally, patients with a high index had significantly fewer mechanical ventilation-free days than patients with a low index (p = 6.07e-07). Furthermore, the established index showed a strong prognostic ability for the risk of using a ventilator in the multivariate Cox regression model (p < 0.001). The CD24-CSF1R index was significantly associated with COVID-19 severity. The established index could have potential implications for prognosis, disease severity stratification, and clinical management.

2021

GADD45B Expression Associated with Bone Metastasis Risk in Breast Cancer Patients
Nguyen Thanh Dat, Tran Quynh Hoa, Nguyen Dinh Truong, Nguyen Minh Nam
Journal of Science, Technology, and Food, 21(4), 94-109 (2021).

Bone metastatic breast cancer is an uncurable disease. Aproximately 80% of patients with advanced breast cancer develop bone metastasis, and these patients' life expectancy is limited to 2-3 years following diagnosis of bone metastasis. Currently, there is no effective therapy for this condition because of lacking therapeutic targets. Therefore, identifying biomarkers that can be used to diagnose early and prognose of bone metastatic breast cancer is necessary. In this study, we used bioinformatics tools to analyse the expression of GADD45B and its correlation with bone metastasis breast cancer. Our results showed that GADD45B expression correlates with the risk and duration of bone metastasis of breast cancer. Breast cancer patients with high expression levels of GADD45B had a high risk of bone metastasis and poor bone metastasis-free survival outcomes. GADD45B is an independent biomarker to predict bone metastasis of breast cancer. Elevated expression of GADD45B predicts a poor prognosis in breast cancer patients. GADD45B could be a potential therapeutic target for patients with bone metastatic breast cancer. Taken together, the current results show that GADD45B is a valuable biomarker for improving diagnostic efficiency and prognosis in the age of precision medicine.

Awards & Achievements

  • MOE Taiwan Scholarship (2025): Awarded by the Ministry of Education, Taiwan, for Master's studies. (2025–2027)
  • The 4th Ho Chi Minh City Creative Award (2025): Third Prize, Field of Science and Technology (Field 6): "Application of Big Data analysis and machine learning in identifying biomarkers for the diagnosis, prognosis, and treatment of primary liver cancer." (June 2025)
  • The 28th Ho Chi Minh City Technical Creativity Contest: Third Prize: "Development of an ensemble machine learning model and a website for prenatal Thalassemia screening." (June 2025)
  • The 11th Binh Duong Province Technical Creativity Contest (2025): Second Prize on “Application of Big Data analysis and machine learning in identifying biomarkers for diagnosis, prognosis, and treatment of primary liver cancer.” (May 2025)
  • 22nd National Youth Conference on Medical Science and Technology (2024): Third Prize (Biomedical Engineering Section) for Oral presentation “An ensemble learning model for thalassemia prediction from complete blood count test.” (Dec 2024)
  • Smart City 2024 Competition:
    • First Prize (Group A: Digital Technology/IoT) – “Developing an AI-based website for prenatal screening of Thalassemia.”
    • Third Prize (Group B: Biotechnology) – “Developing software for early diagnosis, prognosis, and treatment of liver cancer.”
    (Dec 2024)
  • AI.STAR 2024 Competition:
    • Certification for Advancing to Incubation – “AI-based machine learning model and prenatal screening website for Thalassemia prediction.”
    • Certification for Advancing to Incubation – “Developing an AI-integrated website for diagnosis, prognosis, and treatment of liver cancer.”
    (Nov 2024)
  • Scientific Workshop on Digital Transformation & AI in Diagnosis and Treatment (2024):
    • Second Prize (Oral & Poster) – “Application of machine learning models for early detection and personalized treatment of liver cancer.”
    • Consolation Prize (Poster) – “In silico screening of biomarkers for early diagnosis and prognosis in non-small cell lung cancer.”
    • Impressive Poster Award – “Application of machine learning in predicting lung metastasis in breast cancer patients.”
    (May 2024)
  • Scientific Initiative 2024: Consolation Prize – “Diagnosis, prognosis and treatment prediction of liver cancer based on F12 gene expression.” (May 2024)
  • Young Science Conference “COVID-19 Vaccine: Research and Application” (2021): Second Prize for Oral presentation “Prognosis of COVID-19 severity in patients using the CD24-CSF1R index.” (Nov 2021)
  • Student Bio-Science Conference 2021: Outstanding Oral Presentation Award for “Up-regulation of GADD45B is associated with high risk of bone metastasis and poor survival outcome in breast cancer patients.” (May 2021)
  • Tan Tao University Scholarship: Full scholarship (100%) for first year and a 70% scholarship for the subsequent years, with two additional semesters turned to full scholarships (100%) due to academic excellence. (2018–2022)