A clinical study on using AI to enhance the consistency of embryo assessment

The issue of consistency in embryo assessment has long constrained the development of assisted reproductive technology. The paper introduced herein attempts to combine artificial neural networks with genetic algorithms to explore the possibility of developing an intelligent embryo assessment system. Results show that this method has a certain effect on improving assessment consistency while maintaining clinical interpretability.

This study suggests that artificial intelligence (AI) can not only improve assessment accuracy but, more importantly, compensate for the unavoidable subjective differences in manual assessment. This method provides a reference technical solution for embryo assessment, yet its clinical application value still needs to be verified by more studies.

Research Background

In assisted reproductive technology, morphological assessment of embryos usually relies on the subjective judgment of embryologists. However, this assessment method faces significant inter-evaluator variability. Studies have found that the consistency of scoring for the inner cell mass (ICM) and trophectoderm (TE) of the same embryo among different experts is low, with a Kappa coefficient of approximately 0.3, which directly affects the accuracy of embryo selection. Although artificial intelligence (AI) technology has been introduced into the field of embryo assessment, most current AI models suffer from the “black box” issue—their decision-making processes lack interpretability and fail to gain widespread trust from clinicians. Therefore, developing an AI tool that can both improve assessment accuracy and has interpretability is of great clinical significance for promoting the objectivity and standardization of embryo assessment.

Research Methods

In this study, static time-lapse imaging data of 223 human blastocysts were collected, and all images were acquired using an EmbryoScope® incubator under standardized conditions. During data preprocessing, grayscale conversion and resolution adjustment were first performed to unify the standards of different images; then, the Hough transform was used to accurately identify the boundaries of blastocysts, thereby extracting blastocyst contours. To extract key morphological features, the research team further adopted the gray-level co-occurrence matrix (GLCM) and watershed algorithm for texture analysis and image segmentation, respectively, to extract the geometric and texture features of blastocysts.

In terms of model construction, the study combined genetic algorithms and artificial neural networks. Genetic algorithms were used to optimize the structural configuration of the neural network, including the number of neurons, the number of hidden layers, and other key parameters, thereby improving model performance. All datasets were divided into training, validation, and test sets at a ratio of 70%, 15%, and 15%, ensuring good generalization ability and reliability of the model.

During training, the neural network was used to classify blastocyst expansion (BE), inner cell mass (ICM), and trophectoderm (TE). Evaluation indicators included accuracy, Kappa coefficient (for assessing consistency), and area under the ROC curve (AUC), which comprehensively evaluated the model’s performance in different assessment tasks.

Main Findings

Studies showed that the AI model achieved assessment accuracies of 81.5% (BE), 78.8% (ICM), and 78.3% (TE) on the test set, demonstrating good classification ability. Compared with the assessment results of five different embryologists, the AI model achieved a significant improvement in assessment consistency. Particularly in the assessment of BE and ICM, the Kappa coefficient of the AI model increased from 0.3-0.4 (manual assessment) to 0.7, showing the advantage of the AI model in improving assessment consistency. Although the Kappa coefficient of the AI model in TE assessment was 0.4, which was still lower than that in BE and ICM assessment, this result still showed a significant improvement compared with the assessment consistency of human embryologists.

In terms of performance, the AUC values of the AI model in BE, ICM, and TE assessment were 0.956, 0.854, and 0.769, respectively, indicating that the model performed best in identifying blastocyst expansion. It is worth noting that all model outputs strictly followed the Gardner scoring criteria, ensuring high compatibility of assessment results with the existing clinical diagnosis and treatment system.

Although the performance of the AI model in TE assessment is still limited, which may be partially related to biological characteristics such as blurred boundaries of TE cells, the overall performance of the model showed a significant improvement compared with the assessment consistency of human embryologists.

Research Innovations

This study innovatively combines genetic algorithms with artificial neural networks, optimizing the structure of the neural network through evolutionary computation to improve the accuracy of embryo assessment. The study also adopted a feature engineering method with strong interpretability to extract texture and geometric features of blastocysts, making the model’s decision-making process more transparent and enhancing its interpretability. This method not only improves assessment consistency but also solves the subjectivity problem in traditional assessment methods, showing broad clinical application prospects.

Research Limitations

This study has certain limitations. First, as a retrospective study, the results may be affected by data selection bias. Second, the study only used static images for analysis and failed to fully utilize the dynamic information from time-lapse imaging, limiting the comprehensive assessment of the embryo development process. In addition, although the sample size was 223 cases, the verification of the model’s generalization ability is still insufficient, especially in TE assessment, where the model’s performance still has room for improvement.

Clinical Significance and Prospects

This study provides new ideas for the standardization of embryo assessment. Future studies should conduct multicenter prospective studies to verify the correlation between AI scores and clinical pregnancy outcomes. Integrating dynamic developmental data from time-lapse imaging will further enhance the accuracy and spatiotemporal dimensions of assessment. AI assessment technology can not only be applied to embryo assessment but also extended to clinical applications such as preimplantation genetic testing (PGT), helping to improve treatment outcomes for infertile patients. In the future, with the continuous advancement of technology, AI tools are expected to become an important component of assisted reproductive technology, promoting the precision of embryo selection and treatment protocols.

reference

Toschi M, Bori L, Rocha JC, Hickman C, Gouveia Nogueira MF, Satoshi Ferreira A, Costa Maffeis M, Malmsten J, Zhan Q, Zaninovic N, Meseguer M. A Combination of Artificial Intelligence with Genetic Algorithms on Static Time-Lapse Images Improves Consistency in Blastocyst Assessment, An Interpretable Tool to Automate Human Embryo Evaluation: A Retrospective Cohort Study. Int J Fertil Steril. 2024 Oct 30;18(4):378-383. doi: 10.22074/ijfs.2024.2008339.1510. PMID: 39564830; PMCID: PMC11589968.