Artificial Intelligence Assists in Azoospermia Treatment: A Prospective Comparative Experiment of AI Assistance

This paper introduces a deep learning-based sperm retrieval model designed to optimize the post-surgical sperm detection process for azoospermia patients. Experimental results show that the system exhibits significant advantages in sperm detection speed and recall rate, greatly reducing missed detections and human errors, and securing more fertility opportunities for patients. However, the research has only been validated on “immotile sperm” so far, with a limited sample size and failure to fully simulate all operational procedures in real clinical scenarios. In the future, further expanding the dataset, optimizing model functions, and conducting multi-center clinical studies are expected to promote the widespread application of this technology in azoospermia diagnosis and treatment, bringing greater benefits to patients.

Goss DM, Vasilescu SA, Vasilescu PA, Cooke S, Kim SH, Sacks GP, Gardner DK, Warkiani ME. Evaluation of an artificial intelligence-facilitated sperm detection tool in azoospermic samples for use in ICSI. Reprod Biomed Online. 2024 Jul;49(1):103910. doi: 10.1016/j.rbmo.2024.103910. Epub 2024 Feb 22. PMID: 38652944.

I. Research Background: Core Pain Points in Azoospermia Diagnosis and Treatment and the Necessity of AI Intervention

Global male sperm counts have dropped by 50% over the past 50 years, making male infertility an increasingly prominent issue. As the most severe type of male infertility, azoospermia (defined as the absence of sperm detected in semen for at least two consecutive times after centrifugation) affects not only 10%-20% of infertile men but also 1% of the general male population. Among azoospermia cases, non-obstructive azoospermia (NOA), accounting for 60%, results from impaired sperm production due to testicular failure caused by primary, secondary, or unknown reasons, and has become a key focus and difficulty in clinical diagnosis and treatment.

Currently, the gold standard treatment for NOA patients is microdissection testicular sperm extraction (mTESE), but the manual screening of sperm from post-surgical testicular tissue suspension has significant limitations—embryologists must search field by field under a microscope, with a single operation typically taking 1 to 6 hours or even longer. The lengthy detection process is not only prone to missed detection of viable sperm due to interference from high concentrations of other cells in the sample but also may lead to human errors caused by differences in laboratory personnel experience or fatigue after prolonged work. Such errors can result in unnecessary surgery for patients in mild cases, or misdiagnosis of NOA patients who could achieve fertility through intracytoplasmic sperm injection (ICSI) as absolutely infertile in severe cases. Therefore, there is an urgent clinical need for a more efficient and high-throughput method for sperm localization and separation to optimize the diagnosis and treatment process, and the advantages of machine learning and artificial intelligence (AI) in image analysis provide a feasible path to address this pain point.

II. Research Methods: From Sample Preparation to AI Training and Experimental Design

In the sample preparation phase, the study first used specially prepared samples for initial AI model training. These samples were mixed with donor sperm, red blood cells from fingertip blood, white blood cells, epithelial cells, etc., to simulate the high collateral cell contamination environment commonly found in mTESE samples. Subsequently, clinical testicular tissue samples from 8 azoospermia patients (6 with NOA and 2 with obstructive azoospermia) were introduced to provide a training background closer to real clinical scenarios.

The experiment was designed into two cohorts: Cohort 1 used 512 static images from 4 NOA patients (containing 2660 sperm to be detected) to compare the per-field detection time, recall rate, precision, and total number of detected sperm between AI and embryologists, with the true sperm labels confirmed by two independent scientists as the benchmark. Cohort 2 simulated clinical scenarios: using testicular tissue samples from 4 NOA patients on ICSI microscope equipment, embryologists performed detections under “AI-assisted” and “non-AI-assisted” conditions respectively, comparing the detection time and number of detected sperm per drop of sample. To eliminate subjective interference, the Petri dishes were randomly rearranged to ensure embryologists were unaware of the grouping.

III. Research Results: AI Demonstrates Significant Advantages

In terms of experimental results, static image detection in Cohort 1 showed that AI was comprehensively superior to embryologists in core performance indicators. The average time for AI to identify sperm in each field was only 0.02 seconds, much faster than the 36.10 seconds for embryologists, representing a detection speed improvement of approximately 99.95% with extremely high statistical significance. In terms of recall rate, AI reached 91.95%, higher than the 86.52% of embryologists, indicating that AI is more advantageous in reducing missed detections. Although the precision of embryologists (98.18%) was significantly higher than that of AI (89.58%), recall rate is the key indicator given the high clinical cost of “false negatives” in sperm retrieval for NOA patients. In terms of the total number of detected sperm, AI identified 1997, more than the 1937 detected by embryologists, further confirming AI’s advantage in reducing missed detections.

The simulated clinical detection results in Cohort 2 were consistent with those in Cohort 1. With AI assistance, the average time for embryologists to complete sperm detection per drop of sample was 98.9 seconds, significantly shorter than the 168.7 seconds without AI assistance. In terms of detection quantity, the AI-assisted group detected a total of 1396 sperm, more than the 1274 in the non-AI-assisted group. Although the number of detected sperm per drop of sample did not reach statistical significance, the overall trend showed that AI assistance improved detection quantity, verifying the practical value of AI in clinical scenarios.

IV. Research Innovations and Limitations: Breakthrough Value of AI and Future Optimization Directions

The core innovation of this study lies in the first application of machine learning AI in surgical sperm retrieval scenarios for complex testicular tissue samples, filling the gap of AI in this field. Previous studies have mostly focused on sperm in clear environments or embryo selection, while this study developed an AI model for clinically severely contaminated testicular tissue samples with multiple interfering factors, achieving real-time video stream-based sperm recognition and successful integration into ICSI microscope equipment. By simulating differences in various microscopes, lighting environments, and camera equipment, the model improved adaptability and can identify sperm in different viability states, expanding its application scope.

However, the study has limitations. Currently, validation has only been conducted on “immotile sperm”—although supplementary experiments showed that AI also performs excellently in recognizing motile sperm, validation in real clinical diagnosis and treatment processes has not been completed. The simulated clinical scenario tests did not include the time consumed to “confirm sperm position during microscope field movement,” which may slightly reduce AI’s time advantage in practical applications. In addition, the current study only included samples from 4 NOA patients, with a limited dataset scale. Future multi-center, large-sample clinical studies are needed to further verify the robustness of the model.

V. Conclusions and Clinical Significance: AI Brings New Breakthroughs in Azoospermia Diagnosis and Treatment

This study provides a new technical path for the efficiency and precision of post-surgical sperm detection in azoospermia patients. This technology not only avoids the decline in sperm viability caused by prolonged detection but also reduces the risk of misdiagnosis due to human errors, securing more fertility opportunities for NOA patients. In terms of clinical application value, the positioning of this AI model is to assist rather than replace embryologists—by guiding embryologists to focus on “suspected sperm regions,” it helps them avoid biological limitations such as visual fatigue, forming an efficient collaborative model of “AI assistance + human judgment.” In the future, further development of “sperm viability and morphology evaluation modules” and “sperm twitch detection functions” is expected to form a standardized and efficient diagnosis and treatment workflow. This will improve the accessibility of treatment for azoospermia patients, reduce the working time of medical staff, expand the coverage of sample detection, and provide important technical support for the clinical diagnosis and treatment of severe male factor infertility.