This paper focuses on the issue of over-reliance on experience in determining FSH dosage and trigger timing during IVF ovarian stimulation. Through a 12-month consecutive real-world clinical observation of 584 cycles at a reproductive center in Michigan, it prospectively verified the impact of the AI decision support system, Stim Assist, on the initial and total FSH dosages, number of retrieved oocytes, and incidence of OHSS. However, the research only extends to the oocyte retrieval stage, leaving significant room for further validation studies in the future.
Research Background
In vitro fertilization-embryo transfer (IVF-ET) already accounts for 2.3% of annual newborns in the United States, with demand continuing to rise. The success of ovarian stimulation depends on the dosage of follicle-stimulating hormone (FSH) and the timing of trigger. Traditional practices are highly reliant on personal experience: insufficient dosage leads to weak follicle growth and reduced oocyte retrieval; excessive dosage easily induces ovarian hyperstimulation syndrome; and triggering too early or too late also reduces the recovery rate of mature oocytes. Meanwhile, FSH preparations are expensive, and drug waste directly increases the economic burden on patients, becoming a significant bottleneck limiting the equitable access to assisted reproduction. Early research on artificial intelligence in the field of reproduction has mostly been limited to retrospective modeling, and there is still a lack of prospective evidence in clinical practice.
AI System
This study adopted the Stim Assist clinical decision support tool. Its “starting dosage tool” was trained on 18,591 cycles from three U.S. reproductive centers between 2014 and 2020. During operation, it first uses the KNN algorithm to select the 100 most similar cases from the database based on the current patient’s AMH, AFC, and BMI, and then fits a “FSH starting dosage – expected number of MII oocytes” dose-response curve, allowing doctors to view the predicted results at different dosages before making their own decisions. Its “trigger timing tool” is based on an interpretable linear regression model trained on 30,278 cycles. It real-time reads the current day’s E2 level and follicle diameter, and outputs the expected number of retrievable MII oocytes and the E2 levels for the next three days if triggering is done “today/tomorrow/the day after tomorrow”. All calculations are completed in less than 1 second, requiring no additional manual entry and without interfering with clinical processes.
Experimental Methods
This study conducted a post-marketing observation at a large private reproductive center in Michigan, USA. It included 292 patients undergoing conventional autologous IVF from December 2022 to December 2023 as the AI group, and 292 historical patients treated by the same group of senior physicians from May 2019 to May 2022 as the 1:1 matched control group. Matching variables included age, BMI, AMH, and AFC, with missing values imputed by KNN. The primary outcomes observed were the starting FSH dosage, total FSH dosage, and number of retrieved MII oocytes, while the secondary outcomes were the E2 level on the trigger day and the incidence of OHSS.

Research Results
Overall, the AI group reduced the starting FSH dosage from 444 IU to 397 IU, and the cumulative dosage from 4655 IU to 4182 IU, with a decrease of over 10% in both cases (P<0.01), while the number of mature oocytes remained almost the same: 11.2 vs. 11.3, with no statistically significant difference. The age-stratified results were equally clear: among patients under 35 years old, the AI group reduced the starting and total dosages to 351 IU and 3566 IU, respectively, while the control group remained at 388 IU and 4012 IU, with the number of oocytes (13.8 vs. 15.3) showing no significant difference. In the 35-40 age subgroup, the dosage reduction was more significant (425 IU vs. 489 IU, 4638 IU vs. 5159 IU), and the number of oocytes increased from 8.3 to 9.3. For patients aged ≥40 years, there was a slight decrease in dosage, and the number of oocytes also showed a mild positive trend, increasing from 6.6 to 7.8. In terms of safety, neither group experienced moderate to severe OHSS, and there was no statistically significant difference in the E2 level on the trigger day. The conclusions remained robust after sensitivity analysis excluding missing data.
Research Innovations
This study moved AI from the laboratory to daily clinics, conducting a 12-month prospective validation of 584 cycles. Doctors retained the prescribing authority, and the system only provided interpretable dosage curves and trigger countdowns, achieving true human-machine collaboration. The results showed for the first time that the algorithm could reduce the overall FSH dosage by 10% without sacrificing the number of oocytes. The 35-40 age group, a traditionally low-response population, even retrieved nearly one more oocyte, and patients aged ≥40 years also saw an increase in the number of oocytes while reducing the dosage. The study did not exclude populations at low, medium, or high risk, so the conclusions can be directly mapped to daily practice. The absence of moderate to severe OHSS during the study period endorses its safety. By simultaneously focusing on hard endpoints and hard costs, it answers the question of reducing drug usage without compromising efficacy. The training set covering multiple centers and the stable performance in a single center lay the foundation for subsequent medical insurance cost control and path standardization.

Research Limitations
Despite controlling confounding factors through rigorous matching, the single-center sample with mostly senior physicians limits the extrapolation of results to regions with relatively scarce resources or to novice physicians. The research endpoint only extends to the oocyte retrieval stage, and the fertilization rate, blastocyst rate, euploidy rate, and live birth rate need to be followed up in subsequent studies. The training data comes from three U.S. centers between 2014 and 2020, and differences in ethnicity and clinical practices may weaken the generalizability of the model. The “grey box” nature of the KNN algorithm is still obscure to some users, requiring further visualization improvements.
Clinical Significance
FSH accounts for the largest share of IVF drug costs. A reduction of 470 IU per cycle is equivalent to saving patients the cost of one to two imported preparations, which is particularly crucial for families needing multiple stimulations or with limited insurance coverage. The algorithm achieves individualized dosages based on AMH, AFC, and BMI, bridging the experience gap between different physicians and shortening the learning curve for young doctors. The record of zero OHSS with reduced drug dosage provides real-world evidence for an “efficient and low-risk” ovarian stimulation strategy. If subsequent multi-center randomized controlled trials further confirm its benefits for live birth, AI-CDSS is expected to become an important tool for medical insurance cost control and clinical path standardization, promoting the development of assisted reproductive technology towards greater equity and accessibility.
reference
Cameron J. Bixby, Bradley Miller. Real-world use of an artificial intelligence–powered clinical decision support tool for ovarian stimulation[J]. F&S Reports, 2025, 6(2): 140-146. DOI: 10.1016/j.xfre.2025.01.015.