-
A hybrid artificial intelligence model leverages multi-centric clinical data to improve fetal heart rate pregnancy prediction across time-lapse systems.
Can artificial intelligence (AI) algorithms developed to assist embryologists in evaluating embryo morphokinetics be enriched with multi-centric clinical data to better predict clinical pregnancy outcome?
Training algorithms on multi-centric clinical data significantly increased AUC compared to algorithms that only analyzed the time-lapse system (TLS) videos.
Several AI-based algorithms have been developed to predict pregnancy, most of them based only on analysis of the time-lapse recording of embryo development. It remains unclear, however, whether considering numerous clinical features can improve the predictive performances of time-lapse based embryo evaluation.
A dataset of 9986 embryos (95.60% known clinical pregnancy outcome, 32.47% frozen transfers) from 5226 patients from 14 European fertility centers (in two countries) recorded with three different TLS was used to train and validate the algorithms. A total of 31 clinical factors were collected. A separate test set (447 videos) was used to compare performances between embryologists and the algorithm.
Clinical pregnancy (defined as a pregnancy leading to a fetal heartbeat) outcome was first predicted using a 3D convolutional neural network that analyzed videos of the embryonic development up to 2 or 3 days of development (33% of the database) or up to 5 or 6 days of development (67% of the database). The output video score was then fed as input alongside clinical features to a gradient boosting algorithm that generated a second score corresponding to the hybrid model. AUC was computed across 7-fold of the validation dataset for both models. These predictions were compared to those of 13 senior embryologists made on the test dataset.
The average AUC of the hybrid model across all 7-fold was significantly higher than that of the video model (0.727 versus 0.684, respectively, P = 0.015; Wilcoxon test). A SHapley Additive exPlanations (SHAP) analysis of the hybrid model showed that the six first most important features to predict pregnancy were morphokinetics of the embryo (video score), oocyte age, total gonadotrophin dose intake, number of embryos generated, number of oocytes retrieved, and endometrium thickness. The hybrid model was shown to be superior to embryologists with respect to different metrics, including the balanced accuracy (P ≤ 0.003; Wilcoxon test). The likelihood of pregnancy was linearly linked to the hybrid score, with increasing odds ratio (maximum P-value = 0.001), demonstrating the ranking capacity of the model. Training individual hybrid models did not improve predictive performance. A clinic hold-out experiment was conducted and resulted in AUCs ranging between 0.63 and 0.73. Performance of the hybrid model did not vary between TLS or between subgroups of embryos transferred at different days of embryonic development. The hybrid model did fare better for patients older than 35 years (P < 0.001; Mann-Whitney test), and for fresh transfers (P < 0.001; Mann-Whitney test).
Participant centers were located in two countries, thus limiting the generalization of our conclusion to wider subpopulations of patients. Not all clinical features were available for all embryos, thus limiting the performances of the hybrid model in some instances.
Our study suggests that considering clinical data improves pregnancy predictive performances and that there is no need to retrain algorithms at the clinic level unless they follow strikingly different practices. This study characterizes a versatile AI algorithm with similar performance on different time-lapse microscopes and on embryos transferred at different development stages. It can also help with patients of different ages and protocols used but with varying performances, presumably because the task of predicting fetal heartbeat becomes more or less hard depending on the clinical context. This AI model can be made widely available and can help embryologists in a wide range of clinical scenarios to standardize their practices.
Funding for the study was provided by ImVitro with grant funding received in part from BPIFrance (Bourse French Tech Emergence (DOS0106572/00), Paris Innovation Amorçage (DOS0132841/00), and Aide au Développement DeepTech (DOS0152872/00)). A.B.-C. is a co-owner of, and holds stocks in, ImVitro SAS. A.B.-C. and F.D.M. hold a patent for 'Devices and processes for machine learning prediction of in vitro fertilization' (EP20305914.2). A.D., N.D., M.M.F., and F.D.M. are or have been employees of ImVitro and have been granted stock options. X.P.-V. has been paid as a consultant to ImVitro and has been granted stocks options of ImVitro. L.C.-D. and C.G.-S. have undertaken paid consultancy for ImVitro SAS. The remaining authors have no conflicts to declare.
N/A.
Duval A
,Nogueira D
,Dissler N
,Maskani Filali M
,Delestro Matos F
,Chansel-Debordeaux L
,Ferrer-Buitrago M
,Ferrer E
,Antequera V
,Ruiz-Jorro M
,Papaxanthos A
,Ouchchane H
,Keppi B
,Prima PY
,Regnier-Vigouroux G
,Trebesses L
,Geoffroy-Siraudin C
,Zaragoza S
,Scalici E
,Sanguinet P
,Cassagnard N
,Ozanon C
,De La Fuente A
,Gómez E
,Gervoise Boyer M
,Boyer P
,Ricciarelli E
,Pollet-Villard X
,Boussommier-Calleja A
... -
《-》
-
Development of an artificial intelligence-based assessment model for prediction of embryo viability using static images captured by optical light microscopy during IVF.
Can an artificial intelligence (AI)-based model predict human embryo viability using images captured by optical light microscopy?
We have combined computer vision image processing methods and deep learning techniques to create the non-invasive Life Whisperer AI model for robust prediction of embryo viability, as measured by clinical pregnancy outcome, using single static images of Day 5 blastocysts obtained from standard optical light microscope systems.
Embryo selection following IVF is a critical factor in determining the success of ensuing pregnancy. Traditional morphokinetic grading by trained embryologists can be subjective and variable, and other complementary techniques, such as time-lapse imaging, require costly equipment and have not reliably demonstrated predictive ability for the endpoint of clinical pregnancy. AI methods are being investigated as a promising means for improving embryo selection and predicting implantation and pregnancy outcomes.
These studies involved analysis of retrospectively collected data including standard optical light microscope images and clinical outcomes of 8886 embryos from 11 different IVF clinics, across three different countries, between 2011 and 2018.
The AI-based model was trained using static two-dimensional optical light microscope images with known clinical pregnancy outcome as measured by fetal heartbeat to provide a confidence score for prediction of pregnancy. Predictive accuracy was determined by evaluating sensitivity, specificity and overall weighted accuracy, and was visualized using histograms of the distributions of predictions. Comparison to embryologists' predictive accuracy was performed using a binary classification approach and a 5-band ranking comparison.
The Life Whisperer AI model showed a sensitivity of 70.1% for viable embryos while maintaining a specificity of 60.5% for non-viable embryos across three independent blind test sets from different clinics. The weighted overall accuracy in each blind test set was >63%, with a combined accuracy of 64.3% across both viable and non-viable embryos, demonstrating model robustness and generalizability beyond the result expected from chance. Distributions of predictions showed clear separation of correctly and incorrectly classified embryos. Binary comparison of viable/non-viable embryo classification demonstrated an improvement of 24.7% over embryologists' accuracy (P = 0.047, n = 2, Student's t test), and 5-band ranking comparison demonstrated an improvement of 42.0% over embryologists (P = 0.028, n = 2, Student's t test).
The AI model developed here is limited to analysis of Day 5 embryos; therefore, further evaluation or modification of the model is needed to incorporate information from different time points. The endpoint described is clinical pregnancy as measured by fetal heartbeat, and this does not indicate the probability of live birth. The current investigation was performed with retrospectively collected data, and hence it will be of importance to collect data prospectively to assess real-world use of the AI model.
These studies demonstrated an improved predictive ability for evaluation of embryo viability when compared with embryologists' traditional morphokinetic grading methods. The superior accuracy of the Life Whisperer AI model could lead to improved pregnancy success rates in IVF when used in a clinical setting. It could also potentially assist in standardization of embryo selection methods across multiple clinical environments, while eliminating the need for complex time-lapse imaging equipment. Finally, the cloud-based software application used to apply the Life Whisperer AI model in clinical practice makes it broadly applicable and globally scalable to IVF clinics worldwide.
Life Whisperer Diagnostics, Pty Ltd is a wholly owned subsidiary of the parent company, Presagen Pty Ltd. Funding for the study was provided by Presagen with grant funding received from the South Australian Government: Research, Commercialisation and Startup Fund (RCSF). 'In kind' support and embryology expertise to guide algorithm development were provided by Ovation Fertility. J.M.M.H., D.P. and M.P. are co-owners of Life Whisperer and Presagen. Presagen has filed a provisional patent for the technology described in this manuscript (52985P pending). A.P.M. owns stock in Life Whisperer, and S.M.D., A.J., T.N. and A.P.M. are employees of Life Whisperer.
VerMilyea M
,Hall JMM
,Diakiw SM
,Johnston A
,Nguyen T
,Perugini D
,Miller A
,Picou A
,Murphy AP
,Perugini M
... -
《-》
-
Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?
What is the accuracy and agreement of embryologists when assessing the implantation probability of blastocysts using time-lapse imaging (TLI), and can it be improved with a data-driven algorithm?
The overall interobserver agreement of a large panel of embryologists was moderate and prediction accuracy was modest, while the purpose-built artificial intelligence model generally resulted in higher performance metrics.
Previous studies have demonstrated significant interobserver variability amongst embryologists when assessing embryo quality. However, data concerning embryologists' ability to predict implantation probability using TLI is still lacking. Emerging technologies based on data-driven tools have shown great promise for improving embryo selection and predicting clinical outcomes.
TLI video files of 136 embryos with known implantation data were retrospectively collected from two clinical sites between 2018 and 2019 for the performance assessment of 36 embryologists and comparison with a deep neural network (DNN).
We recruited 39 embryologists from 13 different countries. All participants were blinded to clinical outcomes. A total of 136 TLI videos of embryos that reached the blastocyst stage were used for this experiment. Each embryo's likelihood of successfully implanting was assessed by 36 embryologists, providing implantation probability grades (IPGs) from 1 to 5, where 1 indicates a very low likelihood of implantation and 5 indicates a very high likelihood. Subsequently, three embryologists with over 5 years of experience provided Gardner scores. All 136 blastocysts were categorized into three quality groups based on their Gardner scores. Embryologist predictions were then converted into predictions of implantation (IPG ≥ 3) and no implantation (IPG ≤ 2). Embryologists' performance and agreement were assessed using Fleiss kappa coefficient. A 10-fold cross-validation DNN was developed to provide IPGs for TLI video files. The model's performance was compared to that of the embryologists.
Logistic regression was employed for the following confounding variables: country of residence, academic level, embryo scoring system, log years of experience and experience using TLI. None were found to have a statistically significant impact on embryologist performance at α = 0.05. The average implantation prediction accuracy for the embryologists was 51.9% for all embryos (N = 136). The average accuracy of the embryologists when assessing top quality and poor quality embryos (according to the Gardner score categorizations) was 57.5% and 57.4%, respectively, and 44.6% for fair quality embryos. Overall interobserver agreement was moderate (κ = 0.56, N = 136). The best agreement was achieved in the poor + top quality group (κ = 0.65, N = 77), while the agreement in the fair quality group was lower (κ = 0.25, N = 59). The DNN showed an overall accuracy rate of 62.5%, with accuracies of 62.2%, 61% and 65.6% for the poor, fair and top quality groups, respectively. The AUC for the DNN was higher than that of the embryologists overall (0.70 DNN vs 0.61 embryologists) as well as in all of the Gardner groups (DNN vs embryologists-Poor: 0.69 vs 0.62; Fair: 0.67 vs 0.53; Top: 0.77 vs 0.54).
Blastocyst assessment was performed using video files acquired from time-lapse incubators, where each video contained data from a single focal plane. Clinical data regarding the underlying cause of infertility and endometrial thickness before the transfer was not available, yet may explain implantation failure and lower accuracy of IPGs. Implantation was defined as the presence of a gestational sac, whereas the detection of fetal heartbeat is a more robust marker of embryo viability. The raw data were anonymized to the extent that it was not possible to quantify the number of unique patients and cycles included in the study, potentially masking the effect of bias from a limited patient pool. Furthermore, the lack of demographic data makes it difficult to draw conclusions on how representative the dataset was of the wider population. Finally, embryologists were required to assess the implantation potential, not embryo quality. Although this is not the traditional approach to embryo evaluation, morphology/morphokinetics as a means of assessing embryo quality is believed to be strongly correlated with viability and, for some methods, implantation potential.
Embryo selection is a key element in IVF success and continues to be a challenge. Improving the predictive ability could assist in optimizing implantation success rates and other clinical outcomes and could minimize the financial and emotional burden on the patient. This study demonstrates moderate agreement rates between embryologists, likely due to the subjective nature of embryo assessment. In particular, we found that average embryologist accuracy and agreement were significantly lower for fair quality embryos when compared with that for top and poor quality embryos. Using data-driven algorithms as an assistive tool may help IVF professionals increase success rates and promote much needed standardization in the IVF clinic. Our results indicate a need for further research regarding technological advancement in this field.
Embryonics Ltd is an Israel-based company. Funding for the study was partially provided by the Israeli Innovation Authority, grant #74556.
N/A.
Fordham DE
,Rosentraub D
,Polsky AL
,Aviram T
,Wolf Y
,Perl O
,Devir A
,Rosentraub S
,Silver DH
,Gold Zamir Y
,Bronstein AM
,Lara Lara M
,Ben Nagi J
,Alvarez A
,Munné S
... -
《-》
-
Testing an artificial intelligence algorithm to predict fetal heartbeat of vitrified-warmed blastocysts from a single image: predictive ability in different settings.
Could an artificial intelligence (AI) algorithm predict fetal heartbeat from images of vitrified-warmed embryos?
Applying AI to vitrified-warmed blastocysts may help predict which ones will result in implantation failure early enough to thaw another.
The application of AI in the field of embryology has already proven effective in assessing the quality of fresh embryos. Therefore, it could also be useful to predict the outcome of frozen embryo transfers, some of which do not recover their pre-vitrification volume, collapse, or degenerate after warming without prior evidence.
This retrospective cohort study included 1109 embryos from 792 patients. Of these, 568 were vitrified blastocysts cultured in time-lapse systems in the period between warming and transfer, from February 2022 to July 2023. The other 541 were fresh-transferred blastocysts serving as controls.
Four types of time-lapse images were collected: last frame of development of 541 fresh-transferred blastocysts (FTi), last frame of 467 blastocysts to be vitrified (PVi), first frame post-warming of 568 vitrified embryos (PW1i), and last frame post-warming of 568 vitrified embryos (PW2i). After providing the images to the AI algorithm, the returned scores were compared with the conventional morphology and fetal heartbeat outcomes of the transferred embryos (n = 1098). The contribution of the AI score to fetal heartbeat was analyzed by multivariate logistic regression in different patient populations, and the predictive ability of the models was measured by calculating the area under the receiver-operating characteristic curve (ROC-AUC).
Fetal heartbeat rate was related to AI score from FTi (P < 0.001), PW1i (P < 0.05), and PW2i (P < 0.001) images. The contribution of AI score to fetal heartbeat was significant in the oocyte donation program for PW2i (odds ratio (OR)=1.13; 95% CI [1.04-1.23]; P < 0.01), and in cycles with autologous oocytes for PW1i (OR = 1.18; 95% CI [1.01-1.38]; P < 0.05) and PW2i (OR = 1.15; 95% CI [1.02-1.30]; P < 0.05), but was not significantly associated with fetal heartbeat in genetically analyzed embryos. AI scores from the four groups of images varied according to morphological category (P < 0.001). The PW2i score differed in collapsed, non-re-expanded, or non-viable embryos compared to normal/viable embryos (P < 0.001). The predictability of the AI score was optimal at a post-warming incubation time of 3.3-4 h (AUC = 0.673).
The algorithm was designed to assess fresh embryos prior to vitrification, but not thawed ones, so this study should be considered an external trial.
The application of predictive software in the management of frozen embryo transfers may be a useful tool for embryologists, reducing the cancellation rates of cycles in which the blastocyst does not recover from vitrification. Specifically, the algorithm tested in this research could be used to evaluate thawed embryos both in clinics with time-lapse systems and in those with conventional incubators only, as just a single photo is required.
This study was supported by the Regional Ministry of Innovation, Universities, Science and Digital Society of the Valencian Community (CIACIF/2021/019) and by Instituto de Salud Carlos III (PI21/00283), and co-funded by European Union (ERDF, 'A way to make Europe'). M.M. received personal fees in the last 5 years as honoraria for lectures from Merck, Vitrolife, MSD, Ferring, AIVF, Theramex, Gedeon Richter, Genea Biomedx, and Life Whisperer. There are no other competing interests.
N/A.
Conversa L
,Bori L
,Insua F
,Marqueño S
,Cobo A
,Meseguer M
... -
《-》
-
Deep learning as a predictive tool for fetal heart pregnancy following time-lapse incubation and blastocyst transfer.
Can a deep learning model predict the probability of pregnancy with fetal heart (FH) from time-lapse videos?
We created a deep learning model named IVY, which was an objective and fully automated system that predicts the probability of FH pregnancy directly from raw time-lapse videos without the need for any manual morphokinetic annotation or blastocyst morphology assessment.
The contribution of time-lapse imaging in effective embryo selection is promising. Existing algorithms for the analysis of time-lapse imaging are based on morphology and morphokinetic parameters that require subjective human annotation and thus have intrinsic inter-reader and intra-reader variability. Deep learning offers promise for the automation and standardization of embryo selection.
A retrospective analysis of time-lapse videos and clinical outcomes of 10 638 embryos from eight different IVF clinics, across four different countries, between January 2014 and December 2018.
The deep learning model was trained using time-lapse videos with known FH pregnancy outcome to perform a binary classification task of predicting the probability of pregnancy with FH given time-lapse video sequence. The predictive power of the model was measured using the average area under the curve (AUC) of the receiver operating characteristic curve over 5-fold stratified cross-validation.
The deep learning model was able to predict FH pregnancy from time-lapse videos with an AUC of 0.93 [95% CI 0.92-0.94] in 5-fold stratified cross-validation. A hold-out validation test across eight laboratories showed that the AUC was reproducible, ranging from 0.95 to 0.90 across different laboratories with different culture and laboratory processes.
This study is a retrospective analysis demonstrating that the deep learning model has a high level of predictability of the likelihood that an embryo will implant. The clinical impacts of these findings are still uncertain. Further studies, including prospective randomized controlled trials, are required to evaluate the clinical significance of this deep learning model. The time-lapse videos collected for training and validation are Day 5 embryos; hence, additional adjustment would need to be made for the model to be used in the context of Day 3 transfer.
The high predictive value for embryo implantation obtained by the deep learning model may improve the effectiveness of previous approaches used for time-lapse imaging in embryo selection. This may improve the prioritization of the most viable embryo for a single embryo transfer. The deep learning model may also prove to be useful in providing the optimal order for subsequent transfers of cryopreserved embryos.
D.T. is the co-owner of Harrison AI that has patented this methodology in association with Virtus Health. P.I. is a shareholder in Virtus Health. S.C., P.I. and D.G. are all either employees or contracted with Virtus Health. D.G. has received grant support from Vitrolife, the manufacturer of the Embryoscope time-lapse imaging used in this study. The equipment and time for this study have been jointly provided by Harrison AI and Virtus Health.
Tran D
,Cooke S
,Illingworth PJ
,Gardner DK
... -
《-》