Tuesday, September 6, 2022

How we applied AI to prevent sepsis in preterm babies

A case study on using XGBoost for time series forecasting to predict the onset of sepsis in preterm infants within a 12-hour prediction horizon.

Improving the chances of preterm-born infants

About 20% of the preterm-born infants admitted to the newborn intensive care unit (NICU) will develop sepsis which is related to a higher mortality and adverse long-term effects. In the AI for Health - Sepsis Prevention project we applied machine learning to accurately predict whether a preterm baby is going to develop sepsis. Sepsis is a reaction to an infection and can be life-threatening. Early prediction of the onset gives doctors the necessary time to apply preventative measures.

Healthcare professionals meet data scientists

Our team of 4 data scientists worked for 20 weeks to address the problem. We worked in a very close collaboration with the hospital. Intensive care is an extremely sensitive topic in medicine and we had to establish good basic comprehension of the application area. Since we were all located in the Netherlands, we even got a chance to visit the premises of the NICU in person for some hands-on experience.

AI for Health engineers at the NICU of UMC Utrecht

We decided to split the team and work on two solutions. One was trying to improve the existing logistic regression model that UMC Utrecht developed. The other one was to create a new XGboost model that would outperform the improved existing model.

Existing logistic regression model got an upgrade

We started with improving the already existing logistic regression model UMC Utrecht used. We applied two methods:

hyperparameter optimization and
feature engineering

For the hyperparameter optimization we used random search and grid search with cross validation. We did this on the original set of features on only a portion of the patients due to computational limitations . We still managed to improve the ROC/ AUC score from 0.56 to 0.67.

The goal of the second approach was to create new features based on the minimum, mean, variance, peaks, and drops of those original features which have a low number of null values. We computed their feature importance by using a simple regression model and gradient boosting model to measure each feature’s influence on the outcome of the model.

We computed a correlation matrix, which gave us the opportunity to manually explore possible feature combinations for our logistic regression model. Finally, we used a sequential feature selector to find the best 5 to 8 features for our model. We found that the best combination of features was:

HF mean (2h) - heart frequency mean over a 2 hour interval;
SpO2 drops (2h) - how many oxygen saturation drops occur in a 2 hour interval;
HF variance (2h) - heart frequency variance over a 2 hour interval;
Bradycardia (2h) - how often a too slow heart rate occurs in a 2 hour interval;
AdemF mean - breathing frequency.

Despite improving the ROC/ AUC score from 0.56 to 0.67 we still operated at a near chance level. The change had to be more fundamental for a usable outcome. We had to reformulate the problem. We’ve realized quite early that:

The logistic regression model didn’t cater to the problem - it just flagged an off-the-chart value (or combination of values) when something was already happening. This might not give the medical staff the convenient time window that would ideally be sought after, to maximize the probability of a successful intervention.
We would have to move away from the real-time analysis of a data stream approach to predicting future events from chunks historical of data. A classic time series forecasting problem!

‍

“We needed a model to predict which patient will develop sepsis within the next 12 hours. A traffic police officer that doesn’t just direct & flag real-time data traffic, but one that knows on which intersection important stuff will happen. Show me what traffic crosses the intersection right now and I’ll tell you what happens in the next 12 hours of time - that’s the goal. ” - Kamal Elsayed, AI for Health engineer

‍

Developing a new XGBoost time series forecasting model

The goal here was to create an XGBoost time series forecasting classification model that predicts the onset of sepsis in preterm infants within a 12-hour prediction horizon.

The model was trained on these features:

arterial blood pressure
diastole arterial blood
pressure systole
incubator measured temperature
monitor temperature
heart rate pulse
heart rate pleth
monitor heart rate
respiratory rate
O2 saturation
gestation age
gender

Data Pre-processing

The aforementioned features are minute-by-minute time-series data streams which were recorded from a set of invasive and non-invasive medical instruments in the incubator. Each feature data stream could extend to multiple days. An event feature shows per timestamp if a notable medical intervention or an administrative event occurred. Notable events include: admission, discharge, death, negative or positive blood cultures. A positive blood culture confirms a positive sepsis case, and the corresponding timestamp is marked as sepsis onset (t_sepsis).

For every patient, each of the 10 physiological markers was subset to the 12 hours of data that directly preceded a sepsis timestamp in a case patient, or control timestamp in a control patient. Number of positive patients equaled the number of negative. In order to maintain a balanced dataset, 398 control patients were pseudo randomly drawn from the 2196 control pool. The selection was constrained to match the distribution of the gestation age.

Over this extracted 12 hours segment, a sliding window of length 3 hours was run on every physio-marker to aggregate a set of 8 statistical features. This created a total of 320 training features per patient (10 physiological x 8 statistical x 4 Intervals (3hrs)). Additionally, we added the gestation age and gender as features.

The targets used in model training were derived from the event feature. A 12 hours segment from a patient was assigned the positive class if it directly preceded a sepsis timestamp. If no sepsis event followed, the segment was assigned the negative class.

Model Validation

This model was trained and evaluated using a repeated nested cross-validation procedure to simultaneously search for the optimal parameters and evaluate the test scores. Both the inner and outer cross-validation loops used k = 4 and each loop was repeated 10 times. The inner loop used a random search over a set of probability distributions.

The model reached an average precision of .90 on the test set as seen in the graph below. This shows promise for actual implementation of the model. Further testing on new unseen data should prove generalizability and clinical feasibility.

Explainable AI in action

We developed a prediction interpretability analysis of the XGBoost model using SHAP. SHAP is a novel model agnostic technique that is used to explain predictions and model the decision process using so-called Shapley’s values.

Top 6 features, SHAP of individual predictions

The most important findings from this model were:

Minimal incubator measured temperature (int. 3) had highest average absolute SHAP values.
Both mean and median heart frequency had high impact in the 20 most impactful features.
Most prominent features without consideration of filters were then incubator measured temperature, arterial blood pressure systole and heart frequency.
Unlike in the original logistic regression model, in our XGBoost model O2 saturation was significantly less dominant. It wasn’t present in the top 10 features.

‍

Due to a different pre-processing of the dataset we couldn’t compare the performance of the 2 models and the influence of measured features one-to-one. The XGboost superiority lies in its prediction capability - it gives the hospital enough advance time to pay attention to specific patients. Our suggestion to the hospital team was to further validate the model accuracy on patient data in clinical practice.

‍

Missing data held back more advanced techniques

Our team tried to include a more advanced filtering technique - the fast Fourier transform, to build new features from the heart rate variable. But this technique proved to be difficult to implement due to a high number of missing data. Dealing with the missing data and working in a virtual environment with limited memory proved to be the biggest hurdles throughout the Challenge.

We know that it’s extremely difficult to do measurements on preterm babies consistently but it would be of great benefit. Measuring patient data without interruption would ensure a low number of missing values for crucial features like heart rate or O2 saturation; improving the prediction models immensely.

‍

What I learnt about applying ML in real life (and about the need for Explainable AI)

What surprised me the most in this Challenge, was really how many factors come into play when you design a machine learning solution for a particular purpose. It is not just about which algorithm performs the best. In this case it was very important that the outcome of the algorithm was explainable. Doctors have to make medical decisions based on these outputs and therefore can’t just trust it blindly.

UMC Utrecht considered our results a success and is already planning similar initiatives to deploy AI for clinical purposes. Both sides learned a lot from each other; our team of data scientists got seasoned in medical AI and the hospital got valuable machine learning models as well as a blueprint for similar projects.

AI for Health - Preventing Sepsis team of engineers

I’d love to give a shout out to the entire AI against Sepsis team. We did good and learned on the way. By improving the existing and creating a new model, we hope that more preterm babies’ sepsis can be signaled early on. When the babies receive their treatment earlier, severe consequences will be prevented and this might even save some lives.

‍

Laura Didden

AI for Health Engineer

AI for Health - Predicting Sepsis Team: Kamal Elsayed, Simon Sukup, Simona Stoyanova, Laura Didden

‍

Subscribe to our newsletter

Be the first to know when a new AI for Good challenge is launched. Keep up do date with the latest AI for Good news.

Previous publication

You are reading the most recent publication!

Return to publications page

Next publication

You have reached the end of our publications, you must love our content!

Return to publications page

April 12, 2024

Looking inland to clean up oil spills with AI

Oil spills are a significant threat to marine ecosystems, wildlife, and public health. Ships sinking, boats crashing, or leaking tankers pollute our rivers and ports, causing ecological damage. In our latest collaboration with Rijkswaterstaat (The Dutch Ministry of Infrastructure and Waterways), we aimed to use Artificial Intelligence to improve the response times of their Oil Clean-up teams. By working with drone-captured RGB images, we aimed to develop an advanced oil-volume estimation tool that would enhance the efficiency and accuracy of oil spill assessments. This tool would also facilitate prompt and effective response measures to minimize the damage caused by oil spills

February 21, 2024

Saving Marine Ecosystems with Artificial Intelligence

Discover how the AI for Coral Reefs Challenge is revolutionizing marine conservation. With a blend of supervised and unsupervised AI approaches, this initiative aims to enhance coral reef monitoring, offering hope for the preservation of marine ecosystems through advanced technology.

February 1, 2024

Tracking Turtles: How AI helps conservationists to re-identify sea turtles

Discover how AI revolutionizes sea turtle conservation, enabling accurate re-identification with non-intrusive methods. This collaboration with Sea Turtle Conservation Bonaire utilizes cutting-edge AI to match sea turtle photos against a database, enhancing our understanding of their migratory patterns and aiding in their preservation. The solution employs innovative techniques for turtle face detection, feature extraction, and matching, proving pivotal for habitat preservation efforts.

January 31, 2024

AI and Visualisations: A Data Driven All-Rounded Approach for Road Safety

Exploring the intersection of AI and road safety, highlighting a collaborative project aimed at leveraging data insights to enhance road safety measures. We delve into the use of AI-based Advanced Driver Assistance Systems and the analysis of various alert types to develop solutions like the Vehicle Risk Score and Hotspot Identification tools.

January 29, 2024

Understanding AI Models: A Comprehensive Guide

Explore artificial intelligence models and choose the right one with our comprehensive guide. Enhance your AI journey.

January 29, 2024

AI Challenge: Test Your Skills with Real-World Problems

Take on real-world challenges and boost your AI skills with our AI Challenge. Test, learn, and excel.

January 29, 2024

Your Path to an Artificial Intelligence Degree Made Easy

Learn about different ways to earn your AI degree. Explore online options and level up your AI career with convenience.

January 29, 2024

AI Bootcamp: Fast-Track Your AI Career

Accelerate your AI career with intensive AI Bootcamp training. Fast-track your expertise in AI and Machine Learning.

January 29, 2024

Machine Learning Projects: The Path to Becoming an AI Pro

Master AI through exciting Machine Learning projects. Your path to becoming an AI pro starts here!

January 16, 2024

AI Training For Businesses

Artificial Intelligence (AI) is no longer just a futuristic idea. It has become a game-changer in today's world. If you've been keeping up with business trends, you would know this. AI is important for both startups and established companies to stay competitive and innovative by building ai training solutions & upskilling teams in ai.

January 16, 2024

Unlocking AI Skills: Your Guide to AI Bootcamps

AI is a powerful force that drives innovation in various industries, not just a trendy term in the tech world. If you're a skilled developer or curious about AI, you've probably heard of AI bootcamps for learning and enhancing abilities. In this article, we'll explain AI bootcamps, why they're important, and how to choose the right one for your learning journey.

January 16, 2024

AI Learning Solutions: Boosting Team Competency

AI is changing the game for businesses in different industries in our fast-paced, tech-driven world. It's no longer a question of if AI will shape the future but how prepared we are to embrace it. In this article, I will explain the benefits of AI learning. It can improve team skills, encourage innovation, and help your organisation succeed in the AI era.

January 16, 2024

Essentials of AI Upskilling for Companies

The AI Revolution: Why Companies Can't Ignore Utilising AI Technologies and need to close the ai skills gap by upskilling teams.

November 20, 2023

From Pixels to Preservation: How AI Gives Rise to a Birdwatching Revolution

At the interface of ecology and the ever-increasing applications to use artificial intelligence for good, FruitPunch AI and the Swedish University of Agricultural Sciences (SLU) joined together to better understand how our ecosystems function. Over decades, passionate ornithologists have been taking thousands of images of birds across the world. In the AI for Eagles Challenge, we exploited this dataset to build a machine learning pipeline that would allow rapid assessment of a bird’s behavior, species, and age. Traditional manual labeling of such images is a time-consuming process, and the need for speed in assessing individual birds and population states underscores the urgency of the task. The successful implementation of AI has the potential to revolutionize the efficiency of bird image analysis. By automating the labeling process, the application can significantly accelerate the assessment of individual birds and contribute valuable insights into the broader state of bird populations.

October 18, 2023

Leveraging Large Language Models to make businesses around the world more sustainable!

With rising global temperatures, the world is facing more and more natural disasters in the form of extreme drought and subsequent fires, as well as extreme rainfall and flooding. Amidst this climate crisis, it is the responsibility of every single organization to implement sustainable business practices. One company that aids organizations in becoming a more sustainable version of themselves is Metabolic.

August 25, 2023

Flying High with AI: Counting Pelican Breeding Pairs in the Danube Delta

Imagine flying in a small airplane over the vast wetlands of the Danube Delta on the shores of the Black Sea in Romania looking for patches of small white dots: great white pelicans (Pelecanus onocrotalus). While flying over the colonies researchers like Sebastian Bugariu from the Romanian Ornithological Society (ROS) take photos which will be used to count the number of breeding birds when back in the office. The number of breeding pairs has grown from ~5,000 pairs 15 years ago to recently ~18,000 pairs. Keeping good records of the breeding numbers is important but not an easy task. Back at the office, it can take weeks to go through the images and manually count the pelicans. Wouldn’t it be great if this process could be automated, freeing up time that could be dedicated to other important conservation? This is where the AI for Pelican challenge started.

August 10, 2023

Solving automated wildlife taxonomy with AI

In the wild landscapes of Europe, camera traps have become essential tools for ecologists and wildlife researchers, sharing glimpses into the lives of the continent's diverse and fascinating animals. These devices have revolutionized our understanding of European wildlife, but their use comes with a challenge - the sheer volume of data they generate. To tackle this issue, a team of dedicated FruitPunchers recently developed an AI solution for the European Wildlife Challenge.

July 27, 2023

Listening to the Giants: Protecting Forest Elephants Through Audio Monitoring

In the dense rainforests of Central Africa, a captivating endeavor is underway, driven by researchers from Cornell University. Their mission? To track and protect elephants through the power of audio monitoring. These majestic creatures emit resounding rumbles at an incredibly low frequency, nearly imperceptible to the human ear. These deep calls, traversing vast distances, serve as a concealed communication network for elephants, veiled from our understanding for centuries.

March 23, 2023

Understanding seals with AI

On 6th December 2022, the FruitPunchAI community joined forces with researchers from Colgate University to build a facial recognition model that enables the non-invasive study and monitoring of harbor seals and other marine mammals. The team’s goal was to build upon the existing SealNet model developed by Colgate University to improve its accuracy, generalizability across sea species and mammals, and increase development velocity across data processing and model development workflows.

February 10, 2023

How we detect oil spills on open sea and support response teams

“Deepwater Horizon Oil Spill” might ring a bell for most of you. But did you know that there are numerous oil spills caused by small accidents and deliberate discharges, which do not make the headlines? Did you also know that these smaller incidents actually contribute to the bulk of oil pollution, and are just as, if not more threatening than a single major oil spill?

January 12, 2023

Inside Out: Crafting Corporate AI Training Programs

Ever wondered what it takes to transform your corporate playground into an AI training haven? Well, buckle up because we're about to dive deep into the world of crafting internal AI training programs. We’re excited to share the secrets to building trust and knowledge within your corporate community.

January 11, 2023

Unleashing AI Potential: Hands-On Challenges for Developers

Hey there fellow tech enthusiasts! 👋 I'm Andrew, an AI developer on a mission to share the incredible journey I've had in unleashing the full potential of AI through hands-on challenges. Picture this: a vibrant community of developers, of all ages and levels of experience, coming together to learn, grow, and build trust in a world where AI is evolving faster than ever. Let me kick things off by sharing a bit about my personal journey. Back in the day, I found myself drowning in theory-heavy AI courses that left me wondering, "How on earth do I apply this in the real world?" Fast forward to today, and I've cracked the code. How? Through hands-on challenges that not only bridge the gap between theory and application but also create a sense of camaraderie among developers.

January 11, 2023

The Ultimate Guide to Team-based AI Skill Development

Today, I'm excited to share with you the ultimate guide to team-based AI skill development, a game-changer for curious developers like yourself. Let's dive in by acknowledging the dynamic nature of AI development. The tech landscape is shifting faster than ever, and staying relevant requires continuous skill development. Now, let me share a bit of my story—a story of grappling with challenges and discovering the power of collective learning.

January 11, 2023

Building a Future-Ready Development Team with AI Education

Let's dive into the realm of AI education and how it's not just about personal growth but building future-ready development teams.

January 11, 2023

Mastering AI: Customised Learning Paths for Corporates

Ever felt like you're in an AI maze within the corporate jungle? Well, you're not alone. We’ve been there, done that, and found the treasure map to mastering AI in corporate settings. Today, let's embark on a journey to explore the magic of customised learning paths.

November 24, 2022

Can AI track reforestation projects using drone and satellite data?

Satellite and drone data was used to monitor tree coverage of Justdiggit re-greening projects in Tanzania and Kenya to measure the efficiency of carbon capture! To help fight climate change by improving carbon capture efficiency, three teams of AI engineers came together to build and implement machine learning models.

November 22, 2022

The Most Important AI for Good Trends of 2022 + Some Dangers

The rise of open research communities usher in a power distribution in AI research and accessibility, AI safety is slowly being taken seriously and AI assists fundamental breakthroughs in various natural sciences! This article describes the most important AI for Good trends of 2022 and what they mean to you.

October 31, 2022

‘I just took the plunge and left my pharmacy job.’: Aisha Kala on how she became a self-taught data scientist

AI for Good engineer Aisha Kala talks about her journey of switching to a career in tech and learning statistics, coding and computer science online. A love letter to data and people in AI communities trying to make the world a better place.

October 14, 2022

How to use vehicle sensors to make cities more sustainable

A case study on making cities greener by vegetation monitoring and detecting traffic density; differentiating between heavy vehicles, buses and private transport.

October 3, 2022

Can AI detect the risk of heart failure from ECGs?

Electrocardiogram data was subjected to a sweeping array of machine learning and deep learning models. Is it as good a predictor of heart failure risk as blood tests?

September 21, 2022

The pains of classifying flooded forests in satellite data

About a tricky detection use case - from weeks of data pre-processing to training 2 CNNs; and why the answer might be in infrared band data.

September 14, 2022

AI-based early warning system for river floods

Forecasting flash floods with LSTM, ARIMA and Prophet using time series data from hydrological sensors monitoring French rivers.

September 6, 2022

How we applied AI to prevent sepsis in preterm babies

A case study on using XGBoost for time series forecasting to predict the onset of sepsis in preterm infants within a 12-hour prediction horizon.

August 29, 2022

AI-powered Wildlife Conservation in Africa

An account of 10-week teamwork developing multiple machine learning and hardware pipelines to bring production-ready AI to edge hardware on flying rangers.

August 28, 2022

Model Optimization and Pruning of Poacher-detecting YOLOv5

Optimizing a YOLOv5 model for NVIDIA Jetson Nano to increase the inference speed and reduce memory footprint, focusing on inference speed not the absolute mAP.

August 27, 2022

User-friendly, Wilderness-proof MLOps

Developing a CI/CD pipeline that automatically retrains the existing poacher-detecting model, which can be deployed to the drone with improved performance.