Wednesday, September 14, 2022

AI-based early warning system for river floods

Forecasting flash floods with LSTM, ARIMA and Prophet using time series data from hydrological sensors monitoring French rivers.

Predicting propagation of floods in time to save lives

Flash floods have been the major cause of disasters for years, costing lives and taking away people's homes and livelihoods. The climate change fuelled extreme events of the last 2 years just gave us a sneak peak on how early warning systems might become one of the most critical applications in every corner of the planet.

The AI for Earth engineers tackled the unpredictability of flash floods in the AI for Earth - Inland Floods Prediction Challenge. The goal was to accurately predict a river flood event from water level data measured upstream. It would serve as input for early warning systems and help mitigate the impact of floods across a river’s course.

Water level data from hydrological sensors

The data the machine learning models were to be trained and tested with was taken from a network of hydrological stations monitoring river estuaries in France. We used data from two different data sources:

The data from sensors installed by Vortex.io - a real-time hydrological data service building prediction systems. Despite having a variety of data from the monitoring available, for example the weather, the wind, etc., we’ve decided to focus on time series data of water levels (river height) with a timeframe of approximately 5 minutes.
Open data from Vigicrues - the french national information service on the risk of flooding of the main rivers in France. These were also coming from sensors measuring the water levels.

The data required different preprocessing steps. Data with water height only had to be extracted from the measurements. The two sources varied in formats and the height was in different scales. We homogenized the data, implemented name conventions and metrics (turning water level metric into meters) and changed the data point frequency of the time series to every 1 hour to be data-source independent.

A significant amount of measurements was missing in both datasets (NaN). Sensor stations which had more than 20% missing data were omitted. For the rest, we imputed missing values using interpolation.

The battle of 3 models for greater prediction accuracy

Following data processing and analysis the teams started testing the different types of models best suited for the flood prediction use case.

Accuracy was the most important parameter of the final model. When a model predicts a flood event, a warning system is triggered to notify people to evacuate the area. You don’t want the people to be falsely notified to evacuate, nor not to be warned of an upcoming flood. We had to balance the model accuracy with how many hours ahead the prediction will be. You’d want to give people enough time to see the warning and evacuate the area.

The model had to fulfil certain requirements:

Regressive forecast
Multivariate
Handle seasonality

This yielded several options for experimentation:

ARIMA (Autoregressive Integrated Moving Average) - a classic linear framework for time series prediction
Prophet - traditional additive regression model with a piecewise linear or logistic growth curve trend
LSTMs (Long Short-Term Memory) - a powerful recurrent neural network

We’ve split the group in 2 teams to work in parallel to come up with several possible avenues of usable prediction models and test assumptions on real life data.

Since the main goal of the project was to accurately predict floods (and to a lesser extent droughts), we considered it more important to inspect if they were accurately predicted. We wanted an as accurate as possible prediction for a window within 24 hours.

‍

TEAM 1: Linear vs. Deep Learning prediction models

Our Inland Floods Prediction Team 1 split into three parts to experiment with all of the 3 models. To compare models’ results, we used the same training and test sets from 2 stations. We selected datasets that included flood events.

The following metrics were used to evaluate the models:

Root Mean Squared Error (RMSE)
Mean Absolute Error (MAE)
Mean Absolute Percentage Error (MAPE)
Correlation (Corr)

Below are the 3 test set 6-hour ahead predictions for the ‘Marmande’. To plot these predictions, we used a week of data to predict the next 6 hours, and then we did that iteratively until the whole test set was predicted.

ARIMA predictions

LSTM predictions

‍Prophet predictions

The ARIMA model managed to outperform the other two in all the evaluation metrics, achieving an average Mean Absolute Error of less than 9 centimeters. Even though its long-term capabilities were quite limited within the prediction timeframe of 6 hours, it produced the best results in the team's experimentation.

“This was a surprising moment for us. The recurrent neural network LSTM was expected to do a better job at time series prediction. It’s a novel; way more complex algorithm. Moreover, both deep learning models were multivariate meaning they are also using the water level data of all the stations upstream of the forecasted station. This was the biggest take away for us - that a newer, more advanced neural network fed with a lot of data can still be outperformed by a linear univariate forecasting model like ARIMA.” - Georgios Gkenios, AI for Earth engineer

‍

ARIMA fine tuned and repeatable forecast pipeline created

We went on to further fine-tune the ARIMA model. We tweaked the code of its predictive model, achieving accurate predictions within a 10cm range on average, up to 6 hours in advance, and scoring an error margin of less than 10%.

With one week of the Challenge left, we decided to focus on model feasibility in real life - ease of deployment and on scalability. We wanted to create a defined and clear sequence of steps that starts with the data and ends with the forecast. From homogenizing the data, imputing missing values, fixing the time series data points, loading the model to making the prediction.

We delivered the blueprint of an ARIMA-based model that can predict with high accuracy the likelihood of flash floods in the short term (within 6 hours). The predictive model has the potential to deliver valuable short term predictions, especially for populations in small towns that lack the necessary alarm infrastructure. It can be easily generalized to other rivers, as it only uses data from hydrological sensors measuring water levels and does not require other data sources (e.g. weather forecasts, satellite imagery or hydrological data).

‍

TEAM 2: Getting Prophet to do its magic for long-term forecasts

After a short stint playing with both, LTSMs and Prophet, our Inland Floods Prediction Team 2 settled on the latter which performed better. We decided to squeeze the best results out of the Prophet algorithm.

Below, you can see the results for two sensor stations:

Marmande, where significant floods occurred
Cadillac, which is close enough to the coast that tidal effects came into play

We tested multiple prediction lengths (e.g. 2h, 8h and 24h). In the final analysis we focused on 24 hour prediction, since this is where team 1's performance degraded, but Prophet still got reasonable results.

We experimented with 3 approaches:

UV - univariate; forecast in the future using only the measurements you’ve obtained at the present sensor station
MV - multivariate; forecast in the future using also data from upstream sensor stations
MV+ - rolling forecast, multivariate; forecast using upstream sensor data; the model is retrained after every forecast, using the new bit of information.

The multivariate rolling forecast outperformed all others. It traced the actual measurements very well. However, at flooding events, it lagged the actual measurements by a couple hours.

Forecast on the Marmande station test data

Forecast on the Cadillac station test data

MV+ and MV are close. The model is able to recognise the seasonalities in the measurements, however, performance is worse than in Marmande (more deviations).

The multivariate rolling forecast was best at 24hr advance predictions

The multivariate model gives great forecasts. At the Marmande dataset, performance could be improved by including more events, which is a feature of the Prophet model. An example event would be, if the water has risen 1m within a given timeframe, then the event is triggered. This could improve performance at flooding events.

At Cadillac, the sensor station close to the coast, performance could be improved by fine tuning the seasonality measurements. Seasonality in this case means the influence of tides and ebbs on floods of the exposed river estuaries. The expert knowledge of including these seasonalities in the model would make it easier for the model to make accurate predictions.

“Prophet MV+ model excelled when forecasting 24 hours in the future, even compared to the outcomes of Team 1. While ARIMA had comparable performance to Prophet at short-term (up to 6 hrs) it didn’t keep up with its accuracy for the 24 hour time span.” - Flin Verdaasdonk, AI for Earth engineer

For further follow-up steps we’d suggest to tweak the Prophet method by

Provide more thorough quantitative analysis; include evaluation metrics
Add models that are trained to reduce the residual error
Find events which improve flooding predictions
Inspect if coastal predictions can be improved using additional seasonalities

It would be also interesting to experiment with forecasts (with any model) by including weather data and satellite image analysis.

Managing teams through vacation season

The AI for Good Challenges require a lot of coordination and mentorship so participants don’t disconnect from the project, with the weekly meetings, hurdle analysis and progress reports. Vacation season proved to be challenging, losing engineer capacity of several weeks to reducing entire teams.

But for us who stayed to the end, the Challenge felt like building a new, small data team within a startup. We had to set roles, methodologies, communication strategies; and work as a completely autonomous team. This involved picking up responsibility, project managing, scripting and laughs. One shared emotion of our team was the gratitude for the purpose of this Challenge and for the FruitPunch AI team, who always gave us positive reinforcement and helped us understand the value of group collaboration.

Georgios Gkenios & Flin Verdaasdonk

AI for Earth II engineers

Inland Floods Prediction Team 1: Agustin Iniguez Rabago, Georgios Gkenios, Kiki van Rongen, Pavlos Skevofylax, Sabrina Wirjopawiro, Samantha Biegel

Inland Floods Prediction Team 2: Flin Verdaasdonk, Sabelo Mcebo Makhanya

‍

Subscribe to our newsletter

Be the first to know when a new AI for Good challenge is launched. Keep up do date with the latest AI for Good news.

Previous publication

You are reading the most recent publication!

Return to publications page

Next publication

You have reached the end of our publications, you must love our content!

Return to publications page

April 12, 2024

Looking inland to clean up oil spills with AI

Oil spills are a significant threat to marine ecosystems, wildlife, and public health. Ships sinking, boats crashing, or leaking tankers pollute our rivers and ports, causing ecological damage. In our latest collaboration with Rijkswaterstaat (The Dutch Ministry of Infrastructure and Waterways), we aimed to use Artificial Intelligence to improve the response times of their Oil Clean-up teams. By working with drone-captured RGB images, we aimed to develop an advanced oil-volume estimation tool that would enhance the efficiency and accuracy of oil spill assessments. This tool would also facilitate prompt and effective response measures to minimize the damage caused by oil spills

February 21, 2024

Saving Marine Ecosystems with Artificial Intelligence

Discover how the AI for Coral Reefs Challenge is revolutionizing marine conservation. With a blend of supervised and unsupervised AI approaches, this initiative aims to enhance coral reef monitoring, offering hope for the preservation of marine ecosystems through advanced technology.

February 1, 2024

Tracking Turtles: How AI helps conservationists to re-identify sea turtles

Discover how AI revolutionizes sea turtle conservation, enabling accurate re-identification with non-intrusive methods. This collaboration with Sea Turtle Conservation Bonaire utilizes cutting-edge AI to match sea turtle photos against a database, enhancing our understanding of their migratory patterns and aiding in their preservation. The solution employs innovative techniques for turtle face detection, feature extraction, and matching, proving pivotal for habitat preservation efforts.

January 31, 2024

AI and Visualisations: A Data Driven All-Rounded Approach for Road Safety

Exploring the intersection of AI and road safety, highlighting a collaborative project aimed at leveraging data insights to enhance road safety measures. We delve into the use of AI-based Advanced Driver Assistance Systems and the analysis of various alert types to develop solutions like the Vehicle Risk Score and Hotspot Identification tools.

January 29, 2024

Understanding AI Models: A Comprehensive Guide

Explore artificial intelligence models and choose the right one with our comprehensive guide. Enhance your AI journey.

January 29, 2024

AI Challenge: Test Your Skills with Real-World Problems

Take on real-world challenges and boost your AI skills with our AI Challenge. Test, learn, and excel.

January 29, 2024

Your Path to an Artificial Intelligence Degree Made Easy

Learn about different ways to earn your AI degree. Explore online options and level up your AI career with convenience.

January 29, 2024

AI Bootcamp: Fast-Track Your AI Career

Accelerate your AI career with intensive AI Bootcamp training. Fast-track your expertise in AI and Machine Learning.

January 29, 2024

Machine Learning Projects: The Path to Becoming an AI Pro

Master AI through exciting Machine Learning projects. Your path to becoming an AI pro starts here!

January 16, 2024

AI Training For Businesses

Artificial Intelligence (AI) is no longer just a futuristic idea. It has become a game-changer in today's world. If you've been keeping up with business trends, you would know this. AI is important for both startups and established companies to stay competitive and innovative by building ai training solutions & upskilling teams in ai.

January 16, 2024

Unlocking AI Skills: Your Guide to AI Bootcamps

AI is a powerful force that drives innovation in various industries, not just a trendy term in the tech world. If you're a skilled developer or curious about AI, you've probably heard of AI bootcamps for learning and enhancing abilities. In this article, we'll explain AI bootcamps, why they're important, and how to choose the right one for your learning journey.

January 16, 2024

AI Learning Solutions: Boosting Team Competency

AI is changing the game for businesses in different industries in our fast-paced, tech-driven world. It's no longer a question of if AI will shape the future but how prepared we are to embrace it. In this article, I will explain the benefits of AI learning. It can improve team skills, encourage innovation, and help your organisation succeed in the AI era.

January 16, 2024

Essentials of AI Upskilling for Companies

The AI Revolution: Why Companies Can't Ignore Utilising AI Technologies and need to close the ai skills gap by upskilling teams.

November 20, 2023

From Pixels to Preservation: How AI Gives Rise to a Birdwatching Revolution

At the interface of ecology and the ever-increasing applications to use artificial intelligence for good, FruitPunch AI and the Swedish University of Agricultural Sciences (SLU) joined together to better understand how our ecosystems function. Over decades, passionate ornithologists have been taking thousands of images of birds across the world. In the AI for Eagles Challenge, we exploited this dataset to build a machine learning pipeline that would allow rapid assessment of a bird’s behavior, species, and age. Traditional manual labeling of such images is a time-consuming process, and the need for speed in assessing individual birds and population states underscores the urgency of the task. The successful implementation of AI has the potential to revolutionize the efficiency of bird image analysis. By automating the labeling process, the application can significantly accelerate the assessment of individual birds and contribute valuable insights into the broader state of bird populations.

October 18, 2023

Leveraging Large Language Models to make businesses around the world more sustainable!

With rising global temperatures, the world is facing more and more natural disasters in the form of extreme drought and subsequent fires, as well as extreme rainfall and flooding. Amidst this climate crisis, it is the responsibility of every single organization to implement sustainable business practices. One company that aids organizations in becoming a more sustainable version of themselves is Metabolic.

August 25, 2023

Flying High with AI: Counting Pelican Breeding Pairs in the Danube Delta

Imagine flying in a small airplane over the vast wetlands of the Danube Delta on the shores of the Black Sea in Romania looking for patches of small white dots: great white pelicans (Pelecanus onocrotalus). While flying over the colonies researchers like Sebastian Bugariu from the Romanian Ornithological Society (ROS) take photos which will be used to count the number of breeding birds when back in the office. The number of breeding pairs has grown from ~5,000 pairs 15 years ago to recently ~18,000 pairs. Keeping good records of the breeding numbers is important but not an easy task. Back at the office, it can take weeks to go through the images and manually count the pelicans. Wouldn’t it be great if this process could be automated, freeing up time that could be dedicated to other important conservation? This is where the AI for Pelican challenge started.

August 10, 2023

Solving automated wildlife taxonomy with AI

In the wild landscapes of Europe, camera traps have become essential tools for ecologists and wildlife researchers, sharing glimpses into the lives of the continent's diverse and fascinating animals. These devices have revolutionized our understanding of European wildlife, but their use comes with a challenge - the sheer volume of data they generate. To tackle this issue, a team of dedicated FruitPunchers recently developed an AI solution for the European Wildlife Challenge.

July 27, 2023

Listening to the Giants: Protecting Forest Elephants Through Audio Monitoring

In the dense rainforests of Central Africa, a captivating endeavor is underway, driven by researchers from Cornell University. Their mission? To track and protect elephants through the power of audio monitoring. These majestic creatures emit resounding rumbles at an incredibly low frequency, nearly imperceptible to the human ear. These deep calls, traversing vast distances, serve as a concealed communication network for elephants, veiled from our understanding for centuries.

March 23, 2023

Understanding seals with AI

On 6th December 2022, the FruitPunchAI community joined forces with researchers from Colgate University to build a facial recognition model that enables the non-invasive study and monitoring of harbor seals and other marine mammals. The team’s goal was to build upon the existing SealNet model developed by Colgate University to improve its accuracy, generalizability across sea species and mammals, and increase development velocity across data processing and model development workflows.

February 10, 2023

How we detect oil spills on open sea and support response teams

“Deepwater Horizon Oil Spill” might ring a bell for most of you. But did you know that there are numerous oil spills caused by small accidents and deliberate discharges, which do not make the headlines? Did you also know that these smaller incidents actually contribute to the bulk of oil pollution, and are just as, if not more threatening than a single major oil spill?

January 12, 2023

Inside Out: Crafting Corporate AI Training Programs

Ever wondered what it takes to transform your corporate playground into an AI training haven? Well, buckle up because we're about to dive deep into the world of crafting internal AI training programs. We’re excited to share the secrets to building trust and knowledge within your corporate community.

January 11, 2023

Unleashing AI Potential: Hands-On Challenges for Developers

Hey there fellow tech enthusiasts! 👋 I'm Andrew, an AI developer on a mission to share the incredible journey I've had in unleashing the full potential of AI through hands-on challenges. Picture this: a vibrant community of developers, of all ages and levels of experience, coming together to learn, grow, and build trust in a world where AI is evolving faster than ever. Let me kick things off by sharing a bit about my personal journey. Back in the day, I found myself drowning in theory-heavy AI courses that left me wondering, "How on earth do I apply this in the real world?" Fast forward to today, and I've cracked the code. How? Through hands-on challenges that not only bridge the gap between theory and application but also create a sense of camaraderie among developers.

January 11, 2023

The Ultimate Guide to Team-based AI Skill Development

Today, I'm excited to share with you the ultimate guide to team-based AI skill development, a game-changer for curious developers like yourself. Let's dive in by acknowledging the dynamic nature of AI development. The tech landscape is shifting faster than ever, and staying relevant requires continuous skill development. Now, let me share a bit of my story—a story of grappling with challenges and discovering the power of collective learning.

January 11, 2023

Building a Future-Ready Development Team with AI Education

Let's dive into the realm of AI education and how it's not just about personal growth but building future-ready development teams.

January 11, 2023

Mastering AI: Customised Learning Paths for Corporates

Ever felt like you're in an AI maze within the corporate jungle? Well, you're not alone. We’ve been there, done that, and found the treasure map to mastering AI in corporate settings. Today, let's embark on a journey to explore the magic of customised learning paths.

November 24, 2022

Can AI track reforestation projects using drone and satellite data?

Satellite and drone data was used to monitor tree coverage of Justdiggit re-greening projects in Tanzania and Kenya to measure the efficiency of carbon capture! To help fight climate change by improving carbon capture efficiency, three teams of AI engineers came together to build and implement machine learning models.

November 22, 2022

The Most Important AI for Good Trends of 2022 + Some Dangers

The rise of open research communities usher in a power distribution in AI research and accessibility, AI safety is slowly being taken seriously and AI assists fundamental breakthroughs in various natural sciences! This article describes the most important AI for Good trends of 2022 and what they mean to you.

October 31, 2022

‘I just took the plunge and left my pharmacy job.’: Aisha Kala on how she became a self-taught data scientist

AI for Good engineer Aisha Kala talks about her journey of switching to a career in tech and learning statistics, coding and computer science online. A love letter to data and people in AI communities trying to make the world a better place.

October 14, 2022

How to use vehicle sensors to make cities more sustainable

A case study on making cities greener by vegetation monitoring and detecting traffic density; differentiating between heavy vehicles, buses and private transport.

October 3, 2022

Can AI detect the risk of heart failure from ECGs?

Electrocardiogram data was subjected to a sweeping array of machine learning and deep learning models. Is it as good a predictor of heart failure risk as blood tests?

September 21, 2022

The pains of classifying flooded forests in satellite data

About a tricky detection use case - from weeks of data pre-processing to training 2 CNNs; and why the answer might be in infrared band data.

September 14, 2022

AI-based early warning system for river floods

Forecasting flash floods with LSTM, ARIMA and Prophet using time series data from hydrological sensors monitoring French rivers.

September 6, 2022

How we applied AI to prevent sepsis in preterm babies

A case study on using XGBoost for time series forecasting to predict the onset of sepsis in preterm infants within a 12-hour prediction horizon.

August 29, 2022

AI-powered Wildlife Conservation in Africa

An account of 10-week teamwork developing multiple machine learning and hardware pipelines to bring production-ready AI to edge hardware on flying rangers.

August 28, 2022

Model Optimization and Pruning of Poacher-detecting YOLOv5

Optimizing a YOLOv5 model for NVIDIA Jetson Nano to increase the inference speed and reduce memory footprint, focusing on inference speed not the absolute mAP.

August 27, 2022

User-friendly, Wilderness-proof MLOps

Developing a CI/CD pipeline that automatically retrains the existing poacher-detecting model, which can be deployed to the drone with improved performance.