BirdVoxDetect: Large-Scale Detection and Classification of Flight Calls for Bird Migration Monitoring

Articles dans une revue

Auteurs : Vincent Lostanlen, Aurora Cramer, Justin Salamon, Andrew Farnsworth, Benjamin M. Van Doren, Steve Kelling, Juan Pablo Bello.

Publié dans : IEEE/ACM Transactions on Audio, Speech and Language Processing

Date de publication : 2024

Acoustic signal detectionAudio databasesDeep learningEcosystemsPhylogeny
Lien vers le dépot HAL

Abstract


Sound event classification has the potential to advance our understanding of bird migration. Although it is long known that migratory species have a vocal signature of their own, previous work on automatic flight call classification has been limited in robustness and scope: e.g., covering few recording sites, short acquisition segments, and simplified biological taxonomies. In this paper, we present BirdVoxDetect (BVD), the first full-fledged solution to bird migration monitoring from acoustic sensor network data. As an open-source software, BVD integrates an original pipeline of three machine learning modules. The first module is a random forest classifier of sensor faults, trained with human-in-the-loop active learning. The second module is a deep convolutional neural network for sound event detection with per-channel energy normalization (PCEN). The third module is a multitask convolutional neural network which predicts the family, genus, and species of flight calls from passerines (Passeriformes) of North America. We evaluate BVD on a new dataset (296 hours from nine locations, the largest to date for this task) and discuss the main sources of estimation error in a real-world deployment: mechanical sensor failures, sensitivity to background noise, misdetection, and taxonomic confusion. Then, we deploy BVD to an unprecedented scale: 6672 hours of audio (approximately one terabyte), corresponding to a full season of bird migration. Running BVD in parallel over the full-season dataset yields 1.6 billion FFT's, 480 million neural network predictions, and over six petabytes of throughput. With this method, our main finding is that deep learning and bioacoustic sensor networks are ready to complement radar observations and crowdsourced surveys for bird migration monitoring, thus benefiting conservation ecology and land-use planning at large.