Fourier-Based Action Recognition for Wildlife Behavior Quantification with Event Cameras
Hamann F., Ghosh S., Juárez Martínez I., Hart T., Kacelnik A., Gallego G.
Event cameras are novel bioinspired vision sensors that measure pixel-wise brightness changes asynchronously instead of images at a given frame rate. They offer promising advantages, namely, a high dynamic range, low latency, and minimal motion blur. Modern computer vision algorithms often rely on artificial neural network approaches, which require image-like representations of the data and cannot fully exploit the characteristics of event data. Herein, approaches to action recognition based on the Fourier transform are proposed. The approaches are intended to recognize oscillating motion patterns commonly present in nature. In particular, the approaches are applied to a recent dataset of breeding penguins annotated for “ecstatic display,” a behavior where the observed penguins flap their wings at a certain frequency. It is found that the approaches are both simple and effective, producing slightly lower results than a deep neural network (DNN) while relying just on a tiny fraction of the parameters compared to the DNN (five orders of magnitude fewer parameters). They work well despite the uncontrolled, diverse data present in the dataset. It is hoped that this work opens a new perspective on event-based processing and action recognition.