论文标题
通过矩阵分解的回声器时间序列中时间过程的紧凑表示
Compact representation of temporal processes in echosounder time series via matrix decomposition
论文作者
论文摘要
来自不同海洋平台的回声数据可用性的爆炸爆炸爆炸创造了前所未有的机会,以广泛的规模观察海洋生态系统。但是,批评能够自动发现和总结突出的时空地图结构的方法限制了这些丰富数据集的有效使用和更广泛的使用。为了应对这一挑战,我们基于矩阵分解开发了一个数据驱动的方法,该方法使用数据中的内在特征构建了长期回声器时间序列的紧凑表示。在两阶段的方法中,我们首先通过主要成分追踪从数据中删除嘈杂的离群值,然后采用时间平滑的非负矩阵分解,以自动发现少数不同的每日不同的单相图模式,其时间变化的线性组合(激活)重建了主要的回声结构。这种低级表示提供的生物学信息比原始数据更容易易加和可解释,并且适合与其他海洋变量的可视化和系统分析。与依赖固定的手工制作规则的现有方法不同,我们的无监督机器学习方法非常适合从不熟悉或快速变化的生态系统收集的数据中提取信息。这项工作构成了构建大型,基于声学的生物学观察的稳健时间序列分析的基础。
The recent explosion in the availability of echosounder data from diverse ocean platforms has created unprecedented opportunities to observe the marine ecosystems at broad scales. However, the critical lack of methods capable of automatically discovering and summarizing prominent spatio-temporal echogram structures has limited the effective and wider use of these rich datasets. To address this challenge, we develop a data-driven methodology based on matrix decomposition that builds compact representation of long-term echosounder time series using intrinsic features in the data. In a two-stage approach, we first remove noisy outliers from the data by Principal Component Pursuit, then employ a temporally smooth Nonnegative Matrix Factorization to automatically discover a small number of distinct daily echogram patterns, whose time-varying linear combination (activation) reconstructs the dominant echogram structures. This low-rank representation provides biological information that is more tractable and interpretable than the original data, and is suitable for visualization and systematic analysis with other ocean variables. Unlike existing methods that rely on fixed, handcrafted rules, our unsupervised machine learning approach is well-suited for extracting information from data collected from unfamiliar or rapidly changing ecosystems. This work forms the basis for constructing robust time series analytics for large-scale, acoustics-based biological observation in the ocean.
