ESAPCA: enabling the analysis of extremely large data sets by scalable and hardware-accelerated PCA and DMD

Running

Early technology development

Implementation progress

05 March 2024

Duration: 18 months

Objective

Singular Value Decomposition (SVD) is indispensable and ubiquitous in data science and engineering: either it is part of important tools (PCA, POD, DMD etc.) or it is used as pre-/post-processing by dimensionality reduction. However, in the context of very large data sets---as they nowadays arise in many disciplines inside and outside the space sector---this becomes computationally challenging as runtime and memory footprint usually grow superlinearly as a function of data size. Specific use-cases at ESA include ---but are not limited to---, long-term thermospheric density data, earth observation SAR and optical imaging data, and in situ measurements of powder bed solidification. The goal of this project is to develop a parallel, GPU-accelerated implementation of SVD and related techniques, optimized for scalability on high-performance computing (HPC) systems, and with a focus on interoperability within the Python (NumPy/SciPy/scikit-learn) data science ecosystem. Hereby, we will exploit the existing infrastructure for multi-node array computing within the Heat research software library (Refs. [1-3]). With this project, we want to fill the gap between the ease of use of the Python NumPy/SciPy/scikit-learn ecosystem, and the need for highly-efficient, hardware-accelerated matrix decomposition in space science and engineering.

Contract number

4000144045

Programme

Discovery

OSIP Idea Id

I-2023-00566

Related OSIP Campaign

Open Discovery Ideas Channel

Main application area

Generic for multiple space applications

Budget

174996€

Resilience, Crisis and Security

ESAPCA: enabling the analysis of extremely large data sets by scalable and hardware-accelerated PCA and DMD

Overview

Events

Assessments

Germany

DLR - German Aerospace Center

TEC-MSP

Early technology development

ESAPCA: enabling the analysis of extremely large data sets by scalable and hardware-accelerated PCA and DMD

Menu Overview Events Assessments

Germany

DLR - German Aerospace Center

TEC-MSP

Early technology development

Overview

Events

Assessments