Whenever the European Space Agency launches a new satellite, carrying one or more innovative instruments monitoring our planet’s surface and atmosphere, these instruments begin collecting data which is sent down via a telecommunications link to selected ground stations on Earth. After the data has been acquired at the station, further processing will take place to transform the raw stream of bits into a coherent image following certain file structure and interface standards. These processed files are further shared with scientific and operational users across Europe, for inclusion into value adding and research activities.
As the mission continues to evolve, the data collected by the instruments grows proportionally. Meanwhile, the scientific community continues to expand their knowledge on which algorithms to utilize when processing the data sets to extract the relevant and useful measurement parameters. When a new algorithm has been developed, tested and validated, it is of interest to the community to take it into account when processing the instrument data. However, for time series analyses and backwards compatibility it is also important to re-process the existing, already collected, data archive.
As time has passed, this data archive has potentially grown considerably, and the exercise of re-processing the entire archive with a new algorithm and corresponding data processor becomes an exercise is big data processing.
To facilitate an efficient re-processing of the vast amounts of satellite data files often found in entire mission archives scheduled for re-processing, certain technology components must be developed and utilized.
This is the background for the DSI (Data Service Initiative) project, led by Serco SpA. This project is a strategic service-oriented approach to provide ESA with an efficient system to re-process and bulk-process the satellite data to help meet the needs of the user community. The DSI project is provided by the X-PReSS consortium (Serco, CEMS, Engineering, IFremer, INTA, S&T, Magellium, Sistema).
The project provides services to ESA for data collection, consolidation, processor integration and bulk re-processing, and support for data repatriation, information and configuration management.
The underlying technologies consists of a framework for tasking, scheduling, archiving, tracking and configuring complex processing chains, on a network of processing nodes and remote distributed servers. This framework has been developed by S[&]T.
The system developed provides its users with:
- Enhanced control, knowledge and accounting of the status, attributes, location, provenance, lineage, quality, completeness and other key attributes of the data;
- Awareness of the relationships and dependencies between data, documents, processing and transformation systems, etc;
- Enhanced ability to control and track the evolution and exploitation of the data.