By Dr James Grogan, Katie O’Connor, Dr Buket Benek Gursoy @ICHEC
The IO-SEA project is concerned with challenges in large-scale storage for Exascale computing. Hierarchical Storage Management (HSM) is a paradigm where diverse storage media, such as tape or SSDs, are combined in a single system and data is routed to them in a way that balances storage costs with capacity and bandwidth needs. Exploiting the relative strengths of different storage media is believed to be key to supporting Exascale needs.
A ‘HSM API’ has been developed in the IO-SEA project to provide a single entry-point to long-term storage infrastructure, shown in Figure 1. The ‘public’ elements of the API allow consumers to add and retrieve data and metadata within the system via standard object-store semantics. A ‘private’ element allows system integrations to enact optimal data placement on storage media by routing data to the most suitable location.
The Hestia software developed by the Irish Centre for High End Computing (ICHEC) provides an implementation of the HSM API, including a Command Line Interface (CLI), language bindings and web endpoints. The software implements HSM by acting as a middleware layer between consumers and different storage media via object store semantics.
An important aspect of HSM is the ability to automatically schedule data movement based on current system state and consumer provided hints. Hestia enables this by providing a streaming ‘Event Feed’ of object data and metadata updates. This is used by other system software, namely a ‘Policy Engine’, to trigger the movement of data between storage media via the ‘private’ HSM API.
Hestia is currently in active development and being integrated with other elements of the IO-SEA system. It is available as open-source software on the project Gitlab https://git.ichec.ie/io-sea-internal/hestia – with versioned RPMs available for RHEL 8 https://git.ichec.ie/io-sea-internal/hestia/-/releases.