HPCMALL 2023

2nd International Workshop on Malleability Techniques Applications in High-Performance Computing

DATE:  May 25th, Thusday.  2:00 PM – 6:00 PM. 

https://www.isc-hpc.com/agenda-2023.html

PROGRAM

Session 1.

2:00-2:10 pm. Workshop Presentation. Slides

2:10-2:30 pm. Hugo Taboada, Romain Pereira, Julien Jaeger and Jean-Baptiste Besnard. “Achieving Transparent Malleablity Thanks to MPI Process Virtualization.” Slides

2:30-2:50 pm Dominik Huber, Martin Schreiber and Martin Schulz. “A Case Study on PMIx-usage for Dynamic Resource Management.” Slides

2:50-3:10 pm.Isaias Alberto Compres Urena, Eishi Arima and Martin Schulz. “Probabilistic Job History Conversion and Performance Model Generation for Malleable Scheduling Simulations.” Slides

3:10-3:30 pm.Jean-Baptiste Besnard, Ahmad Tarraf, Clément Barthelemy, Emmanuel Jeannot, Sameer Shende, Felix Wolf and Alberto Cascajo. “Smart Schedulers: Molding Jobs into the Right Shape via Monitoring and Modeling.” Slides

3:30-3:50 pm. Javier Garcia Blas, Genaro Juan Sanchez Gallegos, Cosmin Octavian Petre and Jesus Carretero. “Malleable and adaptive ad-hoc file system for data intensive workloads in HPC applications. Slides

Coffe break. 4-4:30 pm

Session 2.

4:30-4:50 pm. Leandro Ariel Libutti, Francisco D. Igual, Luis Piñuel, Laura De Giusti and Marcelo Naiouf. “Scheduling Elastic Tensorflow Containers on Multi-core Servers.” Slides

4:50-5:10 pm Alberto Cascajo, David E. Singh and Jesus Carretero. “Malleable techniques and resource scheduling to improve energy efficiency in parallel applications .” Slides

5:10-5:30 pm Marc-André Vef, Alberto Miranda, Ramon Nou and André Brinkmann. “From Static to Malleable: Improving Flexibility and Compatibility in Burst Buffer File Systems.” Slides

5:30-6:00 pm Open discussion and workshop closing.

Workshop Chairs

Prof. Jesus Carretero, University Carlos III of Madrid, Spain.
Prof. Martin Schulz, Technical University of Munich, Germany.
Prof. Estela Suarez, Juelich Supercomputing Centre, Forschungszentrum Juelich GmbH and University of Bonn, Germany.

Program committee

  • Fabio Affinito. Cineca. Italy
  • Alexander Antonov. Moscow State University, Russia
  • Jean-Baptiste Besnard. ParaTools SAS. France
  • Andre Brinkmann. Johannes Gutenberg-Universität Mainz. Germany
  • Iacopo Colonnelli,  University of Totino. Italy.
  • Norbert Eicker. JSC and Univ. Wuppertal. Germany.
  • Hamid Mohammadi Fard. Technical University of Darmstadt. Germany
  • Javier Garcia Blas. Carlos III University. Spain
  • Michael Gerndt. Technical University of Munich. Germany.
  • Balazs Gerofi. RIKEN. Japan.
  • Emmanuel Jeannot. INRIA. France.
  • Michael Klemm.  AMD. Germany.
  • Masaki Kondo. Keio University. Japan.
  • Erwin Laure. MPCDF. Germany.
  • Stefano Markidis. KTH. Sweden.
  • Ramon Nou. Universitat Politècnica de Catalunya. Spain
  • Ariel Oleksiak.  Poznan Supercomputing and Networking Center. Poland.
  • David E. Singh. Universidad Carlos III de Madrid. Spain
  • Martin Schreiber  University of Grenoble-Alpes.  France
  • Sameer Shende. ParaTools SAS. USA.
  • Miwako Tsuji. RIKEN AICS. Japan.
  • Marc André Vef. Johannes Gutenberg-Universität Mainz. Germany.
  • Carlos A. Varela. Rensselaer Polytechnic Institute. USA.
  • Vladimir Voevodin. Moscow State University. Russia.
  • Mohamed Wahib. AIST/TokyoTech. OIL Japan.
  • Josef Weidendorfer. Technical University of Munich. Germany
  • Roman Wyrzykowski. Czestochowa University of Technology. Poland.
  • Vladislav Kashanskii.  Eteronix. Austria.

Description of the workshop

Motivation and Objectives:

The current static usage model of HPC systems is becoming increasingly inefficient. This is driven by the continuously growing complexity and heterogeneity of system architectures, in combination with the increased usage of coupled applications, the need for strong scaling with extreme scale parallelism, and the increasing reliance on complex and dynamic workflows.

As a consequence, we see a rise in research on malleable systems, middleware software and applications, which can adjust resources usage dynamically in order to extract a maximum of efficiency. By providing an intelligent global coordination of resources usage, through runtime scheduling of computation, network usage and I/O across all components of the system architecture, malleable HPC systems can maximize the exploitation of their resources, while at the same time minimizing the makespan of applications in many, if not most, cases.

Of particular concern is the emerging class of data-intensive applications and their interaction with classic simulation workloads, driven by the growing need to process extremely large data sets. However, uncoordinated file access in combination with limited bandwidth make the I/O system a serious bottleneck. Emerging multi-tier storage hierarchies come with the potential to remove this barrier, but maximizing performance still requires careful control to avoid congestion. Malleability allows systems to dynamically adjust the computation and storage needs of applications, on the one side, and the global system on the other.

Such malleable systems, however, face a series of fundamental research challenges, including: who initiates changes in resource availability or usage? How is it communicated? How to compute the optimal usage? How can applications cope with dynamically changing resources? What should malleable programming models and abstractions look like? How to design resource management frameworks for malleable systems? Which resources benefit from malleability and which (if any) should still be managed statically?

In order to address these challenges, the HPCMALL workshop will bring together researchers from diverse areas of HPC that are impacted or actively pursuing malleability concepts, from application developers to system architects, from programming model to system software researchers. The workshop will provide a lively discussion forum for researchers working in HPC and pursuing the concepts of and around malleability.

Topics:

We are looking for original high-quality research and position papers on applications, services, and system software for malleable high-performance computing systems. Topics of interest include, but are not limited to:

  • System and system architecture considerations in designing malleable architectures.
  • Emerging software designs to achieve malleability in high-performance computing.
  • High-level parallel programming models and programmability techniques to improve applications malleability.
  • Run-time techniques to provide malleable execution models for computation, communication and I/O.
  • Resource management frameworks and interfaces supporting malleable scheduling, resource allocations and application execution.
  • Computing and I/O scheduling algorithms providing and/or exploiting static or dynamic malleability.
  • Use of AI and ML techniques to steer malleability in systems and applications.
  • Ad-hoc storage systems and I/O scheduling techniques helping I/O malleability.
  • Support for malleable execution of applications in performance, debugging and correctness tools.
  • Energy efficiency and malleability (applications, over-provisioned systems wrt. power/energy, storage systems, etc.).
  • Experiences and use cases applying malleability to HPC applications.

Format

  • Technical paper sessions. We will give priority to presentation of high-quality technical papers. Accepted papers will be selected after a careful review process. Each paper submitted to HPCMALL will be reviewed by at least four expert reviewers.
  • Keynote speaker. The HPCMALL program will feature at least one invited presentation by a well-known speaker in the HPC community related to malleability in HPC systems and applications.
  • Panel discussion. To close the workshop, we will organize a panel composed from a set of experts who will debate current and future challenges in achieving malleability in HPC systems and applications.

Publication

Papers will be published  together with ISC proceedings.

  • The workshop proceedings (minimum 6 pages, maximum 12 pages per workshop paper) will be published after the conference. For the camera-ready version, authors are automatically granted two extra
    pages to incorporate reviewer comments (14 pages. maximum for final version of the paper).

A Journal Special Issue will be published in a reference journal.  The special issue will have an open CFP, but extended versions of the best papers accepted at HPCMALL 2022 will be invited for publication. All papers will undergo the usual peer-review process of the selected journal.

Submission Guidelines:

Paper submissions are required to be formatted using LNCS style (see Springer’s website):

Tentative Dates

  • Abstract submission:  March 7st 2023
  • Preliminary version of papers:  March 15th 2023
  • Notification:  April 6th, 2023
  • Final version of the paper: April 21st, 2023