Causally Consistent Reversible Debugger for MPI Applications
Date
2022
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Tartu Ülikool
Abstract
Writing programs for parallel computation is a process significantly more difficult
than programming for sequential execution. Debugger tools are of use in multiple
stages of software development including implementation, analysis and maintenance.
Some sophisticated debuggers offer – in complement to generic debugging commands
– reversible debugging commands, providing the ability to progress backwards in the
program execution in some form. MPI (Message Passing Interface) is a widely used
standard for developing parallel programs. In this thesis, the implementation of a
causally debugger for MPI applications offering reversible debugging commands while
being capable of maintaining causal consistency is presented. The debugger utilises a
distributed independent checkpointing mechanism to record the execution of the MPI
application and coordinated restore mechanism to support reversible debugging of the
MPI application. To the best of the author’s knowledge, this is the first debugger for MPI
implementing this kind of checkpointing mechanism to enable reversible debugging. The
produced tool demonstrates the viability of this checkpoint-restore mechanism to enable
reversible debugging for parallel computation.
Description
Keywords
Reverse debugging, MPI, distributed debugging, checkpointing, parallel programming, reversibility