Salishan Conference on High Speed Computing

April 24-27, 2017

Theme: Perspectives on HPC's Current Cambrian Explosion

The 2017 Salishan Conference on High-Speed Computing will explore the explosion in diverse computing models that have arisen in recent years. The traditional evolution of HPC will be examined in sessions on many-core computing and heterogeneous node computing. We will also examine some computing models that are characterized as Beyond Moore Computing, but include two very different models on computing in sessions on neuro-inspired computing and quantum computing. We are presently in an era of tremendous HPC architectural diversity, analogous to the Cambrian Period, ~490-540 million years ago, when there was a major explosion in the diversity of life on earth. This was when simple cell-based life in the primordial soup gave way to an explosion of life forms.

Shifting to near-term history, the Salishan Conference was established in 1981 to serve as a forum for our community to share lessons learned and technology challenges for the advent of Cray vector supercomputers. This was followed by a period of relative stability in supercomputing, as Cray’s custom vector supercomputers became the de facto standard. In the 1990’s the Attack of the Killer Micro’s hit our community and there was a period of disruption in supercomputer architectures as performance increases in Cray supercomputers stalled. Many approaches to parallel architectures were explored with DOE and DOD support, and from this architectural diversity, the ASCI program helped establish massively parallel processor systems with explicit message passing as the dominant supercomputing model for the next 15+ years. This was a golden age for performance improvement when Moore’s Law and Dennard Scaling allowed for easy application performance increases from doubling of transistor counts and attendant increases in processor frequencies.

The end of Dennard Scaling has led us to multi-core processors, many-core processors and GPUs. But increases in peak performance from these processor architectures are often decoupled from performance increases of our real applications due to data movement bottlenecks. New types of data analytic applications have also arisen with very different models of computing from our traditional scientific and engineering modeling and simulation drivers. Both of these factors have given rise to the increase in HPC architectural diversity that is the focus for this year’s Salishan Conference. In this year’s conference, we want to explore the key technical issues and tradeoffs that arise among the Application, Algorithm and Architecture perspectives for this diverse collection of computing models.

Session 1: Many-Core Computing - Application Challenges and Trinity Specifics

The realities of the utilization wall combined with the emergence of stringent application constraints, particularly those linked to energy consumption, have necessitated new system architectural strategies (e.g., many-core and heterogeneous HPC systems) and real-time operational adaptability approaches. Such complex systems require new and powerful design and programming methods to ensure optimal and reliable operation. The Trinity and Cori computing systems are two examples. Both utilize the Intel Knights Landing processor, a self- hosted, many-core processor with on-package high-bandwidth memory that delivers more than 3 teraFLOPS of double-precision peak performance per single socket node, higher intra-node parallelism, and longer hardware vector lengths. While these enhanced features provide opportunity for significant performance improvements, fully capitalizing on the power of many- core architectures comes with increased burden and complexity. This session will focus discussion on the key challenges facing application developers on many-core architectures. It will examine the implications of increasing degrees of parallelism at the compute node level, as well as increasing memory level complexity (e.g., on-package Multi-channel dynamic random access memory (MCDRAM) and off-package DIMMs). The driving question is: What are the key technical challenges facing application developers, and how can these challenges be addressed?

Session 2: Heterogeneous Computing - Application Challenges and Sierra/Summit Specifics

The move to heterogeneous compute architectures is driven by the need to model complex systems at finer levels of resolution and accuracy while minimizing the total cost of ownership of the massive machines required to run these important problems. The challenge of fully realizing the processing power improvements promised by these hybrid architectures lies in our ability to successfully adapt our codes, algorithms and tools to fully exploit the advantages of the GPU without increasing data movement and transferring so much complexity to the programmer that productivity losses outstrip computing power increases. Sierra and Summit, planned to be running at LLNL and ORNL in late 2018 are DOE’s first foray into computing platforms of over 100 Petaflops using a heterogeneous architecture. This session will describe specific efforts related to adapting codes and algorithms and developing tools to increase efficiency of the codes running on Sierra and Summit and of the programmers who write them. Further, this session will also describe future specialized processors being developed to continue our quest for increased simulation capability at sustainable costs. The driving questions behind this session are: 1) What are the key problems facing developers and what is being done to address them? and 2) What future specialized processors can help us continue to meet the demands of our most pressing problems?

Session 3: Neuro-Inspired Computing

This session will explore the theory and applications of neural-inspired computing to solve traditionally complex problems and extend past the current limits of Von Neumann architectures. Although the concept of computational models for neural networks or artificial neural networks have existed for over 70 years, the last decade has seen rapid growth in applications, from credit card fraud to image and video analysis. The use of artificial neural networks and, more generally, the use of learning algorithms is a game-changer in modern algorithms, yet harnessing this power is not always evident. New, neuro-inspired hardware, like IBM’s TrueNorth chip, now gives us a platform to expand even further beyond previous constraints of energy usage and parallelism on a large scale. With the expansion of interest in neuro-inspired computing beyond academia, and the development of larger-scale platforms, DOE laboratories are evaluating applications of this hardware to meet mission needs. We face many challenges in this goal, from evaluating the programmability and integration potential for these architectures to the paradigm shift of recognizing the usefulness of inexact solutions for certain applications. This session will address the following questions: What influences have advancements in machine learning had on neuro-inspired architectures? How do current neuro-inspired architectures meet or deviate from theoretical models? How can we use neuro- inspired computing to solve our challenges? How can this technology affect the way we design computational algorithms?

Session 4: Quantum Computing

Quantum computing has recently been receiving a lot of attention due to the efforts by IBM, Google, Microsoft, Intel, D-Wave, Rigetti, and various academic institutions and national laboratories to realize quantum computers and make them practical to use. Some of these efforts concern "real" quantum computers while others merely exploit quantum effects to solve classical problems. In all cases, new algorithms and programming techniques must be developed because existing ones do not port directly to these fundamentally different machines.Naturally questions about the class of algorithms and hence type of problems that offer improvements over classic computers arise. This session will touch on the different physical implementations of quantum computers as well as the new algorithm developments needed to support these. And last, but not least, early investigations into using quantum hardware will be shown. Questions: What are the trade-offs when moving beyond traditional HPC systems to this new model of computing? Will gate-model or annealing-model dominate the future? When will quantum computing be practical for HPC? How do we move today's programmers to the new paradigm?

Session 5: Crosscutting Fragments

New adaptations and increased complexity are two important characteristics associated with the Cambrian explosion and both are key considerations related to the increase in HPC architectural diversity as well. This session will touch on these cross-cutting areas through a loosely-coupled set of topics. New adaptations in computing technology may require alternative models of computation, such as approximate and probabilistic computing approaches, in order to fully exploit new hardware capabilities. Memory technology will also need to adapt to keep pace with the advancements in processing technology. As large-scale systems evolve to be composed of more diverse hardware organisms and applications begin to explore and embrace these new computational models, complexity increases significantly, and achieving the desired levels of performance likely requires advancements in methods and tools for performance analysis. This session will address the following questions: How will applications and application development be impacted by these new models of computing and extended memory addressing capability? What algorithms are most appropriate for probabilistic and approximate computing models? Can integrated performance tools reduce the complexity of tuning and adapting codes to new node and system architectures?