- Additional Information
- The Challenge of Exascale
- ASCR Discovery Articles
- Overcoming Exascale Challenges
- Co-Design Collaboration Initiative
ASCR Co-Design Centers
Co-design refers to a computer system design process where scientific problem requirements influence architecture design and technology and constraints inform formulation and design of algorithms and software. To ensure that future architectures are well-suited for DOE target applications and that major DOE scientific problems can take advantage of the emerging computer architectures, major ongoing research and development centers of computational science need to be formally engaged in the hardware, software, numerical methods, algorithms, and applications co-design process. Co-design methodology requires the combined expertise of vendors, hardware architects, system software developers, domain scientists, computer scientists, and applied mathematicians working together to make informed decisions about features and tradeoffs in the design of the hardware, software and underlying algorithms.
The Three Co-Design Centers:
Timothy Germann (LANL), Center Director; email@example.com
Institutions: LANL, LLNL, ORNL, SNL, Stanford, and CalTech.
The objective of the Exascale Co-design Center for Materials in Extreme Environments (ExMatEx) is to establish the interrelationship among algorithms, system software, and hardware required to develop a multiphysics exascale simulation framework for modeling materials subjected to extreme mechanical and radiation environments. Such a simulation capability will play a key role in solving many of today’s most pressing problems, including producing clean energy, extending nuclear reactor lifetimes, and certifying the aging nuclear stockpile.
This will be accomplished via a focused effort in four primary areas:
- Scale-bridging algorithms: ExMatEx will employ a science strategy that is an uncertainty quantification (UQ)-driven adaptive physics refinement, in which coarse-scale simulations dynamically spawn tightly coupled and self-consistent fine-scale simulations as needed.
- Programming models: A multi-program, multi-data (MPMD) task-based scale-bridging approach leverages the extensive concurrency and heterogeneity expected at exascale while enabling novel data models, power management, and fault tolerance strategies within applications. The programming models (e.g., domain specific languages (DSLs) that raise the application level of abstraction) and the approaches developed to achieve this will be broadly applicable to a variety of multiscale, multiphysics applications.
- Proxy applications: Proxy apps and kernels are the main mechanism for exploring algorithm design space and communicating the application workload to the hardware architects and system software developers. Proxy apps for single-scale single-program, multi-data (SPMD) applications (e.g. molecular dynamics) will be used to assess node-level data structures, memory and power management strategies, while system-level data movement, fault management, and load balancing techniques will be evaluated via the asynchronous task-based MPMD scale-bridging proxy apps.
- Co-design analysis and optimization: ExMatEx will establish and execute a continuous modeling, evaluation, optimization, and synthesis loop, including optimization of algorithms and architectures for performance, memory and data movement, power, and resiliency. Proxy applications and performance models/simulators will be used to introduce a realistic domain workload into the exascale hardware and software stack development process at an early stage, and together with scalable analysis tools will inform a co-optimization loop to address the challenges of power, resiliency, concurrency, and heterogeneity that will characterize exascale platforms.
This is a team that has long, extensive experience working with the most advanced HPC technology. Livermore has had an extensive partnership with IBM, including ASCI Blue, White and Purple and the Blue Gene series of machines, the latter of which was awarded the National Medal of Technology and Innovation by President Obama last October. Los Alamos and IBM co-designed and built Roadrunner as an advanced architecture that addressed a number of the critical issues on the road to exascale; Roadrunner is the machine that broke the petaflops barrier in 2008, ushering in the era of heterogeneous supercomputers. Oak Ridge has partnered with Cray, including co-design of a GPU-based system scheduled for delivery in 2012. The experience of the team suggests that co-design requires a rich interaction among all elements of the large-scale simulation, including materials models, methods and codes, programming models and system software, elements of the fundamental hardware and supporting I/O and network infrastructure, and the agility to respond to changing elements of each. The team’s proven ability to work with cutting-edge computing technologies is reflected by an unusually large number of Gordon Bell Prize winners and finalists among its senior personnel, including the Center Director himself.
Robert Rosner (firstname.lastname@example.org, 630-252-2480), ANL/University of Chicago
Ewing Lusk (email@example.com, 630-252-7852),
Andrew Siegel (firstname.lastname@example.org, 630-252-1758),
Robert Hill (email@example.com, 630-252-4865), ANL;
Kord Smith (firstname.lastname@example.org, 617-965-7455), Studsvik Scandpower, Inc.
M. Anitescu, P. Fischer, P. Hovland, T. Peterka, R. Ross, M. Stan, T. Tautges, Argonne National Laboratory; R. Ferencz, M Schulz, Lawrence Livermore National Laboratory; J. Ahrens, Los Alamos National Laboratory; M. Adams, J. Morel, Texas A&M University; J. Vetter Oak Ridge National Laboratory; A. Hoisie, D. Kerbyson. Pacific Northwest Laboratory; R. Wisniewski, IBM; T. Warburton, Rice University; P. Murray, AREVA, Inc.; T. Hilsabeck, General Atomics, Inc.; J. Gileland, Terrapower, LLC.
ANL/UChicago; Studsvik, Inc, LLNL, LANL, ORNL, PNNL, IBM, Texas A&M, Rice, Areva, GA, TerraPower, LLC.
The Center for Exascale Simulation of Advanced Reactors (CESAR) aims to enable an exascale-capable integrated simulation tool for simulating a new generation of advanced nuclear reactor designs. Existing reactor analysis codes are highly tuned and calibrated for commercial light-water reactors, but they lack the physics fidelity to seamlessly carry over to new classes of reactors with significantly different design characteristics—for example, Small Modular Reactors (SMRs). Without vastly improved modeling capabilities, the economic and safety characteristics of these and other novel systems will require tremendous time and monetary investments in full-scale testing facilities to assess their economic and safety characteristics.
With the advent of exascale computing, the problem domain that will be accessible to direct numerical simulations in the advanced nuclear reactor domain will be vastly expanded, allowing detailed physics-based modeling of significant reactor subsystems. This advance will then greatly improve the predictive capabilities of full-system modeling tools, especially for a broad range of transient phenomena that are currently very poorly modeled. The specific goal of CESAR is thus to use the co-design philosophy and process to enable a predictive modeling and simulation tool for advanced nuclear reactor design at the exascale -- a tool that has the potential to fundamentally alter the design, optimization, and licensing processes for future generations of advanced nuclear reactors, based on a vastly expanded reach of high-fidelity physics-based modeling over what is presently possible. This predictive high-fidelity capability is a key element in any strategy for making the U.S. nuclear industry once again competitive in the world.
Three general aspects of nuclear reactor design are likely to benefit tremendously from extreme-scale computing: (1) modeling of full-vessel, coupled neutronics/thermalhydraulics for systems in natural convection conditions, (2) accurate fuel depletion modeling using coupled highly detailed neutronics and thermal-hydraulics modeling needed in breed/burn concepts, and (3) detailed structural mechanics coupled to both neutronics and thermal-hydraulics to assess core reactivity feedback and fuel assembly structural integrity. This co-design effort focuses on integrating three distinct software components -- weakly compressible hydrodynamics, particle (e.g., neutron) transport, and structural mechanics -- into an integrated code environment, TRIDENT, capable of attacking difficult simulation problems such as transient analysis of reactor cores for a variety of postulated systems failures.
TRIDENT is intended to be an open-source research code; however, advanced components of TRIDENT are expected to be adopted and incorporated by industrial partners into their own user tool suites. Partners in the nuclear reactor industry include the team’s Chief Scientist, from Studsvik, the world’s leading vendor of reactor analysis software, with 2/3 of World’s utilities using its products and AREVA, General Atomics, and TerraPower, reactor vendors that are focused on advanced reactor designs.
The reach of this co-design effort will have impact in advancing the role of simulation and modeling within a broad span of U.S. industry (and not just the nuclear power industry): The capability to simulate the dynamical interaction of fluids, transport and mechanical structures has been long-sought for within industries as disparate as the airframe, turbine, and power generation sectors. The successful co-design effort to implement TRIDENT on the coming generations of multi-petaflop and exaflop computers will therefore have profound impact on design innovation and optimization in important sectors of U.S. high technology industry.
Pat Hanrahan (Stanford), Hanrahan@cs.stanford.edu; 650-723-8530
Karsten Schwan (GA Tech)
Manish Parashar (Rutgers)
SNL, LLNL, LBNL, ORNL, LANL, NREL, UT Austin, Utah, Stanford, GATech, Rutgers.
Combustion is an excellent candidate for exascale computing because efforts to reduce our petroleum usage by 25 percent by 2020 have opened consideration to a wide and evolving set of new fuels and placed additional requirements for fuel flexibility on new combustion systems.. At present, we do not have an adequate science base for designing new combustion systems. We are unable to accurately model the behavior of turbulent flames at the high pressure conditions characteristic of modern reciprocating and turbine engines. Simulations are not able to model with sufficient fidelity to differentiate differences in flame behavior and emission characteristics between traditional, bio-derived, and other evolving fuels to guide industrial design and national policy decisions.
Current combustion simulation codes, running on petascale architectures, are now able to model realistic laboratory-scale flames with simple hydrogen and hydrocarbon fuels using detailed models for chemical kinetics and transport. In spite of these successes, the requirements of computing at the exascale argue that a new code designed specifically for the exascale is needed. Exascale computing will enable current combustion research to make the critical transition from simple fuels at laboratory conditions to complex fuels in the high-pressure environments associated with realistic engines and gas turbines for power generation. Combustion researchers will then be able to differentiate the properties of different fuels and capture their emissions characteristics at thermochemical conditions found in engines. This type of capability addresses a critical need to advance the science base for the development of non-petroleum-based fuels.
Combustion research is of critical importance to DOE, however, this proposal's research work also fundamentally supports other HPC application areas that combine CFD with other scientific disciplines (e.g chemistry, biology, etc.), leverages linear solvers, and supports multi-scale physics. Thus what we learn, the tools and analysis techniques we develop, and the fundamental results from this proposal will have broad applicability to the more general HPC exascale hardware/software co-design.
Today's DOE HPC software is not ready for exascale machine architectures and equally important, such architectures must be designed (tailored) to support future applications within DOE's critical application areas. Realizing the potential of exascale computing requires a significant change in hardware architecture and software design. Key issues include:
- Exponential growth in explicit parallelism (million-way to billion-way parallelism) requiring algorithms with higher concurrency.
- Reduced memory per core and an increase in the cost of data movement in terms of both performance and power relative to floating point operations, requiring algorithms with reduce memory footprints and data motion.
- A growth in machine complexity to the point that fault tolerance must be an essential component of the software stack.
- An increased disparity between I/O speed and compute speed, requiring that the bulk of the analysis of the simulation be performed in situ along with the simulation.
- Uncertainty quantification will be an integral part of the simulation design as improved computational performance moves us toward a predictive simulation capability
The proposed Center for Exascale Simulation of Combustion in Turbulence (ExaCT) will perform the multidisciplinary research needed to simultaneously redesign all aspects of the simulation process from algorithms to programming models to hardware architecture in order to make exascale combustion simulation a reality. This center will combine the talents of combustion scientists, mathematicians, computer scientists, and hardware architects working in integrated teams to address critical themes in the co-design process. The goal of this work is to develop a combustion modeling capability that combines simulation and analysis, develop the necessary computer science tools to facilitate the development of these applications, and quantify hardware constraints for an effective exascale system.