|
Ultra-Scale Computation and Scientific Discovery
Dr. Raymond L. Orbach
Director
Office of Science
U.S. Department of Energy
SuperComputing 2002 Conference (SC02)
Baltimore, Maryland
November 20, 2002
[Slide #2]
Abstract
Ultra-Scale scientific computation
adds a third pillar supporting scientific discovery
to those of experiment and theory. Simulations
generate insight into the laws of nature for
systems too complex for direct calculation or
in circumstances where descriptive laws are
absent. The required high-sustained speeds have
led to a new sociology for computation. Rather
than simply scaling up existing computer systems,
communities with common computational interests
will join together with applied mathematicians,
computer scientists, and chip and interconnect
manufacturers to tailor machines to their scientific
needs. The scale of operation will require large
fractions of massive computational systems or
platforms, changing the nature of interaction
to resemble that of user communities: group
applications, peer review, and blocks of assigned
time. Scientific interest will develop because
of the promise of discovery, while commercial
interest will develop because of the need for
“virtual prototypes.” Together,
substantial demand for machine usage will emerge,
creating a sustainable market for ultra-scale
computational facilities.
Introduction
It is my great pleasure to be
with you this morning to speak of Scientific
Discovery through Ultra-Scale Computation. This
opportunity is the highest priority of The Office
of Science, U.S. Department of Energy. It is
a natural extension of the long and highly successful
history of Office of Science support for the
creation of mathematical and scientific software
used worldwide to focus the power of high-performance
state-of-the-art computers on advancing the
frontiers of science. Since the early 1970s,
robust high-performance numerical libraries
have been developed with support from the Office
of Science, including EISPACK (for eigenvalue
and singular value problems) and then LINPACK
(for linear equations and linear least squares
problems), which together evolved into LAPACK
(a broad suite of numerical linear algebra algorithms).
These software libraries set the high standard
by which all mathematical software has come
to be assessed. The need to improve efficiency
of algorithms on computers with memory hierarchies
led to the development of the BLAS (basic linear
algebra subroutines), while the need to develop
scalable parallel versions of critical numerical
libraries led to ScaLAPACK (scalable LAPack).
Parallel programming models including PVM (parallel
virtual machine)), MPI (message passing interface),
and Global Arrays were also developed under
DOE auspices and have been adopted as essential
standards by the scientific community. We expect
to add to this list of accomplishments through
our SciDAC program, Scientific Discovery through
Advanced Computing, currently funded at a level
of $60M per year. We are committed to working
with you, the computational community, to advance
the scientific opportunities made possible by
your accomplishments.
The tools for scientific discovery
in this, the 21st century, have changed. Previously,
science had been limited to experiment and theory
as the two pillars for investigation of the
laws of nature. With the advent of what many
refer to as “Ultra-Scale” computation,”
a third pillar has been added to the foundation
of scientific discovery. Modern computational
methods are developing at such a rapid rate
that computational simulation is possible on
a scale that is comparable in importance with
experiment and theory. The remarkable power
of these facilities is opening new vistas for
science and technology.
Tradition has it that scientific
discovery is based on experiment, buttressed
by theory. Sometimes the order is reversed,
theory leads to concepts that are tested and
sometimes confirmed by experiment. But most
often, experiment provides evidence that drives
theoretical reasoning. Thus, Dr. Samuel Johnson,
in his Preface to Shakespeare, writes: “Every
cold empirick, when his heart is expanded by
a successful experiment,
swells into a theorist.”
Many times, scientific discovery
is counter-intuitive, running against conventional
wisdom. Probably the most vivid current example
is the experiment that demonstrated that the
expansion of our Universe is accelerating, rather
than in steady state or contracting. We have
yet to understand the theoretical origins for
this surprise, other than to note that Einstein
represented a static universe by introducing
a “cosmological constant,” which
he then discarded when presented with Hubble’s
observation of the expanding universe.
During my scientific career, computers
have developed from the now “creaky”
IBM 701, upon which I did my thesis research
in the wee hours of the morning, to the so-called
massively parallel processors or MPP machines,
that fill rooms the size of football fields,
and use as much power and cooling as a small
city.
The astonishing speeds of these
machines, especially the Earth Simulator in
Yokohama, Japan, allow Ultra-Scale computation
to inform our approach to science, and I believe
social sciences and the humanities. We are now
able to contemplate exploration of worlds never
before accessible to mankind. Previously, we
have used computers to solve sets of equations,
physical laws too complicated to solve analytically.
Now we can simulate systems to discover physical
laws for which there are no known predictive
equations. We can model physical or social structures
with hundreds of thousands, or maybe even millions,
of “actors,” interacting with one
another in a complex fashion. The speed of our
new computational environment allows us to test
different inter-actor (or inter-personal) relations
to see what macroscopic behaviors can ensue.
Simulations can determine the nature of the
fundamental “forces” or interactions
between “actors.”
Thus, computer simulation is
now a major force for discovery in its own right.
Much of this advance has been enabled by the
development of massively parallel processor
(MPP) computation. All of the scientific simulations
I shall exhibit today were done on this class
of computers. These MPP machines, based on a
strategy of interconnecting systems that were
designed for the desktop or server markets,
are efficient for some classes of applications,
but inefficient for many problems of importance
to the Office of Science. Their sustained speeds
on some problems are as much as 60% of peak,
while on other important scientific problems
the efficiencies are less than 10%. Discovery
through simulation requires sustained speeds
of order 50 – 100 TeraFLOPS for problems
in accelerator science and technology, astrophysics,
biology, chemistry and catalysis, climate prediction,
combustion, computational fluid dynamics, computational
structural and systems biology, environmental
molecular science, fusion energy science, geosciences,
groundwater protection, high energy physics,
materials science and nanoscience, nuclear physics,
soot birth and growth, and more [www.ultrasim.info/doe_docs/].
[Slide #3]
Instead, for some of these applications,
today’s U.S. computers available for open
scientific research deliver 2 TeraFLOPS, but
for most applications, ten times less.
[Slide #4]
Compare these speeds with the
Earth Simulator, whose arrival was announced
in April of this year, and which reaches sustained
speeds over 12 TeraFLOPS for computational fluid
dynamics and fusion, and (remarkably) over 26
TeraFLOPS for geosciences. The consequences
of this disparity are seen most vividly through
the example of climate modeling.
[Slide #5]
The best scale available to scientists
in the United States is a computational grid
100 km X 100 km. Over those lengths, mountains,
hurricanes, and coastlines are averaged out.
We know from complex systems that the output
depends critically on the detailed nature of
the input, on fluctuations caused by geographical
features. If these are averaged out, what confidence
can we have on our long-term climate forecasts?
The Earth Simulator has produced climate models
on a 10 km X 10 km grid. On that scale, most
earthly features can be represented. The accuracy
and reliability of their long-term climate forecasts
were demonstrated vividly when Professor Sato,
the head of the Earth Simulator, displayed a
typhoon developing from their climate modeling
at a recent lecture in the Department of Energy.
In addition to climate prediction,
consider other important scientific questions.
[Slide #6]
“Autoignition and Control
of ‘Flameless’ Combustion”:
Autoignition is the process that lights a combustible
mixture by the mere application of heat, but
without a flame or spark. For example, autoignition
lights the combustion process every time a diesel
truck engine cylinder fires. Autoignition also
limits the efficiency of most automobile engines,
and produces undesirable ‘flashback’
in low-emission gas turbine combustors that
are used to generate electricity. A major scientific
question is: how does autoignition progress
in fluctuating and incompletely mixed gases,
and how might we control the process?
Our present understanding is primarily
from experimental data and simulations limited
to zero or one-dimensional studies. Most of
this work assumes perfectly mixed gases with
no spatial variations.
The direct 2-dimensional numerical
simulations shown in the figure [Slide #6] are
limited by currently available computational
resources. The scientific requirement for 3-dimensional
runs would require 3 x 1018 operations, or about
10 hours at a sustained rate of 100 TeraFLOPS.
With increased code efficiency (current S3D
codes at NERSC run at 7% of peak efficiency)
and/or more optimal computer architectures,
such a project could be carried out on a 40-50
TeraFLOPS machine over reasonable time scales.
This simulation would provide
the first realistic simulation of a moderately
complex, but realistic, autoignition process
revealing its topologies and propagation dynamics.
The data would provide a fundamental understanding
of the effect of mixing on the dynamics of autoignition,
and would ultimately stimulate new strategies
for mixing fuel and air to achieve the desired
operating flexibility and control while maintaining
high efficiency and low emissions. This research
would also help unveil the complex fundamental
relationships between useful energy output and
undesired emissions (NOx and soot) in combustion
devices, giving the Nation a stronger basis
for decisions on policy and choices among programs
such as fuel cell development, homogeneous charge
compression ignition engines, or increased CAFÉ
standards for cars and trucks.
[Slide #7]
“Supernova Simulations”:
Supernovae were the origin of the heavy elements
of life, from the oxygen we breathe to the iron
in our blood. Using their light as signposts
we can now measure the size and age of the universe,
its rate of expansion, and its long-term future.
Nothing since the Big Bang surpasses the raw
power of supernova explosions - - over 1030
megatons/sec in neutrinos for several seconds,
as much instantaneous power as all the rest
of the luminous, visible universe combined.
These explosions also give birth to the most
exotic states of matter known - - neutron stars
and black holes.
Over the past decade, observations
of a particular type of supernovae (Type Ia)
have shown that these are excellent ‘standard
candles.’ This means that the light they
emit is a known quantity, and that by comparing
their observed brightness to this value, we
can infer their distance. The inferred distances
lead to a startling result: the expansion of
the universe is currently accelerating.
Supernovae are inherently multi-dimensional
objects in which convection, hydrodynamic instabilities,
and radiative transfer play central roles. Recent
observations show gross asphericities in the
material they eject, highlighting the importance
of their three-dimensionality. Neutrino and
radiation transport in systems far from thermal
equilibrium are critical both to the explosion
mechanism and to supernova diagnostics.
However, current computer simulations
of supernovae have only been able to follow
the calculations in two spatial dimensions.
To make the jump to three dimensions while maintaining
all the important physics, supercomputers will
need to increase in speed by at least two orders
of magnitude. Currently, two-dimensional models
of the core-collapse event with simplified neutrino
transport require approximately 1015 floating
point operations, a PetaFLOP. Increasing the
complexity of the physics will boost the calculations
by two orders of magnitude, while adding the
third dimension will increase the requirements
by a factor of 500. This leads to 1020 floating
point operations, 100 ExaFLOPs, or sustained
speeds of 10 to 20 TeraFLOPS for 1.5 months.
Improvements on similar scales will have to
be made in both the total amount and speed of
I/O, memory bandwidth, processor communication
bandwidth, and particularly memory and I/O latency.
A machine such as this will allow us for the
first time to successfully explode a supernova
on a computer while not glossing over any of
the important physics, enabling us to understand
the nature of these amazing explosions that
are at the heart of so many of our most pressing
questions in physics and astronomy today.
I have discussed three areas
where ultra-scale computation is essential if
we are to fully simulate, and thus discover,
the physics of climate change, combustion, and
supernova collapse. There are other examples
that can be found at www.ultrasim.info/doe_docs/,
the results of “virtual workshops”
over this past summer, with more to come. See
also the NSF workshop titled “Computation
As a Tool for Discovery in Physics” at
www.nsf.gov/pubs/2002/nsf02176/start.htm.
The market for high-end computation
extends beyond science, into applications, creating
a commercial market for ultra-scale computers.
The science and technology important to industry
can generate opportunities measured in billions
of dollars.
For example, at General Motors:
[Slide #8]
“General Motors currently
saves hundreds of millions of dollars by using
its in-house high performance computing capability
of more than 3.5 TeraFLOPS in several areas
of its new vehicle design and development processes.
These include vehicle crash simulation, safety
models, vehicle aerodynamics, thermal and combustion
analyses, and new materials research. The savings
are realized through reductions in the costs
of prototyping and materials used.
However, the growing need to meet
higher safety standards, greater fuel efficiency,
and lighter but stronger materials, demands
a steady yearly growth rate of 30 to 50% in
computational capabilities but will not be met
by existing architectures and technologies…A
computing architecture and capability on the
order of 100 TeraFLOPS for example would have
quite an economic impact, on the order of billions
of dollars, in the commercial sector in its
product design, development, and marketing.”
And from General Electric:
[Slide #9]
“Our ability to model, analyze
and validate complex systems is a critical part
of the creation of many of our products and
design. Today we make extensive use of high-performance
computing based technologies to design and develop
products ranging from power systems and aircraft
engines to medical imaging equipment. Much of
what we would like to achieve with these predictive
models is out of reach due to limitations in
current generation computing capabilities. Increasing
the fidelity of these models demands substantial
increases in high-performance computing system
performance. We have a vital interest in seeing
such improvements in the enabling high-performance
computing technologies…In order to stay
competitive in the global marketplace, it is
of vital importance that GE can leverage advances
in high-performance computing capability in
the design of its product lines. Leadership
in high-performance computing technologies and
enabling infrastructure is vital to GE if we
wish to maintain our technology leadership.”
As an example, consider the comparison
between simulations and prototyping for GE jet
engines.
[Slide #10]
For evaluation of a design alternative
for the purpose of optimization of a jet engine
design, GE would require 3.1 x 1018 floating
point operations, or 3.6 days of sustained speeds
of 10 TeraFLOPS. And, of course, 100 TeraFLOPS
of sustained speed would require “only”
8.6 hours. This is to be compared with millions
of dollars, several years, and designs and re-designs
for physical prototyping.
Opportunities abound in other
fields such as pharmaceuticals, oil and gas
exploration, and aircraft design. Given the
size and complexity of the machines required
for sustained speeds in the 50 to 100 TeraFLOPS
regime, the “sociology” of high-end
computation will probably have to change. One
can think of the usage of ultra-scale computers
as akin to that of our current light sources:
large machines used by groups of users on a
shared basis. Following the leadership of our
SciDAC program, interdisciplinary teams and
collaborators will develop the necessary state-of-the-art
mathematical algorithms and software, supported
by appropriate hardware and middleware infrastructure,
to use Terascale computers effectively to advance
fundamental research in science. These teams
will associate on the basis of the mathematical
infrastructure of problems of mutual interest,
working with efficient, balanced computational
architectures.
The large amount of data, the
high sustained speeds, and the cost probably
leads to concentration of computing power in
only a few sites, with networking useful for
communication and data processing, but not for
core computation at terascale speeds. Peer review
of proposals will be used to allocate machine
time. Industry will be welcome to participate,
as has happened in our light sources. Teams
will make use of the facilities as user groups,
using significant portions (or all) of the machine,
depending on the nature of their computational
requirements. Large blocks of time will enable
scientific discovery of major magnitude, justifying
the large investment ultra-scale computation
will require.
Visualization of these complex
results will require new paradigms. To give
you an idea of the importance of this aspect,
consider astrophysics and fusion.
[CD: Colliding Black Holes]
Our Astrophysics virtual workshop
group, Julian Borrill, Peter Nugent, John Shalf,
Martin White, and Stan Woosley, with editing
by John Hules, write “Computational astrophysics
has an essential role to play in providing the
point of contact between theory and observation.
From the detailed theoretical predictions made
possible by complex simulations, to the precise
reference points obtained from painstaking analyses
of the new observations, the development of
astrophysics in the new millennium will be regulated
by our computational capability.”
Dr. Ed Seidel of the Max-Plank-Institut
für Gravitationsphysik, Albert-Einstein-Institute,
Pottsdam, Germany has simulated the gravitational
waves resulting from a collision of equally
massive black holes using our Office of Science
computational facility NERSC. The black surfaces
are good approximations to the locations of
the event horizons of two black holes. If one
is inside the event horizon, one is forever
trapped. The black holes are going around each
other in a plunging-in spiral, and merge after
about 1/2 orbit. The wispy purple/red/orange
colors represent the gravitational waves emitted
during the process. The orbital motion and final
plunge of the black holes responsible for the
burst of gravitational waves, first predicted
by Einstein early last century, but never before
detected. These waves may be seen for the first
time in the next few years. Black hole collisions
are considered among the mostly likely sources
to be detected first!
[CD: Colliding Black Holes Movie]
This is the most advanced, and
one of the largest ever, calculations of a binary
black hole plunging-in spiral to date, requiring
two million hours on NERSC. It is one of the
first to compute the evolution of black hole
initial data representing two black holes in
orbit about each other. But even with this usage,
equivalent to thirteen days of use of the full
machine, the calculation shows that the black
holes coalesce very soon, much sooner than expected,
showing that the initial data do not really
represent two black holes in orbit. The initial
data need improvement to better represent the
astrophysical case. Calculations like these,
but much more advanced, will be needed to interpret
gravitational waves expected to be seen with
LIGO, GEO, VIRGO, and LISA detectors.
[CD: A Simulated “reconnection
event” in the National Spherical Torus
Experiment]
Perhaps no area of science is
more central to the mission of the Department
of Energy than fusion, and five projects were
launched under SciDAC auspices in FY 2001 to
develop and improve the physics models needed
for integrated simulations of plasma systems
to advance fusion energy science. Appropriately
funded interdisciplinary teams, focusing on
a full-scale integrated program, can successfully
deliver a greatly enhanced simulation capability
to U.S. fusion science. Such a capability is
absolutely essential for realizing our nation’s
goal of commercially viable fusion power in
a realistic timeframe.
Our Fusion Energy virtual workshop
group, D. Batchelor, L. Berry, A. Bhattacharjee,
J. Candy, P. Catto, V. Chan, B. Cohen, R. Cohen,
J. Dahlburg, A. Friedman, A. Glasser, S. Jardin,
S. Krasheninnikov, J-N. Leboeuf, W. Nevins,
D. Schnack, C. Sovinec, R. Stephens, and W.
Tang write “The challenge to unravel the
mystery of the complex behavior of strongly
nonlinear, non-equilibrium plasma systems, including
interactions with their external environments
is clearly the next frontier of computational
fusion research…An increase of 50-100
in computing power, along with a modest increase
in human resources to support partnerships between
fusion physicists, applied mathematicians and
computer scientists, will enable fusion researchers
to make a major advance in resolving the spatial
and temporal complexity in simulations of individual
phenomena as well as to begin to develop fully
integrated simulations of fusion systems. Such
an integrated simulation capability would dramatically
enhance the utilization of a burning fusion
device in particular and the optimization of
fusion energy development in general, and would
serve as an intellectual integrator of physics
phenomena ranging from advanced tokamaks to
innovative confinement concepts.”
Wonchull Park of the Princeton
Plasma Physics Laboratory, Princeton, New Jersey,
U.S.A., has shown how simulation can demonstrate
instabilities in plasmas, an essential understanding
in the development of magnetic fusion as a practical
energy source. The present focus of Dr. Park’s
research is to understand how this reconnection
event will manifest itself in the next generation
of fusion experiments with higher temperature
plasma and in the presence of high-energy alpha-particles
that are produced by the fusion reaction. Under
certain conditions, that he is working to quantify,
this event can couple with other modes in the
plasma to produce a catastrophic disruption.
A better quantitative understanding of this
process is considered essential for the development
of magnetic fusion into a practical energy source.
The movie shows a simulated "reconnection
event" in the National Spherical Torus
Experiment (NSTX), a fusion experiment at the
Princeton Plasma Physics Laboratory. Red and
green iso-surfaces of constant temperature are
shown. Some frames also show selected magnetic
field lines before, during, and after the reconnection
process. The initial (red) high temperature
region has been expelled from the center and
replaced by a (green) lower temperature region
in this spontaneous, self-regulating event.
Qualitative agreement between the simulation
and experimental measurements is obtained. (Here,
temperature and pressure can be used interchangeably,
because the density variation is small.)
[CD: Reconnection Movie]
This reconnection simulation used
dissipation parameters about 1000 times larger
than in the actual experiment, to make it numerically
tractable. Although the general qualitative
behavior of the evolution is probably captured
correctly, it should be noted that the time
rate at which it will evolve, and the quantitative
criterion for when the instability will actually
start, are not being answered accurately. In
order to be able to answer these questions for
actual experimental parameters, the reconnection
layer region must be resolved a factor of 30
times better. This would require roughly 100
times more computing power than presently available
in the US.
I hope through these movies and
charts I have given you a sense of the excitement
Ultra-Scale computation brings. I have focused
on only astrophysics and fusion, but computation
and simulation will drive discovery in all areas
of science. The Office of Science is now developing
the paths forward to develop the algorithms
and architectures required for discovery across
the full spectrum of science opportunities.
We believe the opportunities made available
by Ultra-Scale computation are truly wonderful.
|