Proposal
Pacific Rim Application and Grid Middleware Assembly:
A proposal
to initiate a sustainable collaboration
In
the 21st century advances in science and engineering
(S&E) will, to a large measure, determine economic growth,
quality of life, and the health of our planet. The conduct of
science, intrinsically global, has become increasingly important
to addressing critical global issues. At the same time, awareness
of the importance of investing in S&E has grown throughout
the world. Our ability, as a Nation, to work effectively within
the international framework is highly dependent on the contributions
of S&E both to policy deliberations and to problem solving.
Our participation in international S&E collaborations and
partnerships is increasingly important as a means of keeping
abreast of important new insights and discoveries in science
and engineering. (Toward a More Effective NSF Role in International
Science and Engineering, National Science Board Interim Report,
NSB-00-217).
Science: An Intrinsically
Global Activity
Over
the last decade we have seen an increase in international efforts
to address global problems as well as the increasing impact of information
technology on the conduct of science. Three examples of international
efforts include the Global Biodiversity Information Facility (http://www.gbif.org)
(see Edwards et al) to make biodiversity data freely available,
the International Geosphere-Biosphere Programme (http://www.igbp.kva.se/cgi-bin/php/frameset.php)
whose mission is to deliver scientific knowledge to help human societies
develop in harmony with Earths environment, the Global Terrestrial
Observing system (http://www.fao.org/gtos)
whose mission is to provide the scientific and policy making community
with access to the data necessary to manage the change in the capacity
of terrestrial ecosystems to support sustainable development. Other
inter-related national efforts include the national virtual observatories
(in the United States at http://www.srl.caltech.edu/nvo/,
and at http://www.research-councils.ac.uk/escience/
in the United Kingdom) and the high-energy physics community (see
http://www.eu-datagrid.org/
for the European Data Grid effort, and http://www.griphyn.org
for a United States based effort). These later efforts illustrate
most clearly how the interrelationship between science and technology
is enhancing both. It also illustrates how new types of international
collaborations are being formed, by tackling large problems that
require resources from around the globe. These examples are directly
tied to the development to the Grid.
The Grid: Transforming
Computing and Collaborating
"Grid
is a new Information Technology (IT) concept of "super Internet"
for high-performance computing: worldwide collections of high-end
resources such as supercomputers, storage, advanced instruments
and immersive environments. These resources and their users are
often separated by great distances and connected by high-speed networks.
The Grid is expected to bring together geographically and organisationally
dispersed computational resources, such as CPUs, storage systems,
communication systems, real-time data sources and instruments, human
collaborators." (from http://www.aei.mpg.de/~manuela/GridWeb/info/grid.html,
also Foster and Kesselman).
Many
experts believe that the " Grids will Transform Computing"
(see Irving Wladawsky-Berger, IBM Server Group Vice President of
Technology and Strategy, http://www.ibm.com/servers/events/grid.html).
Dr Wladowsky-Berger notes:
"Each
stage of the Internets evolution has been cumulative.
Where the Internet today is a vast repository of content that
enabled e-business, the next major stage will leverage Grid
computing turning the Internet itself into a computing
platform. Think back to 1994-1995. The Web was on the horizon
and clients were looking for focused projects to get their feet
wet. This is the same type of opportunity."
"Grid
computing is in some ways like the World Wide Web. The Web provides
access to a world of content over the Internet through open
standards that let the casual user connect without having to
know where the resource is located.
Just as the user
looks at the Internet and sees content via the World Wide Web,
the user looking at a Grid sees essentially one, large virtual
computer built on open protocols with everything shared
applications, data, processing power, storage, etc. All through
the Internet."
In response
to the needs of scientists as well as heeding the advice of individuals
like Dr. Wladowsky-Berger, the scientific community, as well as
nations, have begun to establish standards groups and to make national
investments. "The Global Grid Forum (GGF, http://www.globalgridforum.org)
is a community-initiated forum of individual researchers and practitioners
working on distributed computing, or "grid" technologies. GGF is
the result of a merger of the Grid Forum in the United
States, the eGrid European Grid Forum, and the Grid community
in Asia-Pacific.
The GGF mission is to focus on the
promotion and development of Grid technologies and applications
via the development and documentation of "best practices," implementation
guidelines, and standards with an emphasis on "rough consensus and
running code". The Asia-Pacific Grid (http://www.apgrid.org/default.htm)
is a consortium that with a goal to provide "Grid environments
around Asia-Pacific region. APGrid is a meeting point for
all Asia-Pacific High Performance Computing and Networking researchers.
It acts as a communication channel to the Global
Grid Forum, and other Grid communities."
As has
been noted already, the European Union has invested in the establishment
of a EU-Data Grid (see http://www.eu-datagrid.org/),
driven by the needs of strategic science investments in areas of
high-energy physics, biology and medical imaging processing, and
earth observations; similarly the United Kingdom has invested in
its e-Science Programme at http://www.research-councils.ac.uk/escience/,
the United States has invested in this infrastructure via several
funding initiatives, such as NASAs Information Power Grid
(http://www.ipg.nasa.gov/)
and NSF most recently via an award to its two Partnerships for Advanced
Computational Infrastructure (PACI) Programs, in the TeraGrid (http://www.teragrid.org/)
.
The Problem: Currently
the grid is too difficult to use
Even
with all of these efforts, there are still some critical needs that
must be addressed to realize the full potential of the Grid. The
first and foremost need is to make the grid usable on a daily basis
by the vast array of scientists. Current application efforts are
focused only on very large application consortiums. The barriers
to daily grid use for single PI and small PI groups are enormous,
essentially eliminating a large fraction of potential scientists
from the Grid. While the large consortiums have provided the needed
voice and impulse to take the grid from the lab, addressing problems
to make the grid more commonplace for a more diverse set of applications
groups is essential. Just as research funding agencies have a diverse
portfolio of project size, Grid-enabled resources need to a similar
diversity.
We have
had experience in our attempts to make real the application of telescience
between the United States and Japan, where two online telemicroscopy
systems, one at NCMIR and one at Osaka, use international research
networks to provide interactive, remote control of high-power microscopes
(see http://www.transpac.org,
or http://www-ncmir.ucsd.edu/CMDA
or http://www.uhvem.osaka-u.ac.jp/official/news.html).
While such experiments are possible, they are far from routine,
and very tedious, both in scheduling and tuning the network, but
also in the handling of the data. There is too much human intervention
needed to make this exciting use of the grid routinely possible.
For the Grid to work for a wider variety groups, more automation,
more favorable use policies for allocation and scheduling, and increased
collaboration is needed.
This
example illustrates one goal of using resources to collaborate on
science as well as indicating the difficulty of making the various
components of the Grid, namely the hardware (computer, networking),
the software, and the applications work as one.
In summary,
the type of problems scientist address increasing take on global
proportions. The science Grid, where information technology (computers,
storage, networks) meets applications and scientific instruments,
is exploding in both size and scope, and holds the potential to
address many global science issues. However, much work needs to
be done to make the Grid usable.
Proposal
We propose
to establish the Pacific Rim Applications and Grid Middleware Assembly
(PRAGMA). PRAGMA is being formed as a structure in which Pacific
Rim institutions can co-develop grid-enabled applications more formally
and deploy the needed infrastructure to allow data, computing, and
other resource sharing throughout the Pacific Region. This activity
is based on current collaborations and will enhance these collaborations
and connections among individual investigators by including visiting
scholars' and engineers' programs, building new collaborations,
formalizing resource-sharing agreements, and continuing trans-Pacific
network deployment. PRAGMA member institutions would work together
routinely to address applications and infrastructure research of
common interest to them.
PRAGMA
recognizes that the countries and institutions that surround the
Pacific Rim, including (but not limited to) the United States, Japan,
Korea, China, Singapore, Thailand, Australia, and New Zealand, have
a well-known history of innovation in information technology. And
furthermore, it recognizes that individual researchers have formed
collaborative ties across the region. For these existing collaborations,
PRAGMA can serve as mechanism through which information and resources
can be exchanged more easily. PRAGMA resource-sharing agreements
will allow scientists and infrastructure researchers to concentrate
on problem solutions without having to perform ad hoc resource
collection, installation, and testing.
There
are two overall goals of the PRAGMA activity aimed at the Asia Pacific
Region. First, establish a community of researchers and technologists
together that will accelerate daily use of the Grid for advancing
science through: developing the software, addressing scheduling
and allocation issues across institutional and international boundaries;
running applications on the infrastructure to significantly influence
its buildout; and working with standards bodies (such as Global
Grid Forum (GGF) or the Internet Engineering Task Force (IETF))
to expand the impact of our experiences and ensure longevity and
interoperability. Second, build sustained collaborations, among
the various stakeholders, namely builders and developers of the
Grid, scientists and researchers of the Grid, graduate students
of both of these groups, to have a lasting influence on international
collaborations.
Current Request: First
Steps
Because
our focus is on applications and grid middleware, we wish to bring
together researchers and technologists in a series of meeting to
collaboratively expedite application use of the Grid, and use that
specific applications as our guide to making the Grid live to its
potential of a single computing platform.
We are requesting support
for the following specific activities to launch PRAGMA:
- Host a first workshop,
to be held in San Diego, 11-12 March 2002.
- Support travel of
US based scientists to subsequent workshops in this series. At
this stage the new workshops are planned for 10-12 July 2002 to
be hosted by the Korea Institute for Science and Technology Information
(KISTI) and for Fall or Winter 2002/2003 in Japan.
- Support travel of
US based scientists to iGRID meeting 24 26 September 2002
to demonstrate progress of PRAGMA application.
- Participate and lead
efforts between meetings such as establish web sites and continue
to increase involvement and resource (e.g. computer) commitments
from various groups.
We feel
that through this series of meetings we will build strong collaborations
and have enough meetings to make progress.
Agenda for the First
Meeting
In addition
to the overall goals of PRAGMA, namely to establish a community
of researchers and technologists together that will accelerate daily
use of the Grid for advancing science and to build sustained collaborations,
we have specific goals for this first meeting. By end of the first
meeting we plan to have produced a gap analysis of applications
on running on the grid, namely we to understand concretely what
are the roadblocks (technical, institutional, national) between
our current state of affairs and a routine use of that application
would look like in the grid environment. Furthermore, we would develop
a plan to address those barriers over the course of the subsequent
year.
To bring
our experiences to a broader audience, and to motivate progress
to addressing some of these issues, we plan to use the iGRID 2002
meeting in Amsterdam (with a theme of what can you do with a 2.5
gigabit lambda) September 2002 http://www.startap.net/starlight/iGRID2002
as a milestone of having made progress on one or more applications,
and to use PRAGMA to focus a Pacific Rim response to the iGRID challenge.
Since this will be the
first meeting of the group, we expect to have a mixture of background
talks (for all of the participants to understand the resources available,
the various software projects, and the possible applications to
drive the progress of PRAGMA) and discussions (barriers to progress).
Governance of PRAGMA
As PRAGMA
is envisioned, it is an open organization to all organizations in
the Pacific Rim that align with the goals of PRAGMA. To maintain
continuity between meetings and to help maintain interest and focus
for the group, we will explore a steering structure with some of
the following attributes:
Each
meeting will have co-Program Chairs, responsible for the agenda
for the meeting as well as the local arrangements. We feel for the
sake of continuity that for any given meeting, one co-chair should
be from the host site (as part of the host-sites commitment to PRAGMA)
and the other co-chair should be from the institution that has agreed
to host the subsequent meeting. Thus, the first Program Co-Chairs
are Phil Papadopoulos (UCSD) and a representative from KISTI. The
remaining program committee would be selected from the broader PRAGMA
participants to reflect the agenda.
To provide
institutional oversight and commitment, we will have a steering
committee who will approve the final agenda, help in the selection
of the sites, and set priorities for building PRAGMA (see list of
possible activities below) and assist in helping overcome institutional
and national barriers to making the applications successful. We
will discuss details of this at the first meeting, but we anticipate
that the members of each institution that have agreed to host a
meeting would have one or two representatives on the steering committee.
The steering committee might be rounded out by experts or application
individuals. In the case of UCSD two initial members of the committee
would be Peter Arzberger and Philip Papadopoulos. Other initial
members will be from KISTI, TITECH, TACC, CAS, NCHPC, APAC and Singapore.
The initiators
envision a series of workshops (right now 3 in 2002, 1 in 2003,
1 in 2004 planned, to address such issues as:
1. A common set
of grid applications and mechanisms to co-develop, share, and
support these applications
2. Formalized
agreements to exchange computing and other resource cycles among
institutions and computing centers
3. Common network
deployment activities for trans-Pacific communication
4. Grid infrastructure
deployment, including
a. Security/Certificate
trust relationships
b. Resource
discovery/reporting
c. Co-reservation
of resources
5. Scholars and
Professionals exchange programs
6. Structure and
membership of PRAGMA
Possible Technical Topics
for our Meetings:
Cluster computing
Federating databases
Grid portals
Knowledge integration
in various domains
Mirroring databases in
biology
Telemicroscopy
Network engineering and
operations
Measurement and analysis
of grid performance
Impact of wireless on
expanding the range of the grid
Summary
The time
is ripe to launch this initiative to bring together researchers,
scientists and technical experts to build tools for applications
to run on the growing grid environment. The plan we have proposed
with leverage the other activities around the globe, will focus
on and contribute to key scientific applications, ensures on-going
dialog, and thereby build through these and associated interactions
sustainable collaborations in the Pacific Rim arena. Through these
collaborations, we will be able to address larger problems of global
concern, and build the necessary human infrastructure and networks,
the ultimate science infrastructure.
|