Towards a Global
10 January 2005
1. Scientific Background:
Water has played a critical role in society for thousands of years and will increasingly constrain human economic and social development. Aquatic systems can play important roles as conduits of inorganic carbon from terrestrial systems to the atmosphere and as mineralization sites of organic carbon (Cole et al. 1994). The carbon cycle in lakes is key to understanding land-water interactions, the trophic state of lakes, and clarity of lakes. Lake metabolism, described as the balance between the complementary processes of gross primary production (GPP) and respiration (R) (Hanson et al 2003), is a fundamental lake characteristic that helps describe the source of carbon incorporated into all trophic levels of the ecosystem[1].
Lake metabolism can be measured using high frequency (~0.002 Hz) observations of dissolved oxygen or carbon dioxide concentrations in the surface waters of lakes (Cole et al), (Hanson et al). These measurements are made using sensors deployed on a lake with data sent wirelessly to a base station and incorporated into an easily accessible database. Calculations of metabolism can be made using diel data. In addition measurements such as wind speed, wind direction, water temperature profiles, three-dimensional water circulation patterns (measured by an acoustic Doppler current profiler) are desirable to refine the GPP and R calculations. While these data are sufficient to calculate metabolism variables, it is necessary to use additional information to interpret the values. Ancillary information on weather (precipitation, photosynthetically active radiation), changes in bacterial, algal and zooplankton abundance, and nutrient status are often necessary to understand patterns of lake metabolism. These data are typically not collected on the instrumented buoy but many are available after the fact in NTL LTER or other databases.
To understand the role of lakes in the global carbon
balance, a system of buoys with sensors have been put on lakes in
This first-step towards a global network has already
produced understanding of lake dynamics, especially in the role of episodic
events that influence the physics, chemistry and biology of the lake, by using
moored meteorological instruments and thermistor chains and oxygen sensors. The
additional data enabled calculation of gross primary production, respiration,
and net ecosystem production. These
sensors also showed the response of
Because lake metabolism is a property of fundamental interest to limnologists, resource managers and the public, and because it can be estimated using field-deployed sensors such as those NTL or YYL, lake metabolism is an excellent choice as a science driver of biological database and informatics research.
We envision a global network of thousands of lake metabolism buoys, deployed strategically around the globe in disparate lakes, to understand at local, regional, continental and global scales such issues as the direction and rate of change of lake metabolism; the factors controlling daily, seasonal, and among-year variability of lake metabolism; and the reciprocal interactions between human use of lakes and lake metabolism. A global network of automated lake observatories, each collecting and transferring data in near real time, is within our grasp in the next decade. But this vision, as well as those of other large-scale projects, cannot come to fruition without real and meaningful partnerships between ecological scientists, developers of middleware, and information management specialists to solve key information technology and database issues.
In the next section we describe the current activities that have led to our current accomplishment, and lay out some design principles for a global network of lakes. The philosophy we have followed of making data available to the scientists as quickly as possible and in a sustained way has allowed us to see limitations of our system while demonstrating the value of our approach to gain new scientific insights.
2. Architectural Considerations and Conceptual Design:
In order to realize our vision of a global lake observatory network, much more work needs to be done. The pieces of the infrastructure that will be developed and implemented with current or pending support (see Appendix) will put in place some critical components, and will provide us stepping stones to the next level of infrastructure. But more technology needs to be developed and deployed. In particular, we envision a system that will allow a researcher to obtain not just the sensor data, but the ancillary data to interpret lake metabolism calculations. We envision a system that will allow a researcher to launch a calculation (e.g. lake metabolism, lake evaporation) and have the system provide that response. We envision a system that a researcher, with permission, will be able to adjust any of the sensors in the network to capture data from unique events, perhaps to do so automatically, as the event unfolds.
At this stage of the development, we feel the next critical deployment issue is to redesign our working prototype that links two lakes together to other lakes (see section 3 for a list of individuals invited to the workshop – and their lakes).
The philosophy we have followed of making data available to the scientists as quickly as possible and in a sustained way has allowed us to see limitations of our system while demonstrating the value of our approach to gain new scientific insights.
We will continue to follow that overarching philosophy in this project.
2.a. Architectural Design of Global
To prepare for this next stage of expansion several key design and implementation strategies are proposed:
· Each lake or lake system is a separate (autonomous) administrative domain, which is responsible for the deployment and maintenance of equipment (from sensor to database) and curation of data (including providing access to data and creation of metadata).
· For lakes participating in the network, a core set of sensor data are openly available as soon as it comes from the sensor/data logger to the database. We assume that the data will “flow” in near real time, automatically from sensor to database.
· The system must be easily extensible to new lakes. Our approach will be to design an interface to allow each lake to register its data, and the system will then allow others to see those data. We are using a services oriented architecture (i.e. web service based) to implement such a prototype.
· We will standardize core data (e.g., terminology, language) as far as possible, using ontologies (or a lookup table) for other data.
· The network design must automatically detect and reflect changes within a lake administrative domain that are manifest in the lake database, e.g., changes in sensor location, number of sensors from a site. Note: within an administrative domain the responsibility to update the database will be with the lake administrator (the automated updating of the lake database given changes in sensor or sensor locations is part of NSF BDI proposal – see Appendix). For the global design linking many lakes, we are planning that once a lake is registered, the system will automatically check all databases and update the interface, reflecting the data that are available. .
Lake
administrative domains: Currently we have two lake administrative
domains we have been working with:
This is the simplest version of a lake infrastructure. More complicated versions would include one or more of the following:
· Different sensor data (e.g. acoustic Doppler current profiles) (to be tested with GBMF funding)
· Sensor data from more than one lake in a lake system (as is being implemented with funds from GBMF)
· Ancillary data from lake samples, to give information on dissolved organic carbon, dissolved inorganic carbon, total phosphorous, total nitrogen, chlorophyll concentration, bacteria counts and types, phytoplankton, major ions. Many of these data are available on line from NTL, but are not yet at YYL.
· Sensors will be added, moved, removed at the will of the researchers. Thus, any local design must translate these changes to the database.
· See figure XX for a schematics for what a complex but single administrative domain would address.
Global (Networked) Architecture: A global (networked) architecture would provide access to data from several lake administrative domains. Currently, a prototype exists via support from PRAGMA, Ecogrid, and the YYL projects, that hooks together two lakes with a single interface. This prototype used the JDBC connection directly into the databases at both NTL and YYL (Phase 0 sites). This has proven a very useful demonstration, since already scientists have seen the impact of extreme weather (typhoon) on YYL, and its response. This will result in a paper in 2005.
However,
that system is not extensible without additional manual labor. Furthermore, we
expect that not all databases will offer JDBC connections. We are redesigning
this system, via both PRAGMA and GBMF funding, with the idea of using web
services to access data.. However we will limit the prototype to sensor data.
At the March 2005 meeting, supported by GBMF and other NSF funds, we will
involve the community to review the new prototype, establish a core set of
sensor data to be collected by all lakes, and a set of ancillary data that
could be collected, and discuss ways to extend that system to more lakes. The
version of the system with just sensor data is illustrated in Figure YY.
· Extending the system beyond sensor data to other ancillary data (e.g. data that come from water samples analyzed in the lab – such as the amount of total phosphorous)
· Instituting security features to protect the data from malicious attack.
· Integrating data with computational or presentational tools
· Developing cross-site query tools.
In short,
· Phase 0: We established a prototype, using JDBC connections to allow query of two lakes.
· Phase 1: We are redesigning the interface two and more lakes to provide a registration of lakes into the larger system and to allow connections into the data in the database via web services. We focus exclusively on data from sensors. We will have a prototype by March 2005 of this system.
· Phase 2: We want to move the Phase 1 prototype into use and extend it to three other lakes by the end of 2005. How to do this will be a key discussion at the March meeting, and will entail understanding the physical infrastructure at various lakes, but also what data are currently being collected, what terms are used to refer to the data, what units and other metadata are used, and how data can be accessed.
There are many other issues we need to deal with at the global, networked level:
· Extending the system to other types of sensor data, such as Acoustic Doppler Current Profile data, or visual data
· Extending the system beyond sensor data to other ancillary data (e.g. data that come from water samples analyzed in the lab – such as the amount of total phosphorous)
· Instituting security features to protect the data from malicious attack.
· Integrating data with computational or presentational tools
· Developing cross-site query tools.
· Linking data to other data, e.g. remote sensing data.
3. Lakes Participating in the Meeting
|
|
Project or Institution |
Location |
Contact |
|
NTL (Trout Bog) |
NTL – LTER U Wisconsin |
Wisconsin |
Tim Kratz Paul Hanson |
|
YYL |
TFRI, AS, NCHC |
|
Hen-biau King, Fang-Pang Lin |
|
Several |
Lammi Bio. Station |
|
Lauri Arvola Marko Jarvinen |
|
|
CBER, U |
|
David Hamilton Eloise Ryan |
|
|
KNU |
|
Bomchul Kim |
|
|
|
|
Toshio Iwakuma |
|
|
|
|
Ami Nishri |
|
|
|
|
Andrew Patterson |
|
|
Nanjing Inst of Geography
& Limnology |
|
Boqiang Qin |
|
|
|
|
Jason Antenucci |
|
|
|
|
Glen George |
Appendix: Projects contributing to a global lake observatory system:
The core of our joint progress to date leveraged efforts in
the North Temperate Lakes LTER, the International LTER, and PRAGMA. From August
2003 through April 2004, we embarked on the first step along the path from
concept to implementation, in the beginning of a global network of lakes
measuring and sharing lake metabolism data by creating a prototype network of lakes
in
NTL: Comparative Study of a Suite of Lakes in Wisconsin -- North Temperate
Lakes Long-Term Ecological Research (NTL-LTER), S.R. Carpenter, T.K.
Kratz, B.J. Benson, and 16 co-PIs. DEB 0217533, 10/15/02 – 10/15/08,
$6.72M, http://lter.limnology.wisc.edu. The goals of the NTL-LTER program,
established in 1981, are to detect long-term change in lakes and surrounding
landscapes; understand physical, chemical, and biological linkages at lake,
landscape, and regional scales; and understand feedbacks between lake and human
processes. Seven lakes in northern
YYL: Yuan Yang Lake. T. Kratz, http://lakemetabolism.org. A recent supplemental award has allowed the
NTL team to partner with the PRAGMA program as well as with colleagues at the
Taiwan Forestry Research Institute, Academia Sinica, and the
PRAGMA:
PRIME:
The Gordon and Betty
Moore Foundation (GBMF): “Toward a Distributed Information System for
Marine Biology and Limnology.” P.Arzberger,
A. Gupta, K. Stocks, T. Kratz. $1,762,421, 15 Oct 2004 – 14 Oct 2006. The
Biomedical Informatics Research Network (BIRN), a National Institutes of Health
(NIH) initiative, has constructed an infrastructure that allows researchers
nationwide to share and analyze biomedical data, and used mediation of the data
as a key technology. The GBMF award will extend that infrastructure, designed
for the biomedical community, to handle queries on spatial and temporal data.
The test cases for the oceans include OBIS, and overfishing of Seamounts. The
funding for the lakes component will provide new equipment on an entire lake
system, thus allowing first-of-its-kind analysis of an entire lake system, and
tools to make accessible those sensor data.
In addition, the GBMF funding will initiate a process via a workshop to
organize the international community to build a global network of lake sensors
as well as the linkage between coral reefs sites. The GBMF project will extend a
tool that will be useful to the global environmental community, establish a
prototype lake system observatory, and initiate a global lake observatory
network.
Biological Databases
and Informatics (BDI): Automating
Scaling and Data Processing in a Network of Sensors: Towards a Global Network
for
EcoGrid (ecogrid.nchc.org.tw),
[1] If GPP is greater than R, then the lake is autotrophic and internally
produces reduced carbon sufficient to fuel higher trophic levels. Alternatively, if GPP is less than R, then
the lake is heterotrophic and must receive an external source of reduced carbon
to fuel higher trophic levels.