Calls for Interoperability to Advance Science and Address Data Integration Needs:CCT Workshop

Multi-Scale Integration of Human Health and Environmental Data Is Emerging Standard of Best Practice

By Annie M. Jarabek and Glenn Suter, National Center for Environmental Assessment, US Environmental Protection Agency

Disclaimer:  The views expressed in this article are those of the authors and do not necessarily represent the views or policies of the US Environmental Protection Agency.

An SOT Contemporary Concepts in Toxicology (CCT) workshop entitled “Multi-scale Integration of Human Health and Environmental Data,” co-sponsored by the US Environmental Protection Agency (USEPA), was held at the USEPA campus in Research Triangle Park, North Carolina  on May 8–11, 2012. The Society of Environmental Toxicology and Chemistry (SETAC), International Society of Exposure Science (ISES), the Society of Risk Analysis (SRA), and the International Environmental Modelling and Software Society (iEMSs) also co-sponsored the workshop along with several other governmental agencies including the US Food and Drug Administration (USFDA), USA Army Engineer Research and Developmental Center (ERDC), US Geological Survey (USGS), US Department of Agriculture (USDA), Pacific Northwest National Laboratory (PNNL), and US Nuclear Regulatory Commission; and private sector supporters including the American Chemistry Council,  Environ, TERA, OpenMI, Open Geospatial Consortium, and ICF. 

CCT workshops are intended as in-depth meetings of significant duration to explore cutting-edge topics. The objective of this meeting was to provide a unique opportunity to convene scientists from different sectors (government, industry, NGO, and academia) and across the exposure-dose-response-analysis continuum for both ecological and health endpoints to discuss the timely topics of data integration, data management, and model interface needs with software developers, software engineers, database architects and administrators, and data analysts. The special aspect of including software and database experts was so these computational experts could hear the needs and then speak to the technology and design issues that transcend the scientific disciplines to ensure recommendations for a computational infrastructure to support all endeavors, especially data integration.

The workshop consisted of plenary presentations and thematic breakout sessions for five different disciplines: (1) Exposure, transport, and transformation; (2) Ecological risk, ecosystem services, and climate change; (3) Dose-response, Tox21, and risk; (4) Life-cycle/multi-criteria assessment and cost: benefit analysis; and (5) Information technology. 

Plenary Presentations

The first day of the workshop was devoted to plenary talks from each of the sectors and from different disciplines within each to introduce the range of issues and perspectives. Virtually all the talks emphasized the need to make data and models from diverse sources more available and more useful through interoperability.  Both real-time exposure monitoring and new assays in molecular toxicology are creating huge data sets that must be integrated across exposure durations and different receptors.  One speaker noted that we are in the era of a highly technical, “knowing generation” that expect data to be easily discovered electronically, asserting that if data cannot be located via Google then they essentially do not exist. 

One participant felt that databases and computational tools must be maintained “live,” reflecting curation and annotation as data or models are used in various applications. Speakers also noted the need to extrapolate across steps in the development process (bench to bedside) and across scale, including the levels of organization within an organism to various locations (gene to globe). The ability to visualize and display data was considered a tool of great utility to convey content and aid inferences.  Semantics was identified as a critical issue regarding interoperability for exchanging information across the disciplines. As examples, a vein is not the same in a leaf, fly, or mammal; and “species” in different modeling arenas may represent a reaction molecule or a rat.

An example of interoperability in the environmental arena was provided by Daniel Ames of Idaho State University. He described the Consortium of Universities for the Advancement of Hydrologic Science’s (CUAHSI) open Hydrologic Information System (HIS) including Hydroserver, HydroDesktop and HIS Central. Together, they provide a complete platform for storing and organizing hydrologic and water quality data and then extracting, organizing, plotting, mapping, and linking to models.  HIS achieves consistency, interoperability and transparency through standards and open licensing. It was noted that such comprehensive environmental descriptions will need to be linked to human health, toxicological, and life-cycle or benefit assessment models to achieve characterization of sustainability for environmental decision making. 

The speakers from NGOs emphasized the need for openness both for their own projects and for the public. The information technology speakers described efforts to standardize data and model management in ways that enhance interoperability. The Open Modeling Interface Standard (Open MI) and Open Geospatial Consortium were presented as efforts that have achieved integrated dynamic environmental modeling via use of international standards for spatial data and interfaces. Resource description format (rdf) is a family of World Wide Web Consortium (W3C) specifications that were originally designed as a metadata data model that can be used as a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax formats. 

Disciplinary Breakout Sessions and Thematic “Ambassadors”

Experts in each of the five theme areas served as invited participants and joined other workshop attendees to grapple with articulating best practices and recommendations to facilitate data integration both within and across the disciplines. “Ambassadors” from other disciplines joined the discussions to foster cross-fertilization and stimulate development of interfaces and data integration, and feedback indicated that this was a particularly useful construct as each theme learned a great deal about the needs and challenges in the disciplines, representing some resonant and others disparate with their own. Consistent messages across the disciplines were heard with respect to the need for data discovery and modular “plug and play” capabilities to facilitate comparisons and transparency in given derivations or decisions. The need for maintenance and curation to ensure quality assurance of databases also was a prominent recommendation.

Next Steps—Stay Tuned for Publications

Participants in each session are developing state-of-the-science manuscripts that describe perspectives on best practices and summary of information technology needs to advance that discipline. A separate synthesis manuscript will articulate a set of recommendations for standards on interoperability and computational systems to support data integration across the disciplines.

SOT CCT Group Shot

