Challenges for EVO

EVO has identified the following five issues as being the main challenges that have to be met if effective, secure cloud services for the environmental research community are to be devloped.

Common data and information standards
To enable the free flow of data between environmental web services and applications it is essential that common data and information standards based on shared syntax are available. As with many other areas of science, data standards are available for the environmental sciences. EVO is reviewing the various approaches. The challenge is to find the most effective standards in the context of the development of cloud applications.

Resources are requested and managed using web service calls. The web service APIs are defined using either or both ReST and SOAP standards. The environmental models are implemented using the OGC (Open Geospatial Consortium) WPS (Web Processing Service) standard which specifies how geospatial inputs and outputs should be.

Data access, management and security
Cloud computing offers a variety of potential benefits including bringing together data from disparate sources, and a variety of web-enabled data management tools. However, there is a challenge to make these services flexible and responsive to user needs while managing appropriate levels of access to data and models.

“Separation of concerns” is at the heart of data and model management in the EVO. This constitutional notion is exercised at different levels. For example, models are only accessed through a secured proxy. This ensures that the data provided by data producers is not directly accessible to the EVO’s user whilst allowing the user to execute modelling workflows against the data sets. The delegation technique to enable this emanates from our collaboration with the NERC MashMyData project.

Furthermore, virtualisation allows us to create fully customisable but independent computing environments. This compartmentalisation is essential to maintain privacy through full isolation as well as assisting with efficient resource management and fault tolerance.

Searching for information in the cloud
Searching for information in the cloud is a big challenge for anyone who wants to use cloud services. This is particularly the case where there isn’t yet a mature user community, such as in the environmental science sector. What are the lessons to be learn from other disciplines and sectors?

The EVO portal provides several means of finding information. First, a meta-search engine assists users in finding pages of interest. Search results are returned along with any related resources that can be either on-site or off-site. This is achieved through harnessing the relationship between web resources, a concept known as Linked Data. Second, the portal front page hosts a revolving selection of current content and functionality. Furthermore, there is a myriad of info pages that introduce the functionality and data hosted on the portal along with associated links, FAQs and tutorials. Finally, the portal includes typical navigation menus and a site map.

Visualisation of complex data and models
The cloud offers the opportunity to develop bespoke visualisation tools linked to data and modelling services, and there are already a host of visualisation engines and services available. How should scientists best use the opportunity to visualise environmental data and models in the cloud?

The EVO hosts a range of models working on different inputs and producing different outputs. For each model, a bespoke visualisation is developed to suit the particular factors in question. In general terms, models generate one of two types of output: geospatial and time series. Geospatial data is visualised using interactive layers superimposed over maps. (Google Maps is used due to its wealth in data, features, and the familiarity of the general with it). The interactive nature of the geospatial layers allows us to expand the visualisation to include time series graphs over specific map locations.

Dynamic interaction with models
Hydrologists and environmental scientists make extensive use of process- based and other models to understand the terrestrial water cycle and to predict future trends and events, but to date few of these models have been implemented for the cloud. What are the major challenges and how will such cloud-enabled environmental models be used in the future?