New information technologies help accessing, reusing and valorizing data. They help create opportunities for collaborations, support innovation and foster the development of new research initiatives. Scientific data dissemination and sharing are key elements facilitating data valorization; they help democratize data access and also provide exposure for scientific research. Such benefits motivate governments, organizations and scientists across the world and encourage the adoption of data management best practices throughout the data life cycle.
The Gouvernment of Canada has established a series of open data principles 1. These can easily apply to any dataset:
- Completeness: datasets should be as complete as possible and should reflect the entirety of what is recorded about a particular subject including metadata explaining raw data as well as details about calculation methods.
- Primacy: datasets should come from a primary source, including original data collected and details about how data was collected in order to allow users to verify that data was collected and recorded properly and accurately.
- Timeliness: data should be made available to users in a timely fashion without delays.
- Ease of Physical and Electronic Access: datasets should be as accessible as possible without complicated access conditions ou requirements to using complex technologies.
- Machine Readability: datasets should be stored in widely-used file formats that easily lend themselves to machine processing (e.g. CSV, XML). These files should be accompanied by documentation related to the format and how to use it in relation to the data.
- Non-discrimination: there should not be obstacles to accessing datasets by anyone, at any time, nor should there be a need to identify oneself and to provide justifications.
- Use of Commonly Owned Standards: datasets should be in freely available file formats as often as possible; users should not need to get a particular application to read the data.
- Licencing: the use of an open licence increases openness and minimizes restrictions on the use of the data.
- Permanence: For optimal use, online information should remain online, with appropriate version-tracking and archiving over time.
- Usage Costs: open data is free of charge.
The US Gouvernment has a similar approach. The US White House Project Open Data 2 states that data will be documented, complete, reusable, public and timely. An action plan about governmental open data was published in 2014. 3
On the international scene, the Global Earth Observation System of Systems (GEOSS) has also adopted such principles in the context of data sharing being essential to supporting societal benefits 4. Complete, timely and open accessibility of metadata and data is a key element of this strategy. The International Oceanographic Data and Information Exchange (IODE) of the Intergovernmental Oceanographic Commission (IOC) of UNESCO is a good example of an oceanographic data and information exchange program. 5
The large number of international initiatives has prompted GEO to setup a working group tasked with the examination of data production and management worldwide. Among its conclusions, the group has identified a need to harmonize the various approaches, terminologies and definition. 6