Increasing efforts dedicated to improving access and data sharing prompt issues related to integrating1 multiple sources of data. In a perfect world, all data producers would adopt international standards and would use accessible and interoperable computer environments. But what is really the current situation?
On a daily basis, everyone is using some form of calendar. However, looking at how people write the date, one can easily understand that heterogenous procedures, formats, syntax and systems are in fact major issues with respect to data integration. For example, although the international ISO standard 8601 defines YYYY-MM-DD as the official date format, a wide variety of ways to write 2015-02-12 can be found as shown in the table below:
|Feb. 12/15||February 12, 2015||12 fév. 2015||12 février 2015|
|12.02.2015||2015.02.12||12 de febrero||Etc. etc. etc.|
Generally speaking, we can see that combining datasets where variables are represented in different formats can cause problems. The same goes for the units, measurement precision or cartographic projections used. Rigor and consistency are therefore essential.
ASSIMILATION IN MODELS
The work of meteorologists illustrates well the use of models: weather experts feed a variety of environmental parameters into climate models in order to produce the best forecasts possible. The cycle of data assimilation in this process adds in situ observations into the models as a way to fine-tune the forecasts. For instance, this is how the coupled water-athmosphere model developed by Saucier et al 2 3 can produce surface current forecasts for the Estuary and Gulf of St. Lawrence. 4