Menu

3. Data Life Cycle

3.7.2 Data and Information Integration

Increasing efforts dedicated to improving access and data sharing prompt issues related to integrating1 multiple sources of data. In a perfect world, all data producers would adopt international standards and would use accessible and interoperable computer environments. But what is really the current situation?

HETEROGENEITY
On a daily basis, everyone is using some form of calendar. However, looking at how people write the date, one can easily understand that heterogenous procedures, formats, syntax and systems are in fact major issues with respect to data integration. For example, although the international ISO standard 8601 defines YYYY-MM-DD as the official date format, a wide variety of ways to write 2015-02-12 can be found as shown in the table below:

Sample ABC-009:
collected 06/05 at 7 h.
Was it May 6?
June 5?
AM?
PM?
12/02/15 15/02/12 02/12/15 15/12/02
12-02-15 15-02-12 02-12-15 15-12-02
12-02-2015 2015-02-12 02-12-2015 2015-12-02
Feb. 12/15 February 12, 2015 12 fév. 2015 12 février 2015
12.02.2015 2015.02.12 12 de febrero Etc. etc. etc.

Generally speaking, we can see that combining datasets where variables are represented in different formats can cause problems. The same goes for the units, measurement precision or cartographic projections used. Rigor and consistency are therefore essential.

ASSIMILATION IN MODELS
The work of meteorologists illustrates well the use of models: weather experts feed a variety of environmental parameters into climate models in order to produce the best forecasts possible. The cycle of data assimilation in this process adds in situ observations into the models as a way to fine-tune the forecasts. For instance, this is how the coupled water-athmosphere model developed by Saucier et al 2 3 can produce surface current forecasts for the Estuary and Gulf of St. Lawrence. 4

  1. Ludäscher, B., K. Lin, S. Bowers, E. Jaeger-Frank, B. Brodaric and C. Baru.2005. Managing Scientific Data: From Data Integration to Scientific Workflows. 21 p.
    http://users.sdsc.edu/~ludaesch/Paper/gsa-sms.pdf
  2. Saucier, F.J., F. Roy, S. Senneville, G. Smith, D. Lefaivre, B. Zakardjian et J.-F. Dumais. 2009. Modélisation de la circulation dans l'estuaire et le golfe du Saint-Laurent en réponse aux variations du débit d'eau douce et des vents. Revue des sciences de l'eau / Journal of Water Science, vol. 22, n° 2. p. 159-176.
    http://www.ismer.ca/IMG/pdf/Saucier_et_al_2009_RSE.pdf
  3. Gouvernement du Canada. Environnement Canada, Modélisation.
    https://meteo.gc.ca/model_forecast/model_f.html
  4. St. Lawrence Global Observatory (SLGO). Ocean Forecasts.
    https://slgo.ca/ocean