Wednesday, October 15, 2014

EU Tools for all Open Data harmonisation all over Europe

Last week I was in Brussels. I was there invited by the Homer Project (a very interesting European project with the aim of harmonise Open Data offers of the several Mediterranean regions):

My presentation was about European Union tools and cases about this critical requirement: we need to harmonise our Open Data offers, otherwise we don't have a European Open Data market, we only have a lot of isolated Open Data portals.

I put here my presentation and all of my speech.

1.- Introduction
Hello, good morning, I’m Marc Garriga.

I’m the owner of desideDatum Data Company and I’m one of the advisors of the EPSI Platform.

First, I want to thank to Homer Project to give me the opportunity to be here in order to explain tools and cases about Open Data harmonisation all over Europe.

I have only 20 minutes, so let's get to the point.

2.- Why Open Data harmonisation?
Last Saturday I was in Granollers, a medium city close to Barcelona, Spain.

I gave a talk about the importance of Open Data, its benefits to society (including business benefits). I use this definition of Open Data.

It was an introductory talk, a non-deep level talk; therefore, I only explained the bed of roses of Open Data.

But, as you know, the reality is quite hard.

3.- Why Open Data harmonisation?
The hard reality is: Building open data services is, still, a pain.

Nowadays there are more than 130 Open Data portals in Europe; this is a lot of open data available (even though reusers want more open data).

There is a lot of data available, but building Open Data’s services is very difficult.

Specially on that services that use data from different Open Data portals.

Let’s make a test: What services (that use data from different portals) do you know?

[waiting for public answers]

There are services that use data from different Open Data portals, but there are still few.

One of them is EuroAlert:, a service that alerts you about relevant tenders from European governments.

The manager of this company (José Luis Marín) told me that it is very difficult to build a pan-European service with data from different portals: there are different languages, different data structures, different ways to tender, different cultures... these are big problems that are hard to solve.

But, we can solve (or try to solve) these problems: we "only" need to harmonize the different Open Data offerings.


This list is the result of a discussion held in a Congress about the reasons of the low reutilization of Open Data. As you can see, some of them are related to harmonization.

If you want more info, here you have it (in Spanish).

4.- The Open Data España Decalogue
One of the first initiatives in order to harmonise the Open Data offerings was the Open Data España Decalogue.

Open Data España was a group of Spanish governments (national, regional and local) having an Open Data service. Unfortunately, currently this group is dissolved).

This is a decalogue with eleven points, instead of ten (!?)  (Which is the word for a list of eleven points? I don't know it in Spanish, nor in English) ;-)

This is a decalogue of the most important issues that every government should keep in mind to offer a good Open Data service.

The first point is very clear; we need to harmonize the Open Data offerings among all Spanish Open Data Portals.

This was the main reason of this group: to harmonize the offerings. But, as I said before, this group failed in this goal.

There are other groups in Europe with this main goal, for instance, Open Data France.

5.- Open Data France
Open Data France is a group of French governments having an Open Data service. This group is alive and they have several projects in order to harmonise the Open Data offerings among French Portals.

A few of them are related to set of common taxonomies among different portals. For instance the agenda of the next meeting of this group contains several discussions about the use of standard vocabularies in these topics:
•    Public spending (according the initiative of Open Spending).
•    Mapping public amenities in the cities.
•    Cultural data (specially calendar data).
•    Waste data.

Reaching agreements on vocabularies is an excellent way to harmonize.

6.- The most successful tool: DCAT
I think the most successful tool to harmonise Open Data offerings is DCAT (it stands for Data Catalog Vocabulary).

As you know, every Open Data Portal has its own catalog.

And every catalog has a huge number (I hope so!) of datasets.

And every dataset can have one or more distributions (this is different formats to provide this data).

We need to describe these elements (and the relationships among them) in a formal way, in a semantic way.

The main goal of DCAT is to propose a unified format for publishing the contents of (open) data catalogues.

7.- The European solution: DCAT-AP (1/2)
Thanks to an initiative of DG-Connect, the EU Publications Office and the ISA Programme, there is a European specification of DCAT to describe public sector datasets in Europe: the DCAT-AP.

This is an Application Profile of DCAT: a European specification that reuses terms of DCAT adding the characteristics from European portals.

8.- The European solution: DCAT-AP (2/2)
This is the UML Diagram of the classes and properties included in the DCAT-AP.

Essentially, DCAT-AP defines three kind of meta information: mandatory, recommended and optional.

For instance, for datasets, it defines as mandatory: title, description, theme and organization. As recommended: label(s), licence and contact. As optional: Date of last update, frequency of update, geographic scope, etc.

Then, there is other meta information for distributions.

Of course, the ideal situation is to have all meta information, but you need to balance the benefits of having more meta information and the cost to keep them updated.
The minimum level is the mandatory meta information.

9.- The Spanish Solution: NTI
Technical Standard for Interoperability (in Spanish "Norma Técnica de Interoperabilidad, NTI"), provides the set basic rules to reuse information resources produced or held by the public sector.

It is another Application Profile of DCAT (it's previous of DCAT-AP), and it's the technical reference recommended to implement open data initiatives for any Spanish public administration or public entity.

Common conditions of selection, identification, description, format, terms of use and provision of data produced or held by the public sector are set by the NTI.

Its main problem is that it was created after most of Open Data Spanish portals were launched. So, it is currently used in a few portals. Nevertheless, it is a very useful tool.

There are other Application Profiles across Europe, for instance, there is one of them in Austria, another one in Germany, etc.

10.- A Spanish case of Federated Open Data Portals
Thanks to NTI, the Spanish National Government started a project in order to federate all Open Data portals in Spain. You must know that nowadays there are almost 30 Spanish Open Data portals, therefore, it would be very helpful to have a federation service to avoid having 30 Open Data silos.

This becomes real in this project.

It's not easy to federate Open Data portals... there are technical issues and, specially, political issues.

That’s why currently there are only 14 governments federated. You can search any data from these entities in a common point: the National Spanish Open Data portal:

I think, this case is a "little Europe" case, so, the challenges to face in this case will be, more or less, the same challenges for the future federation of all European Open Data Portals. 

11.- European Projects: Homer and CitySDK
Regarding European Projects, there are some projects according harmonisation.

Obviously, the most specific project about harmonisation is the Homer Project. After my presentation, Luca Guerretta will go in deep throw this project.

The other project in this slide is the CitySDK project. The main goal of this project is to harmonize application programming interfaces (APIs) across eight European cities in three main topics: “citizen participation”, “mobility” and “tourism”.

The results are three definitions of APIs that these cities decided to harmonise among them.

This is not strictly Open Data, but it's very close to it.

12.- European Projects: iCity
iCity is another European project. Its main goal is to facilitate the opening of Public Information Systems, in order to offer functionalities (not only data) of these information systems to external organizations.

This project develops a platform that provides harmonised APIs (a few of them from City SDK project) in order to access a lot of information systems of several European cities.

For instance, this is a real case of what you can do via iCity: an enterprise can develop a service (in this case an App about incidents and complaints) that runs for several cities at same time, developing it only once.

This project is usually defined as the next step of Open Data: Open Information Systems.

I encourage you to visit the web of this project:

13.- Legal Harmonisation
Another aspect to harmonize is the legal issues.

You can see me in this picture taken two years ago in Budapest; I was there in a LAPSI seminar in order to explain an initiative to have a unique license for Open Data across Europe.

At that time, there were more or less 50 Open Data portals in Europe... and, almost, 50 different licenses to use this data. Therefore, developing a service that use data from more than one portal was very difficult, (regarding legal aspects).

Three months ago, the Journal of the European Union published these guidelines regarding licenses, I think it's a good step in the good direction.

14.- Last Slide
There are other initiatives that help to achieve a better level of Open Data harmonisation: taxonomies, guidelines, projects, etc.

But, I think I've explained a quite complete vision of harmonisation situation in Europe. My conclusion is we have gone through a long way, but there is still a long way to go…


Please, if you have any comment don't hesitate to write it here.

I want to thank to Míriam Alvarado, Claire Gallon and Sergi Amigó that helped me in this presentation.

The image of this post was made by Miguel González-Sancho.

No comments: