In the first of series of discussions with leading figures in the world of libraries and metadata, BDS talks to Dr Lars G. Svensson, world-expert on linked data and the Semantic Web, and asked him about his vision of the future for libraries…
BDS: Can we start this discussion by asking you, Lars, about your background?
Lars Svensson: That’s an extremely diverse question because I have an extremely diverse background. After school I had the choice between studying sacred music or mathematics and I went for mathematics which proved too dry. So I took time out and worked for two years in an oil refinery. After that I moved from my home in Sweden to Germany, deciding I wanted to do something completely different. I started a PhD in library history with the goal of eventually becoming a librarian but I wasn’t accepted for library training. Instead I was retrained as a software developer and worked for two years writing e-commerce applications but then came the dot com crash and I lost my job. However, at exactly the same time, a position at the German National Library (DNB) became available that required someone between a librarian and a developer. I applied; got the job, and the initial two year project was extended into a long term position.
During those early years I became interested in something I had read about called the Semantic Web which I couldn’t make sense of at all. It wasn’t until I came into a library, really working with metadata, that I saw that we have the titles here and the authorities there, and they are linked together with typed relations. I realised I had read about this… this is called the Semantic Web. I asked, “Why aren’t libraries contributing to the Semantic Web, the data is predestined for it?” I went on to suggest that we should try it. So in 2010 we ventured into our first linked data project here in the DNB, starting with the authorities, which are much easier than bibliographic data, and since then we have built it up, revamping the service three or four times, taking a great deal of inspiration and encouragement from the Swedish National Library who were the first to undertake bringing out their entire union catalogue as linked data in 2008.
BDS: We hear the term “linked data” with ever increasing regularity, so much so it has become a “buzz word” but what exactly is linked data?
Lars Svensson: Linked data is a design principle. It is a set of technologies working together. It goes back to the famous article by Tim Berners-Lee, James Hendler and Ora Lassila written in 2001, where they first coined the term “Semantic Web.” This was something that never really took off because it was seen as too rigid and formal and no-one ever showed that the large data sets made sense outside of the specific communities that had created them, to a large extent because the data resided in silos. So, at that time, it never went in the direction of a data driven infrastructure which was what Berners-Lee really wanted to have. So after a few years, it was realised that what was needed was to start linking things together and so they asked people to publish data on the Web using the four linked data principles which define how to identify things, how to provide useful information and particularly the fourth of which is to supply links to other data sets so a person or a machine can read how data sets in the world link together. That is the key point to linked data: providing links to the outside world.
BDS: So the links have to exist to constitute linked data, but how are they created?
Lars Svensson: The one we are most used to in libraries is the manual creation of links. Links to a work, the publication to an author, to topical subject headings, to events, whatever. Serials are also linked together, this is manual work. There are also projects like MACS – multi-lingual access to subjects – which aimed to link together subject headings from different thesauri in order to provide multi-lingual search options. Then there are machine generated links, which require some sort of algorithm that can decide “how do I match those things together?” This requires controls around what maps onto what and to what extent