THOR’s last hurrah

Project THOR is coming to a close. Our final event was held in Italy on 15 November at ‘La Sapienza’ – the University of Rome – just a stone’s throw from Michelangelo’s impressive sculpture of Moses in the church of San Pietro in Vincoli. The day combined a retrospective review of THOR’s achievements and impact with a forward-looking perspective on the wider persistent identifier (PID) landscape.

With help and advice from CINECA’s Paola Gargiulo the event attracted an audience of Italian researchers from a wide range of disciplines, to join delegates from THOR partner organisations. Adam Farquhar of the British Library followed Paola’s welcome with an overview of THOR’s mission to embed PIDs within the heart of scholarly communications.

Key notes

Herbert van de Sompel, one of our two keynote speakers and Digital Library Research & Prototyping Team lead at Los Alamos National University, presented important research on maintaining the integrity of hyperlinks to managed collections such as journal repositories. ‘Link rot’ – broken links – and ‘content drift’ – where a link works but takes you to something other than the original content – are both problems, albeit less so than for the web at large. Solutions suggested for the scholarly web included using metadata that describes the relationships between links using resources such as signposting.org.

Photo 1

Photo: Herbert van de Sompel giving keynote on achieving link integrity

Our second keynote talk was given by Fiona Murphy, an independent research data and publishing consultant and excellent THOR Ambassador. She has been considering the question of what scholarship would look like had it been digital from inception, with PIDs integral to the process. The Matrix of the Commons – found at www.scholarlycommons.org – is the result of work on how to get us from here to there, via a guiding set of decision trees on how to treat a range of digital entities appropriately and consistently. Important take home messages were that we should all assume 1) that our tools will be used in conjunction with other systems and 2) that they should be built for drivers, not mechanics.

THOR round-up

Between keynote talks, THOR partners presented highlights from the project deliverables over the course of two sessions. In the first, Maaike Duine (ORCID) summarised THOR’s communications activities and Ambassador programme. Elizabeth Hull presented DRYAD’s perspective on linking data and publications. Martin Fenner (DataCite) looked at claiming workflows to ORCID. Robin Dasler spoke both about analysis preservation at CERN and THOR’s analyses of PID service adoption. Robert Petryszak (EMBL-EBI) talked on how to realise the full potential of PIDs in the biosciences, and the session closed with a demo of dynamic data identification in the earth and environmental sciences from Markus Stocker (PANGAEA).

Pic 2

Image: PID service adoption study

After lunch, Tom Demeranville (ORCID) gave updates on developing tools for federated identity management and better use of organisation identifiers. Martin Fenner returned to share outcomes from a joint THOR-OpenAIRE workshop on article-data linking. Angela Dappert (British Library) considered the challenges of embedding persistent identifier services within the humanities. Sünje Dallmeier-Tiessen (CERN) fed back from a high-energy physics community workshop. Lastly, Maaike Duine reported on insights gathered from a focus group meeting to envision the ‘ideal PID world’ for publishing workflows.

Points for discussion

It wouldn’t have been a THOR event without the opportunity for everyone to participate, and the mid-afternoon panel session provided a dedicated forum. Panellists Hannah Hope (Wellcome Trust), Clifford Tatum (CWTS), Andres Mori (Digital Sciences), Erika Bilicsi (Library and Information Centre of the Hungarian Academy of Sciences) and chair Adam Farquhar addressed comments and questions from the room.

Photo 3

Photo: Discussion panel on PID use in different communities

Discussion points included the relative underuse of funder IDs, identifying a current difficulty in capturing full information for works involving multiple funders as a possible stumbling block. A question on how best to approach differences of opinion, e.g. between co-authors faced with article- or data-retraction, provided useful food for thought. Changing our general view was suggested as a potential way forward given that not all retractions are equal and the reasons may well be mundane, rather than issues of capability. Also covered were the concepts of PIDs as a compromise between the readability needs of human and machine, and of the importance of working out which features of an object are the most critical to identify.

Important take-homes from the day were that PID space is developing at an active and encouraging pace and that we should stay alert to idea that, in terms of policy, one size may not fit all.

Finally, Simon Lambert (STFC) looked ahead to the FREYA project, which will follow on from THOR, starting on 1 December 2017. One of its key strands is to achieve long-term sustainability for the outstanding progress and development that began with ODIN – THOR’s predecessor – and advanced significantly during the 30 months of THOR.

Want to know more?

Presentation slides from the day are available for download:

PIDs in Poland: let’s link research!

The ongoing drive within the THOR project to identify and connect the research landscape reached Warsaw on Monday 24 April. Organised in collaboration with Crossref, the workshop focused on the ways in which persistent identifier (PID) services, such as those provided by members of the THOR consortium and Crossref, can represent ‘much more than infrastructure’ by ‘working together to connect research’. Hosted by the Digital Humanities Centre at the prestigious Institute of Literary Research of the Polish Academy of Sciences, the workshop brought together a packed audience of publishers, data managers, researchers, librarians and administrators for a day of knowledge-sharing and discussion centred on increasing access to research output.

Professor Łukasz Szumowski (Under Secretary of State with the Ministry of Science and Higher Education) opened the event with a recognition of just how quickly the digital world is changing. He stressed the need for developing new mechanisms in bibliometrics to enable objective evaluation that can guide public funding of research.

Professor Paweł Rowiński also extended a welcome as Vice President of the Polish Academy of Sciences, home to 69 institutions spanning a multitude of disciplines. Introducing a thread that ran throughout the day, Rowiński highlighted the fact that persistent identifiers not only make research more accessible, they can provide an incentive for scientists to share data, safe in the knowledge that their achievements will be more visible and attributed to them.

In the morning sessions, Rachael Lammey (Crossref), Ginny Hendricks (Crossref), Josh Brown (ORCID), Laura Rueda (DataCite), Rachael Kotarski (British Library) and I (Ade Deane-Pratt, ORCID EU) gave an overview of the persistent identifier landscape and the services that are being developed to support them as this landscape evolves. Crossref recently gained 26 new members from Poland alone, making it one of their fastest growing countries.

Some common themes emerged from the presentations and discussion: achieving persistence is a process, one that involves constant evaluation and adaptation. The challenges can be highly domain specific, and a number of questions also remain unresolved.

But permissions and privacy are key. Services such as EThOS, the British Library’s repository of doctoral theses, can make it easier to track career paths, but at the same time throw up the challenge of claims for legacy theses.

And more broadly: is it possible to enact a cultural shift away from the citation of physical objects to their digital representations? When is it appropriate to do so?

The afternoon sessions saw some interesting case studies from Polish industry and academia, and some robust discussion, with contributions from Dr Eng. Jakub Koperwas (Warsaw University of Technology), Marcin Werla (Poznań Supercomputing and Networking Centre) and Dr Marta Hoffman-Sommer (Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, RepOD Repository for Open Data, OpenAIRE NOAD for Poland).

We heard about the effort to build from scratch a university knowledge base with clear and consistent metadata that semantically links the full spectrum of academic activity, encompassing conception and funding, the research process, publications, implementations, practical applications, patents and results. The motivation was that it should be possible, for example, as a researcher, manager, funder, administrator or librarian, to interrogate the system to find experts in a field. We also heard about the work and challenges involved in providing an infrastructure − the PIONIER Network in this case − to support research via PID uptake. An outstanding question is how to prevent the duplication of DOIs assigned to the same object.

During the discussion that closed the day, we heard from both panel and audience on what the future of research communication should look like. With the event coming hot on the heels of the deadline for contributions to a new Polish national research evaluation exercise, the topics of making research communication more effective, and capturing and sharing information were naturally of real significance to the room. One thing was abundantly clear: persistent identifiers are integral to that future.

This timely meeting was just one strand of ongoing work to improve scholarly infrastructure and make the research landscape fit for 21st-century purpose. You can download slides from the day here: https://zenodo.org/communities/2017-24-04-warsawmeeting/?page=1&size=20. And you can keep abreast of future events at our website.