Hasta la Vista, THOR Bootcamp

With local support from THOR ambassador Eva Mendez, the first edition of the THOR Bootcamp was successfully carried out in Madrid, at Universidad Carlos III de Madrid on November 16-18. The Bootcamp is part of THOR’s outreach effort to engage and train local scholarly communication communities to further adoption of PID services. The full set of slides used can be found on the THOR Knowledge Hub.

THOR colleagues from different partner organizations and guest speakers from local research organizations joined forces to present a full curriculum on PID topics, from existing tools and services to technical and policy implementation. The event attracted more than 130 registrants in total and yielded valuable experience for both the attendees and the THOR project.

The Bootcamp consisted of 3 modules to cater tailored content to different audiences. The first half-day was organized as an integral part of research training for Ph.D. students and other young researchers at UC3M, focusing on Open Science recommendations and the incorporation of PIDs in existing research workflow. Students came from different disciplinary backgrounds and brought with them distinct questions, the Bootcamp provided a great opportunity for us to engage the young researchers’ community and address their concerns directly . 

“I consider the instruments presented along with the seminar an extremely powerful way to collect, share, exploit and advertise the work of a researcher in a way which is mostly new and free from older constraints. The value of the research itself is so, enhanced and collaborations are made way easier in benefits of the results.”

— Rocco Bombardieri, Ph.D. student at UC3M

bootcamp1
Ph.D. Students attending THOR Bootcamp at UC3M

The second day was reserved for local information professionals and research data service stakeholders (librarians, researchers, research administrators and policy makers). Their day followed an intense schedule consisting of talks and a mini panel with service implementation experience by ORCID, DataCite and CERN.

bootcamp2
Local information professionals at THOR Bootcamp General Day at UC3M

The final half-day offered a more technical tutorial. The self-contained programming module enabled participants to build a metrics dashboard that visualized data interactively, based on the technology used in the THOR dashboard. As a hands-on session designed for non-technical and technical savvy attendees alike, it was great to see how people from a variety of technical backgrounds approached the tutorial and contributed to the ensuing discussion.

bootcamp3
Instructors of the Hands-0n Day, Ioannis Tsanaktsidis (left) and Kristian Garza (right).

We aim to establish ties with research organizations and institutions by providing tailored PID content via the Bootcamp series — two more Bootcamps will be held in March and May next year (2017). Stay tuned to find out if we are coming to your neighborhood soon! Or better yet, if you want to organize your own Bootcamp, sign up to be an ambassador and we will provide all the materials that are ready to be reused, plus event planning tips for bringing your local community up to speed with PIDs.

THOR at PIDapalooza

If November taught us anything, it’s that open identifiers clearly do deserve their own festival. On 9th and 10th November 2016, people from all over the world gathered in Reykjavik to share PID stories, demos, use cases, victories, horror stories, and new frontiers at PIDapalooza, the first conference dedicated to PIDs. The THOR team travelled to the country of glaciers and volcanoes to talk about project identifiers, persistent identifiers for instruments, PIDagogy and measuring PID adoption.

PIDs for Projects

Martin Fenner (DataCite) and Tom Demeranville (ORCID) presented their work on project identifiers to a full house. They proposed that project IDs should be used to link participants, outputs and funding. But the most suitable identifiers to describe projects? That was left open for discussion – a discussion that quickly turned heated. What, even, is the exact definition of a project? What would persist if the project ends? Would researchers be willing to share the information needed for the project ID? How would we describe the metadata, given that a project does not have a publication date? Clearly more research needs to be done to answer these important questions. Keep an eye out for the announcement of a THOR webinar on Project identifiers, which will be held early 2017, in which we will be resuming this discussion.

tom

Tom Demeranville leading the discussion on PIDs for projects

Persistent Identification of Instruments

Markus Stocker (PANGAEA) continued to explore new frontiers with a presentation on PIDs for instruments, instrument platforms and their deployments. Beyond enabling the unambiguous identification of these entities as well as reference to them in articles and other research artefacts, Markus suggested that metadata preservation about these entities is critical for researchers to judge the fitness of observation data for reuse. He presented two examples for systems that already assign DOIs to deployments and platforms. A key challenge for the community is to decide on the required metadata for preservation.

markus-2

Twitter activity during Markus Stocker’s presentation on PIDs for instruments

The Human Perspective

Building the technical infrastructure for open research was a clear theme at the conference, but how do we move from infrastructure to adoption? How do you teach, learn, persuade, discuss and grow the uptake of PIDs in everyday research practice? My presentation showcased the contribution that the THOR ambassador network is making to the human infrastructure around PIDs. By organising training activities within their own communities and sharing training materials, THOR ambassadors are helping to overcome the cultural barriers to PID adoption. These forms of collaboration are not only critical between THOR partners and ambassadors, but need to extend to other organisations and projects in order to integrate PIDagogy within the Research Data Science Curriculum. The importance of communication was also reiterated in other sessions on PIDagogy, in which participants designed infographics to promote and explain PIDs to different stakeholder groups. These materials will be developed further and made available for the community to (re-)use.

discussion-2

PIDapalooza crowd developing videos, infographics and quizzes for PID adoption

Challenges of Measuring PID Adoption

Salvatore Mele (CERN) discussed the challenges of measuring PID adoption. THOR has already developed a comprehensive dashboard, which shows ORCID and DOI uptake over time. But the ways in which we evaluate and interpret the results remain open for discussion. Salvatore explained that it is difficult not to get philosophical when talking about measurement of PID uptake. What information is missing? What do we not (yet) know? And what further steps can we take to know the unknowable?

img_0798

Salvatore Mele explaining the THOR Dashboard

PIDapalooza definitely generated as many questions for THOR as we brought to the table. Participating and presenting at this event was a great opportunity for the team to discuss ideas and generate more thought for further research and future collaboration, complementing the PID frontiers already being explored by other organisations. And yes, THOR definitely believes identifiers deserve their own festival and is looking forward to PIDapalooza 2017!

Want to know more about PIDapalooza?

Identifying Interpretation

Scientific research infrastructures collect large quantities of values. Values are typically numbers that result from observation, experiment, or computing activities. For instance, plant scientists collect values that result from observing fluxes of carbon dioxide on the leaf-atmosphere boundary; high-energy physicists collect values that result from observing collisions of atomic particles; social scientists observe the interactions of human populations and individuals, collecting values both qualitative and quantitative.

The interpretation of values is central to research investigations. With interpretation, values are given meaning in the context of investigations. The result of interpretation activities is information, and research infrastructures integrate information into existing bodies of knowledge. Therefore, research infrastructures are knowledge infrastructures or “robust networks of people, artifacts, and institutions that generate, share, and maintain specific knowledge about the human and natural worlds” (Borgman, 2015).

At the International Workshop on Reproducible Science, we presented the possibility of aggregating machine readable information in Research Objects. We have proposed to extend a Research Object Model (Belhajjame et al., 2012) with a new Resource called Interpretation. Existing Resource types include Dataset, Software, and Paper. In our proposal, machine readable interpretations are additional research artefacts created in scientific investigations. Research Objects thus capture also the interpretations given to observational, experimental, or computational values in research investigations.

Just as with other research artefacts, interpretations could be unambiguously and persistently identified in a global way. DataCite digital object identifiers (DOIs) could be used to enable unambiguous reference to interpretations, and the resolution to human and machine readable interpretation descriptions. The approach would also enable the citation of interpretations, and thus the recognition of contributions toward interpretations. Cross-linking of interpretations with the ORCID iD of contributors would enable unambiguous attribution.

Between the numerical values and the abstract high-level information reported in scientific articles, the primary information obtained by interpretation is generally refined into secondary and tertiary information. For example, primary information about individual events occurring in the environment, such as event date, location and duration, may be refined into secondary information about the seasonal mean event duration, and into tertiary information about the statistical significance in difference of seasonal mean duration. Curating such information and its provenance arguably supports the reproducibility of scientific investigations, from numerical values to the natural language text in scientific articles. Advanced knowledge infrastructure may increasingly capture, curate, and provide access to such information, in standardised and unambiguously identified form.

References
Belhajjame, K., et al. (2012). Workflow-Centric Research Objects: A First Class Citizen in the Scholarly Discourse. In Proceedings of the ESWC2012 Workshop on the Future of Scholarly Communication in the Semantic Web (SePublica2012), Heraklion, Greece.
Borgman, C.L. (2015). Big Data, Little Data, No Data: Scholarship in the Networked World. Cambridge, MA: MIT Press. ISBN 978-0-262-02856-1

Persistent Identifier Services for the Humanities

Persistent identifiers (PIDs) are increasingly embedded in the services that researchers use every day, enabling unambiguous attribution of the full range of scholarly outputs. This makes it easier for data producers and researchers to get credit for their contributions; for data centres, universities and funders to track the impact of the research they facilitate; for publishers to incorporate data into scholarly writing; and for researchers to discover and cite data through clear provenance of information and ideas. In short, they support an entirely new research infrastructure.

Within THOR we are working to realise this vision by improving interoperability and integration of PID services, and addressing the cultural barriers to adoption. Now over a year into the project, we have found that uptake in the humanities, in particular, lags behind other disciplines. In response to this, we will be running a series of workshops through which we hope to better understand the potential for persistent identifier services in the humanities, identifying requirements for and barriers to uptake, and creating a roadmap to guide future development.

The first workshop will take place at the British Library on Friday 9 December 2016, in which we will have a focused discussion around the role of PIDs in research using historical sources – fields in which digital data has taken on an increasingly important role. The workshop is by invitation only. However, we’re especially keen to hear from humanities researchers who are working with research data products. If you’re making data available or reusing historical data and are interested in attending, please contact us at events@project-thor.eu for more information.

THOR and the EC Catalogue of Services Framework

In November 2015 the “eInfrastructure” Unit at the European Commission Directorate General for Communication Networks, Content and Technology asked several e-Infrastructure providers to develop a framework for a service portfolio to describe services developed with funding from the directorate. The THOR project participated in the definition of the concepts underlying such a portfolio. The resulting framework can be found here.

catalogue-of-services

One of the goals of THOR is to ensure access to the scholarly record. Research support services are an important component in the overall production of research outputs. They should be preserved, cited, credited, reused and validated just like the other pieces of the research landscape. A service portfolio can play an important role in this.

A central or distributed shared service portfolio can also:

  • assist users by
    • making services easier to discover and compare;
    • making it possible to determine the services’ relevance; and
    • identifying overlapping efforts or gaps in the catalogued service landscape. This is particularly true as the portfolio is to be linked to Key Performance Indicators (KPIs) that enable some evaluation of the services.
  • enable funding bodies and commercial providers to
    • understand needs for and availability, quality and impact of tools;
    • improve the visibility of their investments; and
    • improve their uptake.
  • assist service providers, such as THOR partners, by
    • providing a common interoperable language for our own service descriptions to be shared with others, and, in turn,
    • offering a competitive advantage by being able to showcase our products and services together with other EC-funded service providers.

Together with EGI, EUDAT, GEANT, OpenAIRE, and BlueBRIDGE, we have organised two workshops at which we presented the framework. Our workshop at the EGI annual conference in April 2016 was aimed at sharing our current practices, discussing how to harmonise them, and how they and our framework fit with the FitSM standard for IT service management. At DI4R 2016 in September, we continued the discussion by gathering current user experience and requirements for future portfolio development from different communities. This resulted in a set of recommendations to help shape future activities. The workshop at DI4R also enabled us to explore synergies with the MERIL project, which aims to develop a catalogue of openly accessible European research infrastructures (RIs) across disciplines and countries, and tools to analyse the described resources.

The Catalogue of Services framework can feed into the newly funded eInfraCentral H2020 project. eInfraCentral will develop an implementation of a common service catalogue, not just aimed at researchers, but also at industry, government, educators, and citizens; develop access and monitoring tools; and draw policy lessons.

Science today is “Open Science” − a global collaboration across institutions, borders and disciplines, underpinned by sharing scientific artefacts and resources at a scale hitherto inconceivable. Shared digital services are crucial to its success. They amount to a huge investment which must be responsibly developed. A service portfolio will be an important tool in improving the effectiveness and efficiency of service development and uptake.

Want to know more?
The Catalogue of Services can be found here: https://doi.org/10.5281/zenodo.165467
Example uses are also available here: https://doi.org/10.5281/zenodo.166513

Presentations from DI4R can be found here:
DI4R-JSC4R-2-Dappert-V2.pdf
DI4R-JSC4R-3-MERIL-V4.pdf
DI4R-JSC4R-4-eInfraCentral.pptx

THOR Ambassador Update

On October 13, we held a webinar for our ambassadors to update them on recent THOR activities. Tom Demeranville explained more about ORCID integrations at the THOR disciplinary partners, ORCID Work Identifier types and other recent technological developments, such as Datacite’s Event Data and ORCID Auto Update. A preview of what THOR is planning for the remainder of 2016 and in 2017 was given by Josh Brown. Plans include:

  • Further integration of PIDs in production services
  • Improvement of data citation
  • Continuing the research into the best solutions for missing PIDs.

The webinar slides can be found on the THOR Knowledge Hub.

We rely on our ambassadors to help facilitate and spread discussion about recent PID developments. Some of our ambassadors are very active on Twitter and increase THOR’s visibility by (re-)tweeting. Others spread the word about PIDs amongst their networks in person, promoting their benefits at conferences and workshops. And we are also organising our first bootcamp with one of our ambassadors in Spain. In return, we help you keep up-to-date with recent PID developments through email, webinars and newsletters.

It’s great to see how our ambassadors are contributing to achieving THOR’s mission: connecting people, places and things. And the number of ambassadors is still growing. Not just in Europe but on other continents as well. This week, we welcomed another ambassador in Australia. Click here, for an overview of our ambassadors. If you’d like to be part of this community, please get in touch. We’ll be organising an informal get together over lunch on Thursday 10 November at Pidapalooza. If you’ll be attending and would like to find out more about our ambassador programme, please join us!

 

THOR at Digital Infrastructures for Research

The last week of September 2016, several THOR partners headed to the city of churches, Krakow, to participate in the Digital Infrastructures for Research conference (DI4R). DI4R was an event organised by Europe’s leading e-infrastructures, EGI, EUDAT, GÉANT, OpenAIRE and the Research Data Alliance (RDA) Europe, in which researchers, developers and service providers brainstormed and discussed adoption of digital infrastructure services and promote user-driven innovation. Adam Farquhar (British Library), Josh Brown (ORCiD), Robin Dasler (CERN) and myself, Kristian Garza (DataCite), closed the first day of activities with a talk that emphasised that PIDs are a set of tools and systems to be integrated and promoted in infrastructures and services for researchers.

Our session was divided into short presentations that showcased how ORCiD iDs and DataCite DOIs are integrated into research systems and connected with other platforms. After that, we presented the case of CERN for PID integration which showcased how PIDs enabled linking, attribution, claiming and citation of contributors and datasets.

The session was followed by a discussion on ORCiD nationwide use cases and the need for improving metadata capturing compliance of DOIs. Finally, the DI4R audience shot the THOR panel with a provocative series of questions. For example:

    – “How should we deal with credit attribution of collections of datasets? When in some areas data collections are created by a contributor but each item in the collection has a different producer.”  

    – “Do we need PIDs for machines and instrumentation?”

    – “What about PIDs for projects?”

Certainly, some those questions need further thought and exploration by the THOR members and the community at large. Join us at Pidapalooza if you want to be part of this discussion.

Overall the THOR session at DI4R highlighted the project’s work (specifically DataCite’s Event-Data and ORCiD’s auto-update) and ended up with a good discussion about future lines of work to be developed.

 

THOR Bootcamp in Madrid

Bootcamp the THOR en Madrid(Spanish version below)

Want to learn more about Persistent Identifiers (PIDs) and how to harness their potential to advance your work? THOR is launching its first Bootcamp in Madrid at UC3M on 17-18 November. We will look at topics like PID service integration, research data management, and research output compliance/ impact tracking – bring your questions and let’s crack them together.

THOR is an EC funded project set out to investigate and push the interoperability of scholarly infrastructure through PIDs. By bringing together PID stakeholders from all sides – PID issuers, research organizations, data centers, and researchers – THOR leads the development of PID solutions and gains first-hand experience of local integration through early adopters.

The bootcamp aims to transfer this knowledge to the wider community – funders, publishers, librarians, tool builders, researchers, etc. – and fuel PID related strategic planning and technical integration with practical guidance. The two-day event will include both talks on the current development of PID hot topics and a hands-on tool-building component. We will take the community response to our pre-event questionnaire into consideration, and tailor the content for you – prepare to advance your PID agenda by the end of the bootcamp!

Registration to the bootcamp is open, let us know what you want to learn most about PIDs during the registration process, and see you in Madrid!


¿Quieres saber más sobre identificadores persistentes (PIDs) y cómo aprovechar su potencial? THOR lanza el primero de una serie de Bootcamps internacionales en Madrid el próximo mes de Noviembre. Indagaremos sobre identificadores, integración de sistemas, gestión de datos de investigación, conformidad con estándares y seguimiento del impacto. Ven con tus preguntas y buscaremos respuestas todos juntos.

THOR es un proyecto financiado por la Comisión Europea diseñado para investigar las necesidades de interoperabilidad y empujar nuevos desarrollos que asienten las infraestructuras académicas y de investigación a través del uso de identificadores persistentes. THOR reúne a todas las partes — proveedores de servicios, organismos de investigación, centros de datos e investigadores — para desarrollar una infraestructura sólida y compartida, que genere experiencia y que apoye a los nuevos integradores.

Este Bootcamp tiene como objetivo compartir la experiencia de THOR con una comunidad más amplia — bibliotecas, servicios de investigación, editoriales, organismos de evaluación, investigadores, alentar la planificación estratégica y alimentar el desarrollo de nuevos servicios en España, particularmente los relacionados con datos de investigación. El evento, de dos días de duración, incluirá tanto charlas informativas como discusiones y trabajo práctico. Utilizaremos las preferencias de los asistentes para dar forma a una agenda adaptada a las necesidades de la comunidad y que permita a todos obtener lo máximo posible de la sesión.

¡El registro para el primer Bootcamp the THOR en Madrid ya está abierto! Rellena el formulario y déjanos conocer tus intereses para cerrar la agenda. ¡Nos vemos en Noviembre!

Assessing the PID Landscape: Where is THOR in Context?

Part of knowing how well THOR is doing is knowing how our work fits into the overall context of persistent identifiers (PIDs) at large. This is why we began the project with an eye toward sustainability and also why we developed the metrics dashboard in the early days of the project. (That report is on Zenodo, if you’d like to read it again.)

Now that THOR has celebrated its first birthday, it’s time to pause and see what the PID landscape looks like now compared to when we first started. Assessing these changes now will help THOR tweak our roadmap for the future, making sure we stay on track for the remainder of the project. All of these assessment and evaluation efforts will eventually turn into a formal report at the end of the project, but we know how hard it is to wait. To tide you over, we’ve released a white paper based on our internal midtrack assessment.

Your feedback, questions, and comments are always welcome at info@project-thor.eu.

Persistent IDs and Theses: ETD2016, Lille

The International Symposium on Electronic Theses and Dissertations (ETD) is an annual get together exploring all-things thesis and PhD research. I was there to present a poster on how THOR is developing improved support for identifiers in the British Library’s thesis service, EThOS.

We have previously engaged with UK universities to see how they are already applying identifiers to their theses and data, and what could be done to move that along; we then added support in the EThOS metadata for author identifiers and thesis DOIs. Now as part of THOR, we are planning to push that further by facilitating the development work necessary to enable users to claim theses in EThOS in their ORCID record. This will also enable EThOS to look at completing the round trip, pulling ORCIDs from those claims into EThOS.

THORPosterAtETD2016_trim-2

Currently, anyone with a thesis in EThOS can only add it to their ORCID record manually. This process is prone to errors. Enabling a claim button on EThOS records will make it quicker and easier for researchers to add their thesis to ORCID. If we can then can retrieve claims information from ORCID, we can add links for users to find more works by that author.

There is still one link in this process that is slow to appear in the UK: persistent identifiers for the thesis itself. Many have Handle identifiers from their host repository, but we want to encourage further use of persistent identifiers for theses to make them more easily discoverable and accessible, especially where they are being cited. Having a link for the thesis and its data will also help to maintain the link between the two. We hope this will encourage students to think of their data as a separate, valuable output from their years of hard work, and implant the seed of good data management and sharing right at the start of their careers. So as well as the technical work to develop EThOS, we are working with universities to encourage them to apply persistent identifiers to their theses – and the data from the thesis.

ETD was a great venue to talk to the other repository managers who were interested in applying this work within their own repositories, and a welcome opportunity to answer their questions about the advantages to their institutions – and their students – of our planned approach.

A couple of recurring questions arose:

1. I have Handles in my repository for items already. Will they do?
Technically, yes. Having handles on your theses and related data will certainly enable you to take advantage of consistent linking and citation of the theses. But we do see additional advantages in the use of DOIs. These are: 1) recognition by researchers; 2) the additional governance of DOIs, providing a safety net in terms of long-term persistence.

2. When should our students get ORCIDs? How can we encourage them?
Your students should make sure they have an ORCID as soon as they are ready to publish their first output, whether that be a paper, a dataset, a poster or a conference proceeding.  The first thing institutions can do to encourage them is practice what you preach: demonstrate how you, as repository staff, can bring together your own publications and outputs, and the advantages it has for you!

The poster, which outlines our aims, challenges and potential solutions, can be found online at: https://zenodo.org/record/61176#.V8V8bvkrJpg.