In a project of the IFLA section “Library Theory and Research” (LTR) the profession of data curators was analyzed. Dr Anna Maria Tammaro and Dr Krystyna Matusiak answered our interview questions to tell us about the project findings.
What was the aim of the study?
The project of the IFLA section Library Theory and Research (LTR), entitled “Data curator: who is s/he?”, aimed to identify the roles, responsibilities, activities, and concerns of practicing data curators around the world. The main objectives of the project were:
- To prepare a vocabulary (a list of terms in hierarchical structure) and possibly an ontology, using a formal representation of a set of concepts.
- To understand roles and responsibilities of the data curator profile.
The term “data curator” was introduced 25 years ago and has been used in the context of digital preservation. The data-intensive research environment, funders’ requirements, and the movement towards open science have expanded the understanding of the concept beyond archiving and shifted the attention to other aspects of data management and reuse. It also created demand for information professionals with a broader set of responsibilities and competencies. While the term “data curator” has been still used, especially in the US, new position titles have emerged when librarians and other data experts began to work with researchers at different points of the data lifecycle. Research data management (RDM) emerged as a new field, although it is still a challenge to identify a comprehensive set of practices that would define it as a coherent discipline. The main purpose of the IFLA LTR project was to identify the characteristics of roles and responsibilities of data curators and RDM professionals in international and interdisciplinary contexts. The study also focused on the terminology used to describe the emerging practices and new professional roles.
How was the study conducted?
The study was designed using a mixed-methods approach with a combination of quantitative and qualitative strategies. It examined the terminology, analyses position description, and focuses on the value added by the data curator in the research lifecycle. The study was conducted from September 2015 through June 2017 and consisted of three stages:
- Literature review and vocabulary analysis
- Content analysis of position announcements
- Interviews with professionals working in data curation and research data management
First, the project team has conducted a broad, comprehensive, and systematic survey of the literature related to the area of data curation. In order to try and extract “key concepts” from the 120 papers that had been put together, we have used a free tool called KD (Keyphrases Digger), developed at the University of Trento. During the research the research team has collected six different data corpora and KD has been run on each corpus several times.
The quantitative content analysis concentrated on job announcements derived from a variety of library and information science job posting sites, including International Association for Social Science Information Services and Technology (IASSIST), and Code4Lib. The goal of the content analysis was to examine the titles, roles, responsibilities, qualifications, and competencies listed in the advertised positions. The data set included 441 job advertisements. Most of the analyzed positions (73.6%) were based in the United States. However, the data set also had some international coverage with postings from 32 countries. The widest distribution came from Europe with 17 European countries in the sample.
In the qualitative phase, semi-structured interviews were conducted with professionals working as data librarians, data experts, data curators, or research data managers. The goal of the interviews was to gain insight into the practice of research data management and to examine the services from the perspective of the professionals working in the field. The interviews were conducted with 26 professionals from Australia, Canada, U.S. and six countries in Western Europe.
What are your major findings on research data management?
The study found limited agreement on vocabulary and titles for people who are involved in providing RDM. The variability of titles and an infrequent use of the term “data curator” were found both in content analyses of job announcements and in qualitative data. The positions were frequently advertised under a wide variety of titles often with additional data-related responsibilities, such as data science or data references services. The differences in terminology were primarily between an understanding of data curator as one overseeing the entire data management cycle and a narrower definition focused on technical aspects of archiving in the final stages of the data lifecycle. The use of the term data curator and its broad understanding was quite prevalent among the US professionals, while Australian and European participants made a clear distinction between data curators and data managers and did not use the terms interchangeably.
Despite the differences in vocabulary, the study found a sense of a shared purpose or even mission among the participants. The professionals across institutional and national settings emphasized that their primary roles and responsibilities involved assisting researchers in meeting funder requirements, improving data management practices, and ultimately contributing to a more efficient research process and better-quality data. The work of RDM professionals in improving data management practices and advocating open access occurs on multiple levels, starting with individual researchers and their teams, building networks at their institutions, and then expanding to regional, national, and international communities. The theme of shared values and changing research culture was discussed by participants from multiple countries, pointing to the emerging international character of RDM profession.
All professionals participating in this study were engaged in consultative services, outreach, and open access advocacy. A smaller number of participants assisted researchers with technical aspects of depositing data in repositories and archival storage. Many participants described their roles and responsibilities in the context of the data lifecycle. Consultative and training services would usually take place at the beginning of the lifecycle and focus on developing data management plans (DMPs) and practices in sharing and archiving data. Technical services were offered on a limited scale at the end of the lifecycle and often involved data cleaning and verification, metadata creation and documentation, ingesting into repository systems, data publishing, and archiving.
The findings of the study indicate that RDM is an evolving socio-technical practice that involves not only technical systems and services structured around research data lifecycle but also a range of social activities and policy initiatives. This study finds common themes in social aspects of RDM, especially around efforts in raising awareness of open data, fostering culture of data sharing, and supporting the needs of researchers in the data-intensive environment.
Did you find any specific characteristics in interdisciplinary contexts?
The twenty-six participants, who were recruited for the study, worked at university libraries, campus-wide research data services, data archives, and research centers. Two were embedded in departments or directly in research projects. They worked with researchers in multiple disciplines in all three major domains – sciences and engineering, social sciences, and humanities. There were some differences between disciplines in the way RDM services were structured and practiced but the study data has not been analyzed sufficiently around interdisciplinary aspects to discuss any patterns. The study participants emphasized the importance of domain expertise and knowledge of the research process to conduct advanced data curation activities.
What should be done to foster research data management and open data, especially by libraries?
The role of academic libraries in leading and developing RDM services emerges as an important theme in this study. Librarians offer instructional experience and unique expertise in information organization, metadata, and archiving. However, this study also found that many positions in practice were held by non-library professionals who were hired specifically because of their research background, knowledge of research methods, and domain expertise. This finding has implications for education of future data experts that may need to combine technical skills, expertise in metadata and information organization standards, and knowledge of the research process and research methods.
Anna Maria Tammaro
Professor by contract of the International Master in Digital Library Learning (DILL), joint international Master of Tallinn University and University of Parma. She has been the Chair of the IFLA section ´Library Theory and Research´ (2014-2017) and of the ´Education and Training´ section (2007-2009; 2011-2013); she also served twice in the IFLA Governing Board (2007-2009; 2011-2013). From 2016 to 2017 she was coordinating the IFLA project “Data curator: who is s/he?”
Dr Tammaro is currently collaborating at the European project ROMOR (Research Output Management through Open Access Institutional Repositories in Palestinian Higher Education) and NAVIGATE (Information Literacy: A Game-based learning approach for Avoiding Fake Content). Main interests in teaching and research are: Digital library, Open Education.
Associate Professor, Library & Information Science Program (LIS), University of Denver. She received her PhD from the University of Wisconsin-Milwaukee. She also served as a digitization consultant for projects funded by the Endangered Archive Programme at the British Library and assisted digital library projects at the Press Institute of Mongolia in Ulan Bator, Mongolia, and the Al-Aqsa Mosque Library in East Jerusalem. She has strong interests in international librarianship and serves as Secretary of the IFLA Library Theory and Research Standing Committee. Her research interests include digital library development and evaluation, indexing and retrieval of digital images, usability, and information seeking behavior
Research Data Management: These Portals and Self-Learning Offers Impart Knowledge
In this blog post, we present a selection of portals and self-study courses that...