Data Management

Management of Research Data


Introduction

In this module, you will explore the following aspects of responsible data management:

  1. Responsibility for data management in research and scholarly activity at the University of New Hampshire (UNH)
  2. Issues in recording and reporting research data
  3. Requirements for access to, and sharing, storage, and retention of, research data

Introduction (cont.)

The module consists of:

Case Study

Click on the image below to read the case study.

What is Data Management?

"Research data management concerns the organisation of data, from its entry to the research cycle through to the dissemination and archiving of valuable results. It aims to ensure reliable verification of results, and permits new and innovative research built on existing information" (Whyte, 2011).

Data management is an integral part of good research practice and comprises activities pertinent to all stages of the research data lifecycle. Data management involves establishing strategies for the creation, use, secure storage, and ongoing accessibility of research data. Data management activities contribute to organizing, documenting, finding, archiving, and re-using your data.

What is Data Management? (cont.)

Responsible data management:

Data Management Responsibility

Collection and generation of data are integral aspects of research and scholarly activity. In such activities, data:

The integrity of research and scholarly activity depend on accurate, detailed, organized, complete, and accessible data.

Data Management Responsibility (cont.)

UNH is responsible for stewardship of data generated or collected as part of funded research and scholarly activity that is conducted:

This responsibility derives from the Office of Management and Budget (OMB) Circular A-110, Subpart C.53. Further an increasing number of sponsors (e.g., National Science Foundation [NSF], National Institutes of Health [NIH]) have requirements for sharing research data and disseminating research results.

Data Management Responsibility (cont.)

UNH’s stewardship responsibilities include, but are not limited to

In conducting research as part of the UNH community, investigators are obligated to assist UNH in fulfilling its responsibilities, including the management and sharing of research data.

Data Management Responsibility (cont.)

To fulfill its stewardship responsibilities, UNH adopted a Policy on Ownership, Management, and Sharing of Research Data. The UNH policy delineates rights to and responsibilities for research data, as well as required data management practices that ensure:

The UNH policy applies to all members of the UNH community, including, but not limited to, faculty, staff, and graduate and undergraduate students. It applies to research supported by federal and non-federal funds as well as to unfunded research activities undertaken at UNH.

Data Management Responsibility (cont.)

The federal government normally assigns ownership of data generated in activities that it funds to a grantee institution (i.e., UNH), not to an individual (i.e., principal investigator).

In most situations, UNH owns research data generated by its faculty, staff, and students (see Section 5 of the policy for exceptions). UNH assigns custody of the research data that it owns to the faculty or staff principal investigator, or sponsoring principal investigator in the case of students, who is expected to discharge his/her custodial responsibilities

For purposes of the UNH policy, custodianship is the physical possession of and direct responsibility for protecting research data, including accurate recording and proper retention, maintenance, access, sharing, and disposition of the data.

in accordance with responsible data management practices.

Data Management Responsibility (cont.)

Researchers and scholars are responsible for recording, management, and retention of data in activities that they conduct or oversee.

At UNH, the Dimond Library has a created a Data Management Toolkit (DMT) to assist researchers and scholars with their data management responsibilities. Further, the UNH Research Computing Center provides the UNH research community with information technology services, including secure data storage and large dataset processing.

In exercising their responsibilities, researchers and scholars should adopt an orderly system of data organization for each project or activity, and should communicate the chosen system to all individuals who participate in the project/activity. This system of data organization should include, for example, a file naming convention and folder structure, version control, plan for file sharing, and procedures for recording and documenting data.

Research Data Lifecycle

A research data lifecycle depicts the different activities involving data in a research project. They include:

The Data Documentation Initiative (DDI) was one of the first organizations to conceptualize a data lifecycle (Corti et al., 2014). Since then, others have adapted it, including the U.S. Geological Survey.

Research Data Management Planning

Research data management planning is essential to ensure efficiency in research projects, and to ensure long-term management of data. A data management plan helps to organize all data-related activities, and to ensure that all individuals involved in a project are aware of their responsibiltiies. Federal agencies are increasingly requiring data management plans as part of funding applications.

A data management checklist is a useful tool to employ at the start of a research project to ensure that all pertinent issues have been identified and addressed, and that an indivdiual has been assigned to each.

To ensure success of a research project, Corti et al. (2014) state "It is crucial that roles and responsibilities are assigned and not just presumed" (p. 29). This is especially important in collaborative research projects.

Research Data

Research data are the recorded factual material commonly accepted in the scientific community as necessary to validate research findings. Research data are not limited to raw experimental results and instrument outputs; they encompass associated protocols, numbers, graphs, tables, and charts used to collect and reconstruct the data. They include numbers, field notes or observations, procedures for data analysis and/or reduction, data obtained from interviews, or surveys; computer files and databases, research notebooks or laboratory journals, slides, audio/videorecordings, or photographs.

Research data may be in hard-copy form (including research notes, laboratory notebooks, or photographs) or in electronic form, such as computer software, computer storage/backup, or digital images.

Research Data (cont.)

Research data do not generally include: unreported preliminary analyses of data, drafts of scientific papers, future research plans, peer reviews, or communications with colleagues; trade secrets, commerical information, or materials necessary to be held confidential until they are published, or similar information protected by law; personnel, medical, or similar information, the disclosure of which would constitute a clearly unwarranted invasion of personal privacy.

Research materials are tangible physical objects from which data are obtained such as environmental samples, biological specimens, cell lines, derived reagents, drilling core samples, or genetically-altered microorganisms. While these are not considered to be research data, they should be retained consistent with disciplinary standards.

Research Data (cont.)

Two characteristics of data are authenticity and integrity (Macrina, 2000).

Authentic data are those that honestly and accurately represent the work conducted. Data collection methods should be designed to minimize mistakes/errors.

Integrity of data refers to the appropriateness and proper execution of the collection method. Researchers should ensure that the study design and methods are appropriate, and that technology involved is reliable. They should ensure that their design and methods are free from bias (e.g., selecting a method in order to obtain results that favor a specific conclusion). Further, they should ensure that they have the requisite authority/permission to collect the needed data (e.g., human subjects, vertebrate animals, biological material).

Research Data: Recording

All data should be recorded truthfully, accurately, and contemporaneously with their production or observation.

Researchers, scholars, and students should ensure that their data records:

Investigators should record research data consistent with the standard practices of their discipline. In the absence of such standards, UNH’s minimum standard is that research records are written/recorded, dated, and identified by the project title and name(s) of the individual(s) conducting the activity, experiment(s), or other investigation(s).

Recording Data (cont.)

To ensure the authenticity and integrity of data when recording, researchers, scholars, and students should strive to ensure to:

Recording Data (cont.)

To ensure the authenticity and integrity of data when recording, researchers, scholars, and students should strive to ensure to (cont.):

For more record keeping best practices, see pages 8-12 of Clinical Tools, Inc., Guidelines for Responsible Data Management in Scientific Research.

Recording Data (cont.)

Originals of other related data products and descriptions (including calibration data for the instrumentation, information on the operational conditions and any transformations applied to the data to make them useful) must be kept with the corresponding data.

A detailed description must contain exact information on the original data formats and what has been done to the data.

The source code of any software that was used to prepare the data must also be documented and archived.

Whatever the organizational system adopted, researchers and scholars should ensure that all personnel involved with the project/activity, including any key administrative personnel, understand and adhere to the system.

Research Data: Selective Data Reporting

All relevant observed data should be included in an analysis unless there are valid reasons for excluding or modifying data. Reasons for excluding or modifying data, especially when such exclusions or modifications could alter the findings or interpretation of results, should be recorded and disclosed (The National Academy of Sciences, 1992).

Acceptable reasons for excluding data include instrument malfunction, subject attrition, specimen instability, accidental disruption of procedures, or irrelevance to the hypothesis.

Data imputation is a procedure used to estimate data for missing values. All imputed data should be identified, accompanied by an explanation of the imputation method.

Selective Data Reporting (cont.)

Data manipulation refers to changing, excluding, or creating data, usually without disclosure. The following methods of data manipulation are considered unethical as they distort or misrepresent the true results of the research or scholarly activity.

At UNH, fabrication and falsification are regarded as acts of scholarly misconduct.

Research Data: Data Access & Sharing

Researchers and scholars should provide reasonable access to data by any individual who has participated in the project/activity in which the data were collected or generated.

Prior to starting an activity, researchers and scholars should delineate in writing for each member of the group (or for each student or other personnel involved in the activity) which parts of the data may be copied and taken by the individual if he/she leaves the group (or ceases the activity).

Data Access & Sharing (cont.)

The federal government has the authority to access any data collected or generated in activities that it has funded.

UNH may need to access data and in certain situations take custody of data, such as in patent disputes, allegations of data misuse, subpoena, or Freedom of Information Act/Right to Know Law requests. This responsibility lies with the UNH Senior Vice Provost for Research (SVPR).

If a copyright or patent application might emerge from a study, researchers and scholars should promptly contact the office of UNHInnovation for information and assistance.

Researchers, scholars, and students must adhere to any limitations on access to data stipulated by a relevant agreement, policy, or regulation. For instance, provisions to protect the privacy of human subjects and the confidentially of data collected from them may prohibit making raw data accessible except to authorized individuals.

Data Access & Sharing (cont.)

Researchers, scholars, and students are expected to share data as well as unique research materials essential to the replication or extension of reported findings consistent with the standards of their discipline where such sharing is not limited by a relevant agreement, policy, or regulation (The National Academy of Sciences, 1992).

Benefits of sharing data include:

Data Access & Sharing (cont.)

In 2013, the federal Office for Science and Technology Policy (OSTP) directed each federal agency with over $100 million in annual research and development expenditures to develop a plan to increase public access to the results ~ digital scientific data and scientific publications ~ of federally funded research.

As a result, most federal agencies require that data and unique materials, such as cell lines and DNA sequences, generated or collected in activities supported by their funds be shared with others in a timely manner after the associated research results have been published or provided to the sponsoring agency.

For example, NASA requires that articles are made publically available within 12 months of publication, and that data are made publically available at the time of publication, or within a reasonable time period after publication, which is stipulated in the data access plan.

 

Data Access & Sharing (cont.)

Publically available data ~ open data ~ is part of the open science movement. Open data are "online, free of cost, accessible data that can be used, reused, and distributed provided that the data source is attribute and shared alike" (FOSTER).

Some journals also require sharing of materials and/or data described in articles they publish.

Data Access & Sharing (cont.)

Most federal agencies also require researchers to develop data management plans that are submitted as part of proposals for funding. These plans describe the data that will be created in the research, how those data will be managed, and how those data will be preserved and archived, and made accessible. Such plans generally should describe the following components:

Data Access & Sharing (cont.)

Tangible research materials should be shared with individuals who are not affiliated with UNH only by specific written agreement, such as a Material Transfer Agreement.

UNH retains a non-exclusive, irrevocable, royalty-free license to use all research data for purposes of internal research, education, and/or protection of intellectual property when the data are generated at or under the auspices of UNH.

Data Access & Sharing (cont.)

Principal investigators who leave UNH may normally take UNH-owned original research data for which they are the custodian. They have an obligation to hold the data in trust for UNH for the required retention period

The required retention period is the time period for which researchers and students should retain data securely. It is generally at least three years from the date of data collection, termination of a sponsored agreement under which the data were collected, or publication based on the data, whichever is longer (controlling period).

, and return the data to UNH upon request. UNH-owned original research data must remain at UNH when data are:


Students and other study personnel may normally take a copy of UNH-owned research data related to their research when they leave UNH.

Research Data: Storage and Retention

Principal investigators/custodians are responsible for the physical storage and security of research data during collection and retention periods, consistent with the standard practices of their discipline and/or the terms of a sponsored agreement. 

Of particular importance are issues involving confidentiality and general management of data obtained from human subjects, security of research data against theft or loss, and maintenance of backup or archival copies of research data that may be needed in the event of a disaster, as well as any software. Storage methods should be secure yet allow access by authorized individuals.

Storage & Retention (cont.)

Research data and associated materials/correspondence must be retained in sufficient detail and duration to allow appropriate response to questions about research accuracy, authenticity, primacy, and compliance with laws and regulations governing the conduct of research.

The recordkeeping systems/practices used by researchers should allow unmediated access by UNH over their entire retention period.  Of particular importance are instances in which a principal investigator leaves UNH.

Storage & Retention (cont.)

The UNH policy requires that research data be retained securely for at least three years from the date of data collection, termination of a sponsored agreement by which the data were collected, or publication based on the data, whichever is longer (controlling period).

If custodians permanently leave UNH and take research data with them, they should notify their Dean/Director of the location of such data. They are obligated to hold the data in trust for UNH and return the data if requested to do so. The data must not be disposed of within the required retention period

The required retention period is the time period for which researchers and students should retain data securely. It is generally at least three years from the date of data collection, termination of a sponsored agreement under which the data were collected, or publication based on the data, whichever is longer (controlling period).

without written permission of UNH’s SVPR.

Storage & Retention (cont.)

Examples of situations in which data may be required to be retained beyond the controlling period include, but are not limited to:

Appeal of Ownership Determination

UNH faculty and staff investigators may appeal a determination of UNH ownership of research data to the SVPR (see Section 12 of the UNH data management policy for details).

UNH graduate student investigators may appeal determinations of UNH ownership of research data to the Dean of the Graduate School. UNH undergraduate student investigators should direct an appeal to the undergraduate college/school Dean (see the UNH data management policy for details).

Review Scenario 1 - Questions 1 & 2

Scenario: A graduate student at UNH has successfully defended her dissertation and is preparing to leave UNH. In the course of her research, she recorded data and experimental procedures in her lab notebook. She also performed analyses using computer software that resulted in digital recordings of results. Funding for the project she worked on has ended, and the results of the research have been published. Which of the following statements are true?

1. The student may take all original data, including lab notebooks and digital files, as long as she leaves a copy with her faculty advisor (Sponsoring PI).

Incorrect.
Correct.

2. The student may take a copy of all data but the original data, including lab notebooks and digital files, must remain with her faculty advisor (Sponsoring PI).

Correct.
Incorrect.

Review Scenario 1 - Questions 3 & 4

Scenario: A graduate student at UNH has successfully defended her dissertation and is preparing to leave UNH. In the course of her research, she recorded data and experimental procedures in her lab notebook. She also performed analyses using computer software that resulted in digital recordings of results. Funding for the project she worked on has ended, and the results of the research have been published. Which of the following statements are true?

3. If the student was not paid in the form of salary or wages by UNH, then she owns the data she generated.

Incorrect.
Correct.

4. The student’s faculty advisor (Sponsoring PI) must retain only the raw data; recordings of experimental procedures and digital analyses are not research data.

Incorrect
Correct.

Review Scenario 1 - Question 5

Scenario: A graduate student at UNH has successfully defended her dissertation and is preparing to leave UNH. In the course of her research, she recorded data and experimental procedures in her lab notebook. She also performed analyses using computer software that resulted in digital recordings of results. Funding for the project she worked on has ended, and the results of the research have been published. Is the following statement true?

5. A student may appeal ownership of her data by writing to the UNH Graduate School Dean (if a graduate student) or the undergraduate college/school dean if an undergraduate.

Correct.
Incorrect.

Review Scenario 2

Scenario: A UNH faculty member has a large international research program that involves international graduate and undergraduate students. The faculty member simultaneously has several active federal grants that fund much of his work. As he travels and publishes extensively, it does not leave him much time in the lab. At the beginning of each Fall semester, he reviews with incoming students the lab’s data management practices, including procedures for how to keep lab notebooks, but otherwise leaves senior graduate students to oversee projects. Early one Fall semester, the faculty member is reviewing with several new undergraduate students the preliminary work on a new, unfunded project started the year before by one of the graduate students who just graduated. After retrieving the lab notebooks for the project, an undergraduate explains to the faculty member that she cannot understand them because they are written in Japanese, not in English as is the standard lab procedure. Which of the following statements are correct (check all that apply)?
Correct. However, another answer exists.
Incorrect.
Incorrect.
Correct. However, another answer exists.
Incorrect.
You have selected all of the correct statements.

Review Scenario 3 - Questions 1 & 2

Scenario: A graduate student has obtained his master’s degree at UNH. Her master’s work was conducted as part of a research group that involved UNH faculty, and graduate and undergraduate students as well as faculty and students at a collaborating institution. After graduating, the student leaves this research group and enters a doctoral program at UNH in a different department. He plans, however, to publish at least one paper from his master’s work. He contacts his now former faculty advisor at UNH and asks for a copy of the data so that he can work on a paper. The faculty member tells him that he cannot have a copy of the data, nor access to the data, as he is no longer part of the research group. Which of the following statements are true?

1. The graduate student owns the research data so he should have access to the data.

Incorrect.
Correct.

2. The student has no right to access the data as he has left the research group.

Incorrect.
Correct.

Review Scenario 3 - Questions 3 & 4

Scenario: A graduate student has obtained his master’s degree at UNH. Her master’s work was conducted as part of a research group that involved UNH faculty, and graduate and undergraduate students as well as faculty and students at a collaborating institution. After graduating, the student leaves this research group and enters a doctoral program at UNH in a different department. He plans, however, to publish at least one paper from his master’s work. He contacts his now former faculty advisor at UNH and asks for a copy of the data so that he can work on a paper. The faculty member tells him that he cannot have a copy of the data, nor access to the data, as he is no longer part of the research group. Which of the following statements are true?

3. The faculty advisor should have entered into a written agreement with each member of the research group at the start of the project delineating which data each individual could access and/or copy and take if s/he left UNH or the group.

Correct.
Incorrect.

4. As the student participated in the project where the data were collected, he should have reasonable access to the data, even though he is no longer a member of the research group.

Correct.
Incorrect.

Review Scenario 3 - Question 5

Scenario: A graduate student has obtained his master’s degree at UNH. Her master’s work was conducted as part of a research group that involved UNH faculty, and graduate and undergraduate students as well as faculty and students at a collaborating institution. After graduating, the student leaves this research group and enters a doctoral program at UNH in a different department. He plans, however, to publish at least one paper from his master’s work. He contacts his now former faculty advisor at UNH and asks for a copy of the data so that he can work on a paper. The faculty member tells him that he cannot have a copy of the data, nor access to the data, as he is no longer part of the research group. Is the following statement true?

5. As soon as the student graduated from UNH, according to UNH policy, he should have been eligible to receive a copy of the data from the project.

Correct.
Incorrect.

Review Scenario 4

Scenario: Sally, a doctoral student at UNH, successfully defends her dissertation and obtains a tenure-track position at another institution. During the time that she is moving her things from UNH and setting up her office at her new institution, she is working on a paper with her now former UNH faculty advisor resulting from another project in the faculty advisor’s lab. Sally and her advisor disagree on the order of authors on the paper: the faculty advisor claims the position of first author, and wants to list another graduate student and undergraduate student in the lab as authors whereas Sally thinks she should be first author (as she did the majority of the work) and that the other students should be listed in the acknowledgements section of the paper, not as authors. As leverage in the argument, Sally takes the original data (completed paper surveys) with her to her new institution along with copies of digital data files. When her former faculty advisor demands that she return the original paper surveys before he will discuss the authorship issue any further, Sally contacts the department chair at UNH for advice. She explains that she has taken the original data because she is having an authorship disagreement with her former advisor. What should the department chair tell Sally (check all that apply)?
Incorrect.
Correct. However, another answer exists.
Correct. However, another answer exists.
Correct. However, another answer exists.
Correct. However, another answer exists.
You have selected all of the correct statements.

Case Study Review

Click on the image below to read the case study that was presented at the beginning of this module.

Congratulations!

Once you have finished all of the review questions click ’Certify Completion’.

Certify Completion