The Office of Science mission is to deliver the scientific discoveries and major scientific tools that transform our understanding of nature and advance the energy, economic, and national security of the United States. The Office of Science Statement on Digital Data Management has been developed with input from a variety of stakeholders in this mission1.
Here, data management involves all stages of the digital data life cycle including capture, analysis, sharing, and preservation. The focus of this statement is sharing and preservation of digital research data.
Table of Contents
The Office of Science affirms that the following principles related to the management of digital research data directly support fulfillment of its mission.
- Effective data management has the potential to increase the pace of scientific discovery and promote more efficient and effective use of government funding and resources. Data management planning should be an integral part of research planning.
- Sharing and preserving data are central to protecting the integrity of science by facilitating validation of results and to advancing science by broadening the value of research data to disciplines other than the originating one and to society at large. To the greatest extent and with the fewest constraints possible, and consistent with the requirements and other principles of this Statement, data sharing should make digital research data available to and useful for the scientific community, industry, and the public.
- Not all data need to be shared or preserved. The costs and benefits of doing so should be considered in data management planning.
To integrate data management planning into the overall research plan, the following requirements will apply to all Office of Science research solicitations and invitations for new, renewal, and some supplemental funding issued on or after October 1, 2014. These requirements apply to proposals from all organizations including academic institutions, DOE National Laboratories, and others. These requirements do not apply to applications to use Office of Science user facilities.
All proposals submitted to the Office of Science for research funding must include a Data Management Plan (DMP) that addresses the following requirements:
- DMPs should describe whether and how data generated in the course of the proposed research will be shared and preserved. If the plan is not to share and/or preserve certain data, then the plan must explain the basis of the decision (for example, cost/benefit considerations, other parameters of feasibility, scientific appropriateness, or limitations discussed in #4). At a minimum, DMPs must describe how data sharing and preservation will enable validation of results, or how results could be validated if data are not shared or preserved.
- DMPs should provide a plan for making all research data displayed in publications resulting from the proposed research open, machine-readable, and digitally accessible to the public at the time of publication. This includes data that are displayed in charts, figures, images, etc. In addition, the underlying digital research data used to generate the displayed data should be made as accessible as possible to the public in accordance with the principles stated above. This requirement could be met by including the data as supplementary information to the published article, or through other means. The published article should indicate how these data can be accessed.
- DMPs should consult and reference available information about data management resources to be used in the course of the proposed research. In particular, DMPs that explicitly or implicitly commit data management resources at a facility beyond what is conventionally made available to approved users should be accompanied by written approval from that facility. In determining the resources available for data management at Office of Science User Facilities, researchers should consult the published description of data management resources and practices at that facility and reference it in the DMP. Information about other Office of Science facilities can be found in the additional guidance from the sponsoring program.
- DMPs must protect confidentiality, personal privacy, Personally Identifiable Information, and U.S. national, homeland, and economic security; recognize proprietary interests, business confidential information, and intellectual property rights; avoid significant negative impact on innovation, and U.S. competitiveness; and otherwise be consistent with all applicable laws, regulations, and DOE orders and policies. There is no requirement to share proprietary data.
DMPs will be reviewed as part of the overall Office of Science research proposal merit review process. Additional requirements and review criteria for the DMP may be identified by the sponsoring program or sub-program, or in the solicitation.
- The Principal Investigator should determine which data should be the subject of the DMP and, in the DMP, propose which data should be shared and/or preserved in accordance with the Requirements.
- In determining which data should be shared and preserved, researchers must consider the data needed to validate research findings as described in the Requirements, and are encouraged to consider the potential benefits of their data to their own fields of research, fields other than their own, and society at large.
- DMPs should reflect relevant standards and community best practices for data and metadata, and make use of community accepted repositories whenever practicable.
- Costs associated with the scope of work and resources articulated in a DMP may be included in the proposed research budget as permitted by the applicable cost principles.
- To improve the discoverability of and attribution for datasets created and used in the course of research, the Office of Science encourages the citation of publicly available datasets within the reference section of publications, and the identification of datasets with persistent identifiers such as Digital Object Identifiers (DOIs). In most cases, the Office of Science can provide DOIs free of charge for data resulting from DOE-funded research through its Office of Scientific and Technical Information (OSTI) DataID Service.
- View a list of suggested elements for a DMP.
Additional Requirements and Guidance from Office of Science Program Offices
Information about Data Management Resources at Office of Science User Facilities
View information about the data management resources available at the Office of Science User Facilities.
Data preservation means providing for the usability of data beyond the lifetime of the research activity that generated them.
Data sharing means making data available to people other than those who have generated them. Examples of data sharing range from bilateral communications with colleagues, to providing free, unrestricted access to the public through, for example, a web-based platform.
Digital Research Data:
The term digital data encompasses a wide variety of information stored in digital form including: experimental, observational, and simulation data; codes, software and algorithms; text; numeric information; images; video; audio; and associated metadata. It also encompasses information in a variety of different forms including raw, processed, and analyzed data, published and archived data.
This statement focuses on digital research data, which are research data that can be stored digitally and accessed electronically. Research data are defined in regulation (2 CFR 200.315 (e), continuing the definition from 2 CFR 215 (OMB Circular A-110)) as follows:
“Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues. This 'recorded' material excludes physical objects (e.g., laboratory samples). Research data also do not include:
(A) Trade secrets, commercial information, materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law; and
(B) Personnel and medical information and similar information the disclosure of which would constitute a clearly unwarranted invasion of personal privacy, such as information that could be used to identify a particular person in a research study.”
In the context of this statement, validate means to support, corroborate, verify, or otherwise determine the legitimacy of the research findings. Validation of research findings could be accomplished by reproducing the original experiment or analyses; comparing and contrasting the results against those of a new experiment or analyses; or by some other means.
View Digital Data Management Frequently Asked Questions.
Federal Advisory Committee Reports on the Dissemination of Research Results (2011)
- Advanced Scientific Computing Advisory Committee: Charge (1.4MB), Report (59KB)
- Basic Energy Sciences Advisory Committee: Charge (1.3MB), Report (1.6MB)
- Biological and Environmental Research Advisory Committee: Charge (1.2MB), Report (139KB)
- Fusion Energy Sciences Advisory Committee: Charge (118KB), Report (147KB)
- High Energy Physics Advisory Panel: Charge (1.3MB), Report (6.1MB)
- Nuclear Science Advisory Committee: Charge (1.3MB), Report (323KB)