Several funding agencies require or encourage the development of data management plans for research. Beginning in fy 2011, a number of U.S. agencies began requiring a Data Management Plan as part of a grant application. You should review specific guidelines for data management planning from the funding agency with which you are working. While agencies may state their requirements somewhat differently, in general the elements are similar. Below is a framework that can be used as an outline in assembling data management plans. Prospective PIs should always review the requirements of the funder, but in the absence of such guidance, this framework is a general starting point.
An extremely useful tool for developing agency (and in the case of NSF, directorate) specific plans is the DMPTool, https://dmp.cdlib.org/institutional_login developed by the University of California system and several other research institutions. PSU has become an affiliated instituion. In the "Select Your Institution" dropdown, choose "Portland State University" from the pull-down menu and follow the steps to create a your new account.
- Data Types and Structure
- Data Acquisition, Integrity and Quality
- Intellectual Property Rights
- Dissemination Policy
Provide a brief description of the information to be gathered. Describe the nature, scope and scale of the data that will be generated or collected.
- Types of data to be produced in the course of the project include survey data, conversation data, and video depiction of what a sample of zoo vistors look at during a zoo visit.
In addition to describing the types of data, you should also describe the formats of the files in which the data will be stored, maintained, and made available. Whenever possible, non-proprietary formats should be used. Data should be converted to open, shareable formats, especially when archiving data or depositing it in a data center or repository.
- Example 1:
- Digital video data files generated will be processed and submitted to the [repository] in MPEG-4 (.mp4) format.
- Standards and Formats
- Example 2:
- Textual data will be processed and submitted to the [repository] as plain data, ASCII (.tx).
- Example 3:
- Quantitative survey data generated will be processed and submitted to the [repository] as an SPSS system file. Qualitative data will be in Word files.
Metadata is important because it can help others locate, understand, and interpret your data. It can be useful during the research process, and is a critical component in publicizing and sharing your data. Metadata can include descriptive, technical and administrative information about your research. There are many metadata standards, including some that are discipline-specific. Structured or tagged metadata, like the XML format of the Data Documentation Initiative (DDI) standard, are optimal because XML offers flexibility in display and is also preservation-ready and machine-actionable. Feel free to contact us for help.
- Example 1:
- The clinical data collected from this project will be documented using Clinical Data Interchange Standards Consortium (CDISC) metadata standards.
- Example 2:
- PDXScholar [repository] uses Qualified Dublin Core for descriptive metadata and any technical metadata schemas will be incorporated into the record in accordance to current standards. The PI will be responsible for retaining the required metadata, the Digital Initiatives Coordinator will organize and format the metadata, and work with the PI to ensure its completeness and accuracy.
Describe the storage methods and backup procedures for the data, including the physical and cyber resources and facilities to be used for preservation and storage of the research data. Some common methods include incremental backup of data, md5 hash algorithms to protect against corruption, and storage on raid arrays to protect against hardware failure. For additional information contact Will Garrick, Manager of OIT - ARC, email@example.com or by phone 725-3235
- All electronic data generated by proposed research will be redundantly archived. Portland State University has a secure server on which all information is stored. The server hard drives are set up in a RAID that is capable of full recovery even in the case of multiple simultaneous disk failure.
Address any privacy issues here. If your data contains personal information or data of a sensitive nature, you need to describe how you will protect your data and prevent others from gaining access to that information.
- During data analysis, the data will be accessible only by certified members of the project team. The research project will remove any direct identifiers in the data before deposit with [repository].
State what rights you wish to maintain over your data. Consider whether you wish to be attributed (cited), or if you will allow commercial use of your data. You can also embargo your data for a set period of time to allow time to publish from your data. Creative commons licenses are an easy and effective way to manage rights associated with your research.
If you are using someone else's data, you need to identify its source and how it was used or modified.
NSF and other agencies are emphasizing open access to data. It is best to deposit your research data in data centers or repositories that help facilitate access.
- The principal investigators on the project and their institutions will hold the intellectual property rights for the research data they generate but will grant redistribution rights to [repository] for purposes of data sharing.
Describe how you will make your data available.
Many research communities have established data archives. If your research community has such an archive, it should be an important component of your data archiving plans.
A comprehensive discipline specific list is available via the Open Access Directory list of Open Data Repositories.
Portland State University Library provides a mechanism for permanently archiving you data, PDXScholar, that can be used in addition to external repositories.
Plans should include a timeline proposing how long the data are to be preserved, outlining any changes in access anticipated during the preservation timeline, and documenting the resources and capabilities (e.g., equipment, connections, systems, expertise) needed to meet the preservation goals.
The previously common "Available Upon Request" is not likely to be sufficient. Data may be submitted to a disciplinary archive such as RePEc or PubChem, if one is available. Some researchers make their data available through the journals in which they publish or on personal web sites. We have an institutional repository, PDXScholar, that can serve as a destination for your data.
- The data will be archived in perpetuity at Portland State University's institutional repository, PDXScholar (http://pdxscholar.library.pdx.edu/). The data will be available [upon creation/upon conclusion of the grant/ after some embargo period] in accordance with the rights policies outlined above. Primary responsibility for curating and preparing the data for archiving rests with the Digital Initiatives Coordinator at Portland State University.
Describe the audience for the data you will produce. The audience for the data may influence how the data are managed and shared—for example, when audiences beyond the academic community may use the research data.
- In addition to the research community, we expect these data will be used by practitioners and policymakers.