Preservation and archiving includes service level procedures to make sure that data can be maintained for the future. Backup and storage is a component of preservation planning but is not sufficient to archive and preserve your data.
Before deciding on a local solution for archiving and preservation, we recommend investigating subject- or discipline-based repositories for archiving your data. Please contact Karen Bjork if you are unsure whether there is a repository for your discipline. Make sure that if you select a repository for preservation and archiving that it provides preservation services.
Portland State University Library supports an institutional repository, PDXScholar, which can be used for data preservation. In using this repository, your data will be preserved according to the digital preservatopm standards enacted by the Library. PDXScholar will allow multiple types of files to be uploaded, but they will have to been downloaded for re-use.
PDXScholar offers a repository of digital research and educational materials created and used by the University community and its strategic collaborators. The goal of PDXScholar is to advance research and learning at Portland State, to foster interdisciplinary collaboration, and to contribute to the development of new knowledge through the archiving, preservation, and presentation of digital resources. Original research products including data and publications will be permanently preserved and made accessible. Because all PDXScholar content is completely open access, authors should retain all rights of copyright for any deposited data.
If you are interested in depositing your data into PDXScholar, please contact Karen Bjork to discuss your research prior to submitting your proposal. It is important that the PSU Library works with you to consider the specifics of your long-term archiving needs and PDXScholar policies.
A big advantage of depositing your data in an archive or repository is that it will be preserved - even for your own future use!
Preservation Best Practices and Things to Mention in Your Plan
- Back-up intervals
- Data loss strategies
- File format migration plans and schedules (including software required to view files)
- Bit-integrity checks/check-sums
- Multiple copies
- Storage media (e.g. tape, online/local, and online/cloud)
- Data security and access issues
- Version control
Fragility of Data
Digital data - made up of bits and bytes - are in many ways more fragile than paper records for a number of reasons. Depending on the type of media on which the data are stored (magnetic, optical, and so forth), over time they are subject to different forms of 'bit rot' or decay, in which the electrical charge representing a bit disperses.
This gradually introduces either minor or major errors in the data, and their ability to be read by computer software.
- Refreshment - move data files onto new storage media well within the projected lifespan of the media.
- Replication - by keeping more than one copy of a data file, the risk of losing a readable copy over time is reduced.
These strategies apply to both online and offline storage media. Where data are kept on a server, backup procedures and disaster recovery planning may take into account the necessary procedures. Ask your system administrator about their procedures and tests.
Offline storage media include optical discs such as compact discs (CDs) and digital video discs (DVDs). Depending on the quality, these may need to be refreshed every ten years or less. Portable flash drives can be useful for short-term backup and portability but are not reliable for preservation purposes.
Another threat to long-term accessibility of datasets is software obsolescence. When a new version of a software product is unable to render a file created in an older version, or when a software company retires a product, goes bankrupt, etc, there may be no available version of the software to be used on newer operating system platforms.
- Migration - when a new software version has become established, the data file is converted or 'migrated' to the new software version or package.
- Emulation - a specialised strategy to recreate the functionality of the obsolete software package on a new operating system, or, for example, on a Java Virtual Machine system.
- Format conversion - the most pro-active method is to select a format that is most easily imported into a number of suitable software programs, or that is based on a universal standard.