Logo site
Logo site

Search on OralHistory.ws Blog

Search on OralHistory.ws Blog

Digital Archiving for Beginners: Tools and Best Practices

Whether you’re a seasoned historian, an undergraduate researcher, or an oral history enthusiast, preserving historical material in digital formats is no longer optional—it’s essential. Digital archiving ensures that documents, interviews, photographs, and other artifacts are accessible, searchable, and protected against physical degradation. However, for those new to the practice, digital archiving can seem overwhelming. What format should you use? Where should files be stored? How do you ensure long-term access?

Why Digital Archiving Matters

Digitization does more than preserve—it democratizes access. Archives that were once only available in physical locations can now be explored from any corner of the world. This enhances historical scholarship, enables interdisciplinary research, and empowers community storytelling.

Key Benefits of Digital Archiving:

Preservation: Protects fragile or deteriorating materials from further damage.

Access: Enables remote collaboration and global availability.

Searchability: Metadata and indexing make finding information faster and more efficient.

Redundancy: Digital backups prevent data loss due to fire, flood, or human error.

Understanding the Basics: What to Archive and Why

Before diving into tools and workflows, it’s important to define what materials you should archive.

Common Materials to Digitize

Documents: Letters, journals, transcripts, research notes

Images: Photos, scans of historical maps, posters

Audio: Oral history recordings, interviews, lectures

Video: Recorded testimony, public events, presentations

Metadata: Descriptive, technical, and administrative information about each file

Always consider: Will this be useful for future research, teaching, or public engagement?

Step-by-Step Guide to Starting a Digital Archive

Step 1 – Organize Physical and Digital Materials

Start by taking inventory of what you have. Use a spreadsheet or database to categorize materials by:

  • Type (document, photo, audio, etc.)
  • Source (who created or collected it)
  • Date
  • Subject or theme

This pre-planning helps avoid redundant work and ensures consistency in your archiving strategy.

Step 2 – Choose the Right File Formats

Use open, non-proprietary formats wherever possible to ensure future accessibility. Below are recommended formats for each type of material:

Material Type Recommended Format Notes
Text PDF/A, TXT, or XML PDF/A is archival standard
Image TIFF (preferred), PNG Avoid JPEG for preservation
Audio WAV (preferred), FLAC High fidelity, uncompressed formats
Video MP4 (with H.264 codec) Good balance between quality and size
Metadata Dublin Core XML, JSON-LD Enables discoverability

Step 3 – Digitize with Quality Standards

If you’re scanning materials:

  • Use 600 dpi for images and 300 dpi for text
  • Scan in color—even if originals are black-and-white—for more data
  • Use flatbed scanners, not phone apps, for long-term projects

For audio/video:

  • Record in uncompressed formats
  • Use external microphones for interviews
  • Transcribe audio for accessibility and indexing

Step 4 – Add Descriptive Metadata

Metadata is what turns a digital folder into a usable archive. Use consistent fields:

  • Title
  • Creator
  • Date
  • Description
  • Subject/keywords
  • Rights/license

For oral histories, include context like interview location, interviewer name, and summary of themes.

Tools and Software for Digital Archiving

Here are beginner-friendly tools used by historians, libraries, and archivists.

File Management and Metadata

Tropy: Ideal for organizing photos and adding metadata

Excel or Airtable: Useful for inventory lists and tagging systems

ExifTool: Command-line tool for managing metadata in multimedia files

Scanning and Image Processing

VueScan: Advanced scanner software with archival options

GIMP: Open-source photo editing tool for cleaning scanned images

Audio/Video

Audacity: Free audio editor for cleaning and segmenting interviews

Otter.ai or Descript: Transcription services for spoken-word material

HandBrake: Video converter with open-source support

Storage and Backup

External hard drives: Use for local, offline storage

Cloud services: Dropbox, Google Drive (with encryption if needed)

LOCKSS (“Lots of Copies Keep Stuff Safe”): Academic model for distributed backups

Best Practices for Long-Term Preservation

Digital archiving is not a one-time task. Here’s how to build sustainable habits.

1. Follow the 3-2-1 Rule

  • 3 copies of your files
  • Stored on 2 different media types (e.g., cloud + external drive)
  • 1 copy stored offsite (or in the cloud)

2. Document Your Process

Create a README file or “archivist’s note” for each collection explaining:

  • Scope of the archive
  • Tools used
  • Naming conventions
  • Metadata scheme

This is especially useful if the archive will be inherited by another researcher or deposited in a library.

3. Update Your Formats and Tools

Technology changes. Plan to check and update your archive every few years:

  • Convert obsolete formats (e.g., WordPerfect) to modern standards
  • Migrate files from aging hard drives
  • Re-test access to cloud repositories

Real-World Example: An Oral History Archive

A graduate student at the University of Toronto conducted 25 interviews with first-generation immigrant families. Using a Zoom recorder and a backup app on their phone, they captured high-quality WAV files. They:

  • Scanned family photographs using a flatbed scanner at 600 dpi
  • Uploaded all files into Tropy with Dublin Core metadata
  • Used Otter.ai for transcriptions and Descript for editing
  • Backed up to two external drives and Google Drive
  • Wrote a README explaining the purpose and scope of the collection

This project was later accepted into the university’s digital repository with minimal adjustment.

Where to Deposit Your Digital Archive

If your project has long-term value, consider depositing it in a trusted repository:

  • Institutional repositories (university libraries)
  • Omeka (open-source publishing platform for archives)
  • DPLA (Digital Public Library of America)
  • Internet Archive
  • StoryCorps Archive for oral history interviews

Always check their format and metadata requirements before submitting.

Conclusion

Digital archiving may seem technical, but it is ultimately an extension of good historical practice: preserving, organizing, and sharing knowledge responsibly. With careful planning, the right tools, and attention to metadata and storage, even beginners can build a robust and lasting archive. In doing so, you’re not just protecting data—you’re preserving memory, culture, and stories for generations to come.