Skip to main content Skip to secondary navigation

About

Main content start

About the SDR

The SDR is Stanford Libraries’ open access repository solution for scholarly content of enduring value. We build systems and provide services that make digital information resources available for generations of future scholars.

Read on to learn more about the repository contents, technology, and staffing.

In production since 2005, the SDR currently manages over 1 petabyte of unique content -- 5.2 million digital objects in nearly 2900 collections.

A wide variety of content types are represented in the SDR: articles, books, data, software, theses, dissertations, reports, manuscripts, maps, photographs, oral histories, music, video, web archives, 3D models, and so forth.  

 

Technology

The SDR is designed, developed, and maintained by Stanford Libraries’ Digital Library Systems and Services department. The system uses a combination of open source technologies. Open source is a key element of our overall technology strategy: open source allows us to have more control over technology features and implementation, and to deliver on our commitment to provide transparent services for the content entrusted in our long-term care.

 

Postgres

Serving as a robust, scalable repository back-end for storing and managing digital object metadata. Postgres’s design lends itself well to establishing relationships between objects in the digital library.

Solr

Used to index information and offers highly performant search of the repository contents. Solr's flexibility can accommodate a diversity of data sources.

Blacklight

Builds on the Solr index to provide faceted searching, browsing, and tailored views of objects.

In addition to these three technologies, the SDR incorporates two locally developed and implemented components:

  • Cocina, a digital object data model that works with Postgres as the successor to Fedora 3, the SDR’s original object store.
  • Moab for versioning archived digital objects. Details about this novel approach are available in this 2013 article originally published in the Code4Lib Journal.  Stanford is now actively involved in the specification of the Oxford Common File Layout, which is based on Moab. We plan to implement OCFL in the future.

Security

Maintaining a highly secure environment for SDR content is a top priority to ensure ongoing integrity of the information stored. We employ enterprise-based authenticated access to the SDR core system for content storage and management. The system is designed for storing "moderate risk" data, per the risk classifications established by Stanford's Information Security Office.  In terms of user access to information published through SDR, the system has provisions for Stanford-only access and other license-based restrictions as necessary.

Storage

In 2020 we completed a 3-year effort to design, plan, and execute a full migration of our preservation storage system from NetApp to Ceph, an open source, distributed storage system. In addition to the access copies readily available online to users, one "preservation copy" of SDR content is stored locally on spinning disk. We have twelve local storage servers running Ceph, yielding ~2.2 PB of usable storage.

On top of that, all SDR content is replicated to cloud storage services in three geographic zones by two different service providers. The aim is many copies, geographically dispersed, with multiple vendors and no single point of failure.

 

The SDR Team

With a supporting team of service managers, specialist liaisons, product owners, metadata specialists, preservation experts, software engineers, and system administrators, the SDR plays a vital role in the services, functions, and daily operations of the Stanford Libraries and the Stanford community more broadly.