CTO Michael Della Bitta shares updates on DPLA technology projects

By Michael Della Bitta, February 10, 2022.

Several times a year, DPLA Chief Technology Officer Michael Della Bitta shares updates on our tech team’s projects, processes, and works-in-progress. This is his first update of 2022. 

Thanks to funding from the Wikimedia Foundation, the DPLA tech team has begun updating the processes we use to bring DPLA items into Wikimedia Commons. We’re working out how to sync updates to metadata records to the media files we’ve brought into the Commons so that they can benefit from the richer description that comes with our and our network’s continued reharvests and the ongoing work that our contributing institutions put into improving their metadata.

We’re also migrating as much metadata as possible to Structured Data Statements, which are a new way of describing files on Wikimedia Commons. Previously, the main way to describe a file in Wikimedia Commons was limited to embedding metadata in Wikitext, which results in lost  information about data types as well as metadata that is far less machine readable. 

The new Structured Data Statements bring the descriptive power of systems like Wikidata to individual files in Wikimedia Commons. Field types and values can be represented by Wikidata URIs and are more controlled, queryable, and visible to automated processes, which will translate into improved findability and reuse.

You can see an example of this here. (Just click on the “Structured data” tab.)  

In partnership with Protocol Labs, and with the collaboration of NARA, we conducted an experiment to test a workflow in which DPLA provides a conduit of openly licensed content to IPFS and the Filecoin storage network. As I shared at the most recent Network Council meeting, we are investigating how using these distributed web technologies could help make preservation easier and more efficient for a diverse range of organizations. We’re particularly thinking about how we might leverage these technologies to work with underrepresented communities who have not been equitably afforded the opportunity to participate in cultural memory efforts. (Those of you at our most recent DPLAfest may have been part of a discussion along these lines.). Some of you have already shared ideas, questions and feedback; if you or your Hub would like to be part of developing this project, please let me know. We are still learning about this work and will be sharing more as things progress throughout the year. (Relatedly, our Executive Director John Bracken will be speaking on this topic at SXSW next month.)

DPLA is expanding our collection of permissively licensed ebooks and we are working on creating an additional search and discovery experience and API over them. This is a superset of the Palace Bookshelf (formerly known as Open Bookshelf) collection that we work with the DPLA Curation Corps to build, as we’re looking at casting a really wide net to bring these materials together under one roof. Palace Bookshelf, meanwhile, is an effort to surface and catalog the best content for inclusion in experiences like the Palace App.

We imagine the collection we’re currently building as a place that might include anything freely available, including government documents and whitepapers. We feel like this will be a useful resource and step forward toward making our cultural heritage aggregation and open access ebooks collection as accessible and easy to navigate as possible.

Please stay tuned for more updates on these and other projects in the coming months.