The City of Liberty and Liberated Data

By DPLA, January 30, 2014.
Published under:

I just returned from several days in Philadelphia where I attended the American Library Association Midwinter conference. About 7,000 folks attended Midwinter, but that’s nothing compared to ALA Annual, which typically brings upwards of 30,000 people together to think and talk about libraries, open access, privacy, maker spaces, technology, and information provision and consumption. Always fascinating, the Philly conference was no different.

There were two events, in particular, that spotlighted the work that Information Professionals do around open access and freedom of/to information. These are topics near and dear to the hearts of DPLA staff. The first event—LibHack—was organized by Zach Coble, Emily Flynn, Jesse Saunders, and Chris Strauber, and hosted by the University of Pennsylvania’s Van Pelt Library. They did a remarkable job bringing 50 or so hackers and hacker newbies together to play with and develop apps and services from the APIs of the DPLA and OCLC. The setting was stunning, and the cool stuff developed by those who hacked the DPLA API made the day all the more remarkable.

WikipeDPLA, designed and developed at LibHack 2014 in Philladelphia.

WikipeDPLA, designed and developed at LibHack 2014 in Philladelphia.

Hackfests like this are possible because the DPLA Hubs and contributing institutions “freed their metadata” before they contributed it. This means that they either agreed that their metadata is, by default, in the public domain, or they dedicated their metadata to the public domain with a CC0 dedication, thereby waiving all rights to it. Whatever your thoughts about the rights status of metadata in general, without our partners’ belief in open and free access to their metadata, events like LibHack simply couldn’t happen. And it’s events like this one, where new apps and services create more access points to bring to light the great holdings of big institutions and small (like the Girl Scouts of Minnesota and Wisconsin River Valley), that get me jazzed about the work we do.

Here’s a shout out to some of the attendees’ and their great work:

  • @HistoricalCats, by Adam Malantonio, is a DPLA Twitterbot that serves up images of feline friends found in DPLA.
  • Create-your-own DPLA Twitterbot, from Simon Mai, Coral Sheldon-Hess, and Tessa Fallon, responds to queries on #askDPLA hashtag with an item from the DPLA. (In development)
  • DPLA extension to VuDL, by Chris Hallberg and David Lacy.  VuDL is an open-source digital library package developed by Villanova University. The extension takes a user’s search query along with any filters they’ve applied to their search and pulls in results from the DPLA API. The records are displayed in a sidebar next to the main search results. The DPLA extension is only available on Villanova’s site at the moment. Stay tuned…
  • Exhibit Master 2000, created by Chad Fennell, Chad Nelson, Nabil Kashyap, is a work in progress. The “Exhibit Master 2000” project will provide end-users with an easy-to-use interface for constructing online exhibits from DPLA resources. Their plan is to continue development of this app at GLAMhack Philly on Friday, February 1st.
  • WikipeDPLA, from Jake Orlowitz and Eric Phetteplace, is a script that finds relevant items in the DPLA and posts them automatically at the top of Wikipedia pages. You have to be a registered user and install a script to make it work. Luckily there are instructions and a helpful video.
  • A few intrepid teams even worked on data analysis on our behalf. Francis Kayiwa tackled subject terms and Thomas Dukleth, Jennifer (Yana) D. Miller, and Roxanne Shirazi took on copyright statements.

The second event devoted to the concept of “freeing your data” was hosted by the ASCLA Collaborative Digitization Interest Group. Judith Ahronheim, from the University of Michigan Library, presented on the HathiTrust’s Copyright Review Management System and reminded me how challenging the determination of the copyright status of text-based material can be. For many reasons but mostly because US copyright law is so complex, those of us working with cultural heritage materials sometimes generalize copyright and public domain statuses. (I plead guilty.) It’s just easier to say that anything published before 1923 is in the public domain and anything published after that year is not, even though we know better.

Historical Cats (@historicalcats), a Twitter bot that automatically tweets images of cat images from DPLA's collection, designed and developed at LibHack 2014 in Philadelphia

Historical Cats (@historicalcats), a Twitter bot that automatically tweets images of cat images from DPLA’s collection, designed and developed at LibHack 2014 in Philadelphia

Sponsored by IMLS, the focus of the CRMS project was to “increase the reliability of copyright status determinations of books published in the United States from 1923 to 1963 in the HathiTrust Digital Library, and to help create a point of collaboration with other institutions.” The exciting bit for me was that through a collaboration of four Midwestern universities and cohort of graduate students, 90,000 volumes with ambiguous copyright statuses were determined to be in the US public domain. The full texts of these publications are now freely available through the HathiTrust Digital Library.

It’s always inspiring to me to see the hard work and resources that individuals and institutions are investing to free our data. But what really hit home for me at ALA was how the reach and impact of our independent work, when collectivized, is so much greater. Case in point: because HathiTrust is a DPLA partner, the metadata describing those 90,000 volumes is searchable through the DPLA portal and API. And more exciting still is that the HathiTrust data exposed through the API can now be explored in new ways using one of those cool, new LibHack apps. That it all came together in the city in which our country’s liberty was declared over 200 years ago wasn’t lost on this gal.

cc-by-iconAll written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.