Proof is in the Shipping: An Update from the DPLA Tech Team
Working on software is great fun, but there’s nothing more rewarding to a software developer than the moment one ships software that enables users to do new things. The DPLA Tech Team has been hard at work this summer on a number of new projects, and I wanted to take a second to call out some things we have “shipping” out this month. You’ll hear more about these projects on our blog at DPLA News over the coming months, but here’s a quick look:
The original DPLA website included a feature that allowed users to build lists of their favorite items, but because of challenges maintaining that tool, it was not included as part of the redesigned website we launched in March. Instead, we have just launched our new take on list-making functionality. The Lists feature is built right into DPLA search results, topic browse lists, and item pages and allows you to create lists of items from across DPLA collections. Save your customized lists privately in your device’s browser or download them as spreadsheets to use on a different device or to share with others. View last week’s announcement and helpful guide to getting started with your lists.
New Search Index
Our search index software, ElasticSearch, powers the DPLA search experience so that when you search for “puppy in shoe,” you are treated to this adorable image matching your keywords. This summer, we upgraded our search index software from ElasticSearch 0.90, which was released in early 2013, to a recent build of ElasticSearch 6. While the project was a huge lift, this upgrade has allowed us to increase performance while shaving 28% off our hosting bills for some of our most costly infrastructure. Plus, now we have a strong base on which to build new features like autocomplete and spell check for keyword searches.
Being trapped in any sort of data architecture is a bad place to be, and growing record counts only make the problem worse. DPLA used to work through an incremental updates ingestion process that kept us from easily adding new fields to our search index or changing data once it was live. Over the past few months, we built a mass-reindexing system based on Amazon S3, Apache Spark, and Jenkins that lets us rebuild our search index from scratch multiple times a week. Now, we’re set to evolve our search capabilities, experiment with new ideas, and recover our search infrastructure from nothing within two hours.
DPLA has been providing our Hub Network with usage statistics via Google Analytics dashboards for quite a while, but this month, we’re launching a new web application, custom-built for our Members, that presents an easy-to-use view of multifaceted usage analytics. Using these new dashboards, our Members will have access to new information about usage of our API and metadata quality metrics alongside data from Google Analytics. If you’re a Member Hub, look forward to hearing more about this soon!