The content on this wiki is being preserved for historical purposes, but is not being maintained and is probably no longer accurate.

For current information about DPLA Development, see the Development Portal

For latest API documentation, see the API documentation

Item API

From DPLA Dev Wiki
Jump to: navigation, search

Contents

Note on Usage

Item metadata contributed by Harvard University is offered under a Creative Commons 0 license. Harvard requests the DPLA post the following: "While the data is freely available for use, the Harvard Library community norms request attribution and that if others improve this data, they make those improvements equally freely available. In addition, for data that originated in WorldCat, at OCLC’s request, we are asking users to observe the WorldCat community norms. We believe that observing these community norms will help promote good practices, foster trust among partners, and encourage growth of the open metadata community."

Note that the API is not intended to be used to acquire DPLA data sets in their entirety. Not only is that an inefficient way to gather the data, it can also have an adverse affect on the performance of the API for others. The DPLA prototype platform has instituted some reasonable limits (3/sec) on the rapidity of requests from a single source. Note also that the data sets are available for bulk download here: http://openmetadata.lib.harvard.edu/bibdata.

Base URI for the Item type

http://api.dp.la/v0.03/item/

Basic Query

Basic queries to DPLA include the field you want to search against and your query term squashed together with a colon:

Parameter name Parameter description
filter The field and query


A basic query might look something like this:

http://api.dp.la/v0.03/item/?filter=dpla.keyword:internet


Return Type

Currently, all data is returned as JSON.


Query Terms: Well-formedness

The rules for creating well-formed query terms differ according to whether the search type performs exact or keyword matching.

Keywords

  • Case insensitive
  • Truncation only at word boundaries allowed

Exact Searches

  • Case sensitive
  • Only full values accepted


Base Fields: Mapping to a set of common terms

Field name Field description
dpla.keyword Almost all of a record's fields get copied to this field
dpla.title The title and/or subtitle of the item. Exact matching.
dpla.title_keyword The title and/or subtitle of the item. Keyword matching.
dpla.creator The creator(s), contributor(s), editor(s), etc. of the item. Exact matching
dpla.creator_keyword The creator(s), contributor(s), editor(s), etc. of the item. Keyword matching
dpla.date The item's date of publication.
dpla.description The item's description. This often includes the item's Table of Contents. Exact matching.
dpla.description_keyword The item's description. This often includes the item's Table of Contents. Keyword matching.
dpla.subject A catchall for subject information. LCSH, Dewey, and other tag related fields are copied to this field. Exact matching.
dpla.subject_keyword A catchall for subject information. LCSH, Dewey, and other tag related fields are copied to this field. Keyword matching.
dpla.publisher The name of the publisher. Exact matching.
dpla.language The primary language of the item. Exact matching.
dpla.isbn The item's ISBN. Exact matching.
dpla.oclc The item's OCLC identifier. Exact matching.
dpla.lccn The item's LCCN. Exact matching.
dpla.call_num The item's call number. Exact matching.
dpla.content_link A link to the item's content. Exact matching.
dpla.contributor The contributing partner. Exact matching.
dpla.resource_type The resource's type. Common values include item and collection. Exact matching.


A search for subjects containing the term computer networks might look something like this:

http://api.dp.la/v0.03/item/?filter=dpla.subject_keyword:computer networks


Local Data: The original, supplied data

We map incoming data to the terms listed in the Base Fields section above but we realize that you might have cause to deal with the original data. We've done our best to index the data as it was given to us. You'll find it mixed in. If the term doesn't start with dpla., it's original data:

dpla.publisher: "University of Toronto Press,",
610a: [
    "Queen's University (Kingston, Ont.)",
    "Queen's University (Kingston, Ont.)"
],
650a: [
    "College registrars",
    "Registraires d'université"
],
245b: "Jean Royce and the shaping of Queen's University /",
245a: "Setting the agenda :",
dpla.description: "Includes bibliographical references and index",
055b: [
    "R69 2002",
    "R69 2002"
],
055a: [
    "LE3*",
    "LE3 Q318",
    "LE3 Q318"
],
260a: "Toronto :",
260c: "c2002.",
260b: "University of Toronto Press,",
dpla.contributor: "harvard_edu",
245c: "Roberta Hamilton.",
dpla.subject: [
    "Royce, Jean, 1904-1982.",
    "Royce, Jean, 1904-1982",
    "Queen\'s University (Kingston, Ont.) Biography.",
    "Queen\'s University (Kingston, Ont.) Biographies",
    "College registrars Canada Biography.",
    "Registraires d\'université Canada Biographies"
],

Fields common to all DPLA records

A number of handy DPLA system related fields exist in every record in DPLA:

Parameter name Parameter description
dpla.id Each item in DPLA is given a unique identifier.
dpla.contributor The unique name of the partner that supplied the record. These field values usually take the form of the partner's domain name. e.g. example_edu, library_example_org
dpla.dataset_id Each record loaded into DPLA is loaded in a batch. This field is a unique identifier common to all records loaded in the batch.


A search for all items contributed by harvard_edu might look like this:

http://api.dp.la/v0.03/item/?filter=dpla.contributor:harvard_edu

Faceting and filtering

The API allows filtering and faceting on almost all fields:

Parameter name Parameter description
facet The field you want to facet on
facet_limit_fieldname The default facet return set size is 100. You can limit that on a per field basis by using this parameter. (Substitute fieldname for the name of your field)
filter You can narrow queries by using filters. Syntax looks like this: fieldname:filter (example: language:English)


A search for items containing the term internet, faceting on subject might look like this:

http://api.dp.la/v0.03/item/?filter=dpla.keyword:internet&facet=dpla.subject


Let's give filtering a go:

A search for items containing the term internet, faceting on search subject, limiting ourselves to the records supplied by harvard_edu, might look like this:

http://api.dp.la/v0.03/item/?filter=dpla.keyword:internet&facet=dpla.subject&filter=dpla.contributor:harvard_edu

Controls

Common API controls are available:

Parameter name Parameter description
limit Number of records to return
start The starting point in the result set


A search for items containing the term internet, limiting the return number 10, starting at record 30:

http://api.dp.la/v0.03/item/?filter=dpla.keyword:internet&limit=10&start=30

Known Issues

There are two known issues with the data in the local MARC-schema representation in the bibliographic datasets from Harvard and Library of Congress (the dpla-schema representation of these datasets is not affected by these issues):

  • UTF-8 corruption: chars above lower ASCII do not currently render properly. We have applied the code fix and are in the process of re-ingesting these records.
  • Local-data ingestion not stable: some fields in the original MARC are not being captured.
Personal tools