The content on this wiki is being preserved for historical purposes, but is not being maintained and is probably no longer accurate.
For current information about DPLA, see the DPLA main site.
Data schema
From Digital Library of America Project
See the schema in pdf form, which needs to be wikified: MetadataSchema.pdf,
Items
Items are items in collections. Their records will be resolved to a slightly extended DC schema. Values not included in that schema will be preserved as "dark" data accessible through the API. The "dark" schema will also be accessible through the API.
| 1. Technical Specification | ' | ' | ' | ' | ' | Phase | Resource |
| Tracks all item-level and collection-level metadata | We are provisionally including collection-level metadata within this schema as well, following the practice of adapting what are primarily individual-resource schemas to encompass description of collection-level objects (MARC, OAI-PMH\'s Dublin Core) | 1 | Paul | ||||
| Simple Dublin Core-based core schema | |||||||
| Supplemental \"dark\" schema representing all local metadata | |||||||
| No restrictions placed on allowed local metadata schemas | |||||||
| Registry of local metadata schemas to be maintained to aid discovery by user | |||||||
| 2. Core schema | |||||||
| id | DPLA UUID | non-repeatable | required | char | |||
| id_inst | Local internal tracking ID of data provider | In lieu of DC\'s \"identifier\" (which seems less descriptive) -- needs to be locally unique over time | non-repeatable | required | char | ||
| id_isbn | ISBN number | non-repeatable | char | ||||
| id_lccn | Library of Congress control number (not to be confused with Library of Congress classification or call number) | non-repeatable | char | ||||
| id_oclc | OCLC number | non-repeatable | char | ||||
| title | Title of resource | DC | non-repeatable | required | char | ||
| title_sort | Title of resource minus initial stop words (articles, etc.) | Non-filing chars feature supported by MARC | non-repeatable | char | |||
| creator | Author, editor, translator, contributor of resource | Combines DC\'s \"contributor\" and \"creator\" | repeatable | char | |||
| publisher | An entity responsible for making the resource available | DC | repeatable | char | |||
| date | A point or period of time associated with an event in the lifecycle of the resource | DC | repeatable | date | |||
| format | File format, physical medium, or dimensions of the resource | DC | non-repeatable | char | |||
| language | A language of the resource | DC | repeatable | char | |||
| page_count | Number of pages of work | DPLA further specification for physical textual objects | non-repeatable | char | |||
| height | Physical height of work | DPLA further specification for physical textual objects | non-repeatable | char | |||
| subject | Subject representation via keywords, key phrases, or classification codes -- recommended best practice is to use a controlled vocabulary | DC | repeatable | text | |||
| description | An account of the resource | DC -- may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource | repeatable | text | |||
| call_num | Any classification number assigned to work | repeatable | char | ||||
| content_link | URL to digital representation of work | repeatable | char | ||||
| relation | A related resource | DC | repeatable | text | |||
| rights | Information about rights held in and over the resource | DC | repeatable | text | |||
| checkouts | Aggregated number of checkouts/views of resource | repeatable | int | ||||
| data_source | Data provider of resource | In lieu of DC\'s \"source\" (which seems less descriptvie) | non-repeatable | required | char | ||
| dataset_tag | Unique ID for dataset within which given record was provided | non-repeatable | required | char | |||
| collection_parent | Any parent collection of this collection | repeatable | char | ||||
| resource_type | The nature or genre of the resource | In lieue of DC\'s \"type\" (which seems less descriptive) | non-repeatable | char | |||
| 3. Dark, local schema | Full, raw local schema will be retained; e.g., in the case of MARC21, structure would be retained at the tag and subfield levels | ||||||
| No data refinement or normalization | |||||||
| Supplements core metadata |
Events
Events are records of occurrences involving items, including checkouts, check-ins, being put on reserve, etc.
| ' | ' | ' | ' | ' | ' | ' | ' |
| 1. Technical Specification | Phase | Resource | |||||
| Tracks a single event documenting use or evaluation of item(s) or collection(s) | n/a | Paul | |||||
| 2. Core schema | |||||||
| id | DPLA UUID | non-repeatable | required | char | |||
| object_id | DPLA UUID of item(s) or collection(s) attached to this event | repeatable | required | char | |||
| object_id_inst | char | ||||||
| event_type | Type of event | Might include growing list of controlled terms: checkout, reserve, recall, view, course text, acquisition, extra copy, review, plus one, etc. | non-repeatable | required | char | ||
| date | Year in which event occurred | non-repeatable | required | date | |||
| agent | Institutional, group or individual performer of event | non-repeatable | required | char | |||
| data_provider | non-repeatable | required | char | ||||
| dataset_tag | non-repeatable | required | char |