The content on this wiki is being preserved for historical purposes, but is not being maintained and is probably no longer accurate.

For current information about DPLA, see the DPLA main site.

Data schema

From Digital Library of America Project
Jump to: navigation, search

See the schema in pdf form, which needs to be wikified: MetadataSchema.pdf,

Items

Items are items in collections. Their records will be resolved to a slightly extended DC schema. Values not included in that schema will be preserved as "dark" data accessible through the API. The "dark" schema will also be accessible through the API.

1. Technical Specification ' ' ' ' ' Phase Resource
Tracks all item-level and collection-level metadata We are provisionally including collection-level metadata within this schema as well, following the practice of adapting what are primarily individual-resource schemas to encompass description of collection-level objects (MARC, OAI-PMH\'s Dublin Core) 1 Paul
Simple Dublin Core-based core schema
Supplemental \"dark\" schema representing all local metadata
No restrictions placed on allowed local metadata schemas
Registry of local metadata schemas to be maintained to aid discovery by user
2. Core schema
id DPLA UUID non-repeatable required char
id_inst Local internal tracking ID of data provider In lieu of DC\'s \"identifier\" (which seems less descriptive) -- needs to be locally unique over time non-repeatable required char
id_isbn ISBN number non-repeatable char
id_lccn Library of Congress control number (not to be confused with Library of Congress classification or call number) non-repeatable char
id_oclc OCLC number non-repeatable char
title Title of resource DC non-repeatable required char
title_sort Title of resource minus initial stop words (articles, etc.) Non-filing chars feature supported by MARC non-repeatable char
creator Author, editor, translator, contributor of resource Combines DC\'s \"contributor\" and \"creator\" repeatable char
publisher An entity responsible for making the resource available DC repeatable char
date A point or period of time associated with an event in the lifecycle of the resource DC repeatable date
format File format, physical medium, or dimensions of the resource DC non-repeatable char
language A language of the resource DC repeatable char
page_count Number of pages of work DPLA further specification for physical textual objects non-repeatable char
height Physical height of work DPLA further specification for physical textual objects non-repeatable char
subject Subject representation via keywords, key phrases, or classification codes -- recommended best practice is to use a controlled vocabulary DC repeatable text
description An account of the resource DC -- may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource repeatable text
call_num Any classification number assigned to work repeatable char
content_link URL to digital representation of work repeatable char
relation A related resource DC repeatable text
rights Information about rights held in and over the resource DC repeatable text
checkouts Aggregated number of checkouts/views of resource repeatable int
data_source Data provider of resource In lieu of DC\'s \"source\" (which seems less descriptvie) non-repeatable required char
dataset_tag Unique ID for dataset within which given record was provided non-repeatable required char
collection_parent Any parent collection of this collection repeatable char
resource_type The nature or genre of the resource In lieue of DC\'s \"type\" (which seems less descriptive) non-repeatable char
3. Dark, local schema Full, raw local schema will be retained; e.g., in the case of MARC21, structure would be retained at the tag and subfield levels
No data refinement or normalization
Supplements core metadata


Events

Events are records of occurrences involving items, including checkouts, check-ins, being put on reserve, etc.

' ' ' ' ' ' ' '
1. Technical Specification Phase Resource
Tracks a single event documenting use or evaluation of item(s) or collection(s) n/a Paul
2. Core schema
id DPLA UUID non-repeatable required char
object_id DPLA UUID of item(s) or collection(s) attached to this event repeatable required char
object_id_inst char
event_type Type of event Might include growing list of controlled terms: checkout, reserve, recall, view, course text, acquisition, extra copy, review, plus one, etc. non-repeatable required char
date Year in which event occurred non-repeatable required date
agent Institutional, group or individual performer of event non-repeatable required char
data_provider non-repeatable required char
dataset_tag non-repeatable required char
Personal tools