Suggestions for the DPLA
|DPLA Wiki Navigation|
|About the DPLA|
|Main Page • Berkman Center|
|Board of Directors|
|Audience and Participation • Content and Scope|
|Financial/Business Models • Governance|
|Legal Issues • Technical Aspects|
|Beta Sprint • Workshops • Events|
|Media and Blog Mentions|
|List of Models|
|Community Portal • Sign on|
|Join the listserv • Listserv archives|
|Weekly listserv recaps • Suggested Resources|
Suggestions from Terry Fisher
March 2, 2011
I found the discussion at the workshop yesterday very interesting. My conclusions, after listening to the many thoughtful contributions, are that we should strive to adhere to the following guidelines when building the Digital Public Library of America.
The DPLA should have a distributed, not centralized, structure. For that reason, it's essential to pay close attention to the protocols that will ensure that all of the components of the system are interoperable. (That seems the primary job of the “technical” working group.) The system would be less expensive, more robust, and more durable if it avoided all proprietary formats.
Everyone (in the US and elsewhere in the world) should have access to all items in the collection, and no user should be charged a fee. Requiring anyone to pay a fee for anything is the camel's nose under the edge of the tent.
This principle of universal free access is important to serve the various constituencies who are legitimately interested in the venture. As suggested in point 8, below, I think that it’s also important for symbolic reasons – to sustain and celebrate an ideal of universal access to knowledge.
This is not to suggest, of course, that all forms of information or entertainment should be free. I teach Intellectual Property law, so I spend much of my time thinking about reasons why that’s not true. But there are plenty of institutions that gather and license materials for fees; the DPLA need not.
The principle of universal free access doesn't mean that none of the contributors to the collection could get paid. On the contrary, documents might get into the system in a variety of ways. They include:
- Aggregated public domain materials
- Out-of-copyright materials scanned and then the copies donated
- In-copyright materials, licenses to which are purchased by the DPLA or by foundations
- In-copyright materials donated by publishers on the condition that the metadata contains links to sources from which hard copies can be purchased.
- The most complex of the ways in which materials could get into the collection would be for a foundation (or, better yet, the federal government) to create a pot of money to be distributed, at the end of each year, to the creators and contributors of a specified subset of in-copyright documents. The pot would be divided among the contributors in proportion to frequency with which their works were used. The DPLA could facilitate the emergence of this model by gathering anonymized usage data.
Contributions to the collection going forward are as important as mechanisms for digitizing and collecting documents produced in the past. For example, the DPLA could and should coordinate with open access academic publishers. In addition, the project pioneered by Stuart Shieber here at Harvard – under which the university acquires a nonexclusive license to (almost) all articles written by Harvard faculty and then makes those articles publicly available – should be generalized. The DPLA could coordinate and help implement the emergence of similar systems rapidly emerging in other universities.
Within the legal world, all legislatures, courts, and administrative agencies in the United States could be asked to contribute their official output to the DPLA – and persuaded to use a common format to ease the resultant administrative burden. The success of AUSLII (the Australasian Legal Information Institute) can be inspirational in this regard. It’s inexcusable that the only comprehensive digital collections of American legal materials are proprietary and expensive.
The PLALA program, funded by the Mellon Foundation, is another promising source of materials. (See http://www.drclas.harvard.edu/programs/plala.)
It would be impractical and unnecessary for the DPLA to try to certify that items in its collection are trustworthy (meaning reliable or accurate). However, one of the functions of the DPLA might be to certify attribution and authenticity. In other words, it might devise mechanisms (a) to inform users concerning who created each of the documents in the collection and (b) to verify that particular copies of a document are unmodified replicas of the original.
When digital copies are derived from a unique artifact, the DPLA could and should provide information about where that artifact is located and how it could be accessed.
As David Weinberger suggested at the meeting, all sorts of things will be built upon the foundation of the DPLA’s collection. They include:
- Teaching materials
- Derivative Works
- Commercial and noncommercial distribution and organizational systems – like Apple’s iTunes model.
The DPLA need not and should not strive to supply all those things. Rather, the DPLA’s collection and technology should be designed so as to enable and invite others to offer complementary services. Open APIs seem essential.
It would seem risky to turn the project over to a single government agency. Rather, the DPLA should be an independent entity, partly public and partly private in character, with several members/sponsors. One of those sponsors could and should be Harvard.
What John Palfrey described as the “yellow zone” (midway between the “green zone” of public-domain materials and the “red zone” of in-copyright, in-print materials) is very large. Legal reforms that would increase access to those materials would be enormously beneficial – to the DPLA and to the public at large. Identifying those reforms would seem to be the job of the “legal issues” working group. The public participants in DPLA project – for example, the Library of Congress, and the new Register of Copyrights – could help enormously in moving such reforms forward.
The DPLA should not try to be all things to all people – should not replicate or swallow the relevant portions of Google, Amazon, JSTOR, Harper Collins, etc. The narrow way of defending this stance is to say that the DPLA should only undertake tasks for which it has a comparative advantage. A more affirmative way is to say that the DPLA should have a clear identity. It should stand for a principle: that knowledge should be free and universally accessible.
Fidelity to that principle is crucial to improving formal education in the United States – at all levels. Wider implementation of that principle would also facilitate the spread of distance education and life-long learning, activities that are currently hobbled by the TEACH Act. Finally, the principle has the merit of being fundable, as well as right. It would help explain why this venture should be paid for, at least in part, out of the public fisc. It would help to give substance to the two modifiers in the tentative title for the organization: “Public” and “of America”.
All of these suggestions are tentative, subject to revision as this remarkable project unfolds.
I take two of Terry's points as a fundamental guide for the planning process and add two of my own warnings.
1. Begin with one fundamental principle – the principle of universal and free access to all knowledge.
2. Distinguish between contributors of content to DPLA (e.g. authors, collectors, publishers, libraries, private donors, etc.) and consumers of content (e.g. scholars, citizens, anyone who wants to). The consumers should be able to access all DPLA content for free while there can be many ways in which the contributors of content can be remunerated. The same distinction should guide the approach for open versus restricted (in copyright or orphan) materials – multiple models for the contributors while made free to the users.
Everything else should follow from this principle and this distinction: the governance, sustainability and business models, technology decisions, legal issues, selection of content, funding decisions, etc.
1. Do not start planning the future DPLA by replicating the existing system and types of libraries and the conceptual framework they impose on our thinking (e.g. public versus scholarly, state versus community-based libraries, K12 versus academic research libraries, research libraries versus other, etc.) These distinctions are less relevant in the world of digital information and will hamper the development of DPLA.
2. Do not waste time trying to predict all possible users and uses of DPLA content; this is a futile and counterproductive effort.
I have similar feelings. I think for this project to be "for everyone" we have to not imagine an infinite number of "use cases" of individual users of content and try to curate with a broad view in mind.
At the same time I'd hope that a digital product could be accessible to as many people as possible and I'd like us to emulate or even create best practices for accessibility and usability with even the alpha and beta versions of this, not roll-in ad hoc accesibility features later.
It's easy to get tangled in the weeds with some of these issues but I think we've seen curatorial examples [Flickr Commons, Open Library, some of Public Resource's work] that shows that projects can be huge in scope and still be usable and accessible from an interface perspective.