The content on this wiki is being preserved for historical purposes, but is not being maintained and is probably no longer accurate.

For current information about DPLA, see the DPLA main site.

December 2011 Technical Workshop/Draft Notes

From Digital Library of America Project
Jump to: navigation, search
DPLA Wiki Navigation
About the DPLA
DPLA Website
Main PageBerkman Center
Board of Directors
Steering Committee
Dev portal
Ongoing Work
Workstreams
Audience and ParticipationContent and Scope
Financial/Business ModelsGovernance
Legal IssuesTechnical Aspects
Additional Activities
Beta SprintWorkshopsEvents
Media and Blog Mentions
Possible Models
List of Models
Concept Note
Get Involved
Community PortalSign on
Join the listservListserv archives
Weekly listserv recapsSuggested Resources

Full notes are available at December 2011 Technical Workshop/Notes. These are draft notes. Feel free to contribute directly, fix typos, add any notes you captured that were missed. To edit this wiki, please create an account.


Contents

Morning welcome and introduction

JP - <quickly summarizes the current status of DPLA, and our understanding of the platform involved>


Principles

Additional comments from Martin:

Metadata
think about other existing projects
Code
make most code reusable
Content
work closely wth partners; we would get most of it from existing partners.
Also; content and services should not come with any [new] restrictions on reuse, we should not add any.
Participation
hoping for tech engagement from as many groups as posible. Side meetins at existing museum, library, other events -- bringing those people together would help cross-pollinate.

discussion

consider Including non-library meetings to gather participation... historical society meetings as well...

S - Does this include people as content creators? active support for a community of creators as well as a community of developers? We could change 4.3 to

actively support the community of developers that want to re-use and extend content, data, and metadata; and the community of creators that want to create and share new content, data, and metadata.

K - if people are meant to be at the center of the experience, have them at the center of the discussion from the beginning.

J - an inventory of state level archievs and groups who could partner with us would be interesting

M - we might think of content as a black box for our discussions at this point.

J - there are some libraries out there who have asked to help somehow and aren't part of a workstream yet, we could frame this as a research idea they could contribute to.

There is a spectrum of ideas about what DPLA could do with content - on one end we don't store content itself but deal with metadata and connection; where we fit re: access and preservation. As CH mentioned earlier, the more onerous you make it to participate, the less you get in. If you just let people share metadata and point back to them, you will get more, people will see immediate results. At a minimum we are happy to send people to where content is, we don't want to hog that or [push for our own traffic.] On the other hand, having a copy is useful. LOCKSS, scripted manipulation of a large database, providing multiple formats...

You could have tiered participation; some could share only metadata, some could share more. There is a policy decision there.

S - does the community have a clear idea of the difference between metadata and content?

D - Even if everything is just metadata, there would still be high value ratings and walkthroughs and collections. It is conceptually impossible to separate the two.

C - It would be difficult in the example just described, it is easy to imagine a separation in the existing digital content. That data exists, we might not duplicate it, but want to import metadata for aggregating use.

D - this raises a principles question. if we draw in lots of metadata, and this generates useful material that is indistinguishable from content, and the platform aggregates the userbase content being generated, does 3.1 apply? is it true that all of that will be made available without restriction, since the users presumably own it?

QUESTION: Will we require user contributed material have to have certain licenses? What about ratings, collections, &c?

J - I think that yes we may want to include material with some restrictions. But this may be controversial.

D - What about the use case where someone comes to the DPLA and contributes some material directly via a DPLA service - do we require licensing?

S - While DPLA might set a license for any core services, this seems like a parallel question to "what content providers do we work with"? just as we might work with a content provider that imposes some restrictions, but will not impose new ones, we could work with a service that sets whatever default license it wants for its users. (We have similar flexibility in each case, as noted in 3.1)

J - if we see ourselves as an intermediary for existing libraries - allow people who dont have access to a larger library have access to and ability to [impact, use, update?, comment on?] that metadata - that is a profitable view.

C - at Smithshonian, some orgs have their own system and we directed people to them. some didn't have a system, and we took their metadata and held it ourselves. we had a tiered system... for groups in a small town who dont have resources we can consider, we hosted.

J - is there consensus that dpla may not have to resolve this?

N - in Europe there wa a decision not to hold items themselves, and I dont think it would be feasible to do it... the amount of content for a tv archivnig was 35k videos. the whole process of reencoding and digitizing... i took 2?3? yrs and only 10k videos are up there now. there is a point at which you can't actually manage the whole procedure. the same issue arises with metadata. say you want 10M records. then you need a procedure to be able to operate afterward, and to include many providers as they come in...still after 4 yrs the organization is struggling with it. the registration mechanism and ops workflow are hard. scaling that is an issue. this has to do w what you can expect from content providers.

there is effort to redirect work to aggregation projects in europeana. ideal scenario: a nat'l aggregator in each country? are there such regional options here?

C - The above is also our experience.

A - from Google; trying to use the lightest weight metadata available isn't a good idea.

[QUESTION: what do we want DPLA to guarantee provision of? What will only it be tracking, offering, backing up?

there are many things that could be made available as a nice feature; included wherever possible... but hopefully a few clear ones that we must provide and ensure lasting provision of, if DPLA ever changed form.]


Aside on layers of [meta]knowledge [sj]

- existence - location (status) - metadata - authority file (merge/split, deduping) - summaries (abstracts) - sources, data, interconnection - content - reviews, analysis

Supporting current and future sprints

We want to support existing and potential future beta sprints. How can we best do this? What should the interface be?

D - I have a list of fears to help spur discussion. We have a question of who the clients are - people who will use this plaform. That is hugely important, thats how this stuff we will do will spread most rapidly throughout the web. As Kara brought up there are users. but in being an open platform, we don't know what people will use it for. there is a sense in which we just dont know. on the other hand there's a sense in which you have to start with a sense of user.

Personally I think we want to have things out as quickly as possible, so even in the first few months we can see what could be done with the platform to see how it spreads. Do we want a toolset as part of this platform? obviously APIS. ways for developers to contribute back in? Do we provide an App store?

If this is to be an additive platform that gets richer, should it be something that supports user generatd work, like reviews and analyses and ratings? And as a set of shared services, I think we want to pull in metadata at least generated by devs building on top of it. The fact that someone has used Zeega/Extramuros to build a tour and say items are related, that is valuable information. the platform might anticipate pulling that back in to make it [more] reusable.

What policy questions should we build into the core tech? What standards should we all uphold?


SP - I read a lot about plaform, including in the documentation. The web is a platform. anything you do should accept that. don't build anything new on top of that. the comunity could add value in the definition of media formats, types, standards in terms of how domain-specific information is represented.


S - Many beta sprinters here are collaborating with other sprints or projects. Please be sure any data sharing or communication you do is through a public API.

S - Think about other APIs you might provide (from your own work) or use (as a centralized service).
S - Imagine launching a developer's program tomorrow. What would you include in a developer kit?

K - SJ talked last night about bots as users -- do we want to consider that as part of the platform?

S - think about how to make scriptable updates happen, like applying an API change or content change to a few million records.

N - Think about non-technical services. The technical services might be hard to use without non[software-tech] ones like help with digitzation

Think about versioning. when you put out a new version, make sure they can find updates. [talking about changing an API or schema]

R - there is a whole set of repository services we could provide.

D - we could use some of the tools from OS environments like Debian. you can se how those communties develop, and where they find maintainers for inividual packages... in an open extensible way.

SP - is the goal to provide some middleware that can be used by others, or to [connect things others are using?]

A - thinking about developing a platform from the beginning will be helpful to the extent that does happen. Making updates to APIs and data visible. When you talk about deveoper tools, common libraries in common lanages for communicating with the 'platform' would be useful. that way you could have developer relations folks responsible for creating thsoe libraries.

SD - talking about standards... you mentioned fostering user community that would bring people in. and then mentioned developers, who might take advantage of common tools to bring in more functionality to bring more users. and then standards - "should this platform succeed, we should define a standard!" -- it might be more responsible to set standards up front, so there are few disparate bits, and something comprehensible to use. limit technical options.

SP - code / tools / applications aren't a way to create a platform. the web is a good xample. the web wasn't created with a set of libraries to process jpegs, or a browser. It began with the HTTP/HTML/JPEG standards. then developers built tools and libraries to interpret thsoe formats. the ecosystem grew around that. no one said 'here is a library of java code'. they defined formats and protocols, not any code -- everything else grew around that.

DC - I think we have the advantage of using both...

SP - to be able to say 'here is a platform using Msyql, Java, etc -- that to me, from my experience, from being part of many middleware attempts, is the wrong approach.

SD - above that there may be a place where we can suggest other standards.

CH - if we develop more fundamental services that deliver content / data in whatever format, and let application developers take that ... we don't need to articulate what language or toolset people use. if we articulate more fundamental work and formats, implementers can take whatever they feel is appropriate.

J - an analogy is resonating for me : if you look at the flourishing of text onlnie, you had http. a limited # of people could participate. then a broader platform : covering a suite of OS packages - wordpress, mediawiki: off the shelf platforms that let people engage with certain kinds of data. lightweight standards allowing connection. RSS/Atom. Are those useful categories to think about? we're talking about kinds of content: Collections, Interfaces to them, Protocols enabling sharing that data.

JP - one reaction we might have to saying 'the Web is the platform' might be 'ok, then we go home'. So what is it that is worth doing on top of the web? many people have reacted to this by saying 'the web is the dpla!' so figuring out whats appropriate above the web is useful. josh is helpfully creating the idea of a stack. are we building something that looks like wordpress? more something more specific? or something more horizontally oriented...

JG - if Beta Sprints left us with the equivalent of a wordpress/mediawiki/other interfaces... what can help interconnet these, the gaps between them, the equivalent of rss/trackback/[spamfilters]?

SP - why not use RSS/Atom? those are the web technologies... there are a plethora of tools that understand those formats.

D - I think we all agere we should use exising standards whenever possible. that will be the vast majority of the time. we have no discussion of replacing standards.

SP - I would focus on agreeing on formats useful for the web, with domain specific information... since no one else has addressed that problem.

N - so for instance you could use rss/atom as a wrapper - both have facilities to support colelctions and iterating through them. you can define the format of the item? ok. but where I think this becomes tricky is, if you let people define their own metadata inside rss, mapping between formats will be more work than neecssary.

JP - this will be an ongoing question. if we are completely loose on lots of fronts; tech specs, standards for formats, schemas -- how do we want to be on specific v. general?

N - I would stay on the data layer; how will we store this, how do we offer access to that storage? this all seems like a step ahead of ourselves.

A - we were talking about DPLA being a web for content... we should think about discovery services. there is some way to find a collection. something like a serach engine for less structured searches. individual modules might be useful, but only if you know they exist.

J - how do we make it discoverable on the open web? standards around microformats, for instance?

JG - there is a strategic question about how we work through existing structures. is the goal for dpla to gain market share back, via local or other portals; or to work through (Existing web services).

J - if we could work with bing, google to expand content on the web to be more universal - then we have something that works broadly.

SP - my team is responsible for schema.org? on behalf of MS. I can't emphasize how the whole web is moving towards that direction. with commmunities with expertise in a particular domain representing their info in a structured way. so services can use them more clearly via domain-specific tools

J - the US, google worked with schema.org to ensure there was better data available about jobs... we could do something similar with our web collections.

K - will dpla have a default interface?

JP - I suspect "lots of them". If we think about lots of individual serviecs built around a core platform, I suspect the bulk of things will say nowhere on them "DPLA." groups working with e.g. the smithsonian and dpla will be middleware essentially. we should move for that and cry from the top of the moutnans success! if that works out. if we are successful we are unlikely to be the primary way most people access knowledge passing through us.

D - in this discussion, there is a traditional discussion we are leaving out. i want to make sure we do it on purpose. that is building a catalog. search engines don't know when they have duplicates of an item. dpla has the chance to collect metadata about items and objects, do traditional work of libraries: identifying what they are and their relationships, and improving metadata over time.

to jon's point i think this is important - when another service wants to use dlpa data or use it as a platform - we could offer to them traditional style catalog data. that is an offering that we can make to the web.

J - one service we dont have now that would enable that to happen is a crawler service, aggregating data. there is a built-in notion of identifiers for things...

SP - when it comes to semantics, formats, presentations; i dont see any big search engines automatically understand and work with various formats. this is where communities can play a key role.

? - I work for the group that maintains freebase.net (@ google). defining relationships between items. this might let us link book content and movie content, &c.

k - other things we make use of in libraries: authority control, classified access to information and subject thesauri.

jg - 2 points. iirc OL is important to think about re: getting people to help clean up data. freebase is a good project. one thing that will be essential is the ability to log usage, not just enabling discovery: circ stats, etc, and being able to tell the individual contributor usage data from the central platform. that lets institutions report success of their mission, incentivizes participation, [and helps improve effective knowledge sharing!]

N - consider vocabularies - many are commercial. there are more valid vocabularies... then it is an implementation. using vocabularies is a two-way discussion.


Post-break

Where do we provide central services, where do we enable distributed services?

D - If you offer a service that shows where things are, rather than making a decision about them, that could be central. But what if there is a mistake and you correct it? we want to support that too.

K - If we can use semantic data and let people publish locally, that's better than a central service, I would see this being done anywhere on the web; all out there, no centralized decision making. To me the service is helping users make connections. The connections made live on the web.

CH - I think it's necessary, i we want usable faceted filtering, to have [a shared vocabulary] to provide various services.

K - I agree with that, but its done on the web, without central decision-making.

L - people make mistakes. if you dont have a way to correct it, you can't move forward. you need some sort of editorial effort outside of the site that assigned the label.

N - while to the user it may appear as though it is on the wb, someone needs to host it. don't require that each organization hosts its own mapping. you have to play a part of a web node, and ensure that it contains and serves what is linked to dpla.

K - there are metadata registries out there. it doesn't have to be hosted by dpla.

J - there is a function that needs to happen: vocabulary control. if that is within dpla, or something we can rely on other providers for, we dont need to resolve today.

M - if its a centralied service, that can be a value to contributors.

[S - support user choice! let users choose among divergent options / vocabs when they exist.]

J - what does it mean to not be a mothership?

D - consider Lucene: that has both a standard and a reference implementation in JAva, but others have implemenetd it in other languages.

JG - this would differ from lucene, say, in that Lucene doesn't consider aggregating an index of all things using lucene. there is that sort of desire for aggregate services here.

think about centralized services where necessary rathe than ust where convenient.

S - so think about this as a scaffold?

D - I like the scaffold metametaphor. You might only have centralized services where needed, or also where it lowers hurdles for developers. Things that don't have to be there; you just build them to make it easier, but don't require people to use [the central serviecs].

C - I agree with that assessment - this is what we found at Smithsonian. We build central tools and then provided central services. Our widgets, dumbed down things people can use, don't have to reside centrally. There might be a hosted playground where people can contribute to them; it makes it easier for anyone to use the data.

[J - we are not focusing on that (user engagement) right now, but not b/c we dont think that is very important]

J - from David's question earlier, does it make sense to everyone to talk about url-based APIs?

D - Wikipedia is actually a mothership (P - With Jimbo as the Queen) -- and it is useful. How does that relate?

SJ - There are definite advanatages to being a centralized source of knowledge. People often experss that saying "we must centralize things in order to offer certain services to others." But you don't need to force the central service; you can make it easy for people to offer their own version of that service. Wikipedia made a point of making its core services easy to clone and copy and change; then it provided a central services but made it easy (in its case through lciensing) for anyone to take the same material at any point and start something different.

For DPLA I imagine the 'platform' being something like 'a way for people to expose their own authority file' and aggregation of same, using that aggregation a a deafult way to enhance other services, but not simply providing a central authority service [rather than a channel for others]. Wikipedia spent time developing an easy-to-clone set of tools, and was careful not to be too proud inwhat it claimed to do - it was just going to use those tools to make somethin that might e useful, and could at least be clonable by others who wanted to build on it later. that drove everyone to work together on the shared platform -- and so there ended up being only one after a year or so. any imposition of 'rival' decision making encourages people to fork [counterintuitively, barriers to forking limit collaboration]
If we distinguish rival from non-rival platform development, make sure the non-rival part is very good. then you can build an optional-use rival part which informs default interfaces or supports core tools. And capturing the alternate views of people who want their own variation is valuable; I'm glad that Conservapedia was easy to fork, sometimes you want to reference that variant of an encyclopedia article [and it would be valuable to have more like it, just as it is valuable to have a single trunk that most people collaborate on]

JR - WP also has lots of editors, page locking, etc -- it has a lot of controls in place.

SJ - those didn't exist for the first many years, until the non-rival piece was very well developed, and the site was very popular. we don't have to resolve most of those until DPLA gets there -- and being too popular is a good problem to have.

N - There is a difference here: we want to capture and reflect everyone's version; we don't want people to clone and implement their own elsewhere, we want to capture how they reference it within dpla. So we generally want a multi-perspetive view / voice towards cultural heritage.

S - agreed

JR - I don't know if anyone else has an example of all the different types of content and doing it well... we want mixed media to be somehow discoverable, but you build a platform so as not to be all things to all people. I see the mothership problem as trying to offer a single interface to something complex.

JG - I think we had a failed experiment on the user interface side [reflecting on sth - what?]

N - how is this different from a single collection with many content types?

SP - the power of the web comes in here; you could enable experiences that lead to other experiences.

K - this has some corrolary in physical space...

KC - this is tied to the question of having just metadata or having full content; you can provide so many more services if you have all content.

JP - we could restate here the complementary nature of what we want to do.

J - you might offer a native interface to do something... but also provide [synthetic interfaces?]

J' - Hathi shares Solar indexes to their content, not the raw texts, for instance. [ditto for N-gram data]

Beta Sprint support

JP - to Beta Sprinters, what do your sprints need to be successful?

B - one example for our work is that we could be part of core DPLA serviecs. that would require some local storage of data, scripts running, &c. Another option would be for us to just have access to metadata and ways to find article and other data. Facilitating that sort of discovery is probably the more important goal.

SJ - For WikiCite for instance and potential future sprints, it would be good to have support in dpla hosting for sprints, and help finding longer-term funding; matching projects that have become successful to partners inerested in their type of knowledge-work

JB - notions of collections that have been successful, or tied to particular groups of users. that could be helpful for filtering search results, experience, &c.

D - I agree, shelflife would love this - it might not need to be core.

JP - there coul be core and something 'close to core' [inner and outer core? mantle?]

DW - "Unique IDs" and "Unique IDs for works" (clustered). Shelflife needs this service, we're currently providing it within shelfLife... following library cataloging preferences... but!

C - Object heirarchy: given that we're only limiting individual items in the archival world, we often have archival collections: say 10 boxes, each with individual folders, each with photos, correspondence, etc. [vertical files, say]. This isn't just the idea of a one-level cluster: we need to support hierarchical relationships of different objects. We don't want archival organizations to feel left out given how they collect and identify things.

K - there are lots of other types of relationships too.

D - there might be other kinds of relationships here. maybe 'ID' is a kind of relationship? collection..

JR - maybe you can start with general info about collections out there and their siets. do they have a sitemap? do they have id markup on a page? that could help scafold what other services to provide once that data is out there. work in conjunction with a crawler service to collect data on pages.

a gamification piece could help with standards adoption: what standards do we want to support, how do we find emerging standards? certain communities might want to promote certain standards... Achievements could help discover collections meeting various standards... and we could make this user edited. have volunteers identify collections, maintain that metadata.

RS - imagine a document repository of requirements and design ideas? [check with Ruth for a recap]

QUESTION: how to design such a repository?

JP - Some sprints may see themselves as part of the core, others may see themselves as things that will benefit from whatever the core has. others still are existing projects with known synergy, like WP, two motherships potentially moving side by side and connecting.

Lunch!

Kara - sometimes we need to gather media froma source and want higher resolution access to their APIs than the source offers. (for instance Vimeo lets us embed a clip starting at any time up to a very small rsolution; for youtube it is only down to 1 second resolution. so it would be useful to ahve access to a rehosted v. of the youtube clip that allows more precise cutting)

S - this feels like providing access to different formats; different {format + API}s.

K - also a caching system, and a notification service that notifies people when something referenced within their collection is taken down.

S - webcaching is a great and subtle service that could use a better central system. Lot of blogs and local news articles disappear; WP gets another 10k a month? [And don't forget the BBC losing a decade of video...]


Questions people have

What the platform is/should have
  • What makes DPLA different from existing projects (Hathi, BHL, Open Library, Aquifer, Bibsoup)
  • Are we providing standards for delivery of digital content -- and will we include overcoming nonstandard material as an alternative format that can be found/generated via a DPLA service?
  • Are we thinking about webcaching, and general maintenance / updating that tracks when material disappears?
  • Can a little bit of technology help humans solve big knowledge problems? Are there different levels of abstraction we can piece apart?
Differentiators
  • Vocabulary control, metadata, ...
  • Gather metadata from more libraries and other institutions, not currently available
  • Scale down to meet the needs of smaller institutions, facilitate their work.
  • Scale all the way down to individuals and small community groups, facilitate their work.
  • Add value to services local libraries currently offer
  • Make other products and services even better (via aggregation and integration) -- ex: [LibThing,] Wikipedia, &c. Widely adopted services.
  • Proudly handling multimedia, not just texts.
Aside; if we took 30 of us aside, we could come up with a long list of things we all want that doesn't exist. If we bult a Venn diagram , we would find what is common across them.
This list also helps list success criteria that will help us determine whether we're being successful.
Standards for delivery?
  • First, make clear what you can get from assets
    Second, reflect back to the source the desire for new formats
    Third, suggest standards
    Fourth, provide reformatting services into standard formats
  • Governance track is considering what the levels of membership are. At some level you may have to commit to supporting shared standards.


"make it mungable"

Dev Process

  • Agile, many partners - which means longer sprints.
  • Specwriting tiger teams; 3-4 people for 4-6 wk sprints.
  • Getting in more use cases: connect w/ audience and aprticipation. Developer use cases. Have coordinators / a "use case czar"
  • Integration of the overall system -- how do we ensure people can hook up along the way?
  • Platform development
    • An inventory of existing tools and a gap analysis of what needs to be developed
      R - We worked out what we were working towards and what tools were available.
    • An open development process
    JR - the question of what money can be spent on may guide what development happens. JP - we have ~1M+ to cover all dev costs, including the platform and some initial elements and beta sprinters. If someone tells us 'we desperately need X to free up Y' we might support that.
    S - Integration / open source question: how do we facilitate integration from others?
    R - making sure that continuous integration happens... even knowing we wanted this on a recent project it took 1.5 yrs to get it done.
    C - at Smithsonian we tend to develop a 1st phase internally,and make tha available for contribution in the 2nd phase. Speed of acceptance of commiters and management of it are key. fostering discussion on interest, questions, contribution -- is key for long term success. Don't have "a tiger starting and then a little mouse going away"

What should the first step be?

SP - I think it should not be 'code'. Maybe take a small collection somewhere that needs exposure, aggregation, search. Define the format for a collection description.

N - whats the process for getting in touch, deciding who the constitutents are, what tools exist, how we are working with partner institutions, who has time to collaborate, &c.

We've been gathering requirements for a long time. see this as a fresh start in part.

Have that core team, a few other people, throw them into a room together, have a straw man to throw darts at.

SJ - have a daily release of an entire 'DPLA' noting that the first few hundred iterations will be 'rebuttable propositions' for what a DPLA might be; ready to be thrown away. This requires a repository and maintainer. If I see a problem statement and share a solution I have, and it takes 6 weeks for it to be reviewed and included -- I might get bored and leave. If I get a friednly respose and inclusion within 6 hours, I'm likely to draw my friends in to start contributing as well.

C - Define a standard for data schemas for us to use...

JG - I have an alarm go off whenever someone discusses setting a schema standard... two sprints I am most familiar with have been working on similar issuse for 1-3 years. thanks to those sprints and erlated intro work in the world, I feel that is not a new process, and a gap analysis could be very quick.
K - if we have to predetermine standards we'll never get anything done. assume we will have heterogeneous data. this isn't a standardization project... [so I would prefer.]

JR - start taking in lots of data to find these problems and be able to identify / classify them practically as we encounter them?

C - Figuring out how to integrate our millions (DCC) We can make our data ready once there is a platform to use it. [indicating the LOC as well]

[pun on Brewster's Millions?]

S - I'd like to see code.

J - Something more than the general idea that we will have well documenetd open code before we are done?
S - I'd like to see code available to download by tonight. I am backdating my desire. Either a link to 'download the DPLA v0.0001' or a note that there is a plaform in development, and a link to code from individual partner projects.
RJ - I will have some code up on github in 5-7 days. Very rough, no promises...
S - Fantastic.

D - Can we think about where we want to be in 3 months?

R - that can be expensive. it's a good way to start dialogue; if you are willing to throw it away after 3 months... ["design to throw it away"]

Aside on workstream work

M - any thoughts on where the tech stream should be heading?
K - does each list have its own mailing list? [yes] that's part of the problem...
M - should we have a smaller list for chairs of all groups?
Communication and summarization

There's a call for more wiki updates; making that the work place by default - from Karen. Martin - I usually do that, but am used to having to summarize that in email for others. I agree that makes sense as a central place.

A fair amount of human capital is going into communication and summarization... so we are certainly there in spirit! Summaries of each workstream evey month.

DW - could we get a summary of the summaries? (good idea!)

M - the core team would be a staff team working @ Berkman, with the workstream as advisory; pulling in a group of 6 ppl for certain tasks perhaps.

JG - from the view of the core team that have to get something done, there should be crystal clear explanation of their relationship with these other pieces. I hear you saying the relationship is that the tech advisory group is a resource to be called on.

M - I see a prog manager / tech director who reports to the secretariat day to day.
JP - conveners are also pointing out to other communities [on the workstream]

JG - to be successful, they have to get the word out to practitioners of those tools... we need to figure out the mechanism for them to report out. not passing a hurdle of the advisors, but with the idea that it will be relayed to other communities. "we need more input from museums. we need more metadata input..."

Sharing what is needed across all worksterams would help whenever someone is going into a space where they could reach out and connect with new communities.
Who is missing from the room?
National archives! (invited, couldn't make it)
More tech architects?

Final thoughts?

<Everyone asked to write down, privately, one thing they want from dpla, and one thing they hope to offer to others through it.>

Two concerns - long-term sustainability, and management and coordination of development.

I'd like to see simiplicity around the first round of prototyping. To let us come up with am inimal viable set of requirements and get it out there.

I'd like to fill out and detail the diagram at the end of the summary, and put beta sprinst and institutios into their boxeds, think about what it would look like as a complete vision -- find out where we are lacking examples in our current streams. This helps visualize what we're trying to do with a broad audience. [dp.la - platform - content]

Code speaks! The faster we start realizing ideas the better. Considering what we're working on at MSFT you could play a key role in the future of structured data, info repersentation, knolwedg management, etc.

We're going to share some APIs and code, you can access some through our beta sprint prototype. (U.Illinois)

If there's some way to share key nuggets across our workstreams, that would be important.

(NB: which workstreams aer not represented here? we have Tech, Gov, Audience... )

The discussions about data models are interesting - does it make sense to coordinate with discussions already planned?

I'm excited to get a tech prototype out there soon. the sooner we do, the sooner we can get examples of public interfaces taht we can say are built on top of the dpla platform. that would emonstrate why this maters to people not just interested in theory, but those who want to see it reality.

Chris - (from skype!) You have a fan here on my couch... my dog has given this rapt attention

offer dog lincoln as the mascot of the tech workstream :-)

<Martin - Thanks to the organizers; followups online!>

Personal tools