2014 DLF Forum: Full Schedule

Wireless: GTvisitor, no password

Community Reporting Google docs

#DLFForum / @CLIRDLF - now on Instagram!

2:30pm EDT

Piloting a Peer-to-Peer Process for Becoming a Trusted Digital Repository + The Future of Fedora: Update on Fedora 4

Two project updates:

Piloting a Peer-to-Peer Process for Becoming a Trusted Digital Repository
Community Notes
In this presentation, representatives from UF and UNT will share on their work in collaboratively creating a pilot peer-to-peer process for TRAC to build towards becoming a Trusted Digital Repository, and how the process supports other concerns including needs for different types of collaborations and scales of collaboration for achieving TRAC goals, with peer-to-peer style collaboration for peer review of TRAC offering an important option for building capacity locally and as a community.

In 2014 the University of Florida (UF) and the University of North Texas (UNT) began a collaborative process to each complete a full self-audit using the Trusted Repository Audit Checklist (TRAC) for both institution's digital repositories. In addition to the self-audit, each institution agreed to participate in a peer review process evaluating and scoring each other's self-audit and supplied documentation.

The goals of the project are as follows:

Document the current repository services and systems, technical and human infrastructures, and overall operations following the TRAC process
Demonstrate the maturity of repository services, infrastructure and governance at both institutions
Share information and knowledge to support increasing the collaboration between project teams at UF and UNT
Pilot a peer review option that aims at offers more rigor and external feedback than a self-audit, but which also does not have the same financial requirements as a full external certification by a third party
Leverage the process internally at each institution to share information and knowledge to support increasing collaboration among different internal and external groups, including Research Computing and High Performance Computing groups at each institution

Session Leaders
Laurie Taylor, University of Florida
Chelsea Dinsmore, University of Florida
Suchi Yellapantula, University of Florida
Mark Phillips, University of North Texas

AND

The Future of Fedora: Update on Fedora 4
Community Notes
Over the past eighteen months, the Fedora community has come together to redesign and rebuild Fedora as a robust repository platform for the next decade. This new version of the software, Fedora 4, introduces a number of sought-after features, including performance improvements, support for large files, and native linked data capabilities. The codebase has also been revitalized to take advantage of modern, best-practice coding standards, including rigorous testing and documentation. The first official release, Fedora 4.0, launched as a beta at Open Repositories in July of 2014, and the full release will be available later in the year.

This presentation will provide an update on Fedora 4, both in terms of community support and technical development. Attendees will learn about the new Fedora 4.0 feature set, as well as use cases and strategies for migrating from Fedora 3.x to Fedora 4.

Session Leader
Mike Durbin, University of Virginia

Presenters

Chelsea Dinsmore

Curator for Digital Collections, University of Florida

Mark Phillips

Associate Dean for Digital Libraries, UNT Libraries

Mark Phillips is the Associate Dean for Digital Libraries at the UNT Libraries. His areas of interest include: workflows for digitized and born-digital content, digital preservation systems, Web archives, and metadata quality.

Laurie Taylor

Senior Director for Library Technology & Digital Strategies, University of Florida

Suchi Yellapantula

University of Florida

Monday October 27, 2014 2:30pm - 3:30pm EDT
Salons 1,2,3 Georgia Tech Hotel and Conference Center

Project Updates

2:30pm EDT

Researcher Identifiers—What's in a Name (or URI)? + SHARE: An Update on the SHared Access Research Ecosystem

Two project updates:

Researcher Identifiers—What's in a Name (or URI)?
Community Notes
A number of approaches to providing authoritative researcher identifiers have emerged, but they tend to be limited by discipline, affiliation or publisher. The rise of bibliometrics and its extension, altmetrics—the attempt to measure the impact of a work including mentions in social media and news media—strengthens the need to uniquely identify researchers and correctly associate them with their scholarly output. Both institutions and researchers have a stake in ensuring their scholarly output is accurately represented across academia and the web. It is time for universities to transition from watchful waiting to engagement.

It is difficult to uniquely identify researchers when they have not authored monographs, but write primarily journal articles, and thus are not represented in national name authority files. An OCLC Research Task Group comprising specialists from the US, the UK, and the Netherlands (see http://www.oclc.org/research/activities/registering-researchers.html#taskgroup) developed eighteen use-case scenarios around different stake-holders, generated a list of functional requirements derived from these use case scenarios, and profiled 20 research networking systems. A researcher ID information flow diagram illustrates the complexity of the current ecosystem. The same information about a specific researcher may be represented in multiple databases, and only a subset interoperates with each other.

This presentation will summarize emerging adoption trends and focus on three identifiers—ISNI, ORCID and VIAF. Participants will be asked to comment on the recommendations targeted to librarians, researchers and university administrators and share their experiences with or plans for researcher identifiers at their institutions.

Session Leader
Karen Smith-Yoshimura, OCLC Research

AND

SHARE: An Update on the SHared Access Research Ecosystem
Community Notes
An update on the latest developments with SHARE, a higher education and research community initiative to facilitate the preservation of, access to, and reuse of research outputs. Learn the status of the project’s first undertaking, the SHARE Notification Service, which aims to notify interested stakeholders when research release events occur. Currently in prototype development, the SHARE Notification Service is working with funding agencies, sponsored research offices, institutional repositories, disciplinary repositories, publishers, data archives, and other interested parties to provide a timely, structured, and comprehensive communication channel. Presentation will describe how the SHARE Notification Service can be used by researchers to keep interested parties apprised of their scholarly output; by universities to facilitate the work of the sponsored research office, tenure and promotion committees, and to oversee open access polices; by funding agencies to track grant compliance; and by libraries to help populate their institutional repositories.

The presentation will also touch on SHARE’s larger vision of a coordinated repository infrastructure that will give campus-driven research outputs their widest exposure, and facilitate their broad reuse. In its fully realized state, SHARE will provide a registry of what is available within publicly accessible repositories and facilitate discovery of, and access to, content across these repositories. SHARE will expose this content so that the community can reuse, mine, and build services on top of the corpus. We look forward to detailing this vision and getting your critical input as we pursue this community-driven project.

Session Leader
Eric Celeste, SHARE

Presenters

Eric Celeste

Consultant, Tenseg

I have been working with REA on its website and other technology since 2010. My son, Alex, and I run a technology consulting company, Tenseg LLC, together.

Karen Smith-Yoshimura

OCLC Research

Monday October 27, 2014 2:30pm - 3:30pm EDT
Salons 4,5,6 Georgia Tech Hotel and Conference Center

Project Updates

3:45pm EDT

Managing the Digitization of Large Press Archives + Audio and Video at Scale: Indiana University's Media Digitization and Preservation Initiative + Building a Ten-Campus Digital Library Collection at the University of California

Three project updates:

Managing the Digitization of Large Press Archives
Community Notes
Managing the digitization of press material is quite a challenge; not only in terms of quantity, but also in terms of text and material quality, designing the workflow system which organizes the operations, and handling the metadata. This challenge has been the focus of the Bibliotheca Alexandrina's digitization work during the past year in the course of its partnership with the Center for Economic, Judicial, and Social Study and Documentation (CEDEJ). Having more than 800,000 pages of press articles to be digitally preserved and publicly accessed, triggered an inevitable need to design a workflow that can manage such a massive collection and handle its attributes proficiently. The deployment of this endeavor required simultaneous intervention of four main aspects; data analysis of the collection, developing a digitization workflow for the collection at hand, implementing and installing the necessary software tools for metadata entry, and finally, publishing the digital archive online for researchers and public access.

The presentation will demonstrate the workflow system which is being implemented to manage this massive press collection, which has yielded to date more than 400,000 pages. It will shed some light on the BA's Digital Assets Factory (DAF), which is the nucleus upon which the digitization process of CEDEJ collection has been built. Additionally, the presentation will discuss the tools implemented for ingesting data into the digitization process starting form indexing until the creation of batches that are ingested into the system. The outflow will also be discussed in terms of organizing and grouping multipart press clips, in addition to the reviewing, validation and correction of the output. Light will also be shed on the challenges encountered to associate the accessible online archive with a powerful search engine supporting multidimensional search while maintaining a user-friendly navigation experience.

Session Leaders
Bassem Elsayed, Bibliotheca Alexandrina
Ahmed Samir, Bibliotheca Alexandrina

AND

Audio and Video at Scale: Indiana University's Media Digitization and Preservation Initiative
Community Notes
In 2013, Indiana University (IU) launched a five-year project, known as the Media Digitization and Preservation Initiative (MDPI: http://mdpi.iu.edu/), to digitize and preserve over 300,000 audio and video assets of value from across the university. Among academic institutions, IU has an unusually rich collection of rare and unique time-based media that document subjects of enduring value to the university, State of Indiana, and the world. Pieces range from wax cylinder sound recordings of Native American music to performances by notable graduates of its Jacobs School of Music to media from the collections of IU's Kinsey Institute for Research in Sex, Gender, and Reproduction.

The project is co-led by IU's Vice President for Information Technology and Dean of University Libraries. IU is partnering with a commercial vendor, Memnon Archiving Services of Belgium, to set up a facility in Bloomington, Indiana to digitize these materials, in a workflow that will produce as much as 12 terabytes per day of digital data to be preserved beginning in summer 2014.

MDPI was planned out of recognition by IU leadership that large portions of IU's media holdings were becoming seriously endangered due to media degradation and/or format obsolescence. A 2008-2009 survey of holdings at IU Bloomington (http://www.indiana.edu/~medpres/documents/iub_media_preservation_survey_FINALwww.pdf) uncovered over 569,000 audiovisual items on 51 different physical formats held in collections of 80 different organizational units across the campus, with significant quantities of rare and unique items in danger of becoming inaccessible within 5-15 years due to degradation or obsolescence.

In this presentation, we will outline the goals and history of MDPI, describe the workflows that we are establishing to feed content into the digitization process and manage content coming out of the process, and discuss planned strategies for preservation storage, access, and metadata.

Session Leaders
Juliet Hardesty, Indiana University
Jon Dunn, Indiana University

AND

Building a Ten-Campus Digital Library Collection at the University of California
Community Notes
The University of California (UC) Libraries and the California Digital Library are nearing the conclusion of an ambitious project to build a shared system for creating, managing, and providing access to unique digital resources across ten campuses (see http://bit.ly/UCLDC).

The platform we are creating will have three major components: 1) a shared digital asset management system for librarians to centrally add and edit digital files and metadata, 2) a metadata harvest for digital resources hosted on external platforms, and 3) an integrated public interface so end-users can seamlessly search across these disparate resources. Together, these components will provide critical infrastructure for the UC Libraries to more efficiently, economically, and collaboratively manage and surface digital content. We will also be leveraging this platform to participate in the Digital Public Library of America (DPLA), and we are investigating the possibility of extending it to facilitate participation in DPLA by additional libraries, archives, and museums throughout California.

This session will build on a "Community Idea Exchange" poster presentation from the 2013 Forum—at which point we had just begun the project—to describe in more depth the components of the platform and the technologies employed, as well as challenges to and changes in our approach since we embarked. One of the more interesting aspects of our technology stack is that we have opted to license and customize a vendor product for the digital asset management system with which the digital library community may not have much familiarity (Nuxeo, http://www.nuxeo.com/), and in this session we will discuss our experiences with it. We will also describe how our project and our platform will connect with other initiatives, most notably the DPLA, and may provide a piece of the technical infrastructure needed for institutions across California to share their respective digital resources.

Session Leaders
Sherri Berger, California Digital Library
Brian Tingle, California Digital Library

Presenters

Sherri Berger

Product Manager for Special Collections, California Digital Library

Sherri Berger is Product Manager for Special Collections at the California Digital Library. She leads and participates in collaborative projects that provide greater access to digital collections throughout California.

Jon Dunn

Assistant Dean for Library Technologies, Indiana University Bloomington Libraries

Bassem Elsayed

Project Manager, Bibliotheca Alexandrina

Ahmed Samir

Bibliotheca Alexandrina

Brian Tingle

Technical Lead for Digital Special Collections, California Digital Library

wandered into the library 20 years ago and never left

Monday October 27, 2014 3:45pm - 5:15pm EDT
Salons 4,5,6 Georgia Tech Hotel and Conference Center

Project Updates

9:00am EDT

Placing the IR within the User's Workflow: Connecting Hydra-based Repositories with Zotero + Redesigning Electronic Record Processing and Preservation at NARA

Two project updates:

Placing the IR within the User's Workflow: Connecting Hydra-based Repositories with Zotero
Community Notes
This session presents an update on the research conducted in an Andrew W. Mellon Foundation-funded project at Penn State University. The first phase of the project (2012-13) explored the scholarly workflow of the Penn State faculty across the sciences, humanities, and social sciences, focusing on the integration of digital technologies at all stages of a research lifecycleâ€”from collecting and analyzing data, over managing and storing research materials, to writing up and sharing research findings. The current phase of the study (2014-2016) centers on developing a digital research tool for humanities scholarship using Zotero as a test platform, in collaboration with George Mason University. Based on the results of the first phase of our study, we will focus on unifying several phases of the research workflow, and facilitating elements such as better integration of finding and archiving into the scholar's online path. Specifically, we aim to connect Zotero with Penn State's Hydra-based institutional repository, ScholarSphere. Penn State Zotero users will be able to claim their publications and then seamlessly login and store copies within ScholarSphere. This project aims to place self-archiving within a tool that already has good traction within the Humanities (Zotero), and increase the visibility of an institutional repository within the workflow of digital scholars. Preliminary development as well as additional details on the technology, including options for other Hydra-based repositories to adopt this workflow, will be shared during this research update. We will also discuss specific needs of Humanities scholars as found in the first iteration of this study, and how these implications are addressed in the Zotero / ScholarSphere software integration.

Session Leaders
Dawn Childress, Penn State University
Patricia Hswe, Penn State University
Ellysa Cahoy, Penn State University

AND

Redesigning Electronic Record Processing and Preservation at NARA
Community Notes
The US National Archives and Records Administration (NARA) is in the process of refactoring its infrastructure for the processing and preservation of electronic records.

In gathering requirements to enhance the tool suite at NARA, a number of needs were identified. The key need was for a flexible processing environment with an expandable set of software tools to verify and process a significant volume and varieties of electronic records. Existing systems lacked support for non-Federal digital materials (e.g., digital surrogate masters, Legislative, Donated, Supreme Court, etc.) or classified digital materials. And given highly successful partnerships with other types of organizations, there are growing storage for digital surrogates and a need for a more efficient workflows to provide public access.

This new infrastructure is described as the Optimized Ingest Framework (OIF). This framework includes a new model for managing the receipt and processing of digital materials for preservation and access; a modular approach to systems managing digital materials; a departure from the model of a single, monolithic system; the refactoring and evolution of existing systems; the establishment of an environment to provide necessary processing flexibility and tools for a wide variety of digital materials; and a more automated and robust solution for digital preservation with reduced complexity.

This refactoring comprises three modular systems: a Digital Processing Environment (DPE) that encompasses a suite of tools for processing including validation, characterization and transformation of files; a Business Object Management system to create and manage workflows for transfer and ingest; and an enhanced Digital Object Repository for the management and preservation of records and surrogates.

This project is just getting underway at NARA with its first iteration DPE prototype currently scheduled for early 2015.

Session Leader
Leslie Johnston, National Archives and Records Administration

Presenters

Ellysa Cahoy

Education Librarian, Penn State University

Dawn Childress

Librarian, Digital Collections and Scholarship, UCLA

Patricia Hswe

Digital Content Strategist, The Pennsylvania State University

I manage Penn State's repository service, ScholarSphere (https://scholarsphere.psu.edu/), which promotes open, persistent access to the research outputs, including data sets, of Penn State's faculty, students, and staff. I'd love to talk about promotion of student scholarship through... Read More →

Leslie Johnston

Director of Digital Preservation, National Archives and Records Administration (NARA)

Wednesday October 29, 2014 9:00am - 10:00am EDT
Salons 4,5,6 Georgia Tech Hotel and Conference Center

Project Updates

9:00am EDT

Running Up That Hill: The Academic Preservation Trust: A Community Based Approach to Digital Preservation + Accessing Digital Art: Emulation and Preservation of Complex Digital Art Objects

Two project updates:

Running Up That Hill: The Academic Preservation Trust: A Community Based Approach to Digital Preservation
Community Notes
The Academic Preservation Trust (APT), a consortium of 17 institutions, was formed two and a half years ago when a small group of academic library deans agreed to take a community approach in building and managing a repository that would provide long-term preservation of the scholarly record. The repository also aims to aggregate content, to provide for disaster recovery, to leverage economies of scale, and to explore access and other services. From its beginning, APTrust has been a layered collaboration of deans, technology experts, content/preservation specialists, and a small APTrust staff located at the University of Virginia.

The growth of the consortium has been bumpy at times, with differences of opinion regarding technology decisions and, inside the University of Virginia, in building awareness that an entrepreneurial program requires quick responses from the infrastructure. APTrust remains repository and format agnostic by using the Baglt specification for content submission. Metadata is managed by Fedora with pointers to content preserved in Amazon S3 and Glacier with administrative functions built using Hydra and Blacklight. The repository is scheduled to go live in July and will become a DPN node. A panel of APTrust partners and UVA staff will describe the interplay in decision making among deans, technologists, and content experts and will discuss the evolving nature of an effort that is approaching full production, including questions of governance, business modeling, certification goals and the consortium's evolving approach to the complex issues related to digital preservation.

Session Leaders
Bradley Daigle, University of Virginia
Scott Turnbull, APTrust
Laura Capell, University of Miami
Stephen Davis, Columbia University
Elisabeth Long, University of Chicago
Nathan Tallman, University of Cincinnati

AND

Accessing Digital Art: Emulation and Preservation of Complex Digital Art Objects
Community Notes
We will provide an overview of the strategies and desired outcomes of PAFDAO: Preservation and Access Frameworks for Digital Art Objects, a two-year Research and Development, NEH-funded project. We will describe technical challenges in general as well as those that are idiosyncratic to the content at hand, and outline strategies we employ to address them. The talk will focus primarily on technical components of the project: disc imaging, the metadata framework and the organization of the PAFDAO deposit to the Cornell University Library Archival Repository (CULAR). The requirements specific to this project for imaging, metadata and organization of deposit are more complex than typical digital preservation projects due to these works' interdependencies with emulation environments and concerns over fidelity of experience in an emulated environment.

We will share our processes and encourage discussion with participants concerning digital preservation of complex media.

Project Background:

In February 2013, the Rose Goldsen Archive of New Media Art, part of Cornell University Library's Division of Rare and Manuscript Collections, received a $300,000 grant from the National Endowment for the Humanities to develop PAFDAO: preservation and access frameworks for complex digital media art objects: http://www.neh.gov/files/grants/cornell_universitypreservation_and_access_framework_for_digital_art_objects.pdf.

PAFDAO's test collection includes more than 300 interactive born-digital artworks created for CD-ROM, DVD-ROM, and web distribution, many of which date back to the early 1990s. Though vitally important to understanding the development of media art and aesthetics over the past two decades, these materials are at serious risk of degradation and are unreadable without obsolete computers and software.

Our goal is to create a scalable preservation workflow to ensure the best feasible access to these materials for decades to come, and also contribute to the development of coherent best practices in the area of preserving complex media collections.

Session Leaders
Jason Kovari, Cornell University
Dianne Dietrich, Cornell University
Michelle A. Paolillo, Cornell University

Presenters

Laura Capell

Head of Digital Production, University of Miami

Laura is the Head of Digital Production at the University of Miami, where she manages digital projects for special collections materials.

Bradley Daigle

Strategic and Content Expert/Chair NDSA Leadership, University of Virginia

University of Virginia

Stephen Davis

Director, Libraries Digital Program, Columbia University Libraries

Dianne Dietrich

Cornell University

Jason Kovari

Director, Cataloging & Metadata Services, Cornell University

Elisabeth Long

Associate University Librarian, University of Chicago

Michelle Paolillo

Digital Lifecycle Services Lead, Cornell University

Michelle is Cornell University's Library's Lead for Digital Lifecycle Services. She is invested in the practical logistics of digital preservation (harmonizing workflows, preservation storage, interoperability, systems design, etc.). She also serves as Cornell's HathiTrust coordinator... Read More →

Nathan Tallman

Digital Content Strategist, Assistant Librarian, University of Cincinnati

Scott Turnbull

Lead Engineer, APTrust - University of Virginia

I'm the lead engineer for APTrust, providing preservation and aggregate repository services for 16+ Universities nation wide. I'm also passionate about digital humanities and academic computing.

Wednesday October 29, 2014 9:00am - 10:00am EDT
Salons 1,2,3 Georgia Tech Hotel and Conference Center

Project Updates