CSR - Directed Study - Summer 2019
Project Description
Building on course content learned in LS 566 Metadata, this directed study will use SLIS’s wikibase instance (https://wikibase.slis.ua.edu/) to investigate the feasibility of using linked data to organize collections of time-based media. The collection of interest will be the National Education (NET) Collection Catalog Project, which is a project of the American Archive of Public Broadcasting. The NET Catalog Project is a national catalog of programs distributed by NET (1952-1972) comprising the earliest public television content. For the NET Catalog Project, transcripts for each program were created and marked up in HTML using a method separating transcripts into segments. In this study, the goal is to scrape this HTML-encoded transcript data and treat the segments as time-coded linked data entities in the SLIS wikibase in order to demonstrate how these entities can serve as the basis for systematically organizing the collection as a more granular level than is presently the case. We will also investigate the potential of enhancement each linked data segment with external metadata downloaded from the linked data cloud.
Student will gain experience in conducting feasibility studies for investigating the potential of linked data applications in cultural heritage institutions. Additionally, student will conduct an extensive literature review to aggregate previous work on linked data application and to situate this feasibility study within the current state of linked data research, both building from previous research, and charting new territory in the field. Part of the experimentation process of this project will be to refine an HTML-parsing methodology and to index parsed HTML within Wikibase software to demonstrate navigation with linked data. The directed study conclude with a formal write-up of study results including experience gained in using the Wikibase software, parsing tools, and linked data indexing methods.
Background Links
- Putting the Pieces Together: Creating a National Educational Television Catalog (slidedeck)
- Artificial intelligence meets public broadcasting’s archives (article)
Links to NET Collection Catalog