Main Page

From Wikibase.slis.ua.edu
Revision as of 15:09, 15 June 2021 by Smaccall (talk | contribs)
Jump to navigation Jump to search

Welcome to the Linked Data Research Group!

Steven L. MacCall, PhD
Associate Professor
School of Library and Information Studies
University of Alabama

Selected Reading Lists
Our mission is applied linked data research investigating the semantic indexing of text and time-based media collections to facilitate the precision search of those collections via SPARQL queries. We are deploying methods for deriving properties as well as named entities and features from collection items that have been either manually created (e.g., book indexes or video logs) or computationally extracted (e.g., among others, Microsoft Azure Video Analyzer for Media). The result of our semantic indexing method are SPARQL queriable RDF knowledge graphs that are hosted on a local Wikibase instance managed by the Office of Information Technology (OIT) at the University of Alabama.

There are two overarching empirical research questions that we are pursuing:

  1. Queriability of semantic indexing method: The semantic indexing of both text and time-based media collections facilitates precision queries (compare a book index (more precise) to a table of contents (less precise)). When RDF triples are loaded to a knowledge graph, the resulting triplestore can be queried with precision using SPARQL. But are these precise queries useful?
  2. Scalability of semantic indexing method:

1st project: Sports DAM Knowledge Graph:

  1. We are applying linked data technologies to semantically index still images and video clips that document game action in sports by incorporating play-by-play datasets into the indexing process by way of a semantic data model and ETL pipeline process. The resulting knowledge graph can be queried using SPARQL, which allows for precision searching based on queries that incorporate game situation variables. Aadditional information:
    1. Video of presentation by Dr. MacCall to the 2020 Linked Data for Libraries (LD4L) conference
    2. Chronology: Data-driven Sports Image Indexing Research, which documents our work up until now.
  2. Basic research on philology graphs: We are investigating an ontology that would serve to integrate texts in library collections extending the work of the Collections as Data research community.

Current researchers (Spring 2021):

  1. Dr. Steven L. MacCall
  2. Huapu Liu, CIS doctoral student
  3. Nicole Lewis, SLIS MLIS student: For her Fall 2020 directed research study, Nicole is applying what she learned in the Linked Data course from this past summer to extend her philology graph work from scientific article publishing to book publishing using the HathiTrust Research Center's Extracted Features Datasets. She is developing the "transform" portion of an ETL pipeline using back-of-the-book indexes as a source for named entities for chapter-level subject access and will evaluate her work with a set of SPARQL queries against a set of transformed books on the topic of cataloging.


Affiliate researchers:

  1. Dr. Greg Bott in UA Department of Information Systems, Statistics and Management Science for database design and Python programming guidance
    1. Austin Herriott, Dr. Bott's undergraduate student, provided Python programming for the first iteration of our sports ETL pipeline funded with RGC grant monies in fall 2019
  2. Dr. Yu Gan in UA Department of Electrical and Computer Engineering for digital image processing
    1. Alexander Ramey, Dr. Gan's masters student, is assisting in developing an algorithm for extracting players numbers visible in YouTube game video, which will added to our knowledge graph as named entity data.

Previous SLIS student:

  1. C. Melissa Anderson, SLIS MLIS graduate
  2. Christine Schultz-Richert, MLIS graduate: For her Summer 2019 directed research study, Christine investigated the semantic enhancement of transcripts from a 1957 National Educational Television (NET) program (A Look at the Indian's Future) using linked data methods.
  3. Jessica Camano, SLIS MLIS graduate: For her Fall 2020 directed research study, Jessica is investigating a "red letter" problem that emerged from our work in the summer Linked Data course in which our subject terms remain without MediaWiki sitepages. Resolution of this problem involved researching the National Library of Medicine's linked data service (Medical Subject Headings RDF) to download metadata about MeSH entries as RDF triples.
  4. David Roby, SLIS MLIS graduate: For his Fall 2020 directed research study, David is helping us investigate a research problem related to scaling the data management methods and ETL pipeline for our ongoing research into the semantic indexing of digital images and video clips documenting game action in sports.

Special thanks to David J. McMillan, Executive Director for Enterprise Development & Application Support in the UA Office of Information Technology (OIT)