2018-19 Academic Year Research Report

From Wikibase.slis.ua.edu
Jump to navigation Jump to search

Data-driven Semantic Indexing of Football Images from the Entire 2017 Alabama Crimson Tide Football Season:
An Experiment in Linked Data Using a Local Wikibase Instance


Linked Data Research Group
School of Library and Information Studies
College of Communication and Information Sciences
University of Alabama

Our aim with this experiment was to demonstrate how statistical play-by-play data generated during a football game could be incorporated into a semantic indexing and identification process for the photos and video clips captured during those games. We chose to deploy a linked data approach using Wikibase as our software.

Outline of research accomplishments resulting from this experiment:

  1. Progress Report: RGC grant-funded research accomplishments thus far. We were able to successfully transform Web-accessible JSON-encoded statistical play-by-play datasets for all 14 games in the 2017 Alabama Crimson Tide football season into a linked data application using Wikibase. One can now navigate from play to play, drive to drive, and game to game within the 2017 football season using data drawn from existing statistical play-by-play datasets incorporated into the application by way of linked data methods. Highlights:
    1. Property list for the ontology developed for this application.
    2. Examples of three types of entities ("play" "drive" "game") in our linked data application. Each example provides links to Mediawiki site pages and to Wikibase item pages (each item page contains the metadata statements (each statement a "triple") about each entity using properties drawn from our ontology):
      1. Mediawiki site page for a typical play (here is this play's corresponding Wikibase item page)
      2. Mediawiki site page for a typical drive (here is this drive's corresponding Wikibase item page)
      3. Mediawiki site page for a typical game (here is this game's corresponding Wikibase item page)
    3. Infobox templates deployed to generate infoboxes on each Mediawiki page:
      1. For a play
      2. For a drive
      3. For a game
    4. Example SPARQL queries for retrieving plays from the 2017 Alabama Crimson Tide football season meeting a variety of query criteria (PLEASE NOTE: To run each query, click on the Blue Arrow icon in lower left portion of screen after clicking on links below):
      1. All rushing touchdowns that went for over 50 yards during the 2017 Alabama Crimson Tide football season (limit to those for which there are video clips)
      2. All Jalen Hurts touchdown passes that went for over 25 yards in games from the 2017 Alabama Crimson tide football season (Just those Jalen Hurts TD passes for which there are video clips).
      3. All touchdown passes that went for over 10 yards in the 2017 Alabama Crimson Tide football season (Just those plays for which there are video clips).
  2. Applied research results (i.e., outside of the RGC grant context): The use of linked data to provide access to multimedia assets. One important result of our research work during this academic year is the demonstration of the use of linked data to provide access to multimedia assets that document individual plays. As shown above, individual plays can be discovered by navigating the linked data application or by using SPARQL to query the triplestore. In the examples below, you will find video clips for each play thus demonstrating how linked data navigation or SPARQL querying methods can lead to multimedia assets that document those plays:
    1. Tua Tagovailoa pass complete to DeVonta Smith for 27 yds for a TD
    2. Da'Ron Payne 1 Yd pass from Jalen Hurts
    3. Damien Harris run for 75 yds for a TD
    4. Calvin Ridley 12 Yd pass from Jalen Hurts
    5. See also: Plays with Example UA Images May 2019
  3. Collaborator information. Important collaborators contributing to the research reported here:
    1. Dr. Greg Bott, Assistant Professor, UA Culverhouse School of Business. Dr. Bott is co-PI on the RGC grant providing data management expertise focusing on optimizing the efficiency of data wrangling methods using Python scripting
    2. David McMillan, IT Team Leader, Enterprise Development & Application Support, UA Office of Information Technology (OIT). David has been a long time collaborator beginning in the late 1990s when he was Systems Admin in the School of Library and Information Studies, and we were co-authors on a UA patent. In the current research project, David has contributed crucial support in the installation, optimization, and ongoing management of the Mediawiki/Wikibase instance hosted by UA OIT.
    3. Huapu Liu, Graduate Research Assistant and MLIS student. Huapu served as my graduate research assistant for the entire 2018-19 academic year serving as an indispensable collaborator in the development of our understanding of Wikibase and linked data, which was essentially a long series of trial and error steps. Huapu helped me compose over 25 pages of data wrangling procedures needed to transform the JSON-encoded statistical play-by-play datasets into linked data in Wikibase. He also did more than his share of data wrangling! (to run SPARQL query, click on the Blue Arrow icon in lower left portion of screen after clicking on link)
    4. Christine Schultz-Richert, MLIS student. Christine joined the research team in early April after hearing about this project in the presentation I made about it in our LS 566 Metadata course. In this short amount of time, Christine was able to accomplish quite a bit of wrangling data for us.