Difference between revisions of "Data Preparation Procedures"
(49 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | This page provides information on data preparation for using QuickStatement | + | This page provides information on data preparation for using QuickStatement. |
− | + | Procedures vary based on the year that a game occurred. | |
− | # Make sure a season page exists for each team participating in the game: [[Indexing a Football Team Season]] | + | '''IMPORTANT: Before creating an Item page, be to to search to be sure it doesn't already exist!''' |
− | # Creating an item for a game | + | |
− | # Data prep required for creating and populate each drive's item page using QuickStatements... two steps process: | + | |
− | ## Create spreadsheet that will derive Q numbers for each drive and incorporating "instance of" statements [[Drive Creation Procedure]] | + | === PRELIMINARY STEPS === |
− | ## Create spreadsheet to populate data about each drive's (now) existing item page: [[Drive Data Preparation]] | + | |
− | # | + | # Make sure a season page exists for each team participating in the game: '''[[Indexing a Football Team Season]]''' |
+ | # Creating an item page for a game: '''[[Indexing a Football Game]]''' | ||
+ | |||
+ | === PROCEDURES === | ||
+ | |||
+ | Spreadsheet preparation procedures - downloading/acquiring and then preparing spreadsheets based on time period within which each game occurred. Procedures include data cleaning and uploading cleaned data to our Wikibase instance using our QuickStatements tool. | ||
+ | |||
+ | ==== Category 1 Games (with available JSON-encoded play-by-play data sources - 2001 to present) ==== | ||
+ | |||
+ | Football games in this category occurred 2001 to present and have JSON-encoded play-by-play data sources. This category further subdivides based on the availability of wall clock data for each play (games occurring 2014 to present). | ||
+ | |||
+ | IMPORTANT: Need to first download and install a JSON to CSV converter from https://json-csv.en.softonic.com/download | ||
+ | |||
+ | # Games '''2016''' to present (JSON files with wall clock data): | ||
+ | ## '''[[Downloading and Initial Prepping of Spreadsheets for Category 1 Games 2014 to Present]]''' | ||
+ | ## '''[[Preparing Drive Creation Spreadsheets for Category 1 Games 2014 to Present]]''' | ||
+ | ## '''[[Preparing Play Creation Spreadsheets for Category 1 Games 2014 to Present]]''' | ||
+ | ## '''[[Preparing Drive Data Spreadsheets for Category 1 Games 2014 to Present]]''' | ||
+ | ## '''[[Preparing Play-by-play Data Spreadsheets for Category 1 Games 2014 to Present]]''' | ||
+ | ## Further preparation procedures to come later, including for player participation data (draft version available: [[Player Indexing]]) | ||
+ | # Games 2001 to 2013 (JSON files without wall clock data): | ||
+ | ## '''[[Downloading and Initial Prepping of Spreadsheets for Category 1 Games 2001 to 2013]]''' | ||
+ | ## '''[[Preparing Drive Creation Spreadsheets for Category 1 Games 2001 to 2013]]''' | ||
+ | ## '''[[Preparing Drive Data Spreadsheets for Category 1 Games 2001 to 2013]]''' | ||
+ | ## '''[[Preparing Play Creation Spreadsheets for Category 1 Games 2001 to 2013]]''' | ||
+ | ## '''[[Preparing Play-by-play Data Spreadsheets for Category 1 Games 2001 to 2013]]''' | ||
+ | ## Further preparation procedures to come later, including for player participation data (draft version available: [[Player Indexing]]) | ||
+ | |||
+ | ==== Category 2 Games (with play-by-play data sources requiring transcribing) ==== | ||
+ | |||
+ | Football games in this category occurred prior to 2001 and have paper-based play-by-play data sources that require transcribing to spreadsheets. | ||
+ | |||
+ | # '''[[Transcribing Steps and Initial Prepping of Spreadsheets for Category 2 Games]]''' | ||
+ | # '''[[Preparing Drive Creation Spreadsheets for Category 2 Games]]''' | ||
+ | # '''[[Preparing Drive Data Spreadsheets for Category 2 Games]]''' | ||
+ | # '''[[Preparing Play Creation Spreadsheets for Category 2 Games]]''' | ||
+ | # '''[[Preparing Play-by-play Data Spreadsheets for Category 2 Games]]''' | ||
+ | # Further preparation procedures to come later, including for player participation data (draft version available: [[Player Indexing]]) | ||
+ | |||
+ | ==== Category 3 Games (with play-by-play data requiring reconstruction from newspaper accounts) ==== | ||
+ | |||
+ | Football games in this category occurred prior to 2001 and do not have any play-by-play data sources other than newspaper game accounts that require transcribing to spreadsheets. | ||
+ | |||
+ | # '''[[Data Gathering, Transcribing Steps, and Initial Prepping of Spreadsheets for Category 3 Games]]''' | ||
+ | # '''[[Preparing Drive Creation Spreadsheets for Category 3 Games]]''' | ||
+ | # '''[[Preparing Drive Data Spreadsheets for Category 3 Games]]''' | ||
+ | # '''[[Preparing Play Creation Spreadsheets for Category 3 Games]]''' | ||
+ | # '''[[Preparing Play-by-play Data Spreadsheets for Category 3 Games]]''' | ||
+ | # Further preparation procedures to come later, including for player participation data (draft version available: [[Player Indexing]]) | ||
+ | |||
+ | Data prep required for creating and populate each drive's item page using QuickStatements... two steps process: | ||
+ | ## Create spreadsheet that will derive Q numbers for each drive and incorporating "instance of" statements: '''[[Drive Creation Procedure]]''' | ||
+ | ## Create spreadsheet to populate data about each drive's (now) existing item page: '''[[Drive Data Preparation]]''' | ||
+ | # Play-by-play Data Preparation and Upload Procedures |
Latest revision as of 20:05, 28 January 2019
This page provides information on data preparation for using QuickStatement.
Procedures vary based on the year that a game occurred.
IMPORTANT: Before creating an Item page, be to to search to be sure it doesn't already exist!
Contents
PRELIMINARY STEPS
- Make sure a season page exists for each team participating in the game: Indexing a Football Team Season
- Creating an item page for a game: Indexing a Football Game
PROCEDURES
Spreadsheet preparation procedures - downloading/acquiring and then preparing spreadsheets based on time period within which each game occurred. Procedures include data cleaning and uploading cleaned data to our Wikibase instance using our QuickStatements tool.
Category 1 Games (with available JSON-encoded play-by-play data sources - 2001 to present)
Football games in this category occurred 2001 to present and have JSON-encoded play-by-play data sources. This category further subdivides based on the availability of wall clock data for each play (games occurring 2014 to present).
IMPORTANT: Need to first download and install a JSON to CSV converter from https://json-csv.en.softonic.com/download
- Games 2016 to present (JSON files with wall clock data):
- Downloading and Initial Prepping of Spreadsheets for Category 1 Games 2014 to Present
- Preparing Drive Creation Spreadsheets for Category 1 Games 2014 to Present
- Preparing Play Creation Spreadsheets for Category 1 Games 2014 to Present
- Preparing Drive Data Spreadsheets for Category 1 Games 2014 to Present
- Preparing Play-by-play Data Spreadsheets for Category 1 Games 2014 to Present
- Further preparation procedures to come later, including for player participation data (draft version available: Player Indexing)
- Games 2001 to 2013 (JSON files without wall clock data):
- Downloading and Initial Prepping of Spreadsheets for Category 1 Games 2001 to 2013
- Preparing Drive Creation Spreadsheets for Category 1 Games 2001 to 2013
- Preparing Drive Data Spreadsheets for Category 1 Games 2001 to 2013
- Preparing Play Creation Spreadsheets for Category 1 Games 2001 to 2013
- Preparing Play-by-play Data Spreadsheets for Category 1 Games 2001 to 2013
- Further preparation procedures to come later, including for player participation data (draft version available: Player Indexing)
Category 2 Games (with play-by-play data sources requiring transcribing)
Football games in this category occurred prior to 2001 and have paper-based play-by-play data sources that require transcribing to spreadsheets.
- Transcribing Steps and Initial Prepping of Spreadsheets for Category 2 Games
- Preparing Drive Creation Spreadsheets for Category 2 Games
- Preparing Drive Data Spreadsheets for Category 2 Games
- Preparing Play Creation Spreadsheets for Category 2 Games
- Preparing Play-by-play Data Spreadsheets for Category 2 Games
- Further preparation procedures to come later, including for player participation data (draft version available: Player Indexing)
Category 3 Games (with play-by-play data requiring reconstruction from newspaper accounts)
Football games in this category occurred prior to 2001 and do not have any play-by-play data sources other than newspaper game accounts that require transcribing to spreadsheets.
- Data Gathering, Transcribing Steps, and Initial Prepping of Spreadsheets for Category 3 Games
- Preparing Drive Creation Spreadsheets for Category 3 Games
- Preparing Drive Data Spreadsheets for Category 3 Games
- Preparing Play Creation Spreadsheets for Category 3 Games
- Preparing Play-by-play Data Spreadsheets for Category 3 Games
- Further preparation procedures to come later, including for player participation data (draft version available: Player Indexing)
Data prep required for creating and populate each drive's item page using QuickStatements... two steps process:
- Create spreadsheet that will derive Q numbers for each drive and incorporating "instance of" statements: Drive Creation Procedure
- Create spreadsheet to populate data about each drive's (now) existing item page: Drive Data Preparation
- Play-by-play Data Preparation and Upload Procedures