pga tour data warehouse team members frank paladino aravind yeluripiti
TRANSCRIPT
![Page 1: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/1.jpg)
PGA Tour Data Warehouse
Team MembersFrank Paladino
Aravind Yeluripiti
![Page 2: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/2.jpg)
Project Goals• The Professional Golfers Association (PGA) is the
organizer of the men's professional golf tour primarily in the United States and North America.
• It organizes most of the events on the flagship annual series of tournaments also known as the PGA Tour.
• The goal of this project is to build a data warehouse using data collected from the PGA Tour website.
![Page 3: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/3.jpg)
Task Delegation and work flow
![Page 4: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/4.jpg)
Source Databases
• The source databases contain data related to PGA Players, Tournaments, and Results.
• For each subject area, research was done to determine the appropriate data model for database.
• After completing the data model, data was loaded into the source databases.
• In order to obtain relevant data for our project, we extracted data from the PGA Tour website into comma delimited files and imported them into our source databases.
• The “LOAD DATA INFILE ‘[filename]’ FIELDS TERMINATED BY ‘,’ command was used to load the data from files into the source databases.
![Page 5: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/5.jpg)
Source Databases
• Player Database– The player database contains data related to the
players profile and demographic information.• Tournament Database– The Tournament database contains detailed data
related to tournaments, venues, and results of tournaments.
• Statistics Database– The statistics database contains data that measures a
players tour performance.
![Page 6: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/6.jpg)
Raw data - sample• Dustin johnson
– Height: 6 ft, 4 in– Weight: 190 lbs– Birthday: 06/22/1984– College:Coastal Carolina University– Turned Pro:2007– Birthplace:Columbia, South Carolina– Residence:Myrtle Beach, South Carolina
• Steven stricker– Height:6 ft, 0 in– Weight:190 lbs– Birthday:02/23/1967– College:University of Illinois– Turned Pro:1990– Birthplace:Edgerton, Wisconsin– Residence:Madison, Wisconsin
![Page 7: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/7.jpg)
Players Database
![Page 8: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/8.jpg)
Players Database Snapshots
Locations TablePlayers Table
![Page 9: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/9.jpg)
Players Database Snapshots
PlayerPersonalInfo Table
![Page 10: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/10.jpg)
Players Database snapshots
![Page 11: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/11.jpg)
Statistics Database
![Page 12: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/12.jpg)
Statistics Database SnapshotsAdvancedStats TableBasicStats Table
![Page 13: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/13.jpg)
Statistics Database Snapshots
TourStats Table
![Page 14: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/14.jpg)
Tournament Database Schema
![Page 15: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/15.jpg)
Tournament Database SnapshotsCourseDetails Table PlayerTourney Table
![Page 16: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/16.jpg)
Tournament Database Snapshots
Venues Table
Tourneys Table
![Page 17: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/17.jpg)
Tournament Database SnapshotsScorecard Table
![Page 18: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/18.jpg)
PGA Tour Warehouse Database
The PGA Tour Warehouse integrates data from the three source databases to a single schema that can be used to query player statistical, tournament results, and player performance.
Our approach separated source data into:– Qualitative data – dimensions – data used to
qualify or filter the data– Quantitative data - (facts/measures) – data used
to measure performance
![Page 19: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/19.jpg)
Implementation of Data WarehouseThe SQL implementation of data warehouse dimension and fact tables used a similar approach to that described below. For each target table to be loaded, a stored procedure was used to extract source data into a database cursor, compare the key fields to the target table, and if the row does not already exists, inserts the data into the target table.
Example shown below was used to load the LocationDimension table
BEGINDECLARE done INT DEFAULT 0;DECLARE id varchar(10);DECLARE city1 varchar(45);DECLARE state1 varchar(2);DECLARE country1 varchar (45);DECLARE cur1 CURSOR FOR SELECT DISTINCT idLocation, city, state, country FROM tournament.venue ;DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
OPEN cur1; -- open database cursor
cur1_loop: LOOP FETCH cur1 INTO id, city1, state1, country1; -- MySQL Bug 28227 (variable names cannot be same as column names)
IF (done=1) THEN LEAVE cur1_loop; END IF;
IF NOT EXISTS (SELECT * FROM pgatourwarehouse.locationdimension WHERE idLocation = id) THENINSERT INTO pgatourwarehouse.locationdimension VALUES (id, city1, state1, country1, CURDATE());
END IF;
END LOOP;
SELECT 1 FROM tournament.venue LIMIT 1; -- MySQL Bug 60840 (need a statement using a table to avoid throwing warning 1329)
CLOSE cur1; -- close the database cursorEND
![Page 20: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/20.jpg)
Implementation of Data Warehouse
Last, a call wrapper.sql script was used to call stored procedures in specific order so that referential integrity constraints were honored.
1. CALL loadLocationDimension();2. CALL loadLocationDimension2();3. CALL loadVenueDimension();4. CALL loadPlayerDimension();5. CALL loadTournamentDimension();6. CALL loadPlayerTourneyRoundFacts();7. CALL loadPlayerFacts();8. CALL loadPlayerTourneyFacts();
![Page 21: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/21.jpg)
PGA tour Warehouse
Fact constellation
![Page 22: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/22.jpg)
Data Warehouse SnapshotsLocationDimension Table
![Page 23: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/23.jpg)
Data Warehouse Snapshots
PlayerDimension Table
![Page 24: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/24.jpg)
Data Warehouse Snapshots
TournamentDimension Table
VenueDimension Table
![Page 25: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/25.jpg)
Data Warehouse SnapshotsPlayerFacts Table
![Page 26: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/26.jpg)
Data Warehouse SnapshotsPlayerTourneyFacts Table
![Page 27: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/27.jpg)
Data Warehouse SnapshotsPlayerTourneyRoundFacts Table
![Page 28: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/28.jpg)
Is PGA tour, a warehouse?
• “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision-making process.”—W. H. Inmon
• Subject-oriented– Players, tournaments
• Integrated – raw data -> databases -> warehouse
• Time variant – implicit: seasons (2012,2011)– explicit: each table time stamped
• Nonvolatile– Initial loading and access of data
![Page 29: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/29.jpg)
Is PGA tour, a warehouse?...contd
• Management decision making (querying)– How many players are from so-so
city/state/country?– How many tournaments are being held in certain
state in a specific month?– Etc..
![Page 30: PGA Tour Data Warehouse Team Members Frank Paladino Aravind Yeluripiti](https://reader036.vdocuments.us/reader036/viewer/2022062313/56649d1c5503460f949f1b55/html5/thumbnails/30.jpg)