totaletl:infoserver
DESCRIPTION
TotalETL:infoServer. Chris Fournier Nathan Clark Scott Longley Cyril Shilnikov. MQP Project 2005 Sponsored by TotalETL inc. TotalETL. Small ETL Company ETL (Extract Transform Load) Used in large companies Multimillion dollar business Existing Product is infoSight-- desktop solution. - PowerPoint PPT PresentationTRANSCRIPT
TotalETL:infoServer
Chris FournierNathan ClarkScott LongleyCyril Shilnikov
MQP Project 2005Sponsored by TotalETL inc.
TotalETL
• Small ETL Company
• ETL (Extract Transform Load)– Used in large companies– Multimillion dollar business
• Existing Product is infoSight-- desktop solution
infoSight
infoSight Current Features
• GUI Project creation
• Library of Transformers
• Works with multiple input types
• Single machine
• Single user
• One project at a time
MQP Project goals
• Prototype the client-server version of infoSight– Distributed– Multi-user– Database-centric– Extensible– Alpha-level code– Focus on back-end design
Project Methodology
• Met with TotalETL team on-site
• Design requirements
• Refine and discuss requirements as needed
• Build core modules, demo end first term
• Build additional modules, final demo.
General design overview
Thin Clients
Thick Clients
Repository
Distributed Server System
Distributed Server System
Actual design overview
Security Manager
Session Manager
Repository Manager
Event & Log Manager
Project Manager
Scheduling Manager
Job Manager
DB
Client
Version Manager
Repository Manager
• System core
• Store all information about– System operation– Security– Projects
• XML Parser to store Projects
• JDBC to connect to DB’s
Repository Table Design
Project Manager & Version Control
• Storage and Retrieval of Projects– In-memory Object -> XML File -> Repository
• Version Control– Per user locking– Version tracking
Job Manager
• Combine Projects into Jobs
• Set interdependencies
• Running Jobs
Schedule Manager
• Schedule Jobs– On request– Per schedule
• Multiple scheduling strategies
Session Manager
• Establish and maintain client connections
• RMI– Simple, robust, built-in to Java
• Front end for all functions in server
• Security checking– Authentication of users– Authorization of commands
Security Manager
• Determine user’s privileges
• Control access to Projects, Jobs, etc.
• Custom Security Model– Role-based ACLs– Read, Write, Execute (Projects and Jobs)– Read, Create, Modify (System Configuration)
Event Manager & Logger
• Useful for future expansion
• Complex Hierarchy of Events
• All Events Logged– Log4J format
Event Hierarchy
InfoserverEvent UserEvent
UserLoginFailedEvent
UserLoginEvent
UserLogoutEvent
ProjectEvent
(Otherlevel-2events) (Other
level-3events)
Listeners
Saving and Loading Projects
Security Manager
Session Manager
Repository Manager
Event & Log Manager
Project Manager
DB
Client
Version Manager
Creating Jobs from Projects
Security Manager
Session Manager
Repository Manager
Event & Log Manager
Project Manager Job Manager
DB
Client
Scheduling Jobs to Run
Security Manager
Session Manager
Repository Manager
Event & Log Manager
Scheduling Manager
Job Manager
DB
Client
Project Summary
• Relational Database storage– Projects– Operational Information
• Job Scheduling
• Tailored Security Model
• Version control
• Logging
Future work
• Distributed servers
• Clients, thick and thin
• Support for more databases
• More advanced scheduling algorithms
Thanks
• Professor E. A. Rundensteiner
• Arun Shastry
• Greg Goldberg
• Rest of the TotalETL Team
Questions?