postgresql materialized views with active record
DESCRIPTION
How to use PostgreSQL Materialized Views with to simplify code and improve performance of your Ruby on Rails applicationTRANSCRIPT
![Page 1: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/1.jpg)
PostgreSQL Materialized Views
and Active Record
David Roberts
![Page 2: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/2.jpg)
The ProblemHow do you quickly filter data
represented by multiple ActiveRecord associations and calculations?
![Page 3: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/3.jpg)
Data ModelCity%
&%Philly%&%Boston%
Technology%&%Ruby%&%Python%
Club%&%Philly.rb%
Talk%&%PostgreSQL%Materialized%Views%
Feedback%&%“Super%rad%talk!!”%&%“The%whole%world%is%now%dumber”%
Author%&%David%Roberts%
![Page 4: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/4.jpg)
View all Commentsclass Feedback < ActiveRecord::Base belongs_to :talk INVALID_COMMENTS = ['', 'NA', 'N/A', 'not applicable']
scope :filled_out, -> { where.not(comment: INVALID_COMMENTS) }end
Feedback.filled_out
Feedback Load (2409.6ms) SELECT "feedbacks".* FROM "feedbacks" WHERE ("feedbacks"."comment" NOT IN ('', 'NA', 'N/A', 'not applicable'))
![Page 5: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/5.jpg)
Highest Scoring Talk with Valid Comments
Feedback.filled_out \ .select('talk_id, avg(score) as overall_score') \ .group('talk_id').order('overall_score desc') \ .limit(10)
# 665msSELECT talk_id, avg(score) as overall_score FROM "feedbacks" WHERE ("feedbacks"."comment" NOT IN ('', 'NA', 'N/A', 'not applicable')) GROUP BY talk_id ORDER BY overall_score desc LIMIT 10;
![Page 6: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/6.jpg)
Highest Scoring Talk with Valid Comments
results = Feedback.filled_out \ .select('talk_id, avg(score) as overall_score') \ .group('talk_id').order('overall_score desc') \ .limit(10)
results.first.inspect
=> "#<Feedback id: nil, talk_id: 24510>"
![Page 7: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/7.jpg)
Highest Scoring Talks in PA by Authors named Parker
Feedback.filled_out.joins(talk: [:author, { club: :city }] ) \ .select('feedbacks.talk_id, avg(feedbacks.score) as overall_score') \ .where("cities.state_abbr = ?", 'PA') \ .where("authors.name LIKE '%?%'", 'Parker') \ .group('feedbacks.talk_id') \ .order('overall_score desc') \ .limit(10)
# 665msSELECT feedbacks.talk_id, avg(feedbacks.score) as overall_score FROM "feedbacks" INNER JOIN "talks" ON "talks"."id" = "feedbacks"."talk_id" INNER JOIN "authors" ON "authors"."id" = "talks"."author_id" INNER JOIN "clubs" ON "clubs"."id" = "talks"."club_id" INNER JOIN "cities" ON "cities"."id" = "clubs"."city_id" WHERE ("feedbacks"."comment" NOT IN ('', 'NA', 'N/A', 'not applicable')) AND (cities.state_abbr = 'PA') AND (authors.name LIKE '%Parker%') GROUP BY feedbacks.talk_id ORDER BY overall_score desc LIMIT 10;
![Page 8: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/8.jpg)
What’s Wrong with these Examples?
• Long ugly queries
• Slow queries are bad for Web Applications
• Fighting ActiveRecord framework
• SQL aggregates are difficult to access
• Returned object no longer corresponds to model
• Repetitive code to setup joins and Filter invalid comments
![Page 9: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/9.jpg)
Using Database Views
to clean up ActiveRecord models
![Page 10: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/10.jpg)
Views• Uses a stored / pre-defined Query
• Uses live data from corresponding tables when queried
• Can reference data from many tables
• Great for hiding complex SQL statements
• Allows you to push functionality to database
![Page 11: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/11.jpg)
But this is a Ruby talk!
![Page 12: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/12.jpg)
class CreateTalkView < ActiveRecord::Migration def up connection.execute <<-SQL CREATE VIEW v_talks_report AS SELECT cities.id as city_id, cities.name as city_name, cities.state_abbr as state_abbr, technologies.id as technology_id, clubs.id as club_id, clubs.name as club_name, talks.id as talk_id, talks.name as talk_name, authors.id as author_id, authors.name as author_name, feedback_agg.overall_score as overall_score FROM ( SELECT talk_id, avg(score) as overall_score FROM feedbacks WHERE feedbacks.comment NOT IN ('', 'NA', 'N/A', 'not applicable') GROUP BY talk_id ) as feedback_agg INNER JOIN talks ON feedback_agg.talk_id = talks.id INNER JOIN authors ON talks.author_id = authors.id INNER JOIN clubs ON talks.club_id = clubs.id INNER JOIN cities ON clubs.city_id = cities.id INNER JOIN technologies ON clubs.technology_id = technologies.id SQL end
def down connection.execute 'DROP VIEW IF EXISTS v_talks_report' endend
![Page 13: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/13.jpg)
Encapsulate in ActiveRecord Modelclass TalkReport < ActiveRecord::Base # Use associations just like any other ActiveRecord object belongs_to :author belongs_to :talk belongs_to :club belongs_to :city belongs_to :technology # take advantage of talks has_many relationship delegate :feedbacks, to: :talk
self.table_name = 'v_talks_report'
# views cannot be changed since they are virtual def readonly true endend
![Page 14: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/14.jpg)
Highest Scoring Talks with Valid Comments
results = TalkReport.order(overall_score: :desc) \ .limit(10)
results.first.inspect => "#<TalkReport city_id: 16, city_name: \"Burlington\", state_abbr: \"VT\", technology_id: 2, club_id: 73, club_name: \"Python - Burlington\", talk_id: 6508, talk_name: \"The SQL protocol is down, reboot the optical syste...\", author_id: 100, author_name: \"Jerrell Gleichner\", overall_score: #<BigDecimal:7ff3f80f1678,'0.4E1',9(27)>>"
![Page 15: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/15.jpg)
Highest Scoring Talks in PA by Authors named Parker
TalkReport.where(state_abbr: 'PA') \ .where("author_name LIKE '%Parker%'") \ .order(overall_score: :desc).limit(10)
![Page 16: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/16.jpg)
Using Materialized Views
to Improve Performance
![Page 17: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/17.jpg)
Materialized Views• Acts similar to a Database View, but results persist for
future queries
• Creates a table on disk with the Result set
• Can be indexed
• Ideal for capturing frequently used joins and aggregations
• Allows optimization of tables for updating and Materialized Views for reporting
• Must be refreshed to be updated with most recent data
![Page 18: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/18.jpg)
class CreateTalkReportMv < ActiveRecord::Migration def up connection.execute <<-SQL CREATE MATERIALIZED VIEW mv_talks_report AS SELECT cities.id as city_id, cities.name as city_name, cities.state_abbr as state_abbr, technologies.id as technology_id, clubs.id as club_id, clubs.name as club_name, talks.id as talk_id, talks.name as talk_name, authors.id as author_id, authors.name as author_name, feedback_agg.overall_score as overall_score FROM ( SELECT talk_id, avg(score) as overall_score FROM feedbacks WHERE feedbacks.comment NOT IN ('', 'NA', 'N/A', 'not applicable') GROUP BY talk_id ) as feedback_agg INNER JOIN talks ON feedback_agg.talk_id = talks.id INNER JOIN authors ON talks.author_id = authors.id INNER JOIN clubs ON talks.club_id = clubs.id INNER JOIN cities ON clubs.city_id = cities.id INNER JOIN technologies ON clubs.technology_id = technologies.id; CREATE INDEX ON mv_talks_report (overall_score); SQL end
def down connection.execute 'DROP MATERIALIZED VIEW IF EXISTS mv_talks_report' endend
![Page 19: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/19.jpg)
ActiveRecord Model Changeclass TalkReport < ActiveRecord::Base # No changes to associations belongs_to …
self.table_name = 'mv_talks_report'
def self.repopulate connection.execute("REFRESH MATERIALIZED VIEW #{table_name}") end # Materialized Views cannot be modified def readonly true endend
![Page 20: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/20.jpg)
Highest Scoring Talks99% reduction in runtime
577msFeedback.filled_out \ .select('talk_id, avg(score) as score') \ .group('talk_id').order('score desc').limit(10)
1msTalkReport.order(overall_score: :desc).limit(10)
![Page 21: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/21.jpg)
Highest Scoring Talks in PA by Authors named “Parker”
400msFeedback.filled_out.joins(talk: [:author, { club: :city }] ) \ .select('feedbacks.talk_id, avg(feedbacks.score) as overall_score') \ .where("cities.state_abbr = ?", 'PA') \ .where("authors.name LIKE '%?%'", 'Parker') \ .group('feedbacks.talk_id') \ .order('overall_score desc') \ .limit(10)
19msTalkReport.where(state_abbr: 'PA') \ .where("author_name LIKE '%?%'", 'Parker') \ .order(overall_score: :desc).limit(10)
95% reduction in runtime
![Page 22: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/22.jpg)
Why Use Materialized Views in your Rails application?
• ActiveRecord models allow for easy representation in Rails
• Capture commonly used joins / filters
• Allows for fast, live filtering and sorting of complex associations or calculated fields
• Push data intensive processing out of Ruby to Database
• Make use of advanced Database functions
• Can Optimize Indexes for reporting only
• When Performance is more important than Storage
![Page 23: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/23.jpg)
Downsides
• Requires PostgreSQL 9.3
• Entire Materialized View must be refreshed to update
• Bad when Live Data is required
• For this use case, roll your own Materialized View using standard tables
![Page 24: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/24.jpg)
Downsides• Migrations are painful!
• Recommend writing in SQL, so no using scopes
• Entire Materialized View must be dropped and redefined for any changes to the View or referring tables
• Hard to read and track what changed
![Page 25: PostgreSQL Materialized Views with Active Record](https://reader033.vdocuments.us/reader033/viewer/2022042516/557632fed8b42a015c8b4c7f/html5/thumbnails/25.jpg)
Resources
• Source Code used in talk
• https://github.com/droberts84/materialized-view-demo
• PostgreSQL Documentation
• https://wiki.postgresql.org/wiki/Materialized_Views