henrietta dombrovskaya - a new approach to resolve object-relational impedance mismatch @ postgres...

72
A New Approach to Resolve Object-Relational Impedance Mismatch Henrietta Dombrovskaya Enova Chicago IL [email protected]

Upload: postgresopen

Post on 30-May-2015

777 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

A New Approach to Resolve Object-Relational Impedance Mismatch

Henrietta Dombrovskaya Enova Chicago IL [email protected]

Page 2: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

What is Object-Relational Impedance Mismatch and why It is

bad? Why should we care?

We should care because we care

about..

2013 Henrietta Dombrovskaya – Enova 2  

Page 3: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Database Application Performance

Everybody wants their application to function efficiently, right? After all – why choose to use databases in the first place? Because The DBMS is specialized software designed to manage data in the most efficient way. Nevertheless, the most common complaint of application developers is …. 2013 Henrietta Dombrovskaya – Enova 3  

Page 4: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

THE DATABASE IS

SLOW

WHY??? 2013 Henrietta Dombrovskaya – Enova 4  

Page 5: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Let’s see…

Where do we usually start to look, if we want

to see, why the database performance is bad?...

2013 Henrietta Dombrovskaya – Enova 5  

Page 6: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Our plan of actions

Look at the most long-running queries and try to optimize them

Look at the queries, which are executed most often, and make sure all of them are tuned and/or running fast enough

Everything looks good, why is the app still slow?

2013 Henrietta Dombrovskaya – Enova 6

Page 7: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Helpful reports:

2013 Henrietta Dombrovskaya – Enova 7

Our performance monitoring tool (pgBadger) produces several classes of reports, among them: Longest-running queries (slowest queries)

Most frequently running queries Queries, which take up the most time

(top offenders).

Page 8: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Based on these reports - is a database

really slow? The top 50 most offensive queries, i.e. the queries

which eat up the most of the execution time, almost exclusively are the queries, which are very short (fast), but executed too many times For example

SELECT  *  FROM  loans  WHERE  id=?  

  is executed 8,500,000 times during the day, each time taking several milliseconds, with total execution time about 2.5 hours.

2013 Henrietta Dombrovskaya – Enova 8  

Page 9: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

You may say – that’s how many times the

screen is refreshed, but…

UK: account home controller as accessed about 50,000 times during the day, and the customer support tool even less

However, the pgBadger log shows total almost 1.5 million executions of different versions of the

SELECT sum_accounts     statement and similar number of  SELECT*  FROM  loans. Statistics show that some application controllers produce

over 1,000 database calls for each screen refresh.

2013 Henrietta Dombrovskaya – Enova

Page 10: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

How could this possibly be happening?! Let’s take a step back…

2013 Henrietta Dombrovskaya – Enova 10

Page 11: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

First came a program…

•  Once upon a time there was a program… •  Until mid-60’s – only sequential access storage was available; a program was

considered a stable object, which may process different input data, and produce different output

Input Output •  Direct access storage appeared in the early 70s, allowing concurrent data access and

selective updates. At the same time developers started to face rising complexity and costs of software development, concurrency control and data integrity problems, and a need of specialized data-management software became apparent.

•  DBMS emerged as specialized programs for centralized data management

2013 Henrietta Dombrovskaya – Enova

Program

11  

DBMS

Page 12: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Since Then We Have…

-Imperative programming languages, which tells, how to do things

for  (i:=1,  i++,  n)  do  …  end;  

and -  Declarative data manipulating languages, which define

what to do: SELECT    first_name,  last_name  FROM  people        WHERE  id=101      

2013 Henrietta Dombrovskaya – Enova 12

Page 13: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Database provides: •  Independent data storage •  Complex data structures •  Data integrity •  Concurrent access and data consistency Application provides: •  User interface •  Application control flow 2013 Henrietta Dombrovskaya – Enova 13  

Page 14: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

So… Now everybody is happy, right? …not really…

2013 Henrietta Dombrovskaya – Enova 14  

Page 15: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Why? Because…

Both imperative programming languages and declarative query languages work perfectly to accomplish the tasks the were designed to accomplish. The problems start, when we try to make them to work together.

2013 Henrietta Dombrovskaya – Enova 15  

Page 16: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

What is Impedance Mismatch?

These tools were designed to operate with objects of different size (granularity): variables vs. tables And different means of identification: address vs. set of attribute values More precisely: •  Object incompatibility - database objects are “too big” for the

imperative programming language •  This means that classes and methods for tables and sets should be

integrated with the database to work effectively (methods should be executed by the DB engine)

•  But most of the architectures do not allow such integration, which leads to reimplementation of database operations at the application layer

2013 Henrietta Dombrovskaya – Enova 16  

Page 17: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

When We Have an OO Application…

•  During one screen rendering: objects may belong to different classes with different access methods

•  Object-Relational Mapping (ORM): maps a database object to the in-memory application object: –  solves a problem of abstraction from details of data storing –  does not provide effective means of manipulation with data sets

This particular case of impedance mismatch is called ORIM – object-relational impedance mismatch

2013 Henrietta Dombrovskaya – Enova 17  

Page 18: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

ORIM Definition

“A set of conceptual and technical difficulties that is encountered when a relational database management system (RDBMS) is being used by a program written in an object-oriented programming language or style, particularly apparent when objects or class definitions are mapped in a straightforward way to database tables or relational schema”

Ambler, S., Agile Database Techniques: Effective Strategies for the Agile Software Developer

2013 Henrietta Dombrovskaya – Enova 18  

Page 19: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Development Standards & Performance

Internal company standards and/or industry standards may prevent from creating efficient database application.

Typical OO design recommendations - clearly separate the following layers of classes:

•  End-user interface •  Business logic •  Relationships •  Interaction with the database The standard allows limited interaction between adjacent

levels, therefore equivalents of database operations may be implemented within several different processes, sometimes on different machines.

2013 Henrietta Dombrovskaya – Enova 19  

Page 20: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Translation:

•  As a database developer you would assume, that to get all payments for all loans for a given customer, you need to join a loans table with the payments table

•  But for an application developer loans and payments belong to different classes, therefore they are accessed by different sets of methods

•  Which means, that the join operation is replicated “manually”..

Let’s look at some examples…

2013 Henrietta Dombrovskaya – Enova 20  

Page 21: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

What We Have at Enova?

Enova is a Ruby on Rails shop that uses the ActiveRecord Object Relational Mapping (ORM) library to communicate with a Postgres database. With ActiveRecord a database table or view is mapped into a class and an object instance is tied to a single row in the table. ActiveRecord is the default “model” component of the model-view-controller web-application framework Ruby on Rails, and is also a stand-alone ORM package for other Ruby applications.

2013 Henrietta Dombrovskaya – Enova 21  

Page 22: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

How ActiveRecord Works

2013 Henrietta Dombrovskaya – Enova 22  

Page 23: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

What This Means for Application/

Database Interaction Due to the lack of awareness of the underlying database interaction on the part of the object methods, one controller performs multiple trips to the database For example…

2013 Henrietta Dombrovskaya – Enova 23  

Page 24: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Account Presenter

def initialize(customer) @customer = customer @customer_extra = customer.customer_extra @person = customer.person @address = customer.person.try(:address) @company = customer.person.try(:company) @bank_account = customer.bank_account(true) @debit_card = customer.debit_card @customer_paydate = customer.customer_paydate(true) @paydate_schedule = customer.customer_paydate.try(:paydate_schedule) @customer_source = customer.customer_source end

2013 Henrietta Dombrovskaya – Enova 24  

Page 25: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Corresponding Application Log

SELECT  *  FROM  customers                WHERE  (customers.id  =  12470535)      SELECT  *  FROM  people            WHERE  (people.id=  61657007  AND      (type  =  'CustomerPerson'))  AND  (  (people.type  =  'CustomerPerson'  )  );    SELECT  addresses.*,  people_addresses.serial_number  FROM  addresses                  INNER  JOIN  people_addresses  ON  addresses.id  =                                    people_addresses.address_id            

 WHERE  (people_addresses.person_id  =  61657007                                    AND  (eff_end_date  is  NULL));    2013 Henrietta Dombrovskaya – Enova 25  

Page 26: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

But Wait, There’s More!

SELECT  *  FROM  approvals  WHERE  (customer_id  =  12470535)  ORDER  BY  processed_on  desc  LIMIT  1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  SELECT  *  FROM  customers  WHERE  (customers.id  =  12470535)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    SELECT  *  FROM  loans  WHERE  (loans.id  =  25563928)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in  ('applied','approved','on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in  ('applied','approved',E  'on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in  (‘issued','issued_pmt_proc'))  ORDER  BY  funding_date  DESC  LIMIT  1    SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in  ('applied','approved',E'on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      SELECT  *  FROM  customers  WHERE  (customers.person_id  =  61657007)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          SELECT  count(*)  AS  count_all  FROM  loans  WHERE  (loans.customer_id  =  12470535  AND  (status_cd  in  ('applied','approved','on_hold','issued','issued_pmt_proc')  and  loan_type_cd  =  'installment'))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in  ('applied','approved','on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          SELECT  *  FROM  loans  WHERE  (customer_id  =  12470535  and  status_cd  in  ('applied','approved','on_hold'))  ORDER  BY  funding_date  DESC  LIMIT  1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          2013 Henrietta Dombrovskaya – Enova 26  

Page 27: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

What We Should See Instead…

2013 Henrietta Dombrovskaya – Enova

SELECT  *    FROM  customers    WHERE  (customers."id"  =  12470535)      SELECT  *    FROM  people    WHERE  (people."id"  =  61657007    AND  (type  =  'CustomerPerson'))    AND  (  (people."type"    =  'CustomerPerson'  )  );  

SELECT  addresses.*,  people_addresses.serial_number    FROM  addresses    INNER  JOIN  people_addresses    ON  addresses.id  =  people_addresses.address_id    WHERE  (people_addresses.person_id  =  61657007    AND  (eff_end_date  is  NULL));      

27  

Page 28: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Why Does This Happen?

… because the app is written in an imperative way:

“Fist we need to select all loans, then for each loan we need to select…..”

2013 Henrietta Dombrovskaya – Enova 28  

Page 29: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

There Are Some Ways To Improve

Eager loading generates almost correct SQL: SELECT customers."id”

,customers."created_by” <..>,customers."person_id" <..>,customers_extra."estimated_monthly_living_expense" <..>,people."type"

<...>,addresses."id"<...> ,companies."id" <...>

FROM customers LEFT OUTER JOIN customers_extra ON customers_extra.customer_id = customers.id LEFT OUTER JOIN people ON people.id = customers.person_id AND people."type" = 'CustomerPerson’ LEFT OUTER JOIN people_addresses ON people_addresses.person_id = people.id LEFT OUTER JOIN addresses ON addresses.id = <…>

WHERE (customers."id" = 17674188)

2013 Henrietta Dombrovskaya – Enova 29  

Page 30: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

But Then We Apply the Next Method

has_many :bank_accounts, :foreign_key => 'person_id' do def default_for_customer(customer, reload = false) @bank_account_cache = {} if @bank_account_cache.nil? || reload @bank_account_cache[customer.brand.code] ||= find(:first, :joins => "INNER JOIN payment_instruments.defaults pid ON pid.payment_instrument_id = payment_instruments.bank_accounts.payment_instrument_id", :conditions => ["pid.customer_id = ?", customer], :readonly => false) end And we execute SELECT again! 2013 Henrietta Dombrovskaya – Enova 30  

Page 31: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

If We Continue With Existing Frameworks

… this problem (ORIM) will never be solved, and we will continue to loose money on timeouts!

2013 Henrietta Dombrovskaya – Enova 31

Page 32: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

New Approach to Resolve Object-Relational

Impedance Mismatch Our new approach allows drastically reduce the number of calls to the database, and get exactly the data we need. An application view can be rendered using one or two calls to the database instead of 500-900 call in the old app. In addition, it allows to optimize queries execution without any changes to the application and/or models.

2013 Henrietta Dombrovskaya – Enova 32  

Page 33: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

The old methods are unaware of the data which is already available – make them

data-aware!

Isn’t this contrary to the standard OO approach? Yes, but…

This is the only way to improve the App/DB interaction.

2013 Henrietta Dombrovskaya – Enova 33

Page 34: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Logic Split Methodology

Disassemble method into atomic steps,

Identify ones, which require data retrieval

Using the knowledge about database objects relationships, construct a single query

Execute

Use retrieved data in other steps

2013 Henrietta Dombrovskaya – Enova 34  

Page 35: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Example: Amount_Outstanding

2013 Henrietta Dombrovskaya – Enova

Page 36: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Definitions

AccountsOutstanding= AccountsUncollected + FeesOutstanding + InterestOutstanding + PrincipalAccounts + AccountsDue In turn: •  AccountsUncollected =

uncollected_principal + uncollected_installment_principal

2013 Henrietta Dombrovskaya – Enova Financial 36  

Page 37: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Under the Hood: Database Calls

SELECT      vl.value  AS  account                                              ,SUM(CASE  vl.value  WHEN  pt.debit_account_cd                                            THEN  pt.amount  ELSE  0  END)                                                  -­‐  SUM(CASE  vl.value  WHEN  pt.credit_account_cd                                                THEN  pt.amount  ELSE  0  END)  AS  sum                              FROM  payment_transactions  pt  JOIN  valuelists  vl  ON  vl.type_cd  ='transaction_account'          AND  vl.value  IN  (pt.debit_account_cd,  pt.credit_account_cd)  AND  loan_id=?  

… and then a value for specific account is selected. 2013 Henrietta Dombrovskaya – Enova 37  

Page 38: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Drawbacks

The method itself would allow retrieving all the information related to one loan “in one shot”. However, because the application developers are unaware of the underlying layers, there appears to be no difference:

–  Whether we obtain the values of all account balances one-by-one by following normal Object-Oriented method logic, in an imperative way

–  Or if we obtain them all, simultaneously

Now for some results:

2013 Henrietta Dombrovskaya – Enova 38  

Page 39: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

PGBadger Log

2013 Henrietta Dombrovskaya – Enova 39  

Page 40: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Modified Method

2013 Henrietta Dombrovskaya – Enova 40  

Page 41: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Under the Hood: Database Calls

SELECT    loan_id                                  ,  sum(CASE  WHEN  debit_account_cd  =  'uncollected_principal'                                            THEN  pt.amount  ELSE  0    END        -­‐CASE  WHEN  credit_account_cd='uncollected_principal'                                    THEN  pt.amount  ELSE  0    END)  AS  uncollected_principal          <...>  ,  sum(CASE  WHEN  debit_account_cd  =  'uncollected_nsf_fees'                                  THEN  pt.amount  ELSE  0    END    -­‐  CASE  WHEN  credit_account_cd  =  'uncollected_nsf_fees'                    THEN  pt.amount  ELSE  0    END)  AS  uncollected_nsf_fees                ,  sum(CASE  WHEN  debit_account_cd  =  'installment_principal'                                          THEN  pt.amount  ELSE  0    END      -­‐  CASE  WHEN  credit_account_cd  =  'installment_principal'                THEN  pt.amount  ELSE  0    END)  AS  installment_principal      FROM  payment_transactions_committed  pt                            INNER  JOIN  loans  l  ON  l.id=pt.loan_id      WHERE  customer_id={?}      GROUP  BY  loan_id      

2013 Henrietta Dombrovskaya – Enova 41  

Page 42: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Execution Statistics From Dark Testing: Old

2013 Henrietta Dombrovskaya – Enova 42  

Page 43: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

… and New

2013 Henrietta Dombrovskaya – Enova 43  

Page 44: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

So… now everybody happy?... …not really… What do application developers say at this point?.... Wait! What about the business logic?! 2013 Henrietta Dombrovskaya – Enova 44  

Page 45: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

What About the Business Logic?

We need some business logic to know how to execute joins and selects – just enough to minimize the number of db calls.

Selected results transformations and manipulations do not have to be executed on the database side.

2013 Henrietta Dombrovskaya – Enova 45  

Page 46: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

One More Time

ü Disassemble method into atomic steps, ü Identify ones which require data retrieval ü Using knowledge about database objects

relationships, construct a single query ü Execute ü Use retrieved data in other steps

2013 Henrietta Dombrovskaya – Enova 46  

Page 47: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Let’s review another example – account balance calculation for OEC loans.

2013 Henrietta Dombrovskaya – Enova 47  

Page 48: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Account_Balance Method for OEC

Ø Obtain account principal balance Ø Obtain outstanding fees and interest as of

next payment due date Ø Calculate the interest credit (unearned

interest) for the number of days left before the payment due date

Ø Obtain existing customer balance Ø Calculate the total account balance using the

values obtained on steps 1-4.

2013 Henrietta Dombrovskaya – Enova 48  

Page 49: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Traditional Object-Oriented Approach

Create Account_Balance method which would call the following methods:

Ø Principal_Balance, Ø Interest_Amount, Ø Fees_Amount, Ø Customer_Balance Ø  Interest_Credit

Each of these methods would interact with a database independently.

2013 Henrietta Dombrovskaya – Enova 49  

Page 50: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Drilling Down Into Each Of the Steps 1.  Principal balance as described for account_outstanding requires a

single database call. 2.  Outstanding interest and fees require one database call each. 3.  Interest credit calculation:

3.1. Obtain the daily interest rate for this customer 3.2. Obtain base amount, which is used to calculate the total interest 3.3. Obtain the number of days, for which the interest should be credited:

3.3.1. Obtain the next payment due date 3.3.2. Calculate number of days based on obtained date and todays’ date

3.4. Calculate amount of credit, based on results from the previous three steps

4.  Customer balance can be obtained using one database call, same as steps 1-3.

2013 Henrietta Dombrovskaya – Enova 50  

Page 51: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Combining Steps With Data Retrieval

Those steps are: 1,2,3, 3.1, 3.2, 3.3.1, 4, which means that the database accessing method should execute the following task: For a given loan, retrieve payment transactions, which show principal balance, current interest, fees and customer balance, also retrieve loan’s daily interest rate and next payment due date. 2013 Henrietta Dombrovskaya – Enova 51  

Page 52: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

SELECT statement

SELECT l.id AS loan_id ,sum( CASE WHEN debit_account_cd = ‘principal’ AND t.acct_date<= v_current_date THEN t.amount ELSE 0 END - CASE WHEN credit_account_cd =’principal’ AND t.acct_date<= v_current_date THEN t.amount ELSE 0 END ) AS amount_payable ,sum (CASE WHEN t.debit_account_cd=’fees_provisional’ THEN t.amount ELSE 0 END - CASE WHEN t.credit_account_cd= ‘fees_provisional’ THEN t.amount ELSE 0 END ) AS fees_provisional ,sum (CASE WHEN t.debit_account_cd=’ interest_provisional’ THEN t.amount ELSE 0 END - CASE WHEN t.credit_account_cd =’interest_provisional’ THEN t.amount ELSE 0 END ) AS interest_provisional ,st.end_date AS next_closing_date ,l.daily_rate AS interest_rate ,sum (CASE WHEN t.debit_account_cd ='customer_balance'

THEN -t.amount ELSE 0 END - CASE WHEN t.credit_account_cd ='customer_balance' THEN -t.amount ELSE 0 END ) AS customer_balance FROM loans l LEFT OUTER JOIN payment_transactions_committed t ON l.id=t.loan_id LEFT OUTER JOIN statements st ON l.id=st.loan_id WHERE l.id={?} GROUP BY l.id ,l.daily_rate ,st.end_date

2013 Henrietta Dombrovskaya – Enova 52  

Page 53: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Ruby Method: account_balance

def account_balance amount_payable = amount_outstanding + amount_charged_off amt = amount_payable - customer_balance provisional_interest = interest_provisional provisional_fees = fees_provisional if provisional_interest > 0.0 amt = amt + provisional_fees + [ provisional_interest - [unearned_interest, 0].max, 0 ].max else amt = amt + provisional_fees end amt = amt.round_near return [0, amt].max end

2013 Henrietta Dombrovskaya – Enova 53  

Page 54: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Ruby Method: unearned_interest

def unearned_interest amt = -1 * (principal_amount + [(customer_balance - interest_provisional - fees_provisional), 0].max) next_closing_date = Date.parse(self.next_closing_date) return 0 unless next_closing_date interest_period = [next_closing_date - Date.today + 1, 0].max interest_rate = oec_daily_rate.to_f interest = amt * interest_period * interest_rate interest = interest.round_down interest < 0.0 ? 0 : interest end

2013 Henrietta Dombrovskaya – Enova 54  

Page 55: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

How It Works

This methods is used as a callback in a more complex method, which returns Loan_Summary where most of the values can be selected directly from the database. For compliance reasons all calculations are executed in the Ruby model. We need a “virtual column” as one of the fields of the data output; the Ruby callback can be inserted into the data set here. The callback produces required value, and the whole data set is passed to the upper-level method for future processing and then to the web application. 2013 Henrietta Dombrovskaya – Enova 55  

Page 56: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

2013 Henrietta Dombrovskaya – Enova 56

LOAN_ID, FUNDING_DATE< FUNDING_AMOUNT FEE STATE APR LOAN_STATUS AMOUNT_DUE ACCOUNT_BALANCE PRINCIPAL_AMOUNT CUSTOMER_BALANCE INTEREST_PROVISIONAL <…> LANGUAGE_CD

ACCOUNT_BALANCE METHOD

Note, that all data elements can still be retrieved with a single select statement, which won’t be possible within the standard ORM framework.

Page 57: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

On a Larger Scale

In our new application a single Postgres function retrieves summaries for all customer loans. Here is some execution statistics:

2013 Henrietta Dombrovskaya – Enova Financial 57

Page 58: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Number of Database Calls/Customer

2013 Henrietta Dombrovskaya – Enova Financial 58  

1 2 3 4 5 6 7 8 9 Old App avg_calls 121 196 373 449 632 765 819 923 1129 New App avg_calls 4 4 4 4 4 4 4 4 4

0

200

400

600

800

1000

1200

Old App avg_calls New App avg_calls

Page 59: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Execution Time/Customer

2013 Henrietta Dombrovskaya – Enova Financial 59  

0

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6 7 8 9

New App avg_time Old App avg_time

Page 60: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Now

•  We save both time

•  And money!

2013 Henrietta Dombrovskaya – Enova 60

Page 61: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Were we the first to notice the problem? Definitely NOT!

2013 Henrietta Dombrovskaya – Enova 61

Page 62: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Related Work

The problem of object-relational impedance mismatch is a constant discussion topic, when it comes to developing an efficient application. In recent years multiple attempts were made to try to resolve this issue with no significant outcome. 2013 Henrietta Dombrovskaya – Enova Financial 62  

Page 63: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

First Came HORA

HORA (Hybrid Object-Relational Architecture) approach was first introduced in 1993 and became a foundation of multiple ORM-based systems, such as:

•  Java Persistence •  ActiveJDBC •  ADO.NET •  Ruby on Rails ActiveRecord •  Many others.

All implementations have the same ORIM issues. 2013 Henrietta Dombrovskaya – Enova Financial 63  

Page 64: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Hybernate

Hybernate is a high-performance Object/Relational persistence and query service, considered one of the most flexible and powerful Object/Relational solutions on the market. •  Mapping from Java classes to database tables

and from Java data types to SQL data types. •  Significantly reducing development time •  Allowing the application developers concentrate

on the business side of the project. 2013 Henrietta Dombrovskaya – Enova Financial 64  

Page 65: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Hybernate Developers Claim

•  In contrast to other solutions it does not hide “the power of SQL” from developers.

•  This claim is true in some sense, since the

solution, indeed allows us to write queries similar to SQL queries.

2013 Henrietta Dombrovskaya – Enova Financial 65  

Page 66: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

The Caveat

•  Creation of complex queries using Hybernate is

not an easy task •  Similar to other ORM systems, Hybernate

prompts for solutions, which seem more natural for application developers

2013 Henrietta Dombrovskaya – Enova Financial 66  

Page 67: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Newer Version of Active Record

•  Allows Eager loading and some customer queries, but has the same limitations as Hybernate

•  Eager loading may cause an excessive application memory usage

2013 Henrietta Dombrovskaya – Enova Financial 67  

Page 68: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Agile Technology

•  acknowledges the existence of ORIM ; •  not only acknowledges the technical impedance

mismatch, but also a cultural impedance mismatch:

“The object-oriented paradigm is based on proven software engineering principles. The relational paradigm, however, is based on proven mathematical principles”.

•  while raising awareness of the problem, the solutions, proposed by Agile data technology refers mostly to the database schema changes and/or more careful design and data refactoring, but it offers little in terms of dealing with inefficient queries.

2013 Henrietta Dombrovskaya – Enova Financial 68  

Page 69: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

What’s next?

2013 Henrietta Dombrovskaya – Enova 69

Page 70: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Future Work

Rewriting a whole application using our new methodology of splitting logic between an application and a database; we are in the middle of the process. Problems:

–  Large amount of the legacy code, where most of the business logic is embedded in Ruby classes.

–  Virtually no business specifications, which means that we have to extract the business logic from the existing code.

–  The legacy application is evolving and existing models are being modified all the time.

–  The human factor

Because of this….

2013 Henrietta Dombrovskaya – Enova Financial 70  

Page 71: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Moving In Two Directions:

ü Continue to rewrite the larger parts of our applications, shooting for having only a couple of database queries per screen rendering;

ü While doing this, clarifying our technology, making it more transparent and easier to use by application developers.

2013 Henrietta Dombrovskaya – Enova Financial 71  

Page 72: Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedance Mismatch @ Postgres Open

Aknowledgements

Many thanks to: •  Enova CIO/CTO Fred Lee •  Team Raven, and everybody who worked on the Aperture

project during it’s lifetime:

•  Sheldon Strauch and the entire DB Dev team •  Chad Slaughter and Kurt Stephens •  Ben Heilman and Donny Jekels 2013 Henrietta Dombrovskaya – Enova Financial 72  

–  Cheetana Yogeesh –  Allan Zoppa –  Ana Lebon –  Kevin Glowacz –  Neha Bhardwaj

–  Jef Jonjevic –  Marc Groulx –  Richard Lee –  Preeti Dhiman –  Luda Baklanova