formal models for web navigations with session control and browser cache jessica chen & xiaoshan...

29
Formal Models for Web Navigations with Session Control and Browser Cache Jessica Chen & Xiaoshan Zhao Univ. of Windsor, Canada

Post on 21-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Formal Models for Web Navigations with Session Control and Browser Cache

Jessica Chen & Xiaoshan Zhao

Univ. of Windsor, Canada

Motivation

• advances of emerging web technologies– examples:

• dynamic web pages

• session/cookie techniques

• web services

– provide better performance, transparency, expressiveness, etc.

– enable diversity and intensive use of web systems

Motivation

• advances of emerging web technologies– also raised important issues and posed additional

difficulties on testing/verification of web applications built on it

Motivation

• example: re-submit phenomenon

Page B

Submit

Back

Page A

confirm ReSubmit

account balance: 2,330.35

pay bill: ...............

Motivation

• example: re-submit phenomenon– this phenomenon will not exhibit itself with proper

configuration of web browser • go through a sequence of given links to reach A again

• requests for some pages should always be sent to server side and re-generated.

• along navigated pages, normally be able to observe updated information, e.g. account balance

Motivation

• source of problem: – the users can interact not only with the web pages

but also with the web browser itself – user’s actions through interface of web browser may

affect the overall navigations of the web pages, which can be quite sensitive to the security of the information they carry.

Motivation

• what it suggests: our formal model used either for static analysis or dynamic testing should subsume the abstract behavior of the supporting environment of web applications.– example: user's possible interactions with the web browser

should also be modeled and reflected in design spec.

• behavior of the web browsers depends on – how web browser is implemented, configured

– cookie enabled? etc.

• web developers have access to the configurations• but too demanding for integrated and abstract model

Motivation

• what we can do– formal modeling– automated translation into certain formal languages

• what is required– insight knowledge about web server/browser

• what are discussed here– dynamic web page generation– browser cache– session/cookie mechanism

Dynamic Web Page Generation

• web applications– procedures to generate the web pages

• static page template: predefines layout, format, and static content

• some code to generate dynamic content– dynamic links – dynamic texts: symbolic and finite output to be checked.

• page template -- a finite set of predefined pages

– navigation among pages: determined by requested URL

• a hyperlink user has chosen • some possible input.

Modeling of Navigations

• assume page navigation diagrams – shows all desired/possible navigations among web pages.

– fixed number of hyperlinks.• only closed systems: hyperlinks all point to pages within the same

web application.

• an open system can be similarly modeled by augmenting an additional page to represent all the internet web pages beyond those in the application under consideration.

– fixed actions: user's input + hyperlink page to be generated• same hyperlink with different inputs (such as correct/incorrect

password) as search parameters may lead to different pages

Modeling of Navigation

• assume unique page id

• assume entry pages identified in design spec.: – pages uniquely determined by host name + path

name, no search parameters– direct access by typing in host + path names, without

going through hyperlinks– example of entry pages: home page of a web site– to access non-entry pages: follow page navigation

diagram

Abstraction of Web Browser

• source of abstraction: – implementations of existing commercial web

browsers e.g. Microsoft's Internet Explorer, Netscape's Navigator

– some online documents

Web Browser Architecture

Web BrowserHTML Parser

History stack

Browser Cache

Presentation Module

Object ModelUser Interface

Network Interface

HTTP Module Cache Management

InternetRequest Response

……Web Server 1 Web Server 2 Web Server N

URL

Web Server 2

History Stack

A

B

C

A

B

C

A

D

top pointer

current pointer

bottom pointer

back to B navigate to Dcurrently in A

Web Local Cache• cache settings (inside browser)

– examples:• automatically: all cacheable web pages are valid• per session: in current session, all cacheable web pages are valid

• HTTP cache controls (associated with received web pages)– a web page

• cacheable? • valid period?

– expire time • in header part of HTTP response message • in web page itself with META tag.

– uncacheable: its expire time as the same time it is created.

Modeling of Caching Settings

• assume given setting is automatically

• a web page is cacheable: if the cacheable setting is included in the header part of the HTTP message containing the web page.

• For HTTP cache control, no model of expire time:– if a page is cacheable, the page is always fresh.

• each page associated with an attribute for cacheability.

Session Control

• most web applications need to maintain communication sessions with their client browsers, and monitor each client's individual status and activities. – example: an online banking system should maintain

a communication session with a specific user during the time the user has logged in (and not yet logged out).

– HTTP is stateless, no functionality on session control.

– cookie technique -- a solution

Modeling of Session Control

• only sessions for authentications• no consideration of relationship between cookie and

dynamic content/link. • assume it is given whether a page is secure or not.

– secure page: should always be accessed with authentication session open

– all entry pages are by nature not secure.

• two special actions SignIn and SignOut: session will remain open for the consecutive accesses to secure pages until SignOut performed.

The Integrated Model

• given a design specification (P,EP,SP,CP,A,) – P, EP, SP, CP: finite sets of ids– A: finite set of symbols for user's actions including

SignIn and SignOut P A P: navigation relation

• construct an integrated model of lts

The Integrated Model: states

• a page id to denote the current page– an additional error page err: reached for example, when

attempting to access a secure page without an open session.

• a history stack variable for the current status of the URLs contained in history stack. – since in our case there is a one-to-one relationship between a

URL and a page id, it is a stack of page ids

• a variable of a set of page ids to denote the current status of the locally cached pages.

• a boolean variable to denote whether the authentication session is currently open.

The Integrated Model: labels

L = {(l,f) | l A {back,forward,entry,err}, f=fresh or cache}

– entry: user types in the URL– back, forward, entry are from browser’s interface– err: navigation is directed to a special error page err– fresh/cache: whether the accessed page is from

origin server or from local cache.

The Integrated Model: • 20 structural rules

previous(hs) = q qCPqSP)(q SP guard=true))

<p, hs, lc, guard>(back, fresh)

<q, moveBack(hs), lc, guard>

The Integrated Model

• the obtained l.t.s. is deterministic.– the given page navigation diagram is deterministic:

from each page, the input link uniquely determines the next page.

– for each state and each given action, there is exactly one rule to apply to derive the next state.

• no nondeterminism involved apply trace equivalence for reduction.

Recent & Related Work

• what we can do– formal modeling– automated translation into certain formal languages

• integrated model in Extended FSMs and Promela

• related work– some modeling/translator to use SPIN and SMV

[Sciascio03,Haydar04]– baseline testing strategy [Lucca03]

• for each navigation path, create a testing model with back/forward, not complete model

• no session control, no detailed modeling of caching

Final Remarks• to adopt formal methods into a specific app. domain

– a deep understanding of the domain itself – a careful analysis on specific problems – a proper selection of aspects to be modeled

• state space problem render the models reasonable for analysis abstracting away as much un-related details as possible.

• only considered navigation behavior influenced by session control and browser caching mechanism.

• the model can be modified upon needs or extended to cover more features– web caching model based on a certain setting of browser cache: serves as

an example (no much effort to tailor into other settings)– information carried on lts labels can be defined in different ways,

according to the properties to be checked.– currently no consideration of dynamic links

Thank You

Formal Model

• based on info. on page navigation diagram, we define l.t.s. – provide a formal basis for better understanding of the

web system itself – lay the ground work for both model checking and

specification-based testing on the web applications where we take into account the affect of the internal session control and caching mechanism to the correct web page navigations.

Motivation

• example: unauthorized access phenomenon

Submit

Back

Submit

Page A

User Name

Password

Sign In

Insecure Page

Page B

Sign Out

Secure Page

Page CInsecure Page

credit card:

.......

Motivation

• example: unauthorized access phenomenon– this phenomenon can be avoided, for example, if

page B is defined as un-cacheable. – it does not appear in all applications

• show in U Windsor Web Mail System

• not show in Microsoft's Hotmail.