modernizing legacy applications in php (tutorial)

Post on 23-Dec-2014

1.222 Views

Category:

Internet

27 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Modernizing LegacyApplications In PHP

Paul M. JonesZendCon 2014

1

Read These

2

About Me

• 8 years USAF Intelligence

• BASIC in 1983, PHP since 1999

• Jr. Developer, VP Engineering

• Aura project, Zend_DB, Zend_View

• PHP-FIG, PSR-1, PSR-2, PSR-4

• MLAPHP (http://mlaphp.com)

3

Overview• Legacy Applications

• Prerequisites

• Implement an Autoloader

• Consolidate Classes& Functions

• Replace globals with DI

• Replace new with DI

• Write Tests

• Extract SQL Statements

• Extract Domain Logic (Model)

• Extract Presentation Logic (View)

• Extract Action Logic (Controller)

• Replace remaining include

• Separate public resources

• Decouple URLs from page scripts

• Remove repeated logic in page scripts

• Replace page scripts with DI container

4

Legacy Applications

5

What Does Legacy Mean?• Page scripts directly in the document root

• Special index files to prevent directory access

• Special logic at the top of some files to `die()` or `exit()`

• Include-oriented instead of class-oriented or object-oriented

• Relatively few classes

• Poor class structure

• Relies more heavily on functions than on class methods

• Page scripts, classes, and functions combine MVC concerns

• One or more incomplete attempts at a rewrite

• No automated test suite6

Typical PHP Application Structurebin/ # command-line tools cache/ # cache files common/ # commonly-used include files classes/ # custom classes Image.php # Template.php # functions/ # custom functions db.php # log.php # cache.php # setup.php # configuration and setup css/ # stylesheets img/ # images index.php # home page script

7

Typical PHP Application Structurejs/ # JavaScript lib/ # third-party libraries log/ # log files page1.php # other page scripts page2.php # page3.php # sql/ # schema migrations sub/ # sub-page scripts index.php # subpage1.php # subpage2.php # theme/ # site theme files header.php # a header template footer.php # a footer template nav.php # a navigation template 8

Typical PHP Application Code001.php

9

Maintenance Headaches• include statements to execute logic

• inline function definitions

• global variables

• MVC logic combined in a single script

• trusting user input

• possible SQL injection vulnerabilities

• possible XSS vulnerabilities

• unquoted array keys generating notices

• if blocks not wrapped in braces

• copy-and-paste repetition

10

Technical Debt

• A metaphor referring to the eventual consequences of poor or evolving software architecture and software development within a codebase.

• As a change is started on a codebase, there is often the need to make other coordinated changes at the same time in other parts of the codebase.

• http://en.wikipedia.org/wiki/Technical_debt

11

Rewrite From Scratch?• Sexy! Tempting! Do it right the first time!

• Expend effort while not earning revenue

• Old devs on new project? New devs on new project?

• Takes longer than you think

• End up with different bad architecture12

Incremental Refactoring• Pay off smallest debt first (build inertia and raise spirits)

• Small changes across codebase

• Build on previous small changes

• Goal is not “perfection”

• Goal is to “improve over time”13

Refactor To A Framework?

• Essentially a rewrite. If system could be swapped to a framework …

• … we would not actually need to.

• At the end of this process, swapping out is more possible …

• … but might not want to at that point.

14

Review

• What is legacy code?

• Why to refactor instead of rewrite

15

Prerequisites

16

Basic Tools

• Revision control system

• PHP 5.0 or later

• Editor/IDE with multi-file search and replace

• Style guide

• Test suite

• Attitude and approach

17

Revision Control

• SVN, Git, Hg, etc

• Change logs, commits, reverts, push/share changes

• Even if you are the only one on the project

• Install and initialize before continuing

18

PHP 5.0 Or Later

• Yes, this is ancient, but it’s legacy code

• 4.x is a non-starter (autoloading)

• Later is better (5.4/5/6)

• Upgrade to 5.x before continuing (itself a detailed process)

19

Editor/IDE With Multi-File Search And Replace

• Emacs

• PHPStorm

• SublimeText

• TextMate

• Vim

• Zend Studio

20

Style Guide• “This code looks fine” said no developer ever

• Mixed or idiosyncratic styles lead to cognitive friction

• If project has a consistent style, keep it (ugly or not)

• If not, pick a *public* standard (PSR-1 and PSR-2)

• Resist the urge to reformat in toto — single files only21

Test Suite

• If we had unit tests already, we’d be a lot better off

• Characterization tests best

• QA testing is OK-ish

• “Spot check” is worst

• Short-term incentives lean toward “spot check” at first

• Improved discipline with time

22

Attitude And Approach

• Make one small change

• “Spot check” it

• Commit it

• Push it

• Notify QA

• Make the next small change

23

Attitude And Approach

• No big changes

• Revert and redo as needed

• Add as little as possible

• Keep the application running

• Each incremental change is a quality improvement

• Stop at any point and start again later

24

Review

• Basic prerequisites for refactoring

• Attitude and approach to refactoring

25

Implement an Autoloader

26

Why Autoload?• Instead of “include ClassName.php” …

• … PHP will load the ClassName on request

• Remove “definition” includes/require statements

• Reveal the execution structure of the application

• Fewer includes means less to wade through27

Autoloading Standards

• PSR-0 class-to-file mapping

• “\Foo_Bar\Baz\Dib_Zim” maps to “…/Foo_Bar/Baz/Dib/Zim.php”

• PSR-4 only if no underscores in classes, and full class-to-file mapping

28

Single Central Class Location

• Need a base, known directory for the autoloader

• Create a directory in the application for *all* classes (lib, src, classes)

• May already exist: only class files? mixed structures?

• Later, we will move all classes to this location

29

Register Autoloader In Setup

• Create an autoloader function or class

• Register it with SPL early in setup code

• Must be before any definitional “include” statements

• 002.php

30

Followup

• Spot check

• Commit

• Push

• Notify QA

31

What If I Already Have An Autoloader?

• Use it as-is if already PSR-0

• Modify to add PSR-0 behavior

• Register added PSR-0 autoloader

• Composer:$loader = require '/path/to/app/vendor/autoload.php';$loader->add('', '/path/to/app/classes');

32

Performance Implications?

• Some reason to think autoloading is slower than include

• Optimize on the right resources

• Any drag from autoloading likely overshadowed by other issues

• If this is the main bottleneck, you’re in great shape

33

Review

• What is autoloading?

• Why use autoloading?

• Basic implementation

34

Consolidate ClassesAnd Functions

35

Why Consolidate?

• Remove include calls in favor of autoloading

• Remaining includes reveal logic paths

• Provide structure for class hierarchy

36

Process (1/2)

• Find an include statement that pulls in a class definition file.

• Move that class definition file to our central class directory location, making sure that it is placed in a sub-path matching the PSR-0 rules.

• In the original file *and in all other files in the codebase* where an `include` pulls in that class definition, remove that `include` statement.

37

Process (2/2)

• Spot check to make sure that all the files now autoload that class by browsing to them or otherwise running them.

• Commit, push, and notify QA.

• Repeat until there are no more `include` calls that pull in class definitions.

38

Initial Class Locationclasses/ # our central class directory location Mlaphp/ Autoloader.php # A hypothetical autoloader class foo/ bar/ baz.php # a page script includes/ # a common "includes" directory setup.php # setup code index.php # a page script lib/ # a directory with some classes in it sub/ Auth.php # class Auth { ... } Role.php # class Role { ... } User.php # class User { ... }

39

Find A Candidate Include

<?php require 'includes/setup.php'; require_once 'lib/sub/User.php';

// ... $user = new User(); // ... ?>

40

Move The Class Fileclasses/ # our central class directory location Mlaphp/ Autoloader.php # A hypothetical autoloader class User.php # class User { ... } foo/ bar/ baz.php # a page script includes/ # a common "includes" directory setup.php # setup code db_functions.php # a function definition file index.php # a page script lib/ # a directory with some classes in it sub/ Auth.php # class Auth { ... } Role.php # class Role { ... }

41

Remove First Include Call

<?php require 'includes/setup.php';

// ... // the User class is now autoloaded $user = new User(); // ... ?>

42

Remove Other Include Calls

• The class file is likely included in other locations

• Use project-wide search facility to find other includes of the class

• Remove those include calls as well

• Keep a list of changed files

^[ \t]*(include|include_once|require|require_once).*User\.php

43

Beware Mismatches

• Inspect “by eye” the search results when removing

• Make sure the include call is to the same class file

include_once("/usr/local/php/lib/User.php");

44

Spot Check

• “Spot check” changed files

• Best to work with error reporting at E_ALL or E_WARNING

• Look for “file not found” and “class not defined” errors

45

Followup

• Commit, push, notify QA

• Continue to next class

• Done when there are no more class-related includes

46

Consolidate FunctionsInto Classes

• Might be more procedural than class- or object-oriented

• Want to leverage autoloading, but only works with classes

• Convert function files into classes and call them as methods

47

Process (1/3)• Find an `include` statement that pulls in a function definition file.

• Convert that function definition file into a class file of static methods; we need to pick a unique name for the class, and we may need to rename the functions to more suitable method names.

• In the original file *and in all other files in the codebase* where any functions from that file are used, change calls to those functions into static method calls.

48

Process (2/3)

• Spot check to see if the new static method calls work by browsing to or otherwise invoking the affected files.

• Move the class file to the central class directory location.

• In the original file *and in all other files in the codebase* where an `include` pulls in that class definition, remove the relevant `include` statement.

49

Process (3/3)

• Spot check again to make sure that all the files now autoload that class by browsing to them or otherwise running them.

• Commit, push, and notify QA.

• Repeat until there are no more `include` calls that pull in function definition files.

50

Find A Candidate Include

<?php require 'includes/setup.php'; require_once 'includes/db_functions.php';

// ... $result = db_query('SELECT * FROM table_name'); // ... ?>

51

Convert Function FileTo Class File

• 003.php

• The include call stays in place

• The file also stays in place

52

Change Function Callsto Static Method Calls

• Find old call, replace with new call

• Do so for each old function converted to a static methodFind:

db_query\s*\(

Replace:

Db::query(53

Spot Check #1

• Check to make sure the new static calls work

• The file has not moved yet

• The include has not been removed yet

54

Move The Class File classes/ # our central class directory location Db.php # class Db { ... } Mlaphp/ Autoloader.php # A hypothetical autoloader class User.php # class User { ... } foo/ bar/ baz.php # a page script includes/ # a common "includes" directory setup.php # setup code index.php # a page script lib/ # a directory with some classes in it sub/ Auth.php # class Auth { ... } Role.php # class Role { ... }

55

Followup

• Remove all include calls to the moved file

• Spot check the files with changes

• Commit, push, notify QA

• Continue until all function files are converted to class files

56

How To Pick Candidate Include Files?

• Manually traverse the entire codebase and work file-by-file.

• Generate a list of class and function definition files, and then generate a list of files that `include` those files.

• Search for every `include` call and look at the related file to see if it has class or function definitions.

57

What If An Include Defines More Than One Class?

• Split the single file into multiple files.

• Create one file per class, and name each file for the class it contains per the PSR-0 naming and autoloading expectations.

58

Isn’t “One Class Per File” Wasteful?

• Optimize on the right resources

• Any performance drag is small compared to other issues

• Consistency above all: if some have one, and others have 2+ … ?

• Reasonable to group subclasses in subdirectories

59

What About Inline Definitions?

• Remove the inline definition to its own file.

60

What If Definition File Executes Logic?

• Move the definition and leave the logic

• Do not remove the include call, so the logic is executed

• 004.php

61

What If Two Classes Have The Same Name?

• Rename one or both of the classes when moving them.

62

What About Third-Party Libraries?• Don’t want to move 3rd-party libs (makes upgrading harder)

• If it uses an autoloader, add to stack; if not:

• Modify our autoloader with the 3rd-party lib in mind

• Write an additional autoloader for the 3rd-party lib

• Consolidate to their own location if possible (move entire package)63

What About System-Wide Libraries?

• Installed outside the legacy application for system-wide use

• If Horde/PEAR, modify PSR-0 autoloader to look in multiple dirs

• 005.php

64

Use Instance Methods Instead Of Static Methods For Functions?

• Instance methods are better in the long run

• More trouble to convert at first, but do so if you can

• Easier to convert to static first, then to instance, but more time

• 006.php

65

Can This Be Automated?

• Not that I know of.

66

Review• Autoloading all classes from central location

• Autoloading all functions as classes from central location

• Definitional includes now removed

• Remaining includes execute logic

• More class-oriented, less include-oriented67

Replace Globals With Dependency Injection

68

Why Replace Globals?

• Classes still have globals in them

• “Spooky action at a distance” — change “here” affects “there”

• 007.php

69

Inject Dependencies Instead

• Create dependency outside the class

• Pass it into the class at construction time

• Then we can see dependencies from the outside

70

Process (1/2)

• Find a global variable in one of our classes.

• Move all global variables in that class to the constructor and retain their values as properties, and use the properties instead of the globals.

• Spot check that the class still works.

71

Process (2/2)

• Convert the global calls in the constructor to constructor parameters.

• Convert all instantiations of the class to pass the dependencies.

• Spot check, commit, push, and notify QA.

• Repeat with the next global call in our class files, until none remain.

72

First Steps

• Search for “global $” in the central class directory

• Move all globals in class to constructor (008.php)

• Spot check

• Convert globals to properties and constructor params (009.php)

73

CHANGE INSTANTIATIONS

• Find instantiations of the class (imperfect but pretty good):

• new\s+ClassName\W

• use\s+ClassName\W

• Change instantiations to use constructor params (010.php)

74

Followup

• Spot check, commit, push, notify QA

• Continue until there are no more uses of global keyword

75

What About Globals In Static Methods?

• Convert to an instance method and pass dependency

• Pass dependency as a method parameter

76

What About Superglobals?• Special case of globals: automatically global, like it or not

• Must replace them since they are global

• Cannot find with search for “global”

• Often need 2-3 superglobals at a time

• Could pass copies, but $_SESSION may not work that way77

Superglobal Solution: Pass A Request Object• Use the Request data structure from http://mlaphp.com/code

• Inject into all classes using superglobals

• Classes then use the Request for copies of superglobal data

• Request handles the special case of $_SESSION

• 011.php78

What About $GLOBALS?

• Remove all uses of $GLOBALS the same as with the global keyword.

79

Review

• Removed all global keywords and replaced with injection

• Removed all superglobals and replaced with injection

80

Replace new with Dependency Injection

81

Why Replace New?

• Makes dependencies explicit

• Decouples object *creation* from object *use*

• Allows more control for object creation from outside

• 012.php

82

Parallel Solutions

• Replace “early” creation with dependency injection

• Replace “late” creation by injecting a Factory object

• Factory’s only purpose is to create and return objects

83

Process (1/3)• Find a non-Factory class with the new keyword in it.

• For each “early” creation in the class ...

• Extract each instantiation to a constructor parameter.

• Assign the constructor parameter to a property.

• Remove any constructor parameters and class properties that are used only for the `new` call.

84

Process (2/3)

• For each “late” creation in the class ...

• Extract each block of creation code to a new Factory class.

• Create a constructor parameter for each Factory and assign it to a property.

• Modify the previous creation logic in the class to use the Factory.

85

Process (3/3)

• Change all instantiation calls for the modified class to inject dependencies.

• Spot check, commit, push, and notify QA.

• Repeat with the next new keyword that in a non-Factory object.

86

Find A Use Of New

• Search the project for “new” instantiations

• If in constructor method, inject directly (013.php)

• If in a non-constructor method, extract and inject Factory (014.php)

87

Change Instantiations

• Search for new\s+ItemsGateway\(

• Create dependencies and pass to the instantiation (015.php)

88

Followup

• Spot check, commit, push, notify QA

• Repeat with next new keyword in non-Factory class

89

Isn’t This A Lot Of Code?

• Yes, it’s true — there are more lines of code to suit the same end.

• The real issue is not “more code”.

• The real issues are “more testable, more clear, less coupled”

• 016.php

90

What About SPL ClassesAnd Exception Classes?

• These are reasonable exceptions to the “no new outside of Factory” rule, since they are provided by PHP directly and relatively simple.

• Includes ArrayObject, ArrayIterator, Exceptions of all sorts

• Be careful about other native PHP classes like PDO with complex setups, outside configuration, etc.

• Err on the side of dependency injection instead of inline creation.91

What About Intermediary Dependencies?

• Sometimes dependencies have dependencies

• These are injected into one class so that another class can use them

• Construct completed object and pass that instead

• 017.php

92

Can A Factory Create Collections?

• Factories are for object-creation logic

• Object-creation logic may be simple or complex

• As long as it is limited to object creation and return, all is well

• 018.php

93

Can We Automate All These Injections?

• “Manual” injection so far — create dependencies “by hand” and inject

• Tedious and repetitive, even with Factory objects

• DI (aka IOC) Container is tempting, but resist for now

• Too radical a change at this point; later will be more natural fit

94

Review

• Removed all in-class instantiations

• Direct injection and Factory classes

• Page scripts might be longer but dependencies now explicit

95

Write Tests

96

Why Tests?

• Previously, changes to existing code filled us with dread

• Tests help to assure us that changes do not break functionality

97

Test Resistance

• Feels like make-work after such great progress

• No feeling of joy from writing code that just tests other code

• Does not itself feel like an improvement to the codebase

98

Testing Dogma

• "Do not interact with the file system; build a virtual file system instead.”

• "Do not interact with the database; build a set of data fixtures instead.”

• "Rewrite your classes to use interfaces instead of concrete classes, and write test doubles for all the dependencies.”

• Sounds insurmountable so skip it and move on … ?

99

Overcoming Test Resistance

• Without tests, condemning ourselves to more suffering later

• Enabling a continued awfulness to our work life

• Uncertainty that a change does not break something

• Writing tests sucks, but having written tests is awesome

100

The Way of Testivus• http://www.agitar.com/downloads/TheWayOfTestivus.pdf

• "More testing karma, less testing dogma.”

• "The best time to test is when the code is fresh.”

• "Write the test that needs to be written.”

• "An imperfect test today is better than a perfect test someday.”

• "Write the test you can today."101

Write The Best Test You Can

• Characterization that checks output

• Functional/integration that interacts with network/database/files

• Loose unit test that combines concrete classes

• Strict unit tests that isolate the unit with test doubles

102

Build Inertia Towards Testing• Imperfection is irrelevant; the test is more important than the unit.

• Imperfect tests can be perfected; nonexistent tests cannot.

• Writing *no* tests leads to *never* writing tests

• Writing *any* tests leads to writing *more* tests.

• Writing better tests leads to writing better units.103

Caveats

• Testing is relatively specialized

• The full scope of testing is to much for this talk

• Read “The Grumpy Programmer’s PHPUnit Cookook” (Hartjes)

104

Process (1/2)

• Install PHPUnit

• Create a tests directory and test config files

105

Process (2/2)• Pick a public class method

• Write a test for it

• Run the test; rejoice if it passes, rejoice if it fails

• Commit/push when the test passes

• Pick another public method and write a test for it

• Continue until all public methods are tested106

Install PHPUnit

• composer global require phpunit/phpunit=4.*

• pear config-set auto_discover 1pear install pear.phpunit.de/PHPUnit

• Download phpunit.phar

107

Create A Tests DirectoryAnd Config Files

• Create a top-level tests directory

• Create subdirectory for test classes in tests directory

• Add phpunit.xml and bootstrap.php files

• 019.php

108

Pick A Public Class Method

• Pick something simple at the start

• A class that is easy to construct (no dependencies)

• A public method that returns a result (no echo/print)

• Want an easy win; small wins build confidence and motivation

109

Run The Empty Test

• Watch it fail, and rejoice!

• The suite works, there’s just no testing code in it.

tests $ phpunit PHPUnit 3.7.30 by Sebastian Bergmann.

Configuration read from tests/phpunit.xml

F

Time: 45 ms, Memory: 2.75Mb

There was 1 failure:

1) Warning No tests found in class "FooTest".

FAILURES! Tests: 1, Assertions: 0, Failures: 1. tests $

110

Add A Test Method And Run Again• 021.php

• Rejoice when it passes!

• If it ever fails again, something is broken.

• Rejoice when it fails! You know you need to fix it.

tests $ phpunit PHPUnit 3.7.30 by Sebastian Bergmann.

Configuration read from tests/phpunit.xml

.

Time: 30 ms, Memory: 2.75Mb

OK (1 test, 1 assertion) tests $

111

Followup

• Commit the passing test, push

• Probably no need to notify QA *unless* code changed

• Pick another public method to test

• Continue until all public methods have tests

112

Can We Skip This And Do It Later?

• No.

113

Come On, Really, Can We Do This Later?

• You still have high test resistance, which is understandable.

• If you won’t write all your tests at once, at least write them in stages.

114

Options For Writing Tests In Stages

• One complete test class per day.

• Each time you use a method, check for a test, and write it if needed.

• Each feature or bug, keep list of classes or methods, and write tests for them as part of of the completion process.

• Each new method, write a test for it.

115

Delegate Writing Of Tests

• Junior dev!

• But soon the junior will know more about the system than you.

• Empower the junior to berate the authors to write testable code.

116

Testing Is Not Optional

• Writing a test suite is a non-negotiable aspect of modernizing

• If you don’t start now, you never will

• Write them now, or write them as we go, sooner rather than later

117

What About Hard-To-Test Classes?

• Working Effectively With Legacy Code (Feathers)

• Refactoring (Fowler et al.)

• https://github.com/adamculp/refactoring101

118

What About Our Earlier Characterization Tests?

• Keep them

• Basis of acceptance tests for QA team

119

Should We Test ProtectedAnd Private Methods?

• Tests that inspect the inner workings of a class are too brittle

• Public methods expose the results of the inner workings

120

Can We Change A Test?

• If method behavior changes on purpose, test may have to change too

• Run the entire suite, not just the one test

121

Do We Need To Test Third-Party Libraries?

• If they already have tests, no.

• Otherwise, our priorities will dictate.

• Dependency on the library acting a certain way? Test just that.

• Remember that our legacy application is the main concern.

122

What About Code Coverage?

• A report from PHPUnit telling us how many lines were tested

• Uncovered code is untested code; more coverage is better

• 100% is a good goal but likely unfeasible early on

123

Review

• We have at least started writing tests in The Way Of Testivus

• We are committed to adding new tests as part of normal work

• Each new test means more warnings about breaking changes

• Rejoice when they pass! Rejoice when they fail!

124

Extract SQL Statements

125

Why Extract SQL?

• Begin to separate the various concerns of the application

• Leads to easier maintainability and testability

• Data source work becomes separate from data manipulation work

• “SQL” is a stand-in for any data source

126

Embedded SQL

<?php $db = new Db($db_host, $db_user, $db_pass); $post_id = $_GET['post_id']; $stm = "SELECT * FROM comments WHERE post_id = $post_id"; $rows = $db->query($stm); foreach ($rows as $row) { // output each row } ?>

127

Goals For Extracting SQL

• Test the SQL interactions in isolation from the rest of the code

• Reduce the number of repeated SQL strings throughout the codebase

• Collect related SQL commands for generalization and reuse

• Isolate and remove security flaws such as SQL injection

128

Extract And Replace With Gateway Classes

• Only responsibility is to move data to and from data sources

• Probably “Table Data Gateways”

• Use a Gateway appropriate to the data source

129

Process (1/2)

• Search entire codebase for SQL statements.

• For each statement not already in a Gateway, move the statement and relevant logic to a related Gateway class method.

• Write a test for the new Gateway method.

130

Process (2/2)

• Replace the statement and relevant logic in the original file with a call to the Gateway class method.

• Spot check, commit, push, and notify QA.

• Repeat with the next SQL statement that is outside a Gateway class.

131

Search For SQL Statements

• Search for (SELECT|INSERT|UPDATE|DELETE)

• Easier if statement case is consistent (either upper or lower)

• Mixed-case leads to false positives

<?php $stm = "SELECT * FROM comments WHERE post_id = $post_id"; $rows = $db->query($stm); ?>

132

Move SQL To Gateways• Tedious and detail-oriented, even in simple cases

• What to name the class and method?

• How to deal with query parameters?

• How to avoid security flaws?

• What is the proper return value?133

Namespace And Class Names• Organize by layer, or by entity?

• By layer : Gateway\* or DataSource\Gateway\*

• By entity: Comments\* or Domain\Comments\*

• If organization in place, follow it

• Otherwise, by entity: Domain\Comments\CommentsGateway134

Method Names• Look to existing code for a convention; stay consistent if possible

• If no consistent pattern, try select*(), insert*(), etc.

• Try indicating the result and criteria: selectOneById(), etc.

• Use good professional judgment; take hints from the existing code

• Stay consistent above all135

Initial Gateway Class And Method• 022.php

• Essentially exact copy of existing logic

• Wrapped in class with dependency injection and method param

• No changes to logic, including security problems

• Need to fix SQL injection when we see it136

SQL Injection

137

SQL Injection

• “Little Bobby Tables” http://xkcd.com/327/

• Never assume parameter values are safe, even if you “know” they are

• Old “safe” values become unsafe with changes to codebase

• Each Gateway method should prevent SQL injection *itself*

138

Defeating SQL Injection• Best: Prepared statements and parameter binding, or cast to int/float

• OK: Quote-and-escape each parameter before interpolating into the query string

• Better than nothing: Escape each input parameter before interpolating into the query string (still susceptible to charset problems)

• 023.php139

Write A Test• Remember The Way Of Testivus!

• Write the test you can, when you can, while the code is fresh

• It will be imperfect, but it can be perfected

• It will likely touch the database, so have a test database

• 024.php140

Replace The Original SQL Code

• 025.php

• No changes to operational logic in original location

141

Followup

• Spot check, commit, push, notify QA

• Find the next SQL statement in a non-Gateway location

142

What About INSERT, UPDATE, DELETE ?• Example showed only SELECT

• Other query types also go in Gateways

• INSERT/UPDATE/DELETE harder because of params

• Consider using a data array for these

• 026.php143

What About Repetitive SQL Strings?

• We are likely to see lots of repetition with variation

• SELECT * FROM comments WHERE post_id = $post_id LIMIT 10

• Professional judgment: combine where simple, avoid where complex

• Don’t over-imagine possible scenarios! Combine only what you see.

• Always runs tests after changes144

What About Complex Query Strings?

• Simple queries as examples so far, likely to see more complexity

• Extract query-building logic to Gateway from surrounding logic

• Requires care, caution, and high attention to detail

• 027.php vs 028.php

145

What About QueriesIn Non-Gateway Classes?

• Examples have all been page scripts

• Other classes likely to have SQL embedded as well

• Follow same extraction process

• Inject Gateway into originating class

• 029.php146

Can We Extend From A Base Gateway Class?

• Once you have enough extractions to see patterns, yes.

• Be careful not to preemptively create functionality

• 030.php

147

What About Complex QueriesAnd Complex Result Structures?

• Examples have been with single tables

• Likely to need multiple tables with joins

• Likely to have nested-array result structures

148

How To Extract Complex QueriesAnd Result Strctures

• Identify how to split up the queries into separate Gateways (if at all)

• Determine which Gateway should receive the “main” logic, even if across multiple tables

• 031.php

• Note we still have an N+1 problem

149

What If There Is NoDatabase Abstraction Class?

• Examples assume a “Db” or “Pdo” type of object

• Legacy codebases often rely on mysql_*() functions direction

• Upgrade to PDO if you can, but this can be difficult

• Could move mysql calls to Gateways, but global state (link identifier)

150

Option: Proxy Calls Through MysqlDatabase

• http://mlaphp.com/code

• Class itself contains the link identifier, removing global state

• Connects only on queries, not on creation

151

Replacing mysql_*() with MysqlDatabase• Search the entire codebase for the mysql_ prefix on function calls.

• Where there are function calls with the mysql_ function prefix …

• Create or inject an instance of MysqlDatabase.

• Replace each mysql_ function prefix with the MysqlDatabase object variable and a single arrow operator (->)

• Spot check, commit, push, and notify QA.

• 032.php152

After Replacing mysql_*()

• The plan a migration away from the deprecated mysql_*() to PDO

• Easier now that all mysql_*() calls are in the Gateways

153

Review

• We have extracted all SQL statement to Gateways

• Now have tests for all Gateways, and thus all SQL interactions

• Possibly moved away from mysql_*() to PDO or MysqlDatabase

154

Extract Domain Logic (Model)

155

Why Extract Domain Logic?

• Data source work extracted to encapsulate database interactions

• Domain logic co-mingled with presentation and other logic

• Validation, modification, calculation, etc.

• Extract so we can reuse and test independent of other concerns

156

Domain Logic Patterns (1/2)

• Transaction Script "... organizes [domain] logic primarily as a single procedure, making calls directly to the database or through a thin database wrapper. Each transaction will have its own Transaction Script, although common subtasks can be broken into subprocedures."

• Domain Model "... creates a web of interconnected objects, where each object represents some meaningful individual, whether as large as a corporation or as small as a single line on an order form."

157

Domain Logic Patterns (2/2)

• Table Module "... organizes domain logic with one class per table in the database, and a single instance of a class contains the various procedures that will act on the data. ... if you have many orders, a Domain Model will have one order object per order while a Table Module will have one object to handle all orders.”

• Service Layer "... defines an application's boundary and its set of available operations from the perspective of interfacing client layers. It encapsulates the application's business logic, controlling transactions and coordinating responses in the implementation of its operations."

158

Which To Use?

• Service Layer implies sophistication that is not likely present

• Domain Model implies well-defined set of business entity objects

• Table Module seems OK; probably not working with single tables

• If one of these already implemented, use it! Otherwise …

159

Transaction Script

• Dump existing logic into a single class method and call it

• Simple, boring, lowest-common-denominator, but works anywhere

• Remember: refactor as-it-is, not rewrite as-we-want-it-to-be

• After extracting we can think about more sophisticated patterns

160

WARNING• This is the most difficult part of modernization

• Domain logic is the true core of our application

• Difficult to give examples as domain logic is different in each app

• Success depends on familiarity and deep knowledge of application

• Care, attention to detail, time consuming, very demanding161

Process (1/2)

• Search entire codebase for Gateway uses outside Transactions classes

• Examine code surrounding Gateway operations for domain logic

• Extract domain logic to one or more Transactions classes

• Modify the original code to use the Transactions class

162

Process (2/2)

• Spot check to make sure the original code still works properly

• Write tests for the extracted Transactions logic

• Commit, push, notify QA

• Search again for uses of Gateways outside of Transactions

163

Search For Gateway Uses

• Search for new .*Gateway and track through code

• Might be injected into a class that uses it for domain logic

• Might be passed along into other methods after initial use

• Code surrounding the usage is primary candidate for examination

164

Discover And Extract Domain Logic• WARNING: Follow all previous lessons when extracting!

• No use of globals or superglobals (use Request object)

• No use of new keyword outside Factory objects

• Use dependency injection

• Consistency above all165

Candidate Logic For Extraction• normalizing, filtering, sanitizing, and validating of data

• calculation, modification, creation, and manipulation of data

• sequential or concurrent operations and actions using the data

• retention of messages from those operations and actions

• retention of values and variables for later inputs and outputs166

Extraction Activities

• Break up or reorganize the extracted domain logic into support methods

• Break up or reorganize the original code to wrap the new Transaction calls

• Retain, return, or report data needed by the original code

• Add, change, or remove variables in original code related to extracted logic

• Create and inject dependencies for the Transactions classes and methods

167

This Is A Learning Exercise• Picking apart the app is a process of discovery

• Do not be afraid to make multiple attempts at extraction

• Going to fail at it over and over

• Treat these failures as knowledge-building exercises

• Revision control is your best friend168

Example Extraction

• 033.php and 034.php

169

Extraction Notes (1/3)• Two separate transactions: submit new article, update existing article

• Each operated on the user's credit counts in the database

• Various data normalizing and support operations

• Extracted to an ArticleTransactions class and two separate methods

• Named the methods for the domain logic being performed, not for the underlying technical operations

170

Extraction Notes (2/3)• Input filtering encapsulated as a support method in the

ArticleTransactions class for reuse

• ArticleTransactions receives ArticlesGateway and UsersGateway dependencies

• Variables related only to domain logic removed from the page script

• Placed into the Transactions class as properties171

Extraction Notes (3/3)

• Code in the original page script has been greatly reduced

• Is now essentially an object creation and injection mechanism

• Passing user inputs to the domain layer and getting back data to output later

• Original code can no longer see the $failure variable

• Must now get failure information from the object for later presentation

172

Followup• Spot check original code, modify/revert as needed

• Write Transactions tests, modify/revert as needed

• Spot check *again*, modify/revert/retest as needed

• Commit, push, notify QA

• Search again for uses of Gateways outside Transactions173

Are These SQL Transactions?• No, “Transaction Script” pattern not same as SQL transaction

• Does not have to wrap an actual SQL transaction

• However, thinking in terms of what a transaction should be may help define what the Transaction Script should contain

• The singular purpose of the Transaction Script may help determine boundaries of the domain logic to be extracted

174

What About Repeated Domain Logic?• As with Gateway extractions, we may find repeated domain logic

• Use professional judgment: if identical, or nearly, then combine

• If there are recognizable support elements, extract them separately

• Err on the side of caution: small steps are better than big ones

• Test, test, test175

Are Printing And EchoingPart Of Domain Logic?

• No; those are part of presentation logic

• Transactions classes should not contain print or echo

• Extract so it remains outside of domain logic

176

Can A Transaction Be A Class By Itself?

• Some logic complex enough to warrant a single class

• Or standardize of on single classes for Transactions by name

• Use __invoke() method or another method indicating single purpose

177

What About Domain Logic In Gateways?

• Possible that we moved domain logic with data source logic

• Extract from Gateway into Transaction

• Update relevant tests

178

What About Domain LogicIn Non-Domain Classes?

• Examples showed domain logic in page scripts

• Might be in other classes as well

• Follow same extraction process, then inject Transaction into class

• Make calls to the injected Transaction class

179

Review• Extracted core of our application (domain) to Transactions classes

• Mostly shuffled logic around to a new layer

• Domain logic independently testable

• Page scripts now much shorter

• Basis for further refactoring: Service Layer, Domain Model, DDD, etc180

Extract Presentation Logic (View)

181

Why Extract Presentation Logic?

• We have extracted a Model system (domain logic)

• Other concerns combined in the codebase (View, Controller)

• Separate the presentation logic into a View layer

• Independently testable from controller/action logic

182

What Is Presentation Logic?

• Generally, anything that sends output to the client

• Specifically, everything in the HTTP response (headers included!)

• echo, print(), printf(), sprintf()

• header(), setcookie(), setrawcookie()

183

The Response Is The Presentation

• Extract presentation logic to a View file

• The View file will be executed inside a Response object

• http://mlaphp.com/code/

184

Process (1/2)

• Find a page script that contains presentation logic mixed in

• Rearrange the code to collect all presentation logic into a single block *after* all the other logic, and spot check

• Extract the block to a view file to be delivered via a _Response_, and spot check again

185

Process (2/2)

• Add proper escaping logic and spot check again

• Write a test for the extracted logic

• Commit, push, notify QA

• Look for more presentation code mixed in with other code

186

Search For Presentation Logic

• At this point it should be easy; page scripts greatly reduced

• echo, print(), printf(), sprintf()

• header(), setcookie(), setrawcookie()

187

Rearrange Page Script And Spot Check

• Collect all presentation logic after a big noticeable comment

• Remember previous rules: no globals/superglobals, no new, etc.

• Spot check to make sure the page script still works

• 035.php and 036.php

188

Rearrangement Notes

• Moved variables not used by the business logic down to presentation block

• Moved header.php include down to presentation block

• Moved logic acting only on presentation variables to presentation block

• Replaced $_SERVER['PHP_SELF'] with an $action variable

• Replaced $_GET['id'] with an $id variable

189

Extract Presentation To A View File

• Create a top-level views directory

• Pick a view file name; match to page script path

• /foo/bar/baz.php maps to /views/foo/bar/baz.php

• Append with .html.php, .json.php, etc. if desired

190

Move Presentation Block To View File• Cut from page script, paste into view file

• Replace with a Response object that references the view file

• Always use the same variable name for the Response!

• Spot check to find missing variables (E_ALL will help)

• 036.php (recap) and 037.php191

Add Proper Escaping

• Escaping helps prevent Cross-Site Scripting vulnerabilities

• Always escape all variables all the time

• Escape for the output context: HTML body, HTML attribute, CSS, JS

• Requires diligence and attention to detail

192

htmlspecialchars()<form action="<?php echo $request->server['PHP_SELF']; ?>" method="POST">

<form action="<?php echo htmlspecialchars( $request->server['PHP_SELF'], ENT_QUOTES, 'UTF-8' ); ?>" method="POST">

193

Response::esc()

<form action="<?php echo $this->esc($request->server['PHP_SELF']); ?>" method="POST">

194

Escape For Proper Context

• Zend\Escaper for CSS, JS, HTML attributes

• Aura\Html for escapers and other methods

195

Write A View File Test

• Presents unique challenges

• A file, not a class

• Slightly different testing structure

196

Testing Structure

• Add tests/views/ directory

• Modify phpunit.xml to add views/ directory

<phpunit bootstrap="./bootstrap.php"> <testsuites> <testsuite> <directory>./classes</directory> <directory>./views</directory> </testsuite> </testsuites> </phpunit>

197

The View File Test• Even though testing a file, the test itself is a class

• Name for the view file being tested

• /views/foo/bar/baz.php => /tests/view/foo/bar/BazTest.php

• Mimic the Response-calling code in page script

• 038.php198

Rejoice When It Fails

• The expectation is empty output ($expect == ‘’);

• If it passes, we have a problem ;-)

• Now we need to assert correctness of output

199

Asserting Correctness Of Output

• Probably not testing entirety of output

• Naive check for key or critical strings with strpos()

• 039.php

200

More Robust Assertions

• PHPUnit assertSelect*() methods (CSS selectors)

• Zend\Dom\Query for finer examination of DOM tree

• Start simple, move toward more complex as needed

201

Followup

• Commit, push, notify QA

• Continue looking for mixed-in presentation logic

202

What About Headers And Cookies?

• Presentation work includes headers and cookies

• header() and setcookie() send output right away, regardless of ob_*()

• Difficult to test when unbuffered

203

Response Object Buffering Methods• Use Response::header(), setCookie(), and setRawcookie()

• Captures calls into array structure properties

• Inspect using Response::getHeaders()

• Response::send() calls the real header() etc. methods

• 040.php204

What If We Already HaveA Template System?

• Might use template in page script instead of Response

• If there are headers being sent, mixing logic in page script

• Future proof by moving template calls to view file

• Have page script invoke the view file via the Response

• 041.php205

What About Streaming Content?• Don’t want to read large files into memory

• Instead, want to stream file a few kb at a time

• Use Response::setLastCall() to invoke a callable after send()

• The callable can then stream out content

• 042.php206

What If We Have Lots Of Presentation Variables?

• Examples show only a few variables; likely to have 10, 20, more

• Presentation may have several include files needing those vars

• Collect commonly-used presentation vars to their own object

• Pass that object to the presentation

• 043.php207

What About Class MethodsThat Generate Output?

• Classes will often use echo, print, etc. instead of returning values

• Convert them to return the output instead of echoing it

• Remember to escape properly!

• Then echo the method call

• 044.php208

What About Business LogicMixed Into The Presentation?

• May discover Gateway or Transaction calls inside presentation code

• Move the calls outside the presentation code and retain in variable

• Pass variable to presentation

• May result in extra loops, but goal is clean separation

• 045.php209

What If A Page ContainsOnly Presentation Logic?

• Page script may be only presentation with no other code

• Use a Response object and view file anyway

• Need consistency in page scripts for later steps

210

Review

• All presentation logic extracted to view files

• All view files being executed through a Response object

• Views are testable independent of model and controller logic

• Page scripts getting shorter and shorter

211

Extract Action Logic (Controller)

212

Why Extract Action Logic?• Page scripts have only two kinds of logic now:

• Dependency-building logic uses app settings to create objects

• Action logic uses the created objects to generate Response

• Move action logic to its own Controller layer to separate concerns

• Independent testability213

Process (1/2)

• Find a page script where action logic is still mixed in

• Collect action logic to its own central block and spot check

• Extract the action logic to a new Controller class

• Modify the page script to use the new Controller and spot check

214

Process (2/2)

• Write a unit test for the new Controller class and spot check again.

• Commit, push, notify QA

• Find another page script with embedded action logic

215

Search For Embedded Action Logic

• Every page script ;-)

216

Collect Action Logic And Spot Check• Move all setup and dependency-creation code to the top

• Move all code using the created objects to the end

• Leave $response->send() as very last line

• Spot check when finished

• 046.php and 047.php217

Extract To Controller Class

• Pick a class name

• Likely in the Controller\* namespace

• Should reflect page script hierarchy

• /foo/bar/baz.php => Controller\Foo\Bar\BazPage.php

218

Move The Action Logic To Controller Class• Create a skeleton Controller class

• Use consistent invocation method name; e.g., __invoke()

• Cut the central action logic block, paste into the Controller

• Spot check and modify Controller to pass dependencies as params

• 048.php219

Convert Controller to DI and Spot Check• Move method params to constructor and retain as properties

• Modify method to use properties instead of params

• Modify page script to inject dependencies in setup and spot check

• 049.php

• Consistency in calling the controller will help in later steps220

Write A Controller Test• Name the test for the controller class

• Build all the controller dependencies in setup()

• If we can use test doubles (fakes, mocks, etc) so much the better

• Make assertions against the returned Response object

• getView(), getVars(), getHeaders(), getLastCall()221

Followup

• Commit, push, notify QA

• Move on to next page script with embedded action logic

222

Can We Pass Parameters To The Controller Method?

• Better not to do so, but only for reasons of consistency

• Most consistent is to pass nothing at all

• In later steps we will need Controller objects to be constructed ready-to-go with no added information needed

• All “extra” information should go in via constructor

223

Can A Controller Have Multiple Actions?

• Better not to, but sometimes easier when converting complex page scripts

• If necessary, split into separate action methods within the class

• Put switching logic in __invoke()

• Do not call them from the page script (consistency)

• Keep testing!

224

What If The Controller Contains Include Calls?

• Page scripts might still have include calls in them

• Artifact of include-oriented architecture

• These need to be extracted, but not yet

• Leave them in place for now

• The next step will remove them225

Review

• All action logic now extracted from page scripts

• Controllers are now independently testable

• Page scripts are now very short: create controller dependencies, create and invoke controller, get back Response and send it

• Complete Model-View-Controller system

226

Replace Remaining Includes

227

Why Replace Includes? (1/2)• Achieved MVC separation, but classes may still have includes

• Artifact of legacy “include-oriented” architecture

• Executes logic at include-time

• Mixes scopes and obscures logic flow; difficult to test

• 050.php228

Why Replace Includes? (2/2)• Method code and include code tightly coupled in non-obvious ways

• $user variable must pre-exist in scope

• Introduces $user_messages and $user_is_valid into scope

• May overwrite pre-existing variables in scope

• Solution: convert to a class method, then call that method229

WARNING

• If only a few includes, not so tough

• If many/complex/interdependent, very tedious and time consuming

• Possibly more difficult than Domain extraction due to spaghetti nature of include files

230

Process (1/5)

• Search the classes/ directory for an include call in a class.

• Search entire codebase to find how many times the file is included.

• Only one time? Or many times?

231

Process (2/5)

• If the included file is used only once (3 steps) …

• Copy included file code as-is directly over the `include` call.

• Test the modified class, and delete the include file.

• Refactor the copied code so that it follows all our existing rules (globals, new, DI, return not output, no includes).

232

Process (3/5)

• If used more than once (7 steps) …

• Copy contents of included file as-is to new class method.

• Replace include with inline instantiation and invocation.

• Spot check to find missing variables; add to new method by reference.

233

Process (4/5)• Search entire codebase for include calls to same file; replace each with inline

instantiation and invocation; spot check and and unit test.

• Delete original include file; unit test and spot check the entire legacy application.

• Unit test new class; refactor to follow all our existing rules (globals, new, DI, return not output, no includes).

• Replace each inline instantiation of the new class in each of our class files with dependency injection, testing along the way.

234

Process (5/5)

• Final unit test and spot check

• Commit, push, notify QA

• Look for the next include call in classes

235

Search For Includes (1/2)• Search only classes for

^[ \t]*(include|include_once|require|require_once)

• Returns list of candidate include calls; find easiest one

• Then search again for that specific include in the entire codebase

• Found require ‘foo/bar/baz.php’; …

• ^[ \t]*(include|include_once|require|require_once).*baz\.php236

Search For Includes (2/2)

• Found require ‘foo/bar/baz.php’; …

• ^[ \t]*(include|include_once|require|require_once).*baz\.php

• Use only file name; relative paths (examine each for false positives)

• Count number of true positives leading to that include file

• Used once, or used more than once?237

Replace Single Include

• Copy include code directly over the include call in the class

• Run the unit tests for the class

• Delete the original include file and run *all* unit tests

• Refactor and retest: no globals, no new, use DI, use return not output, no includes

238

Replace Multiple Include

• Copy include code to new method (new or existing class)

• Pick an appropriate name for the method (and class if needed)

• If only method in new class, __invoke() is a good choice

239

Replace Original Include Call

• Find original include call

• Replace with inline instantiation of class and invoke the new method

• Violates rule about “new” keyword; we will address this later

• 051.php

240

Discover Missing Variables• Scope now separated so code breaks: variables not present

• Spot-check or unit test to find missing variables (E_ALL)

• Add to method by reference (exact same vars, reduces changes)

• Pass to the method in the calling code

• Let the tests guide you!

• 052.php241

Replace Other Includes

• Now replace other includes of the same file (use search results)

• Replace in the same way: inline instantiation and invocation

• Might be in classes or non-class files (e.g. view files)

• Spot check and unit test each replacement

242

Delete Include File Itself

• After replacing all calls to the file, delete the file

• Spot check and unit test again

243

Write Test For New Class MethodAnd Refactor

• Write unit test for include code extracted to class method

• Refactor: globals, new, DI, output not return, no includes

• Continuously run unit tests to ensure no breaks

244

Convert Inline Instantiations To Injection

• Find each inline instantiation that replaced the include

• Replace with dependency injection

• Unit test and spot check

245

Followup

• Final spot check and unit test

• Commit, push, notify QA

• Find next include call in classes

246

Can We Extract Many IncludesTo The Same Class In Different Methods?

• Example shows extraction to a single method in a new class

• Reasonable to extract related includes to the same class

• Be sure to use appropriate method names

247

What About Includes In Non-Class Files?

• Not so concerned about includes outside of classes

• View files might have lots of includes, but that’s their nature

248

Review

• Extracted all includes from class files

• Removed one of the last artifacts of legacy architecture

• The internal architecture now looks very good

• Time to look at overall end-to-end architecture elements

249

Separate Public Resources

250

Why Separate?• MVC separation, but entire application still in document rood

• Need special protections on private files

• Errors in web server configuration can expose the application

• Separate public and private resources

• Sets up structure for later refactoring to front controller251

Degenerate Front Controller

• Web server itself acting as router/dispatcher

• Page scripts are mapped directly onto file system

• Web server document root is base of mapping

• Fine for public resources, not for private ones (like config files)

252

Process• Coordinate with operations personnel to communicate intentions

• Create new document root directory in app with temp index file

• Reconfigure server to point to new document root and spot check

• Move public resources to new document root and spot check

• Commit, push, coordinate with operations for QA testing253

Coordinate With Operations Personnel

• The single most important part of the process

• Never make changes that affect server config without doing so

• Tell them our intent (i.e., “changing the document root”)

• Feedback from ops will tell us what we need to do

254

Create A Document Root Directory

• Create it in the existing application directory structure

• Call it something sensible; e.g. “public” or “docroot”

• Add an “index.html” file for spot checking

• 053.php

255

Reconfigure Server• Probably local development server or virtual instance server

• Apache-related .conf file uses DocumentRoot directive

• From DocumentRoot "/path/to/app" …

• … to DocumentRoot "/path/to/app/docroot"

• Reload or restart the server and browse to index.html256

Move Public Resources• Move public resources to docroot/ (page scripts, HTML, CSS, JS, etc)

• Do not move config files, class files, view files, CLI scripts, etc.

• If you should not be able to browse to it, do not make it public

• Modify include paths as necessary as this will change relative paths

• 054.php257

Followup

• Spot check as you move files, and complete spot check at end

• Commit

• Push

• Coordinate with operations for QA deployment and testing

258

Is This Really Necessary?

• Foundational element for the next step in modernization (front controller).

259

Review

• Beginning to refactor the overall architecture

• Created a separation between public and private resources

• New public document root that keeps the application private

260

Decouple URLs From Page Scripts

261

Why Decouple URLs?• Web server acting as combined router and dispatcher to pages

• Routes to page scripts are the document root file system

• Changes to file system mean changes to URLs, and vice versa

• Cannot intercept incoming request before dispatch to page script

• Add a Front Controller with Router as single point of entry262

Process• Coordinate with operations to communicate our intentions.

• Create Front Controller script in document root.

• Create pages/ directory for page scripts, and “not found” controller.

• Reconfigure the web server for URL rewriting and spot check.

• Move page scripts from docroot/ to pages/ and spot check.

• Commit, push, coordinate with operations for QA testing.263

Coordinate With Operations

• The single most important part of the process

• Never make changes that affect server config without doing so

• Tell them our intent (i.e., “enable URL rewriting”)

• Feedback from ops will tell us what we need to do

264

Add A Front Controller Script

• Add script as docroot/front.php

• Uses a “Router” to auto-map public URLs to page script paths

• Router is at http://mlaphp.com/code

• 055.php

265

Create Pages DirectoryAnd Page-Not-Found Script

• Front Controller refers to $pages_dir

• All page scripts will go there instead of document root

• Create top-level pages/ directory to receive the page scripts

• Create pages/not-found.php with related Controller and View

• 056.php266

Reconfigure The Server• Enable URL rewriting via appropriate config file for your web server

• In Apache on Ubuntu, this might be sudo 2enmod rewrite

• Instruct server to point all incoming requests to front controller

• 057.conf

• Reload or restart the server and spot check (not found!)267

Move Page Scripts• Move each page script from docroot/ to pages/ (works in either)

• Move only the page scripts, not other public resources

• Spot check after each move

• May need to change include-path values again as a result

• 058.txt268

Followup

• Commit

• Push

• Coordinate with operations for QA deployment and testing

269

Did We Really Decouple From The File System?

• Just moved from docroot/ to pages/

• URL path still maps to subdirectory path

• Even so, yes, now decoupled

• Intercessory Router can point any URL path to any script

• 059.php270

Review

• Moved page scripts out of document root

• Use front controller to route URL paths to page scripts

• Only 2 steps remain!

271

Remove Repeated Logic In Page Scripts

272

Why Remove Repetition?

• Each of our page scripts now highly repetitive

• Load setup, instantiate dependencies for page controller, invoke page controller, send response (060.php)

• Only part that is different is the individual dependencies

• Move surrounding repeated logic to Front Controller

273

Process

• Modify Front Controller to add setup, controller invocation, and response sending work.

• Modify each page script to remove the setup, controller invocation, and response sending work.

• Spot check, commit, push, and notify QA.

274

Modify Front Controller

• 061.php

• Replaced the “require Router” line with the setup line(s)

• Added lines to invoke controller and send the response

• Having used the same variable names everywhere makes this easy

275

Modify Each Page Script

• Remove the setup, controller invocation, and response sending

• Must apply to all page scripts

• 062.php

276

Followup

• Spot check

• Commit

• Push

• Notify QA

277

What If Setup Is Inconsistent?

• Some page scripts might have different or varying setup work

• Setup has to be identical for extraction to Front Controller

• *Force* the setup work to be the same if needed

• If cannot force it, move all shared setup to Front Controller and leave special case setup in the page script (degenerate but necessary)

278

What If Variable Names Are Inconsistent?

• We have emphasized consistent naming of variables throughout

• However, we all make mistakes and oversights

• Change names as needed to make them consistent with Front Controller expectations

279

Review

• Page scripts now reduced to bare core of dependency creation

• Front Controller handles all surrounding logic

280

Replace Page Scripts With DI Container

281

Why A DI Container?• Page scripts are at a bare core of dependency creation functionality

• Still including page scripts into Front Controller Scope

• Dependency creation functionality is itself somewhat repetitive

• Move all dependency creation logic to its own layer (Container)

• Custom container from http://mlaphp.com/code 282

Process (1/3)

• Add a services.php file to create the Container and its services

• Define a “router” service in the Container

• Modify the Front Controller to include the services.php file and use the “router” service

283

Process (2/3)

• Extract creation logic from each page script to the Container.

• Define a service named for the page script.

• Copy the page script logic into that service, renaming variables as needed.

• Route the URL path for the page script to the container service.

• Spot check, commit, and move on to the next page script.

284

Process (3/3)

• Remove the empty pages/ directory.

• Spot check and unit test

• Commit, push, notify QA

285

Add A DI Container File

• Add a file named “services.php” in the setup files location

• Yes, this is another include, but having a separate file will be organizationally clean

• Populates the container with all existing global variables as properties

$di = new \Mlaphp\Di($GLOBALS);286

Add A Router Service

• 063.php

287

Creating Service Definitions• Convention: use all lower-case names for shared services,

use fully-qualified class names for new instance services

• Service definitions are callables (typically closures) to create objects

• Create *and return* the object from the service definition

• Services are lazy-loaded (created only when called)

• $di->get() for a shared object, $di->newInstance() for a new instance288

Modify The Front ControllerTo Use The Router Service

• 064.php

• If the Container has a service that matches the returned Route name, use the service object

• Otherwise treat the Route name as a pages/ path

• Either way, we get a $controller variable

• Spot check and commit289

Extract Each Page ScriptTo A Container Service

• In services.php, add a $di->set() line with the fully-qualified class name of the page script controller, and an empty closure

• Copy the page script logic into the closure for the service definition

• Change global variables to $di properties

• 065.php

290

Route URL Path To Container Service

• 066.php

• Returns a Container service name now instead of a pages path

291

Spot Check• Browse to the converted URL path and see if it works

• Common errors:

• Failure to replace $var with $di->var

• Failure to return the object from the service definition

• Mismatch between service name and mapped route name292

Followup• Commit

• Replace next page script

• When all are deleted, remove pages/ dir

• Spot-check overall

• Push, notify QA293

How Can We Refine Our Service Definitions?

• When extraction is complete, service definitions long and repetitive

• Can further reduce creation logic by extracting shared services and separating class instance creation logic

• 067.php

294

What If There Are IncludesIn The Page Script?

• Hold your nose and copy them over to the container service

• Look for commonalities in setup and extract them to classes

• The classes themselves can become services if needed

295

How Can We Reduce The Size Of The Services File?

• Container definitions may reach tens or hundreds of definitions

• No way to reduce volume in total, only to organize more effectively

• Reasonable to split definitions across multiple files, and make services.php include those files

296

How Can We Reduce The Size Of The Router Service

• The router contains one entry for each URL in the application

• Reasonable to create a routes.php file that returns an array for the router definition

• 068.php

• Again, no way to reduce volume, only to organize more effectively

297

Review

• Added a Container for creation logic and removed page scripts

• No longer page-based; modernization is complete

• Celebrate!

• But new work still awaits :-)

298

Conclusion

299

From Legacy …

• Page-based, include-oriented, spaghetti mess of globals

• No clear separation of concerns (SQL, MVC, etc)

• No tests

• All resources in the public document root

300

… To Modern• Autoloaded (no more includes)

• Dependency Injected (no more globals or new)

• Unit Tested

• Layer Separated (SQL, MVC)

• Front Controlled (private resources, router, container)301

Still Many Opportunities For Improvement

• Just because it’s modern, doesn’t mean we’re done

• The application will always have need for greater perfection

• The modernization process has revealed many opportunities for us

302

Data Source Layer

• Currently consists of Gateways

• Consider converting to Data Mappers for cleaner object interaction

303

Models

• Currently Transaction Scripts

• Consider Domain Model and/or Service Layer

• Consider DDD to further refine the logic and structure

304

Views

• Still relatively monolithic and include-oriented

• Consider Two Step View for layouts separated from central elements

305

Controllers

• May be handling unrelated actions

• May contain too much logic

• Consider extracting to Model or View layers

306

Model-View-Controlleror Action-Domain-Responder?

• Model-View-Controller itself might not be sufficient

• Consider refactoring further to Action-Domain-Responder

• http://pmjones.io/adr

307

Routing System

• Currently a transitional/temporary state

• Prefer path-param-based URLs instead of query-string-based

• Consider replacing with more powerful router

• http://github.com/auraphp/Aura.Router

308

Dispatcher System

• Front Controller handles dispatching of routes

• Consider separating dispatch from routing

• http://github.com/auraphp/Aura.Dispatcher

309

Dependency Injection Container

• DI Container is still very “manual” in nature

• Consider more powerful Container system

• http://github.com/auraphp/Aura.Di

310

A Better Class Of Problems• We still have problems, but they are *high quality* problems

• We can address them in a much better way than in a legacy system

• We can use Refactoring (Fowler et al) more effectively

• Better separation of code allows us to address independent areas

• Unit tests will keep us from introducing breaks311

Conversion To Framework?

• With improved architecture, can consider moving to a framework

• Easier to determine which parts can go, and which can stay

• Consider the tradeoffs (public support vs higher discipline)

• Cannot judge for you; you will have to determine for yourself

312

Our Own Growth

• We now have a better quality of professional life

• No longer dread working with the codebase

• More productive time, less time frustrated and demoralized

• Our programming practices have improved as well

313

Go Tell Others!

• Continue improving our code and our practices

• Tell others that they need not suffer with legacy codebases

• As code gets better, the more lives get better

• Move on to more interesting and challenging issues

314

Thanks!

• http://leanpub.com/mlaphp/c/zendcon2014

• http://paul-m-jones.com/

• @pmjones

315

top related