4.security assessment and testing

Course 1: Overview of Secure Programming, Section 4Pascal Meunier, Ph.D., M.Sc., CISSPMay 2004; updated August 12, 2004Developed thanks to support and contributions from Symantec Corporation, support from the NSF SFS Capacity Building Program (Award Number 0113725) and the Purdue e-Enterprise CenterCopyright (2004) Purdue Research Foundation. All rights reserved.

Course 1 Learning Plan

Security overview and patching Public vulnerability databases and resources Secure software engineering Security assessment and testing Shell and environment Resource management Trust management

Learning objectives

Understand the challenges and methods for assessing software security.

Security Assessment and Testing: Outline

Security Assessment– Architecture and Assurance– Code analysis

Definitions Reviews Automated checkers

– Code complexity– Style– History of vulnerabilities and patches

Testing

Architecture

If there happened to be a vulnerability in a given part of the code, what would be the consequences?

Design and architecture can mitigate the consequences of a vulnerability– e.g., correct application of compartmentalization and

principle of least privilege– Example: chroot in UNIX limits the resources (files and

libraries) accessible by a process, so that even if a vulnerability is exploited, damage can be limited

– Use of stored procedures in SQL databases, instead of generating all SQL code dynamically, may limit how the database can be attacked (see code injection vulnerabilities in another unit)

Question

One of the roles of architecture in secure software engineering is to structure the application or system so it :

a) Is easier to implement, test and maintain b) Gives the best performance and code reuse c) Mitigates exploits

Question

One of the roles of architecture in secure software engineering is to structure the application or system so it :

a) Is easier to implement, test and maintain b) Gives the best performance and code reuse c) Mitigates exploits (Answer "a" can also help security inasmuch as

complexity makes security issues more likely)

Discussion

Give an example of something else architecture can do to improve the security of applications and systems.

(Instructor: Limit this discussion to less than 10 minutes)

Discussion examples

Organize the code into layers and units (libraries, kernels, headers, objects, etc...) whose security properties are more understandable, systematically guaranteed, or can even be proven

Choose languages that support declarative security restrictions (Java security manager; mechanism external to the code) instead of programmatic restrictions (C/C++)– That is, if you will actually use those features!

Does the application you are assessing do that?

Assurance

Correcting security problems after implementation has known security limitations– Cost and time

Some corrections would require rewriting everything Superficial patch is usually applied instead

– "Products claiming security that are created from previous versions without security cannot achieve high trust because they lack the fundamental and structural concepts required for high assurance" (Sullivan 2003)

Other Assurance Considerations

Has security been hacked into the artifact being evaluated, or was it part of the original design and architecture?

Has the artifact been certified with the Common Criteria?

Was it architected and designed using a valid threat model?





Testing

Code Analysis

Static: examining the code without executing it– Examples:

Design and code reviews Syntax and style checkers Most automated security auditors

– finds "stupid" coding mistakes

Model Checking: run code within model checker– check if rules and invariants ("properties") are obeyed

e.g., all allocated memory must be freed

Security auditors using static analysis or model checking tend to find different types of bugs

Reference: Engler and Musuvathi 2003

Code Analysis: Definitions

A false positive occurs when an alert is given in the absence of what we want to detect.– false positives incur the cost of activating response

processes

A false negative occurs when no alert is given in the presence of what we want to detect.– false negatives allow a threat to become real

Question

Given busy locations where the great majority of people are honest citizens, what makes computerized face-recognition systems unusable?

a) False negatives, because criminals can get past them

b) False positives, because the number of alerts (and their cost) is unmanageable

c) They work well enough

Code Analysis: Code and Design Reviews

Dependent on skills of reviewers– Best to have critical, unintimidated reviewers with different

skills and perspective than the authors

With current technology they produce– Least false positives– Least false negatives

Can transfer skills (learning opportunity)– Starting code reviews early may be cheaper than later

Most effective... (according to Microsoft)– In 90 minute sessions– With no more than three people

Costs of Reviews

Can be hard on developer morale and self-esteem Expensive

– Time consuming Focusing on risky code, as determined by a threat analysis,

can yield a higher return

– Need setup and followup– Reviewers may need training

Risks of – Group-think (reviewers become "yes-men")– Same-skills reviews ("inbred" code reviews)– Stale reviews (loss of lateral thinking capability)

Question

What can you do to avoid the dangers of "inbred" code or design reviews?

a) Invite an external participant with different skills

b) Study the genealogy of the code

c) Provide a constructive atmosphere during the review that welcomes and rewards criticism

Question

What can you do to avoid the dangers of "inbred" code or design reviews?

a) Invite an external participant with different skills

b) Study the genealogy of the code c) Provide a constructive atmosphere during the

review that welcomes and rewards criticism

Code Analysis: Automated Code Checkers

What targets?– Binaries– Source code

Users– Developers

Most code checkers are designed to be used by developers and some require annotations in the source code

– Third parties Third-party verification is difficult

Theory: Undecidable problem– Program analysis is undecidable in the general case– Which approximations were made to make the problem

tractable?

Automated Code Checkers

Hot area– Before 2003, their usefulness for third-party security

auditing was very low (Heffley and Meunier 2003)– Many new players in 2003-2004– Significant advances in lowering false positives and false

negatives– Becoming a good way to catch security bugs

some vulnerabilities will never be caught because they result from interactions that can't be deducted from the code or model

Ultimately all results from automated checkers require manual review– but much cheaper than full code reviews





Testing

Code Complexity

Premise: the more the code is complex (loops, branches, entry points, etc...) the greater the chance of vulnerabilities

Many complexity metrics available and used for software quality

Practice: vulnerabilities correlate with defects, as security bugs are defects, but it is difficult to find metrics that specifically relate to security bugs– Indirect relationship -- causality of each metric vs

vulnerabilities is unproven Little-explored research area (one Ph.D. student at CERIAS

is working on this problem)





Testing

Style: Cognitive Science

All human dialects can be proper languages However people using different dialects have

difficulty understanding each other and tend to judge the other as "wrong"

Is everyone in your team or company using the same dialect?

Style

There isn’t an absolutely wrong or right coding style; however to reduce the occurrence of bugs, code should be produced to be easily understandable:– by as many people as possible – as quickly as possible – with the least chance of misunderstandings

Maintaining a consistent coding style throughout a project also speeds up code reviews.

Ambiguous Coding Style

In PHP, some functions mean two completely different things when returning false or zero– strpos

returns position of substring in a string may return false if the string is not found!

– How do you know if the return value is 0 or false? "===" checks type

– A "C" programmer starting to code in PHP might use constructs such as: if (!$a)

– and get unexpected results!

Ambiguous Coding Style

if (!$a) ... Did the author mean:– if ($a === false)– if ($a === 0)– if ($a === NULL)

if ($b == "") ... Did the author mean:– if ($b === "")– if ($b === NULL)

Are there ambiguities in the code you are assessing? – Ambiguities lead to vulnerabilities– "mean what you write and write what you mean"

Exercise

In Linux pppoe.c:– po = po->next;– if (!po) {– int hash = hash_item(po->pppoe_pa.sid, ...)

Patch 2.5.72:– if (po->next) – po = po->next;– else {– int hash = hash_item(po->pppoe_pa.sid, ...)

The author meant "if the new po is NULL, use the hash on the old po" but the old po had been replaced by a NULL!

How could you have written the original statement to make the error more obvious?

(exercise inspired by Hallem, Park and Engler 2003)

Exercise Sample Answer

One way to read "!po" involves a double negation– "if not not null"– double negations are notorious sources of confusion

usually one of the negations is missed and the code would be read, "if not null"

For many people, the following code is clearer (and highlights the error):– if (po == NULL) {

int hash = hash_item(po->pppoe_pa.sid, ...);

Exercise Alternate Answer

"This is a case of variable reuse in the wrong context"

The code could have been written like this, which would have avoided the error altogether:– p2 = po->next;– if (!p2) {– int hash = hash_item(po-

>pppoe_pa.sid, ...);

Style: Conclusion

It may be difficult to assess someone else's code due to different styles– Differences may lead to an undeserved perception of

sloppiness– Look for ambiguities and assumptions

Adopt a style as standard and stick to it consistently throughout a project

A new individual may need training for your coding style





Testing

History of Vulnerabilities and Patches

Premise: The development process that produced an artifact with vulnerabilities is similar to the one that produced other artifacts from the same company or unit.– If one product has vulnerabilities, it is likely that others do– if a product has vulnerabilities, it is likely that it contains

more patches were narrowly developed, or were developed using

the same processes that let the vulnerabilities through originally

– Statistical hypothesis: the likelihood that an artifact has many vulnerabilities is higher if some have already been found "Bugs cluster"



Definitions Static Dynamic


Testing

Testing

Testing is a costly part of development efforts How are these testing efforts used in secure

programming?– Scenario testing– Specification testing– In-line testing– Unit testing– Integration testing– Human interaction testing– Final testing– Acceptance testing

Scenario Testing

Used to ensure that requirements capture is complete and consistent

Validation effort Example: Abuse and Misuse cases (UML)

– If I was a hacker, what would I try? Identify trust relationships and attack them

– see unit on trust management

Which resources are open to me? How can they be abused?

– see unit on resource management

Are there insecure defaults or things that could easily be misconfigured or badly coded?

– see unit on operations management and best practices

Question

Scenario testing requires from the tester: a) A solid understanding of the programming

language b) An imagination and inventiveness for possible

risks c) Experience being a hacker

Question

Scenario testing requires from the tester: a) A solid understanding of the programming

language b) An imagination and inventiveness for

possible risks c) Experience being a cracker and "owning" hosts

Specification Testing

Prove the completeness and correctness of the specification– Formal proof– Symbolic execution

This is usually done only for high-assurance efforts and out of the scope of this tutorial

Many networking protocols lacked specification testing for security and are riddled with vulnerabilities– See track 3, "Network security in depth"– Making your own (secure) communication protocols is not

trivial!

In-line Testing

Execution testing– e.g., ASSERT

Verify that the internal state of the software is:– allowed (security policies)– expected– self-consistent

Verify invariants and rules– "Manual" model checking

Disadvantages of Inline Testing

Usually turned off in deployed applications (e.g., Java)– Verifying invariants can incur significant overhead– Errors may not be user-friendly– Data integrity could be compromised

Need to cover as much code as possible– "No run == no bug" (D. Engler)– Manual placement of ASSERTs and tests may miss

violations

Some rules and assumptions are difficult to test Won't help if execution is transfered successfully to

malicious code

Question

Inline testing is useful for security because it: a) Allows the verification of invariants b) Is usually turned off in the final product c) Happens earlier than other forms of testing

Question

Inline testing is useful for security because it: a) Allows the verification of invariants b) Is usually turned off in the final product c) Happens earlier than other forms of testing

Comment: c) is indirectly true in the sense that it's better to catch bugs early. However the primary reason is a).

Exercises

Can you give examples of security-related inline tests?

Can you give an example of an invariant that is difficult to code into an inline test?

Exercises

Can you give examples of security-related inline tests?– Verify input length or index variable vs buffer length

Can you give an example of an invariant that is difficult to code into an inline test?– That all allocated memory has been freed

Unit Testing

Local testing of functions, etc... Good time to test against buffer overflows and other

low-level coding mistakes – see track 2, "Secure programming practices and

vulnerabilities" C programming language issues Code injection, canonicalization issues Securing the file system

Human Interaction Testing

Does the UI lead users to make the correct security decisions?

Does the UI properly and effectively convey security information to the user?

Are error messages informative and pertinent?– Click-through syndrome

Resilience to User Interface Abuse– Phishing, social engineering– Colored text so human-readable message is different from

machine interpretation– Obscured links, file names

Integration Testing

Interfaces, linking & loading Test for discrepancies in assumptions and security

models– who was supposed to do authentication and access

control?

Security testing– Ensure that all operations that should be denied, are

denied– Verify that partial accesses can only access what they

should

Acceptance Testing

Customer validation step (ET - external test in beta) Should include tests for likely threats

– Also, flaw-hypothesis testing Make hypotheses as to where flaws are likely and try to

expose them

– Customer may not possess necessary skills! Customer trusts reviewer

– Reviewer responsibility to find flaws– Is the reviewer hamstrung by an "obligation" to be nice to the

vendor providing the software for review?

Supposedly well-intentioned researchers or hackers might do so "for the public's sake"

Final Testing

Test cases run against entire project Fault injection

– Random input What happens if input is nonsense?

– Denial of service (crashes), etc...» may indicate possible compromises (buffer overflows)

Bug isolation can be difficult– Binary search:

» Bug is in X sequence of inputs, but not in first half nor second half

» Depends on internal state of project» State of system unknown

– Grammar-generated input Exercise all "legal" kinds of inputs

Question

Describe limitations of using random input for final (security) testing.

Question Sample Answers

Describe limitations of using random input for final (security) testing.– Not all execution paths are tested– Pseudo-random input is not completely random– It may be difficult to identify and locate the bug, because

random input long before a crash may have produced a special internal state of the software

Question

In black box testing, the tester has no knowledge of the internal workings of the compiled, final artifact. Which kind of testing CAN'T you do in black box testing?

a) Final testing b) Acceptance testing c) Inline testing

About These Slides

You are free to copy, distribute, display, and perform the work; and to

make derivative works, under the following conditions.

– You must give the original author and other contributors credit

– The work will be used for personal or non-commercial educational uses

only, and not for commercial activities and purposes

– For any reuse or distribution, you must make clear to others the terms of

use for this work

– Derivative works must retain and be subject to the same conditions, and

contain a note identifying the new contributor(s) and date of modification

– For other uses please contact the Purdue Office of Technology

Commercialization.

Developed thanks to the support of Symantec Corporation

Pascal [email protected]:Jared Robinson, Alan Krassowski, Craig Ozancin, Tim Brown, Wes Higaki, Melissa Dark, Chris Clifton, Gustavo Rodriguez-Rivera

4.security assessment and testing

Technology

makes computerized face

recognition systems unusable

black box testing

related inline tests

scenario testing requires

ambiguous coding style

architecture insecuresoftware engineering

human interaction testing