info 631 prof . glenn booker
DESCRIPTION
INFO 631 Prof . Glenn Booker. Week 3 – Complexity Metrics and Models. Origin. Complexity metrics were developed by computer scientists and software engineers Strongly based on empirical (real world) measurement, with little theory Primarily broken into internal and external measures. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/1.jpg)
www.ischool.drexel.edu
INFO 631 Prof. Glenn Booker
Week 3 – Complexity Metrics and Models
1INFO631 Week 3
![Page 2: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/2.jpg)
www.ischool.drexel.eduINFO631 Week 3 2
Origin
• Complexity metrics were developed by computer scientists and software engineers
• Strongly based on empirical (real world) measurement, with little theory
• Primarily broken into internal and external measures
![Page 3: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/3.jpg)
www.ischool.drexel.eduINFO631 Week 3 3
Internal versus External
• Internal measures describe the complexity within a module (number of decisions, loops, calculations, etc.)
• External measures describe relationships among modules (program or function calls, external file activities, input/output, etc.)
![Page 4: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/4.jpg)
www.ischool.drexel.eduINFO631 Week 3 4
Internal Measures
![Page 5: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/5.jpg)
www.ischool.drexel.eduINFO631 Week 3 5
Internal Product Attributes• Size measures
– Input to prediction models– Normalizing factor for cost, productivity, etc.– Progress during development
• Typically use lines of code (LOC) or function point counts; – LOC is a better measure for predicting cost
and schedule
![Page 6: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/6.jpg)
www.ischool.drexel.eduINFO631 Week 3 6
Lines of Code• Simple complexity metric, often based on
number of executable statements or instruction statements– Highest defect rates often occurs in small
modules– Larger modules have a smaller defect rate
(if they exist at all) - until too cumbersome– Optimum module size ~ 250 lines
![Page 7: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/7.jpg)
www.ischool.drexel.eduINFO631 Week 3 7
Function Points
• Function points help avoid biases due to the programming language(s) used
• Provide a more “fair” basis for comparing different environments
• Focuses on how much work the program accomplishes, not how concisely it is expressed
![Page 8: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/8.jpg)
www.ischool.drexel.eduINFO631 Week 3 8
Halstead Metrics• Also known as Software Science, 1977• Examine program as compilable “tokens”• Tokens are either operators (+, -) or operands
(variables)• Derive metrics such as Vocabulary, Length, Volume,
Difficulty, etc.• Not widely used
![Page 9: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/9.jpg)
www.ischool.drexel.eduINFO631 Week 3 9
Data Structure (Halstead)
• Halstead’s 2 - number of distinct operands in a module– Operands include: number of variables,
number unique constants, and number of labels
• Operand usage (OU)– OU = 2/N2 where N2 is the total number of
operand references
![Page 10: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/10.jpg)
www.ischool.drexel.eduINFO631 Week 3 10
Software Complexity
• Is a characteristic that influences the resources needed to build and maintain it
• Many different characteristics of software relate to complexity
• These complexity characteristics revolve around the structure of the software
![Page 11: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/11.jpg)
www.ischool.drexel.eduINFO631 Week 3 11
Types of Structural Measures• Control flow
– Addresses sequence in which instructions are executed
– Iteration and looping• Data flow
– Follows trail of data as it is created and handled
– Depicts behavior of data as it interacts with the program
![Page 12: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/12.jpg)
www.ischool.drexel.eduINFO631 Week 3 12
Types of Structural Measures
• Data structure– Concerned with organization of data itself– Provides information about difficulties in
handling data and in defining test cases
![Page 13: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/13.jpg)
www.ischool.drexel.eduINFO631 Week 3 13
Control Flow
• Modeled by directed graphs (control flow graphs)– Each node corresponds to a single program
statement– Arcs (directed edges) indicate flow of control
from one statement to another
![Page 14: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/14.jpg)
www.ischool.drexel.eduINFO631 Week 3 14
Control Flow
• Control flow graphs are useful for:– Analysis (estimating number of defects)– Expressing complexity by a single value– Assessing testability and test coverage
![Page 15: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/15.jpg)
www.ischool.drexel.eduINFO631 Week 3 15
Basic Control Constructs
If A then X else YA
YXt f
Repeat X until A
X
A
f
t
Case A of a1 : X1 . . an : Xn
...
a1a2
an
A
X1 X2Xn
Note: t=true f=false
If A then X
A
X
tf
While A do X
A
X
ft
![Page 16: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/16.jpg)
www.ischool.drexel.eduINFO631 Week 3 16
Cyclomatic Complexity
• McCabe, 1976• Based on a program’s control flow chart• Related to number of separate graphable
areas, or number of linearly independent paths in the program
• Complexity MC = edges - nodes + 2*(# of unconnected paths)
![Page 17: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/17.jpg)
www.ischool.drexel.eduINFO631 Week 3 17
Cyclomatic Complexity• Complexity under 10 generally desired• Can also find M as number of binary
decisions (yes/no) minus one– Multiple choice decisions with ‘n’ choices
count as (n-1) binary decisions• Ignores differences among specific types
of control structures
![Page 18: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/18.jpg)
www.ischool.drexel.eduINFO631 Week 3 18
Cyclomatic Complexity
• Uses of complexity metric:– Identify complex modules needing detailed
inspection or redesign– Identify simple modules needing minimal
inspection and/or testing– Estimate programming, testing and
maintenance effort– Identify potentially troublesome code
![Page 19: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/19.jpg)
www.ischool.drexel.eduINFO631 Week 3 19
Control Flow Representation of Programs
• Software programs can be represented by linear directed segments combined with the basic control flow constructs
• Control flow constructs may be nested, e.g. an IF statement can be inside of a WHILE loop
![Page 20: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/20.jpg)
www.ischool.drexel.eduINFO631 Week 3 20
Control Flow Representation of Programs
• Example:
1 2
34
56
7
89
10
111213
14
McCabe cyclomatic complexity (MC) - counts the number of linearly independent paths through a program
MC = # of edges - # of nodes +2
Linearly independent paths for example <2, 11> <2, 10, 12, 14> <2, 10, 12, 13, 12, 14> <1, 3, 5, 6, 9> <1, 4, 6,9> <1, 4, 6, 7, 8, 9>
![Page 21: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/21.jpg)
www.ischool.drexel.eduINFO631 Week 3 21
Control Flow--Linearly Independent Paths
b
c
e
g
f
d
a1 2
3 45 6
7
8
910
MC = edges - nodes + 2 = 10 - 7 + 2 = 5
Set of linearly independent paths: b1: abcg b2: abcbcg b3: abefg b4: adefg b5: adfgAny arbitrary path is equal to a linear combination of the linearly independent paths listed aboveFor example, path abcbefg is equal to: b2 + b3 - b1
![Page 22: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/22.jpg)
www.ischool.drexel.eduINFO631 Week 3 22
Knots - Control Flow Crossovers
• Knot measure -- total number of points at which control flow lines cross
IF (TIME) 30,30,1010 CALL TEMP1 IF (X1) 20,20,4020 Y1=Y+1 Y2=0 CALL TEMP2 GO TO 5030 Z1=140 CALL TEMP3 Z2=Z2+150 CALL TEMP4
How many are here?
![Page 23: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/23.jpg)
www.ischool.drexel.eduINFO631 Week 3 23
Syntactic Constructs
• Examine effect of using specific control structures on defect rate
• Is, by definition, language-specific• Can result in statistically significant
relationships– e.g. Lo used to show that DO WHILE should
be avoided in COBOL
![Page 24: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/24.jpg)
www.ischool.drexel.eduINFO631 Week 3 24
External Measures
![Page 25: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/25.jpg)
www.ischool.drexel.eduINFO631 Week 3 25
Computational Complexity• Examines algorithmic efficiency and use of
machine resources (memory, I/O, storage)• Studies quantitative aspects of solutions to
computational problems• Examples may include sorting efficiency
for a database, managing I/O constraints across a large scale network, etc.
![Page 26: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/26.jpg)
www.ischool.drexel.eduINFO631 Week 3 26
Psychological Complexity
• Concerned with characteristics of software that affect human performance- Injection of defects (when and why does a
programmer make errors?)- Ease of building the software (effort required)- Ease of maintenance (effort required)
![Page 27: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/27.jpg)
www.ischool.drexel.eduINFO631 Week 3 27
Data Structure (Database)• Database size per program size
(DBSPPS)– DBSPPS = DBS/PS
• Where DBS is database size in bytes or characters• PS is program size in source instructions
– Used in COCOMO model as a cost driver• Ordinal scale measure derived from DBSPPS
![Page 28: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/28.jpg)
www.ischool.drexel.eduINFO631 Week 3 28
Fan-in and Fan-out
• Focus is the interaction among code modules– Fan-in = # of modules which call a given
module– Fan-out = # of modules which are called by a
given module• Or, more formally...
![Page 29: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/29.jpg)
www.ischool.drexel.eduINFO631 Week 3 29
Fan-in and Fan-out• Fan-in of a module is the number of local flows
terminating at the module, plus the number of data structures from which info is retrieved by the module
• Fan-out of a module is the number of local flows that emanate from the module, plus the number of data structures (tables, arrays) that are updated by the module
![Page 30: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/30.jpg)
www.ischool.drexel.eduINFO631 Week 3 30
Fan-in and Fan-out• Do fan-in and fan-out affect software
quality?– Large fan-in modules may be interpolation or
look-up routines - no defect correlation– Large fan-out often relates to high defect rate
- has a high defect correlation• Is large fan-in and fan-out bad?
![Page 31: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/31.jpg)
www.ischool.drexel.eduINFO631 Week 3 31
Fan-in and Fan-out• Information flow complexity
– Henry and Kafura: Size*(fan-in * fan-out)2
– Shepperd: (fan-in * fan-out)2
• Henry and Kafura measure helps predict the number of software maintenance problems
Henry, S. and D. Kafura, IEEE Transactions on Software Engineering, 1981. SE-7(5): p. 510-518 Shepperd, M. 1990. Software Engineering Journal 5, 1 (January), pp. 3-10.
![Page 32: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/32.jpg)
www.ischool.drexel.eduINFO631 Week 3 32
Structure Metrics
• Shepperd measure correlates with software development time
• Information flow metric (Henry & Selig) HC = C * (fan-in * fan-out)^2– where C is the cyclometric complexity
![Page 33: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/33.jpg)
www.ischool.drexel.eduINFO631 Week 3 33
Structure Metrics
• System complexity (Card & Glass)– Based on structural complexity (average fan-
out squared) and data complexity (based on number of I/O variables and fan-out)
– Quantified effect of complexity on error rate
![Page 34: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/34.jpg)
www.ischool.drexel.eduINFO631 Week 3 34
Module Call Graph
• Module - a contiguous sequence of program statements, bounded by boundary elements, having an aggregate identifier– Or, a distinct, named group of LOC
• The module call graph shows which modules call each other, and what key information is passed among them
![Page 35: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/35.jpg)
www.ischool.drexel.eduINFO631 Week 3 35
Module Call Graph example
Find_Ave
Main
AverageRead_Scores
Print_Ave
scores
average
average
scores
eof
scores
![Page 36: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/36.jpg)
www.ischool.drexel.eduINFO631 Week 3 36
Module Coupling Measures
• Average number of calls per module (ANCPM)
• Fraction of modules that make calls (FMC)
ANCPM = Number of Interconnections
Number of Modules
FMC = Number of Modules that call
Number of Modules
![Page 37: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/37.jpg)
www.ischool.drexel.eduINFO631 Week 3 37
Information Flow Measures• Types of information flows
– Local direct flow • Module invokes a 2nd module & passes info to it• Invoked module returns result to the caller
– Local indirect flow• Invoked module returns info that is subsequently passed
to a second invoked module– Global flow
• Info flows from one module to another via a global data structure
![Page 38: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/38.jpg)
www.ischool.drexel.eduINFO631 Week 3 38
IEEE-STD-982
• Number of Entries and Exits per Module, ‘m’– Like fan-in and fan-out m = entries + exits
• Software Science measures
![Page 39: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/39.jpg)
www.ischool.drexel.eduINFO631 Week 3 39
IEEE-STD-982• Graph-Theoretic Complexity
– Static ComplexityC = Edges - Nodes + 1
– Generalized Static ComplexityBased on summing resources needed for each module (e.g. storage, access time, etc.)
– Dynamic complexityComplexity as it changes over time across a network
![Page 40: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/40.jpg)
www.ischool.drexel.eduINFO631 Week 3 40
IEEE-STD-982
• Cyclomatic complexity• Minimal Unit Test Case Determination
– Determine number of independent paths through a module, to get minimum number of test cases for unit testing
• Data or information flow complexity– Fan-in and fan-out of variables
![Page 41: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/41.jpg)
www.ischool.drexel.eduINFO631 Week 3 41
IEEE-STD-982
• Design Structure adds weighted (%) average of six parameters:
1. Whether designed top down (Y/N)2. Module inter-dependence3. Module dependence on prior processing4. Database size (# of elements)5. Database compartmentalization6. Module single entrance and exit (Y/N)
– Weighting is chosen to meet project needs
![Page 42: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/42.jpg)
www.ischool.drexel.eduINFO631 Week 3 42
Other Measures
• Compiler measures– Size (bytes of compiled code)– Number of symbols and variables– Cross-reference of all labels– Statement count
![Page 43: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/43.jpg)
www.ischool.drexel.eduINFO631 Week 3 43
Other Measures• Configuration Management Library
Measures– Number of code modules– Number of versions of each module– History of change dates of each module– Module size– Number of related documents for each
module
![Page 44: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/44.jpg)
www.ischool.drexel.eduINFO631 Week 3 44
Availability Metrics
• Most information systems are critical to day-to-day operations– Witness Google or Blackberry being offline
for mere minutes is news• Availability depends on 1) how often the
system goes down, and 2) how long it takes to restore it after a crash
![Page 45: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/45.jpg)
www.ischool.drexel.eduINFO631 Week 3 45
Availability Metrics
• Perfect availability (100%) is nice to dream of, but realistically, higher reliability is more expensive
• Often measure availability by the number of 9’s in the desired level of availability – Two nines is 99%, three nines is 99.9%, four
nines is 99.99%, etc.– How many nines can you afford?
![Page 46: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/46.jpg)
www.ischool.drexel.eduINFO631 Week 3 46
Availability MetricsNo. of 9’s Availability Down time
per year
2 99% 87.6 hours
3 99.9% 8.8 hours
4 99.99% 53 minutes
5 99.999% 5.3 minutes
![Page 47: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/47.jpg)
www.ischool.drexel.eduINFO631 Week 3 47
Achieving High Availability
• Many techniques are used to help ensure that high levels of availability are possible– Duplicate systems (clustering)– RAID data duplication– Duplicate power supplies– Independent power supplies– Uninterruptible power supplies (UPS’)
![Page 48: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/48.jpg)
www.ischool.drexel.eduINFO631 Week 3 48
Availability and Code Quality
• Capers Jones demonstrated a clear connection between code quality (defect rate) and the corresponding mean time to failure (MTTF), which is a key aspect of availability– Consistent methods for measurement and
definitions of terms are needed for further refinement
![Page 49: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/49.jpg)
www.ischool.drexel.eduINFO631 Week 3 49
Customer Outage Data• In order to determine availability, the
actual customer-visible system outage time needs to be collected– In order to get this data, the customer must
place a very high priority on availability– This data could be used to identify software
components which most reduce availability
![Page 50: INFO 631 Prof . Glenn Booker](https://reader036.vdocuments.us/reader036/viewer/2022062521/56816935550346895de090cb/html5/thumbnails/50.jpg)
www.ischool.drexel.eduINFO631 Week 3 50
Availability
• We also expect that availability for a new system should increase over the first couple years of its use
• Defect causal analysis can help reduce the root cause of defects, thereby improving availability