components are like a box of chocolate

Components are like a box of chocolate

A Literature Survey of

Ways to “Know What’s Inside”

a Component

Literature Survey

• Introduction

• Central Topic: Software Credentials

• 4 Related Bodies of Research:– Property Estimators– Interaction Effects– Relevant Trust Research– Combining Credentials

Motivation: Choosing a Component

• Suppose you need an HTML parser for that snazzy new browser you’re building.

• You have many choices:– HotSax – Microsoft – JavaCC – MIL – Jericho – Mozilla – JTidy – TagSoup


• What information would give you confidence about a component to reuse?– Knowing that the component creator validated

the component against a formal specification?– Knowing who created the component?– Knowing how many other people have

successfully used the component before?


• Different people value different information about the product or the people/process who created it.

• This talk is a review of literature about such sources of information.

Motivation: Design as Selection

• Component-selection choices abound:– Choosing an RDBMS– Choosing which web services to use– Choosing whose JavaScript to use– Choosing whose spreadsheets to use…

• Those 80 million end users have this problem when picking applications, too!

A World where Formal Specifications are Few and Far Between

• The burden of formal specs on designers:– Formal specs require $$$/time.– Formal specs require skills.– Formal specs require anticipation.

A World where Formal Specifications are Few and Far Between

• The burden of formal specs on designers:– Formal specs require $$$/time.– Formal specs require skills.– Formal specs require anticipation.

• Incorporating other peoples’ knowledge…– Means that specs must be evolvable– And they must allow heterogeneous notation– And allow incompleteness & inconsistency.

Credentials: Central Topic of this Literature Survey

• In 1996, Mary proposed that components should carry “credentials” (Sha96).

• These credentials would state important facts about extra-functional properties.

• Examples:– “This component is compatible with JDK1.5+.”– “This interface has average latency 100ms.”

Credentials: Format(A Notional Representation)

• Credentials are organized in tables, with one property per row.

Property Name Property Value Confidence Provenance

Platform JDK1.5+ High Asserted

Avg Latency: IHtmlParser

100 ms +- 10 ms Verified: JUnit

Code Memory 1 MB +- 100 KB Verified: JUnit

DOM Compliance Level 2 Moderate Asserted

Thread-safety Apartment High Verified: Fluid

SLOC 73220 High Verified: Eclipse

... ... ... ...

Credentials: 4 RelatedBodies of Research

Software Credentials

Property Estimators

Interaction Effects

TrustResearch

Combining Credentials

Property Estimators

• Some are estimators (or not-quite-estimators) of correctness:

Estimation Method Value Scale References

Model checking Boolean

Software auto-test Boolean Mar01

Code walk-through Ordinal? Bar95, My79

Forms of manual testing / testability analysis Ordinal? My79, Voa95

Mutation-driven testing Ordinal? Bar95, Dm93

Model-based qualitative analysis Ordinal? Reu03

Code metrics Ratio Fen99

OO design metrics Ratio Abr96, Bas95, Chi94

Reliability growth models Ratio Bar95, Lap90, Pig03

COQUALMO & process metrics Ratio Cha99, Har00, Kho00

... ... ...

Property Estimators

• Some methods might make good estimators if we “insert an observer” who could convert the method’s output into a credential table entry.

• More examples:Estimation Method Quality attribute Value Scale References

Penetration analysis Security ? Bar95

Covert-channel analysis Security Ratio Bar95

Software fault tree analysis Safety ? Bar95, Lev95

Event tree analysis Safety ? Bar95, Lev95

Fault modes/effects analysis Safety ? Bar95, Mai95

... ... ... ...

Property Estimators

• Others are estimators of other quality attributes:Estimation Method Quality attribute Value Scale References

Rate monotonic analysis Performance Ratio Kle93

Queuing theory Performance Ratio

Simulation Performance Ratio My79

Timed logic analysis Performance Ratio Bar95, Jah86

Algorithmic analysis Performance Ordinal-ish Cor01

GOMS Usability Ratio Joh96

Heuristic evaluation Usability ? Nie90

Function point sizing ? Size Ratio Pre01

Eval. against maint. cases Maintainability Ratio Ben99

ATAM (several) (several) Kaz00

... ... ... ...

Credentials: Interaction Effects


Property Estimators

Interaction Effects

TrustResearch


Interaction Constraints: Ensembles

• “Constraints” annotate connectors.– They may specify policies that components

must obey when they interact (Dep03).– They may specify promises that will be true if

connectors are used in a certain way (Dep03).– They may specify components that must be

used together in some way as an “ensemble” (Wal01, Sha96).

Interaction Effects: Context

• Many estimators are context-dependent.– Example: Latency and throughput depend on

how code is deployed (e.g.: RAM available).– Sta01 proposed a component “dossier” that

includes test harness code + credentials.

• Many estimators lack confidence bounds.– The “dossier” approach addresses this, but

does not completely solve it.

Credentials: Trust Research


Property Estimators

Interaction Effects

TrustResearch


Trust Research: It’s Who You Know…

• Sometimes, the component author’s identity is the best information you have:– When no relevant estimators exist.– When you need a “tie-breaker.”– When convincing business folks.– When you need a low-cost decision method.

• You also might ask, “Can I trust the person who generated this property estimate?”

Trust Research: Sources of Trust

• Trust derives from several sources:Source Example References

Existing relationship/role You trust your company’s lawyer because he’s the one assigned to you.

Huy04, Jos05, Wil93

Prior performance He’s never let you down in the past. Sab03

Reputation “Everybody” knows he’s a good lawyer. Huy04, Jos05

References He said to ask Judy if he is good, and Judy says he’s great.

Huy04

Group membership And he’s a member of the Rotary Club. Jos05, Tad03

Certification And the Bar Association said he meets the minimum criteria for lawyering.

Models of motivations He cares about your success… so he can get paid.

Ram04

Social context Besides, you can always appeal if the lawyer loses the case.

Jos05

Trust Research: Sources of Trust

• How do you know which software to trust?Source Example

Existing relationship/role You trust some HTML parser because it’s the one your company makes.

Prior performance The HTML parser component has never let you down in the past.

Reputation “Everybody” knows it’s a great parser… online reviews say so.

References The parser’s advertising literature comes with references, all of whom say it’s a great parser.

Group membership And the parser is part of the Mozilla family of tools.

Certification And Symantec has certified it as not being malware, so you don’t have to be afraid to install and execute it.

Models of motivations Hm… do components have motivation? Well, their authors do, anyway.

Social context Hm… do we have software insurance yet? Paul better get on that.

Research on How to GetTrustworthy Data about Trust

• “Reputation”… how do you make sure people aren’t lying about experiences?– Rater ratings (e.g. Slashdot, Amazon) (Jos05)

• How do you get them to give ratings at all?– Pay them to rate (e.g. Epinions, About) (Jos05)

• How do you tap into deep, natural language sources of information?– Mine online reviews (e.g. Amazon) (Hu05)

Credentials: Combining Credentials


Property Estimators

Interaction Effects

TrustResearch


Combining Credentials: Ratio Scale

• Simple case:– Estimate X1, normal error 1

– Estimate X2, normal error 2

– Minimum variance combined estimate(X1 / 1

2 + X2 / 22) / (1 / 1

2 + 1 / 22)

Combined variance: 2 = 12 + 2

2

• Very few estimators have a ratio scale and normally distributed error (e.g.: latency).

Combining Credentials: Weaker Scales

• When scales are ratio and error isn’t normal, you can “invent” a reasonable weight for combining estimators (Huy04).

• More generally, if values have a definite a probability distribution, you can use Bayesian methods (Jos05).

Combining Credentials:Very Weak Scales

• Values on interval scales can be averaged, without weighting (Jos05).(20o F + 30o F)/2 = 25o F

• Values on an ordinal scale can be summarized with a median.{High, Medium, Medium, Low} Medium

Combining Credentials:Very Weak Scales

• Values on a Boolean scale can be combined using deductive logic.(True & ^False) v (False & True)

• Values on nominal scales can be combined using Dempster’s rule (Jos05).{Lots of A, Some B} A

Combining Credentials: Challenges

• Estimators’ semantics vary… what does it really mean to “combine” them?– Not a new problem: the FDA needs to “combine” the

results of clinical trials to yield assessments

• Many estimators lack confidence bounds.

• (We’re getting a little ahead of ourselves, anyway, since we don’t yet have widely deployed credentials to combine.)

Conclusion

• We can’t expect specifications to always be available for components.

• “Credentials” offer a way of representing important extra-functional properties.

• Estimators exist for many properties, but estimators with error bounds are lacking.

• We should consider combining analytic properties with person-related information.

ReferencesAbr96 F. Abreu and W. Melo. Evaluating the Impact of Object-Oriented Design on Software Quality.

METRICS '96: Proceedings of the 3rd International Symposium on Software Metrics, IEEE Computer Society, 1996, pp. 90.

Bar95 M. Barbacci, M. Klein, T. Longstaff, and C. Weinstock. Quality Attributes. Technical Report CMU/SEI-95-TR-021, Software Engineering Institute, Pittsburgh, Pennsylvania 15213, December 1995. See also M. Barbacci, M. Klein, and C. Weinstock. Principles for Evaluating the Quality Attributes of a Software Architecture. Technical Report CMU/SEI-96-TR-036, Software Engineering Institute, Pittsburgh, Pennsylvania 15213, May 1997.

Bas95 V. Basili, L. Bri, and W. Melo. A Validation of Object-Oriented Design Metrics As Quality Indicators. IEEE Transactions on Softw. Eng., Vol. 22, No. 10, 1996, pp. 751-761.

Ben99 O. Bengtsson and J. Bosch. Architecture Level Prediction of Software Maintenance. CSMR '99: Proceedings of the Third European Conference on Software Maintenance and Reengineering, IEEE Computer Society, 1999, pp. 139.

Car03 M. Carbone, M. Nielsen, and V. Sassone. A Formal Model for Trust in Dynamic Networks. First International Conference on Software Engineering and Formal Methods (SEFM'03), 2003, pp. 54.

Cha99 S. Chulani. COQUALMO (COnstructive QUALity MOdel): A Software Defect Density Prediction Model. Project Control for Software Quality, Shaker Publishing, 1999.

Chi94 S. Chidamber and C. Kemerer. A Metrics Suite for Object Oriented Design. IEEE Transactions on Softw. Eng., Vol. 20, No. 6, 1994, pp. 476-493.

Cor01 T. Cormen, C. Stein, R. Rivest, and C. Leiserson. Introduction To Algorithms. McGraw-Hill Higher Education, 2001.

Dep03 W. DePrince Jr and C. Hofmeister. Usage Policies for Components. Proceedings of the 6th ICSE Workshop on Component-Based Software Engineering, 2003.

Dm93 R. DeMillo and A. Offutt. Experimental Results From an Automatic Test Case Generator. ACM Transactions on Softw. Eng. Methodol., Vol. 2, No. 2, 1993, pp. 109-127.

ReferencesFen99 N. Fenton and M. Neil. A Critique of Software Defect Prediction Models. IEEE Transactions on

Softw. Eng., Vol. 25, No. 5, 1999, pp. 675-689. Gra00 T. Grandison and M. Sloman. A Survey of Trust in Internet Applications. IEEE Communications

Surveys and Tutorials, Vol. 3, No. 4, 2000.Har00 D. Harter, M. Krishnan, and S. Slaughter. Effects of Process Maturity on Quality, Cycle Time, and

Effort in Software Product Development. Management Science, Vol. 46, No. 4, April 2000, pp. 451-466.

Hu05 Hu, M., and Liu, B. Mining and Summarizing Customer Reviews. KDD '04: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, 2004, 168-177.

Huy04 T. Huynh, N. Jennings, and N. Shadbolt. Developing an Integrated Trust and Reputation Model for Open Multi-Agent Systems. In Proceedings of 7th International Workshop on Trust in Agent Societies, 2004, pp. 65-74.

Jah86 F. Jahanian and A. K Mok. Safety Analysis of Timing Properties in Real-Time Systems. IEEE Transactions on Softw. Eng., Vol. 12, No. 9, 1986, pp. 890-904.

Joh96 B. John and D. Kieras. The GOMS Family of User Interface Analysis Techniques: Comparison and Contrast. ACM Transactions on Comput.-Hum. Interact., Vol. 3, No. 4, 1996, pp. 320-351.

Kaz00 R. Kazman, M. Klein, and P. Clements. ATAM: Method for Architecture Evaluation. Technical Report CMU/SEI-2000-TR-004, Software Engineering Institute, Pittsburgh, PA 15213, August 2000.

Kho00 T. Khoshgoftaar, R. Shan, and E. Allen. Using Product, Process, and Execution Metrics To Predict Fault-Prone Software Modules with Classification Trees. High Assurance Systems Engineering, 2000.

ReferencesKle93 M. Klein et al. A Practitioners' Handbook for Real-Time Analysis: Guide To Rate Monotonic

Analysis for Real-Time Systems, Kluwer Academic Publishers, 1993.Jos05 A. Jøsang, R. Ismail, and C. Boyd. A Survey of Trust and Reputation Systems for Online Service

Provision. Decision Support Systems, 2005 Lap90 J. Laprie, K. Kanoun, C. Beounes, and M. Kaaniche. The KAT (Knowledge-Action-

Transformation) Approach To the Modeling and Evaluation of Reliability and Availability Growth. Transactions on Software Engineering, Vol. 17, No. 4, Apr. 1991, pp. 370-382.

Lev95 N. Leveson. Safeware: System Safety and Computers, Addison-Wesley, 1995.Mai95 T. Maier. FMEA and FTA To Support Safe Design of Embedded Software in Safety-Critical

Systems. CSR 12th Annual Workshop on Safety and Reliability of Software Based Systems, 1995, pp. 353-367.

Mar01 E. Martins, C. Maria Toyota, and R. Lie Yanagawa. Constructing Self-Testable Software Components. DSN '01: Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS), IEEE Computer Society, 2001, pp. 151-160.

My79 G. Myers, C. Sandler, T. Badgett, and T. Thomas. The Art of Software Testing, John Wiley and Sons, 1979.

Nei96 M. Neil and N. Fenton. Predicting Software Quality Using Bayesian Belief Networks. Proceedings of the 21st Annual Software Engineering Workshop At NASA Goddard Space Flight Centre, December 1996.

Nie90 J. Nielsen and R. Molich. Heuristic Evaluation of User Interfaces. CHI '90: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ACM Press, 1990, pp. 249-256.

ReferencesPig03 M. Pighin and A. Marzona. An Empirical Analysis of Fault Persistence Through Software

Releases. Proceedings of the 2003 International Symposium on Empirical Software Engineering, 2003.

Pre01 R. Pressman. Software Engineering: A Practitioner's Approach, McGraw-Hill, New York, NY, 2001.

Ram04 S. Ramchurn, D. Huynh, and N. Jennings. Trust in Multi-Agent Systems. Knowl. Eng. Rev., Vol. 19, No. 1, 2004, pp. 1-25.

Reu03 R. Reussner, H. Schmidt, and I. Poernomo. Reliability Prediction for Component-Based Software Architectures. Journal of Syst. Softw., Vol. 66, No. 3, 2003, pp. 241-252.

Sab03 J. Sabater. Trust and Reputation for Agent Societies. PhD thesis, Univeritat Autnoma de Barcelona, 2003.

Sha96 M. Shaw. Truth Vs Knowledge: The Difference Between What a Component Does and What We Know It Does. IWSSD '96: Proceedings of the 8th International Workshop on Software Specification and Design, IEEE Computer Society, 1996, pp. 181.

Sin02 M. Singh. Trustworthy Service Composition: Challenges and Research Questions. Proceedings of the Autonomous Agents and Multi-Agent Systems Workshop on Deception, Fraud and Trust in Agent Societies, 2002.

Sta01 J. Stafford and K. Wallnau. Is Third Party Certification Necessary?. Proceedings of the 4th ICSE Workshop on Component–Based Software Engineering, 2001.

Tad03 S. Tadelis. Firm Reputation with Hidden Information. Economic Theory, Vol. 21, No. 2-3, Mar 2003, pp. 635-651.

Voa95 J. Voas and K. Miller. Software Testability: The New Verification. IEEE Softw., Vol. 12, No. 3, 1995, pp. 17-28.

Wal01 K. Wallnau and J. Stafford. Ensembles: Abstractions for a New Class of Design Problem. 27th Euromicro Conference, 2001, pp. 48.

components are like a box of chocolate

Documents