structuresubmit step by step tutorial - ec.europa.euec.europa.eu/eurostat/cros/system/files/final...

57
BNSI Bulgarian National Statistical Institute ESSNET on SDMX II WP4 “Harmonization of existing toolsSDMX RI Source Code Analysis – Java Modules: SDMX Information Model Query Parser Version 1.3

Upload: lyhuong

Post on 20-Sep-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

BNSI Bulgarian National Statistical Institute

ESSNET on SDMX II

WP4 “Harmonization of existing tools”

SDMX RI Source Code Analysis – JavaModules:

SDMX Information ModelQuery Parser

Version 1.3

September 2012

Type of Document Project deliverable

Reference: SDMX RI Source Code Analysis – Java, version 1.3

Issue: Revision: 6 Status: Final

Created by: Emil Hurmuzov (EH) Date: 15 Nov 2011

Updated by: Emil Hurmuzov (EH) 10 May 2012

Updated by: Emil Hurmuzov (EH) 10 Sept 2012

Approved by:

Document Change RecordIssue/

RevisionDate Change

1.1 15 Nov 2011 Draft first version

1.2 10 May 2012 Updated version, submitted to ESTAT for review

1.3 19 Sep 2012 Final version

Table of contents Page

1 Introduction................................................................................................................................... 51.1 Purpose........................................................................................................................................ 51.2 Scope........................................................................................................................................... 51.3 References................................................................................................................................... 5

2 Code Quality Analysis................................................................................................................62.1 Introduction................................................................................................................................... 62.2 Quality characteristics.................................................................................................................. 62.3 Automated Code Analysis............................................................................................................6

2.3.1 Static Code Analysis...........................................................................................................62.3.2 Tools................................................................................................................................... 82.3.3 Rules.................................................................................................................................. 92.3.4 Results – SDMX Information Model..................................................................................112.3.5 Results – Query Parser....................................................................................................242.3.6 Conclusions and Recommendations (and impact of Refactoring)....................................41

2.4 Manual Code Analysis................................................................................................................412.4.1 Results – SDMX Information Model..................................................................................412.4.1.1 Wrong import class (com.agilis.sdmx.model.structure.CodeRefBean.java, line 11)....412.4.1.2 Local Maven repository in the pom.xml (pom.xml, line 129)........................................412.4.1.3 Non-UTF-8 symbols in the header of all java class files..............................................422.4.2 Results – Query Parser....................................................................................................422.4.2.1 Using “main” class for unit testing (QueryParser.java, line 464)..................................422.4.2.2 Missing test data (QueryParser.java, line 468)............................................................432.4.2.3 Incorrect Utility class (QueryDefs.java)........................................................................432.4.2.4 Direct use of String values (Magic values) (QueryParser.java)...................................442.4.3 Conclusions and Recommendations (and impact of Refactoring)....................................44

3 Final Words................................................................................................................................ 45

3

1 Introduction

1.1 PurposeThe goal of this document is to provide information about the quality of the two modules from SDMX Reference Infrastructure – SDMX Information Model version 1.4.1 and Query Parser 1.0.1 and to provide some recommendation on the way of refactoring particular problems in the source code.

This information could be used to increase the quality of the code and reducing the risk of potential errors when these modules are using in a production deployments in National Statistics.

1.2 ScopeThe document provides analysis information only for two modules from SDMX Reference Infrastructure - SDMX Information Model version 1.4.1 and Query Parser 1.0.1.

The analysis is focused only on improvement of the existing implementation of two modules but not to suggest another approach for design or environment.

1.3 References

Reference Document/Resource Name

SDMX Standard http://sdmx.org

Java Programming Style Java Programming Style Guidelines

NSIWebServices Java http://circa.europa.eu/Public/irc/dsis/stne/library?l=/x-dis/tools/reference_architecture/nsi_web_service/java/2011/nsiwebservice_20110715zi/_EN_1.0_&a=d

SDMX Information Model version 1.4.1

sdmx_model module from above application

SDMX Query Parser http://circa.europa.eu/Public/irc/dsis/stne/library?l=/x-dis/tools/reference_architecture/old_versions/infrastructure_providing_1/2010/20100927/downloadzip/_EN_1.0_&a=d

4

2 Code Quality Analysis

2.1 IntroductionTwo general methods are used in analysing the quality of the code for selected modules.

The first method is automated check by using a specialized software tool (Sonar in this case). The results are summarized in a table and later for each major type of errors/alerts are provided an explanation of the problem and suggested correction. This can be a good starting point for refactoring the code.

The second method is “manually” looking into the code, trying to build, deploy and test the NSIWebService 2.3.0 application which contains the two selected modules. The results of this test can be useful for improving the distribution of modules to National Statistics or third parties of developers.

2.2 Quality characteristicsThe both modules - SDMX Information Model version 1.4.1 and Query Parser 1.0.1 are pretty simple and very specialized.

SDMX Information Model is a simple POJO representation of the SDMX Model from SDMX Standard. This module does not provide any functionality and is used only to transfer SDMX objects between other SDMX modules. That’s why there are no way to test it for scalability, reliability and performance.

Query Parser is a just “one-class” module and its functionality is to parse the incoming XML messages. It uses a SAX based parser which is stream based parsing and provides a better resource utilization than the DOM based parsers. The simplicity of the module does not allow it to influence the performance of the application.

2.3 Automated Code Analysis

2.3.1 Static Code AnalysisThe term “static analysis” has indeed been around for decades. The primary purpose of

the first-generation tools such as lint was to do stronger type checking, but they could

also do some style checking and lightweight bug-finding. Nowadays there are three

classes of tool:

◦ Lightweight tools like lint, and others that find violations of coding standards

are in one class. They typically use superficial techniques that examine the

syntax of the program to find issues. A common complaint about them is their

excessive rate of false alarms and that the warnings they issue do not correlate

very well with real defects.

◦ Advanced static-analysis tools such as Sonar are bug hunters—their primary

purpose is to find serious programming defects.

5

◦ Tools that support formal methods can prove the absence of certain classes of

error. These are heavyweight tools that require programs conform to a strict

subset of the language, so have limited applicability and are not widely used.

There is of course some overlap between these classes. The lightweight tools can be

useful if there is a requirement for the code to conform to coding standards; they can

also find some kinds of bugs, albeit not nearly as effectively as the other classes of

tools. Similarly the advanced tools can also be used to check lightweight properties.

Advanced static-analysis tools, or bug hunters are those whose primary purpose is to

find serious bugs. It is wrong to state that these tools are little different from those that

came before.

They are simultaneously inter procedural, flow sensitive, context sensitive, whole

program, and path sensitive. They scale to tens of millions of lines of code and yield

results with a low level of false positives.

These techniques give them the capability to find deep semantic errors in huge code

bases. There is no contradiction in observing that they can also subsume the

capabilities of the earlier generation of tools. The commercial success and wide

adoption of these tools indicates that many users find that these tools deliver real

value.

Testing is very good at finding ways in which the software does not meet its

requirements. However testing is only as good as the test cases, and it is expensive to

create test cases and time-consuming to execute them. It is not uncommon for half of a

software development effort to be consumed by testing activities.

The reason advanced static-analysis tools are successful is that they change the

economics for several reasons:

◦ They can find bugs early in the development cycle, and bugs found earlier are less expensive to fix.

◦ They do not require program inputs, so bugs can be found and eliminated without incurring the expense of developing test cases.

◦ They can make it easier and less expensive to develop dynamic test cases. The consequence is they can eliminate more bugs for less expense.

Static analysis can be used as soon as the code can be compiled. The code does not

have to be complete, or integrated with the rest of the program in order for the

technique to be able to begin to find bugs. It can be run on a single file at a time, or a

6

complete codebase, or anything in between. The results are better when the entire

program is analyzed, but an analysis of small parts can also be useful. Thus developers

can get very quick feedback on their code quality.

2.3.2 ToolsSonar is an open platform to manage code quality. As such, it covers the 7 axes of code

quality:

Java language is built in. Open Source and commercial plugins enable to cover C, C#,

Flex, Natural, PHP, PL/SQL, Cobol and Visual Basic 6.

An open source code quality management software, combines the expertise of Check

Style, Find Bugs and PMD as well as provides a graphical way of analyzing and

reporting code quality.

Link to installation instructions:

http://docs.codehaus.org/display/SONAR/Install+Sonar

There are several ways Sonar analysis can be performed:

7

▪ using the Maven plugin

▪ using the Ant Task

▪ using the Java Runner

▪ using a CI engine

We’ll start with using the Maven Plugin. The final goal is to build a complete

development environment – Version control (Subversion), Continuous Integration

Server (Hudson / Jenkins), Code Analysis (Sonar).

2.3.3 RulesThe rules are classified according to the type of irregularity in the code – conventions

rules (Checkstyle), bad practices (PMD) and potential bugs (FindBugs). The Sonar

way rule set checks for java language conventions and bad coding practices. Every

rule has a severity level according to a classification of violations – blocker, critical,

major, minor, info.

The following 30 conventions rules were used in the source code analysis:

Anon Inner Length Checks for long anonymous inner classes.Boolean Expression ComplexityRestricts nested boolean operators (&&, || and ^) to a specified depth (default = 3). Constant Name Checks that constant names conform to the specified format Cyclomatic Complexity Checks cyclomatic complexity of methods against a specified limit. The complexity is measured by the number of if, while, do, for, ?:, catch, switch, case statements, and operators && and || (plus one) in the body of a constructor, method, static initializer, or instance initializer. It is a measure of the minimum number of possible paths through the source and therefore the number of required tests. Generally 1-4 is considered good, 5-7 ok, 8-10 consider re-factoring, and 11+ re-factor now ! Default Comes Last Check that the default is after all the cases in a switch statement. Double Checked Locking Detect the double-checked locking idiom, a technique that tries to avoid synchronization overhead but is incorrect because of subtle artifacts of the java memory model. Empty Statement Detects empty statements (standalone ';'). Equals Hash Code Checks that classes that override equals() also override hashCode(). Final Class Checks that class which has only private constructors is declared as final. Hidden Field Checks that a local variable or a parameter does not shadow a field that is defined in the same class. Hide Utility Class Constructor

8

Make sure that utility classes (classes that contain only static methods) do not have a public constructor. Illegal Throws Throwing java.lang.Error or java.lang.RuntimeException is almost never acceptable. Inner Assignment Checks for assignments in subexpressions, such as in String s = Integer.toString(i = 2);.Local Final Variable Name Checks that local final variable names, including catch parameters, conform to the specified format Local Variable Name Checks that local, non-final variable names conform to the specified format Magic Number Checks for magic numbers. Member name Checks that name of non-static fields conform to the specified format Method Name Checks that method names conform to the specified format Modifier Order Checks that the order of modifiers conforms to the suggestions in the Java Language specification, sections 8.1.1, 8.3.1 and 8.4.3. The correct order is : public, protected, private, abstract, static, final, transient, volatile, synchronized, native, strictfp. Package name Checks that package names conform to the specified format. The default value of format has been chosen to match the requirements in the Java Language specification and the Sun coding conventions. However both underscores and uppercase letters are rather uncommon, so most configurations should probably assign value ^[a-z]+(\.[a-z][a-z0-9]*)*$ to format Parameter Assignment Disallow assignment of parameters. Parameter Name Checks that parameter names conform to the specified format Redundant Modifier Checks for redundant modifiers in interface and annotation definitions. Redundant Throws Checks for redundant exceptions declared in throws clause such as duplicates, unchecked exceptions or subclasses of another declared exception. Simplify Boolean Expression Checks for overly complicated boolean expressions.Simplify Boolean Return Checks for overly complicated boolean return statements. Static Variable Name Checks that static, non-final fields conform to the specified format String Literal Equality Checks that string literals are not used with == or !=. Unused Imports Checks for unused import statements. Visibility Modifier Checks visibility of class members. Only static final members may be public; other class members must be private unless property protectedAllowed or packageAllowed is set.

9

2.3.4 Results – SDMX Information Model

Violated Rules detected by software

Statistics of the project.

Complexity of the project.

10

Violations / Compliance of the project.

Severity Rule Description ViolationsMajor Visibility Modifier Checks visibility of class members. Only

static final members may be public; other class members must be private.

79

Major If Stmts Must Use Braces Avoid using if statements without using curly braces.

13

Major Loose coupling Avoid using implementation types (i.e., HashSet); use the interface (i.e, Set) instead

12

Major System Println System.(out|err).print is used, consider using a logger.

12

Major Avoid Throwing Raw Exception Types

Avoid throwing certain exception types. Rather than throw a raw RuntimeException, Throwable, Exception, or Error, use a subclassed exception or error instead.

5

Major If Else Stmts Must Use Braces

Avoid using if..else statements without using curly braces.

4

Major Signature Declare Throws Exception

It is unclear which exceptions that can be thrown from the methods. It might be difficult to document and understand the vague interfaces. Use either a class derived from RuntimeException or a checked exception.

4

Major Member name Checks that name of non-static fields conform to the specified format

3

Major Illegal Throws Throwing java.lang.Error or java.lang.RuntimeException is almost never acceptable.

1

Major Parameter Name Checks that parameter names conform to the specified format

1

Major Empty Finalizer If the finalize() method is empty, then it does not need to exist.

1

Major Hide Utility Class Constructor

Make sure that utility classes (classes that contain only static methods) do not have a public constructor.

1

Major Simplify Boolean Return Checks for overly complicated boolean return statements.

1

11

Info Unused Imports Checks for unused import statements. 1

Rule: Visibility ModifierDefining a class member without modifier (default modifier) makes this member direct accessible by all classes in the same package.If other programmers use your class, you want to ensure that errors from misuse cannot happen. Access levels can help you do this.

Use the most restrictive access level that makes sense for a particular member. Use private unless you have a good reason not to.

Avoid public fields except for constants. (Many of the examples in tutorials use public fields. This may help to illustrate some points concisely, but is not recommended for production code.) Public fields tend to link you to a particular implementation and limit your flexibility in changing your code.

List of classes with violations

Class Name No Rows with violationsRegistryInterfaceBean 11 15, 16, 17, 18, 19, 20, 21, 22, 23,

24, 25

StructureBean 10 21, 22, 24, 25, 26, 28, 29, 30, 31, 32

KeyFamilyBean 10 21, 22, 24, 25, 26, 28, 29, 30, 31, 32

QueryProvisioningResponseBean 5 17, 18, 19, 20, 22

RefBean 4 13, 14, 15, 16

QueryProvisioningRequestBean 4 17, 18, 19, 20, 22

RefBean 4 12, 13, 14, 15

QueryStructureRequestBean 3 17, 25, 26

AssociationBean 3 12, 13, 14

CodelistMapBean 3 15, 16, 17

QueryStructureResponseBean 2 25, 26

CrossSectionalMeasureBean 2 10, 11

GroupBean 2 15, 16

QueryBean 2 15, 16

SubmissionResultBean 2 14, 15

SubmitProvisioningRequestBean 1 19

MessageBean 1 10

12

FullTargetIdentifierBean 1 15

SubmitSubscriptionRequestBean 1 19

SubmitProvisioningResponseBean 1 17

ItemSchemeBean 1 24

StructureSetBean 1 15

MetadataAttributeBean 1 14

SubmitStructureResponseBean 1 17

SubmitStructureRequestBean 1 22

ProvisioningStatusBean 1 18

PartialTargetIdentifierBean 1 15

Source code with violation

List codelistMaps = new ArrayList();

Suggested Correctionprivate List codelistMaps = new ArrayList();

/** Only if access from other classes is required */public getCodelistMaps() {

return (codelistMaps);}

Rule: If Statement Must Use BracesIt is good programming style to always write curly braces {}, although they are not needed if the clause contains only a single statement. There are two reasons this is good:

Reliability: When code is modified, the indentation is such a strong indicator of structure that the programmer may not notice that the addition of a statement at the "correct" indentation level really isn't included in the scope of the if statement. This is a surprisingly common error.

Readability: It is faster to read code with the braces because the reader doesn't have to keep in mind whether they are dealing with an un-braced single statement or a braced block.

List of classes with violations

Class Name No Rows with violationsKeyFamilyBean 4 156, 239, 241, 243

QueryBean 2 48, 50

13

CategorySchemeBean 2 38, 43

IdentifierComponentBean 1 79

ReportStructureBean 1 77

DataMessage 1 62

MetadataStructureDefinitionBean 1 53

StatusMessageBean 1 32

Source code with violationcom.agilis.sdmx.model.structure.CategorySchemeBean – row 41

while (cit.hasNext()) {CategoryBean category = (CategoryBean) cit.next();if (parentCategory != null)

category.setParentId(parentCategory.getId());categories.add(category);iterateDeapthFirst(category, categories, category.getCategories());

}

Suggested Correction

while (cit.hasNext()) {CategoryBean category = (CategoryBean) cit.next();if (parentCategory != null) {

category.setParentId(parentCategory.getId());}categories.add(category);iterateDeapthFirst(category, categories, category.getCategories());

}

Rule: Loose couplingA common programming philosophy is to declare variables with a type that makes the least assumptions. This allows flexibility to make changes in the underlying type if this latter becomes necessary. For example, the most obvious way to declare x is ArrayList x = new ArrayList(); // OK

But if x only uses the methods defined in the List interface, it would be better to do the following. List x = new ArrayList(); // More flexible

This guarantees that x doesn't use any methods that aren't in the List interface. The advantage of this is that later the underlying implementation can be changed. List x = new LinkedList(); // Changed underlying class

14

Why change the underlying class?The two List classes, ArrayList and LinkedList both implement the same List interface methods and can be manipulated by the methods in the Collections class, but the performance of different operations may be very different, eg, insertion in the middle of an LinkedList is more efficient, but accessing a random element is more efficient using a ArrayList. If it becomes necessary to change the underyling class for performance reasons, it is easy to do if variables are declared by their interfaces (List, Map, and Set for the Collections classes).

List of classes with violations

Class Name No Rows with violationsKey 3 16, 21, 25

XSSection 3 17, 23, 27

AttachableArtefact 3 16, 21, 25

XSGroup 3 17, 23, 27

Source code with violationcom.agilis.sdmx.model.data.cross.XSSection – row 17

private ArrayList<XSObservation> observations = new ArrayList<XSObservation>();

Suggested Correction

private List<XSObservation> observations = new ArrayList<XSObservation>();

Rule: System PrintlnIn practice console input is rarely used, but programmers often use System.out.println for output during the debugging phase. The logger gives to ability to define different levels of importance of the logged messages and the ability to use different sink for the output - the console, a file, etc.Also it's easy to enable or disable only some type of message when using a logger - for example you don't want to see every debug message in production.

List of classes with violations

Class Name No Rows with violations

15

OperatorBean 9 77, 79, 83, 89, 91, 93, 95, 97, 103

DataWhereBean 1 69

DataMessage 1 49

MetaDataWhereBean 1 64

Source code with violationcom.agilis.sdmx.model.data.DataMessage – row 49

} catch (IOException ioe) {System.out.println("Exception loading SDMX_namespaces.props file " +

ioe.getMessage());}

Suggested Correction

Use logger.

Rule: Avoid Throwing Raw Exception TypesAvoid throwing certain exception types. Rather than throw a raw RuntimeException, Throwable, Exception, or Error, use a subclassed exception or error instead.

List of classes with violations

Class Name No Rows with violationsQueryBean 1 53

MetaDataWhereBean 1 65

DataWhereBean 1 70

KeyFamilyBean 1 40

OperatorBean 1 107

Source code with violationcom.agilis.sdmx.model.query.QueryBean – row 53

} catch (Exception e) {

16

throw new Exception("Error printing Query", e);}

Suggested Correction

Define a new Exception type. (SDMXModelException)

} catch (SomeException e) {throw new SDMXModelException("Error printing Query", e);

}

Rule: If Else Statement Must Use BracesIt is good programming style to always write curly braces {}, although they are not needed if the clause contains only a single statement. There are two reasons this is good:Reliability: When code is modified, the indentation is such a strong indicator of structure that the programmer may not notice that the addition of a statement at the "correct" indentation level really isn't included in the scope of the if statement. This is a surprisingly common error. Readability: It is faster to read code with the braces because the reader doesn't have to keep in mind whether they are dealing with an un-braced single statement or a braced block.

List of classes with violations

Class Name No Rows with violationsKeyFamilyBean 3 151, 153, 155

OperatorBean 1 97

Source code with violationcom.agilis.sdmx.model.structure.KeyFamilyBean – row 152

else if (attribute.getAttachmentLevel().trim().equals("Series"))addSeriesAttribute(attribute);

Suggested Correction

else if (attribute.getAttachmentLevel().trim().equals("Series")) {addSeriesAttribute(attribute);

}

17

Rule: Signature Declare Throws ExceptionAvoid throwing certain exception types. Rather than throw a raw RuntimeException, Throwable, Exception, or Error, use a subclassed exception or error instead.

List of classes with violations

Class Name No Rows with violationsQueryBean 1 46

MetaDataWhereBean 1 57

DataWhereBean 1 57

OperatorBean 1 73

Source code with violationcom.agilis.sdmx.model.query.QueryBean – row 46

public static void print(QueryBean queryBean) throws Exception {

Suggested Correction

Define a new Exception type. (SDMXModelException)

public static void print(QueryBean queryBean) throws SDMXModelException {

Rule: Member nameVariable names must be in mixed case with a lower-case first letter. (Java Code Convention – September 12, 1997)

List of classes with violations

Class Name No Rows with violationsAttributeBean 2 26, 27

18

ProvisioningStatusBean 1 18

Source code with violationcom.agilis.sdmx.model.registry.ProvisioningStatusBean – row 18

StatusMessageBean StatusMessage;

Suggested Correction

StatusMessageBean statusMessage;

Rule: Illegal ThrowsThrowing java.lang.Error or java.lang.RuntimeException is almost never acceptable.Never throw unchecked exceptions in your methods just because it clutters the method signature. There are some scenarios where this is good (For e.g. EJB Interface/Implementations, where unchecked exceptions alter the bean behavior in terms of transaction commit and rollback), but otherwise this is not a good practice.

List of classes with violations

Class Name No Rows with violationsMetadataflowRefBean 1 22

Source code with violationcom.agilis.sdmx.model.registry.MetadataflowRefBean – row 22

public void finalize() throws Throwable {}

Suggested Correction

Define a new Exception type. (SDMXModelException)

public void finalize() throws SDMXModelException {}

19

Rule: Parameter NameParameter names must be in mixed case with a lower-case first letter. (Java Code Convention – September 12, 1997)

List of classes with violations

Class Name No Rows with violationsMetaDataWhereBean 1 57

Source code with violation

com.agilis.sdmx.model.query.MetaDataWhereBean – row 57

public static void print(MetaDataWhereBean MetaDataWhereBean) throws Exception {

}

Suggested Correction

public static void print(MetaDataWhereBean metaDataWhereBean) throws Exception {}

Rule: Empty FinalizerIf the finalize() method is empty, then it does not need to exist.

List of classes with violations

Class Name No Rows with violationsMetadataflowRefBean 1 22

Source code with violationcom.agilis.sdmx.model.registry.MetadataflowRefBean – row 22

20

public void finalize() throws Throwable {}

Suggested Correction

Remove the method or add a business logic code.

Rule: Hide Utility Class ConstructorMake sure that utility classes (classes that contain only static methods) do not have a public constructor. This will prevent users of your class to instantiate the class in this way:Common c = new Common();The usual way to prevent this is to declare an empty private constructor of the class:private Common() {}

List of classes with violations

Class Name No Rows with violationsCommon 1 8

Source code with violationcom.agilis.sdmx.model.base.Common – row 8

public class Common {// ...

}

Suggested Correction

public class Common {

private Common() {}}

21

Rule: Simplify Boolean ReturnsWhen writing methods that return booleans, many people make their code much more complicated than it needs to be.It is alway possible to replace the pattern if (E) {return true;} else {return false;} by the pattern return E; for any boolean expression E.Similarly, it is always possible to replace the pattern if (E) {return false;} else {return true;} by the pattern return !E;.

List of classes with violations

Class Name No Rows with violationsCommon 1 18

Source code with violationcom.agilis.sdmx.model.base.Common – row 18

public static boolean evaluateXMLBooleanLiteral(String value) {if (value.equalsIgnoreCase("yes") || value.equals("1")) {

return true;} else {

return false;}

}

Suggested Correction

public static boolean evaluateXMLBooleanLiteral(String value) {

return (value.equalsIgnoreCase("yes") || value.equals("1"));}

Rule: Unused ImportsUnused Import statements.

List of classes with violations

22

Class Name No Rows with violationsAnnotableArtefactBean 1 8

Source code with violationcom.agilis.sdmx.model.base.AnnotableArtefactBean – row 8

import java.net.URI;

Suggested Correction

Remove unused import statement.

2.3.5 Results – Query Parser

Violated Rules detected by software

23

Statistics of the project.

Complexity

Violations / Compliance of the project.

24

Violations Drilldown.

Severity Rule Description ViolationsCritical Avoid Catching Throwable This is dangerous because it casts too

wide a net; it can catch things like OutOfMemoryError.

1

Major Visibility Modifier Checks visibility of class members. Only static final members may be public; other class members must be private.

21

Major Naming – Suspicious constant field name

A field name is all in uppercase characters, which in Sun's Java naming conventions indicate a constant. However, the field is not final.

14

Major Static Variable Name Checks that static, non-final fields conform to the specified format

13

Major System Println System.(out|err).print is used, consider using a logger.

8

Major Avoid Print Stack Trace Avoid printStackTrace(); use a logger call instead.

8

Major Cyclomatic Complexity Checks cyclomatic complexity of methods against a specified limit. The complexity is measured by the number of if, while, do, for, ?:, catch, switch, case statements, and operators && and || (plus one) in the body of a constructor, method, static initializer, or instance initializer. It is a measure of the minimum number of possible paths

2

25

through the source and therefore the number of required tests. Generally 1-4 is considered good, 5-7 ok, 8-10 consider re-factoring, and 11+ re-factor now !

Major Loose coupling Avoid using implementation types (i.e., HashSet); use the interface (i.e, Set) instead

2

Major Member Name Checks that name of non-static fields conform to the specified format

1

Major Use ArrayList instead of Vector

ArrayList is a much better Collection implementation than Vector.

1

Major Unused local variable Detects when a local variable is declared and/or assigned, but not used.

1

Major Preserve Stack Trace Throwing a new exception from a catch block without passing the original exception into the new Exception will cause the true stack trace to be lost, and can make it difficult to debug effectively.

1

Major Ncss Method count This rule uses the NCSS (Non Commenting Source Statements) algorithm to determine the number of lines of code for a given method. NCSS ignores comments, and counts actual statements. Using this algorithm, lines of code that are split are counted as one.

1

Major Hide Utility Class constructor

Make sure that utility classes (classes that contain only static methods) do not have a public constructor.

1

Major Replace Vector with List Consider replacing Vector usages with the newer java.util.ArrayList if expensive threadsafe operation is not required.

1

Major Avoid throwing NullPointerException

Avoid throwing a NullPointerException - it's confusing because most people will assume that the virtual machine threw it. Consider using an IllegalArgumentException instead; this will be clearly seen as a programmer-initiated exception.

1

Major Avoid Catching NPE Code should never throw NPE under normal circumstances. A catch block may hide the original error, causing other more subtle errors in its wake.

1

Minor Redundant Throws Checks for redundant exceptions declared in throws clause such as duplicates, unchecked exceptions or subclasses of another declared exception.

1

26

Rule: Avoid Catching ThrowableThis is dangerous because it casts too wide a net; it can catch things like OutOfMemoryError.

List of classes with violations

Class Name No Rows with violationsQueryParser 1 211

Source code with violation

} catch (Throwable t) {...

Suggested Correction

Define a new Exception type.

Rule: Visibility ModifierDefining a class member without modifier (default modifier) makes this member direct accessible by all classes in the same package.If other programmers use your class, you want to ensure that errors from misuse cannot happen. Access levels can help you do this.

Use the most restrictive access level that makes sense for a particular member. Use private unless you have a good reason not to.

Avoid public fields except for constants. (Many of the examples in tutorials use public fields. This may help to illustrate some points concisely, but is not recommended for production code.) Public fields tend to link you to a particular implementation and limit your flexibility in changing your code.

List of classes with violations

Class Name No Rows with violationsQueryDefs 13 13, 14, 15, 17, 18, 19, 20, 21, 22,

23, 24, 25

QueryParser 8 111, 116, 119, 122, 128, 131, 133, 136

27

Source code with violationorg.estat.query_parser.QueryParser – row 111

StringBuffer textBuffer;

Suggested Correctionprivate StringBuffer textBuffer;

Rule: Naming – Suspicious constant field nameA field name is all in uppercase characters, which in Sun's Java naming conventions indicate a constant. However, the field is not final.

List of classes with violations

Class Name No Rows with violationsQueryDefs 13 13, 14, 15, 16, 17, 18, 19, 20, 21,

22, 23, 24, 25

QueryParser 1 139

Source code with violationorg.estat.query_parser.QueryDefs – row 13

public static String QUERY_ELEMENT = "Query"

Suggested Correction

public final static String QUERY_ELEMENT = "Query"

28

Rule: Static Variable NameChecks that static, non-final fields conform to the specified format.

List of classes with violations

Class Name No Rows with violationsQueryDefs 13 13, 14, 15, 16, 17, 18, 19, 20, 21,

22, 23, 24, 25

Source code with violationorg.estat.query_parser.QueryDefs – row 13

public static String QUERY_ELEMENT = "Query"

Suggested Correction

If this is a non-final field its name should be “queryElement”.

Rule: If Statement Must Use BracesIt is good programming style to always write curly braces {}, although they are not needed if the clause contains only a single statement. There are two reasons this is good:Reliability: When code is modified, the indentation is such a strong indicator of structure that the programmer may not notice that the addition of a statement at the "correct" indentation level really isn't included in the scope of the if statement. This is a surprisingly common error. Readability: It is faster to read code with the braces because the reader doesn't have to keep in mind whether they are dealing with an un-braced single statement or a braced block.

List of classes with violations

Class Name No Rows with violationsQueryParser 10 209, 261, 291, 341, 346, 350,

352, 357, 365, 407

Source code with violationorg.estat.query_parser.QueryParser – row 261

29

if ("".equals(eName)) eName = qName; // not namespaceAware

Suggested Correction

if ("".equals(eName)) {eName = qName; // not namespaceAware

}

Rule: System PrintlnIn practice console input is rarely used, but programmers often use System.out.println for output during the debugging phase. The logger gives to ability to define different levels of importance of the logged messages and the ability to use different sink for the output - the console, a file, etc.Also it's easy to enable or disable only some type of message when using a logger - for example you don't want to see every debug message in production.

List of classes with violations

Class Name No Rows with violationsQueryParser 8 210, 342, 347, 351, 358, 366,

489, 493

Source code with violationorg.estat.query_parser.QueryParser – row 358

if (operatorObjs.size() != 1)System.err.println("Stack with more than 1 elements: " + operatorObjs.size());

Suggested Correction

Use logger.

30

Rule: Avoid Print Stack TraceAvoid printStackTrace(); use a logger call instead.

List of classes with violations

Class Name No Rows with violationsQueryParser 7 297, 431, 466, 486, 495, 498,

500

Source code with violationorg.estat.query_parser.QueryParser – row 492

} catch (ParserException e1) {System.out.println(e1.getMessage());e1.getCause().printStackTrace();e1.printStackTrace();

}

Suggested Correction

Use logger.

Rule: If Else Statement Must Use BracesIt is good programming style to always write curly braces {}, although they are not needed if the clause contains only a single statement. There are two reasons this is good:Reliability: When code is modified, the indentation is such a strong indicator of structure that the programmer may not notice that the addition of a statement at the "correct" indentation level really isn't included in the scope of the if statement. This is a surprisingly common error. Readability: It is faster to read code with the braces because the reader doesn't have to keep in mind whether they are dealing with an un-braced single statement or a braced block.

List of classes with violations

31

Class Name No Rows with violationsQueryParser 2 327, 329

Source code with violation

org.estat.query_parser.QueryParser – row 327

if (textBuffer == null)elmValues.put(name, null);

Suggested Correction

if (textBuffer == null) {elmValues.put(name, null);

}

Rule: Loose coupling

A common programming philosophy is to declare variables with a type that makes the least assumptions. This allows flexibility to make changes in the underlying type if this latter becomes necessary. For example, the most obvious way to declare x is

ArrayList x = new ArrayList(); // OK

But if x only uses the methods defined in the List interface, it would be better to do the following.

List x = new ArrayList(); // More flexible

This guarantees that x doesn't use any methods that aren't in the List interface. The advantage of this is that later the underlying implementation can be changed.

List x = new LinkedList(); // Changed underlying class

Why change the underlying class?The two List classes, ArrayList and LinkedList both implement the same List interface methods and can be manipulated by the methods in the Collections class, but the performance of different operations may be very different, eg, insertion in the middle of an LinkedList is more efficient, but accessing a random element is more efficient using a ArrayList. If it becomes necessary to change the underyling class for performance reasons, it is easy to do if variables are declared by their interfaces (List, Map, and Set for the Collections classes).

List of classes with violations

Class Name No Rows with violations

32

QueryParser 2 116,119

Source code with violationorg.estat.query_parser.QueryParser – row 119

HashMap<String, String> elmValues;

Suggested Correction

Map<String, String> elmValues;

Rule: Cyclomatic Complexity

Checks cyclomatic complexity of methods against a specified limit. The complexity is measured by the number of if, while, do, for, ?:, catch, switch, case statements, and operators && and || (plus one) in the body of a constructor, method, static initializer, or instance initializer. It is a measure of the minimum number of possible paths through the source and therefore the number of required tests. Generally 1-4 is considered good, 5-7 ok, 8-10 consider re-factoring, and 11+ re-factor now !

List of classes with violations

Class Name No Rows with violationsQueryParser 1 320

Source code with violation

private void processElement(String name) throws ParserException {

Suggested Correction

Refactoring required!

33

Rule: Member nameVariable names must be in mixed case with a lower-case first letter. (Java Code Convention – September 12, 1997)

List of classes with violations

Class Name No Rows with violationsQueryParser 1 139

Source code with violationorg.estat.query_parser.QueryParser – row 139

private boolean DEBUG = true;

Suggested Correction

If this is non-static field the name should be “debug”

Rule: Unused local variableA local variable is declared and/or assigned, but not used.

List of classes with violations

Class Name No Rows with violationsQueryParser 1 250

Source code with violationorg.estat.query_parser.QueryParser – row 250

public void endDocument() throws SAXException {int i = 0;

}

Suggested Correction

Useless method;

34

Rule: Use Array List instead of Vector ArrayList is a much better Collection implementation than Vector.

List of classes with violations

Class Name No Rows with violationsQueryParser 1 239

Source code with violationorg.estat.query_parser.QueryParser – row 239

openTags = new Vector<String>();

Suggested Correction

openTags = new ArrayList<String>();

Rule: Ncss Method countThis rule uses the NCSS (Non Commenting Source Statements) algorithm to determine the number of lines of code for a given method. NCSS ignores comments, and counts actual statements. Using this algorithm, lines of code that are split are counted as one.

List of classes with violations

Class Name No Rows with violationsQueryParser 1 320

Source code with violationorg.estat.query_parser.QueryParser – row 320

private void processElement(String name) throws ParserException {

35

Suggested Correction

The method requires refactoring.

Rule: Hide Utility Class ConstructorMake sure that utility classes (classes that contain only static methods) do not have a public constructor. This will prevent users of your class to instantiate the class in this way:Common c = new Common();The usual way to prevent this is to declare an empty private constructor of the class:private Common() {}

List of classes with violations

Class Name No Rows with violationsQueryDefs 1 11

Source code with violationorg.estat.query_parser.QueryDefs – row 11

public class QueryDefs {

// ...}

Suggested Correction

public class QueryDefs {

private QueryDefs() {}}

Rule: Replace Vector With List

36

Consider replacing Vector usages with the newer java.util.ArrayList if expensive threadsafe operation is not required.

List of classes with violations

Class Name No Rows with violationsQueryDefs 1 116

Source code with violationorg.estat.query_parser.QueryDefs – row 116

/** keeps all opened XML tag */Vector<String> openTags;

Suggested Correction

/** keeps all opened XML tag */List<String> openTags;

Rule: Avoid Throwing Null Pointer ExceptionAvoid throwing a NullPointerException - it's confusing because most people will assume that the virtual machine threw it. Consider using an IllegalArgumentException instead; this will be clearly seen as a programmer-initiated exception.

List of classes with violations

Class Name No Rows with violationsQueryDefs 1 194

Source code with violationorg.estat.query_parser.QueryDefs – row 194

throw new NullPointerException("No \"Message.xsd\" SDMX-ML Schema has been set.");

Suggested Correction

37

throw new IllegalArgumentException("No \"Message.xsd\" SDMX-ML Schema has been set.");

Rule: Preserve Stack TraceThrowing a new exception from a catch block without passing the original exception into the new Exception will cause the true stack trace to be lost, and can make it difficult to debug effectively.

List of classes with violations

Class Name No Rows with violationsQueryDefs 1 298

Source code with violationorg.estat.query_parser.QueryDefs – row 298

} catch (ParserException le) {le.printStackTrace();throw new SAXException(le.getMessage());

}

Suggested Correction

} catch (ParserException le) {le.printStackTrace();throw new SAXException(le.getMessage(), le);

}

Rule: Avoid Catching NPECode should never throw NPE under normal circumstances. A catch block may hide the original error, causing other more subtle errors in its wake.

List of classes with violations

38

Class Name No Rows with violationsQueryDefs 1 499

Source code with violationorg.estat.query_parser.QueryDefs – row 499

} catch (NullPointerException e) {e.printStackTrace();

}

Suggested Correction

Avoid catching NullPointerException.

Rule: Redundant ThrowsChecks for redundant exceptions declared in throws clause such as duplicates, unchecked exceptions or subclasses of another declared exception.

List of classes with violations

Class Name No Rows with violationsQueryDefs 1 175

Source code with violationorg.estat.query_parser.QueryDefs – row 175

public QueryBean read(InputStream is) throws ParserException, NullPointerException {

Suggested Correction

public QueryBean read(InputStream is) throws ParserException {

39

2.3.6 Conclusions and Recommendations (and impact of Refactoring)

Even the number of problems in modules is large the correction is not very difficult

and will not affect functionality of the modules and communication between these and

other modules of SDMX Reference Infrastructure. The focus should be the

encapsulating the functionality (for example – visibility of attributes) especially if

these modules will be provided to third party developers. Some attention should be

given to not using “magic numbers” (static strings) in the code but to create a separate

“dictionary” for defining all static text and allow easy changing of this dictionary by

developers without browsing all the code of modules.

2.4 Manual Code Analysis

2.4.1 Results – SDMX Information Model

2.4.1.1 Wrong import class (com.agilis.sdmx.model.structure.CodeRefBean.java, line 11)import com.agilis.sdmx.model.registry.RefBean;

/** * @author gpa Updated: 5.10.2007 */public class CodeRefBean extends RefBean {…

The imported class (RefBean) does not match the XSD schema of the SDMX

Standard. This class should be from the Structure package instead of Registry package.

Correct class to extend is com.agilis sdmx.model.structure.RefBean and because it is

from the same package as the CodeRefBean.java the import statement is not required.

2.4.1.2 Local Maven repository in the pom.xml (pom.xml, line 129)The Maven build file (pom.xml) contains reference to a local maven repository that

cannot be accessed from other developers and prevents direct building of the module.

This part should be commented to be able to build the module:<repository>

<id>agilis.repo</id>

40

<name>Agilis Repository</name>

<url>http://192.168.0.48:8081/nexus/content/repositories/agilis.repo</url><layout>default</layout>

</repository>

2.4.1.3 Non-UTF-8 symbols in the header of all java class files

Every attempt to open a Java class with the Netbeans IDE on Linux produces this

warning message. After confirm and open a file there are special symbols (probably

from Greek alphabet) that cannot be rendered appropriately.

2.4.2 Results – Query Parser

2.4.2.1 Using “main” class for unit testing (QueryParser.java, line 464)Using java “main” method for unit testing is not a good practice, especially when this

41

is a module for reuse by other developers. It provides an improper entry point to the

module and may cause a security issues.

The suggested method for unit testing is to use specialized frameworks as JUnit. This

will guarantee that test functionality will be available only on test tasks and do not

exist in the deployment environment.

2.4.2.2 Missing test data (QueryParser.java, line 468)The module contains functionality for testing its proper work but the test data are

missing.

QueryParser.java – line 468:sdmxmlQueryFile = new FileInputStream(“Y:/Documents/Registry Samples/nsi/query.xml”);

The test data should be included in the provided package for use, change and extend

by external developers.

2.4.2.3 Incorrect Utility class (QueryDefs.java)The class QueryDefs.java contains only constants. All attributes should be declared

final – not allowing to be changed in runtime. The additional protection is to declare a

private constructor – not allowing the class to be initialized by the default empty

constructor:

Original file:public class QueryDefs {

public static String QUERY_ELEMENT = “Query”;public static String DATAWHERE_ELEMENT = “DataWhere”;public static String METADATAWHERE_ELEMENT = “MetadataWhere”;public static String DIMENSION_ELEMENT = “Dimension”;public static String ATTRIBUTE_ELEMENT = “Attribute”;…}

Suggested changes:public class QueryDefs {

private QueryDefs() {}

public static final String QUERY_ELEMENT = “Query”;public static final String DATAWHERE_ELEMENT = “DataWhere”;public static final String METADATAWHERE_ELEMENT = “MetadataWhere”;public static final String DIMENSION_ELEMENT = “Dimension”;

42

public static final String ATTRIBUTE_ELEMENT = “Attribute”;…}

2.4.2.4 Direct use of String values (Magic values) (QueryParser.java)It is a good practice not to use Strings or other constants directly in the code. These

should be declared as an attribute and then used in the code.

Example – QueryParser.java – Line 421:if (aname.equals("id")) {

at.setId(attrs.getValue(i));} else if (aname.equals("attachmentLevel")) {

at.setAttachmentLevel(attrs.getValue(i));}

Suggested change:private statis final String TAG_ID = “id”;private statis final String TAG_ATTACHMENTLEVEL = “attachmentLevel”;

if (aname.equals(TAG_ID)) { at.setId(attrs.getValue(i));

} else if (aname.equals(TAG_ATTACHMENTLEVEL)) { at.setAttachmentLevel(attrs.getValue(i));

}

2.4.3 Conclusions and Recommendations (and impact of Refactoring)

A special attention should be paid to the smooth process of build/deploy and test of the

applications. Every simple error in this process can disappoint users (especially if they

are not senior) and stop them in using the SDMX Reference Infrastructure.

43

3 Final WordsA big part of problems are created by refactoring and extending functionalities of modules and missing coding rules and discipline.

Simplicity of modules allows easy correction without affecting their functionality and communication with other SDMX modules.

The refactoring of the code should pay attention on the age of the code – there are some really old-fashioned approaches (like Vector class for example).

44