28 m 2001 doc. cord 039 an xml version of the instat … specification could share the same view of...

16
COLLECTION OF RAW DATA TASK FORCE 28 MARCH 2001 Doc. CoRD 039 An XML version of the INSTAT message For information

Upload: dangduong

Post on 27-Mar-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

COLLECTION OF RAW DATA

TASK FORCE

28 MARCH 2001

Doc. CoRD 039

An XML version of the INSTAT message

For information

An XML version of the INSTAT message CoRD039

2

Abstract

EDI standardisation is one of the six main areas of the raw data collection strategy which theCoRD Task Force monitors and for which they act as the steering committee. This papergives a progress report on the development of an XML version of the INSTAT message usedfor Intrastat by Working Group 5 of EBES EG6.

The Intrastat business domain has been modelled as a UML class diagram, and a separateUML model for the INSTAT message has been based on this. Separate models are necessaryto allow the introduction of new classes and relationships for implementation purposes. AnXML schema for the message has been derived form the model, using rules and guidelinesdeveloped in the project.

The XML message has been successfully trialed in a prototype XML message generator builtinto a version of IDEP, and a prototype XML message analyser which can receive and importthe data to a database.

This work is in the vanguard of standardisation of XML messages for raw data collection andan important aspect of the project has been the formulation of a standardisation process forXML EDI messages.

An XML version of the INSTAT message CoRD039

3

AN XML VERSION OF THE INSTAT MESSAGE

CONTEXT

This study is developed within the framework of the EDICOM project for to the Intrastatsystem relating to statistics of trading of goods between Member States.

INSTAT message is the electronic format of the Intrastat declaration. Member States andEurostat studied this message within the EEG6/Working Group 5 (European Board for EDIStandardisation / Expert Group 6 – Statistics / Working Group 5 – Foreign Trade statistics).

The first implementation is the EDIFACT message CUSDEC/INSTAT used by more than45,000 traders in the European Union.

NEED OF THE XML VERSION OF INSTAT

Several National Administrations (CNAs) already implemented web forms for Intrastat in theMember States. Generating Intrastat declarations in XML is an important technical extensionof web forms because XML is both a browser language at declarant side and a format takeninto account by more and more database interfaces at CNA side.

Software developers request a common format of the Intrastat declaration in XML that theycan implement in their commercial packages.

ADOPTED APPROACH

EEG6/WG5 began this study by gaining knowledge in XML and other linked techniques byprototyping Intrastat declarations in XML and XSL.

Then the group studied the business case of "Intrastat and XML".

In September 2000, EEG6/WG5 studied the first XML schema of INSTAT based on theEDIFACT CUSDEC/INSTAT Message Implementation Guideline but met problems aboutthe definition of the information. So, the group decided:

½ to build a stable data model of Intrastat, syntax neutral, in which everyone involved inthe specification could share the same view of the Intrastat system;

½ to derive any specification data exchange from this model, in particular the Intrastatdeclaration in XML, called INSTAT XML.

CLASS DIAGRAMS OF INTRASTAT IN UML

Two models

In February 2001, EEG6/WG5 validated and approved two models of class diagrams ofIntrastat in UML :

An XML version of the INSTAT message CoRD039

4

½ the business world of Intrastat called "Intrastat declaration reference model"; thismodel is the result of validation between Member States and Eurostat and provides areference for the development of any information systems in that domain; forimplementation purposes, new models can be derived from this model by adding orignoring elements as necessary; this class diagram is presented in Annex 1;

½ the INSTAT message implementation called "INSTAT message, implementationmodel"; this class diagram is derived from the "Intrastat declaration reference model";the purpose of this model is to provide a basis for the definition of a DTD or an XMLschema of the Intrastat declaration; this model is the basis of the implementation of theprototype on the INSTAT XML schema in IDEP/CN8; this class diagram is presentedin Annex 2.

Details of the differences between the two models

Some of the relationships and one class in the reference model are not needed for theimplementation model:

½ "TDP-PSI", "Collecting Centre-TDP", "Collecting Centre-PSI" relationships,

½ "CNA" class.

The implementation model requires a class and relationships, which are not in the referencemodel:

½ "Party" super-class of "Collecting Centre", "PSI" and "TDP" classes; Party means theentities involved in the exchange of the envelope containing the Intrastat declarationswhich are:

½ the Collecting Center which will receive the envelope,

½ either the PSI sending the envelope,

½ or the TDP sending the envelope and 1 to n PSIs responsible for the declarationscontained in the envelope,

½ relationships between "Collecting Centre, PSI, TDP" and "Address, ContactPerson" which are now simplified by the relationships between "Party" and"Address, Contact Person".

Further studies

Symbols indicating the attributes generated by national rules will be added in the diagrams.

Whether the class diagram needs to be modified to take account of security issues must bestudied.

INSTAT XML

Derivation

INSTAT XML, basis of the prototypes implemented for IDEP/CN8, has been derived fromthe implementation model in UML. The derivation between a class diagram in UML and anXML schema is not an automated but a "handcrafted" process. EEG6/WG5 examined thederivation step by step between each class, attribute and relationship of the UML model andeach element, attribute of the XML schema. A set of rules and guidelines to be followed when

An XML version of the INSTAT message CoRD039

5

deriving has been developed. INSTAT XML Diagram, representing the structure of the XMLdocument is shown in Annex 3.

Further studies

A mapping between the XML INSTAT (Intrastat declaration in XML) and the EDIFACTCUSDEC/INSTAT message is needed as another sort of validation and in order to help thePSIs in the implementation of XML, administration who want to migrate to XML.

The most appropriate form of user documentation must be investigated.

XML PROTOTYPES DEVELOPED FOR IDEP/CN8

IDEP/CN8 is the Intrastat Data Entry Package developed by Eurostat, set-up and distributedby the CNAs to the declarants. INSTAT XML specifications are the basis pf the XMLprototypes developed for IDEP/CN8.

Two prototypes for the Intrastat XML Generation and Analysis have been developed. Theyare represented in Figure 1. These prototypes are:

½ the generator producing well-formed and valid INSTAT XML messages.

the analyser parsing INSTAT XML and storing the data in a database.

Figure 1: XML prototypes developed for IDEP/CN8

XML GENERATOR IN IDEP

The INSTAT XML Generator is integrated into IDEP for Windows (V4.0.0) and works inparallel with the EDIFACT generator. By using a new MS Parameter it will be possible tochoose between XML or EDIFACT generation. For the user of IDEP for Windows this istransparent.

IDEP/CN8

XML Analyzer

Database

XML declaration

XML generator

An XML version of the INSTAT message CoRD039

6

Some new rules for generating the INSTAT interchange have been implemented. These rulessimplify the generation and analysis process:

½ If a field does not contain any data it is not inserted in the message (e.g. GoodsCode Description may be left empty).

½ If a "Yes/No" field (Boolean) has the value "No" (False), it is not inserted in themessage (e.g. if Acknowledgement is "Off", no information aboutacknowledgement is inserted in the message).

The prototype has been implemented quickly but with full-scale integration into IDEP forWindows. In other words the prototype has the full functionality of a production version as faras can be predicted at the moment.

Prototype XML Analyser

The INSTAT XML Analyser Prototype is intended to illustrate the feasibility and ease ofimplementation of an XML analyser for the CNA. The analyser prototype is a standalone toolthat is capable of analysing an INSTAT XML message and storing the data it contains in anMS-Access database.

The prototype consists of:

1. A DTD independent parser that verifies the XML message and stores the data in adatabase. The XML parser is based on SAX.

2. A reference table describing where XML elements are stored in the database (and howthey are stored.). The above-mentioned parser uses this table.

3. A Viewer that browses the resulting database. This module is not linked to the other twomodules. It is simply there to verify the results visually.

The first results of the prototype suggest a very high performance in processing the XMLmessage. This performance gain must be offset against the size of the XML message that hasmuch increased compared to an equivalent EDIFACT message.

STANDARDISATION PROCESS

EbXML is the Electronic Business XML Working Group. UN/CEFACT and OASIS joinedforces in this group for developing a worldwide project and standardising XML businessspecifications. EbXML Working Group has to develop a technical framework that enablesXML to be used in a consistent manner for exchanging all electronic business data. The firstdeliverables of ebXML in May 2001 will be Core Components and Business processescatalogues.

XML documents developed in EEG6 must be ebXML compliant. In order to cope with thisrequirement, the following procedure of standardisation of INSTAT XML was agreed duringthe last EEG6/XML Task Force:

½ Submit UML model, DTD, example to EEG6 for approval,

½ EEG6/WG5 will send the documents to EEG6 program Office and Chairman by endof March,

An XML version of the INSTAT message CoRD039

7

½ WG5 will ask to EEG6 to approve the documents via a written procedure,

½ WG5 will assist in the review of the ebXML Core Components and BusinessProcesses catalogues,

½ WG5 will check that XML INSTAT document is compliant with ebXML CoreComponents and Business Processes and submit change requests to ebXML, ifnecessary.

An XML version of the INSTAT message CoRD039

8

ANNEXES

Annex 1: Intrastat declaration, reference model, version 1.0

This class diagram represents the business world of Intrastat and provides a reference for thedevelopment of any information systems in that domain.

The model has been designed, validated and approved by delegates of Member States andEurostat within the framework of the EEG6/WG5.

For implementation purposes, new models can be derived from this model by adding orignoring elements as necessary. One such model is the "INSTAT message, implementationmodel".

The context is:- a PSI or TDP creates one or several Intrastat declarations and sends them to aCollecting Centre located in one Member State of the European Union.

Operations are not yet defined for the classes.

An XML version of the INSTAT message CoRD039

9

M odeOfT ransport

mo deOfT ranspo rtCode : enu m = {1 ,2,3 ,4 ,5 ,7 ,8,9}mo deOfT ranspo rtName : strin g

Natur eOf T ransac tionB

na tureT ra nsact ionB Code : en um = {1 ,2 , 3 ,4,5}natureT ransactionB Nam e : strin g

A ddi tiona lGood sCode

a ddi tiona lGoodsCode : l i st

ad di tiona lGoodsC odeDe scrip tion : st rin g

LocationID

locat ionCode : e num = {1 ,2 ,3}locat ionNam e : string

CNA

CNAId : s tringCNANam e : s tring

C ontactP erson

contactPersonId : s tringcontactPersonNam e : s tringcontactPer sonRole : s tri ng

0..1

0..*

0..1

0..*

Address

s treetN am e : s tri ngs treetN um ber : s tringpos talCode : s tringci tyN ame : s tringcountryNam e : s tringphoneNum ber : s tringfaxNum ber : s tringe-m a il : s tringU RL : URI0..1

0..1

0..1

0..1

0 ..10 ..1 0 ..10 ..1

C ol le ctingCentre

CCId : s tringCCN am e : s tring

1

1..*

1

1..*0.. *

0..*

0.. *

0..*

0 ..1

0 ..1

0 ..1

0 ..1

TDP

TDPId : s tringTDPNam e : s tringinte rchangeAgreem entId : s tringpass word : s tring

0..10..*

0..10..*

0..10..1 0..10..1

1

0..1

1

0..1

is regis tered

{A T DP o r a PS I is re g istered by a

Col lecting Center.}

Flow

flowCode : enum = {A , D}flowNam e : strin g

F unc tion

functi onCode : en um = {O ,R,D,M ,N}

functi onNam e : string

Decla ratio nCurren cy

DeclarationT ype

de clarat ionT ypeCod e : enum = {F,1,2 ,3}de cla ration T ype Name : st ring

PSI

PSIId : s tringPSINam e : s tringinterchangeAgreem entId : s tringpas sword : s tring

0. .1

0.. *

0. .1

0.. *

0..1

0 ..1

0..1

0 ..1

1

0..1

1

0..1

is re g istere d

{A T DP or a P SI is registe red by a

Co l lec ting Ce nter.}

0..*0..1 0..*0..1 re presents>

Enve lope

envelopeId : s tringdateTim eOfEnvelope : da teacknowledgem entReques t : boolean/ au thentica tion : s tringtes tInd icator : boolean/ num berOfDeclara tions : in tegerapplica tionReference : s tringsoftwareUs ed : s tring

0..*

0.. 1

0..*

0.. 1

sends

{An envel ope i s sent by a T DP or a PS I.}

1 ..*

1

1..*

1rece ives

0..*

0..1

0..*

0..1

sends

{A n e nve lope is sent by a T DP or a PS I.}

CountryOfOrig i n

IS OCountryCode : l i stGeonom CountryCode : l i stcountryNam e : string

NatureOfT ransactionA

natureT ra nsationA Code : l i stnatureT ra nsactionANam e : string

0..*

1

0..*

1

is c omp leted

M em berS tateConsDest

ISOCoun tryCode : l i stGeo nom Cou ntryCode : l i st

cou ntryNam e : string

Partner

partnerId : l i st

partnerNam e : string

CN8

CN8Code : l i stCN8Descrip tion : string

SUCode : string

1

0..*

1

0..*

S tatistica lPro ced ure

statistica lProcedure Code : l i ststa tProcedureNam e : strin g

Regio n

reg ionCode : l i streg ionNam e : strin g

Port/a i rport/in landport

po rt/a i rp ort/in landp ortCod e : l i stport/a i rport/in landportNam e : string

Currency

currencyCode : l i stcurrencyNam e : string

Del ivery T erm s

T ODCode : enum

T ODDescrip tion : string

0..1

0..*

0 ..1

0..*

Decla ration

decla rationId : s tringreferencePeriod : s tringdateTim eOfDeclaration : datefirs tLas t : enum = {F,L}/ to talInvo icedAm ount : num ber/ to talNetMas s : in teger/ to talStatis ticalValue : num ber/ to talNum berLines : integerpreviousD eclarationId : s tring

0..1

0..1

0..1

co rrec ts

0..1

0..* 10..* 1

0..* 10..* 10..*1 0..*1

0 ..*1

0 ..*1 0..*

1

0..*

1

provides

1

1..*

1

1..*

Item

item Num ber : integergoodsD e s cription : s tringnetMas s : in tegerquantityInS U : integernumbe rOfCons ignm ents : in tegerinvoicedAm ount : nu m bers ta tis tica lVa lue : num berinvoicedAm ou ntInC urren cy : nu m berinvoiceNum ber : s tringTODDe tails : s tringTODP lace : s tri ng

0..*0 ..1 0..*0 ..1

0..*

0 ..1

0..*

0 ..1

0..*0 ..1 0..*0 ..1

0. .*0 ..1 0. .*0 ..1

0..*

0..1

0..*

0..1

0 ..* 0 ..10 ..* 0 ..1

0..*

0 ..1

0..*

0 ..1

0 ..* 0 ..10 ..* 0 ..1

0..* 0 ..10..* 0 ..1

0 ..*

0 ..1

0 ..*

0 ..1

0..*

0 ..1

0..*

0 ..1

1

0..*

1

0..*

An XML version of the INSTAT message CoRD039

10

Annex 2: INSTAT message, implementation model, version 3.2

This class diagram is derived from the "Intrastat declaration reference model". The purposeof this model is to provide a basis for the definition of a DTD or an XML schema of theIntrastat declaration.

The model has been designed, validated and approved by delegates of Member States andEurostat within the framework of the EEG6/WG5.

The context of the model is:- a PSI or TDP creates one or several Intrastat declarations andsends them to a Collecting Centre located in one Member State of the European Union.

The main difference from the "Intrastat declaration reference model" is the introduction of theclass "Party" for implementation reasons. This was done to simplify the relationshipsbetween the classes "TDP", "PSI", "Collecting Centre", "Address" and "ContactPerson".

Operations are not yet defined for the classes.

An XML version of the INSTAT message CoRD039

11

Collec tingCentre

NatureOfT rans a ctionB

natureTrans actionBCode : enum = {1,2,3,4,5}natureTrans actionBNam e : s tring

AdditionalGoods Code

additionalGoods Code : lis tadd itio nalGoo ds CodeDes crip tion : s tring

LocationID

locationCode : enum = {1,2,3}lo cat ionNam e : s tring

Co untryOfOri g in

ISOCountryCode : lis tGeonom CountryC ode : lis tcountryNam e : s tring

NatureOfTransactionA

natureTrans ationACode : lis tnatureTrans actionAN am e : s tring

0..*

1

0..*

1is com p leted by

Mem berStateCons Des t

ISOCountryCode : lis tGeonom CountryCode : lis tcountryN am e : s tring

Partner

partnerId : lis tpartnerNam e : s tring

C N8

CN8Code : lis tCN8Description : s tringSUCode : s tring

1

0 ..*

1

0 ..*

Statis ticalProcedure

s tatis tica lProcedureCode : li s ts tatProc edur eN ame : s tr ing

Region

regionCode : lis treg ionNam e : s tring

ModeOfTransport

m odeOfTrans portCode : enum = {1,2,3,4,5,7,8,9}m odeOfTrans portNam e : s tring

Port/airport/inlandport

port/a irport/in landportCode : lis tport/a irport/in landportNam e : s tring

Currency

currencyCode : lis tcurrencyNam e : s tring

Deli ver yTerm s

TODCode : enumTODDes crip tion : s tring

0..1

0 ..*

0..1

0 ..*

ContactPerson

con tactPersonId : s tringcon tactPersonName : s tringcon tactPersonRole : s tring

Address

streetName : s tringstreetNumber : s tringpostalCode : s tringc ityName : s tringcountryName : stringphoneNumber : stringfaxNumber : s tringe-mail : stringURL : URI

0 ..1

0..1

0 ..1

0..1

Party

partyID : s tringpartyName : string

0..*

0..*

0..*

0..* 0 ..1

0. .1

0 ..1

0. .1

Flo w

flowCode : enum = {A, D}flowNam e : s tring

Function

functionCode : enum = {O,R,D,M,N }fu nctionNam e : s tring

DeclarationCurrency

Item

itemNumber : integergoodsDescription : s tringnet Mass : int egerquanti ty InSU : int egernumberOfCons ignment s : integerinvoicedAmount : numbers tatis tica lValue : numberinvoicedAmountInCurrency : numberinvoiceNumber : s tringTODD eta il s : s tr ingTODPlace : s tring

0..*0. .1 0..*0. .1

0..*

0..1

0..*

0..1

0 ..*0..1 0 ..*0..1

0..*0..1 0..*0..1

0 ..*

0..1

0 ..*

0..1

0 ..* 0..10 ..* 0..1

0..*

0 ..1

0..*

0 ..1

0..* 0..10..* 0..1

0..* 0..10..* 0..1

0 ..*

0..1

0 ..*

0..1

0 ..*

0..1

0 ..*

0..1

DeclarationType

declarationTypeCode : enum = {F,1,2,3}declarationTypeNam e : s tring

Declaration

dec larationId : stringreferencePeriod : stringdateTimeOfDeclaration : datefirs tLast : enum = {F,L}/ totalInvoicedAmount : number/ totalNetMass : integer/ totalStatisticalValue : number/ totalNumberLines : integerpreviousDeclarationId : string

0..1

0..1

0..1

corrects

0..1

0..* 10..* 1

0..* 10..* 10 ..*1 0 ..*1

1

0 ..*

1

0 ..*

0 ..*1

0 ..*1

PS I

interchangeAgreementId : s tringpassword : string

0 ..*

1

0 ..*

1

provid es

TDP

interchangeAgreementId : str ingpassword : s tr ing

Envelope

envelopeId : s tringdateTimeOfEnvelope : dateacknowledgementRequest : boolean/ authentication : s tringtestIndicator : boolean/ numberOfDec larations : integerapplicationReference : s tringsoftwareUsed : string

1

1 ..*

1

1 ..*

0..*

0..1

0..*

0..1

se nds

{An env elope is sent by a

TDP or a PSI . }

0..*

0..1

0..*

0..1

sen ds

1 ..*

1

1 ..*

1

receive s

An XML version of the INSTAT message CoRD039

12

Annex 3: INSTAT XML, version 3.0

INSTAT XML is derived from the class diagram in UML called "INSTAT message,implementation model, version 3.2".

The diagram describes the structure of the elements and attributes of the document. INSTATXML is specified with the "XML Authority" tool.

By defining what elements may be found within what elements, a structure for the documentis established. This structure can be thought of as a tree where the "root" is the encompassingelement and its branches are the elements and attributes that may be contained within it (asdefined by the content model). In turn each branch may have branches defined by theircontent model.

Document conventions

The following example gives the conventions used to represent the elements, attributes, typesand occurrences.

Occurrences of elements and attributes are represented by:

½ "nothing" meaning one and one time only,

½ ? meaning zero or one time,

½ meaning zero or more times,

½ + meaning or more times.Elements contain other element(s) or like attributes, they can have the types "id", "string","boolean", "date", "time", "integer", "URI", "number".

Diagram

In order to avoid reducing the diagram of XML INSTAT in one page, it is presented in threeparts:

½ First part: Details related to the envelope. The envelope contains an identification(envelopeId), a date and time of preparation (DateTime), different parties involved in

An XML version of the INSTAT message CoRD039

13

the exchange of the envelope, other information like acknowledgement request,software used, etc … 1 to n Intrastat declarations and the number of declarationscontained in the envelope.

½ Second part: Details related to the declarations. Each declaration contains anidentification (declarationId), different elements defining it, 0 to n statistical items andthe total number of items (totalNumberLines).

Third part: Details related to the items of a declaration. Each item contains an identification(itemNumber) and its own elements.

An XML version of the INSTAT message CoRD039

14

First part: Details related to the envelope

An XML version of the INSTAT message CoRD039

15

Second part: Details related to the declaration

An XML version of the INSTAT message CoRD039

16

Third part: Details related to the item