collaborative network for industry, manufacturing, business and … · 2018. 9. 27. · global...

42
NIMBLE Collaboration Network for Industry, Manufacturing, Business and Logistics in Europe © D3.2 Catalogue Ingestion and Semantic Annotation Page 1 of 42 Collaborative Network for Industry, Manufacturing, Business and Logistics in Europe D3.2 Catalogue Ingestion and Semantic Annotation Project Acronym NIMBLE Project Title Collaboration Network for Industry, Manufacturing, Business and Logistics in Europe Project Number 723810 Work Package WP3 Core Business Services for the NIMBLE Platform Lead Beneficiary SRDC Editor Suat Gönül SRDC Reviewers Benny Mandler, Dietmar Glachs, Gianluca D’Acosta, Wernher Behrendt IBM, SRFG, ENEA Contributors Dissemination Level PU Contractual Delivery Date 30/06/2017 Actual Delivery Date 07/07/2017 Version 1.0

Upload: others

Post on 11-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page1of42

Collaborative Network for Industry,

Manufacturing, Business and Logistics in Europe

D3.2 Catalogue Ingestion and Semantic Annotation

Project Acronym NIMBLE

Project Title Collaboration Network for Industry, Manufacturing, Business and Logistics in Europe

Project Number 723810

Work Package WP3 Core Business Services for the NIMBLE Platform

Lead Beneficiary SRDC

Editor Suat Gönül SRDC

Reviewers Benny Mandler, Dietmar Glachs, Gianluca D’Acosta, Wernher Behrendt

IBM, SRFG, ENEA

Contributors

Dissemination Level PU

Contractual Delivery Date 30/06/2017

Actual Delivery Date 07/07/2017

Version 1.0

Page 2: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page2of42

Abstract

This document provides details about product catalogue publishing that is the way forcompaniestomakethemselvesdiscoverableonNIMBLE.Aimingforabettersearchexperienceforusers,weintroducesemanticannotationforenrichinganobjectofinterestforNIMBLEinterms ofmachine-processablemetadata.We follow a similarmethodology used by existingeCommerceplatformsand letusersselectaproductcategorysothat theCatalogueRegistrycan recommend product specific metadata during the publishing process. We give a shortstateoftheartofonproducttaxonomiesandexplainourreasoningforselectingeClassasthebuilt-intaxonomyofNIMBLE.

We use Universal Business Language (UBL) as the common data model of the CatalogueRegistry because it contains appropriate data elements for catalogue/product managementthat are also connected to broader data elements related to supply chain operations suchordering, fulfilment, delivery and invoicing. We present how we modified the initial UBLschema by cropping several irrelevant data elements and by adding some new elements inresponsetothedatarepresentationrequirementsofourusecases.

InspiredbytheebXMLRegistrystandard,whichprovidesaspecificationforsharingofdataandmetadata between organizational entities in a federated environment, NIMBLE makes adistinctionbetween repository and registry concepts inwhichactual data andmetadata arestored.Currently,weuseaglobalmetadata registry,which ispopulatedwithmetadata thatcould be obtained from catalogues in varying formats and stored in disparate databasesaccordingly. We introduce an adapter concept called Feature Extractor, which runs ondedicatedcatalogue/productdefinitionformatsandwhichextractssemanticmetadatafromthe data represented in those formats and populate the metadata registry. The metadataholdsreferencestotherepositorieswheretheactualdataitemsarestored.

NIMBLE in a Nutshell

NIMBLE is the collaboration Network for Industry, Manufacturing, Business and Logistics inEurope. Itwill develop the infrastructure for a cloud-based, Industry 4.0, Internet-of-Things-enabledB2BplatformonwhichEuropeanmanufacturingfirmscanregister,publishmachine-readable catalogues for products and services, search for suitable supply chain partners,negotiate contracts and supply logistics. Participating companies can establish private andsecure B2B andM2M information exchange channels to optimise business work flows. Theinfrastructurewill be developed as open source software under anApache-type, permissivelicense. The governance model is a federation of platforms for multi-sided trade, withmandatory interoperationfunctionsandoptionaladded-valuebusinessfunctionsthatcanbeprovidedby thirdparties.Thiswill foster thegrowthofanet-centricbusinessecosystemforsustainable innovation and fair competition as envisaged by the Digital Agenda 2020.Prospective NIMBLE providers can take the open source infrastructure and bundle it with

Page 3: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page3of42

sectorial, regional or functional added value services and launch a new platform in thefederation. The project started in October 2016 and will last for 36 months.

Page 4: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page4of42

DocumentHistory

Version Date Comments

V0.1 05/06/2017 Initial structure

V0.2 20/06/2017 Input for Furniture Ontology

V0.3 20/06/2017 Input for Catalogue Ontology

V0.4 29/06/2017 Pre-final version for review

V0.5 30/06/2017 Updates based on feedbacks of Dietmar Glachs, Benny Mandler and Gianluca d’Acosta

V0.6 02/07/2017 Final Review: Wernher Behrendt

V1.0 07/07/2017 Submission of final version

Page 5: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page5of42

Table of Contents

Abstract.....................................................................................................................................2

NIMBLEinaNutshell.................................................................................................................2

1 Introduction............................................................................................................................9

2 SemanticAnnotation............................................................................................................10

2.1 ProductCategoryTaxonomies.......................................................................................10

2.1.1 eClass......................................................................................................................11

2.1.2 Domain-SpecificTaxonomies..................................................................................14

2.1.2.1 FurnitureTaxonomy............................................................................................14

2.1.2.2 TextileTaxonomy.................................................................................................15

2.2 DataRepresentation......................................................................................................16

2.2.1 BasicElements........................................................................................................16

2.2.2 UniversalBusinessLanguage(UBL)........................................................................18

2.2.3 OntologySchema....................................................................................................20

2.3 LinkingtheTaxonomieswiththeDataModel...............................................................21

3 IngestionDesign...................................................................................................................22

3.1 PersistenceModalities...................................................................................................23

3.2 IngestionModalities......................................................................................................26

4 APIDesign.............................................................................................................................27

4.1 RESTAPI.........................................................................................................................27

4.2 JavaAPI..........................................................................................................................29

5 UserInterfaces.....................................................................................................................30

5.1 Mock-ups.......................................................................................................................30

5.2 UserInterfaces...............................................................................................................30

6 FutureWork..........................................................................................................................30

Page 6: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page6of42

7 Summary...............................................................................................................................31

8 References............................................................................................................................32

9 Appendices...........................................................................................................................32

9.1 AnnexA-ExcerptofEntitiesInvolvedintheMICUNAUseCase..................................32

9.2 AnnexB–InitialAlignmentoftheDataElementsoftheMicunaUseCasewiththeUBLElements.................................................................................................................................34

9.3 AnnexC–Mock-upsforCatalogueRegistry..................................................................38

9.4 AnnexD–SnapshotsfromtheCatalogueRegistryUI...................................................41

List of Figures

Figure1–(Left)SomeofeClassrootcategories,(Right)Categoryhierarchyforlogisticsservices.......................................................................................................................................12

Figure2Propertiesassociatedwiththe"Traintransport"category..........................................13

Figure3-StructuralelementsofeClass.....................................................................................14

Figure4Relationshipbetweenmainconceptcategories...........................................................17

Figure5UBLGraphicalrepresentationforCatalogueLineschema............................................19

Figure6High-LevelViewoftheCatalogueontologyschema.....................................................20

Figure7IntegrationofeClassresourcesintotheUBLdatamodel.............................................22

Figure8Storingproductdataandmetadata..............................................................................23

Figure9Ingestionflowforproductdataandmetadata.............................................................26

Figure10Implementationstatusoffeatureextractorsanddatabasesetups............................31

Figure11Searchforproductcategorieswithakeyword...........................................................38

Figure12Categorieslistedfortheprovidedkeyword................................................................39

Figure13Categoryselectionbytraversingthetaxonomyhierarchy.........................................39

Figure14PublishingasingleproductviatheCatalogueRegistryUI..........................................40

Figure15Downloadinganduploadingcategorytemplates.......................................................40

Figure16Keyword-basedcategorysearch.................................................................................41

Page 7: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page7of42

Figure17Categorysearchresults...............................................................................................41

Figure18Productpropertiesrecommendedfortheselectcategory.........................................42

Figure19Downloadinganduploadingcategorytemplates.......................................................42

List of Tables

Table1:Acronymstable...............................................................................................................8

Table2-Statisticsabouttaxonomies[1]....................................................................................11

Table3ProductCategoryControllerservices...............................................................................27

Table4CatalogueControllerservices..........................................................................................28

Table5CatalogueServicemethods.............................................................................................29

Page 8: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page8of42

Acronyms

Table 1: Acronyms table

Acronym Meaning

ABIE Aggregate business information entity

API Application Programming Interface

B2B Business to business

BBIE Basic business information entity

BFO Basic Formal Ontology

CRUD Create – read – update – delete

CSV Comma separated value

ebXML Electronic Business XML

GUI Graphical User Interface

IEC International Electrotechnical Commission

ISO International Organization for Standardization

JSON Java Script Object Notation

LDPath Linked data path

M2M Machine to machine

NIMBLE Collaboration Network for Industry, Manufacturing, Business and Logistics in Europe

RDF Resource description framework

REST Representational State Transfer

SDK Software Development Kit

SPARQL SPARQL Protocol and RDF Query Language

UBL Universal Business Language

XML Extensible Markup Language

XSD XML Schema Definition

Page 9: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page9of42

1 Introduction

TheultimateaimofNIMBLEistooptimizecompanies’B2Boperationsthroughoutthesupplychainbyorchestratingandautomatingdataexchangeamongtheparticipantsofthechain indifferentphasessuchassub-contractedmanufacturing,transportation,etc.Theorchestrationofdataexchangetowardsthatautomationisrealizedthroughbusinessprocessesinvolvingatleasttwopartieswithspecificrolesfeedingtheprocesswithrelevantdata.

PartnerdiscoveryistheinitialsteppriortoperformingB2BoperationsonNIMBLEunlesstheuser, initiating a business process, already knows which company to look for. CatalogueRegistry is the main enabler of the partner discovery phase as it enables companies tointroducethemselvestotheNIMBLEplatformwiththeproductstheysupplyandtheservicestheyprovide.

In order to enable users to findwhat they are looking for quickly, CatalogueRegistry offerspublishingproductswithsemanticallyrelevantannotations.Theregistrymakesuseofgenericand sector-specific taxonomies as knowledgebases fromwhich relevant annotations can beobtained automatically given a particular product category. This document presents ourapproachaimingtoeasetheprocessforuserstopublishtheirproductswherethepublishingprocessissupportedwiththesetaxonomies.Wegivedetailsabouttheirusagefromtheusers’perspective.

Wemanagedataandmetadataofproducts indifferentways.Whilemetadataarekept inaglobalregistry;rawdata,whichcouldhavevaryingformats,arekeptindisparaterepositories.Maintaining all the metadata in a single repository enables querying on products havingheterogeneous structures initially. Once a product is identified, its complete, structureddefinitioncanbefetchedfromtherespectiverepository.

Basedonthetechnicaldesignconsiderationsfordealingwiththeheterogeneityofcatalogue/product data formats along with the storage mechanisms for management of data andmetadata separately, we provide details about the current status of the API and relevantimplementation. We intend to maintain this document beyond the current state, fordocumentationoffutureimprovementsoftheCatalogueRegistrycomponent.

Notonlyphysicalgoodscanbethesubjectofabusinessprocessbutalso intangibleserviceslike painting, transportation, etc. We will be using the term “product” in the rest of thedocumentreferringtobothoftheseconcepts.

Page 10: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page10of42

2 Semantic Annotation

Describing a product with a paragraph of text is not a mechanism on which user-friendly,effective searchmechanismscanbebuiltapart fromkeyword-basedsearch. In contrast toapurelytext-basedapproach,semanticannotationisthemethodfordescribingproductswithasetofmachine-readablepropertiesinadditiontothefreetextdescriptions.Suchannotationscan easily be processed, indexed and used for providing advanced search capabilities. Theymakethediscoveryprocessfasterandmoreeffectivethroughtheuseoffacetedsearch,whichisthedefactosearchmethodusedbyeCommerceplatforms.Utilizingtherelationsamongtheproductcategoriesaswellasrelationswithotherproductssetviatheproperties,NIMBLEalsoprovidessemanticsearchonthepersisteddata.

Semantic search provides several ways of discovery. For example, users can performexploratory search by broadening or narrowing the search scope by using the categoriesresiding in varying level of hierarchies as filtering criteria. Considering the logistics servicehierarchydepictedinFigure1,userscanbroadenthesearchscopebyselectingahigher-levelcategory e.g. logistics and narrow it by selecting a lower-level category e.g. train transport.Furthermore,specificpropertiesofcategoriescanbeusedtonarrowthesearchscopefurthere.g.quality levelpropertyoftraintransportcategory.Structuredmetadataenableswideningthesearchscopewith linguisticextensionsaswelle.g.synonymsoforiginalcategorynames.Fromtheoppositepointofview,suchadditional informationcanbe indexedasmetadataofproducts,andafterwardstheproductscouldbediscoveredforanykeywordmatchingwiththeindexedmetadata. Continuing from the sameexample, searchingwith the logisticskeywordmightyieldasresulttherailtransportservice.

2.1 Product Category Taxonomies

Asalreadystated,productcategorytaxonomiesareknowledgebasesstructuredasahierarchyofcategories.Inadditiontoenrichingproductdescriptionswithmachineprocessablesemanticmetadata,suchtaxonomiesprovideacommonterminologyforsearchprocesseswheresingleconceptscanberepresentedinmultiplelanguages.

WeanalyzedasetofavailablecategorytaxonomiesinordertoidentifythemostappropriateonetobeusedasthedefaulttaxonomyinNIMBLE.[1]providesthefollowingtableaboutthecoverage of the available taxonomies called classification systems. As can be seen from thetable, eClass is the taxonomy with the highest number of classes (i.e. categories) andproperties.

Page 11: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page11of42

Table 2 - Statistics about taxonomies [1]

AsshowninAnnexA-ExcerptofEntitiesInvolvedintheMICUNAUseCase,wealsocomparedthe taxonomieswith respect to the coverage of the concepts used in theMicuna use case.Again,eClassoutperformstheothertaxonomies.Asaresult,wedecidedtouseeClassforthegenericNIMBLEinstance.Inthesubsequentsections,weprovidedetailsabouteClassandtheFurniture taxonomy,which is the first sector-specific taxonomythatwehave integrated intoNIMBLEsofar.

2.1.1 eClass

eClass is an ISO/IEC compliant industry standard for cross-industry product and serviceclassification. Although we use its English version, it is an international standard withtranslations to several other languages. The latest version (10.0.1) of the taxonomy wasreleasedinFebruary2017with41,000productcategoriesand17,000properties1.InlinewithNIMBLE’sobjectives, theeClass taxonomycontains resources forboth tangibleproducts likefurniturefittingsandintangibleserviceslikerailtransportation.Thecompletehierarchycanbebrowsedattheirwebsite2.

1http://wiki.eclass.eu/wiki/Category:Products

2http://www.eclasscontent.com/

Page 12: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page12of42

Figure 1 – (Left) Some of eClass root categories, (Right) Category hierarchy for logistics services

eClasshasacategoryhierarchywith4levels.Figure1showstheroot levelcategoriesontheleft; and on the right is the expanded sub-hierarchy for the root logistics category. Figure 2showsthepropertiesassociatedtotraintransportcategoryalongwithsomerelatedkeywords.

Page 13: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page13of42

Figure 2 Properties associated with the "Train transport" category As shown in Figure 3, only leaf level categories are associated with properties. eClass alsoconstraints the value sets for some properties. Furthermore, some of the properties mighthavedifferentsetofvaluesetsbasedonthecategorytheyareassignedto.

Page 14: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page14of42

Figure 3 - Structural elements of eClass3 We requested eClass resources, a set of CSV files, from the organizationmaintaining it andtransformedthemintoamachineprocessableformatandpersisteditinarelationaldatabaseinthefirstversion.Inmorerecentversions,wehavealsokeptthetaxonomyinatriplestoreMarmotta4,whichisanopensourceApacheproject.Thereasonfordoingsoistobetterutilizethesemanticrelationsamongthecategoriesofthetaxonomy.

2.1.2 Domain-Specific Taxonomies

Oneof theways inwhichNIMBLE canbe specialized for specific sectors is to let the sectorplayersbevisibleonNIMBLEwiththeirsector-specific,structuredinformation.Thankstotheextensiblemechanismforadditionalclassification taxonomies,companiescanannotate theirproductswith the informationavailable in these taxonomies. It isworthhighlighting thatanextensionofNIMBLEisnotaworktobeperformedbythepartiestradingontheplatformbutisanoptionforplatformprovidersservingNIMBLEtoaspecificsector.Itisalsopossiblethatthirdpartieswouldoffersuchstrongersemanticservicesonaplatform.

2.1.2.1 Furniture Taxonomy

The furniture taxonomy has been updated from a previous version of the taxonomy ofprocessesinthefurnitureindustryandthefurnitureontology,bothmanagedbyAIDIMME.

As a result, the merged taxonomy includes the following main resources focused on thefurnituresector:

3http://wiki.eclass.eu/wiki/Category:Structure_and_structural_elements

4http://marmotta.apache.org/

Page 15: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page15of42

• Processes and techniques: activitiesorganisedbymainproduction stages, from firstwoodprocessinguntilproductdelivery,aswellasrelatedtechniques.

• Machinery: machines for furniture manufacturing including manual, automatic andsemiautomaticdevices.

• Equipment:tools,protectiveandsafetyequipmentusedbyworkers.

• Materials, substances and textiles: categorisation of most used components infurnituremanufacturing.

• Assemblyhardware:fittingelementsusedinpiecesoffurniture.

• Productcatalogue:categorisationoffurnitureproductsconsideringfinishedpiecesoffurnitureandfurniturecomplements.

Theproductcataloguebranchhas itsorigins intheFurnitureOntologywhichisbasedontheinternationalstandardfordataexchangeISO10303-2365(Industrialautomationsystemsandintegration -- Product data representation and exchange -- Part 236: Application protocol:Furniturecatalogueandinteriordesign)alsoknownasfunStep,focusedonthefurnitureandwoodcluster.Thiswasdevelopedtodefineacommonvocabularyexpressedinaformalwayand capturing the most used concepts inside the furniture and furniture-related industry,includingpropertiesandrelationshipsbetweenthem.

2.1.2.2 Textile Taxonomy

The textile taxonomy has been derived from the ontology based on the eBIZ/Moda-MLbusinessstandard,whichisgeneratedbytheCENinitiative.

The eBIZ/Moda-ML standard is mainly focused on the collaboration aspect of thetextile/clothingproductionandcovers theprocesses that involve twoormorecompaniesoractors.Forthisreason,theelementsthatcanbederivedfromthisstandardaremorerelatedtothesecollaborationaspectsthantheclassification/categorizationofgoods.

Thus,theOntoMODAontology,derivedfromthestandard,ismainlybasedonthedescriptionandcategorizationofthedifferentbusinessprocesses,andispooronthecategorizationoftheproducts.However,therichnessofdifferenttypesofdataintheModa-MLdictionaryrelatedtotextile/clothingproducts,canbeusedtofurtherdevelopthecataloguebyincludingalsothecategorization of goods. Hence, a taxonomy of concepts can be derived from the followingresources:

• Business Processes: activities organised by different collaboration step linked withrelevantdocumentsforthespecificinteraction;

• Productcatalogues:collection of yarn, fabric, garment and accessory products.

5https://www.iso.org/standard/42340.html

Page 16: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page16of42

At the moment, the catalogues are plain lists of goods (for example a list of yarns) withdifferentproperties(bothcommercialinformationandtechnicaldatasheetrelevantforallthecommercial aspects). Thesepropertiesaredescribed ina verydetailedwayand theanalysisprocess has taken into account almost all the properties that characterize a textile/clothingproduct.

On theother hand, a real product catalogue in the textile/clothing sector is a very complexobject containing, for example, notonly the imagesof the yarns, but also real samples thatdifferincolourorinproductionprocesses.Inthetextile/clothingsector,buyerswanttoholdthe product in their hands, look at the colour in a very precise way and select the goodsdirectly.

Forthisreason,theuseoftheeBIZ/Moda-MLcatalogueshasbeenrestrictedtoafewcasesfordemonstrationalpurposes.

However,therealizationofelectronictextilecataloguesforyarnandfabricsmaybeboostedbytheproducersinafutureNIMBLEexploitation.

2.2 Data Representation

Management of catalogue data is a challenging task in NIMBLE. Firstly, NIMBLE has to dealwithontheonehandstaticdataaboutcompaniesandontheotherhandrelativelydynamicdata for the products they provide and also different trading conditions related to theproducts. All this information must be kept in a flexible way enabling the development ofenvisionedsearchcapabilities.

2.2.1 Basic Elements

We start with an overview of the most important conceptual elements in the domain ofproductcatalogues.Ahigh-levelvisualizationof theontologyschema ispresented in section2.2.3OntologySchema.

ThecoreaspectswithintheproductcataloguedomainareBusinessEntity,ResourceEntityandDependentEntity.WhenthereisaneedformoredetaileddomainspecificdescriptionsfortheResourceEntities,thecoreconceptscanbeextendedwithadditionaltaxonomiessuchastheeClasstaxonomyorfurnituredomainspecifictaxonomyusinglinkeddataprinciples.Thebasicrelationship between the entities is depicted in Figure 4. A business entity owns resourceentities,andcanoffersomeoftheresourceentitiestoothers.Thebusinessentityaswellasthe resourceentities,aredescribed in thedependententitywithproperties suchascontactand price. In addition, the resource entities can be detailed and further specified withextensions.Inthefollowing,thesefourconceptcategoriesaredescribed:

Page 17: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page17of42

Figure 4 Relationship between main concept categories BusinessEntityConcept

ABusinessEntityisalegalagentthatparticipatesinsomeactivitiesinasupplychain.ItcanbealegalPartyoraPerson,whichhasanaturallydetaileddescriptionsuchasnameandcontacts.ABusinessEntitycanoffersomeResources,ortakepartinsub-processesinasupplychain,forexample,forresourcesacquisition.WhenaBusinessEntityperformssomeactivitiesinasupplychain, it takesusuallyaBusinessRole.TypicalBusinessRoles ina supply chainareEndUser,Manufacturer,Supplier,Retaileretc.

ResourceEntityConcept

AResourceEntityisakindofproductorservicethataBusinessEntityholds.ItcanbeanItemor Aggregated Resource such as Catalogue. The entity Item is arranged based on thetaxonomies in the Basic Formal Ontology (BFO) [2]: Continuant Resource Entity, OccurrentResource Entity. Continuant Resource Entity is a kind of resource that continues to existthroughtime,forexample,somePhysicalProductorDigitalResources.Bycontrast,OccurrentResource Entity is a kindof resource that exists for a certain timeperiod, such as a LogisticService. Each Resource Entity has some resource specific technical characteristics orcommercial properties such as price and specification. These kinds of specifications aredetailedintheDependentEntitydescription.

DependentEntityConcept

ThenextmajorcategoryofconceptsistheDependentEntity,whichismainlydependentupontheBusinessEntityandResourceEntity.TheconceptsintheDependentEntityincludeentitiesthatareusedtodescribethedetailsoftheBusinessEntityandResourceEntity.TheconceptsdependentuponBusinessEntitycanbeforexampleContact,FinancialAccount,andBusinessRole. The concepts for specifying a Resource Entity can be for example Price, CommodityClassification,Identifieretc.BasedontheconceptCommodityClassification,aResourceEntitycanbelinkedtootherextensionssuchaseClass6taxonomyorotherproductontologies.

6http://www.eclasscontent.com/index.php?language=en

Page 18: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page18of42

Extensions

TheExtension isaveryspecialcategoryofconceptsused intheNIMBLEplatforminordertokeepextensibilityandusabilityofthedatamodelfordiverseusecases.Indifferentusecasesorsupplychains,diverseresourcesareexchanged.Eachresourcehasitsowndomainspecificproperties. It is typically hard tomanage every single property for all different resources invariousdomainsinasingleontology.Thereareusuallydomainspecificproductontologiesforthedescriptionofresourcesinaspecificdomain.IntheNIMBLEplatform,weintendtoreusethis kindof domain specific resources todetail theResource Entity. For example, theeClasstaxonomy can be applied to classify a Resource Entity, and the respective properties in aneClasscategoryareabletospecifytheResourceEntity.Inthesameway,otherproductdomaintaxonomiessuchastheFurnitureTaxonomyinthefurnituredomainorMODA-MLontologyinthetextiledomaincanbeextended.DetailsontheextensionwillbegivenindeliverableD2.2(SemanticModellingofManufacturingCollaborationAssets).

2.2.2 Universal Business Language (UBL)

UniversalBusinessLanguage(UBL)isaworld-widestandardprovidingaroyalty-freelibraryofstandard electronic XML business documents that are commonly used in supply chainoperations.UBLalsodefinescommonbusinessprocessesaswell-defineddataexchangeflowsamong tradingpartners. Cataloguing is oneof theseprocesses andUBLdefines adocumentschema7fortherepresentationofcatalogues.

We use Universal Business Language (UBL) as the common data model of our CatalogueRegistryasitcontainsappropriatedataelementsforcatalogue/productmanagementthatarealso connected to broader data elements related to supply chain operations such ordering,fulfilment, delivery and invoicing. UBL has good coverage in terms of the data elementsrequiredinourusecases.WemappedthedataelementsrequiredintheMicunausecasetodataelementsdefined in theUBLdatamodel.As canbe seen inAnnexB,which shows theinitialmappingeffort,mostof theMicunadataelementshadcorrespondingelements in theUBLdatamodel.

In addition to document schemas that are rather complex, UBL provides a set of reusablebuildingblocksforconstructingthesedocuments.UBLprovidestwomaingroupsofreusableelements.EachelementinthefirstgroupisnamedaBasicBusinessInformationEntity(BBIE).As thename implies such elements canonly haveone valuewith varying data types.Whiledatatypescanbebuilt-indatatypessuchasinteger,string,dateandsoon;theycanalsobeassociatedwithadditionalattributessuchasthemeasurementunitforaquantity,mimetypeforabinaryobject,etc.

7http://docs.oasis-open.org/ubl/os-UBL-2.1/UBL-2.1.html#T-CATALOGUE

Page 19: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page19of42

Entities in thesecondgrouparecalledAggregateBusiness InformationEntities (ABIE).ThesearecollectionsofBBIEsandcanalsohavereferencestootherABIEs.Havingspecificsemantics,DocumentschemaslikeCataloguehavethesamestructurewithABIEs.

TheUBLdatamodel includesmainconcepts thatare required in thescopeof theCatalogueRegistry component. In addition to the high-level Catalogue document concept, it includesconcepts for representing companies, persons, catalogues, products, product properties,deliveryterms,tradingtermsandsoon.

Figure 5 UBL Graphical representation for Catalogue Line schema

Page 20: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page20of42

UBL defines theCatalogue document schema such that itwould contain a set ofCatalogueLinescorrespondingtoproductstradedbyaparty.Figure5showsagraphicalrepresentationofaportionoftheCatalogueLinedataschema.Theelementsreferredwitha“cbc”prefixareBBIEsdefinedinthecommonBBIElibraryprovidedbyUBL.Likewise,theonesreferredwitha“cac” prefix are ABIEs from the common ABIE library. Although we preserved the originalschemaprovidedbyUBL,wedid somemodificationsbasedon the requirementsof our usecases. First, in order to start with a relatively simple datamodel, we removed unnecessaryelementsthatarenotrequiredinourusecases,mainlyfortheMicunausecase.Nevertheless,wedidsomemodificationsaswell.InAnnexB,weinitiallyhadindicatedtheMicunausecasedataelementsthatdonothaveacorrespondinginUBLwithaTBDmark.Mainly,thosearetheelements for which we customize the original UBL data model. For example, we addedconcepts related to payment and delivery conditions to the company concept. The currentversionofthedatamodelfortheCatalogueRegistrycanbefoundat8.

2.2.3 Ontology Schema

Thissub-sectionprovidesahigh-levelvisualizationoftheontologyschema.Itfollowsthekeyconcepts listed in sub-section 2.2.1 Basic Elements and definitions in sub-section 2.2.2Universal Business Language. A high-level visualization of the catalogue domain ontology isillustratedinFigure6.Forvisibilityreasons,onlythemostimportantconceptsareshowninthefigure, and the object properties as well as data type properties are not detailed. Moredetailed information can be found in the separate deliverable D2.2 Semantic Modelling ofManufacturingCollaborationAssets.

Figure 6 High-Level View of the Catalogue ontology schema

8https://github.com/nimble-platform/catalog-service-srdc/tree/master/business/ubl-data-model/src/main/schema/NIMBLE-UBL-2.1-Catalog-Subset

Page 21: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page21of42

Asdepictedinthefigureabove,thekeyconceptsdescribedinsub-section2.2.1areincluded.Inahigh-levelview,aBusinessEntitycanoffersomeResourceEntities.TheBusinessEntityaswell as Resource Entities can have properties which are detailed in Dependent Entities. Inaddition, the resourceentity canbeextendedwithother resources suchaseClassontology,Furniture taxonomy in the furnituredomainorMODA-ML in the textile domain for detaileddomainspecificresourcespecification.Theextensionisapplicablemainlythankstothelinkeddatamechanisms.

MostoftheconceptsintheontologyschemaarederivedfromtheUBLstandarddatamodel,which is specified in sub-section2.2.2. The catalogue related conceptsandproperties in theUBL data model are tailored and transformed into ontology classes and object/data typepropertiesintheaboveshowncatalogueontologyschema.Forexample,theUBLconceptsofCatalogueor ItemaretransformedintotheontologyclassesCatalogueand Item. Inordertodescribecommercialconditions fortheresourceentity, thepayment,price,warrantyaswellas delivery related concepts from UBL such as PaymentTerms are included. Many otherconcepts,whicharederived from theUBLdatamodels, such asPeriod,Address, are for thereasonofvisibilitynot listed in theabovediagram.The relationships thatare in the formofontologyobjectpropertiesanddatatypepropertiesareforthesamereasonnotdetailedintheabovediagram.Forexample,aCataloguecanhavemanyCatalogueLines.EachCatalogueLinecanhave itsownWarranty Information,WarrantyPartyandWarrantyValidityPeriod. Intheexample,theWarrantyInformationisadatatypepropertyintheontology,theWarrantyPartyandWarrantyValidityPeriodarebothobjectpropertieswithlinkstotherespectiveontologyclassPartyorPeriod.

After the catalogue related concepts are derived from the standard UBL datamodel, theseconcepts are re-arranged based on the top-level ontology BFO as well as the key conceptcategoriesdescribedinthesub-section2.2.1.

2.3 Linking the Taxonomies with the Data Model

TheUBLdatamodel,specificallythe ItemTypeconcept,which isusedtoannotateaproduct,hasanappropriatesub-concepttolinktheproductwithanexternalcategory.Thissub-conceptisacommonABIEcalledCommodityClassificationType. It furthercontainsadditionalBBIEstoprovidethecodedvalueofthecategoryalongwithaURIreferencetothecategorydefinitioninthetargettaxonomy.

Once an ItemType is associatedwith a category, the properties definedby the category areaddedtotheItemTypeasItemProperties,whichisanothercommonABIEthatcouldbeusedtoexpress diverse types of primitive values such as text, number or binary objects. Thisassociation is either realized on the UI when users publish products one by one or once acatalogue template is uploaded to NIMBLE. Ingestion modalities are presented in the nextsection in detail. Figure 7 shows how the knowledge obtained from the taxonomies areintegratedintotheUBLdatamodel.

Page 22: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page22of42

Figure 7 Integration of eClass resources into the UBL data model

3 Ingestion Design

We follow the approach introduced by the ebXML Registry standard for managing theinformation about product data. An ebXML Registry manages any content type and thestandardized metadata that describes it [3]. While the former concept is named asRegistryItem, the latter one is calledRegistryObject.While RegistryItemsmight differ in thetype of content such as image, video, XMLdocument, etc.,RegistryObjectshave a standardrepresentation. Inorder tobeable toenableoperations thatwouldworkonheterogeneousRegistryItems,ebXMLintroducestwotypesofstorageconceptsdealingwiththesetwotypesofdata:repositoryandregistry.

Similarly, NIMBLE needs toworkwith product definitions that have heterogeneous formatsand structures. Thus, the Catalogue Registry manages two types of information aboutproducts:

1. Product data: Suchdatawould contain any information about theproduct includinganytypeofbinaryattachments.Structureoftheproductdatamightdifferfromsectorto sector or company to company. Even, there might not be any structuredrepresentationofproducts.

2. Metadata: Metadata are extracted from the product data and stored as key-valuepairs.

Page 23: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page23of42

Figure 8 Storing product data and metadata Figure 8 depicts the process for storing product data andmetadata. The top-level elementsindicatethattheinitialproductdatamighthavedifferentstructures,thatarecompliantwithdifferent standards or that are proprietary. NIMBLE allows providers to use customFeatureExtractorstoextractmetadatafromproductdatawithdifferentstructures.Forthetimebeing,whiletheproductdatamightresideindifferentrepositories,extractedpropertiesarekeptinasinglerepositorytobeabletoprovideaunifiedsearchoverthestoredproducts.However,inthefuture,wemightadoptanapproachwherethemetadataisalsostoredinadistributedwayconsideringtherequirementsofNIMBLEfederations.Seethefederation-relatedfutureworkinSection6.

As depicted in Figure 7, categories are represented in the taxonomy model before beingintegratedintotheproductdatamodel.Forthetimebeing,wehavewrappedeClasswiththetaxonomy model and as future work we will develop wrappers for the sector-specifictaxonomies.

3.1 Persistence Modalities

Wehave twomeansof storageasan implementationof thedesignwe just introduced.Thefirstisforthedataandthesecondforthemetadataextractedfromproductinformation.Oncea product is published to NIMBLE, first the original content is stored in a databasewith allassociated information. Then, the feature extractors run on the published data and extractmetadataaskey-valuepairsfromheterogeneousproductdata.

Page 24: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page24of42

Metadata are used to enable thediscoveryprocess of products onNIMBLE.After the initialdiscovery, the details about the product can be retrieved from the relevant repository.Nevertheless,NIMBLEalsooffersalternativesearchcapabilitiesaswillbedescribed inD3.3-ProductandServiceSearchEngineandSearchMediator.

Having UBL as the common data model for representation of product data, the defaultrepository of NIMBLE is a UBL-compliant relational database keeping the product data in astructuredway.

Ontheotherhand,weuseApacheMarmotta9tostoreboththeactualcontentandmetadataofproducts.Marmottausesatriplestoretopersistitsdata.TheTriplestoremanagesthedataasRDF triples [4]. Tobeable to transformUBL-compliantproductdata intoRDF,weusedatoolcalledOntmalizer10.OntmalizerisabletotransformanyXMLdocumenttoRDFaslongastheXMLdocumentiscompliantwithanXSDschema,whichisthecaseforUBL-basedproductdata.OncethedataisinRDFformat,itissenttoMarmottatobestored.

In addition toproviding a triple store,Marmotta has an add-onmodule usingApache Solr11(a.k.a Solr) for text-based indexing of the data. A Solr index includes a set of index fielddefinitionseachofwhichcorrespondstoadifferentmetadataaboutthesubmitteddocument.So,thewaytoindexadocumenttoaSolrindexistoextractinformationforeachindexfieldfromtheoriginaldocumentandputtheextractedinformationintorelevantindexfieldalongwith a reference to the original document. It is also possible to define a specific index fieldcontainingtheentiretextofthedocument.

Marmotta supports a query language, called LDPath12, to query the RDF data stored in thetriplestoreandpopulateaSolrindexwithresultsofthequery.AnLDPathexpressioncontainsasetof(relative)RDFpathsthatcorrespondtoanindexfield.So,thismeansthatLDPathisthedefaultFeatureExtractorfortheUBL-basedproductdata.

@prefix cat: <urn:oasis:names:specification:ubl:schema:xsd:Catalogue-2#> ;

@prefix cac: <urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2#> ;

@prefix cbc: <urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2#> ;

@filter rdf:type is cac:ItemType ;

9http://marmotta.apache.org/

10https://github.com/srdc/ontmalizer

11http://lucene.apache.org/solr/

12http://marmotta.apache.org/ldpath/language.html

Page 25: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page25of42

item_name = cbc:Name :: xsd:string ;

item_description = cbc:Description :: xsd:string ;

item_complete_images = cbc:ItemConfigurationImages :: xsd:string;

df_d = cac:AdditionalItemProperty[cbc:ValueQualifier is "REAL_MEASURE"] / fn:dynamic(cbc:Name, cbc:Value) :: lmf:dynamic (dynamicField="*_d", solrType="double") ;

df_s = cac:AdditionalItemProperty[cbc:ValueQualifier is "STRING"] / fn:dynamic(cbc:Name, cbc:Value) :: lmf:dynamic (dynamicField="*_s", solrType="string") ;

df_st = cac:AdditionalItemProperty[cbc:ValueQualifier is "STRING_TRANSLATABLE"] / fn:dynamic(cbc:Name, cbc:Value) :: lmf:dynamic (dynamicField="*_st", solrType="string") ;

Listing 1 LDPath instance for extracting properties from UBL-based product data Listing1istheLDPathinstancebeingusedtoextractpropertiesfromUBL-baseddocuments.Theinstancecontainsa@filtercommandtofiltertheRDFtripleswithtypecac:ItemType.Thisclassistheontologyclassusedtoannotatearesourceasaproduct.TheresourcesfilteredbythiscommandaretherootresourcesofwhichdetailsareobtainedviatheRDFpathdefinitionsfollowingthefiltercommand.EachRDFpathdefinitionspecifiesthenameoftheindexfieldandtheassignmentvaluesidentifiedbyfollowingthesubsequent(relative)pathstatementsstartingfromtherootresource.Theexpecteddatatypeofthevaluesfinalizeseachstatement.Forexamplethestatementitem_name=cbc:Name::xsd:string;(seeListing1)willreturntheresult“SuperBathroom”onRDFdatagiveninListing2.

<rdf:Description rdf:about="urn:oasis:names:specification:ubl:schema:xsd:Catalogue-2#INS9709254_ItemType_1">

<rdf:type rdf:resource="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2#ItemType"/>

<ns3:Name>Super Bathroom</ns3:Name>

<ns2:AdditionalItemProperty rdf:resource="urn:oasis:names:specification:ubl:schema:xsd:Catalogue-2#INS9709254_ItemPropertyType_1"/>

</rdf:Description>

<rdf:Description rdf:about="urn:oasis:names:specification:ubl:schema:xsd:Catalogue-2#INS9709254_ItemPropertyType_1">

<rdf:type rdf:resource="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2#ItemPropertyType"/>

<ns3:Name>Wall Tile</ns3:Name>

<ns3:ValueQualifier>Text</ns3:ValueQualifier>

<ns3:Value>Green tiles</ns3:Value>

</rdf:Description>

Page 26: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page26of42

Listing 2 An example product data in RDF format Normally,theindexfieldsforaSolrindexaredefinedatthebeginningandthedocumentstobeindexedareprocessedforthosefields.However,inNIMBLEproductswouldhavedistinctiveproperties to be represented as index fields that could be retrieved either from productproperty taxonomiesor specifiedby theusers themselves. So, it isnot feasible to createanindex field for each distinct property beforehand. Therefore, we updated the initial LDPathimplementationwithnewparsingfunctionstobeabletocreatedynamicindexfields,whichissupported by Solr. For example, the cac:AdditionalItemProperty[cbc:ValueQualifier is"STRING"] / fn:dynamic(cbc:Name, cbc:Value) :: lmf:dynamic (dynamicField="*_s",solrType="string")commandintheexampleLDPathinstance,createsadynamicindexfieldoftypestring ifthevaluequalifierofanadditionalitemproperty isSTRING.Figure9depictstheingestionflowelaboratedinthissection.

Figure 9 Ingestion flow for product data and metadata Bymaking use ofMarmotta-Solr collaborativelywe are able to store semanticmetadata ofproducts ina free-textenginethatcombinesthe facetedsearchparadigmwiththesemanticsearchparadigm.

3.2 Ingestion Modalities

As the Catalogue Registry has similar capabilities with available eCommerce platforms, weinvestigated how the existing platforms enable their users to publish product information.Specifically,we investigated both international platforms (Alibaba andAmazon) aswell as alocalone(Hepsiburada,aTurkishone).Thecommonprocessamongall theseplatforms isasfollows:Askusersto:

1. Chooseaproductcategoryfirst

2. Download a spreadsheet based template containing product properties for theselectedcategory

3. Fill in the template fordistinctproductsandprovidealternativevalues forparticularpropertiese.g.differentcoloursalternativesforatilerepresentedasasingleproduct

4. Uploadthetemplate

Page 27: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page27of42

Thisapproachenablesuserstopublishproductsinabulkmanner.Ontheotherhand,someofthemallowproductpublishingonebyonebutstillafteraskinguserstoselectacategory.Thenusers specify values forpre-definedpropertiesproposed for the selected category. In eithercase,usersareabletospecifycustomproperties.

From the user experience point of view, new applications / platforms are recommended toprovide similarwaysofdoing thingswith theexistingapplications /platformshaving similaraims.Because,usersdevelopusagepatternsonsuchplatformsandtheywouldliketousethesamepatterninsteadoflearningsomethingfromscratch.Therefore,wealsomakeuseoftheaforementionedingestionmodalities.

4 API Design

Inthissection,weprovidedetailsaboutthecurrentAPIoftheCatalogueRegistry.TherearetwomainAPIsforthiscomponentnamely,aRESTAPIandaJavaAPI.TheRESTAPIenablestheconsumptionoftheCatalogueRegistrycapabilitiesbyexecutingHTTP-basedqueries.TheRESTAPI delegates the task to the Java API to realize the job expected from the service. In thisdeliverable,wereportthecurrentstatusoftheAPI.ThelateststatusupdatewillalwaysbeonGitHub13.

4.1 REST API

Within the CatalogueRegistry itself, there are two sets of REST services for the time being:ProductCategoryControllerandCatalogueController.

ProductCategoryController enables the retrieval of different kinds of information about theproductcategorytaxonomies.

Table 3 ProductCategoryController services

ServiceName ServicePath ServiceDescription

getCategoryById /catalogue/category/{categoryId}PathParam:categoryId

Returnsacategoryclasswithalldetails

getDefaultCatalogue /catalogue/{partyId}/defaultPathParam:partyId

Returnsthedefaultcatalogueforthegivenparty

getCategoriesByName /catalogue/category Returnsalistofcategoriesforagiven

13https://github.com/nimble-platform/catalog-service-srdc

Page 28: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page28of42

QueryParam:categoryName keyword

getSubCategories /catalogue/category/{parentCategoryId}/subcategoriesPathParam:parentCategoryId

Returnsthechildcategoryclassesforthespecifiedparentclass

Althoughthecurrentsetofservices includedinProductCategoryControllerworksonlyonthebuilt-ineClass-basedtaxonomy,wewillsoonaddsupportformultipletaxonomies

CatalogueController includes services for management of products through CRUD (Create-Read-Update-Delete)functionalities.

Table 4 CatalogueController services

ServiceName ServicePath ServiceDescription

getCatalogueByUUID /catalogue/{uuid}PathParam:uuid

Returnsthecatalogueidentifiedbythegivenidentifier

addCatalogue /catalogueRequestBody:catalogueJson

Storesthegivencatalogue

updateCatalogue /catalogueRequestBody:catalogueJson

Updatesacataloguewiththenewdata

deteleCatalgoue /catalogue/{uuid} Deletesthecataloguewiththegivenidentifier

downloadTemplate /catalogue/templateRequestParam:categoryId

Generatesatemplatefortheproductcategoryspecifiedbythegivenidentifier

uploadTemplate /catalogue/template/uploadRequestParam:file

Uploadsthetemplate

For the timebeing,CatalogueControllerprovidesCRUDcapabilitiesat thecatalogue-level. Inthefuture,wewillincludeCRUDcapabilitiesalsoattheproductlevelsothatuserswillbeabletomanageindividualproducts.

InadditiontoRESTservicesdevelopedinthescopeofCatalogueRegistryimplementation,weare also able tomake use of the REST services provided byMarmotta and Solr. MarmottaprovidesCRUDservicesforRDF-baseddatamanagement. InadditiontotheCRUDserviceonthe Solr-specific documents, Solr also provides advanced search capabilities. These externalservices,however,aremainlyusedbytheinternalNIMBLEimplementation.

Page 29: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page29of42

4.2 Java API

SimilartotheRESTAPI,thecurrentJavaAPIhastwomainmodulesnamelyCatalogueServiceandProductCategoryService.

ProductCategoryServiceincludesmethodswithone-to-onemappingtotheservicesincludedinProductCategoryController.Asafuturework,wewillimprovethecategoryAPIforextensionofthe base taxonomy through crowd-sourcing. In this way, on one hand, users will be ablerecommend new categories and new properties for existing categories. On the other hand,authorizedpartieswillbeabletochecktherequestsofusersandoncetherequestisapprovedtherecommendationwillbeintegratedintotheNIMBLEtaxonomypermanently.

CatalogueServicehasmorecapabilitiesthanitsRESTcounterpart,i.eCatalogueController,asitincludesCRUDmethodsforbothUBL-basedcataloguesandothercustomformats.

Table 5 CatalogueService methods

MethodName MethodParameters MethodDescription

addCatalogue CatalogueTypecatalogue Storesthegivencataloguebothtothedefaultrepository,triplestore;extractsthemetadataandstoreitinthededicatedSolrindex

addCatalogue StringcatalogueXML StoresthegivencatalogueinUBL-compliantXMLformatbothtothedefaultrepository,triplestore;extractsthemetadataandstoreitinthededicatedSolrindex

getCatalogue Stringuuid Returnsthecatalogueidentifiedbythegivenidentifier

getCatalogue StringidStringpartyId

ReturnthecataloguespecifiedbyidforthepartyspecifiedbypartyId

updateCatalogue

CatalogueTypecatalogue Updatesacataloguewiththenewdata

deleteCatalogue Stringuuid Deletesthecataloguewiththegivenidentifier

addCatalogue StringcatalogueXMLStandardstandard

AddsthecatalogueinXMLformatcompliantwiththespecifiedstandardintotherelevantrepository

Page 30: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page30of42

addCatalogue TcatalogueStandardstandard

Addsthecataloguecompliantwiththegivenstandardintotherelevantrepository

getCatalgoue StringuuidStandardstandard

Returnsthecataloguecompliantwiththespecifiedstandardandwiththegivenidentifier

getCatalogue StringidStringpartyIdStandardstandard

Returnsthecataloguecompliantwiththespecifiedstandardandwiththegivenidentifierforthespecifiedparty

deleteCatalogue StringuuidStandardstandard

Deletesthecataloguecompliantwiththegivenstandardforthegivenidentifier

5 User Interfaces

5.1 Mock-ups

Fortheuserinterfaces,wefollowedaniterativeapproachbyfirstpresentingmock-upstoendusers.We prepared interactivemock-upswith the Invision tool14. In addition to interaction,userswerealsoable tocommentonspecificpartsof themock-up indicating their feedback.AnnexCincludesthemock-upsfortheCatalogueRegistry.Thecompletemock-uppackageisavailableonlineat15.

5.2 User Interfaces

Aftertheinitialmock-upphase,wedevelopedtheinitialUIoftheCatalogueRegistry.Forthetimebeing,wedevelopedviewsforproductpublishing.Asfuturework,wewilldevelopviewsalsoformanagementoftheproductsaftertheinitialpublishing.AnnexDshowsmock-upsforcategorysearchandproductpublishing.

6 Future Work

As futurework,wewill continue elaborating the API in the scope of T2.3 andwork on theimplementationoftheAPIasstatedinSection4anduserinterfacesinSection5.Wewilladopt14https://www.invisionapp.com/

15https://invis.io/7W9UWPTV4

Page 31: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page31of42

Swagger16 for improved documentation and testing of the REST API. Although a centralmanagementofFeatureExtractorsandStorageConfigurationsisneeded,Figure10showstheimplementation status of various modules such that while green arrows indicate that theconnection between the components is already established e.g. eClass is wrappedwith thetaxonomy model and integrated into the UBL product representations; a wrapper is stillneededforthefurnitureontology.

Figure 10 Implementation status of feature extractors and database setups We will also work on federation-related improvements of the Catalogue Registry. We willidentifythefederation-relatedrequirementsandcorrespondingsystemdesign.Wewilldecidethewayafederationwillbeformed;inwhichwaysfederationmemberswillbeawareofeachother;andhowthefederationmemberswillbesynchronizedifatall.

Lastly,wewill improve theCatalogueRegistrywith the securityenablers tobedeveloped inWP6. Enablers such as REST service security, resource access control mechanisms andautomaticauthenticationwillbeintegrated.

7 Summary

In this deliverable, we report on the design decisions about the catalogue ingestionmechanism in NIMBLE. We elaborate on the mechanism for annotating the products withsemantically relevantmetadata. In this respect,wementionproductcategorytaxonomiesas

16http://swagger.io/

Page 32: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page32of42

collections of product categories associatedwith a set of category-specific properties. Afterpresentingthedatarepresentation,weshowhowweutilizethesetaxonomiesandlinkthemwiththecataloguedatamodel.

Fromanarchitecturalpointofview,wemakeadistinctionbetweendataandmetadata;andprovidedetailsaboutthewaysandmeanstodealwiththosetwokindsofinformation.Lastly,we provide details about the latest version of the Catalogue Registry API along with somescreenshotsoftheuserinterfaces.

8 References

[1]Stolz,A.,Rodriguez-Castro,B.,Radinger,A.,&Hepp,M.(2014,May).PCS2OWL:Agenericapproach for deriving web ontologies from product classification systems. In EuropeanSemanticWebConference(pp.644-658).SpringerInternationalPublishing.

[2]Smith,B.,Almeida,M.,etal.(2015)BasicFormalOntology2.0SPECIFICATIONANDUSER’SGUIDE.http://purl.obolibrary.org/obo/bfo/Reference.Lastvisited2017-06-30.

[3] OASIS ebXML RegRep Version 4.0 Part 0: Overview Document. 25 January 2012. OASISStandard.Retrievedfromhttp://docs.oasis-open.org/regrep/regrep-core/v4.0/os/regrep-core-overview-v4.0-os.html,26June2017.

[4] Klyne, Graham, Jeremy J. Carrol, and Brian McBride. "RDF 1.1 Concepts and AbstractSyntax." RDF 1.1 Concepts and Abstract Syntax. 25 Feb. 2014. Retrieved fromhttps://www.w3.org/TR/rdf11-concepts/,28June2017.

9 Appendices

9.1 Annex A - Excerpt of Entities Involved in the MICUNA Use Case

ThetablebelowincludessomerelevanttermsusedinthescenariosoftheMICUNAusecase.Manyofthese terms will be common to all use cases. They have been added from domains that can bedifferentiated:(i)products,(ii)components/materials,(iii)productionoperations/phasesand(iv)tradeconcepts.

X:meanstheconceptisregisteredinthattaxonomywiththesamemeaning.

(X):means the concept is in the taxonomy but there are only specific instances related to the terminstead of a general class representing it, or themeaning is not exactly the same as the one in theMICUNAdomain

-:meanstheconceptismissingorithasadifferentmeaningthattheoneintheMICUNAdomain

Page 33: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page33of42

Itshouldbenotedthattheconsideredtaxonomiesarequitedifferentintermsoflevelswhileheretheonlyaspectconsideredisthepresenceornotofanexamplecollectionofterms.

eClassOWL 5.1.4 CPV 2008 ETIM 6.0 Google Prod Tax adhesive X X X X assembly X X X - balance X X X (X) board (X) X - (X) business X - - - carrier (X) - (X) - catalogue X (X) X - certification X (X) - - composition X - - - cradle (X) X - X crystal X X X X delay X - (X) - delivery X (X) X - din X - - - discount - - - - donation (X) - - - environment (X) (X) - - finishing X X - X fitting X - (X) (X) flexible X - X - guarantee X - - - handcrafted - (X) - (X) installation X X X - insurance X (X) - X iso (X) - (X) - labelling (X) (X) (X) (X) law X - - - machining X (X) - - negotiation X - - - norm X X - - order X - - - outsourcing (X) - - - packaging X (X) X - paint X X (X) X payment X - - - piece X (X) (X) (X) plastic X X X - polishing X - (X) - price X - X -

Page 34: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page34of42

quality X X (X) - quantity X X X - regulation X - - - renovation X X - - repayment - - - - safety (X) (X) X (X) sanding X (X) - (X) sensor X X X X shapping - - - stock X - (X) - storage X (X) X (X) supplier X (X) - - technology X X (X) - textile X X X (X) tool X X X (X) transport X X (X) - unit X - X - varnishing X - - X warehouse X (X) (X) (X) wheel X X X X wood X X (X) X

9.2 Annex B – Initial Alignment of the Data Elements of the Micuna Use Case with the UBL Elements

PRODUCTSUPPLIERDATASHEET

GENERALINFORMATION

Name cac:Party/cac:PartyName/cbc:NameCode cac:Party/cbc:IndustryClassificationCodeNIF/VAT cac:Party/cac:PartyTaxScheme/cac:TaxSchemeAddress cac:Party/cac:PostalAddressZipcode cac:Party/cac:PostalAddress/cbc:PostalZoneTown cac:Party/cac:PostalAddress/cbc:CityNameRegion cac:Party/cac:PostalAddress/cbc:RegionContactperson cac:Party/cac:PersonPositionofcontactperson

cac:Party/cac:Person/cbc:JobTitle

Email cac:Party/cac:Contact/cbc:ElectronicMailTelephone cac:Party/cac:Contact/cbc:TelephoneFax cac:Party/cac/Contacy/cbc:TelefaxClassification cac:Party/cbc:IndustryClassificationCodeSupplierfacilities cac:Party/cac:PhysicalLocationISOcompliance cac:Party/cac:PartyLegalEntity

Page 35: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page35of42

OSHAScertification

cac:Party/cac:PartyLegalEntity

EFQMimplementation

cac:Party/cac:PartyLegalEntity

Awardsreceived TBDPenaltyclausesforlatedelivery

cac:DeliveryTerms/cbc:Amount

Qualityindicators

TBD

Technicaladvice cac:ItemSpecificationDocumentReference

TRADECONDITIONS

Methodofpayment

cac:PaymentMeans

Paymentconditions

cac:PaymentTerms

Paymentdays cac:PaymentTerms/cac:SettlementPeriod/cbc:DurationMeasure Bankentity cac:FinancialAccount/cac:FinancialInstitituonBranch/cac:FinancialInstitutionBankaccountnumber

cac:FinancialAccount

Discounts cac:AllowanceCharge/cbc:Amout Trade

discountscac:AllowanceCharge/cbc:AllowanceChargeReasonCode

Rappel(variable)

cac:AllowanceCharge/cbc:AllowanceChargeReasonCode

Cashdiscount cac:AllowanceCharge/cbc:AllowanceChargeReasonCodeDeliveryconditions

Generalcomments

cac:DeliveryTerms/cbc:SpecialTerms

Deliveryperiod

cac:Delivery/cac:PromisedDeliveryPeriod

Freighttype cac:Shipment/cac:ShipmentStage/cbc:TransportModeCode Insurance cac:Delivery/cbc:InsuranceValueAmount Services cac:Shipment/cbc:HandlingInstructionsorcbc:DeliveryInstructions Otherfees cac:Shipment/cac:DeclaredForCarriageValueAmount

LOGISTICSSUPPLIERDATASHEET

GENERALINFORMATION

Name SameasaboveCode SameasaboveNIF/VAT SameasaboveAddress SameasaboveZipcode SameasaboveTown SameasaboveRegion SameasaboveContactperson Sameasabove

Page 36: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page36of42

Positionofcontactperson

Sameasabove

Email SameasaboveTelephone SameasaboveFax SameasaboveClassification SameasaboveSupplierfacilities SameasaboveISOcompliance SameasaboveSafetystock cac:ItemManagementProfile/cbc:MinimumInventoryQuantityPenaltyclausesforlatedelivery

Sameasabove

Qualityindicators

Sameasabove

Technicaladvice Sameasabove

TRADECONDITIONS

Methodofpayment

Sameasabove

Paymentconditions

Sameasabove

Paymentdays SameasaboveBankentity SameasaboveBankaccountnumber

Sameasabove

Discounts Sameasabove Tradediscounts Sameasabove Rappel Sameasabove Cashdiscount SameasaboveServiceconditions

Generalcomments

Sameasabove

Cargodays cac:Delivery/cac:PromisedDeliveryPeriod/cbc:DurationMeasure Freighttype Sameasabove Insurance Sameasabove Services Sameasabove Otherfees Sameasabove

SERVICESPECIFICATIONS

Deliverymethod cac:Shipment/cac:ShipmentStage/cbc:TransportModeCodeDestination cac:DeliveryCustomerPartyInsurance cac:Shipment/cbc:InsuranceValueAmountRate cac:Shipment/cac:DeclaredForCarriageValueAmountQualityindicator TBDCertifications cac:Item/cac:CertificateGeneralfeatures cac:Shipment/cbc:Information

PRODUCTSPECIFICATIONS

Name cac:Item/cbc:Name

Page 37: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page37of42

Code cac:Item/cac:CommodityClassificationPrice cac:Price/cbc:PriceAmountProductfeatures cac:Item/cac:AdditionalItemPropertyMinimumorderquantity cac:CatalogueLine/cbc:MinimumOrderQuantityMaterials cac:Item/cac:ItemSpecificationDocumentReferenceCertifications cac:Item/cac:CertificateTechnicalsafetydatasheet cac:Item/cac:ItemSpecificationDocumentReferenceTolerance/Allowedmeasurementdeviations

cac:Item/cac:Dimension/cbc:MinimumMeasure,cbc:MaximumMeasure

Packagingtype cac:GoodsItem/cac:ContainingPackage/cbc:PackagingTypeCodeProductimage cac:Item/cac:ItemSpecificationDocumentReferenceUsualdeliveryperiod cac:Delivery/cac:EstimatedDeliveryPeriodDeliveryperiodinnon-conformityclaims

TBD

Graduatedpricesbyquantity TBDFirstproductsamplefreeofcharge

cac:InvoiceLine/cbc:FreeOfChargeIndicator

Quantityperpackage cac:GoodsItem/cac:ContainingPackage/cbc:QuantityTechnicalproductdatasheet cac:Item/cac:ItemSpecificationDocumentReferenceProductguarantee cac:CatalogueLine/cbc:WarrantyInformationSafetystock cac:ItemManagementProfile/cbc:MinimumInventoryQuantity

Page 38: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page38of42

9.3 Annex C – Mock-ups for Catalogue Registry

Figure 11 Search for product categories with a keyword

Page 39: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page39of42

Figure 12 Categories listed for the provided keyword

Figure 13 Category selection by traversing the taxonomy hierarchy

Page 40: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page40of42

Figure 14 Publishing a single product via the Catalogue Registry UI

Figure 15 Downloading and uploading category templates

Page 41: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page41of42

9.4 Annex D – Snapshots from the Catalogue Registry UI

Figure 16 Keyword-based category search

Figure 17 Category search results

Page 42: Collaborative Network for Industry, Manufacturing, Business and … · 2018. 9. 27. · global registry; raw data, which could have varying formats, are kept in disparate repositories

NIMBLE CollaborationNetworkforIndustry,Manufacturing,BusinessandLogisticsinEurope

©D3.2CatalogueIngestionandSemanticAnnotation Page42of42

Figure 18 Product properties recommended for the select category

Figure 19 Downloading and uploading category templates