essnet big data ii · 2020-06-02 · erable, essnet big data pilot 2 (ref to the quality...

47
ESSnet Big Data II Grant Agreement Number: 847375-2018-NL-BIGDATA https://webgate.ec.europa.eu/fpfis/mwikis/essnetbigdata https://ec.europa.eu/eurostat/cros/content/essnetbigdata_en Workpackage I Mobile Network Data Deliverable I.5 (Methodology) First proposed standards and metadata for the production of official statistics with mobile network data Draft version, 28 May, 2020 Workpackage Leader: David Salgado (INE, Spain) [email protected] telephone : +34 91 5813151 mobile phone : N/A Prepared by: Roberta Radini (ISTAT, Italy) - Tiziana Tuoto (ISTAT, Italy) - Sandra Hadam (Destatis, Germany) - Fabrizio de Fausti(ISTAT, Italy) - Sandra Barragán (INE, Spain) - Luca Valentino (ISTAT, Italy) - David Salgado (INE, Spain) - Raffaella M. Aracri (ISTAT, Italy)

Upload: others

Post on 07-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

ESSnet B ig Data I I

G r a n t A g r e e m e n t N u m b e r : 8 4 7 37 5 - 2 0 1 8 - N L - B IG D A T A h t t p s : / / w e b g a t e . e c . e u r o p a . e u / f p f i s / m w i k i s / e s s n e t b i g d a t a

h t t p s : / / e c . e u r o p a . e u / e u r o s t a t / c r o s / c o n t e n t / e s s n e t b i g d a t a _ e n

Wor kp ack age I

Mob i le Ne twor k Da ta

D e l ive rab le I . 5 (Me tho do logy)

F irs t prop osed st and ards a nd me ta da ta f or the pr odu ct ion of o ff i c ia l s ta t is t ics w i th mob i le n etw ork

d at a Draft version, 28 May, 2020

Workpackage Leader:

David Salgado (INE, Spain) [email protected]

telephone : +34 91 5813151 mobile phone : N/A

Prepared by:

Roberta Radini (ISTAT, Italy)

- Tiziana Tuoto (ISTAT, Italy) - Sandra Hadam (Destatis, Germany) - Fabrizio de Fausti(ISTAT, Italy) - Sandra Barragán (INE, Spain) - Luca Valentino (ISTAT, Italy) - David Salgado (INE, Spain) - Raffaella M. Aracri (ISTAT, Italy)

Page 2: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

Contents

1 General Introduction and Motivation 1

2 The modular structure of the Process 5

3 The modular structure of the Metadata and Standard 7

4 The Source Metadata 13

5 The Intermediate Metadata 17

6 The output metadata 19

Bibliography 21

GLOSSARY 23

II

Page 3: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

1

General Introduction and Motivation

The availability of a rigorous, understandable and transparent description of theinformation contained in the Mobile Network data (MND) as well as their structures andthe relationship with the statistical concepts is a crucial point for a full understandingof the statistical analyses and experiments based on MND, particularly for official sta-tistical purposes. As well-known, MND is not designed to produce outputs for officialstatistics; Mobile Network Operators (MNOs) collect and often store MND to trackand monitor the operation of the network, to guarantee device connectivity (like thesignaling data), to manage the billing of services provided to customers (personal dataof the customers as well as data related to the contract, CDRs, DDRs). In addition, theMobile Network systems that generate these data, being proprietary, have often specificcharacteristics that might introduce differences among the physical data collected bydifferent MNOs. Other MND specificities are related to the several technologies thathave been introduced in telco environment. All these factors highlight the need to sharean accurate description of metadata between NSIs and MNOs from a conceptual ratherthan a logical/physical point of view.

Furthermore, even when involving big data sources, like it is the case with MND,the proposed standard statistical production follows the principles of the ESS ReferenceMethodological Framework (ESS RMF) for MND, comprising:

1. an input phase, which mainly handles data from MNOs. This is also called datalayer;

2. a throughput phase where data are processed, transformed and elaborated. This isalso called convergence layer;

3. an output phase, with the production of statistical outputs. This is also calledstatistics layer.

In accordance, it is also useful to distinguish three different types of data and corre-sponding metadata, as follows:

1

Page 4: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

1 General Introduction and Motivation

1. raw source data, as they appear in the big data source, in this case the MNOdatabases;

2. Intermediate statistical data, i.e. the transformed raw data that make statisticalprocessing possible,

3. usual statistical outputs.

It is worthwhile noting that in the Quality Guidelines provided by the WPK Deliv-erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phasehas been split into two phases, i.e. sub-phase 1, Deriving Statistical Data from RawData of a Big Data Source, and sub-phase 2, Usage of the Derived Statistical Data forthe Production of Statistical Output. This subdivision is particularly useful with MND,given that there are some specific operation/transformation/pre-elaboration that areneeded to derive statistical data from MND raw data.

In this deliverable we will focus on the conceptual description of the three types ofdata and in particular the metadata that describe them. Input data are described mainlyby providing a glossary for the source data, section GLOSSARY. The glossary uses adescriptive perspective, we avoid to be too technical and we privilege a level of detailsthat allow the non-telco-expert readers to understand the informative content of thedata source. However, quite often the terminology and acronyms are those used by thestandard technical language 3GPP and other specialised technical literature, so to createa bridge between the two words, official statisticians and telco experts. This glossary isdesigned to be useful to define in detail the information requirements of the MND to beused to produce a statistical product and to define a common language for statisticiansand telco experts that does not lead to misunderstandings.

The description of the data and metadata related to the first step of the throughputphase is provided in section 3 of this deliverable. Details are provided to clarify whythis first step deserve a specific attention, being characterised by a set of operations thatare in common to almost all the statistical output that can be derived by MND. For thedescription of the other two types of metadata, the former related to the sub-phase 2of the throughput phase and the latter related to the output phase, we provide someexamples in sections 4 and 5. They are most closely related to the production processof the specific usages, so we can’t assume to list a complete set of metadata. We limitourselves to show examples from the use case assumed by this WPI, as purpose ofillustration, mainly because the definition in terms of data and metadata of the outputphase should help in understanding the acceptable requirements for data and metadatafrom the input phase and going on.

In this document, we adopt the perspective of the GSIM Referential Metadata Objects(REF), where possible. GSIM (Generic Statistical Information Model) is an internationallyrecognized reference framework for modelling statistical information; in particular, itallows the definition, management and use of data and metadata throughout a statistical

2

Page 5: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

1 General Introduction and Motivation

production process. This framework, in recent years, has been increasingly adopted bynational statistical offices as a conceptual reference model.

3

Page 6: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving
Page 7: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

2

The modular structure of the Process

This document does not report the flow of processes and data, nor the detail ofprocesses and sub-processes, but the identification of the conceptual model of the dataand a first typing of the information content, which we divide into three classes, asspecified before: the source raw data, the throughput data, the output data. Recently,some studies have been underway to verify the applicability of the well-known standardprocess model for statistical production, the GSBPM, into the analysis and productionprocesses of the big data (Kuonen, Ricciato). To this regard, the project ESSNET BIGDATA Pilot II, Implementation component, has dedicated the entire WPF (Process andArchitecture) to the definition of a new model able to describe the production processwith big data sources. Deliverable WPI.6 on Quality will analyze and align the proposalof the model described by WPF for the usage of MFD.

Independently of the process model that we apply, the crucial step for any kindof analysis and production is related to the analysis of the information sources: DataUnderstanding. This is fundamental both in a cognitive approach to a new source, i.e. ina top-down approach where starting from a cognitive need we try to verify how a newsource can meet these needs; and in a similar way it works in an exploratory/miningapproach, i.e. in a bottom-up approach, where we need information on the new sourceto understand what kind of knowledge is there. In both approaches, and in a mixedapproach as well, it is necessary to know/understand the data and therefore it is ex-tremely important to document them through the source metadata. These should beused for the selection of the information required to satisfy the knowledge needs, whichcan be also defined after an analytical and testing phase. In a privacy-by-design ap-proach, only the information strictly required for the project analysis can be selected,therefore the source metadata should be used to determine the transformation of theraw source data into the data prepared for further processing. At this stage, informationgeneralisation methods can be applied as one of the techniques to minimise privacy risks.

Figure 2.1 shows a light schema of macro processes and metadata typification whenusing MND. The access of data from MNO systems should be accompanied by the sourcemetadata, for which a basic Glossary of Terms is provided in section GLOSSARY. The

5

Page 8: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

2 The modular structure of the Process

raw data selection and archiving phase is followed by a first phase of data preparationthat uses source metadata in input to generate intermediate metadata; they are betweenthe initial raw source metadata and the final production metadata. At this point, asexplained in the Introduction, we apply the logic proposed in the WPK of split thethroughput phase in two. In the first phase of the elaboration, these metadata have thecharacteristic of being a hybrid that brings the raw data closer to statistical information.This transformation can be managed with multiple processes that should be controlledand defined by the NSIs, but can be implemented/executed by MNOs on their ownpremises on trusted analytical systems. Input privacy processes should be introduced atthis stage.

This intermediate data and metadata represents the biggest challenge, since it hasto be defined jointly by MNOs and NSIs and may involve a classification of concepts,i.e. some general concepts, such as location, might be common to almost all analyses,others specific concepts can be related to the specific use cases. The general conceptswill be discussed in section 2 and some examples from the specific concepts related tothe methodologies developed in deliverable WPI.3 “A proposed production frameworkwith mobile network data” will be introduced as well.

Finally, production metadata are the traditional metadata of statistical production,which might be also related to new products if the analyses are devoted to new phe-nomena, not investigated before. In addition, they might represent a deepening or agreater detail of investigation fields already explored in the statistical production of NSIs.

Figure 2.1: Schema of macro processes and typification of metadata

6

Page 9: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

3

The modular structure of the Metadata and Standard

The general representation of the analysis and production process of statisticaloutputs involves the subdivision of data and metadata into: source metadata (Input),metadata (Throughput), and statistical production (Output). These are divided intoMicro metadata, if referring to the variables of the individual records, and Macro, ifreferring to the analysis dimensions of the aggregated data.

The formalization of the metadata follows the GSIM standard, in particular theconcept of Information Resource [8] is used for modeling the information on the datasource, and consists of: the name of the source, the description, the owner (i.e. the MNOfor raw source data) and location (optional). The scheme of the information resource isshown in the following table 3.1:

Table 3.1: Schema of the Information Resource for data sources

Name A human-readable identifier for the objectDescription A human-readable description of the objectOwner Identification of the person, institution or group which owns the infor-

mation resourceLocation A description of the location where the data resource can be found, it

could be a physical address or a logical address (like an URI)

In the case that the data supplies come from multiple providers (MNO) or the processinvolves the integration of different sources, not just phone data, a table informationresource should be defined for each source.

Moreover, the representation of the metadata for each source describes:

Data Resource, i.e. collections of data that are used by a statistical activity to produceinformation [9];

Referential Metadata Resource, i.e. collections of structured information that may beused by a statistical activity to produce information. We formalize the referential

7

Page 10: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

3 The modular structure of the Metadata and Standard

metadata resource for MND by means of the glossary of terms, reported in sectionGLOSSARY.

Figure 3.1: GSIM - Information Resource Concept

In this document, we do not address all metadata modeling according to the GSIMmodel, we rather focalize on some aspects, highlighted with circles in figure 3.2, whichwe believe are relevant for modeling MND.

The modeling is carried out on two different layers: a conceptual one aimed at model-ing the information contained in the data and a logical-physical one referred to the dataand its structures.

If the logical-physical modelling is purely that of the data and it is shared and de-fined by the MNO, the conceptual modelling is an operation carried out by the NSIthat defines the units and the populations of interest. Therefore, the modeling refers tothe GSIM concepts of Data Structure and DataSet for logical-physical layer, and to theconcepts of Unit Type, Unit, Population for conceptual layer.

In our view, these elements of the GSIM model are sufficient for data and informationmodeling.

8

Page 11: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

3 The modular structure of the Metadata and Standard

Figure 3.2: Abstract of GSIM Concept

The logical-physical level of the data is described by:

The Data Structure represents the data structure description, it is composed of:Identifiers, Measures and Attributes and can be defined for both Micro and Macrodata.

The DataSet is a collection of Data Resource and is structured by a Data Structure.

The conceptual layer of the data is described by:

The identification of the analysis units (Unit type) that are represented in theDataSet. A Unit Type is used to describe a class or group of Units based on a singlecharacteristic, but with no specification of time and geography these contribute tothe definition of the population of reference. It is the statistical unit.

The individual units that can be extracted from the MNO data supplies, referringto a specific period and an area, represent the Units. For example, the calling SIMsin the case of a supply / extraction of CDR.

The population is made up of a set of units that have homogeneous characteristics.For example, a population is made up of SIM subscribers.

GSIM concepts can also be represented through ontology. Below is an excerpt fromthe ontological modeling of GSIM concepts in Graphol [11].

9

Page 12: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

3 The modular structure of the Metadata and Standard

Figure 3.3: The ontology of some GSIM Concepts

In this approach, we follow the idea of using semantics for making data integration,preparation, and governance more powerful. As illustrated in Lenzerini (2011), usingsemantics means conceiving information systems where the semantics of data is explic-itly specified and is taken into account for devising all the functionalities of the system.Over the past two decades, this idea has become increasingly crucial for a wide varietyof information-processing applications and has received much attention in the ArtificialIntelligence, Database, Web, and Data Mining communities (Noy, Doan and Halevy2005). In particular, we concentrate on a specific paradigm, called Ontology-Based DataManagement (OBDM), introduced about a decade ago as a new way for modeling andinteracting with a collection of data sources (Calvanese et al. 2007; Poggi et al. 2008; Lenz-erini 2011). According to such paradigm, the client of the information system is freedfrom being aware of how data are structured in concrete resources (databases, softwareprograms, services, etc.), and interacts with the system by expressing her queries andgoals in terms of a conceptual representation of the domain of interest, called ontology.More precisely, an OBDM system is an information management system maintainedand used by a given organization (or, a community of users), whose architecture has thesame structure of a typical data integration system, with the following components: anontology, a set of data sources, and the mapping between the two.

10

Page 13: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

3 The modular structure of the Metadata and Standard

The ontology is a conceptual and formal description of the domain of interest of theorganization, expressed in terms of concepts, concept attributes, and relationships be-tween concepts and logical assertions that formally describe the domain knowledge. Thedata sources are the repositories accessible by the organization in which the domain dataare stored. Mapping is a precise and formal definition of the correspondence betweenthe data contained in the data sources and the elements of ontology. Where elementmeans any concept, attribute or relation. These three levels constitute a sophisticatedknowledge representation system that can be managed and reasoned with the help ofautomated reasoning techniques. Furthermore, the OBDM supports the checking andmonitoring of the consistency, accuracy and completeness of data sources.

11

Page 14: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving
Page 15: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

4

The Source Metadata

We can distinguish the raw source metadata into three types of Data Resources:phone data, network data, and business data. The phone data are generated by themobile devices directly due to their activities , e.g. calling, receiving calling, sendingand receiving text messages, connecting to the internet, as well as indirectly, due to thesimple connection to the telco network, even when the mobile devices are inactive, e.g.signaling data.

The network data allow the MNOs to operate the telco networks, they are related tothe characteristics of the telco network, these are the most technical information referringto the kind of technologies, the technicalities of the antenna and network, most of themare familiar for telco experts and are reported in the glossary of terms, due to the factthat a proper understanding of these data is fundamental for using the phone data inthe best ways for statistical purposes.

Finally, the business data are related to the business of the MNOs, they representthe number of devices to which the phone data refer, they include info on customercontracts, MNO’s market share, and penetration rate at different territorial levels. TheInformation Resource describes the data source used in the analyzes of interest. Inour case, the source data are Phone Data, CDR, DDR and/or signaling data, as wellas Network Data and Business Data. Moreover, the analyses might involve other datasources, i.e. auxiliary data sources in this case where the focus is on MPD, those otherdata can be already in the NSI’s data repository or they can be provided by other dataowners, e.g. resident population counts at a certain domain, the Land Use, orographyfor a given territory.

The source data are described with:

the Glossary, according to the schematic of the Referential Metadata Resource:Acronym, Lemma and Description;

the Data Resource, classified in 3 types of data: ”Phone Data”, ”Network Data”and ”Business Data”. Each source data has its data structure.

13

Page 16: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

4 The Source Metadata

The Data Structure represents the logical level of the data and it is made up of: Identifiers,Measures and Attributes. A DataSet is a collection of data that corresponds to a DataStructure.

For a correct use of the data source it is necessary to define the dataset according tothe components of the Data Structure, and also to associate the description of the contentto each variable through the glossary. Once the description of the Data Set has beencompleted, the information model is defined, and the second step is defined the UnitType and Population of interest.

For example, table 4.1 reports some information from a CDR data set related to thedataset descriptors the data structure and the referential metadata.

Table 4.1: Example of data set descriptors, data structure and referential metadata forCDRs

Date set descriptorsDataStructure

ReferentialMetadataResouce

caller’s phone identifier (IMEI) (transformed into ID SIM);receiving phone identifier (IMEI) (transformed into ID SIM);cell locked by the caller (Cell ID) at the start of the call;cell locked by the caller (Cell ID) at the end of the call

Identifiercomponent

IMEIID SIMCell ID

Call start date;Call start time;Call End date;Call End time

Attributecomponent

Call durationMeasureCompo-nent

It is worthwhile noting that the SIM ID is an identifier in the Phone data, while thecell ID is an identifier in the Network data and allow us to connect phone activities andthe telco network, a crucial step for assessing the location of the phone.

In the case of CDRs we can identify multiple units of analysis (Unit Type), forexample:

1. the calling SIMs that identifies the active devices;

2. the call events for each SIM;

3. the relationship between the caller and called.

The Unit Types characterize a set of Units, this is an example of how the same data set,i.e. the CDRs, allow analysis on different aspects distributed over time and space.

14

Page 17: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

4 The Source Metadata

A. the calling SIMs that represent the “population of active devices”;

B. the call event represents telephone traffic;

C. the relations between the caller and called represent the network of telephonecontacts.

In particular, in example 1, the Unit Type is “the SIM that makes the call and identifiesan active device”, while the “Units” are the SIM (IMEI). The set of “Units” analyzed overtime and space represents a Population, that is, all SIMs active in a certain place and time.This is the typical input data for the density of people analysis in a certain place and time.

With these examples we want to demonstrate how starting from a set of data bymeans of a large and in-depth analysis of the information structure it is possible to addsemantics to the data by promoting the extraction of knowledge and allowing to infernew knowledge.

15

Page 18: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving
Page 19: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

5

The Intermediate Metadata

In this paragraph we would like to describe the concepts and metadata related to:

Data preparation, i.e. the selection and possible transformations of the data. Forexample, i.e. the data not to the antenna / sector, but to the BSA, or generalize thetemporal information by transforming the start time of the call from hours/minutesinto hours, anonymize the SIM identifier. These processes can be agreed with theMNO and often constitute a constraint for input privacy.

Data elaboration and estimation, i.e. prior/posterior location, estimation densityof population by number SIM.

Intermediate metadata corresponds to those production tasks embedded in thethroughput phase of the ESS RMF. These metadata are closely linked to the methodologyadopted to process, transform, and prepare data for statistical purposes. We will notprovide here mathematical or technical details about the statistical methodology of thisphase. We refer the reader to the deliverable WPI.3 “A proposed production frameworkwith mobile network data”. However, we briefly describe in generic terms the approachto motivate the Intermediate Metadata and intro duce the context of the terms includedin the glossary.

The bottom line of the throughput phase in the ESS RMF aims at detaching theunderlying complex technological layer behind the process of generation of MND fromthe statistical analysis driving us to the final statistical products. MND constitutes arich source of information, specifically about geolocation, Internet traffic, and socialinteractions. So far, the ESS RMF focuses only on geolocation information. To detach thedata layer from the statistics layer, the core idea is to compute the probability of locationof each mobile device at each tile of a given reference grid. Source data and metadata areused to carry out the computation of these probabilities so that the information comingfrom this data source is condensed in this set of so-called location probabilities for everymobile device anonymously identified. The location probabilities will constitute thebasis for any subsequent data processing and modelling exercise, so that source data

17

Page 20: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

5 The Intermediate Metadata

and metadata are not necessary any more. It is important to clarify and clearly underlinethat we do not mean that location probabilities, already independent from source dataand metadata, can be openly disseminated. They are still highly sensitive data, thus allsafeguards regarding privacy and confidentiality must still be applied on them.

Also, to describe the Intermediate Metadata related to the throughput phase we shallfollow the same conventions used above for the Source Metadata. In this sense, thestructure of the glossary for this production phase is the same. Also, the description ofthe data sets for the location probabilities runs along similar lines in terms of descriptors,data structure (identifier, attribute, measure) and referential metadata resource (seetable 5.1).

Table 5.1: Example of data set descriptors, data structure and referential metadata forlocation probabilities.

Date set descriptorsDataStructure

ReferentialMetadataResouce

caller’s phone identifier (IMEI) (transformed into ID SIM);tile ID

Identifiercomponent

IMEIID SIMTile ID(ReferenceGrid)

location reference time periodAttributecomponent

location probabilityMeasureCompo-nent

Regarding the Unit Types, what we stated for Source Metadata remains valid, sincethe throughput phase amounts to computing and assigning measure components (loca-tion probabilities and device multiplicity probabilities) for the same units of analysis.

18

Page 21: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

6

The output metadata

The Output Metadata are clearly product-oriented, thus intimately related to thestatistical domain at stake. From a general perspective, thus, it is impossible to providea minimal comprehensive list of terms comprising all statistical domains of applicationsof MND. However, not completely novel terms are needed in this respect, since therealready exists a wealth of metadata related to many statistical products obtained withtraditional data sources. For example, in tourism statistics rigorous definitions of do-mestic, inbound, and outbound tourists exist so that regarding MND only a connectionbetween these concepts and the output from the MND-based statistical process needs tobe provided.

In our view, this connection must be mostly operational, and only a disruption inthe concepts should be introduced if definitively necessary. The focus should be on thealgorithmic operationalization of these concepts. In the traditional production setting,concepts and definitions in the metadata system are introduced in the production pro-cess mainly through the questionnaire design prior to data collection. Now, data alreadyexists before data acquisition by NSIs and a new problem arises in which those conceptsand definitions must be identified through some algorithmic procedure among theseexisting data.

Let us consider the example of inbound tourism, which can be defined as compris-ing the activities of non-residents travelling to a given country that is outside theirusual environment, and staying there no longer than 12 consecutive months for leisure,business or other purpose. This is a conceptual definition which can be formalized interms of GSIM as usual. Regarding MND, we now need to operationalize it, i.e. weneed to provide a parametrizable algorithm upon MND producing an identification ofinbound tourists in our mobile network dataset. Notice that this is intimately linked tothe development of the methodology, which is in construction.

In summary, the Output Metadata construction should concentrate on the algorithmicoperationalization of traditional statistical concepts and definitions.

19

Page 22: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving
Page 23: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

Bibliography

[1] List of variables in Mobile Network Operator signalling records that are of potentialinterest for Official Statistics applications — DRAFT version 1.1 (29 October 2019)Ricciato.

[2] Handbook on the use of Mobile Phone data for Official Statistics ; UN Global WorkingGroup on Big Data for Official Statistics, 2017.

[3] https://statswiki.unece.org/display/bigdata/Classification+of+Types+of+Big+Data classificazione delle fonti.

[4] United Nations Global Working Group on Big Data for Official Statistics Task Teamon Cross-Cutting Issues Deliverable 2: Revision and Further Development of theClassification of Big Data.

[5] https://www.3gpp.org/about-3gpp/about-3gpp

[6] https://statswiki.unece.org/display/GSIMclick/Information+Resource

[7] CRISP-DM: Towards a Standard Process Model for Data Mining http://www.cs.unibo.it/˜danilo.montesi/CBD/Beatriz/10.1.1.198.5133.pdf

[8] https://statswiki.unece.org/display/GSIMclick/Information+Resource

[9] https://statswiki.unece.org/display/clickablegsim/Data+Resource

[10] https://statswiki.unece.org/display/clickablegsim/Referential+Metadata+Resource

[11] Console, M., Lembo, D., Santarelli, V., and Savo, D. F. (2014), Graphol: OntologyRepresentation through Diagrams, In Proc. of DL 2014.

21

Page 24: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving
Page 25: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Reference:

[1] Handbook on the use of Mobile Phone data for Official Statistics; UN GlobalWorking Group on Big Data for Official Statistics, 2017

[2] Guidelines for the access to mobile phone data within the ESS (WP5);

[3] From GSM to LTE: An Introduction to Mobile Networks and Mobile Broadband;Martin Sauter, Wiley

[4] List of variables in Mobile Network Operator signalling records that are of potentialinterest for Official Statistics applications

[5] Deliverable WPI.3 of ESSnet on Big Data II

23

Page 26: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Dat

aR

esou

rce:

Phon

eD

ata.

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Cal

l-D

etai

lR

ecor

dC

DR

Cal

ldet

ailr

ecor

ds

(CD

Rs)

cap

ture

info

rmat

ion

onca

llsm

ade

onte

lep

hone

syst

ems,

incl

udin

gw

hom

ade

the

call

(nam

ean

dnu

mbe

r),w

how

asca

lled

(nam

eif

avai

labl

e,an

dnu

mbe

r),t

heda

tean

dtim

eth

eca

llw

asm

ade,

the

dura

tion

ofth

eca

ll,an

dty

pica

llydo

zens

ofus

age

and

diag

nost

icin

form

atio

nel

emen

ts(f

orex

ampl

e,fe

atur

esus

edan

dre

ason

for

call

term

inat

ion)

.CD

Rs

are

colle

cted

ona

regu

lar

basi

sfo

rpr

oces

sing

into

usag

e,ca

paci

ty,p

erfo

rman

cean

ddi

agno

stic

repo

rts.

With

such

info

rmat

ion,

itis

easi

erto

spot

exce

ptio

nsto

regu

lar

calli

ngpa

tter

ns,s

uch

asou

t-of

-hou

rsca

lling

,int

erna

tiona

lca

lls,s

igni

fican

tvar

ianc

esfr

ompr

evio

usre

port

ing

peri

ods

and

call

des

tina

tion

sth

atdo

notr

eflec

tnor

mal

calli

ngpa

tter

nsfo

rth

een

terp

rise

.In

orde

rto

beab

leto

prod

uce

abi

llfo

rea

chsu

bscr

iber

,the

MN

One

eds

tom

aint

ain

char

ging

mec

hani

sms

that

are

resp

onsi

ble

for

pro

du

cing

and

com

bini

ngC

all

Det

ail

Rec

ord

s(C

DR

)an

dIn

tern

etPr

otoc

olD

etai

lRec

ords

(IPD

R),

whi

char

esu

bjec

tto

billi

ngin

form

atio

nge

nera

tion

(e.g

.ap

plyi

ngse

rvic

era

tes)

.

Dom

esti

cda

ta

Any

loca

tion

even

tsoc

curr

ing

wit

hin

the

netw

ork

ofth

esp

ecifi

cM

NO

.The

sear

eC

DR

’sor

othe

rev

ents

whe

reho

me

subs

crib

er(a

cust

omer

)of

the

spec

ific

MN

Ois

invo

lved

.Alo

cals

ubsc

ribe

rca

lling

anot

her

loca

lsub

scri

ber

inth

esa

me

MN

One

twor

kw

illge

nera

teat

leas

ttw

odo

mes

tic

even

ts(o

neca

llin

itia

tion

,one

call

rece

ivin

g).

2

24

Page 27: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Eve

ntd

ata

add

itio

nal

attr

ibut

es

Ap

art

from

subs

crib

erid

enti

ty,t

ime

and

geol

ocat

ion

attr

ibu

tes,

ther

eex

ist

mor

eat

-tr

ibut

esto

bepo

tent

ially

incl

ude

inth

em

obile

phon

eda

tase

ts.I

nter

natio

nals

peci

fica-

tion

sfo

rch

argi

ngdi

vide

them

into

four

cate

gori

es:

1.M

anda

tory

attr

ibut

es.

2.C

ondi

tion

alat

trib

utes

depe

ndin

gon

the

fulfi

llmen

tofc

erta

inco

ndit

ions

.3.

Ope

rato

r-pr

ovis

iona

ble

man

dato

ryat

trib

utes

whi

chM

NO

sha

vepr

ovis

ione

dto

beal

way

sin

clud

ed.

4.O

pera

tor-

prov

isio

nabl

eco

ndit

iona

latt

ribu

tes

whi

chM

NO

sha

vepr

ovis

ione

dto

beal

way

sin

clu

ded

ifce

rtai

nco

ndit

ions

are

met

.Man

yof

thes

eat

trib

ute

sex

hau

stiv

ely

incl

ud

edin

the

spec

ifica

tion

ssh

owlit

tle

valu

efo

rst

atis

tica

lpu

rpos

es,b

ut

afe

wof

them

are

bein

gco

mm

only

cons

ider

ed:

Rec

ord

type

–he

lps

todi

ffer

entia

tebe

twee

nda

taty

pes

(CD

Ran

dIP

DR

),se

rvic

ety

pes

(e.g

.,SM

S,M

MS,

outg

oing

call,

etc.

)an

dd

isti

ngui

shbe

twee

nev

ents

that

are

nots

ubsc

ribe

rge

nera

ted

(e.g

.,lo

cati

onup

date

s).

Cal

ldur

atio

nor

data

volu

me

for

anal

ysin

gm

obile

phon

eus

age

patt

erns

.

Subs

crib

eror

equi

pmen

tide

ntit

yfo

rre

ceiv

ing/

send

ing

part

ies.

Fore

nign

vis-

ited

netw

ork

Thi

sis

only

rele

vant

for

outb

ound

roam

ers.

The

hom

eop

erat

orsh

ould

beab

leto

obse

rve

the

MC

C/

MN

Cof

the

fore

ign

visi

ted

netw

ork.

Inso

me

case

s,th

eho

me

oper

ator

may

beab

leto

lear

nth

ep

osit

ion

ofth

eou

tbou

ndro

amer

sat

sub-

cou

ntry

leve

l(e.

g.M

SCar

eaor

SGSN

area

).If

such

info

rmat

ion

isav

aila

ble,

itm

ight

behe

lpfu

lfo

ran

alys

ing

mob

ility

patt

erns

ofou

tbou

ndro

amer

sat

abe

tter

degr

eeof

spat

iald

etai

l(E

.g.,

regi

onor

city

ofth

evi

site

dfo

reig

nco

untr

y).

25

Page 28: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Inbo

und

roam

-in

gda

ta

any

loca

tion

even

tby

afo

reig

nM

NO

subs

crib

er.T

hese

data

usua

llyre

pres

entf

orei

gnsu

bscr

iber

su

sing

alo

calr

oam

ing

serv

ice.

Thi

sd

ata

can

also

incl

ud

ed

omes

tic

sub-

scri

bers

from

anot

her

MN

Ous

ing

aro

amin

gse

rvic

ebe

caus

eth

ere

isno

rece

ptio

nby

thei

row

nM

NO

.

Inte

rnat

iona

lM

obile

Stat

ion

Equ

ipm

ent

Iden

tity

IMEI

Eve

rym

obile

dev

ice

isu

niqu

ely

iden

tifi

able

wit

hei

ther

its

IME

Ian

dIM

EIS

Vis

aun

ique

num

ber

used

toid

entif

ya

mob

ilest

atio

n(M

S).I

dent

ifica

tion

isus

edto

iden

tify

valid

devi

ces

and

ban

devi

ces

that

are

lost

orst

olen

from

the

netw

ork.

The

IMEI

is15

of16

deci

mal

digi

tslo

ng.I

tcon

sist

sof

thre

epa

rts.

The

first

8di

gits

,the

Type

App

rova

lC

ode,

are

use

dto

iden

tify

the

mod

el.

The

next

5d

igit

sar

ea

uni

aue

uni

que

seri

alnu

mbe

rfo

rth

isty

pe

ofd

evic

e.T

his

isfo

llow

edby

eith

era

1d

igit

cont

rolc

ode

ora

2d

igit

soft

war

eve

rsio

nnu

mbe

r.The

IME

Iwas

intr

oduc

edw

ith

GSM

.As

ofG

SMth

ein

det

ifica

tion

ofth

ed

evic

ean

dth

esu

bscr

iber

are

sep

arat

ed.

ASI

Mca

rdis

use

dto

iden

tify

the

subs

crib

erIn

tern

atio

nal

Mob

ileSt

atio

nE

quip

men

tId

enti

tyan

dSo

ftw

are

Ver

-si

onnu

mbe

r

IMEI

SVE

very

mob

iled

evic

eis

uni

quel

yid

enti

fiab

lew

ith

eith

erit

sIM

EI

and

IME

ISV

isa

uniq

uenu

mbe

rus

edto

iden

tify

am

obile

stat

ion

(MS)

2

Inte

rnat

iona

lM

obile

Sub-

scri

ber

Iden

-ti

ty

IMSI

As

long

asa

subs

crib

erha

sno

tcha

nged

his

SIM

card

,the

IMSI

will

rem

ain

the

Sam

e.Is

inte

rnat

iona

llyst

anda

rdiz

edun

ique

num

ber

toid

entif

ya

mob

ilesu

bscr

iber

.The

IMSI

isd

efine

din

ITU

-TR

ecom

men

dat

ion

E.2

12.

The

IMSI

cons

ists

ofa

Mob

ileC

ount

ryC

ode

(MC

C),

aM

obile

Net

wor

kC

ode

(MN

C)

and

aM

obile

Stat

ion

Iden

tifi

cati

onN

umbe

r(M

SIN

).W

hen

focu

sing

ona

sing

leco

untr

y,th

eM

CC

can

bedr

oppe

dou

tand

the

com

bina

tion

ofth

eM

NC

and

the

MSI

N,w

hich

isa

9-d

igit

cod

eid

enti

fyin

gth

esu

bscr

iber

,is

usua

llyre

ferr

edas

the

Nat

iona

lMob

ileSu

bscr

iber

Iden

tity

(NM

SI).

26

Page 29: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

IPD

etai

lR

ecor

dIP

DR

Pro

vid

esin

form

atio

nab

out

Inte

rnet

Pro

toco

l(IP

)-ba

sed

serv

ice

usa

gean

dot

her

ac-

tivi

ties

that

can

beus

edby

oper

atio

nssu

ppor

tsys

tem

s(O

SSes

)and

busi

ness

supp

ort

syst

ems

(BSS

es).

LAI/

TAI

The

LAI/

TAIc

onsi

sts

ofth

ree

elem

ents

whe

reth

efir

sttw

opa

rts

ofth

eco

dear

eal

way

sth

esa

me

for

the

MN

O(s

eead

ded

lem

mas

ofLA

Iand

TAI)

2

Mob

ileC

oun-

try

Cod

ean

dal

soM

obile

Net

wor

kC

ode.

MC

Cor

MN

C

MC

Cis

the

cod

eof

the

hom

eco

unt

ryof

the

inbo

und

roam

ers.

Itis

ath

ree-

dig

itid

entifi

catio

nof

the

coun

try

whe

reth

eLA

islo

cate

d.Th

eM

CC

isde

fined

byth

eIT

U-T

Rec

omm

enda

tion

E.21

2.Th

eM

NC

(hom

eop

erat

or)m

ight

help

toas

sess

the

sele

ctiv

itybi

asth

atis

inpl

ace

due

topr

efer

entia

lroa

min

gag

reem

ents

betw

een

MN

Os.

As

for

hom

esu

bscr

iber

s,Ie

xpec

tth

ata

sing

leM

NO

mig

htha

vecu

stom

ers

with

diff

eren

tMN

Cas

the

resu

ltof

prev

ious

mer

gers

and

acqu

isiti

ons.

Inth

isca

se,M

NC

info

rmat

ion

mig

httu

rnus

eful

tose

gmen

td

iffe

rent

cust

omer

grou

ps.F

orth

isre

ason

,itm

ight

beus

eful

tore

tain

the

MN

Cal

sofo

rho

me

subs

crib

ers.

mob

ilede

vice

MD

Mob

iled

evic

esar

ep

orta

ble

com

pu

ting

dev

ices

equ

ipp

edw

ith

rad

iotr

ansm

itte

rsen

ablin

gth

emto

conn

ectt

oa

tele

com

mun

icat

ion

netw

ork.

Add

ition

ally

,the

sede

vice

sar

ein

crea

sing

lyeq

uipp

edw

itha

num

ber

ofse

nsor

sof

diff

eren

tnat

ure

(acc

eler

omet

er,

gyro

scop

e,di

gita

lcom

pass

,GPS

radi

otr

ansm

itter

,...

).D

epen

ding

gyro

scop

e,di

gita

lco

mpa

ss,G

PSra

dio

tran

smit

ter,

...)

.Dep

endi

ngon

the

amou

ntof

tech

nolo

gy(t

hese

sens

ors)

and

thei

rsi

ze,t

hese

dev

ices

rece

ive

div

erse

nam

es(m

obile

pho

nes,

feat

ure

phon

es,s

mar

tpho

nes,

tabl

ets,

...)

.m

obile

pho

neac

tivi

tyT

his

ind

ivid

ual

dat

ain

clu

des

both

the

pas

sive

sign

alin

gan

dm

obile

pho

neac

tivi

ty(c

alls

,mes

sage

s,et

c.)

Ou

tbou

ndro

amin

gda

ta

any

loca

tion

even

tby

alo

calM

NO

subs

crib

erco

nduc

ted

inan

othe

rne

twor

k(u

sual

lya

fore

ign

MN

Oro

amin

gse

rvic

e).

The

sed

ata

usu

ally

rep

rese

ntth

elo

cals

ubs

crib

ers

usin

gm

obile

phon

esw

hile

trav

ellin

gin

fore

ign

coun

trie

s.Pr

obes

(Son

de)

(Pas

sive

)C

alla

ctiv

itie

san

dha

ndov

erlo

gs;p

erso

nalf

eatu

res

from

the

oper

ator

27

Page 30: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

reco

rdty

pehe

lps

tod

iffe

rent

iate

betw

een

dat

aty

pes

(CD

Ran

dIP

DR

),se

rvic

ety

pes

(e.g

.,SM

S,M

MS,

outg

oing

call,

etc.

)an

dd

isti

ngu

ish

betw

een

even

tsth

atar

eno

tsu

bscr

iber

gene

rate

d(e

.g.,

loca

tion

upda

tes)

;1

Rel

ease

tim

eIt

isth

etim

ew

hen

seiz

edre

sour

ces

are

rele

ased

agai

n.R

elea

setim

eis

anop

tiona

lfiel

d1

Shor

tMes

sage

Serv

ice

SMS

Isa

pag

ing

serv

ice

tose

ndsh

ort

mes

sage

sto

mob

ilete

lep

hone

s.SM

Sst

arte

das

ase

rvic

ew

ithi

nth

eG

SMne

twor

k.N

owad

ays,

anu

mbe

rof

othe

rsy

stem

sfo

rm

obile

com

mu

nica

tion

sof

fer

the

sam

ese

rvic

e.A

Shor

tM

essa

geSe

rvic

eha

sa

leng

thof

1120

bits

.Thi

sgi

ves

the

pos

sibi

lity

tose

ndan

dre

ceiv

ete

xtm

essa

ges

ofat

mos

t160

char

acte

rsor

dat

am

essa

ges

ofat

mos

t140

byte

s.H

owev

er,i

tis

poss

ible

tojo

ina

few

mes

sage

sto

one

larg

erm

essa

ge.I

san

evol

uti

onof

the

Shor

tMes

sage

Serv

ice

(SM

S),

exte

ndin

gth

ete

xtco

nten

twit

hca

pabi

litie

sto

tran

smit

mul

tim

edia

mes

sage

sto

othe

rm

obile

use

rs.

The

mes

sage

sca

nbe

any

com

bina

tion

ofim

ages

,ani

mat

ions

,au

dio

(voi

ce,m

usic

),vi

deo

and

text

.MM

Ssu

ppor

tsm

ostc

omm

onco

mpr

essi

onte

chni

ques

,su

chas

JPE

Gan

dG

IFfo

rp

ictu

res,

MP

EG

-4fo

rvi

deo

and

MP

3,W

AV

and

mid

ifor

audi

o.

2

Sign

alin

gda

ta

Sign

alin

gda

tais

gene

rally

refe

rred

toob

tain

ing

tran

smis

sion

sign

als

from

the

radi

oac

-ce

ssne

twor

k(R

AN

)dir

ectly

and

stor

ing

itto

data

base

.The

yar

eve

ryvo

lum

inou

s,an

dca

nbe

limit

edto

inbo

und

roam

ing

and

dom

esti

cd

ata,

excl

udin

gou

tbou

ndro

amin

gda

ta.T

hey

may

cont

ain

som

epa

ram

eter

valu

esre

late

dto

the

radi

och

anne

lbet

wee

nth

ean

tenn

aan

dth

em

obile

term

inal

that

are

usef

ulto

narr

owdo

wn

the

poss

ible

loca

tion

ofth

ela

tter

.

3

28

Page 31: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Subs

crib

erId

enti

tyM

od-

ule

SIM

The

Subs

crib

erId

entit

yM

odul

eis

asm

artc

ard

that

isne

cess

ary

tom

ake

use

ofa

mob

ileph

one.

The

SIM

isth

eke

yus

edto

iden

tify

and

auth

entic

ate

the

mob

ilesu

bscr

iber

.On

the

SIM

isal

som

emor

yav

aila

ble

for

pers

onal

ised

data

,suc

has

ate

leph

one

book

and

mes

sage

s.Th

esu

bscr

iber

isid

entifi

edw

ithan

IMSI

,Int

erna

tiona

lMob

ileSu

bscr

iber

Iden

tity,

and

ate

leph

one

num

ber.

The

SIM

mad

ea

clea

rse

par

atio

nbe

twee

na

mob

ilep

hone

and

asu

bscr

iber

pos

sibl

e.T

hesu

bscr

iber

can

mak

eu

seof

any

mob

ilep

hone

und

erhi

sow

nac

cou

ntif

the

SIM

card

ispu

tin

the

phon

e.Th

eus

eof

aSI

Mca

nbe

guar

ded

wit

ha

PIN

code

from

4to

8di

gits

.If4

times

the

wro

ngPI

Nco

deis

type

d,th

eSI

Mw

illbe

bloc

ked.

Toun

bloc

kth

eSI

Mth

ePU

K(P

INU

nblo

ckin

gK

ey)i

sne

eded

.Thi

sis

an8

dig

its

cod

eth

atis

know

nby

the

auth

oris

edus

eran

dgi

ven

byth

ese

rvic

epr

ovid

er.

The

SIM

card

ises

sent

ially

anin

tern

alin

tegr

ated

circ

uitc

ard

(IC

C)p

rovi

ding

dive

rse

func

tiona

litie

sto

esta

blis

han

auth

entic

ated

secu

reco

mm

unic

atio

nbe

twee

nth

em

obile

devi

cean

dth

ene

twor

k.

2

subs

crib

eror

equ

ipm

ent

iden

tity

rece

ivin

g/se

ndin

gpa

rtie

s2

Tran

sfer

red

Acc

ount

Proc

edur

eTA

PTh

etr

ansf

erre

dac

coun

tpro

cedu

re(T

AP)

incl

udes

CD

Rs

and

IPD

Rs

that

are

sent

toth

ePL

MN

(out

goin

gda

ta)o

fthe

roam

ing

subs

crib

ers.

2

Type

App

rova

lC

ode

TAC

this

isth

epr

efix

ofth

eIn

tern

atio

nalM

obile

Equi

pmen

tIde

ntifi

er(I

MEI

)tha

tenc

odes

the

type

ofd

evic

e.W

hile

the

full

IME

Iis

tobe

cons

ider

edpe

rson

alin

form

atio

n,an

dsh

ould

notb

ere

tain

ed,t

heTA

Cco

desh

ould

notb

eco

nsid

ered

sens

itive

from

apr

ivac

ype

rspe

ctiv

e.Th

eTA

Cco

dem

ight

beus

eful

,am

ong

othe

rthi

ngs,

toim

prov

eth

efil

teri

ngof

IoT/

M2M

devi

ces

cont

rolle

r

2

29

Page 32: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Type

ofEv

ent

ToE

Type

ofEv

ent(

ToE)

.We

are

mos

tlyin

tere

stto

sing

leou

teve

nts

ofty

peEx

plic

itD

etac

h(E

D)f

rom

allo

ther

type

sof

even

t,th

eref

ore

the

ToE

coul

dbe

enco

ded

with

asi

ngle

-bit

flag.

Infa

ct,t

heED

even

tsar

ere

leva

ntfo

rth

ein

terp

reta

tion

ofBA

.2

Use

rps

eudo

nym

Thi

sis

typ

ical

lyob

tain

edby

hash

ing

the

Inte

rnat

iona

lMob

ileSu

bscr

iber

Iden

tifi

er(I

MSI

)or

part

ther

eof(

typi

cally

,the

part

that

follo

ws

the

MC

C/M

NC

prefi

x).

2

Vis

itor

Loc

a-ti

onR

egis

ter

VLR

The

Vis

itor

Loca

tion

Reg

iste

r(V

LR)i

sa

data

base

ina

mob

ileco

mm

unic

atio

nsne

twor

kas

soci

ated

toa

Mob

ileSw

itchi

ngC

entr

e(M

SC).

The

VLR

cont

ains

the

exac

tloc

atio

nof

allm

obile

subs

crib

ers

curr

ently

pres

enti

nth

ese

rvic

ear

eaof

the

MSC

.Thi

sin

form

atio

nis

nece

ssar

yto

rout

ea

call

toth

eri

ghtb

ase

stat

ion.

The

data

base

entr

yof

the

subs

crib

eris

dele

ted

whe

nth

esu

bscr

iber

leav

esTh

eV

isito

rLo

catio

nR

egis

ter

(VLR

)is

ada

taba

sein

am

obile

com

mun

icat

ions

netw

ork

asso

ciat

edto

aM

obile

Swit

chin

gC

entr

e(M

SC).

The

VL

Rco

ntai

nsth

eex

actl

ocat

ion

ofal

lmob

ilesu

bscr

iber

scu

rren

tly

pres

enti

nth

ese

rvic

ear

eaof

the

MSC

.Thi

sin

form

atio

nis

nece

ssar

yto

rout

ea

call

toth

eri

ghtb

ase

stat

ion.

The

data

base

entr

yof

the

subs

crib

eris

dele

ted

whe

nth

esu

bscr

iber

leav

esth

ese

rvic

ear

ea.

2

Dat

aR

esou

rce:

Net

wor

kD

ata.

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

3rd

Gen

erat

ion

Par

tner

ship

Proj

ect’s

3GPP

3rd

Gen

erat

ion

Part

ners

hip

Proj

ect.

The

join

tsta

ndar

diza

tion

part

ners

hip

resp

onsi

ble

for

stan

dard

izin

gU

MTS

(Uni

vers

alM

obile

Tele

com

mun

icat

ion

Syst

em),

HSP

A(H

igh

Spee

dPa

cket

Acc

ess)

and

LTE

(Lon

gTe

rmEv

olut

ion)

30

Page 33: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Acc

ess

Poi

ntN

ame

APN

Thi

sis

alo

gica

lid

enti

fier

,for

mat

ted

asa

char

acte

rst

ring

,tha

tis

use

din

3Gan

d4G

.It

isso

mew

hatr

elat

edto

cont

ract

ual

arra

ngem

ents

and

typ

eof

subs

crip

tion

:ala

rge

com

pany

purc

hasi

nga

stoc

kof

mul

tiple

mob

ilesu

bscr

iptio

nsfo

rits

empl

oyee

sw

illbe

typi

cally

assi

gned

asp

ecia

lAPN

.Mai

ntai

ning

info

rmat

ion

abou

tthe

APN

mig

httu

rnus

eful

for

the

inte

rpre

tati

onof

spat

io-t

empo

ralp

atte

rns

that

are

spec

ific

topa

rtic

ular

grou

ps

and

/or

for

the

sele

ctio

nof

par

ticu

lar

use

rgr

oup

s.I

exp

ect

that

the

MN

Ow

illu

seA

PN

valu

esto

geth

erw

ith

TAC

cod

es,I

MSI

rang

esan

dIM

EI

rang

esto

pre

-fi

lter

IoT

/M

2Md

evic

es,b

ut

such

filt

erin

gm

ight

not

bep

erfe

ct.

Mai

ntai

ning

AP

Nin

form

atio

nco

uld

beus

eful

inde

tect

ing

resi

dual

IoT/

M2M

devi

ces

inth

eda

tase

t.I

don

’tex

pect

that

the

MN

Ow

illbe

will

ing

tosh

are

clea

rtex

tA

PN

valu

esw

ith

the

Stat

istic

alO

ffice

s,as

this

info

rmat

ion

ishi

ghly

sens

itive

from

abu

sine

sspo

into

fvie

w.

Inth

ebe

stca

seth

eyw

illp

seu

don

ymis

eth

eA

PN

(e.g

.by

som

eha

shin

gfu

ncti

on,a

sdo

new

ith

the

IMSI

)but

this

isno

tapr

oble

mfo

rst

atis

tica

lapp

licat

ions

.

Ans

wer

tim

eIt

isth

etim

ew

hen

aca

llis

answ

ered

–th

eco

nnec

tion

was

succ

essf

ul.A

nsw

ertim

eis

am

anda

tory

field

for

succ

essf

ulca

lls.

Aut

hent

icat

ion

Cen

tre

AuC

The

Aut

hent

icat

ion

Cen

tre

impl

emen

tsth

efu

nctio

nto

auth

entic

ate

each

SIM

card

that

atte

mpt

sto

conn

ectt

oth

eco

rene

twor

k(t

ypic

ally

whe

nth

em

obile

devi

ceis

pow

ered

on).

Onc

eau

then

tica

ted

,the

HL

Rbe

gins

tom

anag

eth

eSI

Man

dth

eco

rres

pon

din

gse

rvic

es.

Bas

eSt

atio

nC

ontr

olle

rBS

C

The

base

stat

ion

pro

vid

esth

eco

ntro

love

rea

chB

TS

asw

ella

sth

eco

nnec

tion

toth

eco

rene

twor

k.W

hile

the

base

stat

ion

isth

ein

terf

ace

elem

entt

hatc

onne

cts

the

mob

iled

evic

esw

ith

the

netw

ork,

the

BSC

isre

spon

sibl

efo

rth

ees

tabl

ishm

ent,

rele

ase

and

mai

nten

ance

ofal

lcon

nect

ions

ofce

llsth

atar

eco

nnec

ted

toit

.

31

Page 34: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Bas

eTr

ansc

eive

rSt

atio

nsBT

S

Base

stat

ions

,whi

char

eal

soca

lled

Base

Tran

scei

verS

tatio

ns(B

TSs)

,are

the

mos

tvis

ible

netw

ork

elem

ents

ofa

GSM

syst

em.C

ompa

red

tofix

ed-li

nene

twor

ks,t

heba

sest

atio

nsre

plac

eth

ew

ired

conn

ectio

nto

the

subs

crib

erw

itha

wir

eles

sco

nnec

tion,

whi

chis

also

refe

rred

toas

the

airi

nter

face

.The

base

stat

ions

are

also

the

mos

tnum

erou

scom

pone

nts

ofa

mob

ilene

twor

kan

dco

mp

rise

sd

iver

seel

emen

ts,a

par

tfr

omth

ean

tenn

ait

self

,w

ith

div

erse

func

tion

alit

ies

(enc

ryp

ting

and

dec

ryp

ting

com

mu

nica

tion

s,sp

ectr

um

filte

ring

tool

s,..

.)

2

Billi

ngC

entr

eBC

The

Billi

ngC

entr

eis

resp

onsi

ble

for

gath

erin

gne

twor

kus

age

data

for

ever

ysu

bscr

iber

.T

hed

ata

isp

ulle

dfr

omth

eM

SC(S

GSN

and

GG

SNar

eal

soin

volv

edin

gene

rati

ngbi

lling

data

).

Bou

ndin

gA

rea

BA

The

BAin

form

atio

nis

pote

ntia

llyus

eful

whe

nla

rge

tem

pora

lgap

sex

istb

etw

een

two

cons

ecu

tive

even

trec

ord

sfo

rth

esa

me

mob

iled

evic

e:it

pro

vid

esa

cons

trai

ntof

the

poss

ible

posi

tion

ofth

em

obile

devi

cebe

twee

ntw

oco

nsec

utiv

eob

serv

atio

ns.I

nor

der

tore

fer

colle

ctiv

ely

toth

een

titi

eslis

ted

belo

w-L

ocat

ion

Are

a(L

A)i

n2G

(and

inth

eC

ircu

itw

itch

ed(C

S)se

ctio

nof

3G).

Rou

ting

Are

a(R

A)i

n3G

(spe

cific

ally

,the

Pack

etSw

itch

ed(P

S)se

ctio

nth

ereo

f).

Trac

king

Are

a(T

A)o

r,if

enab

led,

Trac

king

Are

aLis

t(TA

L)in

4G.

Inot

her

wor

ds,

dep

end

ing

onth

eR

adio

Acc

ess

Tech

nolo

gy(R

AT

)—2G

,3G

or4G

—th

eBA

will

map

toLA

,RA

orTA

.As

for

3G,a

gene

ric

RA

isal

way

sen

tirel

yco

ntai

ned

with

ina

LA:i

fbot

hR

Aan

dLA

are

avai

labl

ein

the

sam

eev

ent,

the

BAw

illm

apto

the

smal

lest

ofth

etw

o,i.e

.,R

Ace

llgl

obal

iden

tific

atio

nC

GI

CG

Iis

the

uniq

ueid

enti

fier

ofa

sing

lece

ll

32

Page 35: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Cel

lIde

ntifi

erC

ellI

D

Thi

sin

form

atio

nis

avai

labl

efo

rho

me

subs

crib

ers

and

inbo

und

roam

ers.

The

Cel

lID

can

beei

ther

inth

efo

rmof

aM

NO

-spe

cific

iden

tifier

(acc

ordi

ngto

alo

calc

elln

amin

gco

nven

tion

)or

inth

efo

rmof

the

stan

dar

dC

ellG

loba

lId

enti

fier

(CG

I).B

oth

opti

ons

are

acce

ptab

le.T

hece

llID

shou

ldbe

inte

rpre

ted

asa

poin

ter

toth

ece

llco

vera

gear

ea.

This

iden

tity

isun

ique

only

insi

deth

elo

cati

onar

eace

llsi

tes

colle

ctio

nof

tran

smit

ters

aton

esp

ecifi

clo

cati

once

llto

wer

colle

ctio

nof

tran

smit

ters

aton

esp

ecifi

clo

cati

onth

atca

nbe

refe

rred

toas

ace

llsi

te

Cen

tral

Stor

-ag

eSy

stem

s

The

cent

rals

tora

gesy

stem

sar

eco

mpo

sed

by:

The

billi

ngdo

mai

n(B

D)s

tore

sC

DR

san

dIP

DR

sfo

rch

argi

ngpu

rpos

es.D

ata

will

best

ored

here

afte

ra

succ

essf

ulch

argi

ngre

cord

gene

rati

onpr

oced

ure.

Cus

tom

erda

taba

ses

cont

ain

info

rmat

ion

abou

tuse

rs,w

hich

can

add

extr

ava

lue

toC

DR

sor

IPD

Rs

data

–e.

g.,s

ocio

-dem

ogra

phic

san

dpl

ace

ofre

side

nce

(whe

reap

plic

able

).

Dat

aw

areh

ouse

sho

stco

llect

ions

ofd

ata

from

dif

fere

ntso

urc

esbu

tmig

htno

tal

way

sbe

easi

lyac

cess

ible

due

toth

ehu

geam

ount

ofda

tast

ored

ther

e.

Dat

abas

esin

the

billi

ngd

omai

nar

ege

nera

llyco

nsid

ered

mos

teas

ilyac

cess

ible

.Dat

ast

ored

inth

eB

Dis

gath

ered

bya

so-c

alle

dm

edia

tion

syst

emfr

omd

iffe

rent

netw

ork

enti

ties

resp

onsi

ble

for

prov

idin

gva

riou

sty

pes

ofse

rvic

es.

33

Page 36: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Cir

cuit

Swit

ched

CS

Cir

cuit

swit

chin

gis

am

etho

dw

here

bya

ded

icat

edph

ysic

alpa

th,o

rci

rcu

it,i

ses

tab-

lishe

dan

dm

aint

aine

dbe

twee

ntw

ono

des

orlo

catio

nsfo

rth

edu

ratio

nof

aco

nnec

tion.

Cir

cuit

swit

ched

netw

orks

are

ofte

nre

ferr

edto

asco

nnec

tion

-ori

ente

dne

twor

ksbe

-ca

use

the

dedi

cate

dci

rcui

tmus

tbe

esta

lishe

dfir

st,o

r“n

aile

dup

”,be

fore

info

rmat

ion

can

bese

nt.

Tele

pho

nene

twor

ksar

ety

pic

ally

circ

uit

swit

ched

,bec

ause

voic

etr

affi

cre

quir

esth

eco

nsis

tent

tim

ing

ofa

sing

le,d

edic

ated

phys

ical

path

toke

epa

cons

tant

del

ayon

the

circ

uit

.T

hep

lain

old

tele

pho

nesy

stem

(PO

TS)

isth

ela

rges

tci

rcu

itsw

itche

dne

twor

k.Th

eor

igin

alG

SMne

twor

kis

also

circ

uits

witc

hed.

Alth

ough

GPR

Sin

trod

uced

pack

etsw

itch

ing

inth

eG

SMne

twor

k.

Cor

eN

etw

ork

CN

Aco

rene

twor

kis

ate

leco

mm

uni

cati

onne

twor

k’s

core

par

t,w

hich

offe

rsnu

mer

ous

serv

ices

toth

ecu

stom

ers

who

are

inte

rcon

nect

edby

the

acce

ssne

twor

k.It

ske

yfu

ncti

onis

tod

irec

tte

lep

hone

calls

over

the

pu

blic

-sw

itch

edte

lep

hone

netw

ork.

Inge

nera

l,th

iste

rmsi

gnifi

esth

ehi

ghly

func

tion

alco

mm

uni

cati

onfa

cilit

ies

that

inte

rcon

nect

prim

ary

node

s.Th

eco

rene

twor

kde

liver

sro

utes

toex

chan

gein

form

atio

nam

ong

vari

ous

sub-

netw

orks

.Whe

nit

com

esto

ente

rpri

sene

twor

ksth

atse

rve

asi

ngle

orga

niza

tion

,the

term

back

bone

isof

ten

used

inst

ead

ofco

rene

twor

k,w

here

asw

hen

used

wit

hse

rvic

epr

ovid

ers

the

term

core

netw

ork

ispr

omin

ent.

This

term

isal

sokn

own

asne

twor

kco

reor

back

bone

netw

ork.

Gat

eway

GPR

SSu

ppor

tN

ode

GG

SNT

heG

atew

ayG

PR

SSu

pp

ort

Nod

e(G

GSN

)ac

tsas

anex

tens

ion

for

the

SGSN

(see

desc

ript

ion

for

MSC

)in

GPR

Sne

twor

ksto

conn

ecta

GPR

Sne

twor

kto

anex

tern

alda

tane

twor

k(e

.g.,

Inte

rnet

).G

ener

alPa

cket

Rad

ioSe

rvic

eG

PRS

The

Gen

eral

Pack

etR

adio

Serv

ice,

enha

nced

the

GSM

stan

dard

totr

ansp

ortd

ata

inan

effic

ient

man

ner

and

enab

led

wir

eles

sde

vice

sto

acce

ssth

eIn

tern

et.

Glo

balS

yste

mfo

rM

obile

Com

mu

nica

-ti

ons

GSM

The

Glo

balS

yste

mfo

rM

obile

Com

mun

icat

ions

isal

sokn

own

as2G

,the

pred

eces

sor

ofth

e3G

netw

ork.

34

Page 37: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

GSM

-lik

em

o-bi

lete

leco

m-

mu

nica

tion

netw

ork

AG

SM-l

ike

mob

ilete

leco

mm

unic

atio

nne

twor

kis

aco

llect

ion

ofth

ree

nest

edty

pes

ofsu

bsys

tem

s:a)

The

radi

one

twor

kor

Base

Stat

ion

Subs

yste

m(B

SS),

prov

idin

gal

lele

men

tsto

conn

ect

the

mob

ilede

vice

sto

the

netw

ork

over

the

radi

oin

terf

ace

(als

okn

own

asai

rin

terf

ace)

.It

ishe

rew

here

the

conn

ecti

onbe

twee

nm

obile

devi

ces

and

ante

nnas

take

spl

ace.

b)T

heco

rene

twor

kor

Net

wor

kSw

itch

ing

Subs

yste

ms

(NSS

),pr

ovid

ing

alle

lem

ents

for

switc

hing

ofca

lls,f

orsu

bscr

iber

man

agem

enta

ndm

obili

tym

anag

emen

t.Th

eco

rene

twor

km

aybe

optio

nally

com

plem

ente

dw

ithth

eIn

telli

gent

Net

wor

k(I

N)s

ubsy

stem

,pr

ovid

ing

optio

nalf

unct

iona

litie

sto

the

netw

ork

(as

the

prep

aid

serv

ices

,to

nam

eth

em

osti

mpo

rtan

t).

c)Th

eN

etw

ork

Man

agem

entS

yste

m(N

MS)

,mon

itori

ngan

dm

anag

ing

dive

rse

aspe

cts

ofth

ene

twor

ksu

chas

mai

nten

ance

wor

ks(s

oftw

are

upgr

adin

g,co

llect

ion

ofst

atis

tics

abou

tper

fom

ance

,cus

tom

erbi

lling

,...

).

3

Loca

tion

area

LAA

djac

entc

ells

,typ

ical

ly30

or40

,are

grou

ped

toge

ther

into

one

loca

tion

area

.Agr

oup

ofce

llsfo

rma

loca

tion

area

Loc

atio

nA

rea

Cod

eLA

C

The

serv

edar

eaof

ace

llula

rra

dio

netw

ork

isu

sual

lyd

ivid

edin

tolo

cati

onar

eas.

Loc

atio

nar

eas

are

com

pris

edof

one

orse

vera

lrad

ioce

lls.E

ach

loca

tion

area

isgi

ven

anu

niqu

enu

mbe

rw

ithi

nth

ene

twor

k,th

eL

ocat

ion

Are

aC

ode

(LA

C).

Thi

sco

de

isus

edas

aun

ique

refe

renc

efo

rthe

loca

tion

ofa

mob

ilesu

bscr

iber

.Thi

sco

deis

nece

ssar

yto

addr

ess

the

subs

crib

erin

the

case

ofan

inco

min

gca

ll.T

heL

AC

form

sp

arto

fth

eL

ocat

ion

Are

aId

enti

fier

(LA

I)an

dis

broa

dca

sted

onth

eBr

oadc

astC

ontr

olC

hann

el(B

CC

H).

35

Page 38: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Loc

atio

nA

rea

Cod

e(L

AC

)/Tr

acki

ngA

rea

Cod

e(T

AC

)

LAC

/TA

C

The

Loc

atio

nA

rea

Cod

e(L

AC

)/Tr

acki

ngA

rea

Cod

e(T

AC

)is

anid

enti

fier

ofth

elo

catio

nar

eaw

ithin

anM

NO

’sne

twor

k.Th

ispa

rtof

the

code

can

bere

pres

ente

dus

ing

hexa

dec

imal

valu

esw

ith

ale

ngth

oftw

ooc

tets

.T

hese

rved

area

ofa

cellu

lar

rad

ione

twor

kis

usua

llydi

vide

din

tolo

cati

onar

eas.

Loca

tion

area

sar

eco

mpr

ised

ofon

eor

seve

ralr

adio

cells

.Eac

hlo

cati

onar

eais

give

nan

uniq

uenu

mbe

rw

ithi

nth

ene

twor

k,th

eLo

catio

nA

rea

Cod

e(L

AC

).Th

isco

deis

used

asa

uniq

uere

fere

nce

for

the

loca

tion

ofa

mob

ilesu

bscr

iber

.Thi

sco

deis

nece

ssar

yto

addr

ess

the

subs

crib

erin

the

case

ofan

inco

min

gca

ll.

Loc

atio

nA

rea

Iden

tity

LAI

LAIc

onsi

sts

ofth

ree

elem

ents

whe

reth

efir

sttw

opa

rts

ofth

eco

dear

eal

way

sth

esa

me

for

the

MN

O.F

orre

gist

ratio

nin

the

netw

ork,

the

Mob

ility

Man

agem

entE

ntity

(MM

E)ha

sto

info

rmth

eM

SCof

the

2G/3

GLo

catio

nA

rea

Iden

tity

(LA

I)in

whi

chth

em

obile

dev

ice

iscu

rren

tly

’the

oret

ical

ly’l

ocat

ed.S

ince

this

ison

lya

theo

reti

calv

alu

e,it

has

tobe

com

pu

ted

outo

fthe

Trac

king

Are

aId

enti

ty(T

AI)

,whi

chis

the

corr

esp

ond

ing

iden

tifie

rin

LTE

.In

prac

tice

,thi

scr

eate

sa

dep

end

ency

betw

een

the

TAIa

ndth

eL

AI,

that

is,t

helo

catio

nar

eas

that

desc

ribe

agr

oup

ofba

sest

atio

nsin

2G/3

Gan

dLT

Em

ust

beco

nfigu

red

ina

geog

raph

ical

lysi

mila

rw

ayfo

rth

efa

llbac

kto

wor

kla

ter

on.

Loc

atio

nBa

sed

Syst

emLB

S

The

pro

pri

etar

ym

onit

orin

gsy

stem

inp

lace

atth

eM

NO

mig

htbe

alre

ady

usi

ngth

era

dio

par

amet

ers

(TA

,rec

eive

dst

reng

than

dot

hers

)to

narr

owd

own

the

pos

itio

nof

the

mob

ilete

rmin

al.

Gen

eral

lysp

eaki

ng,i

fL

BS

dat

aar

eav

aila

ble,

they

shou

ldbe

defin

itel

yre

port

edin

addi

tion

to(n

otin

repl

acem

ento

f)of

Cel

lID

and

TA.

Loc

atio

nid

en-

tifie

r

The

LAI/

TAI/

RA

Iis

intu

rna

com

poun

dco

defo

rmed

with

the

Mob

ileC

ount

ryC

ode

(MC

C),

the

Mob

ileN

etw

ork

Cod

e(M

NC

)an

da

Loc

atio

n/Tr

acki

ng/

Rou

ting

Are

aC

ode

(LA

C/T

AC

/RA

C).

Lon

gTe

rmEv

olut

ion

LTE

The

Long

Term

Evol

utio

nis

also

know

nas

4G.

2

36

Page 39: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Mob

ileP

o-si

tion

ing

(Att

ive)

Sys-

tem

Easy

for

smal

lsam

ples

,usu

ally

requ

ires

opt-

inA

ccur

ate

posi

tion

ing,

cust

omfr

eque

n-ci

es;q

uest

ionn

aire

wit

hth

ere

spon

dent

poss

ible

Mob

ileSu

b-sc

ribe

rIn

-te

grat

edSe

rvic

esD

ig-

ital

Net

wor

knu

mbe

r

MSI

SDN

The

MSI

SDN

isth

em

obile

pho

nenu

mbe

r.It

isco

mp

osed

ofth

eco

unt

ryco

de,

the

nati

onal

des

tina

tion

cod

ean

dth

esu

bscr

iber

num

ber.

As

long

assu

bscr

iber

has

not

chan

ged

his/

her

oper

ator

thus

also

chan

ging

the

SIM

card

,the

MSI

SDN

will

rem

ain

the

sam

e,du

eto

port

abili

tyof

mob

ileph

one

num

bers

.

Mob

ileSw

itch

-in

gC

entr

eM

SCTh

eM

obile

Switc

hing

Cen

tre

(MSC

)is

ate

leph

one

exch

ange

that

mak

esth

eco

nnec

tion

betw

een

mob

ileu

sers

wit

hin

the

netw

ork,

from

mob

ileu

sers

toth

ep

ubl

icsw

itch

edte

leph

one

netw

ork

and

from

mob

ileus

ers

toot

her

mob

ilene

twor

ks.

Mu

ltim

edia

Mes

sagi

ngSe

rvic

eM

MS

MM

Sis

ast

ore

and

forw

ard

mes

sagi

ngse

rvic

e.Th

ism

eans

that

ifth

ere

cipi

entp

hone

isno

tsw

itch

edon

,the

mes

sage

will

best

ored

inth

ene

twor

kan

dse

ntto

the

reci

pien

tas

soon

asth

eph

one

issw

itche

don

.Diff

eren

tpro

toco

lsca

nbe

used

asM

MS

tran

spor

tm

echa

nism

,suc

has

WA

P,H

TTP

orSe

ssio

nIn

itiat

ion

Prot

ocol

(SIP

).Th

em

ostc

omm

onap

proa

chus

edno

wad

ays

isW

AP.

N-B

estS

erve

rsA

reas

N-B

SAIt

isab

leto

com

pute

and

repo

rt,f

orea

chpo

inti

nsp

ace

(typ

ical

lyra

ster

ised

,e.g

.att

hegr

anul

arit

yof

50m

x50

m),

the

pred

icte

dsi

gnal

leve

loft

heN

stro

nges

trad

ioce

lls.

Thes

em

aps

are

calle

dN

-Bes

tSer

vers

Are

as(N

-BSA

)ne

twor

kan

ten-

nas

loca

tion

the

min

imum

geog

raph

ical

info

rmat

ion

prov

ided

byth

eM

NO

san

dit

corr

espo

nds

toth

eco

ordi

nate

pair

ofth

ean

tenn

apo

int

Net

wor

kce

llE

very

cell

can

bed

escr

ibed

bya

num

ber

ofat

trib

ute

ssu

chas

azim

uth

,sec

tor

angl

e,sh

ape

and

size

ofth

eco

vera

gear

ea,t

ype

ofan

tenn

aan

dlo

cati

on.

2

Net

wor

kd

ata

add

itio

nal

at-

trib

utes

The

OSS

usu

ally

stor

esad

dit

iona

lin

form

atio

nfo

rm

aint

enan

cean

dp

erfo

rman

cean

alys

ispu

rpos

es.T

his

info

rmat

ion

can

behi

ghly

valu

able

fori

mpr

ovin

gth

epr

eced

ing

attr

ibut

es,i

npa

rtic

ular

,att

ribu

tes

ofth

ece

llsca

nbe

ofgr

eatv

alue

.

37

Page 40: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Net

wor

km

an-

agem

ent

sub-

syst

emN

MS

The

purp

ose

ofth

eN

MS

isto

mon

itor

and

tom

anag

eva

riou

sas

pect

sof

the

netw

ork.

The

func

tion

sof

the

NM

Sca

nbe

div

ided

into

thre

eca

tego

ries

:fa

ult

man

agem

ent,

confi

gura

tion

man

agem

enta

ndpe

rfor

man

cem

anag

emen

t.

Op

erat

ion

and

Sup

por

tSu

bsys

tem

OSS

The

Op

erat

ion

and

Sup

por

tSu

bsys

tem

(OSS

)ta

kes

resp

onsi

bilit

yfo

rm

aint

enan

cew

orks

such

asso

ftw

are

upgr

adin

gan

dth

eco

llect

ion

ofst

atis

tics

abou

tnet

wor

kpe

rfor

-m

ance

.The

OSS

also

cont

ains

data

base

sth

atho

ldin

form

atio

nab

outn

etw

ork

elem

ents

,su

chas

loca

tions

ofce

llsi

tes,

i.e.o

fBTS

s.Th

isis

are

leva

ntfe

atur

epo

ssib

lyim

ping

ing

onth

ege

oloc

atio

ndi

men

sion

ofth

ere

ques

ted

data

toM

NO

s.P

ubl

icL

and

Mob

ileN

et-

wor

kPL

MN

The

Publ

icLa

ndM

obile

Net

wor

kis

the

entir

ene

twor

kof

the

oper

ator

,whe

reea

chBT

Sis

asso

ciat

edw

ith

age

ogra

phic

alce

llco

veri

ngth

iste

rrit

ory

ofth

ePL

MN

.

Rad

ioA

cces

sN

etw

ork

RA

N

Ara

dio

acce

ssne

twor

kis

ate

chno

logy

that

conn

ects

indi

vidu

alde

vice

sto

othe

rpar

tsof

ane

twor

kth

roug

hra

dio

conn

ectio

ns.I

tis

am

ajor

part

ofm

oder

nte

leco

mm

unic

atio

ns,

with

3Gan

d4G

netw

ork

conn

ectio

nsfo

rm

obile

phon

esbe

ing

exam

ples

ofra

dio

acce

ssne

twor

ks.

Rad

ioA

cces

sTe

cnol

ogy

RA

T2G

,3G

or4G

1,4

Rou

ting

area

RA

Itis

the

cou

nter

par

tof

LA

inp

acke

t-sw

itch

ed(P

A)

netw

orks

.T

heR

Ais

usu

ally

asm

alle

rar

eaco

mpa

red

toth

eL

Abe

cau

seu

sing

mu

ltim

edia

serv

ices

requ

ires

mor

efr

eque

ntpa

ging

mes

sage

s.R

educ

ing

the

area

that

need

sto

bepa

ged

help

sto

low

erth

enu

mbe

rof

pagi

ngm

essa

ges

that

are

sent

out.

2

Rou

ting

Are

aC

ode

RA

CR

outi

ngA

rea

Cod

e(R

AC

),w

hich

isa

one

octe

tlon

gco

de1

Rou

ting

Are

aId

enti

tyR

AI

For

pac

ket-

swit

ched

netw

orks

,Rou

ting

Are

aId

enti

ty(R

AI)

pla

ysth

esa

me

role

asLA

Is/T

AIs

.1

Seiz

ure

tim

eIt

isth

eti

me

whe

nre

sour

ces

are

seiz

edto

prov

ide

serv

ice

toth

esu

bscr

iber

.Thi

sfie

ldis

man

dato

ryon

lyfo

rca

llsth

atw

ere

unsu

cces

sful

1

38

Page 41: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Serv

ing

GP

RS

Supp

ortN

ode

SGSN

Serv

ing

GPR

SSu

ppor

tNod

e(S

GSN

)is

am

ain

com

pone

ntof

the

GPR

Sne

twor

k,w

hich

hand

les

allp

acke

tsw

itch

edd

ata

wit

hin

the

netw

ork,

e.g.

the

mob

ility

man

agem

ent

and

auth

entic

atio

nof

the

user

s.Th

eSG

SNpe

rfor

ms

the

sam

efu

nctio

nsas

the

MSC

for

voic

etr

affic

.The

SGSN

and

the

MSC

are

ofte

nco

-loc

ated

.

1

Tim

eA

dvan

ceTA

Ifsi

gnal

ling

data

are

extr

acte

dfr

omth

eR

adio

Acc

ess

Net

wor

k(R

AN

)the

ym

ayco

ntai

nso

me

para

met

erva

lues

rela

ted

toth

era

dio

chan

nelb

etw

een

the

ante

nna

and

the

mob

ilete

rmin

alth

atar

eu

sefu

lto

narr

owd

own

the

pos

sibl

elo

cati

onof

the

latt

er.

Tim

ing

Adv

ance

(TA

)tha

tis

rela

ted

toth

ero

und

trip

prop

agat

ion

dela

ybe

twee

nth

ean

tenn

aan

dth

em

obile

term

inal

atth

etim

eth

em

essa

gew

asge

nera

ted,

and

ther

efor

eis

linke

dto

the

(qua

ntis

edva

lue

ofth

e)ph

ysic

aldi

stan

cebe

twee

nan

tenn

aan

dm

obile

devi

ce.

2

Trac

king

Are

aTA

The

trac

king

area

isth

eLT

E(L

ong-

Term

Evo

luti

on)c

ount

erp

arto

fthe

loca

tion

area

and

rout

ing

area

.Atr

acki

ngar

eais

ase

tofc

ells

.Tra

ckin

gar

eas

can

begr

oupe

din

tolis

tsof

trac

king

area

s(T

Alis

ts),

whi

chca

nbe

confi

gure

don

the

Use

rEq

uipm

ent(

UE)

.Tr

acki

ngar

eaup

date

sar

epe

rfor

med

peri

odic

ally

orw

hen

the

UE

mov

esto

atr

acki

ngar

eath

atis

noti

nclu

ded

init

sTA

list.

Ope

rato

rsca

nal

loca

tedi

ffer

entT

Alis

tsto

diff

eren

tUEs

.Thi

sca

nav

oid

sign

alin

gpe

aks

inso

me

cond

itio

ns:

for

inst

ance

,the

UE

sof

pas

seng

ers

ofa

trai

nm

ayno

tp

erfo

rmtr

acki

ngar

eaup

date

ssi

mul

tane

ousl

y.

2

Trac

king

Are

aId

enti

tyTA

I

For

regi

stra

tion

inth

ene

twor

k,th

eM

obili

tyM

anag

emen

tEnt

ity(M

ME)

has

toin

form

the

MSC

ofth

e2G

/3G

Loc

atio

nA

rea

Iden

tity

(LA

I)in

whi

chth

em

obile

dev

ice

iscu

rren

tly

“the

oret

ical

ly”

loca

ted

.Si

nce

this

ison

lya

theo

reti

cal

valu

e,it

has

tobe

com

pute

dou

toft

heTr

acki

ngA

rea

Iden

tity

(TA

I),w

hich

isth

eco

rres

pond

ing

iden

tifier

inLT

E.I

np

ract

ice,

this

crea

tes

ad

epen

den

cybe

twee

nth

eTA

Ian

dth

eL

AI,

that

is,

the

loca

tion

area

sth

atd

escr

ibe

agr

oup

ofba

sest

atio

nsin

2G/

3Gan

dLT

Em

ust

beco

nfigu

red

ina

geog

raph

ical

lysi

mila

rw

ayfo

rth

efa

llbac

kto

wor

kla

ter

on.

2

39

Page 42: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Uni

vers

alM

o-bi

leTe

leco

m-

mu

nica

tion

Syst

em

UM

TSU

MTS

,als

okn

own

as3G

and

the

mos

tpre

vale

ntm

obile

com

mun

icat

ion

tech

nolo

gies

,fo

llow

the

3rd

Gen

erat

ion

Part

ners

hip

Proj

ect’s

(3G

PP)t

echn

ical

spec

ifica

tion

s.

Dat

aR

esou

rce:

Busi

ness

Dat

a.

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

cont

ract

type

Poss

ible

cont

ract

type

sar

epr

e-pa

id,p

ost-

paid

SIM

,mac

hine

-to-

mac

hine

SIM

.

Cus

tom

erda

taC

RM

data

Cu

stom

erd

atab

ases

cont

ain

info

rmat

ion

abou

tu

sers

,whi

chca

nad

dex

tra

valu

eto

CD

Ror

IPD

Rda

ta–

e.g.

,soc

io-d

emog

raph

ics

and

plac

eof

resi

denc

e.

Cu

stom

erR

elat

ions

hip

Man

agem

ent

CR

M

CR

Mis

anin

tegr

ated

man

agem

enti

nfor

mat

ion

syst

emth

atis

used

tosc

hed

ule,

plan

and

cont

rolt

hesa

les

and

pre

-sal

esac

tivi

ties

inan

orga

niza

tion

.C

RM

syst

ems

com

-pr

ise

ofha

rdw

are,

soft

war

ean

dne

twor

king

tool

sto

impr

ove

cust

omer

trac

king

and

com

mun

icat

ion.

Hom

eL

oca-

tion

Reg

iste

rH

LR

The

Hom

eL

ocat

ion

Reg

iste

ris

ad

atab

ase

from

am

obile

netw

ork

inw

hich

info

rma-

tion

from

allm

obile

subs

crib

ers

isst

ored

.T

heH

LR

cont

ains

info

rmat

ion

abou

tth

esu

bscr

iber

’sid

enti

ty,h

iste

leph

one

num

ber,

the

asso

ciat

edse

rvic

esan

dge

nera

linf

or-

mat

ion

abou

tthe

loca

tion

ofth

esu

bscr

iber

.The

exac

tloc

atio

nof

the

subs

crib

eris

kept

ina

Vis

itor

Loca

tion

Reg

iste

r.

40

Page 43: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Mob

ileN

et-

wor

kO

pera

tor

MN

O

Am

obile

netw

ork

oper

ator

(MN

O)i

sa

tele

com

mun

icat

ions

serv

ice

prov

ider

orga

niza

-ti

onth

atp

rovi

des

wir

eles

svo

ice

and

dat

aco

mm

uni

cati

onfo

rit

ssu

bscr

ibed

mob

ileus

ers.

Mob

ilene

twor

kop

erat

ors

are

inde

pend

entc

omm

unic

atio

nse

rvic

epr

ovid

ers

that

own

the

com

plet

ete

leco

min

fras

truc

ture

for

host

ing

and

man

agin

gm

obile

com

mun

icat

ions

betw

een

the

subs

crib

edm

obile

user

sw

ithus

ers

inth

esa

me

and

exte

rnal

wir

eles

san

dw

ired

tele

com

netw

orks

.M

obile

netw

ork

oper

ator

sar

eal

sokn

own

asca

rrie

rse

rvic

epr

ovid

ers,

mob

ileph

one

oper

ator

and

mob

ilene

twor

kca

rrie

rs.

soci

o-d

emog

rap

hic

attr

ibut

es

The

soci

o-d

emog

raph

icat

trib

utes

ofth

esu

bscr

iber

sar

eus

ually

colle

cted

inth

eC

RM

syst

emof

the

MN

Oan

dre

late

dto

the

cust

omer

’sp

rofi

ling

syst

em.

The

soci

o-d

emog

rap

hic

attr

ibu

tes

are

mos

tly

colle

cted

for

pos

t-p

aid

pri

vate

(non

-com

mer

cial

)cu

stom

ers,

asth

ere

isof

ten

noin

form

atio

non

the

profi

les

ofpr

e-pa

idcu

stom

ers

orth

ein

form

atio

nis

limite

d(u

nles

sth

eyar

ere

quir

edto

regi

ster

upon

the

purc

hase

ofth

eSI

Mca

rd,a

ndev

enth

enit

isno

talw

ays

avai

labl

ein

the

CR

M).

Forc

omm

erci

alco

ntra

cts

itis

ofte

nno

tpos

sibl

eto

know

whi

chsp

ecifi

cpe

rson

isus

ing

the

phon

e(a

ndth

eref

ore

the

attr

ibut

esof

the

pers

onar

eun

know

n).W

ith

post

-pai

dpr

ivat

ecu

stom

ers,

dat

ais

also

ofte

nbi

ased

inca

seof

fam

ilyp

lans

–th

eso

ciod

emog

rap

hics

Wit

hp

ost-

pai

dp

riva

tecu

stom

ers,

data

isal

soof

ten

bias

edin

case

offa

mily

plan

s–

the

soci

odem

ogra

phic

sof

the

pers

onw

hosi

gned

the

cont

ract

(fat

her,

mot

her)

are

exte

nded

toth

ew

hole

grou

p(m

eani

ngth

eag

ean

dge

nder

ofth

ech

ildre

nus

ing

the

phon

ear

eby

exte

nsio

nth

ose

ofth

efa

ther

’s).

2,4

41

Page 44: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Subs

crib

ers’

add

itio

nal

attr

ibut

es

Thi

sis

rele

vant

for

the

MN

Os

stor

ein

form

atio

nab

outt

heir

cust

omer

sin

subs

crib

erd

atab

ases

(for

dom

esti

can

dou

tbou

ndcu

stom

ers)

.The

info

rmat

ion

that

isco

llect

edan

dst

ored

inth

eC

usto

mer

Rel

atio

nshi

pM

anag

emen

t(C

RM

)sys

tem

ishi

ghly

MN

O-

spec

ific

typi

cally

incl

udes

soci

o-de

mog

raph

icch

arac

teri

stic

sof

the

subs

crib

er(o

wne

rof

the

cont

ract

)tha

tmig

htbe

age,

gend

er,p

refe

rred

lang

uage

,etc

.as

wel

las

deta

ilson

the

cont

ract

and

serv

ice

such

aspr

ivat

eor

busi

ness

clie

nt,i

nvoi

cead

dres

s,av

erag

eco

stof

the

serv

ice

orco

ntra

ctty

pe(p

re-p

aid,

post

-pai

dSI

M,m

achi

ne-t

o-m

achi

neSI

M).

This

info

rmat

ion

shou

ldbe

used

with

grea

tcau

tion,

asit

ishi

ghly

sens

itive

and

nota

lway

sac

cura

te(p

hone

user

may

diff

erfr

omth

eco

ntra

ctho

lder

).Th

eva

lue

ofth

ese

attr

ibut

eslie

sin

the

pos

sibi

lity

toad

dex

tra

dim

ensi

ons

tost

atis

tica

lana

lysi

s(e

.g.,

gend

er),

asw

ella

sin

the

optio

nof

perf

orm

ing

data

clea

ning

proc

esse

s(e

.g.,

rem

oval

ofM

2Mda

ta).

2

Mob

ilep

hone

Mar

ketS

hare

The

per

cent

age

ofth

eto

tal

num

ber

ofco

ntra

ctsu

bscr

ipti

ons

(sal

es)

that

isea

rned

bya

par

ticu

lar

tele

pho

neco

mp

any

over

asp

ecifi

edp

erio

dof

tim

e.M

arke

tsh

are

isca

lcul

ated

byta

king

the

com

pany

’ssa

les

inth

epe

riod

and

divi

ding

them

byth

eto

tal

sect

orsa

les

inth

esa

me

peri

od.T

his

met

ric

isus

edto

give

age

nera

lide

aof

the

size

ofa

tele

phon

eco

mpa

nyto

its

mar

keta

ndco

mpe

tito

rs.

Mob

ilep

hone

pen

etra

tion

rate

Mob

ilep

hone

pen

etra

tion

rate

,is

ofte

nu

sed

tom

ean

the

num

ber

ofac

tive

mob

ilep

hone

use

rsp

er10

0p

eop

lew

ithi

na

spec

ific

pop

ula

tion

,whi

chis

tech

nica

llyno

ta

pen

etra

tion

rate

asit

doe

sno

tacc

ount

for

use

rsha

ving

mu

ltip

lem

obile

pho

nes

and

henc

eca

nex

ceed

100%

due

todo

uble

coun

ting

.

42

Page 45: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARYD

ata

Res

ourc

e:ES

SR

MF

data

.

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Con

dit

iona

lL

ocat

ion

Prob

abili

ty

Pro

babi

lity

oflo

cati

onof

ane

twor

kev

ent

cond

itio

ned

onan

othe

rne

twor

kev

ent

imm

edia

tely

befo

re.

5

Con

verg

ence

laye

rC

-lay

erIn

term

edia

tela

yer

ofth

eE

SSR

efer

ence

Met

hod

olog

ical

Fram

ewor

kfo

cuse

don

the

tran

sfor

mat

ion

and

prep

arat

ion

ofda

tafo

rst

atis

tica

lpur

pose

s.5

Dat

ala

yer

D-l

ayer

Firs

tlay

erof

the

ESS

Ref

eren

ceM

etho

dolo

gica

lFra

mew

ork

focu

sed

onth

eac

cess

and

prep

roce

ssin

gof

raw

telc

oda

tain

prep

arat

ion

for

the

stat

isti

calp

roce

ssin

g.5

Ded

up

licat

edp

enet

rati

onra

tePe

netr

atio

nra

tead

just

edby

ade

dupl

icat

ion

fact

or.

5

Ded

up

licat

ion

fact

orN

umer

icfa

ctor

adju

stin

gfo

rth

ede

vice

mul

tipl

icit

yci

rcum

stan

ce.

5

Dev

ice

du

plic

-it

yC

ircu

mst

ance

inw

hich

agi

ven

ind

ivid

ualc

arri

esex

actl

ytw

od

evic

esd

urin

ghi

s/he

rdi

spla

cem

ent.

5

Dev

ice

du

plic

-it

ypr

obab

ility

Prob

abili

tyth

ata

give

nde

vice

isca

rrie

dto

geth

erw

ithan

othe

ron

e(a

ndon

lyon

em

ore)

bya

give

nin

divi

dual

duri

nghi

s/he

rdi

spla

cem

ent.

5

Dev

ice

mu

lti-

plic

ity

Cir

cum

stan

cein

whi

cha

give

nin

div

idu

alca

rrie

sm

ult

iple

dev

ices

du

ring

his/

her

disp

lace

men

t.5

Dev

ice

mu

lti-

plic

ity

pro

ba-

bilit

y

Prob

abili

tyth

ata

give

nde

vice

isca

rrie

dto

geth

erw

ithan

othe

ron

esby

agi

ven

indi

vid-

uald

urin

ghi

s/he

rdi

spla

cem

ent.

5

Dyn

amic

alA

p-pr

oach

App

roac

hto

com

pute

loca

tion

prob

abili

ties

whe

rea

dis

plac

emen

tpat

tern

ofm

obile

devi

ces

ism

odel

led.

5

(HM

M)

Em

is-

sion

Mod

elPr

obab

ility

mod

elfo

rth

eem

issi

onpr

obab

iliti

es.

5

43

Page 46: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

(HM

M)

Em

is-

sion

Pro

babi

l-it

y

Ina

dyna

mic

alap

proa

chba

sed

ona

hidd

enM

arko

vm

odel

,pro

babi

lity

ofob

serv

atio

nof

netw

ork

even

tdat

aco

ndit

ione

don

the

stat

eof

the

mob

ilede

vice

.5

ESS

Ref

eren

ceM

etho

dol

ogi-

calF

ram

ewor

k

ESS

RM

F

Prod

uctio

nfr

amew

ork

base

don

mod

ular

ityan

dev

olva

bilit

ypr

inci

ples

toin

corp

orat

em

obile

netw

ork

dat

ain

the

pro

du

ctio

nof

offi

cial

stat

isti

cs.I

tcom

pri

ses

thre

ela

yers

nam

edd

ata

laye

r,co

nver

genc

ela

yer,

and

stat

stit

ics

laye

rw

hose

mai

nbu

sine

ssfu

nc-

tion

sar

eto

acce

ssan

dp

rep

roce

ssra

wte

lco

dat

a,to

tran

sfor

man

dp

rep

are

dat

afo

rst

atis

tica

lpur

pose

s,an

dto

build

stat

isti

calp

rod

ucts

for

dif

fere

ntst

atis

tica

ldom

ains

,re

spec

tive

ly.

5

Eve

ntlo

cati

onpr

obab

ility

(See

even

tloc

atio

nlik

elih

ood)

.Geo

loca

tion

Gen

eric

assi

gnm

ento

fcoa

rsed

spat

iote

m-

pora

lcoo

rdin

ates

.E.g

.ass

ignm

ento

faci

tydi

stri

ctto

ane

twor

kev

ent.

5

Geo

posi

tion

Gen

eric

assi

gnm

ent

ofsp

atio

tem

por

alco

ord

inat

es.

E.g

.as

sign

men

tof

lati

tud

ean

dlo

ngit

udpa

ram

eter

sto

ane

twor

kev

ent.

5

Join

tL

ocat

ion

Prob

abili

tyP

roba

bilit

yof

loca

tion

oftw

oco

nsec

uti

vene

twor

kev

ents

for

agi

ven

mob

iled

evic

e.Lo

cati

onG

eolo

cati

onof

ane

twor

kev

entt

oa

tile

ofth

ere

fere

nce

grid

.5

(Eve

nt)

Lo-

cati

onL

ikel

i-ho

odPr

obab

ility

ofne

twor

kev

entd

ata

cond

itio

ned

ona

give

nlo

cati

on.

5

Loc

atio

np

rob-

abili

tyPr

obab

ility

ofth

elo

cati

onof

ane

twor

kev

ent.

5

Pos

teri

orlo

ca-

tion

pro

babi

l-it

yPr

obab

ility

ofth

elo

cati

onof

ane

twor

kev

entc

ondi

tion

edon

netw

ork

even

tdat

a.5

Pri

orlo

cati

onpr

obab

ility

Prob

abili

tyof

the

loca

tion

ofa

netw

ork

even

tnot

cond

itio

ned

onne

twor

kev

entd

ata.

5

Ref

eren

cegr

idG

rid

ofre

fere

nce

inth

eES

SR

efer

ence

Met

hodo

logi

calF

ram

ewor

kup

onw

hich

geol

o-ca

tion

will

bere

ferr

edto

.5

Reg

ion

Uni

toft

erri

tori

aldi

visi

onup

onw

hich

final

esti

mat

esw

illbe

prod

uced

.

44

Page 47: ESSnet Big Data II · 2020-06-02 · erable, Essnet Big Data pilot 2 (REF to the Quality Guidelines) the throughput phase has been split into two phases, i.e. sub-phase 1, Deriving

GLOSSARY

Lem

ma

Shor

tor

acro

nim

Defi

niti

onR

ef.

Stat

icA

p-

proa

chA

ppro

ach

toco

mpu

telo

cati

onpr

obab

iliti

esw

here

nodi

spla

cem

entp

atte

rnof

mob

ilede

vice

sis

mod

elle

d.5

Stat

isti

cal

filte

ring

Proc

edur

eto

iden

tify

indi

vidu

als

ofa

targ

etpo

pula

tion

with

ina

give

nm

obile

netw

ork

data

set.

5

Stat

isti

csla

yer

S-la

yer

Upp

erla

yer

ofth

eES

SR

efer

ence

Met

hodo

logi

calF

ram

ewor

kfo

cuse

don

the

cons

truc

-ti

onof

stat

isti

calp

rodu

cts

for

diff

eren

tsta

tist

ical

dom

ains

.5

Tele

com

mun

icat

ion

netw

ork

Tech

onol

ogic

alin

fras

truc

ture

dist

ribu

ted

accr

oss

age

ogra

phic

alte

rrit

ory

prov

idin

ga

tele

com

mun

icat

ion

serv

ice,

i.e.a

com

mun

icat

ion

serv

ice

whi

lein

mov

emen

t.5

Tile

Pixe

loft

here

fere

nce

grid

.(H

MM

)Tra

nsiti

onM

odel

Prob

abili

tym

odel

fort

hetr

ansi

tion

prob

abili

ties

.5

(HM

M)T

rans

i-ti

onP

roba

bil-

ity

Ina

dyn

amic

alap

proa

chba

sed

ona

hid

den

Mar

kov

mod

el,p

roba

bilit

yof

tran

siti

onbe

twee

nco

nsec

utiv

est

ates

.5

45