characterization and monitoring of critical systems on ... · arin american registry for internet...
TRANSCRIPT
Characterization and Monitoring of Critical Systems onNational Internet
João Miguel Marques Domingos
Dissertation submitted to obtain the Master Degree in
Telecommunications and Informatics Engineering
Supervisors: Prof. Rui Jorge Morais Tomaz ValadasEng. Paulo Pereira
Examination Committee
Chairperson: Prof. Paulo Jorge Pires FerreiraSupervisor: Prof. Rui Jorge Morais Tomaz Valadas
Member of the Committee: Prof. Fernando Henrique Côrte-Real Mira da Silva
May 2017
ii
Acknowledgments
What a journey! I would like to thank my advisers, Professor Rui Valadas and Professors Jose Brazio for
all the knowledge they gave me and for all the time spent helping me during this work. They motivated me
from the beginning and always helped with their tremendous knowledge whenever they could. Engineer
Paulo Pereira also deserves all my thanks for the important advices during all this work.
To my parents and closest friends, thank you for always supporting me during the good and bad
times. You always supported my decisions and helped me achieve my dreams! If this thesis is finished,
it is because of all of you. Thank you for everything.
iii
iv
For my parents.
I am what I am because of you.
v
vi
Abstract
The Internet has become a critical communications infrastructure for today’s society. Besides supporting
interpersonal communication and entertainment services, the Internet also allows communication with
sites that are critical to the good functioning of our society, in a way that the denial of access to infor-
mation or services available by them can be the cause of society malfunctioning. These sites will in the
following be designated as eLife critical sites. In the present thesis was developed the capability of eval-
uation and monitoring of the resilience of sites that support critical public services used by Portuguese
citizens, based on public information available. Such work demands therefore a series of questions. (i)
The identification and inventory of the most important sites and algorithms. (ii) Characterization of IP
resources associated to each one of these sites. (iii) Interconnection system that supports the connec-
tions between those sites and the Internet, namely the characterization of Autonomous Systems (AS)
that support their connectivity. The techniques and algorithms described above are integrated in the
monitoring system developed, Crisys, that allows in a dynamic way to obtain information and resilience
indicators of the systems considered.
Keywords: eLife Critical Sites, Autonomous Systems, Crisys, National Internet, Automatic Mon-
itoring, Communications Resilience.
vii
viii
Resumo
A Internet tornou-se uma infraestrutura de comunicacoes crıtica para a sociedade de hoje em dia.
Para alem de suportar a comunicacao interpessoal e servicos de entretenimento, a Internet tambem
permite a comunicacao com sites crıticos para o bom funcionamento de nossa sociedade, em que a
negacao do acesso a informacao ou servicos disponıveis possa causar o mau funcionamento da so-
ciedade. Estes sites sao definidos como eLife critical sites. Na presente tese foi desenvolvida a capaci-
dade de avaliacao e monitorizacao da resiliencia de sites que suportam servicos crıticos, utilizados por
cidadaos portugueses, usando apenas informacao disponıvel publicamente. Uma serie de perguntas
foram necessarias para desenvolver este trabalho. (i) Identificacao e inventario dos sites e algoritmos
mais importantes. (ii) Caracterizacao dos recursos IP associados a cada um desses sites. (iii) Analise
do sistema de interconexao que suporta as conexoes entre esses sites e a Internet, nomeadamente
a caracterizacao de Sistemas Autonomos (AS). As tecnicas e algoritmos descritos acima, estao inte-
grados no sistema de monitorizacao desenvolvido, denominado como Crisys, que permite de forma
dinamica, obter informacoes e indicadores de resiliencia dos sistemas indicados.
Palavras-chave: eLife critical sites, Sistemas Autonomos, Crisys, Internet Nacional, Monitorizacao
Automatica, Resiliencia de Comunicacoes.
ix
x
Contents
Acknowledgments iii
Abstract vii
Resumo ix
List of Figures xv
List of Tables xvii
Acronyms xix
1 Introduction 1
1.1 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Technical Background 3
2.1 Critical Infrastructures & Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Autonomous Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 DNS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 BGP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.5 Assignment of IP Address Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.6 Web Scraping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Related Work 9
3.1 Dutch Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.1 Autonomous Systems Identification . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1.2 Autonomous Systems Characterization . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.3 Data Representation and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 French Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.1 Autonomous Systems Identification . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.2 Autonomous Systems Characterization . . . . . . . . . . . . . . . . . . . . . . . . 11
xi
3.2.3 Data Representation and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.4 DNS Resilience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.5 BGP Resilience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 German Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.1 Autonomous Systems Identification . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.2 Autonomous Systems Characterization . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.3 Data Representation and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 Lithuanian Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.1 Critical Sites & Autonomous Systems Identification . . . . . . . . . . . . . . . . . . 14
3.4.2 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Identification and Communications Resilience Assessment 17
4.1 Identification and Communications Resilience Assessment . . . . . . . . . . . . . . . . . 17
4.1.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Characterization of Critical Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2.1 Sources of Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 Identification of ASes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 ASes Interconnection System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.1 Paths Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.2 Valid Paths Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4.3 Paths Characterizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.5 Characterization of ASes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.5.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5 The Crisys Monitoring System 29
5.1 Functionalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.1.1 Sites Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.1.2 ASes Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.1.3 Sites and ASes availability correlation . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3.1 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.3.2 Monitoring Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
xii
5.3.3 Monitoring ASes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.3.4 GUI Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6 Evaluation 37
6.1 Test Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Critical Systems Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2.1 eLife Critical Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.2.2 Autonomous Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.3 Crisys System Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3.1 Identification of Site unavailability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3.2 Possible Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3.3 AS Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.4 Crisys System Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.4.1 Database Populate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.4.2 Sites Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.4.3 ASes Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.4.4 Sites Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7 Conclusions and Future Work 47
7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
A Identified eLife Critical Sites 49
Bibliography 55
xiii
xiv
List of Figures
2.1 Technical background organizational view . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1 Lithuanian Monitoring System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1 AS identification on RIB example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Valley-free example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 ASes paths JSON file example. Paths to AS2860 . . . . . . . . . . . . . . . . . . . . . . . 24
5.1 Summary of the system architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Sites page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.3 Individual Site page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.4 ASes page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.5 AS Individual page - part 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.6 AS Individual page - part 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.7 Interactive Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.1 No of DNS Servers per Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.2 No of IP Addresses per Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.3 RTT per Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.4 Visual traceroute to sns.gov.pt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.5 Site down for maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.6 System log confirming that the site is down. . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.7 Our AS rank compared with CAIDA AS rank. . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.8 Database populate chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.9 Sites update chart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.10 ASes update chart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.11 Sites monitoring chart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
xv
xvi
List of Tables
3.1 Comparison between what we need with past works . . . . . . . . . . . . . . . . . . . . . 16
4.1 Critical eLife sites identification summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Characterization example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 AS characterization example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6.1 Test Environment Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 Sites metrics evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.3 Sites security evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.4 Sites affected by the shutdown of all foreign Transit ASes . . . . . . . . . . . . . . . . . . 40
6.5 Number of hops per path from ISP to Services ASes . . . . . . . . . . . . . . . . . . . . . 40
6.6 Number of possible paths from ISP to Services ASes . . . . . . . . . . . . . . . . . . . . . 40
6.7 Number of National and Foreign Services ASes . . . . . . . . . . . . . . . . . . . . . . . . 41
6.8 Traceroute comparison with possible paths identified and classified . . . . . . . . . . . . . 42
A.1 eLife critical sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
xvii
xviii
List of Acronyms
AFRINIC African Network Information Center
ARIN American Registry for Internet Numbers
APNIC Asia-Pacific Network Information Centre
AS Autonomous Systems
ASN Autonomous System Number
ARCEP Autorite de regulation des communications electroniques et des postes
BGP Border Gateway Protocol
BSI Bundesamt fur Sicherheit in der Informationstechnik
CAIDA Center for Applied Internet Data Analysis
DoS Denial of Service
DDoS Distributed Denial of Service
DNS Domain Name System
ENISA European Network and Information Security Agency
ICT Information and Communications Technology
IANA Internet Assigned Numbers Authority
IX Internet Exchange Point
ISP Internet Service Provider
LACNIC Latin America and Caribbean Network Information Centre
LIR Local Internet Registry
RIR Regional Internet Registry
RIPE Reseaux IP Europeens
RPKI Resource Public Key Infrastructure
RIPE NCC RIPE Network Coordination Centre
ROA Route Origin Authorizations
RIB Routing Information Base
xix
RIS Routing Information Service
TLD Top Level Domain
UCLA University of California, Los Angeles
xx
Chapter 1
Introduction
In its beginning, the Internet was used mainly to support simple applications, for example email, chat,
news readers, etc. Nowadays the Internet supports a very large number of highly complex services, in
particular, services that replace the need for physical presence, for example, a bank transaction or the
purchase of a book. Some of these services are critical to the good functioning of our society, namely
those related to health, financial, and government. For access to these services, people interact with
sites that can also be considered critical.
We define critical site as an Internet site whose malfunction affects a large number of people in
activities that are fundamental to the operation of society. We call these sites eLife critical sites.
The resilience of the communications with eLife critical sites is an important asset of modern soci-
eties. Evaluating and monitoring this resilience is important for taking preventive actions and it involves
several aspects.
The communication with eLife critical sites is made via the IP protocol suite, and some of its aspects
should be considered, for example, the IP Address, DNS servers, responsible entities and national or
international location. The IP global connectivity of sites is supported by Autonomous Systems (AS), and
the failure of one such system can affect this connectivity. Physical redundancy can reduce the risk of
losses of connectivity to a site in case of connectivity disruption of an intermediate AS. The AS location
is another important aspect, as an indicator of the degree of control a country has over ASes critical to
its national connectivity, however, there are several ASes with presence in more than one country.
A study of the national critical sites would provide data that can be applied to improve the security
and reliability of our national electronic services.
The goal of this work is to identify Portuguese eLife critical sites, to define methodologies to monitor
the resilience of the communications with these sites, and to develop a platform that implements these
methodologies. To develop our work, previous studies that map and monitor the critical sites of different
countries were analyzed.
To identify Portuguese eLife critical sites, we start by the identification of national critical sectors,
using classifications already available, and then identify eLife critical sites in those sectors using public
information from entities or other sources.
1
To monitor the resilience of communications with eLife critical sites, we need to characterize these
communications. For this purpose, each site identified as critical will be characterized, using information
derived from public sources, from the point of view of the associated IP address space, names, DNS and
BGP routes. In addition will be studied the AS interconnection system, particularly in order to identify
support ASes considered critical. With all critical ASes characterized, it is possible to draw a topology of
the ASes interconnection system and evaluate its resilience. In this last step, it is essential to consider
ASes that function as transit between critical ASes.
Using all the methodologies explained above, the developed platform will be capable of doing a static
characterization followed by a dynamic monitoring of those critical systems in terms of connectivity or
also of changes in important technical parameters, for example the IP Address or a change in the BGP
route of a critical AS.
1.1 Goals
This thesis has the following goals:
• Development of criteria for the identification of Portuguese critical sites.
• Development of criteria and methodologies to assess the communications with critical sites, using
only information publicly available.
• Development of a platform for monitoring the communications with critical sites.
1.2 Outline
The present thesis is structured in the following way:
• Chapter 1 - Presents the motivation and goals of this thesis and succinctly describes the work
performed.
• Chapter 2 - Presents information about technical aspects related to the present thesis.
• Chapter 3 - Presents previous works that contain available tools and methodologies useful for this
thesis.
• Chapter 4 - Describes the methodologies developed and results obtained in the identification and
characterization of eLife national critical systems.
• Chapter 5 - Describes the architecture and implementation aspects of the tool developed to mon-
itor national critical systems.
• Chapter 6 - Presents the evaluation of the developed tool in terms of performance and information
reliability, and the evaluation of national connectivity resilience.
• Chapter 7 - Summarizes the work performed and describes possible future work.
2
Chapter 2
Technical Background
This section provides background information relevant for the work developed in this thesis. Section
3.1 gives the definition of a critical infrastructure and the different critical sectors defined by ENISA.
Section 3.2 explains the different types of Autonomous Systems and their interconnections. Section 3.3
discusses the DNS vulnerabilities and some countermeasures, namely DNSSEC. Section 3.4 discusses
BGP vulnerabilities and countermeasures. Section 3.5 describes the allocation of IP address blocks
and the different entities involved in this process. Section 3.6 shows and explains some technologies to
extract public information from the Internet.
Figure 2.1 states those elements and describes their organizational view. An organization can be
responsible for sites, naming domains, DNS servers and IP address blocks. The IP address block are
allocated by an Internet Regional Registry. The organization can be supported by an AS owned by an
ISP, or it can have its own AS. An organization can also be an ISP. The ISP contains ASes that contain
routes to other ASes. The routes are stored in the regional registry.
2.1 Critical Infrastructures & Sites
A critical infrastructure is a system which is essential for the maintenance of vital societal functions,
which means that its failure may have a significant negative impact in security and well-being of citizens
[1].
Critical Infrastructures can be classified according to the critical sectors of activity they belong to.
Several definitions of critical sectors exist, one example being that provided by the European Network
and Information Security Agency (ENISA) [2]. ENISA defines the following critical sectors:
• Energy
• Nuclear industry
• Information and Communication Technologies (ICT)
• Water
3
Figure 2.1: Technical background organizational view
• Food
• Health
• Financial
• Transport
• Chemical industry
• Space
• Research facilities
In [2], ENISA also defines how critical assets can be identified in terms of the severity of conse-
quences in case of a failure:
• Public effects (percentage of population affected);
• Economic effects (significance of economic loss and/or degradation of products or services);
• Environmental effects;
• Political effects;
• Psychological effects.
This methodology is applied to critical infrastructures, which are different from eLife critical sites. A
eLife critical site can be seen as platform to access resources or services provided by a critical infras-
tructure. For example, a banking accounting site is a platform to access services provided by banks,
which are considered critical infrastructures. We assume that an eLife critical site can also be classified
according to these critical sectors of activity, as the failure of a critical eLife site also as a significant
impact in the good operation of society.
4
2.2 Autonomous Systems
An Autonomous System(AS) is a set of routers and communication resources under a single technical
administration, using interior gateway protocols and metrics to route packets within the AS, and using
BGP to route packets to other Autonomous Systems (ASes) [3].
There are three main types of ASes [4], namely stub, multi-homed and multi-homed transit. The
stub AS has only one connection to another AS, the multi-homed AS has more than one interface to
communicate with other ASes, but transit traffic is forbidden, the multi-home transit AS has more than
one interface to other ASes and transit traffic is permitted.
ASes can also be classified following a definition by CAIDA [5] [6], as transit, content and access.
Transit ASes form the backbone of the Internet by carrying traffic to other destinations. Content ASes
provide videos, Web pages and other contents. Access ASes are used by smaller organizations or
individuals to access the Internet.
In a relation between ASes, an AS can be classified as provider, costumer or peer. An AS is a
provider when the organization that owns the AS receives a cash payment to transit traffic, on the
contrary a customer AS is when the organization pays to another organization to transit their traffic. Two
ASes are peers when there is relation where exists a mutual agreement between the ASes owners to
transit traffic of their customers between them without a payment.
2.3 DNS
The Domain Name System (DNS) can be described as a simple query-response protocol, where client
query servers to resolve domains into IP addresses [7]. It has security vulnerabilities that can compro-
mise the reliability of the name resolution process, namely:
• Integrity of Servers Hardware and Software: The normal countermeasure is the use of redun-
dancy, for example, each primary server has a list of secondary servers that contain a copy of the
primary DNS records database.
• Tampering of DNS responses: An attacker can intercept communications (man-in-the-middle)
and generate a false response to the user query to the DNS server, so the user can be directed to
malicious systems, for example, a Web page that harvests credentials. The best countermeasure
for these attacks is the DNSSEC technology. DNSSEC is a set of extensions for the DNS to
ensure responses integrity, that consist in the validation of DNS responses using digital signatures
that are included with DNS responses. These digital signatures are contained in DNSSEC-related
resource records that are generated and added to the zone during zone signing. Each DNS zone
is verified from child to parent until root zone to create a chain of trust. This verification guarantees
the integrity of DNS records. This technology is not provided by many Registrars because of the
cost it involves together with its non mandatory status.
• Distributed Denial of Service: A DDoS has the objective of overloading the servers, services
5
or infrastructures resources until they became unavailable. One specific DDoS is the reflec-
tion/amplification attack, which uses open resolvers to send request with a tampered source IP
(spoofed), so the responses goes to that IP to generate high volume of traffic and thus cause a de-
nial of service. The normal countermeasures are the use of Firewall, IP filtering, DNS Dampening
and Response Rate Limiting.
2.4 BGP
The Border Gateway Protocol (BGP) is an inter-Autonomous System routing protocol, in which the pri-
mary functionality is to exchange network reachability information with other BGP systems [8]. This
exchange of information, is supported on the following types of messages:
• Open - For the creation of a BGP session between routers.
• Update - For the exchange of network reachability information.
• Keepalive - For determining if a link or host is available.
• Notification - For the information of error conditions.
• Route-Refresh - Optional message that has the objective of requesting dynamic, inbound, BGP
route updates from BGP peers or of sending outbound route updates to a BGP peer.
Routing information exchanged between BGP routers is stored in a database called the BGP Routing
Information Base (RIB), where for each learned prefix is stored routing information, for example, the
corresponding AS PATH.
BGP has security vulnerabilities that can compromise the routing system of the Internet, namely:
• Human Errors: A bad routing configuration can lead to global instability. For example when
Pakistan wanted to prevent access to Youtube in their country, they redirected the traffic to a
black hole inside Pakistan, but because of a configuration error that rule was exported to outside
Pakistan, leading to global denial of service to those servers. To prevent human errors, the IETF
defined a Routing Policy Specification Language that establishes policy’s that enables the creation
of tools to automate the router configuration, however, an automatic configuration still can contain
bad policies.
• Software Failures: For example, faulty router software that can cause network failures and com-
promise Internet connections.
• Malicious Attacks: BGP can suffer some malicious attacks, for example DoS attacks to systems,
control of border routers, route flap injections and falsification of routing information.
Countermeasures for the protection of the routing system of the Internet include the creation of quick
reaction response teams, use of authentication mechanisms in BGP packets, verification that the first
AS in AS PATH is identical to the AS where the prefix was learned from, and the announcement of
6
disaggregated prefixes. Also, the use of techniques as BGP ingress and egress filtering and the use of
maximum prefix feature improves the BGP protocol resilience.
2.5 Assignment of IP Address Blocks
Initially the IPv4 was created, providing 232(' 4 ∗ 109) IP addresses, but with the emerging number of
devices it was necessary to create the IPv6, which provides 2128(' 3.4 ∗ 1038) IP addresses.
The global address spaces, IPv4 and IPv6, are managed by a hierarchical structure of organizations.
The top organization is the Internet Assigned Numbers Authority (IANA). IANA allocates a number of
address blocks to each one of the five Regional Internet Registries (RIR) (RIPE NCC, APNIC, LACNIC,
AFRINIC and ARIN). Normally, each RIR is responsible for a specific region of the globe, for exam-
ple, RIPE is responsible for Europe, the North of Africa and the Middle East. Each RIR subsequently
allocates its IP blocks to Local Internet Registries (LIR), also called members, which can be Internet
Service Providers (ISP), companies or academic institutions. Some specific IP addresses are assigned
for special use, described in RFC1918 [9] and RFC5735 [10], for example 10.0.0.0/8 is used in private
networks.
RIR databases, for IPv4 and IPv6, are structured around well-defined objects. For IPv4 one such
object is inetnum [11], that contains information (attributes) about the allocation and assignment of IPv4
addresses. The following attributes are directly related with our work:
• inetnum: Range of IPv4 addresses that the inetnum object describes.
• country: Identifies a country, that can be the owner organization location or server location.
• status: The inetnum object has 11 possible states, the most common are:
ALLOCATED UNSPECIFIED: State of the blocks of addresses for which RIPE NCC is admin-
istratively responsible.
ALLOCATED PA: Allocated to members by RIPE NCC.
ASSIGNED PA: Assigned from a member to an end user.
IPv6 database follows the same pattern, as it contains inet6num object, which contains the same at-
tributes of a inetnum object.
RIPE NCC is the RIR that supports the infrastructure of the Internet in Europe, the Middle East and
parts of Central Asia. As part of their work they maintain a public registry that contains information about
that infrastructure, namely the RIPE database [12] and the RIPE Routing Registry [13]. For example, it is
possible to publicly query RIPE Database with terms (IP, ASN, etc.), for the extraction of the associated
information.
7
2.6 Web Scraping
Web Scraping is the technique of extracting data from websites and indexing it. It is achieved by means
of a computer, for example a bot or a Web crawler. It is employed mostly to search through volumes
of information and extract only the relevant information. The extracted information can be considered
public information, but there are websites that possess terms of use forbidding such automatic extraction
and take legal action against extracting of information in an automated way.
There are many libraries available in different programming languages to extract information. One
of those libraries, in Python, is Beautiful Soup [14], which can parse websites into trees and search the
information based on tags, id’s, types, classes, etc.
2.7 Summary
In this chapter the we provided background information relevant for this thesis, namely about the relation
between critical sectors and critical eLife sites and the respective activity sectors, DNS and BGP security
vulnerabilities, ASes classification, assignment of IP address blocks and web scraping.
8
Chapter 3
Related Work
Some of the issues of the present thesis have been previously addressed by several international stud-
ies, namely in France [15] [16], the Netherlands [17], the Germany [18] and Lithuania [19] [20] [21].
These studies will be referred to in the following sections as French study, Dutch study, German study
and Lithuanian study, respectively.
3.1 Dutch Study
This study focus on the discovery and mapping of the Internet entities corresponding to the Dutch na-
tional critical infrastructure. It describes the methodology used to discover the relevant AS numbers and
to determine the relationships between them. Also describes data visualization techniques to create AS
relations graphs.
3.1.1 Autonomous Systems Identification
Two approaches were used to identify ASes using only public information:
• Bottom-up - From the list of all AS numbers allocated to organizations registered in the Nether-
lands, retrieve the ones that are active in critical sectors.
• Top-down - From prominent organizations active in critical sectors, find the respective AS. Native
AS if the organization is the AS owner, Proxy AS if not.
The Bottom-up approach consists of retrieving a list of ASN’s from the RIPE NCC stats file, which
is updated every day and contains all the ASN’s that were allocated to organizations registered in the
region managed by RIPE. The ”NL” and ”EU” keywords were used to search, by comparing with the value
of the country attribute of each ASN. In the case of the ”EU” value of the country attribute, an additional
search was performed on the description and address attributes in order to find further organizations
related to the Netherlands. From search results ASes that belong to critical sectors defined by the Dutch
government were identified. Google, RIPEstat [22] and the Dutch Chamber of Commerce were used
9
to find the organization responsible for each AS. Each organization was then labeled based on the 12
sectors and 31 goods and services, defined by the Dutch Government.
The Top-down approach consists of selecting national organizations from multiple sources, for exam-
ple Wikipedia and Google, that were part of the critical sectors. For each organization, were identified
the correspondent domain using the KvK website [23]. A, AAAA and MX records were retrieved from
the DNS records of each organization domain. This tool dig was used to find the IP addresses corre-
sponding to the DNS records, and then, RIPEstat to find the corresponding IP prefix and the originating
AS. Finally, the results gathered from the two approaches were combined manually.
3.1.2 Autonomous Systems Characterization
To characterize the ASes this study determined the AS type and the type of relationship that exists
between them all. If a critical organization has its own AS, it is considered a native AS, if the AS
is provided by other organization, it is called a proxy AS. To characterize the ASes interconnections,
different information sources were used, namely Routing information Service (RIS) [24], Route Views
[25], Route Servers [26] [27] and Looking Glasses [28]. Some services that aggregate the retrieved
information are the University of Washington’s iPlane [29], CAIDA, and UCLA’s Internet Research Lab
[30]. The last two are more related to AS relationships, whereas the University of Washington’s iPlane is
focused on measuring link performance, for example, latency, bandwidth, etc. For this reason, the use
of the University of Washington’s iPlane was dropped. The UCLA information was chosen because it
was more recent. Each connection between ASes was obtained using UCLA data files and also transit
providers between those connections were added to have a full mesh relationship. Those providers are
called transit ASes.
3.1.3 Data Representation and Analysis
Retrieved data was represented by the use of node graphs for each sector. To build those graphs, first,
all the ASes information was structured into two JSON files. The first one contains all ASes attributes,
for example their type (native or proxy), ASN, country, etc. The second one contains the links between
ASes, and each one contains two attributes: ”source” and ”target” ASN. To parse those files jQuery was
used, to draw graphs it was used the Sigma.js [31] Library that provides functionalities like zoom-in.
To achieve a good representation, node centrality and distance between nodes were defined. Two
semi circles were created, to differentiate foreign ASes from national ASes. The node position was
calculated by:
x = cos(a) ∗ r (3.1)
y = sin(a) ∗ r (3.2)
To calculate r and a, two different lists of ASes were used, one for each semi-circle, where ASes are
ordered by the number of direct connections. r is the AS index in the respective list, and a is the radian
angle calculated with r. The parameter a can go from [π/2, 3π/2] or [3π/2, 5π/2], depending of the list,
10
so nodes from different lists don’t overlap each other. To get a better visualization, nodes and links are
colored to differentiate types of ASes and connections, also, each node has their location country flag.
Graphs were analyzed for each sector according to different factors, for example the percentage of
foreign and national ASes, which ones were central organizations, etc. The choices to exclude some
ASes were explained, and also generic information and specific aspects about each sector were re-
vealed.
3.2 French Study
This study provides analyses on the implementation of the BGP and DNS protocols from resilience
indicators, which includes ASes identification and characterization.
3.2.1 Autonomous Systems Identification
The methodology used consists in the identification of ASes, that fulfill the requisites to be considered a
French AS, using the following criteria:
• The description of an AS in whois from the RIPE NCC has keywords related to France.
• More than 75% of IP Addresses allocated to the AS are localized in France by GeoIP;
• The description of an AS in the RIPE NCC database contains keywords related to operators that
are in a list declared by the ”Autorite de regulation des communications electroniques et des
postes” (ARCEP);
• The AS organization, in the RIPE NCC database, has an address located in France;
• The AS administrators, in the RIPE NCC database, have an address located in France;
• The AS is part of the 34 French ASes manually identified by members of the observatory;
• The AS is directly connected to one of the 34 French ASes manually identified the members of the
observatory;
• Its ASN is assigned by RIPE NCC.
Four extractions were made during 2013 (March, June, September and December), and their results
were combined to obtain a list of French ASes.
3.2.2 Autonomous Systems Characterization
Only the connections between ASes were characterized, assuming that there is a direct connection
between ASes if there are AS PATH records that contain two successive ASes. That connection can be
defined as peering or transit. The information was retrieved from RIS and Route Views in the first 5 days
of March, June, September and December of 2013.
11
3.2.3 Data Representation and Analysis
The main focus was the ASes interconnections and their connectivity status. Graphs were made with
the objective of summarizing the following topics:
• Number of French ASes that were in BGP files during 2013.
• The number of French ASes and foreign ASes important to French Internet, on IPv4 and IPv6.
• French IPv4 connectivity between French ASes, French critical ASes and foreign critical ASes.
• The number of French critical ASes and foreign critical ASes, on IPv4 and IPv6.
• The impact of the disappearance of critical ASes, on IPv4 and IPv6.
3.2.4 DNS Resilience
To measure the DNS resilience, the French study used a tool called DNSwitness [32], that is divided
into two modules, DNSdelve, that takes as input a list of domains and query various things, for example,
all the IP addresses of the name servers, and DNSmezzo, that is used in a DNS probe and parses all
the data that passes through the name server. The DNS resilience indicators were the dispersion of
authoritative DNS servers and Inbound Email Relays, the percentage of zones delegated by .fr top-level
domain (TLD) that use DNSSEC, and the percentage of zones delegated by .fr TLD that have at least
one server with IPv6 compatibility. These tools require internal access to DNS servers.
3.2.5 BGP Resilience
In the French study, BGP resilience of the French Internet was measured by studying the development
of national ASes and foreign ASes in the convex hull of the French Internet, both critical and non-
critical, and in case of their disappearance, the impact they would had in the connectivity of other ASes.
Since the summer of 2014, the Observatory started to analyze real time BGP advertisements to identify
conflicts that can be a traffic redirection. When there is a conflict, traceroute is executed from a node of
RIPE Atlas project to the conflict IP prefix. The IPs resulting from traceroute are mapped into ASN and
then compared with the AS PATH of the conflicted BGP advertisement. The resulting information was
subsequently aggregated into an annual report containing the number of conflicts, prefix hijacks, and
the objectives of those attacks.
In this study, it is claimed that for each prefix advertised on the Internet, a route object, that con-
tains routing information for IP address space resources, should be declared, because it improves BGP
resilience. Also in this study were obtained statistics about the number of unused route objects, pre-
fixes covered by route objects, and prefixes not covered by route objects. Finally, it was calculated the
percentage of IPv4 and IPv6 addresses, managed by French ASes, covered by Route Origin Authoriza-
tions (ROA) declared in the RIPE-NCC repository of the Resource Public Key Infrastructure (RPKI). The
declaration of ROA’s to the RPKI guarantees the authenticity of IP resources.
12
3.3 German Study
This study describes a methodology to derive a country-centric view on the Internet structure, which
was applied to Germany with the objective of generate, visualize, and analyze the structure of commu-
nication flows between relevant public and business sectors by the identification of ASes with national
relevance. Also describes an AS graph visualization implementation and its evaluation using common
graph metrics.
3.3.1 Autonomous Systems Identification
The identification of ASes that had a relevant role in Germany was done, first, by extracting all inetnum
objects from the RIPE DB that possess in the country attribute the values ”DE” or ”EU”. For the ASes
with ”EU” value, the address data of the associated admin-c and org objects were also retrieved, by
comparison with a list that contain keywords such as country codes, local city names, and international
dialing codes that could be related to Germany. The longest covering IP-prefix was determined, for each
IP-block that was allocated to German organizations, by querying the RIPE-DB and looking for the route
object. Using the RIPE DB, data from Team Cymru [33] and RRC12 [34] of the RIPE RIS, the prefixes
were then mapped to their origin ASN through the route object. Various sources of information were
used, since the exclusive use of the RIPE DB, there would be several unresolved mappings.
3.3.2 Autonomous Systems Characterization
ASes were classified in a topological hierarchy (tier1, large and small ISP, and stub) and by their role in
different sectors, defined by the Federal Office for Information Security (BSI), relevant in Germany. ASes
were divided into sectors using an optimized and manual search, using keywords, for names, descrip-
tions and address fields from the respective ASes. Afterwards, interconnections were found based on a
weighted next hop matrix from NECLab topology project [35] that uses updated measurements of UCLA
data. The last step also allowed to find other ASes that were relevant to the country as transit nodes
between important ASes.
3.3.3 Data Representation and Analysis
Three different types of metrics (Betweenness, degree distribution and distance distribution) and the
peering behavior were explained.
Betweenness quantifies the number of shortest paths passing through a node x. The equation to
calculate the betweenness is:
B(x) =∑
i 6=m6=j,i 6=j
B(i,m, j)
B(i, j)(3.3)
where B(i, j) is the total number of shortest paths between i and j, and B(m, i, j) is the number of
paths that belong to B(i, j) that also pass through m. Using this calculation, it is possible to infer the
importance of the node m in the connection between ASes.
13
Degree distribution is the number of one-hop neighbors of each node. With this measure, it was
possible to draw graphs that compare the in-degree distribution of sectoral ASes with the in-degree of
the full German Internet.
Distance distribution describes routing performance and frequently follows a Gaussian law truncated
to sensitive values. It measures the probability that two nodes of a network, randomly selected and
using shortest paths, are k distant from each other.
The peer behavior was analyzed to discover how likely a member from a specific sector would choose
its upstream peer depending on the sector it communicates with. To analyze the peer behavior, the num-
ber of different upstream peers was counted relatively to the overall number of paths towards members
of different sectors, and also the relative frequency corresponding diversity classes over all members of
a sector was quantified.
3.4 Lithuanian Study
This study focus on the creation of Lithuanian Internet infrastructure critical object identification and
emulation criteria’s and methodology with the objective of creating a critical object operational reliability
monitoring.
3.4.1 Critical Sites & Autonomous Systems Identification
In this study, critical sites are divided into sectors of activity, but their identifications aren’t explained. To
identify the ASes, first a query to DNS servers with the URL is made, in the answer the IP is obtained
and then using RIPE, the route object is obtained and analyzed to find the AS responsible. Also, the
upstream AS is also discovered using BGP routes information, for each AS responsible of a critical site.
3.4.2 Monitoring
The monitoring of critical sites and ASes is essential to have a quick response to failures and error
in connections, also the monitoring enables the possibility of acquiring data to do statistical studies.
A monitoring system was developed in [19] [20] [21] with the objective of creating a new Lithuanian
Internet infrastructure critical object identification, emulation criteria and methodology, and also creating
a monitoring methodology and theoretical model to critical object operational reliability. The system
collects data automatically and does real-time monitoring of the Lithuania’s Internet critical domains.
To collect data, every day, for each domain, the IP was found and afterwards data from RIPE DB was
retrieved and stored in a local DB. The attributes stored were the following:
• Inetnum
• Route, Route Path
• ASN
14
Figure 3.1: Lithuanian Monitoring System
• DNS NS
The next step is the live monitoring, and for this purpose, the approach of the study used informa-
tion based on BGP and ICMP protocols and HTTP response status codes. To get updates about BGP
routes, BGP information was accessed from inside the network, which mean it isn’t public information. If
there was an update of a critical route, an alarm was triggered in the monitor interface. This part is out
of scope from our work because only public information will be used. Currently the real-time monitoring
system is called MAPPI [21] and consists of a visual network topology analyzer that collects data au-
tomatically, generates reports and alarms, gathers statistics and compares data, and helps in network
troubleshooting. This work represents what we want in terms of critical sites and ASes monitoring, the
only difference being that we can only use public information. Figure 3.1 shows the structure of this
system.
3.5 Summary
Table 3.1 summarizes the coverage of the issues addressed in the proposed thesis by the studies
considered.
15
Dutch Study German Study French report Lithuanian StudyMethodology toidentify criticalsites/organizations
7 7 7 7
Critical sectors 3 3 7 7
Methodology toidentify ASes 3 3 3 7
Use RIPE tools 3 3 3 3
Other tools 3 3 3 3
ASes topologyrepresentation 3 3 3 3
Equation to calculatedistance between ASes 3 3 7 7
Live Monitoring 7 7 3 3
Use only publicinformation 3 7 7 7
Table 3.1: Comparison between what we need with past works
16
Chapter 4
Identification and Communications
Resilience Assessment
This chapter describes the methodologies and results obtained in the identification and characteriza-
tion of critical systems. Section 4.1 and 4.2 describe the methodologies and results obtained in the
identification and classification of critical eLife sites, respectively. Section 4.3, 4.4 and 4.5 describe the
methodologies and results obtained in the identification and classification of critical ASes, respectively.
4.1 Identification and Communications Resilience Assessment
This section describes the methodologies and results obtained in the identification of critical eLife sites.
4.1.1 Methodology
We are interested in identifying sites that provide services to the general public that in case of disruption
affect the good functioning of society, i.e., critical eLife sites. We further classify the critical eLife sites
according to the sector of activity they relate to. For this purpose, we used the classification of activity
sectors of ENISA described in section 2.1.
In order to build a list of the main Portuguese critical eLife sites, we made a search on the Inter-
net for the main Portuguese organizations in each activity sector. We searched entities using search
engines with keywords related with each activity sector, for example, the keyword ”.gov” for civil adminis-
tration sector, or the keyword ”banco” for the financial sector. Also, to complement our search, we used
documents from official agencies that contain lists of entities and their respective sites, for example,
documents from ANACOM [36] and ERSE [37] [38].
To finish the identification of critical eLife sites, we analyzed the sites of every identified organization
to assess the importance of the services offered to the user, and then retained only the most important
ones. For example, we retained sites that contain client portals that replace the need for the presence
a physical site, or that contain important information to citizens, and discarded sites that do not offer
17
Sectors Number of identified critical eLife SitesICT 6Transport 8Energy 4Water 1Health 7Food 1Financial Services 25Public order and safety 5Industry 0Civil Administration 22Civil Protection 3Environment 1Total 83
Table 4.1: Critical eLife sites identification summary
critical services, even if they belong to companies that offer critical services through other means, for
example, a public water company, responsible for providing water to the population, but the site only
contains non-critical information.
4.1.2 Results
Table 4.1 summarizes the results. As can be seen, some sectors do not contain any critical eLife
sites. This occurs because, as explained in section 2.1 these sectors were used to assess critical
infrastructures, which means, some of them do not make sense in our work. Appendix A contains all the
identified sites.
4.2 Characterization of Critical Sites
This section describes the sources of information available, the methodology used, and the results
obtained in the characterization of critical eLife sites.
4.2.1 Sources of Information
The public information used in the characterization of a site can be retrieved by a variety of means. We
found the following sources of public information:
• Networking Tools - Tools such as dig to retrieve information on DNS servers, or nslookup to
obtain the IP addresses of a site.
• Python Libraries - For example the socket library contains functions to test open ports, for in-
stance, port 443 to check for the use of https.
• Authoritative Sites - The different RIRs, namely RIPE, maintain databases where it is possible to
query by url with answers in json format, that contain for example the ASN responsible for a given
IP.
18
• Non-Authoritative Sites - The less legitimate option, as we can’t be assured of the information
truthfulness, also from the other options, is the most difficult one to retrieve information because
most of these sites don’t offer an API for querying.
We decided to use a combination of all the options, as we could not retrieve all information using
a single one. Obviously, the non-authoritative sites option is the one that we try to avoid, in order to
guarantee information authenticity.
4.2.2 Methodology
To characterize the identified sites in terms of connectivity using only public information is one of our
goals. To achieve this goal, the following parameters were selected:
• IP Address List - List of IPv4 or IPv6 addresses associated to the site in DNS records. This list is
obtained using nslookup, which is a networking tool that queries DNS servers to obtain information.
• IP Prefix List - List of registered IP prefixes that contain the addresses of IP Address List. This
list is obtained from a query to the RIPE-DB [12].
• Responsible Organizations - Organizations that own the IP addresses in the IP Address List.
This usually is only one organization. The organizations are obtained from a query to the RIPE-
DB.
• ASN - The ASN is the identification of the AS to which the site is attached. The ASN is obtained
from a query to the RIPE-DB.
• HTTPS/HTTP - Indicates if a site uses HTTP or HTTPS. Note that sometimes the initial page of
a site uses HTTP and the portal inside it uses HTTPS. This information is obtained via a Python
library, socket, that tries to connect to port 443, which is assigned to HTTPS. The use of HTTPS
is important to protect the information exchanged between the user and the site, for example, in a
bank account information exchange. On the contrary, if it uses HTTP, the information exchanged
is not encrypted and can be used in a malicious way.
• HTTP Response Status Code - HTTP code returned after probing the site. This code is important
to check the state of the site, for example if the site redirects the user to another page, or if the site
is down. There are 5 standard classes:
– 1xx - Informational
– 2xx - Success
– 3xx - Redirection
– 4xx - Client Error
– 5xx - Server Error
This probe is done using a Python library, httplib, which allows requesting the HEADER of the
initial page of the site, and from this response the status code is obtained.
19
Site sns.gov.pt nos.pt
IP List 217.172.189.86188.138.57.39 212.113.183.252
IP prefixes 217.172.189.0/24188.138.57.0/24 212.113.160.0/19
ASN 8972 2860Responsible Organization PLUSSERVER-AS NOS COMUNICACOES
DNS Servers ns2.nameserverservice.de.ns1.nameserverservice.de.
ns2.novis.pt.ns1.novis.pt.
DNSSEC False FalseHTTP/HTTPS HTTPS HTTPSHTTP Code 301 301
Table 4.2: Characterization example
• DNS Servers - List of DNS servers of the domain to which the site belongs. Attacking these
servers may compromise the connectivity with the site. To obtain the DNS Servers we use the tool
dig, which is a networking tool that queries DNS Servers to obtain information.
• DNSSEC - Indicates if the site domain uses DNSSEC. The DNSSEC system provides a more
secure resolution of names, because it uses a PKI and a digital signatures to verify DNS query
response integrity. The information about the use of DNSSEC is obtained from ViewDNS [39] site.
4.2.3 Results
Using the method explained above, we present in Table 4.2 some real examples. The importance and
method to retrieve the table entries is explained in section 4.2.2.
4.3 Identification of ASes
This section describes the methodologies and results obtained in the identification of critical ASes.
4.3.1 Methodology
The ASes supporting the connectivity to eLife critical sites were divided into three classes:
• Service ASes - This is the AS where the critical eLife site is located. This AS is identified using
the methodology described in section 4.2.2.
• ISP ASes - These are the ASes where the users of the critical eLife sites are located, i.e., those
of the ISPs of the Portuguese Internet users. We choose to analyze MEO, NOS and Vodafone, as
they provide mobile and fixed access to 95% of the national Internet subscribers [40].
• Transit ASes - The ASes provide the connectivity between the two types of ASes above, if there
is not a direct connection. The identification of these ASes will be explained in 4.4.
20
Figure 4.1: AS identification on RIB example Figure 4.2: Valley-free example
4.4 ASes Interconnection System
For the purpose of evaluating the resilience of connectivity between users and critical eLife sites, we will
analyze the paths between Service and ISP ASes in the graph that represents the adjacencies between
Autonomous Systems in the Internet.
4.4.1 Paths Identification
To identify paths between ASes, our first approach was to analyze BGP updates from public routers that
are directly connected to higher tier ASes, in order to have a global view of the connectivity between
ASes. This information is obtained from the RouteViews [25] project, which was conceived for Internet
operators to obtain real-time information about the global routing system as they provide files with live
raw data.
They provide two types of files. RIB files that are updated every 2 hours and BGP files that are
updated every 15 minutes. Both type of files contain BGP updates, the difference is that BGP files
contains only the updates within the file temporal window, and RIB files contain the most recent BGP
update for every prefix learned by the selected router.
We concluded that it was more efficient to use the RIB file information, as it contains all advertised
paths to the learned prefixes and if a path is not updated in a very long time, we would never see it using
BGP files.
For each pair of (Service AS, ISP AS) and (Service AS, Service AS) we identified the paths described
in terms of Transit ASes. To identify these paths, a Python script was developed to analyze the RIBs data
from the six main routers available in RouteViews, and filter the paths that contained the pairs described
before. For each path the intermediary ASes, or by our definition, Transit ASes were identified. Figure
4.1 represents an example of a RIB entry.
This method allows identification of current paths. However, as we also want to identify possible
paths that can be used as alternative in case of connectivity disruption, this method was not suitable for
our work.
21
4.4.2 Valid Paths Identification
To discover all alternative paths to the ones currently used, we needed to build an AS network graph
and define a valid path pattern, as without path validation, there would be infinite available paths.
ASes Dataset
To create an AS network graph, we needed an AS dataset that contains informations on all ASes and
on the connections between them. As described in section 3.1.2, CAIDA provides datasets [41] that
contain all direct connections between ASes and the respective relationships. There are two types of
datasets that are monthly updated:
• Serial-1 - Contains inferred relationships from RouteViews BGP tables snapshots taken at 8-hour
interval for a 5-day period for each month.
• Serial-2 - Contains Serial-1 data, plus inferred relationships from BGP communities collected from
IX Looking Glass servers that contain global routing information and from traceroute output col-
lected from CAIDA’s ark monitors, both collected on the same day.
We choose to use Serial-2 data as it is more complete. The inferred relationships were not 100% cer-
tain, as it can contain false relationships, this was the only way that we could obtain ASes relationships
using only public information.
From the chosen dataset, we also verified that there was not any relationship between Portuguese
ISP ASes. However, if we did a traceroute, from one to another, there was a direct connection and this
implies the existence of a relationship between them. This happens because of the Internet Exchange
Point (IX) in Portugal, GigaPIX, provides a peering connection between certain ASes, and these con-
nections are not announced on BGP updates. There are many IXs around the world, but we only take
in account GigaPIX.
We added this IX to the network graph to have a complete view of the network. We consider the links
that pass through GigaPIX as neutral links without any kind of relationship, as they are not announced
in BGP updates. To retrieve this information we parse a site [42] that contains all GigaPIX peers.
Valid Path Pattern
To find valid paths in the network graph we used the valley-free rule. The valley-free rule defines routing
paths patterns that allow ASes to minimize their routing monetary cost.
The validation of paths is made according to the relationships between ASes that are directly con-
nected. It considers money transfers between customers and providers, which means that for every
transit provider there is a payee. Without this rule, every path in the network graph would be valid, which
is unrealistic. A valid path must follow the following pattern:
• 0 or more customer to provider links, followed by
• 0 or 1 peering links, followed by
22
• 0 or more provider to customer links.
Figure 4.2 gives an example of valid and invalid paths. The arrow direction in the figure defines a
customer to provider relation. Connections without arrows are peering relations.
Paths Identification and Validation Algorithm
To validate paths, we developed a Python script that reads the data from the CAIDA dataset and builds a
graph, using the NetworkX library, where it is possible to identify paths between ASes and then validate
those paths.
We implemented two standard algorithms, the Dijkstra algorithm and the k shortest path algorithm
from Yen [43], for the discovery of paths between ASes. The latter one seemed more reasonable, as
Dijkstra’s algorithm returns all possible paths, and the validation of all the paths would have an higher
computational time. The shortcoming of the k-shortest paths algorithm is that we need to choose a
higher enough k, so we don’t miss any valid path.
Doing some more research about these algorithms, we found that the NetworkX library already has
algorithms to find paths between nodes. We compared our algorithms to the ones from NetworkX, and
the ones from NetworkX were faster, so we ended up using them. The algorithm to identify and validate
paths can be described in the following steps:
1. Create a list with Service ASes and ISP ASes.
2. Build a dictionary, from the ASes relationships dataset, where each AS entry contains its neighbors
and their corresponding relationship. This includes all ASes, not only Service and ISP ASes.
3. Build the network graph according to the dictionary.
4. For each Service AS, discover k=100 shortest paths from ISP ASes and Service ASes. We con-
sider K=100 because from our research, we consider it to be a good trade-off between computa-
tional complexity and the number of valid paths identified.
5. For each path discovered check its validity according to Valley-Free rule using the dictionary from
step 2.
6. For each valid path, identify the Transit ASes and add it to the ASes list created on step 1 that
contains Service and ISP ASes.
7. Save the resulting ASes and respective paths in a JSON file, as we can see on Figure 4.3.
The algorithm pseudocode is stated in Algorithm 1.
4.4.3 Paths Characterizations
As only public information is used, we could not obtain connection characteristics, such as speed or
capacity. We decided to focus on theoretical parameters that could be inferred from our identified ASes
23
Algorithm 1 Path identification and validation algorithm1: procedure MAIN2: network graph← read relations for all ASES()3: ases← join(isp ases, service ases).4: dict of paths← new dict().5: for AS1 in ases do6: for AS2 in ases do7: paths← k.shortest path(AS1, AS2, network graph, 100)8: for path in paths do9: valid← validate path(path)
10: if valid then11: append path to dict of paths[AS1][AS2]
return dict of paths
12: procedure VALIDATE PATH(path)13: status← 0 . 0 if in c2p, 1 if in p2p, 2 if in p2c14: n p2p connections← 015: valid← True16: for connection in path do17: type← get connection type(asn from, asn to)18: if (status > 0 and type == c2p) or (status > 1 and type == p2p) or (type == p2p and
n p2p > 0) then19: valid← False20: if type == p2p then21: status← 122: n p2p+ = 123: else if type == p2c then24: status← 2
return valid
Figure 4.3: ASes paths JSON file example. Paths to AS2860
24
and their respective paths, as indicators of connectivity resilience. The following characteristics were
analyzed:
• Number of paths between ASes - The more disjoint paths there are between ASes, the more
resilient to Transit AS failure the connection will be.
• Number of hops in each path between ASes - The less hops there are in a path between ASes,
we assume there will be faster connectivity.
• Customer Cone - The customer cone is the number of ASes an AS can reach using only customer
links. We assume that paths which contain Transit ASes with a larger customer cone are more
reliable, as an AS with a higher number of customers, has a higher number of possible valid paths.
CAIDA has a site where they rank ASes according to their respective customer cone, however
there is no API available to retrieve this information. We tried to retrieve this information by parsing
their site, however this method was not possible because they limit the number of requests to the
site. We tried to connect to the site using different public proxies to make the requests, but at
certain point, the site stopped to respond.
Finally, without other resources, we followed the definition of customer cone, and calculated it
recursively using the ASes relationship dataset retrieved before. Comparing the results with the
results of CAIDA, there were some discrepancies in the ranking, but nothing problematic, as the
ASes differ one or two places, higher or lower, from each other. After ranking all Transit ASes,
we create a direct weighted graph that contains Service, ISP and Transit ASes, and all the paths
between Service and ISP ASes, and among themselves. The weight for each connection is the
rank of the destination AS.
Using the weighted graph it is possible to discover the shortest path between Service and ISP
ASes, relative to both the number of hops, and the AS rank metrics.
The algorithm pseudocode is stated in Algorithm 2.
Algorithm 2 Customer Cone calculation1: customer ases← dict()2: customer ases size← dict()3: procedure MAIN4: for AS in ases do5: getP2C(AS,AS)6: customer ases size[AS]← customer ases[AS].size()
return dict of paths
7: procedure GETP2C(AS origin, AS)8: connections← getConnections(AS)9: for AS to in connections do
10: if ((connections[AS to] == p2c)and(ASnotincustomer ases[AS origin]) then11: Append AS to customer ases[AS origin]12: getP2C(AS origin,AS to)
• Betweenness Centrality - Indicates the node centrality in a graph based on shortest paths, in
our case, the graph that contains all our identified ASes and their paths. We assume that paths
25
which contain Transit ASes with a higher betweenness centrality are more reliable, as an AS with
a higher betweenness centrality has a higher number of possible valid paths.
We created a graph with the identified ASes and the corresponding paths, and calculated the
betweenness centrality for each AS, that returns values between 0 and 1. Afterwards, a direct
weighted graph was created, where the weight for each connection is calculated as we can see in
the following equation:
weight = (betweenness centrality(destinationnode)− 1) ∗ (−1000) (4.1)
As nodes with a higher betweenness centrality are preferred, we created a decreasing function
so the nodes with a higher betweenness centrality have less weight. From the created graph, we
retrieved the shortest path between Service and ISP ASes, that considers not only the number of
hops, but also the betweenness centrality of Transit ASes.
4.5 Characterization of ASes
This section describes the methodologies and results obtained in the characterization of critical ASes.
4.5.1 Methodology
To characterize an AS in terms of connectivity using only public information, we analyzed the following
parameters:
• ASN - Number that identifies the AS;
• Type - Service, ISP or Transit AS
• Holder - Entity that is responsible for the AS. This information is retrieved from the RIPE DB;
• Holder Country - Country where the AS holder is registered. This information is retrieved from
the RIPE DB;
• Connections - All valid paths from ISP to Service ASes, and between Service ASes;
• Customer Cone - Number of ASes the AS can reach using only customer links;
• Customer Cone Connection - Shortest path that considers the Transit ASes customer cone;
• Betweenness Centrality - Value between 0 and 1 that indicates the betweenness centrality of the
AS;
• Betweenness Centrality Connection - Shortest path that considers the Transit ASes between-
ness centrality.
26
ASN 8972 8657Type Service TransitHolder PLUSSERVER-AS MEO-INTERNACIONALHolder Location DE PT
Connections
”3243”: [[{”type”: ”c2p”,”asn”: ”3243”},{”type”: ”c2p”,”asn”: ”8657”},{”type”: ”p2p”,”asn”: ”3257”},{”type”: ”p2c”,”asn”: ”3320”}], ...
[]
Customer Cone 4 276
Customer Cone Connections”3243”: [{”type”: ”c2p”,”asn”: ”3243”},{”type”: ”c2p”,”asn”: ”8657”},{”type”: ”p2c”,”asn”: ”174”}], ...
[]
Betweenness Centrality 0 0.137
Betweenness Centrality Connections”3243”: [{”type”: ”c2p”,”asn”: ”3243”},{”type”: ”c2p”,”asn”: ”8657”},{”type”: ”p2c”,”asn”: ”174”}], ...
[]
Table 4.3: AS characterization example
4.5.2 Results
Table 4.3 gives an example of different types of ASes characterization.
4.6 Chapter Summary
The identification of critical eLife sites was based on the identification of online services offered by
entities from different sectors of activity, that were critical to the good functioning of society. These
entities were identified from public lists and from Google search using keywords related to each sector.
The critical eLife sites were characterized in order to assess the user connectivity to the site. A
Python script was developed to retrieve all the information, using OS tools, libraries and web scraping,
from authoritative and non-authoritative sites.
From the critical eLife sites characterization was possible to obtain the ASN responsible for each one.
We designate the ASes responsible as Service ASes. To guarantee connectivity from the user, we also
identified the ASes that are responsible for most of the user’s connectivity to the Internet, these ones
are ISP ASes.
After the identification of Service and ISP ASes we identified the possible paths between Service
and ISP ASes and between Service ASes. To identify those paths, we use CAIDA dataset to obtain
relationships, and from the Valley-Free rule, validate those paths. From those paths we identified more
critical ASes, the Transit ASes, that are responsible for routing traffic between the other ASes. As
our goal is to characterize critical eLife sites connectivity, we had to find resilience indicators for these
paths using only public information. We characterized every connection between ASes by the number
of paths, number of hops for each path, and discovered the best path according to customer cone and
betweenness centrality.
Finally, we characterize the ASes using information obtained from the paths identification and infor-
mation from RIPE DB.
27
28
Chapter 5
The Crisys Monitoring System
This chapter describes the architecture of our monitoring system, called Crisys (Critical Systems). Note
that this system can monitor not only to critical eLife sites, but any site.
5.1 Functionalities
To develop a system that monitors the critical elements defined in chapter 4, the following types of
functionalities were stipulated:
• Sites monitoring;
• ASes monitoring;
• Sites and ASes availability correlation.
Our system is supposed to be automatic, where the only human interactions are the input of a list of
sites, and the visualization of data retrieved by the system.
5.1.1 Sites Monitoring
In terms of sites monitoring, our system is supposed to:
• Retrieve, process and store information from each site automatically and periodically;
• Evaluate the site connectivity, in terms of RTT, routing, and availability;
• Provide visual information regarding the site connectivity;
• Log any information on site changes.
5.1.2 ASes Monitoring
In terms of ASes monitoring, our system is supposed to:
• Retrieve, process and store information from public sources automatically and periodically;
29
Figure 5.1: Summary of the system architecture
• Evaluate the AS-level connectivity between pairs of (Service AS, ISP AS) and (Service AS, Service
AS) using stipulated metrics;
• Provide visual information regarding the AS-level connectivity.
• Log any information on changes in AS-level connectivity.
5.1.3 Sites and ASes availability correlation
The developed system must have the capability to correlate the monitored sites availability, with the
disruption of the monitored ASes, in order to evaluate the network resilience.
5.2 Architecture
According to the functionalities described in section 5.1, we divided the system in the following modules:
• Sites Monitoring - Retrieves sites information from the Internet, process it and store it in the
Database.
• ASes Monitoring - Retrieves ASes information from the Internet, process it and store it in the
Database.
• Database - Contains all information related to sites and ASes, and the logs with sites and ASes
parameters changes.
• GUI Interface - User interface that allows the display of information contained in the database and
provides useful tools.
Figure 5.1 summarizes our system architecture.
30
5.3 Implementation
The system architecture comprises a back-end, running in a Linux machine to which users can connect
a front-end accessed from any web browser. The monitoring system was implemented in Python, for
the back-end, and in JavaScript, for the front-end.
Python was chosen for the back-end because it contains libraries that facilitate the extraction of
information from the web, and allows the use of networking tools built in the operating system. The
back-end is developed for Linux. JavaScript was chosen for the front-end because of the portability
between different operating systems as it is used in a web browser. To connect the front-end to the
back-end we used the Django [44] framework which allows the execution of back-end functions from
front-end actions.
The Django framework also implements a database model where entries are Python classes, which
facilitates the interaction between the back-end and the database. We used a SQLite database because
this is the Django standard, and it was sufficient for our purposes. In Django, there are 3 files essential
for the system:
• urls.py - Contains all the urls that can be used to access different pages of the front-end. From
the url it is possible to pass parameters to the views.py file. Along this section, we refer it as URL
file.
• views.py - Contains the back-end that is triggered from the access to urls. From here it is possible
to access the database or use network tools. Along this section, we will refer to it as Views file.
• models.py - Contains the models defined for the database. Along this section, we will refer to it
as Models file.
5.3.1 Database
The database contains three different models that are specified in the Models file, which contain the
following parameters:
• Site (name, domain, ip addresses, asn, holder, http, dns, dnssec, prefixes, rtt, status, http code)
• AS (asn, country, holder, connections, asn type, customer cone, betweenness centrality, cc connections,
bc connections)
• Log (date, time, message)
Aside from Log, most of parameters from Site and AS models are described in section 4.2.2 and 4.5.1,
respectively. The ones in Site model that are not explained are the parameter status, that states if the
site is up or down, and the rtt, that states the round-trip time to the site.
The database is structured in a non-relational way, because an AS can be a Transit AS that is not
related to a certain Site, and also because the Logs are supposed to be permanent, which means that
in case of AS or Site removal, the Log must stay.
31
To populate the database, we retrieved each parameter of the Site and AS models using the meth-
ods described in sections 4.2.2 and 4.5.1, respectively. The Django framework allows to create and ac-
cess database easily, by calling save() and objects.get() to a model, for example Log(message=”new
ip”).save(), which inserts a new entry in the database, or Log.objects.get(message=”new ip”), which
retrieves the entry from the database.
5.3.2 Monitoring Sites
We can divide the site monitoring into two types, passive and active monitoring.
Monitoring of dynamic parameters
This is done via traceroute, to check if the site is up and to obtain the average round-trip time. If a site
changes its state, down or up, it creates a new log entry. This task is done periodically for every site.
The pseudocode is stated in Algorithm 3.
Algorithm 3 Active Monitoring1: procedure CHECK SITE2: for each site:3: http status← get http status().4: if http status == range(400-599) then5: site status← down.6: log(”site down”)7: site.rtt← tcp traceroute.rtt()
Monitoring of semi-static parameters
This type of monitoring consists of checking every parameter for every site, and if any parameter change
is detected, change the parameter value in the database. The methods are equal to the initial database
populate, explained in 4.2. This task is done once a day, as it is network intensive, and most of the
parameters do not change.
5.3.3 Monitoring ASes
This type of monitoring consists of applying the methods used to populate the database, explained
in section 4.4 and 4.5, to verify ASes parameters changes and create new log entries in case of any
change. This task is done once a month, as we are limited by the relationship files from CAIDA. However,
it checks one time a day if there is a new file, because CAIDA does not update their files on the same
day of the month.
5.3.4 GUI Interface
The GUI interface is accessed through a web browser. To design the page, we use HTML combined
with the Bootstrap library to get a minimalist interface. The implementation is done in JavaScript with
32
Figure 5.2: Sites page
the help of some libraries, namely jQuery to make data requests to the back-end, and the vis.js to draw
dynamic and interactive graphs. The Django framework divides the interface elements in two folders,
the static and templates folder. The static folder contains images, scripts, CSS, and static pages such
as the header for every page. The templates folder contains every HTML page that changes its contents
according to the query made to the system through the URL’s file.
Sites
The Sites page contains a table with every site and the respective parameters. This table is dynamically
updated in case of a parameter change. It is also possible to add or remove sites. From this page, it is
possible to access an individual page of a Site or an AS. Figure 5.2 represents the sites page.
Site individual page
This page contains all site parameters together with some tools to access the site connectivity, namely:
• Visual traceroute - Consists of doing a query to the Views file, where the system executes a
traceroute using tcp, so is possible to identify every IP in each hop, and from every IP find the
responsible AS and the location. The back-end returns the list of ASes and the list of locations.
From those lists the front-end draws two different paths: One in form of a graph that contains the
ASes involved, and other one settled in a map with every location and the respective connections
between locations;
• Console - Consist of a console where is possible to execute three commands: Dig, Ping and tcp
traceroute. For every command, a query is made to the Views files that executes the command
and returns the system output.
Figure 5.3 represents the individual site page.
33
Figure 5.3: Individual Site page
Figure 5.4: ASes page
ASes
The ASes page contains a table with every AS and the respective responsible organization and country,
divided by Service and Transit ASes. From this page it is possible to access an AS individual page.
AS individual page
Apart from all the AS parameters, it contains the list of sites that the AS is responsible for, resilience
indicators and four graphs that represent connections between the ISP ASes and the respective Service
AS, but it is also possible to see the connections between other Service AS and the respective Service
AS. These different types of graphs are the following:
• All Connections - Contains all connections between the selected ASes;
• Betweenness Centrality - Contains the shortest path connection between the selected ASes
according to the betweenness centrality metric;
34
Figure 5.5: AS Individual page - part 1 Figure 5.6: AS Individual page - part 2
Figure 5.7: Interactive Graph
• Customer Cone - Contains the shortest path connection between the selected ASes according to
customer cone metric;
• Shortest Path - Contains the shortest path connection between the selected ASes.
These metrics are explained in section 4.4.3. Figure 5.5 and 5.6 represents the AS individual page.
Interactive Graph
This page contains a graph with all connections from ISP ASes to Service ASes, possessing the follow-
ing functionalities:
• Remove Transit AS;
• Remove all foreign Transit ASes;
• Step back;
• Step forward;
• Restart.
It also contains a table with the number of connections between ISP and Service ASes. With the
combination of the graph and the table, it is possible to assess the impact caused by the disruption of a
Transit AS. Figure 5.7 represents the interactive graph page.
35
Logs
Consists of a box that is presented in every page that contains all the logs. In case of a new log, it
displays a notification to the user. The log is updated dynamically, which means that the user does not
need to refresh or change page to see log alterations.
5.4 Chapter Summary
To achieve the goal of monitoring and evaluating the connectivity to critical systems, we stipulated the
functionalities required for our system, and designed the system architecture according to these re-
quirements. From the designed architecture and the work already done in section 4.2.2 and 4.5.1, we
implemented a system with the back-end written in Python and the front-end written in JavaScript, using
the Django framework to connect the two.
36
Chapter 6
Evaluation
In this chapter, we will evaluate the eLife critical sites and ASes identified using resilience indicators,
and also evaluate the developed system in terms of reliability and performance. All the tests were made
several times to ensure a good level of confidence.
6.1 Test Description
To evaluate the resilience of the critical systems network and the reliability of the Crisys system, we
used information retrieved by our tool. The information was based on 83 eLife critical sites. To evaluate
Crisys system performance we developed our own Python scripts. All the performance tests were made
for 20, 40, 60 and 80 sites. The information used in the tests was retrieved using the method explained
in chapter 4.
All the test were made in the same machine and network. Table 6.1 contains the test environment
configuration.
6.2 Critical Systems Network
In this section, we will evaluate the information retrieved about the critical eLife sites and ASes identified
in section 4.
OS elementaryOS Loki (Ubuntu 16.04)CPU Intel R© CoreTM i5-2410M @ 2.30GHzRAM 8 GBLocal AS 1930Internet Speed 100 MbpsBrowser Firefox 52.0.1
Table 6.1: Test Environment Configuration
37
Figure 6.1: No of DNS Servers per Site Figure 6.2: No of IP Addresses per Site
Minimum Maximum AverageNo of DNS servers 1 6 3No of IP addresses 1 7 1RTT 4,068 ms 97,853 ms 32,589 ms
Table 6.2: Sites metrics evaluation
6.2.1 eLife Critical Sites
Sites evaluation is stated in the Figures 6.1, 6.2 and 6.3, and in the Tables 6.2, 6.3 and 6.7.
DNS Servers
Figure 6.1 shows a histogram of the number of DNS servers per site. Most of the identified critical eLife
sites have more than one DNS server, which is a good resilience indicator, in case of the DNS server is
compromised or a DNS server failure. Table 6.11 contains the minimum, maximum and average number
of DNS servers for the critical eLife sites identified.
IP Addresses
Figure 6.2 shows a histogram of the number of IP Addresses per site. Most of the identified critical eLife
sites are only associated to one IP address, which can be a bottleneck in case of denial of service attack
(DoS) without proper site configuration. If the site is distributed among several IP addresses, a DoS
will be more difficult to execute. Also, a higher number of IP addresses guarantee more availability and
redundancy. Table 6.11 contains the minimum, maximum and average number of IP Addresses for the
critical eLife sites identified.
RTT
Figure 6.3 shows a histogram of the RTT per site. More than a half of the identified critical eLife sites
have a RTT lower than 25ms. This is a good indicator, as a critical eLife site must have a quick response
Using Not usingDNSSEC 2 81HTTPS 70 13
Table 6.3: Sites security evaluation
38
Figure 6.3: RTT per Site
to the user actions. However, a few sites have a high RTT, which be caused by:
• The physical type of transmission medium;
• The physical distance between source and destination;
• The number of nodes between source and destination;
• Network congestion
Table 6.11 contains the minimum, maximum and average RTT for the critical eLife sites identified.
DNSSEC
Table 6.3 shows that from the 83 sites analyzed, only two use DNSSEC. The small number of sites using
DNSSEC was expected, as DNSSEC isn’t mandatory and depends on a chain of trust, which means
that all servers on the chain, need to have DNSSEC implemented.
HTTPS
Table 6.3 shows that from the analyzed sites, most of them use HTTPS. These numbers may be inac-
curate, since our measurement method is to test if port 443 is open, which can lead to false positives in
case the same IP hosts more than one site. However, by using this method, is possible to verify if sites
use HTTPS in their client portals, even if they don’t use it in the initial page of the site.
Critical eLife sites affected by foreign Transit ASes
Table 6.4 shows the critical eLife sites that in case of the shutdown of all foreign Transit AS, have at least
one connection to an ISP disrupted. The numbers on the table are the number of paths, before and after
foreign Transit ASes shutdown, from the Service AS, that supports the site, to the ISP AS. Some of the
affected sites lost the connection with all ISPs. In case of disruption of all communications with foreign
countries, these sites can be seriously affected. This test was done using our interactive graph.
39
Site Meo Vodafone Noscitius 0 / 8 0 / 27 0 / 21endesa 0 / 6 0 / 19 0 / 14banco bic 0 / 5 5 / 19 5 / 13bni europa 0 / 3 1 / 9 1 / 6banco popular 0 / 10 2 / 30 0 / 21bankinter 0 / 2 2 / 8 2 / 7bnp paribas 0 / 12 6 / 49 5 / 24deutsche-bank 0 / 2 0 / 17 0 / 16sns 0 / 8 0 / 44 0 / 15policia judiciaria 0 / 8 0 / 27 0 / 21ana 0 / 6 4 / 24 3 / 11carris 0 / 1 5 / 18 5 / 12
Table 6.4: Sites affected by the shutdown of all foreign Transit ASes
Minimum Maximum AverageNOS 1 5 3MEO 1 6 3VODAFONE 1 6 3
Table 6.5: Number of hops per path from ISP to Services ASes
6.2.2 Autonomous Systems
Autonomous systems evaluation is stated in Table 6.5, 6.6 and 6.7. This evaluation consists in analyzing
the number of hops per path and the number of paths between the primary ISP’s and Service ASes, and
also states the distribution of Service ASes between national and foreign entities.
Number of hops per path
Looking at Table 6.5 we can see that the number of hops per path, on average, from ISPs to Service
ASes is around three. The number of hops stated is the number of hops between ASes, not between
machines.
Number of paths
Table 6.6 shows the number of possible paths between ISPs and ASes. Vodafone has a higher average
number of paths as expected, because it is the ISP with highest international presence. The Service
ASes are divided more or less equally between Foreign and National ASes, which means that a national
ISP with higher international presence will normally have a higher number of possible paths.
Minimum Maximum AverageNOS 1 23 10MEO 1 24 6VODAFONE 8 39 17
Table 6.6: Number of possible paths from ISP to Services ASes
40
No of ASesNational 21Foreign 19
Table 6.7: Number of National and Foreign Services ASes
Figure 6.4: Visual traceroute to sns.gov.pt
National and Foreign ASes
Table 6.7 shows the number of foreign and national Service ASes. This does not mean that a site
supported by a foreign ASes is physically located outside national territory, as explained in section 4.5.1.
However, such possibility exists and this can be a problem in a state of emergency where there is the
possibility to close network borders.
For example, in Figure 6.4it is possible to observe the visual traceroute to sns.gov.pt, which is sup-
ported by a Foreign AS. The traffic goes through several countries, before reaching its destination.
6.3 Crisys System Reliability
In this section, we will evaluate if the developed tool monitors critical systems correctly.
6.3.1 Identification of Site unavailability
Figure 6.5 and 6.6 show an example of a correct detection by Crisys detected correctly that a site was
down. In this case, the site was unavailable due to site maintenance, which means that our system
detects not only sites where isn’t possible to access, but also site that return errors on access.
6.3.2 Possible Paths
Table 6.8 shows the results of a test executed on ten random sites, where five of them were supported
by a foreign AS and the other five supported by a national AS, consisting of checking if the path obtained
from traceroute exists in our graph, and if it is equal to any of the paths obtained using our metrics.
41
Figure 6.5: Site down for maintenance
Figure 6.6: System log confirming that the site is down.
In Table 6.8 SP stands for Shortest Path, BC stands for Betweenness Centrality and CC stands for
Customer Cone.
From the ten paths obtained, only two weren’t on the graph. This happens if the paths don’t follow
the valley-free rule, or if the number of paths considered initially to validation wasn’t high enough. This
number is stated in section 4.4.2.
6.3.3 AS Rank
Figure 6.7 shows the AS rank differences between our rank and CAIDA’s rank. There is a discrepancy
because they are calculated using different datasets.
CAIDA’s AS rank is calculated using data observed from BGP paths, which means they only consider
paths that are advertised. Our rank is calculated using the AS-level graph obtained from the relationship
Site Exists in graph SP BC CC AS ASN112.pt Yes No No No National 197802anacom.pt Yes Yes Yes No National 3243artelecom Yes Yes Yes No National 12926mbnet.pt Yes Yes Yes No National 6773dre.pt Yes Yes Yes No National 29673sns.gov.pt Yes Yes Yes Yes Foreign 8972bnpparibas.pt No No No No Foreign 20940cncs.gov.pt Yes Yes No No Foreign 49941cruzvermelha.pt Yes Yes Yes No Foreign 8426ana.pt No No No No Foreign 16509
Table 6.8: Traceroute comparison with possible paths identified and classified
42
Figure 6.7: Our AS rank compared with CAIDA AS rank.
dataset, which means that all the possible paths are considered.
6.4 Crisys System Performance
In this section, we will evaluate the developed system in terms of the time it takes to execute tasks in
function of the number of sites monitored. The tests were executed for 20, 40, 60 and 80 sites, and for
each case, the values were calculated using the confidence level of 95% for N=5 samples.
6.4.1 Database Populate
Figure 6.8 shows that the time for populate the database, using methods that are explained in 5.3.1, has
a linear growth with the increase of the number of sites. This is the most time consuming task.
6.4.2 Sites Update
Figure 6.9 shows that the time to update sites information, using methods explained in section 5.3.2,
has a linear growth with the increase of the number of sites.
6.4.3 ASes Update
Figure 6.10 shows that the time to update ASes information, using methods explained in 5.3.3, has
a sublinear growth with the increase of the number of sites, but tends to a logarithmic growth. This
43
2 0 4 0 6 0 8 0
0
500
1000
1500
2000
2500
3000
Nº OF SITES
TIM
E (S
)
POPULATE THE DATABASE
.
Figure 6.8: Database populate chart
2 0 4 0 6 0 8 0
0
50
100
150
200
250
300
Nº OF SITES
TIM
E (S
)
SITES UPDATE
.
Figure 6.9: Sites update chart.
44
2 0 4 0 6 0 8 0
0
500
1000
1500
2000
2500
Nº OF SITES
TIM
E (S
)
ASES UPDATE
.
Figure 6.10: ASes update chart.
happens because the number of ASes doesn’t increase linearly, as some sites are supported by the
same Service AS and different paths contain the same Transit ASes.
6.4.4 Sites Monitoring
Figure 6.11 shows that the time to monitor the sites, using methods explained in 5.3.2, has a linear
growth with the increase of the number of sites. This is the lowest time consuming task.
6.5 Chapter Summary
In this chapter, we evaluated our developed system in terms of its reliability and performance. We
also evaluated the information retrieved from our tool in terms of network resilience. All the tests were
conducted from the same machine and network.
45
2 0 4 0 6 0 8 0
0
50
100
150
200
250
Nº OF SITES
TIM
E (S
)
MONITORING SITES
.
Figure 6.11: Sites monitoring chart.
46
Chapter 7
Conclusions and Future Work
The main goals of this thesis were the identification, characterization and monitoring of systems critical
for the communication with eLife critical sites, i.e., Internet sites whose malfunction affect the good
functioning of society. Examples of such sites are in Appendix A.
To achieve the stipulated goals, we started by analyzing the structure of these systems, and identified
which system components can be important to their availability and resilience.
Afterwards we analysed previous work related to this thesis goals, however, this work focus more on
the ASes and not so much in the critical sites, but their methodologies are a main source of information
to create our own. These works were done in an international context, namely in the Netherlands,
Germany, France and Lithuania.
With the previous goals in mind, we divided our work in four stages:
• Development of criteria to identify critical eLife sites;
• Development of methodologies for the identification and characterization of critical eLife sites;
• Development of methodologies for the identification and characterization of ASes that support
critical eLife sites connectivity.
• Development of a system that uses the developed methodologies to monitor the identified critical
systems;
A eLife critical site was defined as one that offers services that allow the dematerialization of services
that are essential for the society good functioning and discard sites that can belong to an organization
that has critical infrastructures, but the site itself only contains information not critical for the general
public.
To identify these critical eLife sites we used different sources of information, from Google searches
to the analyses of official documents. The characterization was done using several tools, depending on
the parameters of critical eLife sites we wanted to obtain.
The ASes were divided into Service, ISP and Transit ASes. Service AS were identified directly from
critical eLife sites. ISP ASes are the primary ASes of Portuguese ISPs that support national users
47
connectivity to the Internet. Transit ASes were identified from the paths between the pairs (Service AS,
ISP AS) and (Service AS, Service AS). The characterization was done using several tools, depending
on the parameters of ASes we wanted to obtain.
We emphasize that the approach to the identification of alternative paths to the ones currently used,
is new, and provides a good resilience metric to evaluate critical eLife sites connectivity.
The monitoring system was developed using the methodologies from the identification and charac-
terization of critical systems, and is composed of a back-end in Python and a front-end in JavaScript,
bridged by a Django framework.
The monitoring system developed allows the evaluation of the resilience of critical systems con-
nectivity and visualize the information obtained in a centralized way. From our monitoring tool, the
functionalities that we want to highlight are the live monitoring of critical eLife sites availability, the visual
traceroute to the critical eLife sites, the graph visualization of paths for any pair (Service AS, ISP AS) and
(Service AS, Service AS), and finally the interactive graph where is possible to simulate the disruption
of Transit ASes and verify which critical eLife sites are affected.
By evaluating the information obtained, we say that most of critical eLife sites are safe from the
disruption of foreign ASes, but not all, which means that some critical online services depend on foreign
organizations.
By evaluating the developed monitoring system, we note that the performance could be improved,
but our focus was on the tool reliability.
7.1 Future Work
As a future work, there are several systems components that could be improved:
• Private Information: If we could use private information, for example from ISPs, we could have
more detailed information on systems connectivity resilience.
• ASes datasets: Using datasets that are updated more frequently would improve our tool reliability;
• Metrics: Develop more and better metrics to the evaluation of paths between ASes;
• Performance: Improve the system performance, namely in terms of speed;
• Compatibility: Develop the back-end for other operating systems.
48
Appendix A
Identified eLife Critical Sites
Table A.1: eLife critical sites
Name Site Sector
ACT http://www.act.gov.pt Civil Administration
ANSR http://www.ansr.pt Civil Administration
Base: Contratos Publicos Online http://www.base.gov.pt Civil Administration
Bolsa de emprego publico https://www.bep.gov.pt Civil Administration
Citius http://www.citius.mj.pt Civil Administration
Dados.gov http://www.dados.gov.pt/ Civil Administration
Deco https://www.deco.proteste.pt/ Civil Administration
Diario da Republica Electronico https://dre.pt/ Civil Administration
Gestao Integrada de Acessos https://www.sgu.gov.pt Civil Administration
Governo da Madeira http://www.madeira.gov.pt Civil Administration
Governo dos Acores http://www.azores.gov.pt/ Civil Administration
IEFP - NetEmprego https://www.netemprego.gov.pt Civil Administration
Instituto Nacional de Estatistica https://www.ine.pt Civil Administration
Interoperabilidade na Administracao Publica https://www.iap.gov.pt Civil Administration
Parlamento https://www.parlamento.pt Civil Administration
Portal da Juventude https://www.juventude.gov.pt Civil Administration
Portal das Financas http://www.portaldasfinancas.gov.pt Civil Administration
Portal do Cidadao https://www.portaldocidadao.pt/ Civil Administration
Proteccao Civil http://www.prociv.pt Civil Protection
Autoridade Nacional da Aviacao Civil http://www.anac.pt Civil Administration
Direccao Geral de Recursos Naturais,
Seguranca e Servicos Maritimoshttps://www.dgrm.mm.gov.pt Civil Administration
Direccao Geral das Actividades Economicas http://www.dgae.min-economia.pt Civil Administration
Proteccao Civil e Bombeiros Acores http://www.prociv.azores.gov.pt Civil Protection
49
Proteccao Civil Madeira http://www.procivmadeira.pt Civil Protection
Seguranca Social http://www.seg-social.pt Civil Administration
EDP http://www.edp.pt Energy
Endesa https://www.endesa.pt Energy
Galp http://www.galpon.pt Energy
Direccao Geral de Energia e Geologia http://www.dgeg.pt Energy
IPMA https://www.ipma.pt Environment
Activo Bank http://www.activobank.pt Financial Services
Banco Best https://www.bancobest.pt Financial Services
Banco Bic http://www.bancobic.pt Financial Services
Banco BNI http://bnieuropa.pt/ Financial Services
Banco Carregosa https://www.bancocarregosa.com/pt/ Financial Services
Banco CTT https://www.bancoctt.pt Financial Services
Banco de Investimento Global https://www.big.pt/ Financial Services
Banco de Portugal https://www.bportugal.pt/ Financial Services
Banco Finantia https://www.finantia.pt/ Financial Services
Banco Invest https://www.bancoinvest.pt/ Financial Services
Banco Popular http://www.bancopopular.pt Financial Services
Banco Primus http://www.bancoprimus.pt/ Financial Services
Bankinter https://www.bankinter.pt Financial Services
BBVA https://www.bbva.pt/ Financial Services
BNP Paribas http://www.bnpparibas.pt/ Financial Services
BPI http://www.bancobpi.pt Financial Services
BPI Net https://www.bpinet.pt/ Financial Services
Caixa Geral de Depositos https://www.cgd.pt Financial Services
Credito Agricola http://www.creditoagricola.pt/CAI Financial Services
Deutsche Bank http://www.deutsche-bank.pt/ Financial Services
MB Net https://www.mbnet.pt/ Financial Services
Millenium BCP http://millenniumbcp.pt Financial Services
Montepio http://www.montepio.pt Financial Services
Novo Banco https://www.novobanco.pt Financial Services
Santander https://www.santandertotta.pt Financial Services
ASAE http://www.asae.pt/ Food
ADSE http://www.adse.pt Health
Cruz Vermelha http://www.cruzvermelha.pt/ Health
Farmacias Portuguesas https://www.farmaciasportuguesas.pt Health
Saude 24 http://www.saude24.pt Health
Servico Nacional de Saude https://www.sns.gov.pt/ Health
112 http://www.112.pt Health
50
Inem http://www.inem.pt Health
Anacom http://www.anacom.pt/ ICT
AR Telecom http://www.artelecom.pt ICT
Meo https://www.meo.pt/ ICT
Nos http://www.nos.pt ICT
PT Empresas https://www.ptempresas.pt/ ICT
Vodafone http://www.vodafone.pt ICT
GNR http://www.gnr.pt/ Public order and safety
Policia Judiciaria https://www.policiajudiciaria.pt Public order and safety
PSP http://www.psp.pt Public order and safety
SIBS https://www.sibs.pt/ Public order and safety
Centro Nacional de Ciberseguranca https://www.cncs.gov.pt Public order and safety
ANA aeroportos https://www.ana.pt Transport
Carris http://www.carris.pt/ Transport
CP https://www.cp.pt Transport
CTT http://www.ctt.pt Transport
Metro Lisboa http://www.metrolisboa.pt/ Transport
Metro Porto http://www.metrodoporto.pt/ Transport
SATA http://www.sata.pt Transport
TAP https://www.flytap.com/pt-pt/ Transport
EPAL http://www.epal.pt Water
51
52
Bibliography
[1] Comission, E.: Critical infrastructure. https://ec.europa.eu/home-affairs/what-we-do/
policies/crisis-and-terrorism/critical-infrastructure_en Accessed: 2017-03-25.
[2] ENISA: Methodologies for the identification of critical information infrastructure assets and services
(December 2014)
[3] IETF: Guidelines for creation, selection, and registration of an autonomous system (as), rfc 1930.
Network Working Group (March 1996) https://tools.ietf.org/html/rfc1930, Accessed: 2017-
05-08.
[4] ARIN: Autonomous systems and autonomous system numbers. American Registry for Internet
Numbers https://www.arin.net/knowledge/4byte_asns.pdf, Accessed: 2017-05-08.
[5] CAIDA. http://www.caida.org/home/ Accessed: 2017-05-08.
[6] Huffaker, B.: Autonomous systems (as) visualization. CAIDA (January 2016) http:
//www.caida.org/publications/presentations/2016/as_intro_visualization_ucsd/as_
intro_visualization_ucsd.pdf, pages 29, 46.
[7] IETF: Dns terminology, rfc7719. Network Working Group (December 2015) https://tools.ietf.
org/html/rfc7719, Accessed: 2017-02-24.
[8] IETF: A border gateway protocol 4 (bgp-4), rfc4271. Network Working Group (January 2006)
https://tools.ietf.org/html/rfc4271, Accessed: 2017-02-24.
[9] IETF: Address allocation for private internets, rfc1918. Network Working Group (February 1996)
https://tools.ietf.org/html/rfc1918, Accessed: 2017-05-08.
[10] IETF: Special use ipv4 addresses, rfc5735. Network Working Group (January 2010) https://
tools.ietf.org/html/rfc5735, Accessed: 2017-02-24.
[11] RIPE: Inetnum. https://www.ripe.net/manage-ips-and-asns/db/
support/documentation/ripe-database-documentation/rpsl-object-types/
4-2-descriptions-of-primary-objects/4-2-4-description-of-the-inetnum-object Ac-
cessed: 2017-05-08.
[12] RIPE: Ripe db. https://www.ripe.net/manage-ips-and-asns/db Accessed: 2017-05-08.
53
[13] RIPE: Ripe routing registry. https://www.ripe.net/manage-ips-and-asns/db/
the-ripe-routing-registry Accessed: 2017-05-08.
[14] Beautiful Soup. https://www.crummy.com/software/BeautifulSoup/ Accessed: 2017-05-08.
[15] Contat, F., Feuillet, M., Lorinquer, P., Valadon, G., Bortzmeyer, S., M’timet, S., Souissi, M., Beau-
douin, X.: Resilience de l’internet francais (2014)
[16] Contat, F., Feuillet, M., Lorinquer, P., Valadon, G., Bortzmeyer, S., M’timet, S., Souissi, M., Beau-
douin, X.: Resilience de l’internet francais (December 2015)
[17] Alizadeh, F., Oprea, R.C.: Discovery and mapping of the dutch national critical ip infrastructure.
Master’s thesis, University of Amsterdam (2013)
[18] Wahlisch, M., Schmidt, T.C., de Brun, M., Haberlen, T.: Exposing a nation-centric view on the ger-
man internet–a change in perspective on as-level. In: Passive and Active Measurement, Springer
(2012) 200–210
[19] Rainys, R.: Internet infrastructure topology mapping. Communications Regulatory Authority of the
Republic of Lithuania (March 2013)
[20] Razbadauskas, M.: Internet infrastructure topology mapping. Communications Regulatory Author-
ity of the Republic of Lithuania (November 2013)
[21] Rainys, R.: Cyber security: Lithuanian national regulatory authority expertise in monitoring national
networks resilience. Communications Regulatory Authority of the Republic of Lithuania (April 2016)
[22] RIPEstat. https://stat.ripe.net/ Accessed: 2017-05-08.
[23] Kamer van Koophandel: https://www.kvk.nl/english/, Accessed: 2017-05-08.
[24] RIPE RIS. https://www.ripe.net/analyse/internet-measurements/
routing-information-service-ris Accessed: 2017-05-08.
[25] Route Views. http://www.routeviews.org/ Accessed: 2017-05-08.
[26] IPv4 Route Server List. http://www.bgp4.net/rs Accessed: 2017-02-24.
[27] Public Route Server List. http://routeserver.org Accessed: 2017-02-24.
[28] Looking Glasses. http://www.bgp4.as/looking-glasses Accessed: 2017-05-08.
[29] iPlane. http://iplane.cs.washington.edu/ Accessed: 2017-05-08.
[30] UCLA Internet AS-level Topology. http://irl.cs.ucla.edu/topology/ Accessed: 2017-05-08.
[31] Sigma.js. http://sigmajs.org/ Accessed: 2017-05-08.
[32] DNSwitness. http://www.dnswitness.net/ Accessed: 2017-05-08.
[33] Team Cymru. http://www.team-cymru.org/ Accessed: 2017-05-08.
54
[34] RIPE RRC12. http://data.ris.ripe.net/rrc12/ Accessed: 2017-05-08.
[35] Internet AS-level topology construction & analysis: http://uk.nec.com/en_GB/emea/about/
neclab_eu/projects/topology.html, Accessed: 2016-05-29.
[36] ANACOM: Telecommunications companies. https://www.anacom.pt/render.jsp?contentId=
1396493, Accessed: 2017-03-25
[37] ERSE: Energy providers - electricity. http://www.erse.pt/pt/electricidade/
agentesdosector/comercializadores/Paginas/Clientesnaodomesticos.aspx Accessed:
2017-03-25.
[38] ERSE: Energy providers - natural gas. http://www.erse.pt/pt/gasnatural/agentesdosector/
comercializadores/Paginas/Residenciais.aspx Accessed: 2017-03-25.
[39] ViewDNS. http://viewdns.info/dnssec/ Accessed: 2017-03-25.
[40] ANACOM: Facts and numbers. https://www.anacom.pt/streaming/FactosNumeros4T2016_
infograma.pdf?contentId=1407916&field=ATTACHED_FILE Accessed: 2017-04-05.
[41] CAIDA datasets. http://www.caida.org/data/as-relationships/ Accessed: 2017-05-08.
[42] GigaPIX peers: https://www.fccn.pt/institucional/gigapix/entidades-ligadas/, Ac-
cessed: 2017-04-27.
[43] Yen, J.Y.: Finding the K shortest loopless paths in a network. Management Science (1971)
[44] Django. https://www.djangoproject.com/ Accessed: 2017-04-15.
55
56