building a usage profile for anomaly detection
TRANSCRIPT
BUILDING A USAGE PROFILE OF A SYSTEM FOR
DETECTING THREATS ON THE SYSTEM October, 2016
Nathanael Asaam [email protected]
Abstract This paper is an investigation into building usage profiles of a system using behavior models. Such behavior models are the heart of machine learning, and evolutionary computing. Some other methods of building such usage profiles include the use of statistical models such as time series models, univariate models and mean and standard deviation models. The aim of building these usage profiles is to be able to detect unusual behavior on the system. This paper uses regression to determine the usage profiles of a system by studying the relationship between relevant system variables that will be used to formulate the usage profile. The dependent and independent variables for the usage profile can be determined from an audit trail. Additionally, the paper applies hidden markov models to study the various states a computer system can fall into and the various stage transition in order to be able to predict unusual behavior in the system. Unusual behavior in this case may be a particular state or a transition from one state to another or the manner in which a particular state transition occurred. With this usage profile which is composed of the usage profile equation and a mean and standard deviation model that capture average usage and its standard deviation and the markov chain model that captures the various states of the system and the various state transition it becomes possible to detect anomaly on the system. Using linear and nonlinear programming, the usage profile equation can be maximized to determine states of the system and points at which the system is optimal. This can help improve the system’s usage. Also using differential coefficient of the usage profile equation and other statistical models such as the mean and standard deviation model, a threat profile of the system can be developed. When the threat profile equation is minimized using linear and nonlinear programming, it will help prevent threats on the system. The benefit of this research is its application to the development of anomaly threat detection systems and risk analysis systems that can be used for performing computer security risk assessments and analysis
INTRODUCTION
If a usage profile of a system can be built, it will become possible to detect unusual behavior on
the system. The method for building such usage profiles involved determining factors of the
system that are critical to the system. These factors can be seen as critical system variables that
affect the system’s usage. The other thing to consider is determining the way in which you can
obtain an abstract representation of the usage profile. The abstract representation of the usage
profile can be achieved by the application of behavior models such as statistical models, machine
learning models and cognitive based models.
The first goal of this research paper is to investigate techniques for building a usage profile of a
computer system. The aim of building the usage profile is to be able to have a working model
that describes the systems behavior. The second goal of the research is to be able to detect
unusual behavior on the system. Unusual behavior will be detected as deviation from the usage
profile model built. The last goal is to be able to build an anomaly threat detection system and a
risk analysis system that can be used for detecting threats and performing risk analysis on a
system.
RESEARCH MODEL AND METHODOLOGY
This paper investigates threat detection using application of machine learning, regression, linear
and nonlinear programming and calculus. The research model of this paper is inspired by
application of regression, machine learning and statistical models and the methodology for
threat detection is inspired by linear and nonlinear programming, calculus and multithreading.
These fields of study are mainly related to computer science, discrete mathematics and
operations research.
BEHAVIOR MODELS
The behavior models that can be used for building a usage profile include, machine learning
models, statistical models, cognitive based models, user intention based models and computer
immunology models. Examples of statistical models that can be used are time series models,
univariate models and moment or mean and standard deviation model. Machine learning
models that can be used include neural networks, Bayesian networks, hidden markov models
and genetic algorithms.
SYSTEM THREATS
There are three types of system logs that our intended threat analysis and detection hope to
arrive at. These are system errors, system threats, and usage rates all categorized based on the
magnitude and characteristics of an instance of the threat model. These logs must as such be
audited by a security expert to analyze changes in our computer system that fits or deviates from
the current usage profile, in order to project a more appropriate instance of the usage profile
that will be perfectly functional and suiting in the future.
PROBLEM DEFINITION
If the normal usage or behavior of a computer system can be represented by an abstract model,
then this abstract model can be used to detect threats on the system. The threats on the system
can be detected as deviations from the abstract model which is the behavior of the system. The
main problems this paper seek to investigate are listed below.
Representing the normal usage or behavior of a system with an abstract model.
Determining activities and occurrences on the system that are deviations from the
system’s normal behavior or usage.
Representing these activities or deviations with an abstract model.
Preventing such activities or occurrences from occurring on the system.
In this paper the system’s normal behavior is known as the usage profile and the deviations from
the systems normal behavior is known as the threat profile of the system.
RESEARCH QUESTIONS
The main questions to be investigated are listed below.
What are the best and most efficient techniques for modelling a system’s normal behavior
or usage?
What are the best and most efficient techniques for design and implementation of a
threat detection system?
How can a we build a risk analysis system for performing risk assessment of a computer
system?
OBJECTIVES
The main objectives of this research are as follows.
Representing a system’s normal functioning with an abstract model.
Design and implementation of a threat detection system.
Design and implementation of a mobile security audit framework.
LITERATURE REVIEW
This section review major topics that constitute this research and work done in this areas. The
topics that will be reviewed are intrusion detection systems, behavior encryption, risk analysis,
information security awareness and practices, ways of mitigating risks on social networking sites
and application of markov chain models for anomaly detection.
INTRUSION DETCETION SYSTEMS
There are two types of intrusion detection systems. These are signature based or knowledge
based threat detection systems and anomaly based threat detection systems. Signature based
threat detection systems detect threat using direct mappings of incidents with a database of
known threats. The database of know threats is called the threat signatures. Anomaly threat
detection system detect threats based on deviation from an observed behavior pattern of the
system for which the threat detection system has been built. They are also known as behavior
threat detection systems. The database of threats of knowledge based threat detection system
which is the threat signatures must be constantly update for new identified threats. Since new
threats may not be in the threat signatures, the correctness of detecting threats using such threat
detection systems is sometimes compromised.
However, anomaly threat detection systems have higher correctness of detecting threats
but they sometimes give false alarms since there are no mapping of system incidents with a
database of known threats. Anomaly based threat detection system are built using artificial
intelligence technologies. Besides this, intrusion detection systems are classified based on
purpose for which they are built and activeness and passiveness with which they deal with
threats. There are host based and network based threat detection systems made for such
purposes. Also, active threat detection systems are configured to block or prevent attacks while
passive intrusion systems are configured to monitor, detect and alert threats.
BEHAVIOR ENCRYPTION
Behavior algorithms are applied to safeguard information on computing devices such as mobile
phone and laptops. These algorithms are the basis for building systems that study and encrypt
user behavior on computing devices in order to ensure the security of the computing device. A
study into mobile platform security reports that behavior encryption system have been designed
and built focusing on mobile platforms. Results from this study indicates that behavior
encryption application systems are effective at ensuring mobile platform security. It must be
stated that cryptographic study into how to encrypt the usage profile of a system can fall under
behavior encryption. This will help in securing the information that embodies the usage profile.
It is necessary because if the usage profile can be known, then it is possible to launch an attack
on the system. It must be stated that the usage profile may consist of user behavior and network
behavior. As such, if such information is compromised, then a user can be impersonated or the
network can be compromised.
RISK ANALYSIS
Computer risk analysis is also called risk assessment. It involves the process of analyzing and
interpreting risk. To analyze risk, the scope and methodology has to be initially determined.
Later, information is collected and analyzed before interpreting the risk analysis results.
Determining the scope can be described as identifying the system to be analyzed for risk and
parts of the system that will be considered. Also, the analytical method that will be used with its
detail and formality must be planned. The boundary, scope and methodology used during risk
assessment determine the total amount of work efforts that is needed in the risk management,
and the type and usefulness of the assessments result.
Risk has many components including assets, threats, likelihood of threat occurrence,
vulnerability, safeguard and consequence. Risk management include risk acceptance which takes
place after several risk analysis. Normally, after risk has been analyzed and safeguards
implemented, the remaining or residual risk in the system that makes the system functional must
be accepted by management. This may be due to constraints on the system such as ease of use,
or features of the systems for which strict safeguard will cost the organization operational
problems. As such, risk acceptance, like the selection of safeguards, should take into account
various factors besides those addressed in the risk assessment. In addition, risk acceptance
should take into account the limitations of the risk assessment.
INFORMATION SECURITY AWARENESS AND PRACTICES
A paper on information security awareness in Saudi Arabia discusses information security
awareness and practices. The paper is entitled “A study of information security awareness and
practices in Saudi Arabia.” This paper emphasizes the fact that information is under constant
threat from cyber vandals. However, Saudi Arabia is rated poor in terms of information security
due to the fact that the country is a highly suppressed, patriarchical and tribal culture country.
The paper examined the level of information security awareness among the general
public in the country using an anonymous online survey based on instruments the Malaysian
Security Organization produced. In all, 633 persons responded to the survey and analysis
confirmed that indeed, information security awareness is low in the country and this is mostly
related to the fact that, the country is highly suppressed, patriarchical and tribal in nature.
PROTOCOL FOR MITIGATING RISKS ON SOCIAL NETWORKING SITES
According to an academic paper entitled, “Protocol for mitigating the risk of hijacking social
networking sites”, hackers can hijack a user’s session on social networking sites, impersonate the
victim and take over his session. The paper deals with this risk by presenting a security
authentication protocol for mitigating the risk.
The protocol takes into account that users of social networking sites connect to the sites
using several platforms and connection speeds. To cater for mobile devices and tablets using Wifi
connection, a novel Self-Configuring Repeatable Hash Chains (SCRHC) protocol was developed to
prevent the hijacking of session cookies. This protocol supports three levels of caching making it
possible to forfeit storage space for enhanced performance and reduced workload.
APPLICATION OF MARKOV CHAIN MODELS FOR ANOMALY DETECTION
According to a research paper on application of markov chain models for anomaly detection, a
temporal behavior of a system can be built using markov chain models. The technique was used
to represent the temporal profile of the normal behavior of a computer and a network system.
The markov chain model of the normal profile was learnt from historic data. The observed
behavior was analyzed to infer into the probability that the markov chain model of the norm
profile supports the observed behavior. A low probability of support indicates anomalous
behavior that may result from intrusive activities. The technique was implemented and tested
on audit data of a Sun Solaris system. The testing results show that the technique clearly
distinguished intrusive activities from normal activities. According to the paper, two primary
sources of data has been used to capture activities in a computer or network system for intrusion
detection, network traffic and audit data.
RESEARCH MODEL AND METHODOLOGY RESEARCH MODEL
Assume that the normal usage (Y) of a system such as smart phone, laptop or a wireless network
can be represented by a mathematical function;
Y=f (Xi, Ci) such that Xi represents system variables like number of functions or number of
authentications. Ci represents system constants like maximum or minimum number of
authentications. When a change in Y is beyond the standard deviation determined from the data
set of our usage, then that change indicates a threat. To investigate this threat, machine learning
algorithms, mathematical functions and behavior based intrusion detection systems will be
studied to determine Y in terms of a number of variables that represent Y appropriately.
The expected usage model of the network to be investigated includes the following
components. Host Usage Model, Server Usage Model, Device Usage Model, Port Usage Model,
Network Usage Model, Session Usage Model, Authentication Usage Model, Memory Usage
Model, CPU Usage Model, Battery Usage Model and Program Usage Model. These components
are expected to be derived from the variables listed below.
• Average number of application software that run on the mobile system while using the system
• Average number of system processes that run on the mobile system while using the system
• Average number of authentications in the mobile system.
• Average number of user actions that happens on the mobile system
• Average time a user spends before his session expires.
• Average time the mobile facility or resource functions each day.
• Number of paired ports communicating on the network
• Average amount of memory space used on devices while the network is being operated.
• Average CPU time spent on a single device on the network
• Average life span of a single device battery on the network.
METHODOLOGY
The list below details activities or processes that will be followed to represent a computer system
with an abstract mathematical model and analyze changes in the system. It is hoped that
following these processes will arrive at design and implementation of a normal usage model, a
threat detection system and a mobile security audit framework.
• Machine Learning Algorithms & Behavior Based Intrusion Systems: Investigate machine learning
algorithms, mathematical functions, and behavior based intrusion detection systems in order to
determine the extent to which the normal usage of a mobile system can be represented by the
research model.
• Audit trails: Analyze audit trails in order to formulate a set of independent and dependent
variables and their associated data set that will help in modelling the usage model of a mobile
system.
• Normal Usage Model: Apply the knowledge gained from the machine learning algorithms and
behavior based intrusion detection systems study and the audit trails analysis to model and
represent the normal usage of a mobile system such as a smart phone, laptop or wireless network.
• Threat Modelling: Study differential equations of the normal usage model and its applications in
order to model, detect and prevent threats.
• Boolean Calculus: Apply Boolean algebra and calculus of Boolean functions to design and
implement a hardware and software that make up the Normal Usage and Threat Detection
Systems.
• Use programming as a tool to experiment representations of the normal usage and threat models
to aid design and implementation of a mobile security audit framework.
• Employ questionnaire to collect information about the usage of computers and mobile phones.
• Threat Detection Systems: Develop an anomaly based threat detection system to demonstrate
the effectiveness of the research model. The goal is to measure the effectiveness of the threat
detection system developed, at preventing threats on a computer system.
Machine Learning Algorithms & Behavior Based Intrusion Systems
Machine learning techniques and algorithms will be investigated to know the extent to which an
expert system that learns a computer system’s usage can be built. Since the expected usage
model is a mathematical model, various mathematical modelling techniques will be applied to
determining the normal usage model.
When deviations from these mathematical models are analyzed it can lead to design and
implementation of behavior based intrusion detection systems. As such, a thorough study into
design and implementation of behavior based intrusion detection systems will be done.
Audit Trail Analysis
It is expected that computer security audit reports will be sampled and analyzed to arrive at a
set of dependent and independent variables and their data set. These variables and their
associated data set can be used to formulate the normal usage model.
Normal Usage Model
An investigation into applying the knowledge gained from the machine learning study, the
mathematical modelling study, the behavior based intrusion detection system study and the
audit trail analysis will the done. It is hoped that this will answer the question how do you
represent the normal functioning of a computer system with a mathematical abstract model.
Threat Modelling
Differential equations of the normal usage model will be investigated to know the extent to
which deviations from the normal usage models can be analyzed. An abstract mathematical
model of these deviations will be formulated. These abstract models are derivatives of the
normal usage model.
Boolean Calculus
A study into representing the normal usage model with a boolean function will be done. It is
hoped that analyzing these boolean functions will aid in building a hardware that is the expected
usage system. Differential equations of these boolean functions will be studied to analyze
changes in the system that indicate deviation from the normal usage model.
Experimenting Usage and Threat Models
Programming will be used as a tool to experiment various usage and threat models. These usage
and threat models are expected to be derived from a computer system. This experiment will lead
to design and implementation of a normal usage system, a threat detection system and a risk
analysis system. These systems are expected to be components of a mobile security audit
framework.
Computer Usage Survey
A questionnaire for obtaining information about computer and smart phone usage will be
employed. It is expected that this will give an idea about various statistics that make up a
computer or smart phone’s usage. These statistics will be a guideline for sampling experimental
data of a computer system’s usage during experimenting the usage and threat models.
Threat Detection Systems
It is hoped that an anomaly based threat detection system will be developed to demonstrate the
effectiveness of the research model at being used to model systems usage and threats. The
effectiveness of the threat detection system developed at preventing threats on a computer
system will also be measured. In this project, the threat detection system that will be developed
is for ecommerce sites.
USAGE PROFILE OF A SYSTEM AND THREATS ASSOCIATED WITH THE SYSTEM
Building a usage profile of a system requires determining an abstract model that represents the
system’s usage appropriately. In this paper, we will determine the usage profile by using
statistical models and machine learning models. The statistical model that will be used is the
moments or mean and standard deviation model. The machine learning model that will be used
is the hidden markov model. Additionally, we will be determining the usage profile by trying to
find out the relationship between dependent and independent variables that make up the
system. As such, we will be using regression to determine the relationship between the
dependent and independent variables. A study into the application of differentiation also gives
insight into the kind of threats associated with the system.
MATHEMATICAL MODELLING TECHNIQUES
The mathematical relation that represents the usage of a system can be determined using
regression analysis. Regression analysis is a field of statistics that employs the least squares
method to determine the relationship between a dependent and one or more independent
variables given the data set for these variables. The least squares method tries to determine the
relationship by minimizing the error margin of the relationship determined. Additionally,
differential equations will be used to model threats in the system given the usage equation.
SIMPLE LINEAR REGRESSION
Simple linear regression involves a dependent variable and a single independent variable. The
goal is to find a linear relationship between the two variables. The linear relationship found are
typically of the form y=b0+b1x where y is the dependent variables. The slope of the line is b1 and
the y-intercept is b0. The relationship between the dependent and independent variables can be
determined using the least squares method. First of all, the sum of the dependent and the
independent variables (∑y, ∑x) and the sum product of the dependent and the independent
variables (∑xy) are determined. Secondly, the sum of the squares of the dependent and the
independent variables (∑y2, ∑x2) must be determined.
The constant that represents the slope of the line that best fits the relationship determined is
calculated as the product of the sum product of the dependent and the independent variables
and the sample size, minus the product of the sums of the dependent and the independent
variables divided by the product of the sample size and the sum of square of the independent
variable minus the square of the sum of the independent variable. This is given a s: (n∑xy- (∑x
∑y) ∕(n∑x2-(∑x)2). Where n is the sample size.
The constant that represent the y-intercept of the line determined to be the relationship
between the dependent and the independent variables can be calculated as the product of the
sum of the dependent variable and the sum of squares of the independent variables minus the
product of the sum of the independent variable and the sum product of the dependent and
independent variables divided by the product of the sum of squares of the independent variable
and the sample size minus the square of the sum of the independent variable. This is given as
(∑y∑x2-∑x∑xy) /((n∑x)2-(∑x)2). Where n is the sample size.
Finally, the correlation coefficient of the predictive relationship determined is calculated as the
product of the sample size and the sum product of the dependent and independent variable
minus the product of the sums of the dependent and independent variables divided by the
square root of the product of the sample size and the sum of the squares of the independent
variable minus the product of the squares of the sum of the independent variables multiplied by
the product of the sample size and the sum of the squares of the dependent variable minus the
square of the sum of the dependent variable. This is given as (n∑xy-∑x∑y)/√(n∑x2-(∑x)2(n∑y2-
(∑y)2) where n is the sample size.
MULTIPLE LINEAR REGRESSION
Multiple linear regression problems involve a dependent variable and two or more independent
variables. Using the least squares method, the goal is to find the linear relationship between the
variables involved. The relationships are of the form y=b0 + b1x1+b2x2+…+bnxn, where n is the
number of independent variables and x1, x2,… ,xn are the various independent variables and y is
the dependent variable.
To solve multiple linear problems, we first need to reduce the expected function or multiple
linear models to their simple linear forms. In this form, it is easier to determine the regression
equation. To do this we need to determine the y=b0+b1x for every independent variable. That
way, the regression coefficient set denoted b associated with the independent variables can be
determined using the least squares method. As such the set b made up of b1, b2,…bn is a set
containing the entire regression coefficient associated with the predicted regression function.
NON LINEAR REGRESSION
Nonlinear regression problems involve finding a nonlinear relationship between a dependent
variable and one or more independent variables. Because nonlinear graphs are difficult to
analyze, they can be represented mathematically as linear models before they are analyzed. This
makes it possible to use linear regression techniques to analyze such relationships.
One of the ways used to represent nonlinear relationships with linear models is taking
logs on both sides of the relationship equation. That reduces the nonlinear relationship to a
linear relationship. An example is the equation y2=x2/xy. To reduce this relationship to a linear
relation we take logs on both sides of the relation.
The resulting relationship is 2logy=2logx-logx-logy. When this relationship is simplified
the resulting relationship is logy=(logx)/3. In this form, the logy term represents the dependent
variable and the logx term represents the independent variable. Let K=logy and let P = logx. It
implies that K=P/3. This becomes the linear form of our nonlinear relation.
SINGLE VARIABLE CALCULUS REVIEW (DIFFERENTIATION)
Assume a system with exactly three major system variables. If sampling each of these variables
helps us to arrive at exactly one micro usage equation of our system that best represents the
behavior or functioning of that feature of our system, then we can use differential equations of
the three micro models to analyze and detect threats. Below are some examples of calculus
basics for our usage profile and threat modelling.
Y=2X+3 is a linear function that represents our first micro usage model. X is number of
authentications. Y=3X2+2X+6 is a quadratic function that represents our second micro usage
model and X is the number of host on the system’s wireless network. Y=40/ X+ 5 is an exponential
function that represents our third micro usage model and X is the number of applications on a
host on the system’s wireless network. For each micro usage model, the differential coefficient
can be computed using the law for differentiation given below.
THEOREM 1: dy/dx(C) =0, where C is a constant. THEOREM 2: dy/dx (f[Xi, Ci]) is computed
as the product of the exponent of the first term that results from simplifying f (Xi, Ci) and the
constant besides it multiplied by the system variable Xi raise to the power the original exponent
of the first term minus one plus the result for iterating the first step till every term of f (Xi, Ci) has
been evaluated based on the first step. The final result looks like the sum of a series of rational
numbers computed from the law after going through all the terms.
From the calculus basics review above, the corresponding differential coefficients of the
three micro models are determined as follows; 2, 6X+2, and -40/ X2. If the average usage and
standard deviations of our micro models are computed, then we can analyze changes in our
system by looking at values of our usage equations and their derivatives and how they relate to
the average usage, its corresponding standard deviation, and how this helps us determine
threats.
INTEGRATION REVIEW
Assume the following functions y=3, y=4X+2 and y=9X2+3 are threat model equations. Based on our
three functions we will do an introductory review of integration which is a branch of calculus
that is a reverse operation for differentiation. The integrals for the functions are computed
respectively as 3X +C, 2X2+2X+C and 3X3+3X+C where C represents system constants in the
system. Computing the integral can be tricky so two laws are defined below to aid quick
computation of the integrals of a normal mathematical function.
THEOREM 3:
If a function is represented by a constant such as a rational number, the integral is the product
of the variable x and the rational number which is the constant plus a system constant c, to be
determined by about a pair of x and y values.
THEOREM 4:
If a function is not represented by a constant, the integral is given as the constant of the first x
occurring term divided by the sum of the exponent of the first x occurring term and 1 multiplied
by the variable x raised to the power the sum of the exponent of the first x occurring term and 1
plus repeating the same for every x occurring term plus the corresponding system constant c.
BUIDING THE USAGE PROFILE
To build a usage profile, we use a mathematical model that captures the behavior of the system
and a markov chain model that captures various states and transitions in the system. The
mathematical model is made up of a usage equation composed of a dependent and independent
variables and a statistical model that captures average usage and its standard deviation. The
usage equation of the system can be summarized as Y=f (Xi, Ci), where Y is our systems’ usage
and Xi are the various independent variables of our system that constitutes the normal usage or
behavior of the system.
In order to determine the usage equation, it is essential to keep the method simple and
the variables simple in abstraction and minimal in quantity. This makes it easy to appropriately
represent the system’s usage with a usage equation. Then we use regression to determine the
mathematical equation that describe the usage of the system. In this paper, we break down the
usage of a system into various micro usage models that describe smaller parts of the system.
When we are able to determine the usage equation of these micro usage models and their
associates mean and standard deviation model, it means that we have finally built a usage profile
of the system. The other thing left is to be able to study the various states and state transitions
in the systems. This is also part of the usage profile.
AUTHENTICATION USAGE MODEL
The authentication usage model represents the usage of an authentication system. The
independent variables that must be sampled to determine the usage of an authentication system
are the average data transmitted during an authentication (x1) and the average network speed
for a single authentication (x2). The average data transmitted is the average of request and
response data for a single authentication and the average network speed is the average upload
and download speed for a single authentication. The dependent variable that must be sampled
is the time taken for an authentication (y).
The goal of modelling the dependent and independent variables is to arrive at a
mathematical relationship between y and the two independent variables x1 and x2. It is
expected that the relationship will be Y=c1(x2/x1) +c2, where c1 and c2 are system constants. In
addition to that, some system constants that will aid threat analysis must be determined. These
are the total number of valid authentications, the expected authentications within a time frame,
the minimum authentications within a time frame and the maximum authentications within a
time frame.
The mathematical relationship between y, x1 and x2 is the normal usage model of the
authentication system. After this relationship has been determined, various occurrences that
deviate from this relationship can be used to analyze threats. For instance, any occurrence that
is not equal to the average usage is a threat. Additionally, any occurrence that indicates a change
outside an acceptable threshold is a threat. The acceptable threshold is a range within which
changes in the systems are deemed normal. Such a range is composed of the average usage and
standard deviation.
SESSION USAGE MODEL
A session usage model represents a single user’s behavior before his session expires. To
determine the mathematical model for a user’s session, two main independent variables must
be sampled. These are size of session data accumulated (x1), and number of user actions (x2).
The dependent variable that must be sampled is time spent before session expires (y). The
session usage model is expected to be made up of two micro usage models. The mathematical
representation of the micro usage models are expected to be Y=c1x1+c2 where c1 and c2 are
systems constants and Y=c1x2+c2 where c1 and c2 are system constants.
In addition to the two mathematical functions, some system constants that will aid threat
analysis must be determined. These include average user actions, average size of data
accumulated, average time spent. These constants can be determined from the data set used to
determine the usage model.
The two mathematical relationships represent the session usage model. Both are linear
functions. It is expected that as user actions increase the time spent also increases. It is also
expected that as data accumulated increase times spent also increases.
MEMORY USAGE MODEL
The memory usage model represents the usage of memory space in a system. The independent
variables that must be sampled are number of application programs running (x1), and the
number of system processes running (x2). The dependent variable that must be sample is
amount of memory space being used(y). The mathematical relationship between x1, x2, and y is
expected to be y=c1x1+c2x2+c3 where c1 is the average memory space for programs, c2 is the
average memory space for processes and c3 is the average memory being used when no process
or program is running.
In addition to these, some system constants that aid threat analysis must be determined.
These include the minimum and maximum memory space for programs and the minimum and
maximum memory space for processes. The mathematical relationship between x1, x2, and y is
the memory usage model. When determined, the memory usage model can be used to analyze
changes in the memory usage that indicate threats in the system.
CPU USAGE MODEL
The CPU usage model represents CPU usage in a system. The independent variables that must
be sampled are the number of application programs running (x1), and number of system
processes running (x2). The dependent variable that must be sampled is amount of CPU power
being used (y). The mathematical relationship between x1, x2, and y is expected to be
y=c1x1+c2x2+c3 where c1 is the average CPU power being used for programs, c2 is the average
CPU power being used for processes and c3 is average CPU power being used when no process
or program is running. In addition to these, some system constants that aid threat analysis must
be determined. These include the minimum and maximum CPU power for programs and the
minimum and maximum CPU power for processes. The mathematical relationship between x1,
x2 and y is the CPU usage model. When determined, the CPU usage model can be used to analyze
changes in the CPU usage that indicate threats in the system.
PROGRAM USAGE MODEL
To determine the program usage model, the dependent and independent variables that must be
sampled are time spent using program (y), and number of functions used (x). In addition to that,
the following constants must also be determined. Minimum functions used and maximum
functions used. The relationship between y and x determined after sampling various x and y
values is the program usage model denoted by y=f(x).
HOST USAGE MODEL
The host usage model is composed of four independent variables. Memory usage (x1), session
usage (x2), CPU usage (x3), and program usage (x4), derived from their respective usage models.
The dependent variable that must be sampled in the time spent on host (y). Any relationship
determined between the dependent and the independent variables is the host usage model. The
resulting host usage model is denoted y=f (x1,x2, x3, x4).
BATTERY USAGE MODEL
The battery usage model is made up of the average usage of CPU, average memory usage and
the average usage of how a session behaves in the system. These are the independent variables.
The dependent variable is the battery lifespan. The independent variables are derived from their
respective micro usage models.
DEVICE USAGE MODEL
The device usage model is made up of a battery usage model, a host usage model, and the time
spent on the device. The usage models that make up the device usage model compute the
average micro usage and try to relate that with the time spent on the device. The time spent on
the device is the dependent variable.
SERVER USAGE MODEL
The server usage model is made up of the CPU time being used, the memory space being used
and the number of processes running. These variables are used to form two different micro usage
models. As such, there are two dependent variables, CPU time and memory space. The
independent variable for both micro usage models is the number of processes running.
PORT USAGE MODEL
The port usage model is made up of the time elapsed during communication, number of
programs that use the port and the number of paired ports. The number of paired ports is the
dependent variable and the remaining variables are the independent variables.
NETWORK USAGE MODEL
The network usage model is made up of average port usage, average server usage, average host
usage, the average size of data transmitted on the network, and time spent on the network. The
first three variables are the independent variables. The remaining two are the dependent
variables. As such, two micro usage models make up the network usage model.
AGGRESSIVE USAGE DETECTOR
This model is a utility that detects aggressive behavior on a system. It is modelled just like the
various micro usage models. Various factors that determine aggressive behavior during system
usage are used to determine the mathematical representation of this utility. Aggressive behavior
includes aggressive use of major system resources, and aggressive use of system components
with limited resources.
The average aggressive behavior and its standard deviation are determined. Any system
occurrence that indicates the average aggressive behavior, or the average aggressive behavior
plus its standard deviation or the average aggressive behavior minus its standard deviation is
considered a threat and must be halted, alerted or stored for audit purposes.
FALSE ALARM DETECTOR
The false alarm detector is a utility that detects normal system usage that otherwise may be seen
as threat. Occurrences that meet the criteria for false alarms are normal usage that seems to put
the entire usage of the system into a false state of vibration or anarchy. Such usage occurrences
are as such prioritized as normal optimal usage. The remedy for the vibrations such usage
occurrences cause is delay in other normal usage occurrences in the system.
The state and magnitude of other system occurrences plus the state and magnitude of
the normal optimal usage determine the impact of the perceived anarchy. To increase
convenience with which the system for which this utility is developed, the average delay time
and its standard deviation must be detected. This utility is part of the normal usage. The utility is
modelled just like the aggressive usage detector.
SPECIAL PARAMETERS OF THE USAGE PROFILE
This section discusses special parameters of our usage profile. These parameters include the
average usage, the usage standard deviation, the minimum usage, the maximum usage and the
most frequent usage value recorded.
The average usage is the predicted average usage after the normal usage model function
has been determined. The usage standard deviation is the standard deviation of the predicted
normal usage function. The minimum and maximum usage values are the minimum and
maximum usage predicted using the usage equation. These parameters together with usage
rates, threat model constants and other usage constants are used in analyzing and detecting
threats.
BUILDING THE THREAT PROFILE OF THE SYSTEM
To build a threat profile of a system we use differential equations of the usage profile. When we
study differential equations of the usage profile, we will arrive at occurrences in the system that
deviate from our usage profile. The differential coefficient of the usage profile is known as a
threat model. Threats in the system occur as a result of changes in the usage profile that are
beyond a certain acceptable threshold called the standard deviation of the usage profile. A threat
model on the other hand is an abstract representation of this change in our system that is beyond
the acceptable threshold. In addition to that, any state of the system that is unusual or any state
transition in the system that is unusual is also a threat. We will look at how to detect and prevent
such unusual states and transitions in the system using hidden markov models.
Integration can be performed on a threat model to determine the source of the threat.
Integration is a reverse operation for differentiation in calculus. A threat model that can perform
integration operations can be called a novel self integrating data structure. The sections that
follow will look at how to analyze and prevent threats using the usage profile. Also, how to
determine the sources of these threats using a novel self integrating threat model will be
discussed.
THREAT ANALYSIS AND DETECTION
To do threat analysis in a system and abort processes that initiated those threats, linear and
nonlinear programming techniques can be used. The goal here is to minimize the threat
occurrence frequency and the overall impacts associated with the threat and optimize the usage
equation. In addition to these two goals, there are some constants that aid threat analysis. These
constants are associated with the usage profile and the threats in the system.
Examples of these constants may be the rate at which usage is increasing with respect to
a particular usage variable or the rate at which the threat impact and frequency increases with
respect to a particular variable in the usage profile and other special parameters associated with
the usage profile equation.
The average usage, its standard deviation and the threat model equation make up the
threat model. The average usage and standard deviation are constants in the threat model. Using
the threat model equation, the average usage and standard deviation, threats analysis can be
done using linear and nonlinear programming. The goal is to minimize threats using the threat
model equation as the objective function and the average usage and standard deviation as
constraints. Other parameters that may be used as constraints include the rate at which usage
is increasing with respect to a particular usage variable or the rate at which the threat impact
and frequency is increasing with respect to a particular usage variable.
THREAT PREDICTION
This section discusses how to predict threats in a system. The network usage model discussed in
this chapter and its associated threat model will be used to demonstrate how to predict or detect
a threat in a system. As discussed in the previous section, threat can be detected using linear and
nonlinear programming. The network usage model equation and its associated threat model
equation are the objective functions.
The constraints that will be used are the average network usage and its standard
deviation, and other parameters such as the rate at which the network threat increases with
respect to other network usage profile components such as average host usage, average server
usage, average port usage, average time the network operates, average data transmitted on the
network. The goal of the linear or nonlinear programming is to optimize the usage such that
usage is within the range of the average usage minus its standard deviation and the average
usage plus its standard deviation. These are the lower and upper bounds of our objective
function. Every combination of system variables whose usage is within this usage range
minimizes threat in the system.
Since the average port, host and server usage are derived from their corresponding usage
profile models, the linear and nonlinear programming analysis will be done independently for
these ones. When a threat is predicted in a system, the chance of it being accurate is dependent
on the usage value at that instance and whether it is within the range of the acceptable usage.
This is constructed using the average usage and its standard deviation. Any usage value that is
less than the average usage minus its standard deviation is a threat. Also, a usage value that is
greater than the average usage plus its standard deviation is a threat. That means that any
predicted threat at a point where the predicted usage is within the usage range has a high chance
of being false.
In addition to that, the actual and predicted usage values can be used to determine the
chance that the predicted threat is accurate. If the difference between them is high, there is a
chance that the predicted usage may be wrong. Since the predicted usage and the threat models
are derived from the usage profile equation, there is a chance the predicted threat is also false.
Finally, the closer the correlation coefficient of the usage profile equation is to zero, the higher
the chance the predicted usage and its associated threats values are wrong. Usage model
functions with correlation coefficient of 0.6 and above indicate that the predicted usage values
and predicted threats values are accurate. These values are obtained from the usage model
function and the threat model function respectively which are modeled using relevant systems
variables that make it possible to model system usage and system threats.
RISK ANALYSIS IN A SYSTEM
To do risk analysis in a system, the frequency at which threats in the system occur and the impact
they have on the system must be known. When a frequency table is constructed for all threats
and their associated impacts stored, it becomes easy to analyze risks associated with a system.
When a threat is predicted, the likelihood of the threat occurring in the system can be computed
using the threat frequencies.
The impacts various threats have can also be determined based on the types of threats
and other parameters such as the number of such threats, the speed at which they occurred and
the resources they affected or damaged. Risk in a system is computed as the product of the
likelihood of threat occurrence and the impact that threat occurrence has on the system. These
concepts are the basics for developing a risk analysis system using the techniques we have
discussed so far.
PROPERTIES AND METHODS OF THE NOVEL SELF INTEGRATING STRUCTURE
The best properties or characteristics of the data structure that represents our threat model
include just to mention a few, names of network software or host application software, version
number of network and host software, license information that include date software was
purchased or released and number of years needed for renewal, IP address and Mac address of
a host on a network.
The methods of such a gigantic or simulative object may include methods for computing
the integral of a threat model, another for computing the differential coefficient of the predictive
usage profile equation, a method for computing the differential equation of a network or host
threat model. These methods included are mostly methods needed for performing the major
calculus operations that will help in the novel calculus simulation on a network to detect threat
and their sources on a wireless network. Besides these, it may be necessary to implements
methods that retrieve hidden network identity like IP and Mac addresses on a local area network.
INTERPRETATION OF THREAT MODEL INTEGRALS
Since the novel self integrating data structure is a programmed threat model, it is important to
discuss the meaning of its integrals. The integrals represent the source of the original threat.
Examples of the integrals of the threat model may result in detecting the function, software, host
or network from which the threat was detected. With properties like software name, version
number, IP and Mac addresses it becomes easy to pin point the source of the threat.
If the integral of a threat model looks like the usage profile equation of a function of the
system under examination, then that function from the system under examination can be
predicted as the source of the threat. Similarly, if the integral is similar to the usage profile
equation of a software, host, or network that forms part of the system which is being
investigated, then that threat can be predicted to be from that software, host or network.
APPLICATION OF HIDDEN MARKOV MODELS FOR STUDYING VARIOUS STATES IN THE SYSTEM
AND THREATS THAT CAN BE DETECTED
Hidden markov models are machine learning models that are used to model states in a system,
the sequence in which they occur and the associated probabilities for each state transition.
When a system has a set of states in which it usually falls and it can be predicted or established
that each new state is dependent on the previous states, then hidden markov models can be
used to learn the state transitions that usually happens in the system. It must be stated that the
sequence in which states occur in a system can be characterized by a parametric random process.
Also, the probability associated with each state transition is irrespective of the time in which the
transition occurred in the system.
For computer systems which have occurrences that happen based on a parametric
random process, these occurrences can be seen as the set of states in the system. Some of these
occurrences may be the point at which the system is at its optimal, maximum or minimum usage,
or the point at which the system attains average usage. It must be stated that, if the various
states of the system’s usage are determined from the usage profile which is made up of the usage
equation and the mean and standard deviation model, then any state that is unusual is seen as
a threat. The usage states can be the minimum, average and maximum usage. Also the various
state transitions from one of these states to another can be determined. As such, any state which
is less than the minimum usage or greater than the maximum usage can be seen as an unusual
behavior. Also any transition from one of these states to another which is not captured as a
normal state transition can be seen as a threat.
Additionally, when a set of threat types that happens in the system is determined, it
becomes possible to study the sequence in which these threats occur in the system and the
various transitions between the threats using hidden markov models. Also, the various usage
points including the optimal, the minimum and the average usage and how they are transited in
the system can be studied using hidden markov models. Because various occurrences and
threats can be studied using hidden markov models, it becomes possible to predict the next
occurrence or threat that will happen on a host or a computer network.
Threat sources can also be predicted using threat models. When threat models are
integrated, they give a general idea about the source of the threat. With such knowledge and
ability, the next threat or occurrence that has a higher likelihood of happening on a host or
network can be predicted using application of hidden markov models. As such, occurrences can
be prevented if they are estimated to be disastrous. Also, if for instance, for some reason the
optimal or minimal usage must be reached, it becomes possible to study ways of optimizing the
transition from the current state or predicted next state to the required state. This makes it
possible to move from a particular usage point to the desired usage point.
This approach to threat detection and usage optimization, make it possible to build
anomaly based intrusion detection systems that are correct, prompt and increase optimal use of
the system. The anomaly based intrusion detection systems built using these techniques are
correct because the threat models come from usage models that are built using similar
approaches and the threat prediction and prevention mechanisms are designed using robust
techniques developed using these approaches. Also, there are likely going to be lower false
alarms since the threats predicted on host or networks come from threat models designed from
such robust methods.
An example of a kind of cyber security threat that this approach can be used to model is
a network problem where a student is determined or predicted to be sending threatening or
socially unacceptable emails to colleagues. Typically, his identity is hidden on the network on
which he sends the emails. As such, it is difficult to determine the likelihood that he will send
such threatening emails on a particular day or hour so that his identity could be determined and
brought to book. Using hidden markov models, a usage profile of the email system could be
developed. This will make it possible to determine the day or hour in which he is likely going to
send such threatening email so that his identity can be found and the problem solved.
EXPERIMENTING THE USAGE PROFILE AND THREATS ASSOCIATED WITH IT
In this chapter, we discuss the experiment that was conducted to determine the usage of a
computer system. We also discuss how to simulate the threat and usage models with the hope
of developing a threat detection system. The experiment was conducted by implementing the
usage models in java. The implementation uses an interface that captures what makes up a usage
model. Each micro usage model implements the interface. The interface is given in Appendix B.
Additionally, a multithreading object was also implemented for learning the systems usage,
monitoring the system, and for determining the relationship that describes the usage model. This
object is also given in Appendix A.
The interface is made up of eight functions these are computeval() for computing the usage value
based on system variables, findchange() for finding any change in the usage model, learnsys (int
t) for learning the system within a time frame, findrelationship() for determining the relationship
that represents the usage model, monitor(int t) for monitoring the system, showalarm(String
info) for displaying alarms, haltprocess() for halting systems processes that are threats, and
predictvals() for predicting system usage values.
The multithreading object implements the run method of the thread class. The run method
implemented performs three main functions. It calls the learnsys method of the usage model
when the systems usage needs to be learnt. It also calls the findrelationship method of the usage
model when the system usage profile equation needs to be determined. It also calls the monitor
method of the usage model when the system has to be monitored for various activities in the
system.
During the experiment the various independent and dependent variables of the usage models
where sampled. Each sample was captured for the particular usage model using random
numbers. The samples were captured within a time frame. The time was in seconds and refers
to the t variable that the learnsys method of the usage model takes. Each second a sample is
captured for each usage model. Then after the samples have been captured, the relationship
that describes the usage model for the system was determined using the mathematical
modelling techniques reviewed in the previous chapter.
Also, after the usage profile has been built by learning the system and capturing the samples that
helps determine the various usage models, the system was monitored within a time frame. The
time was set in second. This time refers to the t variable that the monitor method of the usage
model takes. This was used to analyze the system in order to have an idea about the activities
going on in the system. Based on the usage profile built, activities that are deviations from the
usage profile are flag as incidents that indicate threat in the system.
Additionally, during the experiment processes that are independent are grouped first and
executed together. Processes that depended on other processes had to wait for those processes
to finish first before. This was important because it makes it possible to determine the usage
model of micro usage models that form part of other usage models first. Later these micro usage
models can be used to determine the usage model of the parts of the system that depend on the
micro usage models. This can be seen in the main method in Appendix C.
It must be stated that because the usage model for authentications was determined to be a
rational function, logs must be taken on both sides of the relation as part of the experiment in
order to reduce the relation to their linear form. The original function is Y=c1(x2/x1) +c2. When
reduced to its linear form we have log Y= log c1+ log x2 – log x1 + log c2. Since log c2 and log c2
results in constants let denote them with k1 and k2 respectively. Additionally, let B= log Y, let j1=
log x1 and let j2= log x2. Therefore, the linear form of the usage for authentication is B= j2- j1 +
k1 + k2. Since k1 + k2 is a constant let it be represented by k. As such B= j2- j1 + k where B is the
dependent variable and j2 and j1 are the independent variables. When B, j2, and j1 are sampled,
Y=c1(x2/x1) +c2 can be determined.
The cpu and the memory usage models on the other hand are multiple linear. The original
relations are of the form y=c1x1+c2x2+c3 where x1 and x2 are the independent variables. The
original relations must be reduced to their simple linear form. To do this, determine y=b0+bx for
each independent variable. The sum of the various b0 equals c3. The various b correspond to the
constant associated with the independent variable for which y=b0+bx was determined. For
example, the b for any y=b0+bx determined for x1 equals to c1 and that for x2 equals to c2. When
x1, x2, and y are sampled and the various y=b0+bx determined, y=c1x1+c2x2+c3 can be
determined completely.
CHALLENGES WITH PROJECT
This section of the report mentions the challenges encountered during the research. First of all,
it is important to state that it was not easy to get audit data so the system variables that were
enumerated in this research were based on careful selection of critical aspects of a computer
system that were seen as good to be used.
Secondly, it is important to note that during the simulation of the usage and threat
models, the Multithreading object discussed earlier was not working as expected. As such, the
capturing of system incidents while learning the system’s usage was done by using a method that
captures the information each second. This method uses the sleep method of the thread class.
The sleep method is a static method. As such, a while loop that runs t second was implemented
using the sleep method that makes the system to wait for a second before capturing the next set
of sample data.
TOOLS AND COMPUTER PACKAGES
This chapter discusses the tools and computer packages that were used throughout this research
project. We will also look at the programming languages, database platforms and development
frameworks that can be used to develop an anomaly based intrusion system for ecommerce sites
using the concepts were have discussed in this paper. The simulation was implemented using
java. It was a console based simulation. Java was chosen for its object oriented concepts such as
encapsulation, inheritance, interfaces, objects, and polymorphism.
To implement an intrusion detection system using results of this research, the following
tools will be essentials. These tools are best suited for intrusion detection systems developed for
ecommerce sites. Bootstrap, Codeignitor, MySQL Database Management System, SQLite,
SQLyog, and Eclipse. The programming languages that will be used are PHP and Android. PHP is
for the desktops and laptops that connect to the ecommerce sites and Android is for mobile
phones that use the ecommerce sites.
Bootstrap and Codeignitor are web development frameworks. Bootstrap is for frontend
developments and Codeignitor is a backend framework for PHP developers. For Android Eclipse
can be used as the best IDE for Android developments. MySQL and SQLyog are for the database
servers that will run on the ecommerce site as part of the intrusion detection system
implementation. SQLite is for the databases that run on the Android implementations that form
part of the intrusion detection system developed for the ecommerce website.
With are these tools, frameworks and packages, developers are ready to develop
intrusion detection systems for ecommerce sites using the concepts in this research paper. It is
expected that the micro usage models discussed will be integral libraries that will be
implemented in PHP and Android as part of an implementation for ecommerce sites or any group
of web or mobile application system.
CONCLUSION AND DISCUSSION
To end this discussion, it is worth mentioning that the normal usage models and threat models
experimented in this paper represents a computer system and it associated threats. These
threats can be analyzed periodically and audited as part of a computer security audit. This will
fuel development of a risk analysis system. A risk analysis system, threat detection system and
normal usage system developed from experimenting the usage and threat models will make up
a mobile security audit framework that can be used for maintaining cyber security on computer
systems. When practices and processes for maintaining this framework are drafted and adhered
to, it will make it easy to maintain cyber security on various computer systems.
Additionally, it can be established that using the differential equations technique, the
novel self integrating data structure, and the linear and non-programming techniques, threats
on a system can be analyzed and detected. To halt such threats, the intrusion detection system
developed using the techniques stated above must possess certain qualities. These qualities
include correctness, promptness, and ease of use. Correctness means how good the intrusion
detection system can detect threats. This is important because correctness affects the rate at
which a predicted threat is false or true. Promptness is related to the time it takes to detect or
halt a threat and ease of use is related to the property of the intrusion detection system aiding
convenient use of the computer system for which it is developed.
The techniques we have discussed make it possible to achieve correctness, promptness
and ease of use. The usage model function with its associated average usage and standard
deviation make it possible to ensure correctness of the intrusion detection system. This is
because the statistical data sampled for developing the intrusion detection system is within the
range of the acceptable usage. The average usage and standard deviation are computed using
statistical models. One of such models used in this research is the moments or mean and
standard deviation model. With this statistical model and the usage model equation, it becomes
possible to ensure correctness of the intrusion detection system.
To achieve promptness, multithreading is applied to analyze, predict, detect and halt
threats. All threats alarms and detection must use multithreading. Multithreading is a
programming concept that ensure that several processes run on the computer at the same time.
This concept makes it possible to predict multiple threats, do multiple threat analysis and halt or
alarm occurrences of multiple threats in a computer system. This makes the threat detection
system prompt. Multithreading may be optimized to halt, prevent or alarm threats with high
magnitude and impact. Prioritizing the detection of such threats also use multithreading.
Ease of use is achieved using the mean and standard deviation model. Without that
model, there is no acceptable range of our usage. That means that the average usage and its
standard deviation prevents a rigid usage model and as such makes usage convenient. Periodic
audits also ensure that the normal usage function and its mean and standard deviation model
are up to date. Application of machine learning techniques such as data mining also ensure that
the usage model and its associated mean and standard deviation model are up to date with
actual usage of the system.
With monitoring mechanisms, intrusion detection systems developed using the normal
usage model and its associated mean and standard deviation model help ensure ease of use. This
is because an administrator monitoring a network or computer system using an intrusion
detection system can prevent relevant threats using a click. He can also use several
configurations of the intrusion detection system to halt or prevent normal threats. All these
mechanism, together with the mean and standard deviation model of the normal usage make
the computer or network system being monitored easy to use.
It is expected that using Boolean algebra and calculus of Boolean functions, the normal
usage model can have a hardware representation. Researching how to implement this hardware
representation can be done using Boolean algebra and calculus of Boolean functions. These
concepts are related with concepts from computer organization and architecture such as logic
gates, multipliers, design of arithmetic and logic units, and concepts from embedded systems
like architecture of various embedded system implementation. These architectures include
hardware only implementation and hardware/software implementation.
The two utilities that compose the usage model are essential for improving convenient
usage and preventing false alarms. This makes the intrusion detection system correct and prompt
at preventing threats. These utilities can be modelled mathematically as linear and quadratic
functions. For instance, the process that causes aggressive usage can be modelled as quadratic
or linear. If the process involves a download or data transfer on a network, then the size of the
data being transferred or downloaded determines whether the mathematical model is linear or
quadratic.
If the size of the data is huge, the function is quadratic. If it is small, the function is linear.
The aggressive behavior and false alarm is analogical to public road transportation. When a car
wants to overtake another on a highway, the car or truck ahead of it must wait for the car making
the overtake. This prevents anarchy on the highway. Similarly, overtaking may seem like over
speeding and may create a false state of anarchy on the road.
If a process has a high chance of completing quickly and has indicated that it wants to
commence, then it is expected that the state of the system usage does not change. Any other
process that changes the state of the system usage such that the commencement of the new
process leads to vibrations or anarchy is aggressive and must be stopped.
The implication of this research is its application to critical systems such as medical
systems, banking systems and systems for monitoring incidents on a country’s road network.
Usage of banking and medical systems may pose social threats such as theft and increase
mortality rates. To prevent such threat, frameworks for auditing these systems must be used
periodically to secure human life, savings and investments of citizens and organizations.
Incidents on a country’s road network can lead to prolonged court cases and health and
work hazards. Building incidents monitoring systems for such road networks can reduce these
health and work hazards and the prolonged court cases in a country. The concepts discussed in
this paper may be beneficial to the design and implementation of such systems.
APPENDIX A
import java.util.*; class
LearnSysProcess extends Thread{ String process_name; model usage;
int time; int time2;
int N;
Thread process;
LearnSysProcess(String name,model x, int t,int n,int t2){
process_name=name;
usage=x;
time=t;
N=n; time2=t2;
}
public void run(){ try{
switch(N){
case 1://learn system
usage.learnsys(time);
break;
case 2:// find relationship
usage.findrelationship();
break; case 3:// monitor system
usage.monitor(time2); break;
}
}
catch(Exception e){
}
}
public void start(){
if(process==null){
process=new Thread(this,process_name);
process.start();
}
}
}
APPENDIX B
public interface model{
public double computeval();
public double findchange();
public void learnsys(int t);
public Object findrelationship();
public void monitor(int t);
public void showalarm(String info);
public void haltprocess();
public void predictvals();
}
APPENDIX C
class normal_usage implements model{
double usageVal;// a system's usage value at a point in time
network_usage net_usag;// network behavior;
user_behavior u_bev;//user behavior;
device_usage device_usag;// avearge behavior of a mobile device on the networrk;
memory_usage memory_usag;
cpu_usage cpu_usag;
program_usage program_usag;
host_usage host_usag;
double net_usag_time; // how long the system functions
usage_modeller math_modeller;
normal_usage(){
net_usag=new network_usage();
u_bev=new user_behavior();
device_usag=new device_usage();
memory_usag=new memory_usage();
cpu_usag=new cpu_usage();
program_usag=new program_usage();
host_usag=new host_usage();
math_modeller=new usage_modeller(4);
}
public static void main(String args[]) throws InterruptedException{
normal_usage usage=new normal_usage();
int min=1;
int time=min*60*1000;
LearnSysProcess p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13;
p1=new LearnSysProcess("learn session_usage",usage.u_bev.x1,time,1,0); p1.start();
p2=new LearnSysProcess("learn auth_usage",usage.u_bev.x2,time,1,0);
p2.start();
p5=new LearnSysProcess("learn memory_usage",usage.memory_usag,time,1,0);
p5.start();
p6=new LearnSysProcess("learn cpu_usage",usage.cpu_usag,time,1,0); p6.start();
p10=new LearnSysProcess("learn server",usage.net_usag.server_usag,time,1,0);
p10.start();
p11=new LearnSysProcess("learn port_usage",usage.net_usag.port_usag,time,1,0);
p11.start();
p8=new LearnSysProcess("learn program_usage",usage.program_usag,time,1,0);
p8.start();
p1.join();
p2.join();
p5.join();
p6.join();
p10.join();
p11.join();
p8.join();
p3=null;
// check if processes p1 and p2 are complete then
if(!p1.isAlive()&&!p2.isAlive()){
p3=new LearnSysProcess("learn user_behavior",usage.u_bev,time,1,0);
p3.start();
} p4=null;
p7=null; // check if processes p1,p5, and p6 are complete then
if(!p1.isAlive()&&!p5.isAlive()&&!p6.isAlive()){
usage.device_usag.battery_usag.set_memory_usage(usage.memory_usag);
usage.device_usag.battery_usag.set_session_usage(usage.u_bev.x1);
usage.device_usag.battery_usag.set_cpu_usage(usage.cpu_usag);
p4=new LearnSysProcess("learn battery_usag",usage.device_usag.battery_usag,time,1,0);
p4.start();
p4.join();
p7=new LearnSysProcess("learn device_usag",usage.device_usag,time,1,0);
p7.start();
} p3.join();
p7.join(); // check if process p5 is complete then
if(!p5.isAlive()){
usage.host_usag.set_memory_usage(usage.memory_usag); }
// check if process p6 is complete then
if(!p6.isAlive()){
usage.host_usag.set_cpu_usage(usage.cpu_usag);
}
// check if process p8 is complete then
if(!p8.isAlive()){
usage.host_usag.set_program_usage(usage.program_usag);
}
// check if process p1 is complete then
if(!p1.isAlive()){
usage.host_usag.set_session_usage(usage.u_bev.x1);
}
p9=null;
// check if processes p1,p5,p6, and p8 are complete then
if(!p1.isAlive()&&!p5.isAlive()&&!p6.isAlive()&&!p8.isAlive()){
usage.net_usag.set_host_usage(usage.host_usag);
p9= LearnSysProcess("learnhost_usage",usage.net_usag.host_usag,time,1,0);
p9.start();
p9.join();
}
p12=null; // check if processes p9,p10 and p11 are complete then
if(!p9.isAlive()&&!p10.isAlive()&&!p11.isAlive()){
p12=new LearnSysProcess ("learn network_usage",usage.net_usag,time,1,0);
p12.start();
p12.join();
}
p13=null; // check if processes p9,p10,p11,p1,p2 and p3 are complete then
if(!p9.isAlive()&&!p10.isAlive()&&!p11.isAlive()&&!p1.isAlive()&&!p2.isAlive()&&!p3.isAlive()){
p13=new LearnSysProcess("learn system usage",usage,time,1,0);
p13.start();
}
double x_vals[]=new double[4];
x_vals[0]=9;
x_vals[1]=7; x_vals[2]=54; x_vals[3]=43;
usage.math_modeller.sample_x_set(x_vals) ;
usage.math_modeller.sample_y(87);
usage.math_modeller.queue_sample();
x_vals[0]=9; x_vals[1]=7; x_vals[2]=54; x_vals[3]=43;
usage.math_modeller.sample_x_set(x_vals);
usage.math_modeller.sample_y(87);
usage.math_modeller.queue_sample();
usage.math_modeller.end_modeller=true;
usage_stats ustats=usage.math_modeller.get_usage_stats();
x_set avg_xset=usage.math_modeller.average_x_set;
x_vals=avg_xset.x_set;
System.out.println("mean 1: "+x_vals[0]);
System.out.println("mean 2: "+x_vals[1]);
System.out.println("mean 3: "+x_vals[2]);
System.out.println("mean 4: "+x_vals[3]);
}
public double computeval(){
return 0;
}
public double findchange(){
return 0;
}
public void learnsys(int t){ int timer=1;
while(t>=timer){
net_usag.learnsys(t);
try {
Thread.sleep(1000);
}catch (InterruptedException e){
e.printStackTrace();
}
timer++;
if(timer==t){ try{
Thread.yield();
}
catch(Exception e){
}
} }
}
public Object findrelationship(){
return null;
}
public void monitor(int t){
int timer=0;
while(timer<t){
try{
net_usag.monitor(t);
}
catch(Exception e){
}
}
timer++;
}
public void showalarm(String info){
System.out.println(info);
}
public void haltprocess(){
}
public void predictvals(){
} }
REFERENCES A, A study of information security awareness and practices in Saudi Arabia [26-28 June 2012]
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6285845&contentType=Confere
nce+Publications&queryText%3DCyber+Security+Papers+.PLS.+mobile+phones [15th November, 2012]
Cashion J, Protocol for mitigating the risk of hijacking social networking sites [15-18 Oct. 2011]
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6144818&contentType=Confere
nce+Publications&queryText%3DCyber+Security+Papers+.PLS.+mobile+phones [15th November, 2012]
Chidambaram Mahadevan (CISA, FCA), Intrusion, Attack, Penetration – Some Issues
Cybersecurity http://whatis.techtarget.com/definition/cybersecurity [14th November 2012]
Ethem Alpadin, Introduction to Machine Learning 2nd edition 2010
Jian g Chunfeng, Research and application of behavior encryption [27-31 May 2012]
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6320096&contentType=Confere
nce+Publications&queryText%3DCyber+Security+Papers+.PLS.+mobile+phones [15th November, 2012]
Larson, Hostetler, Edwards, Multivariable Calculus 8th edition 2006
Matthew E. Whiteman, Herbert J. Mattoro, Principles of Information Security 2nd edition 2005
Nong Ye, A Markov Chain model of Temporal Behaviour for Anomaly Detection [6-7 June 2000]
Security hole allows anyone to hijack your Skype account using only your email address
http://thenextweb.com/microsoft/2012/11/14/security-hole-allows-anyone-
tohijackyourskypeaccountusing-only-your-email-address/ [12th November, 2012]