computingnow - uptraining.up.nic.in/coursematerial/imp_url_imortant_topics.pdf · 9/13/2016 · on...

https://www.computer.org/web/computingnow

http://guidelines.gov.in/tools.php

http://inoc.nic.in/

https://security.nic.in/docs/procedure%20for%20cleaning%20of%20web%20sites%20from%20malicious%20files.docx

https://security.nic.in/docs/NIC_RHEL-7_Hardening_V-1.0.pdf

What are malware, viruses, Spyware, and cookies, and what differentiates them ?

"Malware" is short for malicious software and used as a single term to refer to virus, spy ware, worm etc. Malware is

designed to cause damage to a stand alone computer or a networked pc. So wherever a malware term is used it means a

program which is designed to damage your computer it may be a virus, worm or Trojan.

Worms:-

Worms are malicious programs that make copies of themselves again and again on the local drive, network shares, etc. The

only purpose of the worm is to reproduce itself again and again. It doesn‘t harm any data/file on the computer. Unlike a

virus, it does not need to attach itself to an existing program. Worms spread by exploiting vulnerabilities in operating

systems

Examples of worm are: - W32.SillyFDC.BBY

Packed.Generic.236

W32.Troresba

Due to its replication nature it takes a lot of space in the hard drive and consumes more cpu uses which in turn makes the

pc too slow also consumes more network bandwidth.

Virus:-

Virus is a program written to enter to your computer and damage/alter your files/data. A virus might corrupt or delete data

on your computer. Viruses can also replicate themselves. A computer Virus is more dangerous than a computer worm as it

makes changes or deletes your files while worms only replicates itself with out making changes to your files/data.

Examples of virus are: - W32.Sfc!mod

ABAP.Rivpas.A

Accept.3773

Viruses can enter to your computer as an attachment of images, greeting, or audio / video files. Viruses also enters through

downloads on the Internet. They can be hidden in a free/trial softwares or other files that you download.

So before you download anything from internet be sure about it first. Almost all viruses are attached to an executable file,

which means the virus may exist on your computer but it actually cannot infect your computer unless you run or open the

malicious program. It is important to note that a virus cannot be spread without a human action, such as running an

infected program to keep it going.

Virus is of different types which are as follows.

1) File viruses

2) Macro viruses

3) Master boot record viruses

4) Boot sector viruses

5) Multipartite viruses

6) Polymorphic viruses

7) Stealth viruses

File Virus:-This type of virus normally infects program files such as .exe, .com, .bat. Once this virus stays in memory it

http://guidelines.gov.in/tools.php

http://inoc.nic.in/

https://security.nic.in/docs/procedure%20for%20cleaning%20of%20web%20sites%20from%20malicious%20files.docx

https://security.nic.in/docs/NIC_RHEL-7_Hardening_V-1.0.pdf

tries to infect all programs that load on to memory.

Macro Virus: - These type of virus infects word, excel, PowerPoint, access and other data files. Once infected repairing of

these files is very much difficult.

Master boot record files: - MBR viruses are memory-resident viruses and copy itself to the first sector of a storage device

which is used for partition tables or OS loading programs .A MBR virus will infect this particular area of Storage device

instead of normal files. The easiest way to remove a MBR virus is to clean the MBR area,

Boot sector virus: - Boot sector virus infects the boot sector of a HDD or FDD. These are also memory resident in nature.

As soon as the computer starts it gets infected from the boot sector.

Cleaning this type of virus is very difficult.

Multipartite virus: - A hybrid of Boot and Program/file viruses. They infect program files and when the infected program

is executed, these viruses infect the boot record. When you boot the computer next time the virus from the boot record

loads in memory and then start infecting other program files on disk

Polymorphic viruses: - A virus that can encrypt its code in different ways so that it appears differently in each infection.

These viruses are more difficult to detect.

Stealth viruses: - These types of viruses use different kind of techniques to avoid detection. They either redirect the disk

head to read another sector instead of the one in which they reside or they may alter the reading of the infected file‘s size

shown in the directory listing. For example, the Whale virus adds 9216 bytes to an infected file; then the virus subtracts the

same number of bytes (9216) from the size given in the directory.

Trojans: - A Trojan horse is not a virus. It is a destructive program that looks as a genuine application. Unlike viruses,

Trojan horses do not replicate themselves but they can be just as destructive. Trojans also open a backdoor entry to your

computer which gives malicious users/programs access to your system, allowing confidential and personal information to

be theft.

Example: - JS.Debeski.Trojan

Trojan horses are broken down in classification based on how they infect the systems and the damage caused by them. The

seven main types of Trojan horses are:

• Remote Access Trojans

• Data Sending Trojans

• Destructive Trojans

• Proxy Trojans

• FTP Trojans

• security software disabler Trojans

• denial-of-service attack Trojans

Adware: - Generically adware is a software application in which advertising banners are displayed while any program is

running. Adware can automatically get downloaded to your system while browsing any website and can be viewed through

pop-up windows or through a bar that appears on a computer screen automatically. Adwares are used by companies for

marketing purpose.

Spywares: - Spyware is a type of program that is installed with or without your permission on your personal computers to

collect information about users, their computer or browsing habits tracks each and everything that you do without your

knowledge and send it to remote user. It also can download other malicious programs from internet and install it on the

computer.Spyware works like adware but is usually a separate program that is installed unknowingly when you install

another freeware type program or application.

Spam: - Spamming is a method of flooding the Internet with copies of the same message. Most spams are commercial

advertisements which are sent as an unwanted email to users. Spams are also known as Electronic junk mails or junk

newsgroup postings. These spam mails are very annoying as it keeps coming every day and keeps your mailbox full.

Tracking cookies: - A cookie is a plain text file that is stored on your computer in a cookies folder and it stores data about

your browsing session. Cookies are used by many websites to track visitor information A tracking cookie is a cookie

which keeps tracks of all your browsing information and this is used by hackers and companies to know all your personal

details like bank account details, your credit card information etc. which is dangerous .

Misleading applications: - Misleading applications misguide you about the security status of your computer and shows

you that your computer is infected by some malware and you have to download the tool to remove the threat. As you

download the tool it shows some threats in your computer and to remove it you have to buy the product for which it asks

some personal information like credit card information etc. which is dangerous.

1. Repeater – A repeater operates at the physical layer. Its job is to regenerate the signal over the same network before the

signal becomes too weak or corrupted so as to extend the length to which the signal can be transmitted over the same

network. An important point to be noted about repeaters is that they do no amplify the signal. When the signal becomes

weak, they copy the signal bit by bit and regenerate it at the original strength. It is a 2 port device.

2. Hub – A hub is basically a multiport repeater. A hub connects multiple wires coming from different branches, for

example, the connector in star topology which connects different stations. Hubs cannot filter data, so data packets are sent

to all connected devices. In other words, collision domain of all hosts connected through Hub remains one. Also, they do

not have intelligence to find out best path for data packets which leads to inefficiencies and wastage.

3. Bridge – A bridge operates at data link layer. A bridge is a repeater, with add on functionality of filtering content by

reading the MAC addresses of source and destination. It is also used for interconnecting two LANs working on the same

protocol. It has a single input and single output port, thus making it a 2 port device.

4. Switch – A switch is a multi port bridge with a buffer and a design that can boost its efficiency(large number of ports

imply less traffic) and performance. Switch is data link layer device. Switch can perform error checking before forwarding

data, that makes it very efficient as it does not forward packets that have errors and forward good packets selectively to

correct port only. In other words, switch divides collision domain of hosts, but broadcast domain remains same.

5. Routers – A router is a device like a switch that routes data packets based on their IP addresses. Router is mainly a

Network Layer device. Routers normally connect LANs and WANs together and have a dynamically updating routing

table based on which they make decisions on routing the data packets. Router divide broadcast domains of hosts connected

through it.

6. Gateway – A gateway, as the name suggests, is a passage to connect two networks together that may work upon

different networking models. They basically works as the messenger agents that take data from one system, interpret it,

https://en.wikipedia.org/wiki/Collision_domain

https://en.wikipedia.org/wiki/Broadcast_domain

and transfer it to another system. Gateways are also called protocol converters and can operate at any network layer.

Gateways are generally more complex than switch or router.

COMPUTER NETWORKS

http://www.studytonight.com/computer-networks/osi-model-network-layer

Introduction To Computer Networks

Today the world scenario is changing. Data Communication and network have changed the way business and other daily

affair works. Now, they rely on computer networks and internetwork. A set of devices often mentioned as nodes connected

by media link is called a Network. A node can be a device which is capable of sending or receiving data generated by other

nodes on the network like a computer, printer etc. These links connecting the devices are called Communication channels.

Computer network is a telecommunication channel through which we can share our data. It is also called data network.

The best example of computer network is Internet. Computer network does not mean a system with control unit and other

systems as its slave. It is called a distributed system

A network must be able to meet certain criteria, these are mentioned below:

1. Performance

2. Reliability

3. Scalability

Performance

It can be measured in following ways :

Transit time : It is the time taken to travel a message from one device to another.

Response time : It is defined as the time elapsed between enquiry and response.

Other ways to measure performance are :

1. Efficiency of software

2. Number of users

3. Capability of connected hardware

Reliability

It decides the frequency at which network failure take place. More the failures are, less is the network's reliability.

Security

It refers to the protection of data from the unauthorised user or access. While travelling through network, data passes many

layers of network, and data can be traced if attempted. Hence security is also a very important characteristic for Networks.

Properties of Good Network

1. Interpersonal Communication : We can communicate with each other efficiently and easily example emails, chat

rooms, video conferencing etc.

2. Resources can be shared : We can use the resources provided by network such as printers etc.

http://www.studytonight.com/computer-networks/osi-model-network-layer

3. Sharing files, data : Authorised users are allowed to share the files on the network.

Basic Communication Model

Communication model is used to exchange data between two parties. For example communication between a computer,

server and telephone (through modem).

Source

Data to be transmitted is generated by this device, example: telephones, personal computers etc.

Transmitter

The data generated by the source system are not directly transmitted in the form they are generated. The transmitter

transforms and encodes the information in such a form to produce electromagnetic waves or signals.

Transmission System

A transmission system can be a single transmission line or a complex network connecting source and destination.

Receiver

Receiver accepts the signal from the transmission system and converts it to a form which is easily managed by the

destination device.

Destination

Destination receives the incoming data from the receiver.

Data Communication

The exchange of data between two devices through a transmission medium is Data Communication. The data is exchanged

in the form of 0‘s and 1‘s. The transmission medium used is wire cable. For data communication to occur, the

communication device must be part of a communication system. Data Communication has two types Local and Remote

which are discussed below :

Local :

Local communication takes place when the communicating devices are in the same geographical area, same building, face-

to-face between individuals etc.

Remote :

Remote communication takes place over a distance i.e. the devices are farther. Effectiveness of a Data Communication can

be measured through the following features :

1. Delivery : Delivery should be done to the correct destination.

2. Timeliness : Delivery should be on time.

3. Accuracy : Data delivered should be accurate.

Components of Data Communication

1. Message : It is the information to be delivered.

2. Sender : Sender is the person who is sending the message.

3. Receiver : Receiver is the person to him the message is to be delivered.

4. Medium : It is the medium through which message is to be sent for example modem.

5. Protocol : These are some set of rules which govern data communication.

Line Configuration in Computer Networks

Network is a connection made through connection links between two or more devices. Devices can be a computer, printer

or any other device that is capable to send and receive data. There are two ways to connect the devices :

1. Point-to-Point connection

2. Multipoint connection

Point-To-Point Connection

It is a protocol which is used as a communication link between two devices. It is simple to establish. The most common

example for Point-to-Point connection (PPP) is a computer connected by telephone line. We can connect the two devices

by means of a pair of wires or using a microwave or satellite link.

Example: Point-to-Point connection between remote control and Television for changing the channels.

MultiPoint Connection

It is also called Multidrop configuration. In this connection two or more devices share a single link.

There are two kinds of Multipoint Connections :

If the links are used simultaneously between many devices, then it is spatially shared line configuration.

If user takes turns while using the link, then it is time shared (temporal) line configuration.

Types of Network Topology

Network Topology is the schematic description of a network arrangement, connecting various nodes(sender and receiver)

through lines of connection.

BUS Topology

Bus topology is a network type in which every computer and network device is connected to single cable. When it has

exactly two endpoints, then it is called Linear Bus topology.

Features of Bus Topology

1. It transmits data only in one direction.

2. Every device is connected to a single cable

Advantages of Bus Topology

1. It is cost effective.

2. Cable required is least compared to other network topology.

3. Used in small networks.

4. It is easy to understand.

5. Easy to expand joining two cables together.

Disadvantages of Bus Topology

1. Cables fails then whole network fails.

2. If network traffic is heavy or nodes are more the performance of the network decreases.

3. Cable has a limited length.

4. It is slower than the ring topology.

RING Topology

It is called ring topology because it forms a ring as each computer is connected to another computer, with the last one

connected to the first. Exactly two neighbours for each device.

Features of Ring Topology

1. A number of repeaters are used for Ring topology with large number of nodes, because if someone wants to send

some data to the last node in the ring topology with 100 nodes, then the data will have to pass through 99 nodes to

reach the 100th node. Hence to prevent data loss repeaters are used in the network.

2. The transmission is unidirectional, but it can be made bidirectional by having 2 connections between each Network

Node, it is called Dual Ring Topology.

3. In Dual Ring Topology, two ring networks are formed, and data flow is in opposite direction in them. Also, if one

ring fails, the second ring can act as a backup, to keep the network up.

4. Data is transferred in a sequential manner that is bit by bit. Data transmitted, has to pass through each node of the

network, till the destination node.

Advantages of Ring Topology

1. Transmitting network is not affected by high traffic or by adding more nodes, as only the nodes having tokens can

transmit data.

2. Cheap to install and expand

Disadvantages of Ring Topology

1. Troubleshooting is difficult in ring topology.

2. Adding or deleting the computers disturbs the network activity.

3. Failure of one computer disturbs the whole network.

STAR Topology

In this type of topology all the computers are connected to a single hub through a cable. This hub is the central node and all

others nodes are connected to the central node.

Features of Star Topology

1. Every node has its own dedicated connection to the hub.

2. Hub acts as a repeater for data flow.

3. Can be used with twisted pair, Optical Fibre or coaxial cable.

Advantages of Star Topology

1. Fast performance with few nodes and low network traffic.

2. Hub can be upgraded easily.

3. Easy to troubleshoot.

4. Easy to setup and modify.

5. Only that node is affected which has failed, rest of the nodes can work smoothly.

Disadvantages of Star Topology

1. Cost of installation is high.

2. Expensive to use.

3. If the hub fails then the whole network is stopped because all the nodes depend on the hub.

4. Performance is based on the hub that is it depends on its capacity

MESH Topology

It is a point-to-point connection to other nodes or devices. All the network nodes are connected to each other. Mesh has

n(n-1)/2 physical channels to link n devices.

There are two techniques to transmit data over the Mesh topology, they are :

1. Routing

2. Flooding

Routing

In routing, the nodes have a routing logic, as per the network requirements. Like routing logic to direct the data to reach

the destination using the shortest distance. Or, routing logic which has information about the broken links, and it avoids

those node etc. We can even have routing logic, to re-configure the failed nodes.

Flooding

In flooding, the same data is transmitted to all the network nodes, hence no routing logic is required. The network is

robust, and the its very unlikely to lose the data. But it leads to unwanted load over the network.

Types of Mesh Topology

1. Partial Mesh Topology : In this topology some of the systems are connected in the same fashion as mesh topology

but some devices are only connected to two or three devices.

2. Full Mesh Topology : Each and every nodes or devices are connected to each other.

Features of Mesh Topology

1. Fully connected.

2. Robust.

3. Not flexible.

Advantages of Mesh Topology

1. Each connection can carry its own data load.

2. It is robust.

3. Fault is diagnosed easily.

4. Provides security and privacy.

Disadvantages of Mesh Topology

1. Installation and configuration is difficult.

2. Cabling cost is more.

3. Bulk wiring is required.

TREE Topology

It has a root node and all other nodes are connected to it forming a hierarchy. It is also called hierarchical topology. It

should at least have three levels to the hierarchy.

Features of Tree Topology

1. Ideal if workstations are located in groups.

2. Used in Wide Area Network.

Advantages of Tree Topology

1. Extension of bus and star topologies.

2. Expansion of nodes is possible and easy.

3. Easily managed and maintained.

4. Error detection is easily done.

Disadvantages of Tree Topology

1. Heavily cabled.

2. Costly.

3. If more nodes are added maintenance is difficult.

4. Central hub fails, network fails.

HYBRID Topology

It is two different types of topologies which is a mixture of two or more topologies. For example if in an office in one

department ring topology is used and in another star topology is used, connecting these topologies will result in Hybrid

Topology (ring topology and star topology).

Features of Hybrid Topology

1. It is a combination of two or topologies

2. Inherits the advantages and disadvantages of the topologies included

Advantages of Hybrid Topology

1. Reliable as Error detecting and trouble shooting is easy.

2. Effective.

3. Scalable as size can be increased easily.

4. Flexible.

Disadvantages of Hybrid Topology

1. Complex in design.

2. Costly.

Transmission Modes in Computer Networks

Transmission mode means transferring of data between two devices. It is also called communication mode. These modes

direct the direction of flow of information. There are three types of transmission mode. They are :

Simplex Mode

Half duplex Mode

Full duplex Mode

SIMPLEX Mode

In this type of transmission mode data can be sent only through one direction i.e. communication is unidirectional. We

cannot send a message back to the sender. Unidirectional communication is done in Simplex Systems.

Examples of simplex Mode is loudspeaker, television broadcasting, television and remote, keyboard and monitor etc.

HALF DUPLEX Mode

In half duplex system we can send data in both directions but it is done one at a time that is when the sender is sending the

data then at that time we can‘t send the sender our message. The data is sent in one direction.

Example of half duplex is a walkie- talkie in which message is sent one at a time and messages are sent in both the

directions.

FULL DUPLEX Mode

In full duplex system we can send data in both directions as it is bidirectional. Data can be sent in both directions

simultaneously. We can send as well as we receive the data.

Example of Full Duplex is a Telephone Network in which there is communication between two persons by a telephone

line, through which both can talk and listen at the same time.

In full duplex system there can be two lines one for sending the data and the other for receiving data.

Transmission Mediums in Computer Networks

Data is represented by computers and other telecommunication devices using signals. Signals are transmitted in the form

of electromagnetic energy from one device to another. Electromagnetic signals travel through vacuum, air or other

transmission mediums to travel between one point to another(from source to receiver).

Electromagnetic energy (includes electrical and magnetic fields) includes power, voice, visible light, radio waves,

ultraviolet light, gamma rays etc.

Transmission medium is the means through which we send our data from one place to another. The first layer (physical

layer) of Communication Networks OSI Seven layer model is dedicated to the transmission media, we will study the OSI

Model later.

Factors to be considered while choosing Transmission Medium

1. Transmission Rate

2. Cost and Ease of Installation

3. Resistance to Environmental Conditions

4. Distances

Bounded/Guided Transmission Media

It is the transmission media in which signals are confined to a specific path using wire or cable. The types of Bounded/

Guided are discussed below.

Twisted Pair Cable

This cable is the most commonly used and is cheaper than others. It is lightweight, cheap, can be installed easily, and they

support many different types of network. Some important points :

Its frequency range is 0 to 3.5 kHz.

Typical attenuation is 0.2 dB/Km @ 1kHz.

Typical delay is 50 µs/km.

Repeater spacing is 2km.

Twisted Pair is of two types :

Unshielded Twisted Pair (UTP)

Shielded Twisted Pair (STP)

Unshielded Twisted Pair Cable

It is the most common type of telecommunication when compared with Shielded Twisted Pair Cable which consists of two

conductors usually copper, each with its own colour plastic insulator. Identification is the reason behind coloured plastic

insulation.

UTP cables consist of 2 or 4 pairs of twisted cable. Cable with 2 pair use RJ-11 connector and 4 pair cable use RJ-45

connector.

Advantages :

Installation is easy

Flexible

Cheap

It has high speed capacity,

100 meter limit

Higher grades of UTP are used in LAN technologies like Ethernet.

It consists of two insulating copper wires (1mm thick). The wires are twisted together in a helical form to reduce electrical

interference from similar pair.

Disadvantages :

Bandwidth is low when compared with Coaxial Cable

Provides less protection from interference.

Shielded Twisted Pair Cable

This cable has a metal foil or braided-mesh covering which encases each pair of insulated conductors. Electromagnetic

noise penetration is prevented by metal casing. Shielding also eliminates crosstalk (explained in KEY TERMS Chapter).

It has same attenuation as unshielded twisted pair. It is faster the unshielded and coaxial cable. It is more expensive than

coaxial and unshielded twisted pair.

Advantages :

Easy to install

Performance is adequate

Can be used for Analog or Digital transmission

Increases the signalling rate

Higher capacity than unshielded twisted pair

Eliminates crosstalk

Disadvantages :

Difficult to manufacture

Heavy

Coaxial Cable

Coaxial is called by this name because it contains two conductors that are parallel to each other. Copper is used in this as

centre conductor which can be a solid wire or a standard one. It is surrounded by PVC installation, a sheath which is

encased in an outer conductor of metal foil, barid or both.

Outer metallic wrapping is used as a shield against noise and as the second conductor which completes the circuit. The

outer conductor is also encased in an insulating sheath. The outermost part is the plastic cover which protects the whole

cable.

Here the most common coaxial standards.

50-Ohm RG-7 or RG-11 : used with thick Ethernet.

50-Ohm RG-58 : used with thin Ethernet

75-Ohm RG-59 : used with cable television

93-Ohm RG-62 : used with ARCNET.

There are two types of Coaxial cables :

BaseBand

This is a 50 ohm (Ω) coaxial cable which is used for digital transmission. It is mostly used for LAN‘s. Baseband transmits

a single signal at a time with very high speed. The major drawback is that it needs amplification after every 1000 feet.

BroadBand

This uses analog transmission on standard cable television cabling. It transmits several simultaneous signal using different

frequencies. It covers large area when compared with Baseband Coaxial Cable.

Advantages :

Bandwidth is high

Used in long distance telephone lines.

Transmits digital signals at a very high rate of 10Mbps.

Much higher noise immunity

Data transmission without distortion.

The can span to longer distance at higher speeds as they have better shielding when compared to twisted pair cable

Disadvantages :

Single cable failure can fail the entire network.

Difficult to install and expensive when compared with twisted pair.

If the shield is imperfect, it can lead to grounded loop.

Fiber Optic Cable

These are similar to coaxial cable. It uses electric signals to transmit data. At the centre is the glass core through which

light propagates.

In multimode fibres, the core is 50microns, and In single mode fibres, the thickness is 8 to 10 microns.

The core in fiber optic cable is surrounded by glass cladding with lower index of refraction as compared to core to keep all

the light in core. This is covered with a thin plastic jacket to protect the cladding. The fibers are grouped together in

bundles protected by an outer shield.

Fiber optic cable has bandwidth more than 2 gbps (Gigabytes per Second)

Advantages :

Provides high quality transmission of signals at very high speed.

These are not affected by electromagnetic interference, so noise and distortion is very less.

Used for both analog and digital signals.

Disadvantages :

It is expensive

Difficult to install.

Maintenance is expensive and difficult.

Do not allow complete routing of light signals.

UnBounded/UnGuided Transmission Media

Unguided or wireless media sends the data through air (or water), which is available to anyone who has a device capable of

receiving them. Types of unguided/ unbounded media are discussed below :

Radio Transmission

MicroWave Transmission

Radio Transmission

Its frequency is between 10 kHz to 1GHz. It is simple to install and has high attenuation. These waves are used for

multicast communications.

Types of Propogation

Radio Transmission utilizes different types of propogation :

Troposphere : The lowest portion of earth‘s atmosphere extending outward approximately 30 miles from the

earth‘s surface. Clouds, jet planes, wind is found here.

Ionosphere : The layer of the atmosphere above troposphere, but below space. Contains electrically charged

particles.

Microwave Transmission

It travels at high frequency than the radio waves. It requires the sender to be inside of the receiver. It operates in a system

with a low gigahertz range. It is mostly used for unicast communication.

There are 2 types of Microwave Transmission :

1. Terrestrial Microwave

2. Satellite Microwave

Advantages of Microwave Transmission

Used for long distance telephone communication

Carries 1000‘s of voice channels at the same time

Disadvantages of Microwave Transmission

It is Very costly

Terrestrial Microwave

For increasing the distance served by terrestrial microwave, repeaters can be installed with each antenna .The signal

received by an antenna can be converted into transmittable form and relayed to next antenna as shown in below figure. It is

an example of telephone systems all over the world

There are two types of antennas used for terrestrial microwave communication :

1. Parabolic Dish Antenna

In this every line parallel to the line of symmetry reflects off the curve at angles in a way that they intersect at a common

point called focus. This antenna is based on geometry of parabola.

2. Horn Antenna

It is a like gigantic scoop. The outgoing transmissions are broadcast up a stem and deflected outward in a series of narrow

parallel beams by curved head.

Satellite Microwave

This is a microwave relay station which is placed in outer space. The satellites are launched either by rockets or space

shuttles carry them.

These are positioned 3600KM above the equator with an orbit speed that exactly matches the rotation speed of the earth.

As the satellite is positioned in a geo-synchronous orbit, it is stationery relative to earth and always stays over the same

point on the ground. This is usually done to allow ground stations to aim antenna at a fixed point in the sky.

Features of Satellite Microwave :

Bandwidth capacity depends on the frequency used.

Satellite microwave deployment for orbiting satellite is difficult.

Advantages of Satellite Microwave :

Transmitting station can receive back its own transmission and check whether the satellite has transmitted

information correctly.

A single microwave relay station which is visible from any point.

Disadvantages of Satellite Microwave :

Satellite manufacturing cost is very high

Cost of launching satellite is very expensive

Transmission highly depends on whether conditions, it can go down in bad weather

Types of Communication Networks

Local Area Network (LAN)

It is also called LAN and designed for small physical areas such as an office, group of buildings or a factory. LANs are

used widely as it is easy to design and to troubleshoot. Personal computers and workstations are connected to each other

through LANs. We can use different types of topologies through LAN, these are Star, Ring, Bus, Tree etc.

LAN can be a simple network like connecting two computers, to share files and network among each other while it can

also be as complex as interconnecting an entire building.

LAN networks are also widely used to share resources like printers, shared hard-drive etc.

Applications of LAN

One of the computer in a network can become a server serving all the remaining computers called clients. Software

can be stored on the server and it can be used by the remaining clients.

Connecting Locally all the workstations in a building to let them communicate with each other locally without any

internet access.

Sharing common resources like printers etc are some common applications of LAN.

Metropolitan Area Network (MAN)

It is basically a bigger version of LAN. It is also called MAN and uses the similar technology as LAN. It is designed to

extend over the entire city. It can be means to connecting a number of LANs into a larger network or it can be a single

cable. It is mainly hold and operated by single private company or a public company.

Wide Area Network (WAN)

It is also called WAN. WAN can be private or it can be public leased network. It is used for the network that covers large

distance such as cover states of a country. It is not easy to design and maintain. Communication medium used by WAN are

PSTN or Satellite links. WAN operates on low data rates.

Wireless Network

It is the fastest growing segment of computer. They are becoming very important in our daily life because wind

connections are not possible in cars or aeroplane. We can access Internet at any place avoiding wire related troubles..

These can be used also when the telephone systems gets destroyed due to some calamity/disaster. WANs are really

important now-a-days.

Inter Network

When we connect two or more networks then they are called internetwork or internet. We can join two or more individual

networks to form an internetwork through devices like routers gateways or bridges.

Connection Oriented and Connectionless Services

These are the two services given by the layers to layers above them. These services are :

1. Connection Oriented Service 2. Connectionless Services

Connection Oriented Services

There is a sequence of operation to be followed by the users of connection oriented service. These are :

1. Connection is established 2. Information is sent 3. Connection is released

In connection oriented service we have to establish a connection before starting the communication. When connection is

established we send the message or the information and then we release the connection.

Connection oriented service is more reliable than connectionless service. We can send the message in connection oriented

service if there is an error at the receivers end. Example of connection oriented is TCP (Transmission Control Protocol)

protocol.

Connection Less Services

It is similar to the postal services, as it carries the full address where the message (letter) is to be carried. Each message is

routed independently from source to destination. The order of message sent can be different from the order received.

In connectionless the data is transferred in one direction from source to destination without checking that destination is still

there or not or if it prepared to accept the message. Authentication is not needed in this. Example of Connectionless service

is UDP (User Datagram Protocol) protocol.

Difference between Connection oriented service and Connectionless service

1. In connection oriented service authentication is needed while connectionless service does not need any authentication. 2. Connection oriented protocol makes a connection and checks whether message is received or not and sends again if an

error occurs connectionless service protocol does not guarantees a delivery. 3. Connection oriented service is more reliable than connectionless service. 4. Connection oriented service interface is stream based and connectionless is message based.

Service Primitives

A service is specified by a set of primitives. A primitive means operation. To access the service a user process can access

these primitives. These primitives are different for connection oriented service and connectionless service. There are five

types of service primitives :

1. LISTEN : When a server is ready to accept an incoming connection it executes the LISTEN primitive. It blocks waiting for an incoming connection.

2. CONNECT : It connects the server by establishing a connection. Response is awaited. 3. RECIEVE: Then the RECIEVE call blocks the server. 4. SEND : Then the client executes SEND primitive to transmit its request followed by the execution of RECIEVE to get the

reply. Send the message. 5. DISCONNECT : This primitive is used for terminating the connection. After this primitive one can’t send any message. When

the client sends DISCONNECT packet then the server also sends the DISCONNECT packet to acknowledge the client. When the server package is received by client then the process is terminated.

Connection Oriented Service Primitives

There are 4 types of primitives for Connection Oriented Service :

CONNECT This primitive makes a connection

DATA, DATA-ACKNOWLEDGE, EXPEDITED-DATA Data and information is sent using thus primitive

CONNECT Primitive for closing the connection

RESET Primitive for reseting the connection

Connectionless Oriented Service Primitives

There are 4 types of primitives for Connectionless Oriented Service:

UNIDATA This primitive sends a packet of data

FACILITY, REPORT Primitive for enquiring about the performance of the network, like delivery statistics.

Relationship of Services to Protocol

Services

These are the operations that a layer can provide to the layer above it. It defines the operation and states a layer is ready to

perform but it does not specify anything about the implementation of these operations.

Protocols

These are set of rules that govern the format and meaning of frames, messages or packets that are exchanged between the

server and client.

Reference Models in Communication Networks

The most important reference models are :

1. OSI reference model. 2. TCP/IP reference model.

Introduction to ISO-OSI Model:

There are many users who use computer network and are located all over the world. To ensure national and worldwide

data communication ISO (ISO stands for International Organization of Standardization.) developed this model. This is

called a model for open system interconnection (OSI) and is normally called as OSI model.OSI model architecture consists

of seven layers. It defines seven layers or levels in a complete communication system. OSI Reference model is explained

in other chapter.

Introduction to TCP/IP REFERENCE Model

TCP/IP is transmission control protocol and internet protocol. Protocols are set of rules which govern every possible

communication over the internet. These protocols describe the movement of data between the host computers or internet

and offers simple naming and addressing schemes.

TCP/IP Reference model is explained in details other chapter.

ISO/OSI Model in Communication Networks

There are n numbers of users who use computer network and are located over the world. So to ensure, national and

worldwide data communication, systems must be developed which are compatible to communicate with each other. ISO

has developed this. ISO stands for International organization of Standardization. This is called a model for Open

System Interconnection (OSI) and is commonly known as OSI model.

The ISO-OSI model is a seven layer architecture. It defines seven layers or levels in a complete communication system.

Feature of OSI Model :

1. Big picture of communication over network is understandable through this OSI model.

2. We see how hardware and software work together.

3. We can understand new technologies as they are developed.

4. Troubleshooting is easier by separate networks.

5. Can be used to compare basic functional relationships on different networks.

Functions of Different Layers :

Layer 1: The Physical Layer :

1. It is the lowest layer of the OSI Model.

2. It activates, maintains and deactivates the physical connection.

3. It is responsible for transmission and reception of the unstructured raw data over network.

4. Voltages and data rates needed for transmission is defined in the physical layer.

5. It converts the digital/analog bits into electrical signal or optical signals.

6. Data encoding is also done in this layer.

Layer 2: Data Link Layer :

1. Data link layer synchronizes the information which is to be transmitted over the physical layer.

2. The main function of this layer is to make sure data transfer is error free from one node to another, over the

physical layer.

3. Transmitting and receiving data frames sequentially is managed by this layer.

4. This layer sends and expects acknowledgements for frames received and sent respectively. Resending of non-

acknowledgement received frames is also handled by this layer.

5. This layer establishes a logical layer between two nodes and also manages the Frame traffic control over the

network. It signals the transmitting node to stop, when the frame buffers are full.

Layer 3: The Network Layer :

1. It routes the signal through different channels from one node to other.

2. It acts as a network controller. It manages the Subnet traffic.

3. It decides by which route data should take.

4. It divides the outgoing messages into packets and assembles the incoming packets into messages for higher levels.

Layer 4: Transport Layer :

1. It decides if data transmission should be on parallel path or single path.

2. Functions such as Multiplexing, Segmenting or Splitting on the data are done by this layer

3. It receives messages from the Session layer above it, convert the message into smaller units and passes it on to the

Network layer.

4. Transport layer can be very complex, depending upon the network requirements.

Transport layer breaks the message (data) into small units so that they are handled more efficiently by the network layer.

Layer 5: The Session Layer :

1. Session layer manages and synchronize the conversation between two different applications.

2. Transfer of data from source to destination session layer streams of data are marked and are resynchronized

properly, so that the ends of the messages are not cut prematurely and data loss is avoided.

Layer 6: The Presentation Layer :

1. Presentation layer takes care that the data is sent in such a way that the receiver will understand the information

(data) and will be able to use the data.

2. While receiving the data, presentation layer transforms the data to be ready for the application layer.

3. Languages(syntax) can be different of the two communicating systems. Under this condition presentation layer

plays a role of translator.

4. It perfroms Data compression, Data encryption, Data conversion etc.

Layer 7: Application Layer :

1. It is the topmost layer.

2. Transferring of files disturbing the results to the user is also done in this layer. Mail services, directory services,

network resource etc are services provided by application layer.

3. This layer mainly holds application programs to act upon the received and to be sent data.

Merits of OSI reference model:

1. OSI model distinguishes well between the services, interfaces and protocols.

2. Protocols of OSI model are very well hidden.

3. Protocols can be replaced by new protocols as technology changes.

4. Supports connection oriented services as well as connectionless service.

Demerits of OSI reference model:

1. Model was devised before the invention of protocols.

2. Fitting of protocols is tedious task.

3. It is just used as a reference model.

PHYSICAL Layer - OSI Model

Physical layer is the lowest layer of all. It is responsible for sending bits from one computer to another. This layer is not

concerned with the meaning of the bits and deals with the physical connection to the network and with transmission and

reception of signals.

This layer defines electrical and physical details represented as 0 or a 1. How many pins a network will contain, when the

data can be transmitted or not and how the data would be synchronized.

FUNCTIONS OF PHYSICAL LAYER:

1. Representation of Bits: Data in this layer consists of stream of bits. The bits must be encoded into signals for transmission. It defines the type of encoding i.e. how 0’s and 1’s are changed to signal.

2. Data Rate: This layer defines the rate of transmission which is the number of bits per second. 3. Synchronization: It deals with the synchronization of the transmitter and receiver. The sender and receiver are synchronized

at bit level. 4. Interface: The physical layer defines the transmission interface between devices and transmission medium. 5. Line Configuration: This layer connects devices with the medium: Point to Point configuration and Multipoint configuration. 6. Topologies: Devices must be connected using the following topologies: Mesh, Star, Ring and Bus. 7. Transmission Modes: Physical Layer defines the direction of transmission between two devices: Simplex, Half Duplex, Full

Duplex. 8. Deals with baseband and broadband transmission.

DATA LINK Layer - OSI Model

Data link layer is most reliable node to node delivery of data. It forms frames from the packets that are received from

network layer and gives it to physical layer. It also synchronizes the information which is to be transmitted over the data.

Error controlling is easily done. The encoded data are then passed to physical.

Error detection bits are used by the data link layer. It also corrects the errors. Outgoing messages are assembled into

frames. Then the system waits for the acknowledgements to be received after the transmission. It is reliable to send

message.

FUNCTIONS OF DATA LINK LAYER:

1. Framing: Frames are the streams of bits received from the network layer into manageable data units. This division of stream of bits is done by Data Link Layer.

2. Physical Addressing: The Data Link layer adds a header to the frame in order to define physical address of the sender or receiver of the frame, if the frames are to be distributed to different systems on the network.

3. Flow Control: A flow control mechanism to avoid a fast transmitter from running a slow receiver by buffering the extra bit is provided by flow control. This prevents traffic jam at the receiver side.

4. Error Control: Error control is achieved by adding a trailer at the end of the frame. Duplication of frames are also prevented by using this mechanism. Data Link Layers adds mechanism to prevent duplication of frames.

5. Access Control: Protocols of this layer determine which of the devices has control over the link at any given time, when two or more devices are connected to the same link.

Network Layer - OSI Model

The main aim of this layer is to deliver packets from source to destination across multiple links (networks). If two

computers (system) are connected on the same link then there is no need for a network layer. It routes the signal through

different channels to the other end and acts as a network controller.

It also divides the outgoing messages into packets and to assemble incoming packets into messages for higher levels.

FUNCTIONS OF NETWORK LAYER:

1. It translates logical network address into physical address. Concerned with circuit, message or packet switching.

2. Routers and gateways operate in the network layer. Mechanism is provided by Network Layer for routing the

packets to final destination.

3. Connection services are provided including network layer flow control, network layer error control and packet

sequence control.

4. Breaks larger packets into small packets.

Transport Layer - OSI Model

The main aim of transport layer is to be delivered the entire message from source to destination. Transport layer ensures

whole message arrives intact and in order, ensuring both error control and flow control at the source to destination level. It

decides if data transmission should be on parallel path or single path

Transport layer breaks the message (data) into small units so that they are handled more efficiently by the network layer

and ensures that message arrives in order by checking error and flow control.

FUNCTIONS OF TRANSPORT LAYER:

1. Service Point Addressing : Transport Layer header includes service point address which is port address. This layer gets the message to the correct process on the computer unlike Network Layer, which gets each packet to the correct computer.

2. Segmentation and Reassembling : A message is divided into segments; each segment contains sequence number, which enables this layer in reassembling the message. Message is reassembled correctly upon arrival at the destination and replaces packets which were lost in transmission.

3. Connection Control : It includes 2 types : o Connectionless Transport Layer : Each segment is considered as an independent packet and delivered to the

transport layer at the destination machine. o Connection Oriented Transport Layer : Before delivering packets, connection is made with transport layer at the

destination machine. 4. Flow Control : In this layer, flow control is performed end to end. 5. Error Control : Error Control is performed end to end in this layer to ensure that the complete message arrives at the

receiving transport layer without any error. Error Correction is done through retransmission.

Session Layer - OSI Model

Its main aim is to establish, maintain and synchronize the interaction between communicating systems. Session layer

manages and synchronize the conversation between two different applications. Transfer of data from one destination to

another session layer streams of data are marked and are resynchronized properly, so that the ends of the messages are not

cut prematurely and data loss is avoided.

FUNCTIONS OF SESSION LAYER:

1. Dialog Control : This layer allows two systems to start communication with each other in half-duplex or full-duplex. 2. Synchronization : This layer allows a process to add checkpoints which are considered as synchronization points into stream

of data. Example: If a system is sending a file of 800 pages, adding checkpoints after every 50 pages is recommended. This ensures that 50 page unit is successfully received and acknowledged. This is beneficial at the time of crash as if a crash happens at page number 110; there is no need to retransmit 1 to100 pages.

Presentation Layer - OSI Model

The primary goal of this layer is to take care of the syntax and semantics of the information exchanged between two

communicating systems. Presentation layer takes care that the data is sent in such a way that the receiver will understand

the information (data) and will be able to use the data. Languages (syntax) can be different of the two communicating

systems. Under this condition presentation layer plays a role translator.

FUNCTIONS OF PRESENTATION LAYER:

1. Translation : Before being transmitted, information in the form of characters and numbers should be changed to bit streams. The presentation layer is responsible for interoperability between encoding methods as different computers use different encoding methods. It translates data between the formats the network requires and the format the computer.

2. Encryption : It carries out encryption at the transmitter and decryption at the receiver. 3. Compression : It carries out data compression to reduce the bandwidth of the data to be transmitted. The primary role of

Data compression is to reduce the number of bits to be 0transmitted. It is important in transmitting multimedia such as audio, video, text etc.

Application Layer - OSI Model

It is the top most layer of OSI Model. Manipulation of data (information) in various ways is done in this layer which

enables user or software to get access to the network. Some services provided by this layer includes: E-Mail, transferring

of files, distributing the results to user, directory services, network resource etc.

FUNCTIONS OF APPLICATION LAYER:

1. Mail Services : This layer provides the basis for E-mail forwarding and storage.

2. Network Virtual Terminal : It allows a user to log on to a remote host. The application creates software

emulation of a terminal at the remote host. User‘s computer talks to the software terminal which in turn talks to the

host and vice versa. Then the remote host believes it is communicating with one of its own terminals and allows

user to log on.

3. Directory Services : This layer provides access for global information about various services.

4. File Transfer, Access and Management (FTAM) : It is a standard mechanism to access files and manages it.

Users can access files in a remote computer and manage it. They can also retrieve files from a remote computer.

The TCP/IP Reference Model

TCP/IP means Transmission Control Protocol and Internet Protocol. It is the network model used in the current Internet

architecture as well. Protocols are set of rules which govern every possible communication over a network. These

protocols describe the movement of data between the source and destination or the internet. These protocols offer simple

naming and addressing schemes.

Overview of TCP/IP reference model

TCP/IP that is Transmission Control Protocol and Internet Protocol was developed by Department of Defence's Project

Research Agency (ARPA, later DARPA) as a part of a research project of network interconnection to connect remote

machines.

The features that stood out during the research, which led to making the TCP/IP reference model were:

Support for a flexible architecture. Adding more machines to a network was easy. The network was robust, and connections remained intact untill the source and destination machines were functioning.

The overall idea was to allow one application on one computer to talk to(send data packets) another application running on

different computer.

Description of different TCP/IP protocols

Layer 1: Host-to-network Layer

1. Lowest layer of the all. 2. Protocol is used to connect to the host, so that the packets can be sent over it. 3. Varies from host to host and network to network.

Layer 2: Internet layer

1. Selection of a packet switching network which is based on a connectionless internetwork layer is called a internet layer. 2. It is the layer which holds the whole architecture together. 3. It helps the packet to travel independently to the destination. 4. Order in which packets are received is different from the way they are sent. 5. IP (Internet Protocol) is used in this layer.

Layer 3: Transport Layer

1. It decides if data transmission should be on parallel path or single path. 2. Functions such as multiplexing, segmenting or splitting on the data is done by transport layer. 3. The applications can read and write to the transport layer. 4. Transport layer adds header information to the data. 5. Transport layer breaks the message (data) into small units so that they are handled more efficiently by the network layer. 6. Transport layer also arrange the packets to be sent, in sequence.

Layer 4: Application Layer

The TCP/IP specifications described a lot of applications that were at the top of the protocol stack. Some of them were

TELNET, FTP, SMTP, DNS etc.

1. TELNET is a two-way communication protocol which allows connecting to a remote machine and run applications on it. 2. FTP(File Transfer Protocol) is a protocol, that allows File transfer amongst computer users connected over a network. It is

reliable, simple and efficient. 3. SMTP(Simple Mail Transport Protocol) is a protocol, which is used to transport electronic mail between a source and

destination, directed via a route.

4. DNS(Domain Name Server) resolves an IP address into a textual address for Hosts connected over a network.

Merits of TCP/IP model

1. It operated independently. 2. It is scalable. 3. Client/server architecture. 4. Supports a number of routing protocols. 5. Can be used to establish a connection between two computers.

Demerits of TCP/IP

1. In this, the transport layer does not guarantee delivery of packets. 2. The model cannot be used in any other application. 3. Replacing protocol is not easy. 4. It has not clearly separated its services, interfaces and protocols.

Comparison of OSI Reference Model and TCP/IP Reference Model

Following are some major differences between OSI Reference Model and TCP/IP Reference Model, with diagrammatic

comparison below.

OSI(Open System Interconnection) TCP/IP(Transmission Control Protocol / Internet Protocol)

1. OSI is a generic, protocol independent standard, acting

as a communication gateway between the network and

end user.

1. TCP/IP model is based on standard protocols around which the

Internet has developed. It is a communication protocol, which allows

connection of hosts over a network.

2. In OSI model the transport layer guarantees the

delivery of packets.

2. In TCP/IP model the transport layer does not guarantees delivery of

packets. Still the TCP/IP model is more reliable.

3. Follows vertical approach. 3. Follows horizontal approach.

4. OSI model has a separate Presentation layer and

Session layer. 4. TCP/IP does not have a separate Presentation layer or Session layer.

5. OSI is a reference model around which the networks

are built. Generally it is used as a guidance tool. 5. TCP/IP model is, in a way implementation of the OSI model.

6. Network layer of OSI model provides both connection

oriented and connectionless service. 6. The Network layer in TCP/IP model provides connectionless service.

7. OSI model has a problem of fitting the protocols into

the model. 7. TCP/IP model does not fit any protocol

8. Protocols are hidden in OSI model and are easily

replaced as the technology changes. 8. In TCP/IP replacing protocol is not easy.

9. OSI model defines services, interfaces and protocols

very clearly and makes clear distinction between them. It

is protocol independent.

9. In TCP/IP, services, interfaces and protocols are not clearly separated.

It is also protocol dependent.

10. It has 7 layers 10. It has 4 layers

Diagrammatic Comparison between OSI Reference Model and TCP/IP Reference Model

KEY TERMS in Computer Networks

Following are some important terms, which are frequently used in context of Computer Networks.

Terms Definition

1. ISO The OSI model is a product of the Open Systems Interconnection project at the International

Organization for Standardization. ISO is a voluntary organization.

2. OSI Model Open System Interconnection is a model consisting of seven logical layers.

3. TCP/IP Model Transmission Control Protocol and Internet Protocol Model is based on four layer model which is based

on Protocols.

4. UTP Unshielded Twisted Pair cable is a Wired/Guided media which consists of two conductors usually

copper, each with its own colour plastic insulator

5. STP Shielded Twisted Pair cable is a Wired/Guided media has a metal foil or braided-mesh covering which

encases each pair of insulated conductors. Shielding also eliminates crosstalk

6. PPP Point-to-Point connection is a protocol which is used as a communication link between two devices.

7. LAN Local Area Network is designed for small areas such as an office, group of building or a factory.

8. WAN Wide Area Network is used for the network that covers large distance such as cover states of a country

9. MAN Metropolitan Area Network uses the similar technology as LAN. It is designed to extend over the entire

city.

10. Crosstalk

Undesired effect of one circuit on another circuit. It can occur when one line picks up some signals

travelling down another line. Example: telephone conversation when one can hear background

conversations. It can be eliminated by shielding each pair of twisted pair cable.

11. PSTN

Public Switched Telephone Network consists of telephone lines, cellular networks, satellites for

communication, fiber optic cables etc. It is the combination of world’s (national, local and regional)

circuit switched telephone network.

12. File Transfer, Access

and Management (FTAM)

Standard mechanism to access files and manages it. Users can access files in a remote computer and

manage it.

13. Analog Transmission The signal is continuously variable in amplitude and frequency. Power requirement is high when

compared with Digital Transmission.

14. Digital Transmission It is a sequence of voltage pulses. It is basically a series of discrete pulses. Security is better than Analog

Transmission.

http://www.studytonight.com/servlet/

http://www.studytonight.com/java/

http://www.studytonight.com/data-structures/

http://www.studytonight.com/dbms/

Definition

cloud computing Sponsored News

Considerations for Deploying Hybrid Clouds on Microsoft® Azure™ and Cloud ... –Rackspace Building a Private Cloud on Converged Infrastructure –Dell See More

Vendor Resources

An Introduction to Cloud Computing for Service Providers and Large Enterprise –Joyent Financial Services PaaS & Private Clouds: Managing and Monitoring Disparate... –MuleSoft

Cloud computing is a general term for the delivery of hosted services over the internet.

Cloud computing enables companies to consume a compute resource, such as a virtual machine (VMs), storage or an

application, as a utility -- just like electricity -- rather than having to build and maintain computing infrastructures in house.

SaaS Adoption: Mitigating App Integration Problems

In this expert guide, uncover 9 questions you should be asked by vendors when putting a hybrid cloud app integration

http://www.studytonight.com/servlet/

http://www.studytonight.com/java/

http://www.studytonight.com/data-structures/

http://www.studytonight.com/dbms/

http://www.cloudatyourpace.com/Considerations-for-Deploying-Hybrid-Clouds-on-Microsoft-Azure-and-Cloud-Platform?asrc=SS_scloudcomp_SN-991287881

http://www.convergedinfrastructure.com/Building-a-Private-Cloud-on-Converged-Infrastructure?asrc=SS_scloudcomp_SN-991287881

http://searchcloudcomputing.techtarget.com/sponsored_communities

http://searchcloudcomputing.bitpipe.com/detail/RES/1325797074_716.html

http://searchcloudcomputing.bitpipe.com/detail/RES/1445883141_347.html

http://searchservervirtualization.techtarget.com/definition/virtual-machine

http://searchstorage.techtarget.com/definition/storage

strategy in place. Also, learn how you can use DevOps to prevent cloud complications.

Cloud computing boasts several attractive benefits for businesses and end users. Three of the main benefits of cloud

computing are:

Self-service provisioning: End users can spin up compute resources for almost any type of workload on demand. This eliminates the traditional need for IT administrators to provision and manage compute resources.

Elasticity: Companies can scale up as computing needs increase and scale down again as demands decrease. This eliminates the need for massive investments in local infrastructure which may or may not remain active.

Pay per use: Compute resources are measured at a granular level, allowing users to pay only for the resources and workloads they use.

Cloud computing deployment models

Cloud computing services can be private, public or hybrid.

Private cloud services are delivered from a business' data center to internal users. This model offers versatility and

convenience, while preserving the management, control and security common to local data centers. Internal users may or

may not be billed for services through IT chargeback.

In the public cloud model, a third-party provider delivers the cloud service over the internet. Public cloud services are sold

on demand, typically by the minute or hour. Customers only pay for the CPU cycles, storage or bandwidth they consume.

Leading public cloud providers include Amazon Web Services (AWS), Microsoft Azure, IBM SoftLayer and Google

Compute Engine.

Hybrid cloud is a combination of public cloud services and on-premises private cloud -- with orchestration and automation

between the two. Companies can run mission-critical workloads or sensitive applications on the private cloud while using

the public cloud for bursting workloads that must scale on demand. The goal of hybrid cloud is to create a unified,

automated, scalable environment that takes advantage of all that a public cloud infrastructure can provide while still

maintaining control over mission-critical data.

Cloud computing service categories

Although cloud computing has changed over time, it has been divided into three broad service categories: infrastructure as

a service (IaaS), platform as a service (PaaS) and software as a service (SaaS).

Find more PRO+ content and other member only offers, here.

E-Chapter

Deciding what goes where in a multi-cloud environment

E-Handbook

Open source cloud management means knowing your tools

E-Handbook

Questions to answer before hybrid cloud adoption

IaaS providers, such as AWS, supply a virtual server instance and storage, as well as application program interfaces (APIs)

that let users migrate workloads to a virtual machine. Users have an allocated storage capacity and can start, stop, access

and configure the VM and storage as desired. IaaS providers offer small, medium, large, extra-large and memory- or

http://searchcloudprovider.techtarget.com/definition/User-self-provisioning

http://searchitoperations.techtarget.com/definition/on-demand-computing

http://searchcio.techtarget.com/definition/IT-elasticity

http://searchcio.techtarget.com/definition/metered-services

http://searchcloudcomputing.techtarget.com/definition/private-cloud

http://searchcloudcomputing.techtarget.com/definition/public-cloud

http://searchcloudcomputing.techtarget.com/definition/hybrid-cloud

http://searchaws.techtarget.com/definition/IT-chargeback-showback

http://whatis.techtarget.com/definition/CPU-central-processing-unit

http://searchstorage.techtarget.com/definition/storage

http://searchenterprisewan.techtarget.com/definition/bandwidth

http://whatis.techtarget.com/definition/Amazon-Web-Services-AWS

http://searchcloudcomputing.techtarget.com/definition/Windows-Azure

http://searchcloudcomputing.techtarget.com/definition/IBM-SoftLayer

http://searchaws.techtarget.com/definition/Google-Compute-Engine



http://searchcloudcomputing.techtarget.com/definition/cloud-bursting

http://searchcloudcomputing.techtarget.com/definition/Infrastructure-as-a-Service-IaaS

http://searchcloudcomputing.techtarget.com/definition/Platform-as-a-Service-PaaS

http://whatis.techtarget.com/definition/SaaS

http://pro.techtarget.com/ProLP?Offer=PROContentBox

http://searchcloudcomputing.techtarget.com/ebook/The-benefits-of-a-multi-cloud-approach/Deciding-what-goes-where-in-a-multi-cloud-environment

http://searchcloudcomputing.techtarget.com/ehandbook/Open-source-cloud-management-means-knowing-your-tools

http://searchcloudcomputing.techtarget.com/ehandbook/Questions-to-answer-before-hybrid-cloud-adoption

http://searchnetworking.techtarget.com/definition/virtual-server

http://searchexchange.techtarget.com/definition/application-program-interface

http://searchdatacenter.techtarget.com/definition/workload

http://searchservervirtualization.techtarget.com/definition/virtual-machine

compute-optimized instances, in addition to customized instances, for various workload needs.

In the PaaS model, providers host development tools on their infrastructures. Users access these tools over the internet

using APIs, web portals or gateway software. PaaS is used for general software development, and many PaaS providers

will host the software after it's developed. Common PaaS providers include Salesforce.com's Force.com, AWS Elastic

Beanstalk and Google App Engine.

SaaS is a distribution model that delivers software applications over the internet; these applications are often called web

services. Microsoft Office 365 is a SaaS offering for productivity software and email services. Users can access SaaS

applications and services from any location using a computer or mobile device that has internet access.

A brief summary of what IaaS, PaaS and SaaS are and how to use each.

Cloud computing security

Security remains a primary concern for businesses contemplating cloud adoption -- especially public cloud adoption.

Public cloud providers share their underlying hardware infrastructure between numerous customers, as public cloud is a

multi-tenant environment. This environment demands copious isolation between logical compute resources. At the same

time, access to public cloud storage and compute resources is guarded by account logon credentials.

What are the biggest benefits and challenges your organization has faced while using cloud computing services?

Many organizations bound by complex regulatory obligations and governance standards are still hesitant to place data or

workloads in the public cloud for fear of outages, loss or theft. However, this resistance is fading as logical isolation has

proven reliable and the addition of data encryption and various identity and access management (IAM) tools has improved

security within the public cloud.

This was last updated in October 2016

Next Steps

What application integration strategy is right for you?

Brian Posey explains the business benefits of public cloud services, cost considerations and provides advice for choosing

the best public cloud provider.

Continue Reading About cloud computing

Linda Tucci explains why "Cloud computing will follow you everywhere" at SearchCIO.com

George Gilder wrote about the dawning of the "Petabyte Age" and the Internet cloud in the pages of "Wired Magazine"

Read about 11 leaders in the cloud computing industry from SearchCloudComputing

Cloud Computing Explained -- eGuide

Check out 10 cloud computing definitions you must know from SearchCloudComputing.com

http://whatis.techtarget.com/definition/portal

http://searchsalesforce.techtarget.com/definition/Forcecom

http://searchaws.techtarget.com/definition/Elastic-Beanstalk



http://searchaws.techtarget.com/definition/Google-App-Engine

http://searchsoa.techtarget.com/definition/Web-services



http://searchmobilecomputing.techtarget.com/definition/Microsoft-Office-365-suite

http://whatis.techtarget.com/definition/multi-tenancy

http://searchcompliance.techtarget.com/definition/regulatory-compliance

http://searchcompliance.techtarget.com/definition/information-governance

http://searchsecurity.techtarget.com/definition/encryption

http://searchsecurity.techtarget.com/definition/identity-access-management-IAM-system

http://searchcloudcomputing.techtarget.com/archive/2016/10

http://searchsoa.techtarget.com/feature/When-do-you-need-an-application-integration-strategy

http://searchcloudcomputing.techtarget.com/feature/A-primer-on-public-cloud-benefits

http://searchcloudcomputing.techtarget.com/feature/Criteria-for-choosing-a-public-cloud-provider

http://searchcloudcomputing.techtarget.com/feature/Compare-the-market-leading-public-cloud-providers



http://searchcio.techtarget.com/news/1309834/Cloud-computing-companies-offer-new-services-architecture

http://www.wired.com/wired/archive/14.10/cloudware_pr.html

http://searchcloudcomputing.techtarget.com/photostory/2240204149/Eleven-cloud-computing-industry-movers-and-shapers/1/As-cloud-services-market-progresses-innovative-IT-pros-stand-out

http://docs.media.bitpipe.com/io_10x/io_100433/item_419064/HPandIntel_sCloudComputing_SO%23034437_E-Guide_052611.pdf

http://searchcloudcomputing.techtarget.com/photostory/2240158359/Ten-cloud-computing-definitions-you-need-to-know/1/Introduction#contentCompress

Cloud computing experts forecast the market climate in 2014

Related Terms

cloud database A cloud database is a collection of content, either structured or unstructured, that resides on a private, public or hybrid cloud... See complete definition

GoodData GoodData is a software company specializing in cloud-based business intelligence (BI) and big data analytics. The company’s main ... See complete definition

Software as a Service (SaaS) Software as a service (SaaS) is a software distribution model in which applications are hosted by a vendor or service provider ... See complete definition

Digital Signature

The digital equivalent of a handwritten signature or stamped seal, but offering far more inherent security, a digital

signature is intended to solve the problem of tampering and impersonation in digital communications. Digital signatures

can provide the added assurances of evidence to origin, identity and status of an electronic document, transaction or

message, as well as acknowledging informed consent by the signer.

In many countries, including the United States, digital signatures have the same legal significance as the more traditional

forms of signed documents. The United States Government Printing Office publishes electronic versions of the budget,

public and private laws, and congressional bills with digital signatures.

How digital signatures work

Digital signatures are based on public key cryptography, also known as asymmetric cryptography. Using a public key

algorithm such as RSA, one can generate two keys that are mathematically linked: one private and one public. To create a

digital signature, signing software (such as an email program) creates a one-way hash of the electronic data to be signed.

The private key is then used to encrypt the hash. The encrypted hash -- along with other information, such as the hashing

algorithm -- is the digital signature. The reason for encrypting the hash instead of the entire message or document is that a

hash function can convert an arbitrary input into a fixed length value, which is usually much shorter. This saves time since

hashing is much faster than signing.

Esign http://cca.gov.in/cca/sites/default/files/files/ESIGNFAQFeb26022015.pdf

SMS Gateway

This type of gateway directly connects to the operator's mobile SMSC (short message service center) through a direct line

or an internet. ... This gateway converts the SMS to an SMSC format for device compatibility. This form of SMS gateway

is used by SMS aggregators in order to deliver SMS services to customers.

SMS gateway permits your computer to receive or send (SMS) Short Message Service transmissions to or from a telecom

service provider. Most messages are diverted to mobile phone networks. There are many SMS gateways support media

conversion which help in converting from email and other formats.

Short message service (SMS). It helps to receive and send the SMS also called as short messages. This is the most

commonly used method of communication in present modern business world. It helps in exchange of information is made

through SMS. Messages can be delivered without physical contact. This is also real time,fast paced and speedy.

SMS messages travel through a series of connections.The telecom companies has named this as tool SMS gateway. This is

a network facility being used in order to send and receive SMS. It helps two way messaging that can be routed through cell

http://searchcloudcomputing.techtarget.com/feature/Cloud-computing-experts-forecast-the-market-climate-in-2014

http://whatis.techtarget.com/definition/Database-as-a-Service-DBaaS

http://whatis.techtarget.com/definition/Database-as-a-Service-DBaaS

http://whatis.techtarget.com/definition/GoodData

http://whatis.techtarget.com/definition/GoodData

http://searchcloudcomputing.techtarget.com/definition/Software-as-a-Service

http://searchcloudcomputing.techtarget.com/definition/Software-as-a-Service

http://searchsecurity.techtarget.com/definition/asymmetric-cryptography

http://searchsecurity.techtarget.com/definition/public-key

http://whatis.techtarget.com/definition/algorithm

http://searchsecurity.techtarget.com/definition/RSA

http://searchsecurity.techtarget.com/definition/private-key

http://searchsqlserver.techtarget.com/definition/hashing

http://cca.gov.in/cca/sites/default/files/files/ESIGNFAQFeb26022015.pdf

phones or computers or through Laptops. The connection from cell phones to PC can be made to work through cable,

infrared and Bluetooth.

There are several SMS gateway types. They have different ways and specific course being followed as to how they

transmit the SMS messages. They are as follows:

·1. Direct-to-mobile Gateway. This type of SMS gateway has a built-in GSM or Global System for Mobile

Communication connectivity. This permits sending and receiving SMS text messages via email,web pages and other

software apps through the use of a SIM (Subscriber Identity Module) card. Direct-to-mobile gateway differs from SMS

aggregators as it is installed in its own organization's network and has the capacity to connect to a local mobile network.

This mobile works by the use of SIM card which will be acquired from the network provider for the SMS gateway to be

installed.

·2. Direct-to-SMS Gateway. This device allows sending and receiving SMS text message by email, web pages and other

software apps. This translates protocol to different forms to relay the SMS transmission. This type of gateway directly

connects to the operator‘s mobile SMSC (short message service center) through a direct line or an internet. SMSC handles

operation by storing messages and routing and forwarding them to their desired endpoints. This gateway converts the SMS

to an SMSC format for device compatibility. This form of SMS gateway is used by SMS aggregators in order to deliver

SMS services to customers. This is due to the great volume of messaging it can support and also its direct contact with the

mobile operator.

POP - Post Office Protocol (1) POP is short for Post Office Protocol, a protocol used to retrieve e-mail from a mail server. Most e-mail applications

(sometimes called an e-mail client) use the POP protocol, although some can use the newer IMAP (Internet Message

Access Protocol).

There are two versions of POP. The first, called POP2, became a standard in the mid-80's and requires SMTP to send

messages. The newer version, POP3, can be used with or without SMTP.

(2) Pop is short for point of presence, an access point to the Internet. ISPs have typically multiple POPs. A point of

presence is a physical location, either part of the facilities of a telecommunications provider that the ISP rents or a separate

location from the telecommunications provider, that houses servers, routers, ATM switches and digital/analog

IMAP - Internet Message Access Protocol hort for Internet Message Access Protocol, a protocol for retrieving e-mail messages. The latest version, IMAP4, is similar

to POP3 but supports some additional features. For example, with IMAP4, you can search through your e-mail messages

for keywords while the messages are still on mail server. You can then choose which messages to download to your

machine.

IMAP was developed at Stanford University in 1986.

e-mail client An application that runs on a personal computer or workstation and enables you to send, receive and organize e-mail. It's

called a client because e-mail systems are based on a client-server architecture. Mail is sent from many clients to a central

server, which re-routes the mail to its intended destination.

Also see Why E-Mails Bounce in the Did You Know section of Webopedia.

http://www.webopedia.com/TERM/P/protocol.html

http://www.webopedia.com/TERM/E/e_mail.html

http://www.webopedia.com/TERM/S/server.html

http://www.webopedia.com/TERM/E/e_mail_client.html

http://www.webopedia.com/TERM/I/IMAP.html



http://www.webopedia.com/TERM/S/standard.html

http://www.webopedia.com/TERM/S/SMTP.html

http://www.webopedia.com/TERM/I/Internet.html

http://www.webopedia.com/TERM/I/ISP.html


http://www.webopedia.com/TERM/R/router.html

http://www.webopedia.com/TERM/A/ATM.html

http://www.webopedia.com/TERM/P/protocol.html


http://www.webopedia.com/TERM/P/POP2.html

http://www.webopedia.com/TERM/K/keyword.html


http://www.webopedia.com/TERM/A/application.html

http://www.webopedia.com/TERM/P/personal_computer.html

http://www.webopedia.com/TERM/W/workstation.html


http://www.webopedia.com/TERM/C/client_server_architecture.html


http://www.webopedia.com/DidYouKnow/Internet/2002/BouncedEmail.asp

http://www.webopedia.com/DidYouKnow/_index.asp

peer-to-peer architecture Often referred to simply as peer-to-peer, or abbreviated P2P, a type of network in which each workstation has equivalent

capabilities and responsibilities. This differs from client/server architectures, in which some computers are dedicated to

serving the others. Peer-to-peer networks are generally simpler, but they usually do not offer the same performance under

heavy loads.

QR code

QR code (abbreviated from Quick Response Code) is the trademark for a type of matrix barcode (or two-dimensional

barcode) first designed for the automotive industry in Japan. A barcode is a machine-readable optical label that contains

information about the item to which it is attached.

http://www.webopedia.com/TERM/N/network.html

http://www.webopedia.com/TERM/W/workstation.html

http://www.webopedia.com/TERM/C/client_server_architecture.html

http://www.webopedia.com/TERM/D/dedicated.html

http://www.webopedia.com/TERM/L/load.html

What is a QR Code?

So you may have heard that QR Codes are set to become the 'next big thing' but thinking to yourself, what is a QR Code!?

QR or Quick Response Codes are a type of two-dimensional barcode that can be read using smartphones and dedicated

QR reading devices, that link directly to text, emails, websites, phone numbers and more! You may have even got to this

site by scanning a QR code!

QR codes are huge in Japan and across the East, and are slowly beginning to become commonplace in the West. Soon

enough you will see QR codes on product packaging, shop displays, printed and billboard advertisements as well as in

emails and on websites. The scope of use for QR codes really is huge, particularly for the marketing and advertising of

products, brands, services and anything else you can think of.

Why should I care about QR Codes?

With as many as half of us now owning smartphones, and that number growing on a daily basis, QR Codes have the

potential to have a major impact upon society and particularly in advertising, marketing and customer service with a

wealth of product information just one scan away.

How is a QR Code different from a normal 1D UPC barcode?

Ordinarily we think of a barcode as a collection of vertical lines; 2D Barcodes or QR Codes are different in that the data is

stored in both directions and can be scanned vertically OR horizontally.

Whilst a standard 1D Barcode (UPC/EAN) stores up to 30 numbers, a QR Barcode can store up to a massive 7,089! It is

this massive amount of data that enables links to such things as videos, Facebook or Twitter pages or a plethora of other

website pages.

How do I scan a QR Code?

If you have a smartphone like an iPhone, Android or Blackberry then there a number of different barcode scanner

applications such as Red Laser, Barcode Scanner and QR Scanner that can read and decode data from a QR code. The

majority of these are completely FREE, and all you have to do once you install one is to use your phone's camera to scan

the barcode, which will then automatically load the encoded data for you.

What can be encoded into a QR Code?

In its simplest sense a QR Code is an 'image-based hypertext link' that can be used offline – any URL can be encoded into

a QR Code so essentially any webpage can be opened automatically as a result of scanning the barcode. If you want to

encourage someone to like your Facebook page – have your Facebook profile page as the URL. Want your video to go

viral – encode the URL in your QR Code. The options are endless.

In addition to website URLs a QR Code can also contain a phone number – so when it is scanned it prompts the user to call

a particular number. Similarly you can encode an SMS text message, V-card data or just plain alphanumeric text. The

smartphone or 2D barcode reading device will automatically know which application to use to open the content embedded

within the QR Code.

Where can QR Codes be placed?

The answer to this is almost anywhere! QR Code printing can be done in newspapers, magazines, brochures, leaflets and

on business cards. Further to this they can be put on product packaging or labels, or on billboards or even walls. You could

even tattoo a QR Code on your body – now that would be an interesting take on giving a girl/guy your number in a bar!

You can use QR Codes on a website but they should not generally be used as a substitute for an old-fashioned hyperlink

because obviously the user is already online and doesn't really want to fiddle around with their phone only to find a

website they could have just clicked through to in half the time.

How can I make a QR Code?

You can make your own QR Codes using designated 2D barcode generators, some of which are listed below; however you

should first consider why it is that you want a QR Code and how you will use it. See the 'QR Codes for Marketing' section

below for more information on this.

QR Code generators that are currently available include:

http://www.qrstuff.com/ http://qrcode.kaywa.com/ http://quikqr.com/

http://www.piranha-solutions.com/qr-code-printing.php

http://www.qrstuff.com/

http://qrcode.kaywa.com/

http://quikqr.com/

What size does a QR Code have to be?

Generally speaking, the larger the QR Code, the easier it is for it to be scanned, however most QR reading devices are able

to scan images that are small enough to fit on a business card for example. This of course assumes that the quality of image

is good.

QR Code File Formats

You can use the following file formats when creating a QR Code:

HTML Code PNG File Tiff File SVG EPS

PNG files work particularly well as they can be resized very easily, meaning that you can easily scale the QR Code

depending on where you want to put it.

QR Codes for Marketing

If you want to use QR Codes for business or marketing purposes then you should consider that people have higher

expectations from scanning a QR Code than they do simply clicking a link on a website. You should offer something

special or unique to people that have taken the time and effort to scan the barcode. For ideas of what this could be, or just

for more information about QR Code Marketing have a look at Piranha Internet who have successfully incorporated the

use of QR Codes into several marketing strategies for their clients.

Also remember that many people won't know what a QR Code is or how to use it. Up until their use is more widespread

you will need to provide instructions about what to do with a QR Code.

Who invented the QR Code?

http://www.piranha-internet.co.uk/search/qr-code-marketing.php

Denso-Wave - a subsidiary of the Toyota Group - are attributed with the creation of the QR Code as far back as 1994.

Originally it was designed to be used to track parts in the vehicle manufacturing industry, but its use has since grown

tremendously.

Other 2D Barcode Formats

QR Codes are just one type of 2D Barcode, although they are probably the most popular. Other popular 2D Barcode

formats are:

Microsoft Tag – Microsoft have their very own 2D barcode format known as a High Capacity Colour Barcode, or 'Tag'. The main benefits of this are that you can easily customise your tag – adding colour and making it match your brand. You can also "dynamically change your data source" meaning that you can change the URL that the tag directs to. The main drawback of Microsoft Tag is that they can only be read using Microsoft's own tag reader.

Data Matrix – This is probably the most similar format to the QR Code and is commonly used on small electrical components because it can be read even when only 2-3mm in size.

EZcode – This system is a little different in that the data is not actually stored within the code itself, but on the Scanbuy server. A code index is sent from a mobile device to the server, which queries a database and returns the information. The problem with such a system is that it is wholly reliant upon the Scanbuy servers.

GSM stands for Global System for Mobile Communication, and unless you live in the United States or Russia, this is

probably the technology your phone network uses, given it‘s the standard system for most of the world. GSM networks use

TDMA, which stands for Time Division Multiple Access. TDMA works by assigning time slots to multiple conversation

streams, alternating them in sequence and switching between each conversation in very short intervals. During these

intervals, phones can transmit their information. In order for the network to know which users are connected to the

network, each phone uses a subscriber identification module card, or SIM card.

SIM cards are one of the key features of GSM networks. They house your service subscription, network identification, and

address book information. The cards are also used to assign time slots to the phone conversation, and moreover, they tell

the network what services you have access to. They store your address book, too, along with relative contact information.

They can even be used to pass information between phones, if a carrier allows it.

Read more: http://www.digitaltrends.com/mobile/cdma-vs-gsm-differences-explained/#ixzz4Y0MhzIqq

When you‘re looking at buying a new phone, you might find that there are way too many acronyms to choose from,

between CDMA, GSM, LTE, and WiMax, and the list goes on. Instead, it can be easier to focus simply on the differences

in these networks as they apply to you directly. The simplest explanation is that the ―G‖ in 4G stands for generation,

because 4G is the fourth generation of mobile data technology, as defined by the radio sector of the International

Telecommunication Union (ITU-R). LTE stands for ―Long Term Evolution‖ and applies more generally to the idea of

improving wireless broadband speeds to meet increasing demand.

Don't Fall Behind Stay current with a recap of today's Tech News from Digital Trends

When you‘re looking at buying a new phone, you might find that there are way too many acronyms to choose from,

http://www.digitaltrends.com/mobile/cdma-vs-gsm-differences-explained/#ixzz4Y0MhzIqq

between CDMA, GSM, LTE, and WiMax, and the list goes on. Instead, it can be easier to focus simply on the differences

in these networks as they apply to you directly. The simplest explanation is that the ―G‖ in 4G stands for generation,

because 4G is the fourth generation of mobile data technology, as defined by the radio sector of the International

Telecommunication Union (ITU-R). LTE stands for ―Long Term Evolution‖ and applies more generally to the idea of

improving wireless broadband speeds to meet increasing demand.

What is 3G?

When 3G networks started rolling out, they replaced the 2G system, a network protocol that only allowed the most basic of

what we would now call smartphone functionality. Most 2G networks handled phone calls, basic text messaging, and small

amounts of data over a protocol called MMS. With the introduction of 3G connectivity, a number of larger data formats

became much more accessible, including standard HTML pages, videos, and music. The speeds were still pretty slow, and

mostly required pages and data specially formatted for these slower wireless connections. By 2G standards, the new

protocol was speedy, but still didn‘t come anywhere close to replacing a home broadband connection.

What is 4G?

The ITU-R set standards for 4G connectivity in March of 2008, requiring all services described as 4G to adhere to a set of

speed and connection standards. For mobile use, including smartphones and tablets, connection speeds need to have a peak

of at least 100 megabits per second, and for more stationary uses such as mobile hotspots, at least 1 gigabit per second.

Aadhaar Seeding Aadhaar Seeding Approach Empaneled Seeding Agencies Seeding Training

The UIDAI does not collect or store any additional personal information or linking data, such as PAN number, Driver‘s License

numbers, details of caste, creed, religion, income level or health status, etc. UIDAI has created a seeding ecosystem, where different

partners can leverage various tools offered by UIDAI to link Aadhaar in their respective service delivery databases.

Aadhaar seeding is a process by which UIDs of residents are accurately included in the service delivery database of service providers

for enabling Aadhaar based authentication during service delivery. The seeding process is accomplished in two steps. In the first step,

Aadhaar details need to be collected from the beneficiary. The service provider or the seeding agency is expected to reveal the

purpose of collecting Aadhaar details and taking an informed consent from the Aadhaar number holder or the beneficiary. The second

step involves the verification of collected Aadhaar details. Once the verification is successful with UIDAI‘s CIDR database, the

Aadhaar is linked to the beneficiary record in the domain database of the service provider.

UIDAI has undertaken multiple activities to ensure Aadhaar seeding in facilitated in various scheme databases. The Aadhaar seeding

framework includes:

A Standard Protocol Covering the Approach & Process for Seeding Aadhaar in Service Delivery Databases is available on UIDAI website.

UIDAI has launched various services like DBT Seeding Data Viewer (DSDV), authentication process to verify seeding, e-Aadhaar download, EID-UID search, demographic authentication, advanced search, etc through resident portal and its ecosystem partners to facilitate seeding process

UIDAI has conducted multiple Aadhaar seeding workshops at UIDAI HQ for ministries and departments and at UIDAI ROs for local administration in states.

Empanelled 48 seeding agencies for undertaking seeding on behalf of central and state departments.

Developed content including classroom training and computer based training content for various stakeholders

Scalability is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged

https://uidai.gov.in/images/aadhaar_seeding_june_2015_v1.1.pdf

https://uidai.gov.in/images/empanelled_seeding_agencies_03022015.pdf

https://uidai.gov.in/new/authentication/seeding-authentication-fi.html

https://uidai.gov.in/images/aadhaar_seeding_june_2015_v1.1.pdf

in order to accommodate that growth.[1] For example, it can refer to the capability of a system to increase its total output under an increased load when resources (typically hardware) are added. An analogous meaning is implied when the word is used in an economic context, where scalability of a company implies that the underlying business model offers the potential for economic growth within the company.

Scalability, as a property of systems, is generally difficult to define[2] and in any particular case it is necessary to define the specific requirements for scalability on those dimensions that are deemed important. It is a highly significant issue in electronics systems, databases, routers, and networking. A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a scalable system.

An algorithm, design, networking protocol, program, or other system is said to scale if it is suitably efficient and practical when applied to large situations (e.g. a large input data set, a large number of outputs or users, or a large number of participating nodes in the case of a distributed system). If the design or system fails when a quantity increases, it does not scale. In practice,

if there are a large number of things (n) that affect scaling, then resource requirements (for example, algorithmic time-

complexity) must grow less than n2 as n increases. An example is a search engine, which scales not only for the number of

users, but also for the number of objects it indexes. Scalability refers to the ability of a site to increase in size as demand warrants.[3]

The concept of scalability is desirable in technology as well as business settings. The base concept is consistent – the ability for a business or technology to accept increased volume without impacting the contribution margin (= revenue − variable costs). For example, a given piece of equipment may have a capacity for 1–1000 users, while beyond 1000 users additional equipment is needed or performance will decline (variable costs will increase and reduce contribution margin).

Database scalability [edit]

A number of different approaches enable databases to grow to very large size while supporting an ever-increasing rate of transactions per second. Not to be discounted, of course, is the rapid pace of hardware advances in both the speed and capacity of mass storage devices, as well as similar advances in CPU and networking speed.

One technique supported by most of the major database management system (DBMS) products is the partitioning of large tables, based on ranges of values in a key field. In this manner, the database can be scaled out across a cluster of separate database servers. Also, with the advent of 64-bit microprocessors, multi-core CPUs, and large SMP multiprocessors, DBMS vendors have been at the forefront of supporting multi-threaded implementations that substantially scale up transaction processing capacity.

Network-attached storage (NAS) and Storage area networks (SANs) coupled with fast local area networks and Fibre Channel technology enable still larger, more loosely coupled configurations of databases and distributed computing power. The widely supported X/Open XA standard employs a global transaction monitor to coordinate distributed transactions among semi-autonomous XA-compliant database resources. Oracle RAC uses a different model to achieve scalability, based on a "shared-everything" architecture that relies upon high-speed connections between servers.

While DBMS vendors debate the relative merits of their favored designs, some companies and researchers question the inherent limitations of relational database management systems. GigaSpaces, for example, contends that an entirely different model of distributed data access and transaction processing, space-based architecture, is required to achieve the highest performance and scalability. On the other hand, Base One makes the case for extreme scalability without departing from mainstream relational database technology. [7] For specialized applications, NoSQL architectures such as Google's BigTable can further enhance scalability. Google's massively distributed Spanner technology, positioned as a successor to BigTable, supports general-purpose database transactions and provides a more conventional SQL-based query language.[8

Definition - What does Data Redundancy mean?

Data redundancy is a condition created within a database or data storage technology in which the same piece of data is held in two separate places.

This can mean two different fields within a single database, or two different spots in multiple software environments or platforms. Whenever data is repeated, this basically constitutes data redundancy. This can occur by accident, but is also done deliberately for backup and recovery purposes.

public key infrastructure (PKI)

A public key infrastructure (PKI) is a set of roles, policies, and procedures needed to create, manage, distribute, use,

store, and revoke digital certificates and manage public-key encryption. ... In a Microsoft PKI, a registration authority is

https://en.wikipedia.org/wiki/Scalability#cite_note-1

https://en.wikipedia.org/wiki/Economics

https://en.wikipedia.org/wiki/Business_model

https://en.wikipedia.org/wiki/Economic_growth


https://en.wikipedia.org/wiki/Algorithm

https://en.wikipedia.org/wiki/Protocol_(computing)

https://en.wikipedia.org/wiki/Computer_program

https://en.wikipedia.org/wiki/Algorithmic_efficiency


https://en.wikipedia.org/wiki/Business

https://en.wikipedia.org/wiki/Contribution_margin

https://en.wikipedia.org/wiki/Revenue

https://en.wikipedia.org/wiki/Variable_cost

https://en.wikipedia.org/w/index.php?title=Scalability&action=edit&section=4

https://en.wikipedia.org/wiki/Database

https://en.wikipedia.org/wiki/Transactions_Per_Second

https://en.wikipedia.org/wiki/Mass_storage

https://en.wikipedia.org/wiki/Database_management_system

https://en.wikipedia.org/wiki/Partition_(database)

https://en.wikipedia.org/wiki/Database_server

https://en.wikipedia.org/wiki/Microprocessor

https://en.wikipedia.org/wiki/Multi-core_(computing)

https://en.wikipedia.org/wiki/Symmetric_multiprocessing

https://en.wikipedia.org/wiki/Thread_(computer_science)

https://en.wikipedia.org/wiki/Transaction_processing



https://en.wikipedia.org/wiki/Network-attached_storage

https://en.wikipedia.org/wiki/Storage_area_network

https://en.wikipedia.org/wiki/Fibre_Channel



https://en.wikipedia.org/wiki/X/Open_XA

https://en.wikipedia.org/wiki/Distributed_transaction

https://en.wikipedia.org/wiki/Oracle_RAC

https://en.wikipedia.org/wiki/Relational_database_management_system

https://en.wikipedia.org/wiki/GigaSpaces

https://en.wikipedia.org/wiki/Space-based_architecture

https://en.wikipedia.org/wiki/Base_One


https://en.wikipedia.org/wiki/NoSQL

https://en.wikipedia.org/wiki/BigTable

https://en.wikipedia.org/wiki/Spanner_(distributed_database_technology)

https://en.wikipedia.org/wiki/Database_transaction

https://en.wikipedia.org/wiki/SQL


usually called a subordinate CA.

A public key infrastructure (PKI) is a set of roles, policies, and procedures needed to create, manage, distribute, use,

store, and revoke digital certificates and manage public-key encryption. The purpose of a PKI is to facilitate the secure

electronic transfer of information for a range of network activities such as e-commerce, internet banking and confidential

email. It is required for activities where simple passwords are an inadequate authentication method and more rigorous

proof is required to confirm the identity of the parties involved in the communication and to validate the information being

transferred.[1]

In cryptography, a PKI is an arrangement that binds public keys with respective identities of entities (like persons and

organizations). The binding is established through a process of registration and issuance of certificates at and by a

certificate authority (CA). Depending on the assurance level of the binding, this may be carried out by an automated

process or under human supervision.

The PKI role that assures valid and correct registration is called a registration authority (RA). An RA is responsible for

accepting requests for digital certificates and authenticating the entity making the request.[2]

In a Microsoft PKI, a

registration authority is usually called a subordinate CA.[3]

An entity must be uniquely identifiable within each CA domain on the basis of information about that entity. A third-party

validation authority (VA) can provide this entity information on behalf of the CA.

https://en.wikipedia.org/wiki/Digital_certificates

https://en.wikipedia.org/wiki/Public_key_infrastructure#cite_note-1

https://en.wikipedia.org/wiki/Cryptography

https://en.wikipedia.org/wiki/Public_key

https://en.wikipedia.org/wiki/Certificate_authority

https://en.wikipedia.org/wiki/Public_key_infrastructure#cite_note-techotopia-2

https://en.wikipedia.org/wiki/Microsoft

https://en.wikipedia.org/wiki/Public_key_infrastructure#cite_note-msdn-3

https://en.wikipedia.org/wiki/Validation_authority

Design

Public key cryptography is a cryptographic technique that enables entities to securely communicate on an insecure public

network, and reliably verify the identity of an entity via digital signatures.[4]

A public key infrastructure (PKI) is a system for the creation, storage, and distribution of digital certificates which are used

to verify that a particular public key belongs to a certain entity. The PKI creates digital certificates which map public keys

to entities, securely stores these certificates in a central repository and revokes them if needed.[5][6][7]

A PKI consists of:[6][8][9]

A certificate authority (CA) that stores, issues and signs the digital certificates A registration authority which verifies the identity of entities requesting their digital certificates to be stored at the CA A central directory—i.e., a secure location in which to store and index keys A certificate management system managing things like the access to stored certificates or the delivery of the certificates to

be issued. A certificate policy

Methods of certification

Broadly speaking, there have traditionally been three approaches to getting this trust: certificate authorities (CAs), web of

trust (WoT), and simple public key infrastructure (SPKI).[citation needed]

Certificate authorities

The primary role of the CA is to digitally sign and publish the public key bound to a given user. This is done using the

CA's own private key, so that trust in the user key relies on one's trust in the validity of the CA's key. When the CA is a

third party separate from the user and the system, then it is called the Registration Authority (RA), which may or may not

be separate from the CA.[10]

The key-to-user binding is established, depending on the level of assurance the binding has, by

software or under human supervision.

The term trusted third party (TTP) may also be used for certificate authority (CA). Moreover, PKI is itself often used as a

synonym for a CA implementation.[11]

Issuer market share

In this model of trust relationships, a CA is a trusted third party - trusted both by the subject (owner) of the certificate and

by the party relying upon the certificate.

According to NetCraft [2], the industry standard for monitoring Active TLS certificates, states that "Although the global

[TLS] ecosystem is competitive, it is dominated by a handful of major CAs — three certificate authorities (Symantec,

Comodo, GoDaddy) account for three-quarters of all issued [TLS] certificates on public-facing web servers. The top spot

has been held by Symantec (or VeriSign before it was purchased by Symantec) ever since [our] survey began, with it

currently accounting for just under a third of all certificates. To illustrate the effect of differing methodologies, amongst the

million busiest sites Symantec issued 44% of the valid, trusted certificates in use — significantly more than its overall

market share."

Temporary certificates and single sign-on

This approach involves a server that acts as an offline certificate authority within a single sign-on system. A single sign-on

server will issue digital certificates into the client system, but never stores them. Users can execute programs, etc. with the

temporary certificate. It is common to find this solution variety with X.509-based certificates.[12]

https://en.wikipedia.org/wiki/Public_key_cryptography

https://en.wikipedia.org/wiki/Cryptographic

https://en.wikipedia.org/wiki/Secure_communication

https://en.wikipedia.org/wiki/Digital_signatures


https://en.wikipedia.org/wiki/Digital_certificates






https://en.wikipedia.org/wiki/Public_key_infrastructure#cite_note-Vacca-2004-p8-6






https://en.wikipedia.org/wiki/Wikipedia:Citation_needed


https://en.wikipedia.org/wiki/Trusted_third_party



http://news.netcraft.com/archives/2015/05/13/counting-ssl-certificates.html

https://en.wikipedia.org/wiki/Single_sign-on

https://en.wikipedia.org/wiki/X.509


Web of trust

Main article: Web of trust

An alternative approach to the problem of public authentication of public key information is the web-of-trust scheme,

which uses self-signed certificates and third party attestations of those certificates. The singular term "web of trust" does

not imply the existence of a single web of trust, or common point of trust, but rather one of any number of potentially

disjoint "webs of trust". Examples of implementations of this approach are PGP (Pretty Good Privacy) and GnuPG (an

implementation of OpenPGP, the standardized specification of PGP). Because PGP and implementations allow the use of

e-mail digital signatures for self-publication of public key information, it is relatively easy to implement one's own web of

trust. [13]

One of the benefits of the web of trust, such as in PGP, is that it can interoperate with a PKI CA fully trusted by all parties

in a domain (such as an internal CA in a company) that is willing to guarantee certificates, as a trusted introducer. If the

"web of trust" is completely trusted then, because of the nature of a web of trust, trusting one certificate is granting trust to

all the certificates in that web. A PKI is only as valuable as the standards and practices that control the issuance of

certificates and including PGP or a personally instituted web of trust could significantly degrade the trustability of that

enterprise's or domain's implementation of PKI.[14]

The web of trust concept was first put forth by PGP creator Phil Zimmermann in 1992 in the manual for PGP version 2.0:

As time goes on, you will accumulate keys from other people that you may want to designate as trusted introducers.

Everyone else will each choose their own trusted introducers. And everyone will gradually accumulate and distribute with

their key a collection of certifying signatures from other people, with the expectation that anyone receiving it will trust at

least one or two of the signatures. This will cause the emergence of a decentralized fault-tolerant web of confidence for all

public keys.

Simple public key infrastructure

Another alternative, which does not deal with public authentication of public key information, is the simple public key

infrastructure (SPKI) that grew out of three independent efforts to overcome the complexities of X.509 and PGP's web of

trust. SPKI does not associate users with persons, since the key is what is trusted, rather than the person. SPKI does not use

any notion of trust, as the verifier is also the issuer. This is called an "authorization loop" in SPKI terminology, where

authorization is integral to its design.[citation needed]

Blockchain-based PKI

An emerging approach for PKI is to use the blockchain technology commonly associated with modern cryptocurrency.

Since blockchain technology aims to provide a distributed and unalterable ledger of information, it has qualities considered

highly suitable for the storage and management of public keys. Emercoin is an example of a blockchain-based

cryptocurrency that supports the storage of different public key types (SSH, GPG, RFC 2230, etc.) and provides open

source software that directly supports PKI for OpenSSH servers.[citation needed]

History

This section does not cite any sources. Please help improve this section by adding citations to reliable sources. Unsourced

material may be challenged and removed. (January 2014) (Learn how and when to remove this template message)

Developments in PKI occurred in the early 1970s at the British intelligence agency GCHQ, where James Ellis, Clifford

Cocks and others made important discoveries related to encryption algorithms and key distribution.[15]

However, as

developments at GCHQ are highly classified, the results of this work were kept secret and not publicly acknowledged until

the mid-1990s.

https://en.wikipedia.org/wiki/Web_of_trust

https://en.wikipedia.org/wiki/Public_key_certificate

https://en.wikipedia.org/wiki/Pretty_Good_Privacy

https://en.wikipedia.org/wiki/GnuPG

https://en.wikipedia.org/wiki/OpenPGP

https://en.wikipedia.org/wiki/E-mail



https://en.wikipedia.org/wiki/Public_key_infrastructure#cite_note-Overview-14

https://en.wikipedia.org/wiki/Phil_Zimmermann

https://en.wikipedia.org/wiki/Simple_public_key_infrastructure






https://en.wikipedia.org/wiki/Blockchain_%28database%29

https://en.wikipedia.org/wiki/Cryptocurrency

https://en.wikipedia.org/wiki/Emercoin

https://en.wikipedia.org/wiki/Secure_Shell

https://en.wikipedia.org/wiki/GNU_Privacy_Guard

https://en.wikipedia.org/wiki/List_of_RFCs

https://en.wikipedia.org/wiki/OpenSSH


https://en.wikipedia.org/wiki/Wikipedia:Citing_sources

https://en.wikipedia.org/wiki/Wikipedia:Verifiability

https://en.wikipedia.org/w/index.php?title=Public_key_infrastructure&action=edit

https://en.wikipedia.org/wiki/Help:Introduction_to_referencing_with_Wiki_Markup/1

https://en.wikipedia.org/wiki/Wikipedia:Verifiability#Burden_of_evidence

https://en.wikipedia.org/wiki/Help:Maintenance_template_removal

https://en.wikipedia.org/wiki/GCHQ

https://en.wikipedia.org/wiki/James_H._Ellis

https://en.wikipedia.org/wiki/Clifford_Cocks




https://en.wikipedia.org/wiki/File:Question_book-new.svg

The public disclosure of both secure key exchange and asymmetric key algorithms in 1976 by Diffie, Hellman, Rivest,

Shamir, and Adleman changed secure communications entirely. With the further development of high-speed digital

electronic communications (the Internet and its predecessors), a need became evident for ways in which users could

securely communicate with each other, and as a further consequence of that, for ways in which users could be sure with

whom they were actually interacting.

Assorted cryptographic protocols were invented and analyzed within which the new cryptographic primitives could be

effectively used. With the invention of the World Wide Web and its rapid spread, the need for authentication and secure

communication became still more acute. Commercial reasons alone (e.g., e-commerce, online access to proprietary

databases from web browsers) were sufficient. Taher Elgamal and others at Netscape developed the SSL protocol ('https' in

Web URLs); it included key establishment, server authentication (prior to v3, one-way only), and so on. A PKI structure

was thus created for Web users/sites wishing secure communications.

Vendors and entrepreneurs saw the possibility of a large market, started companies (or new projects at existing

companies), and began to agitate for legal recognition and protection from liability. An American Bar Association

technology project published an extensive analysis of some of the foreseeable legal aspects of PKI operations (see ABA

digital signature guidelines), and shortly thereafter, several U.S. states (Utah being the first in 1995) and other jurisdictions

throughout the world began to enact laws and adopt regulations. Consumer groups raised questions about privacy, access,

and liability considerations, which were more taken into consideration in some jurisdictions than in others.

The enacted laws and regulations differed, there were technical and operational problems in converting PKI schemes into

successful commercial operation, and progress has been much slower than pioneers had imagined it would be.

By the first few years of the 21st century, the underlying cryptographic engineering was clearly not easy to deploy

correctly. Operating procedures (manual or automatic) were not easy to correctly design (nor even if so designed, to

execute perfectly, which the engineering required). The standards that existed were insufficient.

PKI vendors have found a market, but it is not quite the market envisioned in the mid-1990s, and it has grown both more

slowly and in somewhat different ways than were anticipated.[16]

PKIs have not solved some of the problems they were

expected to, and several major vendors have gone out of business or been acquired by others. PKI has had the most success

in government implementations; the largest PKI implementation to date is the Defense Information Systems Agency

(DISA) PKI infrastructure for the Common Access Cards program.

Uses

PKIs of one type or another, and from any of several vendors, have many uses, including providing public keys and

bindings to user identities which are used for:

Encryption and/or sender authentication of e-mail messages (e.g., using OpenPGP or S/MIME) Encryption and/or authentication of documents (e.g., the XML Signature [3] or XML Encryption [4] standards if documents

are encoded as XML) Authentication of users to applications (e.g., smart card logon, client authentication with SSL). There's experimental usage

for digitally signed HTTP authentication in the Enigform and mod_openpgp projects Bootstrapping secure communication protocols, such as Internet key exchange (IKE) and SSL. In both of these, initial set-up

of a secure channel (a "security association") uses asymmetric key—i.e., public key—methods, whereas actual communication uses faster symmetric key—i.e., secret key—methods.

Mobile signatures are electronic signatures that are created using a mobile device and rely on signature or certification services in a location independent telecommunication environment[17]

Open source implementations

OpenSSL is the simplest form of CA and tool for PKI. It is a toolkit, developed in C, that is included in all major Linux distributions, and can be used both to build your own (simple) CA and to PKI-enable applications. (Apache licensed)

EJBCA is a full featured, Enterprise grade, CA implementation developed in Java. It can be used to set up a CA both for

https://en.wikipedia.org/wiki/Key_exchange

https://en.wikipedia.org/wiki/Asymmetric_key_algorithm

https://en.wikipedia.org/wiki/Whitfield_Diffie

https://en.wikipedia.org/wiki/Martin_Hellman

https://en.wikipedia.org/wiki/Ron_Rivest

https://en.wikipedia.org/wiki/Adi_Shamir

https://en.wikipedia.org/wiki/Leonard_Adleman

https://en.wikipedia.org/wiki/Internet

https://en.wikipedia.org/wiki/Cryptographic_engineering

https://en.wikipedia.org/wiki/Cryptographic_primitive

https://en.wikipedia.org/wiki/World_Wide_Web

https://en.wikipedia.org/wiki/E-commerce

https://en.wikipedia.org/wiki/Web_browser

https://en.wikipedia.org/wiki/Taher_Elgamal

https://en.wikipedia.org/wiki/Netscape_Communications_Corporation

https://en.wikipedia.org/wiki/Transport_Layer_Security

https://en.wikipedia.org/wiki/HTTPS

https://en.wikipedia.org/wiki/Uniform_Resource_Locator

https://en.wikipedia.org/wiki/American_Bar_Association

https://en.wikipedia.org/wiki/ABA_digital_signature_guidelines



https://en.wikipedia.org/wiki/Utah

https://en.wikipedia.org/wiki/Privacy


https://en.wikipedia.org/wiki/Defense_Information_Systems_Agency

https://en.wikipedia.org/wiki/Common_Access_Card

https://en.wikipedia.org/wiki/Encryption

https://en.wikipedia.org/wiki/Authentication

https://en.wikipedia.org/wiki/E-mail

https://en.wikipedia.org/wiki/OpenPGP

https://en.wikipedia.org/wiki/S/MIME

https://en.wikipedia.org/wiki/XML_Signature

http://www.w3.org/TR/xmldsig-core/

https://en.wikipedia.org/wiki/XML_Encryption

http://www.w3.org/TR/xmlenc-core/

https://en.wikipedia.org/wiki/XML

https://en.wikipedia.org/wiki/Authentication

https://en.wikipedia.org/wiki/Smart_card

https://en.wikipedia.org/wiki/Secure_Sockets_Layer

https://en.wikipedia.org/wiki/HTTP

https://en.wikipedia.org/wiki/Enigform

https://en.wikipedia.org/wiki/Mod_openpgp

https://en.wikipedia.org/wiki/Bootstrapping

https://en.wikipedia.org/wiki/Internet_key_exchange

https://en.wikipedia.org/wiki/Secure_Sockets_Layer

https://en.wikipedia.org/wiki/Security_association

https://en.wikipedia.org/wiki/Asymmetric_key

https://en.wikipedia.org/wiki/Symmetric_key

https://en.wikipedia.org/wiki/Secret_key


https://en.wikipedia.org/wiki/OpenSSL

https://en.wikipedia.org/wiki/Linux

https://en.wikipedia.org/wiki/Apache_Software_License

https://en.wikipedia.org/wiki/EJBCA

https://en.wikipedia.org/wiki/Java_%28programming_language%29

internal use and as a service. (LGPL licensed) OpenCA is a full featured CA implementation using a number of different tools. OpenCA uses OpenSSL for the underlying PKI

operations. XCA is a graphical interface, and database. XCA uses OpenSSL for the underlying PKI operations. (Discontinued) TinyCA was a graphical interface for OpenSSL. XiPKI,[18] CA and OCSP responder. With SHA3 support, OSGi-based (Java).

Criticism

Some argue that purchasing certificates for securing websites by SSL and securing software by code signing is a costly

venture for small businesses.[19]

Presently Symantec holds a major share in PKI certificate market which sold one third of

all certificates issued globally in 2013.[20]

HTTP/2, the latest version of HTTP protocol allows unsecured connections in

theory, in practice major browser companies have made it clear that they would support this state-of-art protocol only over

a PKI secured TLS connection.[21]

Web browser implementation of HTTP/2 including Edge from Microsoft, Chrome from

Google, Firefox from Mozilla, and Opera supports HTTP/2 only over TLS by using ALPN extension of TLS protocol.

This would mean that to get the speed benefits of HTTP/2, website owners would be forced to purchase SSL certificates

controlled by corporations such as Symantec.

Current web browsers carry pre-installed intermediary certificates issued and signed by a Certificate Authority. This means

browsers need to carry a large number of different certificate providers, increasing the risk of a key compromise.

Furthermore, governments can force certificate providers to give their root certificate keys, which in turn would help them

to decrypt traffic by performing a man-in-middle-attack (MITM).

When a key is known to be compromised it could be fixed by revoking the certificate, but such a compromise is not easily

detectable and can be a huge security breach. Browsers have to issue a security patch to revoke intermediary certificates

issued by a compromised root certificate authority.[22]

Some practical security vulnerabilities of X.509 certificates and

known cases where keys were stolen from a major Certificate Authority listed below.

HASHING

Hashing, Hash Data Structure and Hash Table

Hashing is the process of mapping large amount of data item to a smaller table with the help of a hashing function. The

essence of hashing is to facilitate the next level searching method when compared with the linear or binary search. The

advantage of this searching method is its efficiency to hand vast amount of data items in a given collection (i.e. collection

size).

Due to this hashing process, the result is a Hash data structure that can store or retrieve data items in an average time

disregard to the collection size.

Hash Table is the result of storing the hash data structure in a smaller table which incorporates the hash function within

itself. The Hash Function primarily is responsible to map between the original data item and the smaller table itself. Here

the mapping takes place with the help of an output integer in a consistent range produced when a given data item (any

data type) is provided for storage and this output integer range determines the location in the smaller table for the data

item. In terms of implementation, the hash table is constructed with the help of an array and the indices of this array are

associated to the output integer range.

Hash Table Example :

Here, we construct a hash table for storing and retrieving data related to the citizens of a county and the social-security

number of citizens are used as the indices of the array implementation (i.e. key). Let's assume that the table size is 12,

therefore the hash function would be Value modulus of 12.

Hence, the Hash Function would equate to:

(sum of numeric values of the characters in the data item) %12

Note! % is the modulus operator

https://en.wikipedia.org/wiki/LGPL

https://en.wikipedia.org/wiki/OpenCA

http://sourceforge.net/projects/xca/

https://web.archive.org/web/20130706084700/http:/tinyca.sm-zone.net/



https://en.wikipedia.org/wiki/Code_signing


https://en.wikipedia.org/wiki/Symantec


https://en.wikipedia.org/wiki/HTTP/2



https://en.wikipedia.org/wiki/Microsoft_Edge

https://en.wikipedia.org/wiki/Google_Chrome

https://en.wikipedia.org/wiki/Firefox

https://en.wikipedia.org/wiki/Opera_%28web_browser%29

https://en.wikipedia.org/wiki/ALPN



Let us consider the following social-security numbers and produce a hashcode:

120388113D => 1+2+0+3+8+8+1+1+3+13=40

Hence, (40)%12 => Hashcode=4

310181312E => 3+1+0+1+8+1+3+1+2+14=34


041176438A => 0+4+1+1+7+6+4+3+8+10=44


Therefore, the Hashtable content would be as follows:

-----------------------------------------------------

0:empty

1:empty

2:empty

3:empty

4:occupied Name:Drew Smith SSN:120388113D

5:empty

6:empty

7:empty

8:occupied Name:Andy Conn SSN:041176438A

9:empty

10:occupied Name:Igor Barton SSN:310181312E

11:empty

A hash function is a mathematical function that converts a numerical input value into another compressed numerical

value. The input to the hash function is of arbitrary length but output is always of fixed length. Values returned by a hash

function are called message digest or simply hash values.

Hash functions are extremely useful and appear in almost all information security applications.

A hash function is a mathematical function that converts a numerical input value into another compressed numerical value.

The input to the hash function is of arbitrary length but output is always of fixed length.

Values returned by a hash function are called message digest or simply hash values. The following picture illustrated hash

function −

Features of Hash Functions

The typical features of hash functions are −

Fixed Length Output (Hash Value)

o Hash function coverts data of arbitrary length to a fixed length. This process is often referred to as hashing

the data.

o In general, the hash is much smaller than the input data, hence hash functions are sometimes called

compression functions.

o Since a hash is a smaller representation of a larger data, it is also referred to as a digest.

o Hash function with n bit output is referred to as an n-bit hash function. Popular hash functions generate

values between 160 and 512 bits.

Efficiency of Operation

o Generally for any hash function h with input x, computation of h(x) is a fast operation.

o Computationally hash functions are much faster than a symmetric encryption.

Properties of Hash Functions

In order to be an effective cryptographic tool, the hash function is desired to possess following properties −

Pre-Image Resistance o This property means that it should be computationally hard to reverse a hash function.

o In other words, if a hash function h produced a hash value z, then it should be a difficult process to find any

input value x that hashes to z.

o This property protects against an attacker who only has a hash value and is trying to find the input.

Second Pre-Image Resistance

o This property means given an input and its hash, it should be hard to find a different input with the same

hash.

o In other words, if a hash function h for an input x produces hash value h(x), then it should be difficult to

find any other input value y such that h(y) = h(x).

o This property of hash function protects against an attacker who has an input value and its hash, and wants to

substitute different value as legitimate value in place of original input value.

Collision Resistance

o This property means it should be hard to find two different inputs of any length that result in the same hash.

This property is also referred to as collision free hash function.

o In other words, for a hash function h, it is hard to find any two different inputs x and y such that h(x) = h(y).

o Since, hash function is compressing function with fixed hash length, it is impossible for a hash function not

to have collisions. This property of collision free only confirms that these collisions should be hard to find.

o This property makes it very difficult for an attacker to find two input values with the same hash.

o Also, if a hash function is collision-resistant then it is second pre-image resistant.

Design of Hashing Algorithms

At the heart of a hashing is a mathematical function that operates on two fixed-size blocks of data to create a hash code.

This hash function forms the part of the hashing algorithm.

The size of each data block varies depending on the algorithm. Typically the block sizes are from 128 bits to 512 bits. The

following illustration demonstrates hash function −

Hashing algorithm involves rounds of above hash function like a block cipher. Each round takes an input of a fixed size,

typically a combination of the most recent message block and the output of the last round.

This process is repeated for as many rounds as are required to hash the entire message. Schematic of hashing algorithm is

depicted in the following illustration −

Since, the hash value of first message block becomes an input to the second hash operation, output of which alters the

result of the third operation, and so on. This effect, known as an avalanche effect of hashing.

Avalanche effect results in substantially different hash values for two messages that differ by even a single bit of data.

Understand the difference between hash function and algorithm correctly. The hash function generates a hash code by

operating on two blocks of fixed-length binary data.

Hashing algorithm is a process for using the hash function, specifying how the message will be broken up and how the

results from previous message blocks are chained together.

Popular Hash Functions

Let us briefly see some popular hash functions −

Message Digest (MD)

MD5 was most popular and widely used hash function for quite some years.

The MD family comprises of hash functions MD2, MD4, MD5 and MD6. It was adopted as Internet Standard RFC

1321. It is a 128-bit hash function.

MD5 digests have been widely used in the software world to provide assurance about integrity of transferred file.

For example, file servers often provide a pre-computed MD5 checksum for the files, so that a user can compare the

checksum of the downloaded file to it.

In 2004, collisions were found in MD5. An analytical attack was reported to be successful only in an hour by using

computer cluster. This collision attack resulted in compromised MD5 and hence it is no longer recommended for

use.

Secure Hash Function (SHA)

Family of SHA comprise of four SHA algorithms; SHA-0, SHA-1, SHA-2, and SHA-3. Though from same family, there

are structurally different.

The original version is SHA-0, a 160-bit hash function, was published by the National Institute of Standards and

Technology (NIST) in 1993. It had few weaknesses and did not become very popular. Later in 1995, SHA-1 was

designed to correct alleged weaknesses of SHA-0.

SHA-1 is the most widely used of the existing SHA hash functions. It is employed in several widely used

applications and protocols including Secure Socket Layer (SSL) security.

In 2005, a method was found for uncovering collisions for SHA-1 within practical time frame making long-term

employability of SHA-1 doubtful.

SHA-2 family has four further SHA variants, SHA-224, SHA-256, SHA-384, and SHA-512 depending up on

number of bits in their hash value. No successful attacks have yet been reported on SHA-2 hash function.

Though SHA-2 is a strong hash function. Though significantly different, its basic design is still follows design of

SHA-1. Hence, NIST called for new competitive hash function designs.

In October 2012, the NIST chose the Keccak algorithm as the new SHA-3 standard. Keccak offers many benefits,

such as efficient performance and good resistance for attacks.

RIPEMD

The RIPEND is an acronym for RACE Integrity Primitives Evaluation Message Digest. This set of hash functions was

designed by open research community and generally known as a family of European hash functions.

The set includes RIPEND, RIPEMD-128, and RIPEMD-160. There also exist 256, and 320-bit versions of this

algorithm.

Original RIPEMD (128 bit) is based upon the design principles used in MD4 and found to provide questionable

security. RIPEMD 128-bit version came as a quick fix replacement to overcome vulnerabilities on the original

RIPEMD.

RIPEMD-160 is an improved version and the most widely used version in the family. The 256 and 320-bit versions

reduce the chance of accidental collision, but do not have higher levels of security as compared to RIPEMD-128

and RIPEMD-160 respectively.

Whirlpool

This is a 512-bit hash function.

It is derived from the modified version of Advanced Encryption Standard (AES). One of the designer was Vincent

Rijmen, a co-creator of the AES.

Three versions of Whirlpool have been released; namely WHIRLPOOL-0, WHIRLPOOL-T, and WHIRLPOOL.

Applications of Hash Functions

There are two direct applications of hash function based on its cryptographic properties.

Password Storage

Hash functions provide protection to password storage.

Instead of storing password in clear, mostly all logon processes store the hash values of passwords in the file.

The Password file consists of a table of pairs which are in the form (user id, h(P)).

The process of logon is depicted in the following illustration −

An intruder can only see the hashes of passwords, even if he accessed the password. He can neither logon using

hash nor can he derive the password from hash value since hash function possesses the property of pre-image

resistance.

Data Integrity Check

Data integrity check is a most common application of the hash functions. It is used to generate the checksums on data files.

This application provides assurance to the user about correctness of the data.

The process is depicted in the following illustration −

The integrity check helps the user to detect any changes made to original file. It however, does not provide any assurance

about originality. The attacker, instead of modifying file data, can change the entire file and compute all together new hash

and send to the receiver. This integrity check application is useful only if the user is sure about the originality of file.

Characteristics of Modern Cryptography Modern cryptography is the cornerstone of computer and communications security. Its foundation is based on various

concepts of mathematics such as number theory, computational-complexity theory, and probability theory.

Characteristics of Modern Cryptography

There are three major characteristics that separate modern cryptography from the classical approach.

Classic Cryptography Modern Cryptography

It manipulates traditional characters, i.e., letters and digits

directly. It operates on binary bit sequences.

It is mainly based on ‘security through obscurity’. The techniques

employed for coding were kept secret and only the parties

involved in communication knew about them.

It relies on publicly known mathematical algorithms for coding

the information. Secrecy is obtained through a secrete key which

is used as the seed for the algorithms. The computational

difficulty of algorithms, absence of secret key, etc., make it

impossible for an attacker to obtain the original information

even if he knows the algorithm used for coding.

It requires the entire cryptosystem for communicating

confidentially.

Modern cryptography requires parties interested in secure

communication to possess the secret key only.

Context of Cryptography

Cryptology, the study of cryptosystems, can be subdivided into two branches −

Cryptography Cryptanalysis

What is Cryptography?

Cryptography is the art and science of making a cryptosystem that is capable of providing information security.

Cryptography deals with the actual securing of digital data. It refers to the design of mechanisms based on mathematical

algorithms that provide fundamental information security services. You can think of cryptography as the establishment of a

large toolkit containing different techniques in security applications.

What is Cryptanalysis?

The art and science of breaking the cipher text is known as cryptanalysis.

Cryptanalysis is the sister branch of cryptography and they both co-exist. The cryptographic process results in the cipher

text for transmission or storage. It involves the study of cryptographic mechanism with the intention to break them.

Cryptanalysis is also used during the design of the new cryptographic techniques to test their security strengths.

Note − Cryptography concerns with the design of cryptosystems, while cryptanalysis studies the breaking of

cryptosystems.

Security Services of Cryptography

The primary objective of using cryptography is to provide the following four fundamental information security services.

Let us now see the possible goals intended to be fulfilled by cryptography.

Confidentiality

Confidentiality is the fundamental security service provided by cryptography. It is a security service that keeps the

information from an unauthorized person. It is sometimes referred to as privacy or secrecy.

Confidentiality can be achieved through numerous means starting from physical securing to the use of mathematical

algorithms for data encryption.

Data Integrity

It is security service that deals with identifying any alteration to the data. The data may get modified by an unauthorized

entity intentionally or accidently. Integrity service confirms that whether data is intact or not since it was last created,

transmitted, or stored by an authorized user.

Data integrity cannot prevent the alteration of data, but provides a means for detecting whether data has been manipulated

in an unauthorized manner.

Authentication

Authentication provides the identification of the originator. It confirms to the receiver that the data received has been sent

only by an identified and verified sender.

Authentication service has two variants −

Message authentication identifies the originator of the message without any regard router or system that has sent

the message.

Entity authentication is assurance that data has been received from a specific entity, say a particular website.

Apart from the originator, authentication may also provide assurance about other parameters related to data such as the

date and time of creation/transmission.

Non-repudiation

It is a security service that ensures that an entity cannot refuse the ownership of a previous commitment or an action. It is

an assurance that the original creator of the data cannot deny the creation or transmission of the said data to a recipient or

third party.

Non-repudiation is a property that is most desirable in situations where there are chances of a dispute over the exchange of

data. For example, once an order is placed electronically, a purchaser cannot deny the purchase order, if non-repudiation

service was enabled in this transaction.

Cryptography Primitives

Cryptography primitives are nothing but the tools and techniques in Cryptography that can be selectively used to provide a

set of desired security services −

Encryption Hash functions Message Authentication codes (MAC) Digital Signatures

The following table shows the primitives that can achieve a particular security service on their own.

Note − Cryptographic primitives are intricately related and they are often combined to achieve a set of desired security

services from a cryptosystem.

Cryptographic Attacks In the present era, not only business but almost all the aspects of human life are driven by information. Hence, it has

become imperative to protect useful information from malicious activities such as attacks. Let us consider the types of

attacks to which information is typically subjected to.

Attacks are typically categorized based on the action performed by the attacker. An attack, thus, can be passive or active.

Passive Attacks

The main goal of a passive attack is to obtain unauthorized access to the information. For example, actions such as

intercepting and eavesdropping on the communication channel can be regarded as passive attack.

These actions are passive in nature, as they neither affect information nor disrupt the communication channel. A passive

attack is often seen as stealing information. The only difference in stealing physical goods and stealing information is that

theft of data still leaves the owner in possession of that data. Passive information attack is thus more dangerous than

stealing of goods, as information theft may go unnoticed by the owner.

Active Attacks

An active attack involves changing the information in some way by conducting some process on the information. For

example,

Modifying the information in an unauthorized manner.

Initiating unintended or unauthorized transmission of information.

Alteration of authentication data such as originator name or timestamp associated with information

Unauthorized deletion of data.

Denial of access to information for legitimate users (denial of service).

Cryptography provides many tools and techniques for implementing cryptosystems capable of preventing most of the

attacks described above.

Assumptions of Attacker

Let us see the prevailing environment around cryptosystems followed by the types of attacks employed to break these

systems −

Environment around Cryptosystem

While considering possible attacks on the cryptosystem, it is necessary to know the cryptosystems environment. The

attacker‘s assumptions and knowledge about the environment decides his capabilities.

In cryptography, the following three assumptions are made about the security environment and attacker‘s capabilities.

Details of the Encryption Scheme

The design of a cryptosystem is based on the following two cryptography algorithms −

Public Algorithms − With this option, all the details of the algorithm are in the public domain, known to everyone.

Proprietary algorithms − The details of the algorithm are only known by the system designers and users.

In case of proprietary algorithms, security is ensured through obscurity. Private algorithms may not be the strongest

algorithms as they are developed in-house and may not be extensively investigated for weakness.

Secondly, they allow communication among closed group only. Hence they are not suitable for modern communication

where people communicate with large number of known or unknown entities. Also, according to Kerckhoff‘s principle, the

algorithm is preferred to be public with strength of encryption lying in the key.

Thus, the first assumption about security environment is that the encryption algorithm is known to the attacker.

Availability of Ciphertext

We know that once the plaintext is encrypted into ciphertext, it is put on unsecure public channel (say email) for

transmission. Thus, the attacker can obviously assume that it has access to the ciphertext generated by the

cryptosystem.

Availability of Plaintext and Ciphertext

This assumption is not as obvious as other. However, there may be situations where an attacker can have access to

plaintext and corresponding ciphertext. Some such possible circumstances are −

The attacker influences the sender to convert plaintext of his choice and obtains the ciphertext.

The receiver may divulge the plaintext to the attacker inadvertently. The attacker has access to corresponding

ciphertext gathered from open channel.

In a public-key cryptosystem, the encryption key is in open domain and is known to any potential attacker. Using

this key, he can generate pairs of corresponding plaintexts and ciphertexts.

Cryptographic Attacks

The basic intention of an attacker is to break a cryptosystem and to find the plaintext from the ciphertext. To obtain the

plaintext, the attacker only needs to find out the secret decryption key, as the algorithm is already in public domain.

Hence, he applies maximum effort towards finding out the secret key used in the cryptosystem. Once the attacker is able to

determine the key, the attacked system is considered as broken or compromised.

Based on the methodology used, attacks on cryptosystems are categorized as follows −

Ciphertext Only Attacks (COA) − In this method, the attacker has access to a set of ciphertext(s). He does not

have access to corresponding plaintext. COA is said to be successful when the corresponding plaintext can be

determined from a given set of ciphertext. Occasionally, the encryption key can be determined from this attack.

Modern cryptosystems are guarded against ciphertext-only attacks.

Known Plaintext Attack (KPA) − In this method, the attacker knows the plaintext for some parts of the

ciphertext. The task is to decrypt the rest of the ciphertext using this information. This may be done by determining

the key or via some other method. The best example of this attack is linear cryptanalysis against block ciphers.

Chosen Plaintext Attack (CPA) − In this method, the attacker has the text of his choice encrypted. So he has the

ciphertext-plaintext pair of his choice. This simplifies his task of determining the encryption key. An example of

this attack is differential cryptanalysis applied against block ciphers as well as hash functions. A popular public key

cryptosystem, RSA is also vulnerable to chosen-plaintext attacks.

Dictionary Attack − This attack has many variants, all of which involve compiling a ‗dictionary‘. In simplest

method of this attack, attacker builds a dictionary of ciphertexts and corresponding plaintexts that he has learnt over

a period of time. In future, when an attacker gets the ciphertext, he refers the dictionary to find the corresponding

plaintext.

Brute Force Attack (BFA) − In this method, the attacker tries to determine the key by attempting all possible

keys. If the key is 8 bits long, then the number of possible keys is 28 = 256. The attacker knows the ciphertext and

the algorithm, now he attempts all the 256 keys one by one for decryption. The time to complete the attack would

be very high if the key is long.

Birthday Attack − This attack is a variant of brute-force technique. It is used against the cryptographic hash

function. When students in a class are asked about their birthdays, the answer is one of the possible 365 dates. Let

us assume the first student's birthdate is 3rd

Aug. Then to find the next student whose birthdate is 3rd

Aug, we need

to enquire 1.25*√365 ≈ 25 students.

Similarly, if the hash function produces 64 bit hash values, the possible hash values are 1.8x1019

. By repeatedly

evaluating the function for different inputs, the same output is expected to be obtained after about 5.1x109 random

inputs.

If the attacker is able to find two different inputs that give the same hash value, it is a collision and that hash

function is said to be broken.

Man in Middle Attack (MIM) − The targets of this attack are mostly public key cryptosystems where key

exchange is involved before communication takes place.

o Host A wants to communicate to host B, hence requests public key of B.

o An attacker intercepts this request and sends his public key instead.

o Thus, whatever host A sends to host B, the attacker is able to read.

o In order to maintain communication, the attacker re-encrypts the data after reading with his public key and

sends to B.

o The attacker sends his public key as A‘s public key so that B takes it as if it is taking it from A.

Side Channel Attack (SCA) − This type of attack is not against any particular type of cryptosystem or algorithm.

Instead, it is launched to exploit the weakness in physical implementation of the cryptosystem.

Timing Attacks − They exploit the fact that different computations take different times to compute on processor.

By measuring such timings, it is be possible to know about a particular computation the processor is carrying out.

For example, if the encryption takes a longer time, it indicates that the secret key is long.

Power Analysis Attacks − These attacks are similar to timing attacks except that the amount of power

consumption is used to obtain information about the nature of the underlying computations.

Fault analysis Attacks − In these attacks, errors are induced in the cryptosystem and the attacker studies the

resulting output for useful information.

Practicality of Attacks

The attacks on cryptosystems described here are highly academic, as majority of them come from the academic

community. In fact, many academic attacks involve quite unrealistic assumptions about environment as well as the

capabilities of the attacker. For example, in chosen-ciphertext attack, the attacker requires an impractical number of

deliberately chosen plaintext-ciphertext pairs. It may not be practical altogether.

Nonetheless, the fact that any attack exists should be a cause of concern, particularly if the attack technique has the

potential for improvement.

Data Encryption Standard (DES)

Data Encryption Standard (DES) is a symmetric-key block cipher published by the National Institute of Standards and

Technology (NIST).

DES is an implementation of a Feistel Cipher. It uses 16 round Feistel structure. The block size is 64-bit. Though, key

length is 64-bit, DES has an effective key length of 56 bits, since 8 of the 64 bits of the key are not used by the encryption

algorithm (function as check bits only). General Structure of DES is depicted in the following illustration −

Since DES is based on the Feistel Cipher, all that is required to specify DES is −

Round function Key schedule Any additional processing − Initial and final permutation

Initial and Final Permutation

The initial and final permutations are straight Permutation boxes (P-boxes) that are inverses of each other. They have no

cryptography significance in DES. The initial and final permutations are shown as follows −

Round Function

The heart of this cipher is the DES function, f. The DES function applies a 48-bit key to the rightmost 32 bits to produce a

32-bit output.

Expansion Permutation Box − Since right input is 32-bit and round key is a 48-bit, we first need to expand right

input to 48 bits. Permutation logic is graphically depicted in the following illustration −

The graphically depicted permutation logic is generally described as table in DES specification illustrated as shown

−

XOR (Whitener). − After the expansion permutation, DES does XOR operation on the expanded right section and

the round key. The round key is used only in this operation.

Substitution Boxes. − The S-boxes carry out the real mixing (confusion). DES uses 8 S-boxes, each with a 6-bit

input and a 4-bit output. Refer the following illustration −

The S-box rule is illustrated below −

There are a total of eight S-box tables. The output of all eight s-boxes is then combined in to 32 bit section.

Straight Permutation − The 32 bit output of S-boxes is then subjected to the straight permutation with rule shown

in the following illustration:

Key Generation

The round-key generator creates sixteen 48-bit keys out of a 56-bit cipher key. The process of key generation is depicted in

the following illustration −

The logic for Parity drop, shifting, and Compression P-box is given in the DES description.

DES Analysis

The DES satisfies both the desired properties of block cipher. These two properties make cipher very strong.

Avalanche effect − A small change in plaintext results in the very grate change in the ciphertext.

Completeness − Each bit of ciphertext depends on many bits of plaintext.

During the last few years, cryptanalysis have found some weaknesses in DES when key selected are weak keys. These

keys shall be avoided.

DES has proved to be a very well designed block cipher. There have been no significant cryptanalytic attacks on DES

other than exhaustive key search.

Message Authentication Code (MAC) Another type of threat that exist for data is the lack of message authentication. In this threat, the user is not sure about the

originator of the message. Message authentication can be provided using the cryptographic techniques that use secret keys

as done in case of encryption.

Message Authentication Code (MAC)

MAC algorithm is a symmetric key cryptographic technique to provide message authentication. For establishing MAC

process, the sender and receiver share a symmetric key K.

Essentially, a MAC is an encrypted checksum generated on the underlying message that is sent along with a message to

ensure message authentication.

The process of using MAC for authentication is depicted in the following illustration −

Let us now try to understand the entire process in detail −

The sender uses some publicly known MAC algorithm, inputs the message and the secret key K and produces a

MAC value.

Similar to hash, MAC function also compresses an arbitrary long input into a fixed length output. The major

difference between hash and MAC is that MAC uses secret key during the compression.

The sender forwards the message along with the MAC. Here, we assume that the message is sent in the clear, as we

are concerned of providing message origin authentication, not confidentiality. If confidentiality is required then the

message needs encryption.

On receipt of the message and the MAC, the receiver feeds the received message and the shared secret key K into

the MAC algorithm and re-computes the MAC value.

The receiver now checks equality of freshly computed MAC with the MAC received from the sender. If they

match, then the receiver accepts the message and assures himself that the message has been sent by the intended

sender.

If the computed MAC does not match the MAC sent by the sender, the receiver cannot determine whether it is the

message that has been altered or it is the origin that has been falsified. As a bottom-line, a receiver safely assumes

that the message is not the genuine.

Limitations of MAC

There are two major limitations of MAC, both due to its symmetric nature of operation −

Establishment of Shared Secret. o It can provide message authentication among pre-decided legitimate users who have shared key.

o This requires establishment of shared secret prior to use of MAC.

Inability to Provide Non-Repudiation

o Non-repudiation is the assurance that a message originator cannot deny any previously sent messages and

commitments or actions.

o MAC technique does not provide a non-repudiation service. If the sender and receiver get involved in a

dispute over message origination, MACs cannot provide a proof that a message was indeed sent by the

sender.

o Though no third party can compute the MAC, still sender could deny having sent the message and claim

that the receiver forged it, as it is impossible to determine which of the two parties computed the MAC.

Both these limitations can be overcome by using the public key based digital signatures discussed in following section.

Cryptography Digital signatures Digital signatures are the public-key primitives of message authentication. In the physical world, it is common to use

handwritten signatures on handwritten or typed messages. They are used to bind signatory to the message.

Similarly, a digital signature is a technique that binds a person/entity to the digital data. This binding can be independent ly

verified by receiver as well as any third party.

Digital signature is a cryptographic value that is calculated from the data and a secret key known only by the signer.

In real world, the receiver of message needs assurance that the message belongs to the sender and he should not be able to

repudiate the origination of that message. This requirement is very crucial in business applications, since likelihood of a

dispute over exchanged data is very high.

Model of Digital Signature

As mentioned earlier, the digital signature scheme is based on public key cryptography. The model of digital signature

scheme is depicted in the following illustration −

The following points explain the entire process in detail −

Each person adopting this scheme has a public-private key pair.

Generally, the key pairs used for encryption/decryption and signing/verifying are different. The private key used

for signing is referred to as the signature key and the public key as the verification key.

Signer feeds data to the hash function and generates hash of data.

Hash value and signature key are then fed to the signature algorithm which produces the digital signature on given

hash. Signature is appended to the data and then both are sent to the verifier.

Verifier feeds the digital signature and the verification key into the verification algorithm. The verification

algorithm gives some value as output.

Verifier also runs same hash function on received data to generate hash value.

For verification, this hash value and output of verification algorithm are compared. Based on the comparison result,

verifier decides whether the digital signature is valid.

Since digital signature is created by ‗private‘ key of signer and no one else can have this key; the signer cannot

repudiate signing the data in future.

It should be noticed that instead of signing data directly by signing algorithm, usually a hash of data is created. Since the

hash of data is a unique representation of data, it is sufficient to sign the hash in place of data. The most important reason

of using hash instead of data directly for signing is efficiency of the scheme.

Let us assume RSA is used as the signing algorithm. As discussed in public key encryption chapter, the encryption/signing

process using RSA involves modular exponentiation.

Signing large data through modular exponentiation is computationally expensive and time consuming. The hash of the data

is a relatively small digest of the data, hence signing a hash is more efficient than signing the entire data.

Importance of Digital Signature

Out of all cryptographic primitives, the digital signature using public key cryptography is considered as very important and

useful tool to achieve information security.

Apart from ability to provide non-repudiation of message, the digital signature also provides message authentication and

data integrity. Let us briefly see how this is achieved by the digital signature −

Message authentication − When the verifier validates the digital signature using public key of a sender, he is

assured that signature has been created only by sender who possess the corresponding secret private key and no one

else.

Data Integrity − In case an attacker has access to the data and modifies it, the digital signature verification at

receiver end fails. The hash of modified data and the output provided by the verification algorithm will not match.

Hence, receiver can safely deny the message assuming that data integrity has been breached.

Non-repudiation − Since it is assumed that only the signer has the knowledge of the signature key, he can only

create unique signature on a given data. Thus the receiver can present data and the digital signature to a third party

as evidence if any dispute arises in the future.

By adding public-key encryption to digital signature scheme, we can create a cryptosystem that can provide the four

essential elements of security namely − Privacy, Authentication, Integrity, and Non-repudiation.

Encryption with Digital Signature

In many digital communications, it is desirable to exchange an encrypted messages than plaintext to achieve

confidentiality. In public key encryption scheme, a public (encryption) key of sender is available in open domain, and

hence anyone can spoof his identity and send any encrypted message to the receiver.

This makes it essential for users employing PKC for encryption to seek digital signatures along with encrypted data to be

assured of message authentication and non-repudiation.

This can archived by combining digital signatures with encryption scheme. Let us briefly discuss how to achieve this

requirement. There are two possibilities, sign-then-encrypt and encrypt-then-sign.

However, the crypto system based on sign-then-encrypt can be exploited by receiver to spoof identity of sender and sent

that data to third party. Hence, this method is not preferred. The process of encrypt-then-sign is more reliable and widely

adopted. This is depicted in the following illustration −

The receiver after receiving the encrypted data and signature on it, first verifies the signature using sender‘s public key.

After ensuring the validity of the signature, he then retrieves the data through decryption using his private key.

Cryptography Benefits & Drawbacks Nowadays, the networks have gone global and information has taken the digital form of bits and bytes. Critical

information now gets stored, processed and transmitted in digital form on computer systems and open communication

channels.

Since information plays such a vital role, adversaries are targeting the computer systems and open communication

channels to either steal the sensitive information or to disrupt the critical information system.

Modern cryptography provides a robust set of techniques to ensure that the malevolent intentions of the adversary are

thwarted while ensuring the legitimate users get access to information. Here in this chapter, we will discuss the benefits

that we draw from cryptography, its limitations, as well as the future of cryptography.

Cryptography – Benefits

Cryptography is an essential information security tool. It provides the four most basic services of information security −

Confidentiality − Encryption technique can guard the information and communication from unauthorized

revelation and access of information.

Authentication − The cryptographic techniques such as MAC and digital signatures can protect information

against spoofing and forgeries.

Data Integrity − The cryptographic hash functions are playing vital role in assuring the users about the data

integrity.

Non-repudiation − The digital signature provides the non-repudiation service to guard against the dispute that may

arise due to denial of passing message by the sender.

All these fundamental services offered by cryptography has enabled the conduct of business over the networks using the

computer systems in extremely efficient and effective manner.

Cryptography – Drawbacks

Apart from the four fundamental elements of information security, there are other issues that affect the effective use of

information −

A strongly encrypted, authentic, and digitally signed information can be difficult to access even for a legitimate

user at a crucial time of decision-making. The network or the computer system can be attacked and rendered non-

functional by an intruder.

High availability, one of the fundamental aspects of information security, cannot be ensured through the use of

cryptography. Other methods are needed to guard against the threats such as denial of service or complete

breakdown of information system.

Another fundamental need of information security of selective access control also cannot be realized through the

use of cryptography. Administrative controls and procedures are required to be exercised for the same.

Cryptography does not guard against the vulnerabilities and threats that emerge from the poor design of

systems, protocols, and procedures. These need to be fixed through proper design and setting up of a defensive

infrastructure.

Cryptography comes at cost. The cost is in terms of time and money −

o Addition of cryptographic techniques in the information processing leads to delay.

o The use of public key cryptography requires setting up and maintenance of public key infrastructure

requiring the handsome financial budget.

The security of cryptographic technique is based on the computational difficulty of mathematical problems. Any

breakthrough in solving such mathematical problems or increasing the computing power can render a cryptographic

technique vulnerable.

Future of Cryptography

Elliptic Curve Cryptography (ECC) has already been invented but its advantages and disadvantages are not yet fully

understood. ECC allows to perform encryption and decryption in a drastically lesser time, thus allowing a higher amount

of data to be passed with equal security. However, as other methods of encryption, ECC must also be tested and proven

secure before it is accepted for governmental, commercial, and private use.

Quantum computation is the new phenomenon. While modern computers store data using a binary format called a "bit"

in which a "1" or a "0" can be stored; a quantum computer stores data using a quantum superposition of multiple states.

These multiple valued states are stored in "quantum bits" or "qubits". This allows the computation of numbers to be several

orders of magnitude faster than traditional transistor processors.

To comprehend the power of quantum computer, consider RSA-640, a number with 193 digits, which can be factored by

eighty 2.2GHz computers over the span of 5 months, one quantum computer would factor in less than 17 seconds.

Numbers that would typically take billions of years to compute could only take a matter of hours or even minutes with a

fully developed quantum computer.

In view of these facts, modern cryptography will have to look for computationally harder problems or devise completely

new techniques of archiving the goals presently served by modern cryptography.

Behavioral questions will be experience-based and you need a lot of practice to be able to answer them in a satisfactory

manner.

STAR Technique

To answer Behavioral Questions, employ the STAR technique −

S = Situation − (recall an incident in your life that suits the situation)

T = Task − (recall an incident in your life that suits the task)

A = Action − (mention the course of action you opted to address the situation or task)

R = Result − (mention the result of your action and the outcome)

Q − Tell me about an incident where you worked effectively under pressure.

Remember that these are only sample interview answers meant to give a general idea on the approach to

Behavioral Interviews. You need to formulate your own answers to suit the context and scenario asked in the

question.

Sample Behavioral Interview Questions

Q1 − Describe a bad experience you had working with your ex-employer Q2 − Describe how you handle disagreement.

Q3 − Explain a situation when you explained a complex idea simply. Q4 − Describe a time when you had to adapt to a

change at work. Q5 − Describe a time when you made a mistake. Q6 − Describe a time when you delegated tasks to team-

mates. Q7 − Describe when you were blamed for somebody else‘s mistake. Q8 − Describe a difficult situation that you

faced and how you handled it. Q9 − Describe a new suggestion that you had made to your supervisor Q10 − Describe

when you had to take a judgement on a difficult decision.

It is always advisable to memorize a few keywords on the company‘s needs, problems, or goals. Make sure you visit the

company‘s website before the interview to uncover the needs of this specific job profile, instead of the generalized needs

of the industry.

Sample General Interview Questions

Q1 − Tell me about yourself. Q2 − What are your greatest strengths? Q3 − What are your greatest weaknesses? Q4 − Tell

me about an incident you are ashamed of speaking about Q5 − Why did you leave (or plan to leave) your present

employer? Q6 − The Silent Treatment Q7 − Why should I hire you? Q8 − Where do you see yourself five years from

now? Q9 − Why do you want to work at our company? Q10 − Would you lie for the company? Q11 − Questions on

confidential matters.

In Case Interviews, interviewers tend to not mention important figures and details. They want to see if you have a clear

idea on the industry and on what assumption you will solve the problem. In these situations, it‘s okay to consider assumed

data, but they need to be based on facts and logic.

Answering Case Interview Questions

Answering case interview questions can be tricky, especially when you don‘t get the facts right. Do use the following tips

to tackle such questions −

Listen carefully − Paraphrasing helps in understanding the question completely before answering.

Take time to think − Because of the sheer number of parameters needed to tackle the issue, candidates are

expected to take some time to ponder on the scenario, however anything more than five minutes would be

excessive.

Ask questions − Interviewers deliberately give incomplete questions to check the candidates‘ understanding of

relevant parameters, so they expect a lot of questions from you which makes the entire interview quite interactive.

Use a logical framework − Apply the principles you learned in business colleges as a framework. Examples

include Porter's Five Forces and the SWOT analysis.

Prioritize objectives − Start addressing the most important objectives and concerns and gradually move towards

relatively non-priority topics.

Try and think outside the box − Many interviewers are on the lookout for employees who can bring in creativity

to their problem-solving process.

Exhibit enthusiasm − Behaving as though you feel it's fun to tackle this kind of problem is integral to showing

how well you'd fit in as a consultant or whatever position you're interviewing for.

Standard Case Interview Questions

Market-Sizing Case Interview Questions

Market-Sizing Case Interview Questions need the candidates to guess the market size for a specific product. To answer

these questions, you need to have a close idea on the population of the country, the male-female ratio, different

demographics, among many other parameters. A few popular examples are −

Q. How many light bulbs are there in Delhi?

Q. How many people read gossip magazines in Mumbai?

Q. How many photocopies are taken in Odisha each year?

Q. How much beer is consumed in the city of Chandigarh?

Business Case Interview Questions

These questions need knowledge on the internal working of a company. Visit their website and collect as much

information as possible on their way of operations.

Q. You are working directly with <company’s name> management team. It is organizing a project designed to increase

the revenue significantly. If you were provided with data and asked to supervise the project, what steps would you take to

ensure its success?

Q. The firm has assigned you to consult <company’s name> intending to drop a product or expand into new markets in

order to increase revenue. What steps would you take to help this company achieve its objective?

Q. You have been assigned to consult <shoe retailer’s name> with stores throughout the nation. Since its revenue is

dropping, the company has proposed to sell food at its stores. How would you advise this client?

Logic Problems

Questions involving logic problems require you to be able to perform numeracy quickly. The following are a few logic

problems.

Q1 − At 3:15, how many degrees are between two hands of a clock? Q2 − A fire fighter has to get to a burning building as

quickly as he can. There are three paths that he can take. He can take his fire engine over a large hill (5 miles) at 8 miles

per hour. He can take his fire engine through a windy road (7 miles) at 9 miles per hour. Or he can drive his fire engine

along a dirt road which is 8 miles at 12 miles per hour. Which way should he choose? Q3 − You spend 21 dollars on

vegetables at the store. You buy carrots, onions, and celery. The celery cost half the cost of the onions. The onions cost

half the cost of the carrots. How much did the onions cost?

Network Security

Network Security is the process of taking physical and software preventative measures to protect the underlying

networking infrastructure from unauthorized access, misuse, malfunction, modification, destruction, or improper

disclosure, thereby creating a secure platform for computers, users and programs to perform .

What is network security?

Network Security is an organization‘s strategy and provisions for ensuring the security of its assets and of all network

traffic. Network security is manifested in an implementation of security hardware, and software. For the purposes of this

discussion, the following approach is adopted in an effort to view network security in its entirety:

1. Policy 2. Enforcement 3. Auditing

Policy

The IT Security Policy is the principle document for network security. Its goal is to outline the rules for ensuring the

security of organizational assets. Employees today utilize several tools and applications to conduct business productively.

Policy that is driven from the organization‘s culture supports these routines and focuses on the safe enablement of these

tools to its employees. The enforcement and auditing procedures for any regulatory compliance an organization is required

to meet must be mapped out in the policy as well.

Enforcement

Most definitions of network security are narrowed to the enforcement mechanism. Enforcement concerns analyzing all

network traffic flows and should aim to preserve the confidentiality, integrity, and availability of all systems and

information on the network. These three principles compose the CIA triad:

Confidentiality - involves the protection of assets from unauthorized entities Integrity - ensuring the modification of assets is handled in a specified and authorized manner Availability - a state of the system in which authorized users have continuous access to said assets.

Strong enforcement strives to provide CIA to network traffic flows. This begins with a classification of traffic flows by

application, user, and content. As the vehicle for content, all applications must first be identified by the firewall regardless

of port, protocol, evasive tactic, or SSL. Proper application identification allows for full visibility of the content it carries.

Policy management can be simplified by identifying applications and mapping their use to a user identity while inspecting

the content at all times for the preservation of CIA.

The concept of defense in depth is observed as a best practice in network security, prescribing for the network to be

secured in layers. These layers apply an assortment of security controls to sift out threats trying to enter the network:

Access control Identification Authentication Malware detection Encryption File type filtering URL filtering

https://www.paloaltonetworks.com/documentation/platforms

https://www.paloaltonetworks.com/documentation/71/pan-os/pan-os

https://www.paloaltonetworks.com/documentation/71/pan-os/pan-os/policy

https://www.paloaltonetworks.com/documentation/glossary/what-is-an-it-security-policy

https://www.paloaltonetworks.com/documentation/71/pan-os/pan-os/policy/create-a-security-policy-rule

Content filtering

These layers are built through the deployment of firewalls, intrusion prevention systems (IPS), and antivirus components.

Among the components for enforcement, the firewall (an access control mechanism) is the foundation of network security.

Providing CIA of network traffic flows was difficult to accomplish with previous technologies. Traditional firewalls were

plagued by controls that relied on port/protocol to identify applications—which have since developed evasive

characteristics to bypass the controls—and the assumption that IP address equates to a users identity.

The next generation firewall retains an access control mission, but reengineers the technology; it observes all traffic across

all ports, can classify applications and their content, and identifies employees as users. This enables access controls

nuanced enough to enforce the IT security policy as it applies to each employee of the organization, with no compromise

to security.

Additional services for layering network security to implement a defense in depth strategy 8have been incorporated to the

traditional model as add-on components. Intrusion prevention systems (IPS) and antivirus, for example, are effective tools

for scanning content and preventing malware attacks. However, organizations must be cautious of the complexity and cost

that additional components may add to its network security, and more importantly, not depend on these additional

components to do the core job of the firewall.

Auditing

The auditing process of network security requires checking back on enforcement measures to determine how well they

have aligned with the security policy. Auditing encourages continuous improvement by requiring organizations to reflect

on the implementation of their policy on a consistent basis. This gives organizations the opportunity to adjust their policy

and enforcement strategy in areas of evolving need.

https://www.paloaltonetworks.com/documentation/glossary/what-is-an-intrusion-prevention-system-ips

https://www.paloaltonetworks.com/documentation/glossary/what-is-an-intrusion-prevention-system-ips

https://www.paloaltonetworks.com/documentation/glossary/what-is-malware

Introduction to Networking A basic understanding of computer networks is requisite in order to understand the principles of network security. In this section, we'll cover some of the foundations of computer networking, then move on to an overview of some popular networks. Following that, we'll take a more in-depth look at TCP/IP, the network protocol suite that is used to run the Internet and many intranets.

Once we've covered this, we'll go back and discuss some of the threats that managers and administrators of computer

networks need to confront, and then some tools that can be used to reduce the exposure to the risks of network computing.

What is a Network? A ``network'' has been defined[1] as ``any set of interlinking lines resembling a net, a network of roads || an interconnected system, a network of alliances.'' This definition suits our purpose well: a computer network is simply a system of interconnected computers. How they're connected is irrelevant, and as we'll soon see, there are a number of ways to do this.

The ISO/OSI Reference Model The International Standards Organization (ISO) Open Systems Interconnect (OSI) Reference Model defines seven layers of communications types, and the interfaces among them. (See Figure 1.) Each layer depends on the services provided by the layer below it, all the way down to the physical network hardware, such as the computer's network interface card, and the wires that connect the cards together.

An easy way to look at this is to compare this model with something we use daily: the telephone. In order for you and I to

talk when we're out of earshot, we need a device like a telephone. (In the ISO/OSI model, this is at the application layer.)

The telephones, of course, are useless unless they have the ability to translate the sound into electronic pulses that can be

transferred over wire and back again. (These functions are provided in layers below the application layer.) Finally, we get

down to the physical connection: both must be plugged into an outlet that is connected to a switch that's part of the

telephone system's network of switches.

If I place a call to you, I pick up the receiver, and dial your number. This number specifies which central office to which to

send my request, and then which phone from that central office to ring. Once you answer the phone, we begin talking, and

our session has begun. Conceptually, computer networks function exactly the same way.

It isn't important for you to memorize the ISO/OSI Reference Model's layers; but it's useful to know that they exist, and

that each layer cannot work without the services provided by the layer below it.

Figure 1: The ISO/OSI Reference Model

http://www.interhack.net/pubs/network-security/network-security.html#lex:dictionary

http://www.interhack.net/pubs/network-security/network-security.html#iso:osi

What are some Popular Networks? Over the last 25 years or so, a number of networks and network protocols have been defined and used. We're going to look at two of these networks, both of which are ``public'' networks. Anyone can connect to either of these networks, or they can use types of networks to connect their own hosts (computers) together, without connecting to the public networks. Each type takes a very different approach to providing network services.

UUCP

UUCP (Unix-to-Unix CoPy) was originally developed to connect Unix (surprise!) hosts together. UUCP has since been ported to many different architectures, including PCs, Macs, Amigas, Apple IIs, VMS hosts, everything else you can name, and even some things you can't. Additionally, a number of systems have been developed around the same principles as UUCP.

Batch-Oriented Processing.

UUCP and similar systems are batch-oriented systems: everything that they have to do is added to a queue, and then at some specified time, everything in the queue is processed.

Implementation Environment.

UUCP networks are commonly built using dial-up (modem) connections. This doesn't have to be the case though: UUCP can be used over any sort of connection between two computers, including an Internet connection.

Building a UUCP network is a simple matter of configuring two hosts to recognize each other, and know how to get in

touch with each other. Adding on to the network is simple; if hosts called A and B have a UUCP network between them,

and C would like to join the network, then it must be configured to talk to A and/or B. Naturally, anything that C talks to

must be made aware of C's existence before any connections will work. Now, to connect D to the network, a connection

must be established with at least one of the hosts on the network, and so on. Figure 2 shows a sample UUCP network.

Figure 2: A Sample UUCP Network

In a UUCP network, users are identified in the format host!userid. The ``!'' character (pronounced ``bang'' in networking

circles) is used to separate hosts and users. A bangpath is a string of host(s) and a userid like A!cmcurtin or

C!B!A!cmcurtin. If I am a user on host A and you are a user on host E, I might be known as A!cmcurtin and you as

http://www.interhack.net/pubs/network-security/network-security.html#me:uucp

E!you. Because there is no direct link between your host (E) and mine (A), in order for us to communicate, we need to do

so through a host (or hosts!) that has connectivity to both E and A. In our sample network, C has the connectivity we need.

So, to send me a file, or piece of email, you would address it to C!A!cmcurtin. Or, if you feel like taking the long way

around, you can address me as C!B!A!cmcurtin.

The ``public'' UUCP network is simply a huge worldwide network of hosts connected to each other.

Popularity.

The public UUCP network has been shrinking in size over the years, with the rise of the availability of inexpensive Internet connections. Additionally, since UUCP connections are typically made hourly, daily, or weekly, there is a fair bit of delay in getting data from one user on a UUCP network to a user on the other end of the network. UUCP isn't very flexible, as it's used for simply copying files (which can be netnews, email, documents, etc.) Interactive protocols (that make applications such as the World Wide Web possible) have become much more the norm, and are preferred in most cases.

However, there are still many people whose needs for email and netnews are served quite well by UUCP, and its

integration into the Internet has greatly reduced the amount of cumbersome addressing that had to be accomplished in

times past.

Security.

UUCP, like any other application, has security tradeoffs. Some strong points for its security is that it is fairly limited in what it can do, and it's therefore more difficult to trick into doing something it shouldn't; it's been around a long time, and most its bugs have been discovered, analyzed, and fixed; and because UUCP networks are made up of occasional connections to other hosts, it isn't possible for someone on host E to directly make contact with host B, and take advantage of that connection to do something naughty.

On the other hand, UUCP typically works by having a system-wide UUCP user account and password. Any system that

has a UUCP connection with another must know the appropriate password for the uucp or nuucp account. Identifying a

host beyond that point has traditionally been little more than a matter of trusting that the host is who it claims to be, and

that a connection is allowed at that time. More recently, there has been an additional layer of authentication, whereby both

hosts must have the same sequence number , that is a number that is incremented each time a connection is made.

Hence, if I run host B, I know the uucp password on host A. If, though, I want to impersonate host C, I'll need to connect,

identify myself as C, hope that I've done so at a time that A will allow it, and try to guess the correct sequence number for

the session. While this might not be a trivial attack, it isn't considered very secure.

The Internet

Internet: This is a word that I've heard way too often in the last few years. Movies, books, newspapers, magazines, television programs, and practically every other sort of media imaginable has dealt with the Internet recently.

What is the Internet?

The Internet is the world's largest network of networks . When you want to access the resources offered by the Internet, you don't really connect to the Internet; you connect to a network that is eventually connected to the Internet backbone , a network of extremely fast (and incredibly overloaded!) network components. This is an important point: the Internet is a network of networks -- not a network of hosts.

A simple network can be constructed using the same protocols and such that the Internet uses without actually connecting

it to anything else. Such a basic network is shown in Figure 3.

http://www.interhack.net/pubs/network-security/network-security.html#me:lan

Figure 3: A Simple Local Area Network

I might be allowed to put one of my hosts on one of my employer's networks. We have a number of networks, which are

all connected together on a backbone , that is a network of our networks. Our backbone is then connected to other

networks, one of which is to an Internet Service Provider (ISP) whose backbone is connected to other networks, one of

which is the Internet backbone.

If you have a connection ``to the Internet'' through a local ISP, you are actually connecting your computer to one of their

networks, which is connected to another, and so on. To use a service from my host, such as a web server, you would tell

your web browser to connect to my host. Underlying services and protocols would send packets (small datagrams) with

your query to your ISP's network, and then a network they're connected to, and so on, until it found a path to my

employer's backbone, and to the exact network my host is on. My host would then respond appropriately, and the same

would happen in reverse: packets would traverse all of the connections until they found their way back to your computer,

and you were looking at my web page.

In Figure 4, the network shown in Figure 3 is designated ``LAN 1'' and shown in the bottom-right of the picture. This

shows how the hosts on that network are provided connectivity to other hosts on the same LAN, within the same company,

outside of the company, but in the same ISP cloud , and then from another ISP somewhere on the Internet.

Figure 4: A Wider View of Internet-connected Networks

The Internet is made up of a wide variety of hosts, from supercomputers to personal computers, including every

http://www.interhack.net/pubs/network-security/network-security.html#me:big-net

http://www.interhack.net/pubs/network-security/network-security.html#me:lan

imaginable type of hardware and software. How do all of these computers understand each other and work together?

TCP/IP: The Language of the Internet TCP/IP (Transport Control Protocol/Internet Protocol) is the ``language'' of the Internet. Anything that can learn to ``speak TCP/IP'' can play on the Internet. This is functionality that occurs at the Network (IP) and Transport (TCP) layers in the ISO/OSI Reference Model. Consequently, a host that has TCP/IP functionality (such as Unix, OS/2, MacOS, or Windows NT) can easily support applications (such as Netscape's Navigator) that uses the network.

Open Design One of the most important features of TCP/IP isn't a technological one: The protocol is an ``open'' protocol, and anyone who wishes to implement it may do so freely. Engineers and scientists from all over the world participate in the IETF (Internet Engineering Task Force) working groups that design the protocols that make the Internet work. Their time is typically donated by their companies, and the result is work that benefits everyone.

IP As noted, IP is a ``network layer'' protocol. This is the layer that allows the hosts to actually ``talk'' to each other. Such things as carrying datagrams, mapping the Internet address (such as 10.2.3.4) to a physical network address (such as 08:00:69:0a:ca:8f), and routing, which takes care of making sure that all of the devices that have Internet connectivity can find the way to each other.

Understanding IP

IP has a number of very important features which make it an extremely robust and flexible protocol. For our purposes, though, we're going to focus on the security of IP, or more specifically, the lack thereof.

Attacks Against IP

A number of attacks against IP are possible. Typically, these exploit the fact that IP does not perform a robust mechanism for authentication , which is proving that a packet came from where it claims it did. A packet simply claims to originate from a given address, and there isn't a way to be sure that the host that sent the packet is telling the truth. This isn't necessarily a weakness, per se , but it is an important point, because it means that the facility of host authentication has to be provided at a higher layer on the ISO/OSI Reference Model. Today, applications that require strong host authentication (such as cryptographic applications) do this at the application layer.

IP Spoofing.

This is where one host claims to have the IP address of another. Since many systems (such as router access control lists) define which packets may and which packets may not pass based on the sender's IP address, this is a useful technique to an attacker: he can send packets to a host, perhaps causing it to take some sort of action.

Additionally, some applications allow login based on the IP address of the person making the request (such as the Berkeley

r-commands )[2]. These are both good examples how trusting untrustable layers can provide security that is -- at best --

weak.

IP Session Hijacking.

This is a relatively sophisticated attack, first described by Steve Bellovin [3]. This is very dangerous, however, because there are now toolkits available in the underground community that allow otherwise unskilled bad-guy-wannabes to perpetrate this attack. IP Session Hijacking is an attack whereby a user's session is taken over, being in the control of the attacker. If the user was in the middle of email, the attacker is looking at the email, and then can execute any commands he wishes as the attacked user. The attacked user simply sees his session dropped, and may simply login again, perhaps not even noticing that the attacker is still logged in and doing things.

For the description of the attack, let's return to our large network of networks in Figure 4. In this attack, a user on host A is

carrying on a session with host G. Perhaps this is a telnet session, where the user is reading his email, or using a Unix

http://www.interhack.net/pubs/network-security/network-security.html#rtm:42bsd-weak

http://www.interhack.net/pubs/network-security/network-security.html#smb:tcpip-weak

http://www.interhack.net/pubs/network-security/network-security.html#me:big-net

shell account from home. Somewhere in the network between A and G sits host H which is run by a naughty person. The

naughty person on host H watches the traffic between A and G, and runs a tool which starts to impersonate A to G, and at the

same time tells A to shut up, perhaps trying to convince it that G is no longer on the net (which might happen in the event of

a crash, or major network outage). After a few seconds of this, if the attack is successful, naughty person has ``hijacked''

the session of our user. Anything that the user can do legitimately can now be done by the attacker, illegitimately. As far as

G knows, nothing has happened.

This can be solved by replacing standard telnet-type applications with encrypted versions of the same thing. In this case,

the attacker can still take over the session, but he'll see only ``gibberish'' because the session is encrypted. The attacker will

not have the needed cryptographic key(s) to decrypt the data stream from G, and will, therefore, be unable to do anything

with the session.

TCP TCP is a transport-layer protocol. It needs to sit on top of a network-layer protocol, and was designed to ride atop IP. (Just as IP was designed to carry, among other things, TCP packets.) Because TCP and IP were designed together and wherever you have one, you typically have the other, the entire suite of Internet protocols are known collectively as ``TCP/IP.'' TCP itself has a number of important features that we'll cover briefly.

Guaranteed Packet Delivery

Probably the most important is guaranteed packet delivery. Host A sending packets to host B expects to get acknowledgments back for each packet. If B does not send an acknowledgment within a specified amount of time, A will resend the packet.

Applications on host B will expect a data stream from a TCP session to be complete, and in order. As noted, if a packet is

missing, it will be resent by A, and if packets arrive out of order, B will arrange them in proper order before passing the

data to the requesting application.

This is suited well toward a number of applications, such as a telnet session. A user wants to be sure every keystroke is

received by the remote host, and that it gets every packet sent back, even if this means occasional slight delays in

responsiveness while a lost packet is resent, or while out-of-order packets are rearranged.

It is not suited well toward other applications, such as streaming audio or video, however. In these, it doesn't really matter

if a packet is lost (a lost packet in a stream of 100 won't be distinguishable) but it does matter if they arrive late (i.e.,

because of a host resending a packet presumed lost), since the data stream will be paused while the lost packet is being

resent. Once the lost packet is received, it will be put in the proper slot in the data stream, and then passed up to the

application.

UDP UDP (User Datagram Protocol) is a simple transport-layer protocol. It does not provide the same features as TCP, and is thus considered ``unreliable.'' Again, although this is unsuitable for some applications, it does have much more applicability in other applications than the more reliable and robust TCP.

Lower Overhead than TCP

One of the things that makes UDP nice is its simplicity. Because it doesn't need to keep track of the sequence of packets, whether they ever made it to their destination, etc., it has lower overhead than TCP. This is another reason why it's more suited to streaming-data applications: there's less screwing around that needs to be done with making sure all the packets are there, in the right order, and that sort of thing.

Risk Management: The Game of Security It's very important to understand that in security, one simply cannot say ``what's the best firewall?'' There are two extremes: absolute security and absolute access. The closest we can get to an absolutely secure machine is one unplugged from the network, power supply, locked in a safe, and thrown at the bottom of the ocean. Unfortunately, it isn't terribly useful in this state. A machine

with absolute access is extremely convenient to use: it's simply there, and will do whatever you tell it, without questions, authorization, passwords, or any other mechanism. Unfortunately, this isn't terribly practical, either: the Internet is a bad neighborhood now, and it isn't long before some bonehead will tell the computer to do something like self-destruct, after which, it isn't terribly useful to you.

This is no different from our daily lives. We constantly make decisions about what risks we're willing to accept. When we

get in a car and drive to work, there's a certain risk that we're taking. It's possible that something completely out of control

will cause us to become part of an accident on the highway. When we get on an airplane, we're accepting the level of risk

involved as the price of convenience. However, most people have a mental picture of what an acceptable risk is, and won't

go beyond that in most circumstances. If I happen to be upstairs at home, and want to leave for work, I'm not going to

jump out the window. Yes, it would be more convenient, but the risk of injury outweighs the advantage of convenience.

Every organization needs to decide for itself where between the two extremes of total security and total access they need to

be. A policy needs to articulate this, and then define how that will be enforced with practices and such. Everything that is

done in the name of security, then, must enforce that policy uniformly.

Types And Sources Of Network Threats Now, we've covered enough background information on networking that we can actually get into the security aspects of all of this. First of all, we'll get into the types of threats there are against networked computers, and then some things that can be done to protect yourself against various threats.

Denial-of-Service DoS (Denial-of-Service) attacks are probably the nastiest, and most difficult to address. These are the nastiest, because they're very easy to launch, difficult (sometimes impossible) to track, and it isn't easy to refuse the requests of the attacker, without also refusing legitimate requests for service.

The premise of a DoS attack is simple: send more requests to the machine than it can handle. There are toolkits available in

the underground community that make this a simple matter of running a program and telling it which host to blast with

requests. The attacker's program simply makes a connection on some service port, perhaps forging the packet's header

information that says where the packet came from, and then dropping the connection. If the host is able to answer 20

requests per second, and the attacker is sending 50 per second, obviously the host will be unable to service all of the

attacker's requests, much less any legitimate requests (hits on the web site running there, for example).

Such attacks were fairly common in late 1996 and early 1997, but are now becoming less popular.

Some things that can be done to reduce the risk of being stung by a denial of service attack include

Not running your visible-to-the-world servers at a level too close to capacity Using packet filtering to prevent obviously forged packets from entering into your network address space.

Obviously forged packets would include those that claim to come from your own hosts, addresses reserved for

private networks as defined in RFC 1918 [4], and the loopback network (127.0.0.0).

Keeping up-to-date on security-related patches for your hosts' operating systems.

Unauthorized Access ``Unauthorized access'' is a very high-level term that can refer to a number of different sorts of attacks. The goal of these attacks is to access some resource that your machine should not provide the attacker. For example, a host might be a web server, and should provide anyone with requested web pages. However, that host should not provide command shell access without being sure that the person making such a request is someone who should get it, such as a local administrator.

Executing Commands Illicitly

http://www.interhack.net/pubs/network-security/network-security.html#rfc:1918

It's obviously undesirable for an unknown and untrusted person to be able to execute commands on your server machines. There are two main classifications of the severity of this problem: normal user access, and administrator access. A normal user can do a number of things on a system (such as read files, mail them to other people, etc.) that an attacker should not be able to do. This might, then, be all the access that an attacker needs. On the other hand, an attacker might wish to make configuration changes to a host (perhaps changing its IP address, putting a start-up script in place to cause the machine to shut down every time it's started, or something similar). In this case, the attacker will need to gain administrator privileges on the host.

Confidentiality Breaches

We need to examine the threat model: what is it that you're trying to protect yourself against? There is certain information that could be quite damaging if it fell into the hands of a competitor, an enemy, or the public. In these cases, it's possible that compromise of a normal user's account on the machine can be enough to cause damage (perhaps in the form of PR, or obtaining information that can be used against the company, etc.)

While many of the perpetrators of these sorts of break-ins are merely thrill-seekers interested in nothing more than to see a

shell prompt for your computer on their screen, there are those who are more malicious, as we'll consider next.

(Additionally, keep in mind that it's possible that someone who is normally interested in nothing more than the thrill could

be persuaded to do more: perhaps an unscrupulous competitor is willing to hire such a person to hurt you.)

Destructive Behavior

Among the destructive sorts of break-ins and attacks, there are two major categories.

Data Diddling.

The data diddler is likely the worst sort, since the fact of a break-in might not be immediately obvious. Perhaps he's toying with the numbers in your spreadsheets, or changing the dates in your projections and plans. Maybe he's changing the account numbers for the auto-deposit of certain paychecks. In any case, rare is the case when you'll come in to work one day, and simply know that something is wrong. An accounting procedure might turn up a discrepancy in the books three or four months after the fact. Trying to track the problem down will certainly be difficult, and once that problem is discovered, how can any of your numbers from that time period be trusted? How far back do you have to go before you think that your data is safe?

Data Destruction.

Some of those perpetrate attacks are simply twisted jerks who like to delete things. In these cases, the impact on your computing capability -- and consequently your business -- can be nothing less than if a fire or other disaster caused your computing equipment to be completely destroyed.

Where Do They Come From? How, though, does an attacker gain access to your equipment? Through any connection that you have to the outside world. This includes Internet connections, dial-up modems, and even physical access. (How do you know that one of the temps that you've brought in to help with the data entry isn't really a system cracker looking for passwords, data phone numbers, vulnerabilities and anything else that can get him access to your equipment?)

In order to be able to adequately address security, all possible avenues of entry must be identified and evaluated. The

security of that entry point must be consistent with your stated policy on acceptable risk levels.

Lessons Learned From looking at the sorts of attacks that are common, we can divine a relatively short list of high-level practices that can help prevent security disasters, and to help control the damage in the event that preventative measures were unsuccessful in warding off an attack.

Hope you have backups

This isn't just a good idea from a security point of view. Operational requirements should dictate the backup policy, and this should be closely coordinated with a disaster recovery plan, such that if an airplane crashes into your building one night, you'll be able to carry on your business from another location. Similarly, these can be useful in recovering your data in the event of an electronic disaster: a hardware failure, or a breakin that changes or otherwise damages your data.

Don't put data where it doesn't need to be

Although this should go without saying, this doesn't occur to lots of folks. As a result, information that doesn't need to be accessible from the outside world sometimes is, and this can needlessly increase the severity of a break-in dramatically.

Avoid systems with single points of failure

Any security system that can be broken by breaking through any one component isn't really very strong. In security, a degree of redundancy is good, and can help you protect your organization from a minor security breach becoming a catastrophe.

Stay current with relevant operating system patches

Be sure that someone who knows what you've got is watching the vendors' security advisories. Exploiting old bugs is still one of the most common (and most effective!) means of breaking into systems.

Watch for relevant security advisories

In addition to watching what the vendors are saying, keep a close watch on groups like CERT and CIAC. Make sure that at least one person (preferably more) is subscribed to these mailing lists

Have someone on staff be familiar with security practices

Having at least one person who is charged with keeping abreast of security developments is a good idea. This need not be a technical wizard, but could be someone who is simply able to read advisories issued by various incident response teams, and keep track of various problems that arise. Such a person would then be a wise one to consult with on security related issues, as he'll be the one who knows if web server software version such-and-such has any known problems, etc.

This person should also know the ``dos'' and ``don'ts'' of security, from reading such things as the ``Site Security

Handbook.''[5]

Firewalls As we've seen in our discussion of the Internet and similar networks, connecting an organization to the Internet provides a two-way flow of traffic. This is clearly undesirable in many organizations, as proprietary information is often displayed freely within a corporate intranet (that is, a TCP/IP network, modeled after the Internet that only works within the organization).

In order to provide some level of separation between an organization's intranet and the Internet, firewalls have been

employed. A firewall is simply a group of components that collectively form a barrier between two networks.

A number of terms specific to firewalls and networking are going to be used throughout this section, so let's introduce

them all together.

Bastion host. A general-purpose computer used to control access between the internal (private) network (intranet) and the Internet (or any other untrusted network). Typically, these are hosts running a flavor of the Unix operating system that has been customized in order to reduce its functionality to only what is necessary in order to support its functions. Many of the general-purpose features have been turned off, and in many cases, completely removed, in order to improve the security of the machine.

Router.

http://www.cert.org/

http://ciac.llnl.gov/ciac/

http://www.interhack.net/pubs/network-security/network-security.html#rfc:1244

A special purpose computer for connecting networks together. Routers also handle certain functions, such as routing , or managing the traffic on the networks they connect.

Access Control List (ACL). Many routers now have the ability to selectively perform their duties, based on a number of facts about a packet that comes to it. This includes things like origination address, destination address, destination service port, and so on. These can be employed to limit the sorts of packets that are allowed to come in and go out of a given network.

Demilitarized Zone (DMZ). The DMZ is a critical part of a firewall: it is a network that is neither part of the untrusted network, nor part of the trusted network. But, this is a network that connects the untrusted to the trusted. The importance of a DMZ is tremendous: someone who breaks into your network from the Internet should have to get through several layers in order to successfully do so. Those layers are provided by various components within the DMZ.

Proxy. This is the process of having one host act in behalf of another. A host that has the ability to fetch documents from the Internet might be configured as a proxy server , and host on the intranet might be configured to be proxy clients . In this situation, when a host on the intranet wishes to fetch the <http://www.interhack.net/> web page, for example, the browser will make a connection to the proxy server, and request the given URL. The proxy server will fetch the document, and return the result to the client. In this way, all hosts on the intranet are able to access resources on the Internet without having the ability to direct talk to the Internet.

Types of Firewalls There are three basic types of firewalls, and we'll consider each of them.

Application Gateways

The first firewalls were application gateways, and are sometimes known as proxy gateways. These are made up of bastion hosts that run special software to act as a proxy server. This software runs at the Application Layer of our old friend the ISO/OSI Reference Model, hence the name. Clients behind the firewall must be proxitized (that is, must know how to use the proxy, and be configured to do so) in order to use Internet services. Traditionally, these have been the most secure, because they don't allow anything to pass by default, but need to have the programs written and turned on in order to begin passing traffic.

Figure 5: A sample application gateway

These are also typically the slowest, because more processes need to be started in order to have a request serviced. Figure 5

shows a application gateway.

Packet Filtering

Packet filtering is a technique whereby routers have ACLs (Access Control Lists) turned on. By default, a router will pass all traffic sent it, and will do so without any sort of restrictions. Employing ACLs is a method for enforcing your security policy with regard to what sorts of access you allow the outside world to have to your internal network, and vice versa.

There is less overhead in packet filtering than with an application gateway, because the feature of access control is

performed at a lower ISO/OSI layer (typically, the transport or session layer). Due to the lower overhead and the fact that

packet filtering is done with routers, which are specialized computers optimized for tasks related to networking, a packet

filtering gateway is often much faster than its application layer cousins. Figure 6 shows a packet filtering gateway.

Because we're working at a lower level, supporting new applications either comes automatically, or is a simple matter of

allowing a specific packet type to pass through the gateway. (Not that the possibility of something automatically makes it a

good idea; opening things up this way might very well compromise your level of security below what your policy allows.)

There are problems with this method, though. Remember, TCP/IP has absolutely no means of guaranteeing that the source

address is really what it claims to be. As a result, we have to use layers of packet filters in order to localize the traffic. We

can't get all the way down to the actual host, but with two layers of packet filters, we can differentiate between a packet

that came from the Internet and one that came from our internal network. We can identify which network the packet came

from with certainty, but we can't get more specific than that.

Hybrid Systems

In an attempt to marry the security of the application layer gateways with the flexibility and speed of packet filtering, some vendors have created systems that use the principles of both.

Figure 6: A sample packet filtering gateway

In some of these systems, new connections must be authenticated and approved at the application layer. Once this has been

done, the remainder of the connection is passed down to the session layer, where packet filters watch the connection to

ensure that only packets that are part of an ongoing (already authenticated and approved) conversation are being passed.

Other possibilities include using both packet filtering and application layer proxies. The benefits here include providing a

http://www.interhack.net/pubs/network-security/network-security.html#me:proxy

http://www.interhack.net/pubs/network-security/network-security.html#me:packet

measure of protection against your machines that provide services to the Internet (such as a public web server), as well as

provide the security of an application layer gateway to the internal network. Additionally, using this method, an attacker, in

order to get to services on the internal network, will have to break through the access router, the bastion host, and the

choke router.

So, what's best for me? Lots of options are available, and it makes sense to spend some time with an expert, either in-house, or an experienced consultant who can take the time to understand your organization's security policy, and can design and build a firewall architecture that best implements that policy. Other issues like services required, convenience, and scalability might factor in to the final design.

Some Words of Caution The business of building firewalls is in the process of becoming a commodity market. Along with commodity markets come lots of folks who are looking for a way to make a buck without necessarily knowing what they're doing. Additionally, vendors compete with each other to try and claim the greatest security, the easiest to administer, and the least visible to end users. In order to try to quantify the potential security of firewalls, some organizations have taken to firewall certifications. The certification of a firewall means nothing more than the fact that it can be configured in such a way that it can pass a series of tests. Similarly, claims about meeting or exceeding U.S. Department of Defense ``Orange Book'' standards, C-2, B-1, and such all simply mean that an organization was able to configure a machine to pass a series of tests. This doesn't mean that it was loaded with the vendor's software at the time, or that the machine was even usable. In fact, one vendor has been claiming their operating system is ``C-2 Certified'' didn't make mention of the fact that their operating system only passed the C-2 tests without being connected to any sort of network devices.

Such gauges as market share, certification, and the like are no guarantees of security or quality. Taking a little bit of time

to talk to some knowledgeable folks can go a long way in providing you a comfortable level of security between your

private network and the big, bad Internet.

Additionally, it's important to note that many consultants these days have become much less the advocate of their clients,

and more of an extension of the vendor. Ask any consultants you talk to about their vendor affiliations, certifications, and

whatnot. Ask what difference it makes to them whether you choose one product over another, and vice versa. And then ask

yourself if a consultant who is certified in technology XYZ is going to provide you with competing technology ABC, even

if ABC best fits your needs.

Single Points of Failure

Many ``firewalls'' are sold as a single component: a bastion host, or some other black box that you plug your networks into and get a warm-fuzzy, feeling safe and secure. The term ``firewall'' refers to a number of components that collectively provide the security of the system. Any time there is only one component paying attention to what's going on between the internal and external networks, an attacker has only one thing to break (or fool!) in order to gain complete access to your internal networks.

See the Internet Firewalls FAQ for more details on building and maintaining firewalls.

Secure Network Devices It's important to remember that the firewall is only one entry point to your network. Modems, if you allow them to answer incoming calls, can provide an easy means for an attacker to sneak around (rather than through ) your front door (or, firewall). Just as castles weren't built with moats only in the front, your network needs to be protected at all of its entry points.

Secure Modems; Dial-Back Systems If modem access is to be provided, this should be guarded carefully. The terminal server , or network device that provides dial-up access to your network needs to be actively administered, and its logs need to be examined for strange behavior. Its passwords need to be strong -- not ones that can be guessed. Accounts that aren't actively used should be disabled. In short, it's the easiest way to get into your network from remote: guard it carefully.

There are some remote access systems that have the feature of a two-part procedure to establish a connection. The first part

is the remote user dialing into the system, and providing the correct userid and password. The system will then drop the

http://www.interhack.net/pubs/fwfaq/

connection, and call the authenticated user back at a known telephone number. Once the remote user's system answers that

call, the connection is established, and the user is on the network. This works well for folks working at home, but can be

problematic for users wishing to dial in from hotel rooms and such when on business trips.

Other possibilities include one-time password schemes, where the user enters his userid, and is presented with a

``challenge,'' a string of between six and eight numbers. He types this challenge into a small device that he carries with him

that looks like a calculator. He then presses enter, and a ``response'' is displayed on the LCD screen. The user types the

response, and if all is correct, he login will proceed. These are useful devices for solving the problem of good passwords,

without requiring dial-back access. However, these have their own problems, as they require the user to carry them, and

they must be tracked, much like building and office keys.

No doubt many other schemes exist. Take a look at your options, and find out how what the vendors have to offer will help

you enforce your security policy effectively.

Crypto-Capable Routers A feature that is being built into some routers is the ability to use session encryption between specified routers. Because traffic traveling across the Internet can be seen by people in the middle who have the resources (and time) to snoop around, these are advantageous for providing connectivity between two sites, such that there can be secure routes.

See the Snake Oil FAQ [6] for a description of cryptography, ideas for evaluating cryptographic products, and how to

determine which will most likely meet your needs.

Virtual Private Networks Given the ubiquity of the Internet, and the considerable expense in private leased lines, many organizations have been building VPNs (Virtual Private Networks). Traditionally, for an organization to provide connectivity between a main office and a satellite one, an expensive data line had to be leased in order to provide direct connectivity between the two offices. Now, a solution that is often more economical is to provide both offices connectivity to the Internet. Then, using the Internet as the medium, the two offices can communicate.

The danger in doing this, of course, is that there is no privacy on this channel, and it's difficult to provide the other office

access to ``internal'' resources without providing those resources to everyone on the Internet.

VPNs provide the ability for two offices to communicate with each other in such a way that it looks like they're directly

connected over a private leased line. The session between them, although going over the Internet, is private (because the

link is encrypted), and the link is convenient, because each can see each others' internal resources without showing them

off to the entire world.

A number of firewall vendors are including the ability to build VPNs in their offerings, either directly with their base

product, or as an add-on. If you have need to connect several offices together, this might very well be the best way to do it.

Conclusions Security is a very difficult topic. Everyone has a different idea of what ``security'' is, and what levels of risk are acceptable. The key for building a secure network is to define what security means to your organization . Once that has been defined, everything that goes on with the network can be evaluated with respect to that policy. Projects and systems can then be broken down into their components, and it becomes much simpler to decide whether what is proposed will conflict with your security policies and practices.

Many people pay great amounts of lip service to security, but do not want to be bothered with it when it gets in their way.

It's important to build systems and networks in such a way that the user is not constantly reminded of the security system

around him. Users who find security policies and systems too restrictive will find ways around them. It's important to get

their feedback to understand what can be improved, and it's important to let them know why what's been done has been, the

sorts of risks that are deemed unacceptable, and what has been done to minimize the organization's exposure to them.

http://www.interhack.net/pubs/network-security/network-security.html#cmcurtin:snake-oil-faq

Security is everybody's business, and only with everyone's cooperation, an intelligent policy, and consistent practices, will

it be achievable.

Session in Java

Calling request.getSession(false) or simply request.getSession() will return null in the event the session ID is not found or

the session ID refers to an invalid session. There is a single HTTP session by visit, as Java session cookies are not stored

permanently in the browser.

Tokens are the various Java program elements which are identified by the compiler. A token is the smallest element of a

program that is meaningful to the compiler. Tokens supported in Java include keywords, variables, constants, special

characters, operations etc.

Android The Android SDK includes a mobile device emulator — a virtual mobile device that runs on your computer. The

emulator lets you develop and test Android applications without using a physical device.

Mobile application development is a term used to denote the act or process by which application software is developed

for mobile devices, such as personal digital assistants, enterprise digital assistants or mobile phones. These applications

can be pre-installed on phones during manufacturing platforms, or delivered as web applications using server-side or

client-side processing (e.g., JavaScript) to provide an "application-like" experience within a Web browser. Application

software developers also must consider a long array of screen sizes, hardware specifications, and configurations because of

intense competition in mobile software and changes within each of the platforms.[1]

Mobile app development has been

steadily growing, in revenues and jobs created. A 2013 analyst report estimates there are 529,000 direct app economy jobs

within the EU 28 members, 60% of which are mobile app developers.[2]

As part of the development process, mobile user interface (UI) design is also essential in the creation of mobile apps.

Mobile UI considers constraints, contexts, screen, input, and mobility as outlines for design. The user is often the focus of

interaction with their device, and the interface entails components of both hardware and software. User input allows for the

users to manipulate a system, and device's output allows the system to indicate the effects of the users' manipulation.

Mobile UI design constraints include limited attention and form factors, such as a mobile device's screen size for a user's

hand(s). Mobile UI contexts signal cues from user activity, such as location and scheduling that can be shown from user

interactions within a mobile application. Overall, mobile UI design's goal is mainly for an understandable, user-friendly

interface. The UI of mobile apps should: consider users' limited attention, minimize keystrokes, and be task-oriented with a

minimum set of functions. This functionality is supported by mobile enterprise application platforms or integrated

development environments (IDEs).

Mobile UIs, or front-ends, rely on mobile back-ends to support access to enterprise systems. The mobile back-end

facilitates data routing, security, authentication, authorization, working off-line, and service orchestration. This

functionality is supported by a mix of middleware components including mobile application servers, mobile backend as a

service (MBaaS), and service-oriented architecture (SOA) infrastructure.

Criteria for selecting a development platform usually contains the target mobile platforms, existing infrastructure and

development skills. When targeting more than one platform with cross-platform development it is also important to

consider the impact of the tool on the user experience. Performance is another important criteria, as research on mobile

applications indicates a strong correlation between application performance and user satisfaction. Along with performance

and other criteria, the availability of the technology and the project's requirement may drive the development between

native and cross-platform environments. To aid the choice between native and cross-platform environments, some

guidelines and benchmarks have been published. Typically, cross-platform environments are reusable across multiple

platforms, leveraging a native container while using HTML, CSS, and JavaScript for the user interface. In contrast, native

environments are targeted at one platform for each of those environments. For example, Android development occurs in

the Eclipse IDE using Android Developer Tools (ADT) plugins, Apple iOS development occurs using Xcode IDE with

Objective-C and/or Swift, Windows and BlackBerry each have their own development environments.

Mobile application testing

Mobile applications are first tested within the development environment using emulators and later subjected to field

testing. Emulators provide an inexpensive way to test applications on mobile phones to which developers may not have

physical access. The following are examples of tools used for testing application across the most popular mobile operating

https://en.wikipedia.org/wiki/Application_software

https://en.wikipedia.org/wiki/Software_development

https://en.wikipedia.org/wiki/Mobile_device

https://en.wikipedia.org/wiki/Personal_digital_assistant

https://en.wikipedia.org/wiki/Enterprise_digital_assistant

https://en.wikipedia.org/wiki/Mobile_phone

https://en.wikipedia.org/wiki/Installation_%28computer_programs%29


https://en.wikipedia.org/wiki/Mobile_application_development#cite_note-1

https://en.wikipedia.org/wiki/Mobile_application_development#cite_note-2

https://en.wikipedia.org/wiki/User_interface

https://en.wikipedia.org/wiki/Mobile_enterprise_application_platform

https://en.wikipedia.org/wiki/Integrated_development_environment


https://en.wikipedia.org/wiki/Middleware

https://en.wikipedia.org/w/index.php?title=Mobile_application_servers&action=edit&redlink=1

https://en.wikipedia.org/wiki/Mobile_backend_as_a_service



https://en.wikipedia.org/wiki/Service-oriented_architecture

https://en.wikipedia.org/wiki/User_experience

https://en.wikipedia.org/wiki/Field_testing



https://en.wikipedia.org/wiki/Emulator

https://en.wikipedia.org/wiki/Mobile_operating_system

https://en.wikipedia.org/wiki/Mobile_operating_system

systems.

Google Android Emulator - an Android emulator that is patched to run on a Windows PC as a standalone app, without having to download and install the complete and complex Android SDK. It can be installed and Android compatible apps can be tested on it.

The official Android SDK Emulator - a mobile device emulator which mimics all of the hardware and software features of a typical mobile device (without the calls).

MobiOne Developer - a mobile Web integrated development environment (IDE) for Windows that helps developers to code, test, debug, package and deploy mobile Web applications to devices such as iPhone, BlackBerry, Android, and the Palm Pre. MobiOne Developer was officially declared End of Life by the end of 2014.[citation needed]

TestiPhone - a web browser-based simulator for quickly testing iPhone web applications. This tool has been tested and works using Internet Explorer 7, Firefox 2 and Safari 3.

iPhoney - gives a pixel-accurate web browsing environment and it is powered by Safari. It can be used while developing web sites for the iPhone. It is not an iPhone simulator but instead is designed for web developers who want to create 320 by 480 (or 480 by 320) websites for use with iPhone. iPhoney will only run on OS X 10.4.7 or later.

BlackBerry Simulator - There are a variety of official BlackBerry simulators available to emulate the functionality of actual BlackBerry products and test how the device software, screen, keyboard and trackwheel will work with application.

Windows UI Automation - To test applications that use the Microsoft UI Automation technology, it requires Windows Automation API 3.0. It is pre-installed on Windows 7, Windows Server 2008 R2 and later versions of Windows. On other operating systems, you can install using Windows Update or download it from the Microsoft Web site.

Tools include

eggPlant: A GUI-based automated test tool for mobile application across all operating systems and devices. Ranorex: Test automation tools for mobile, web and desktop apps. Testdroid: Real mobile devices and test automation tools for testing mobile and web apps.

Front-end development tools

Front-end development tools are focused on the user interface and user experience (UI-UX) and provide the following

abilities:

UI design tools

SDKs to access device features

Cross-platform accommodations/support

Back-end servers

Back-end tools pick up where the front-end tools leave off, and provide a set of reusable services that are centrally

managed and controlled and provide the following abilities:

Integration with back-end systems

User authentication-authorization

Data services

Reusable business logic

Security add-on layers

With bring your own device (BYOD) becoming the norm within more enterprises, IT departments often need stop-gap,

tactical solutions that layer atop existing apps, phones, and platform component. Features include

App wrapping for security Data encryption Client actions

https://en.wikipedia.org/wiki/Android_%28operating_system%29

https://en.wikipedia.org/wiki/Android_SDK

https://en.wikipedia.org/wiki/Software

https://en.wikipedia.org/wiki/Mobile_Web


https://en.wikipedia.org/wiki/Windows

https://en.wikipedia.org/wiki/Web_application

https://en.wikipedia.org/wiki/IPhone

https://en.wikipedia.org/wiki/BlackBerry

https://en.wikipedia.org/wiki/Android_%28operating_system%29

https://en.wikipedia.org/wiki/Palm_Pre



https://en.wikipedia.org/wiki/Simulator


https://en.wikipedia.org/wiki/Web_applications

https://en.wikipedia.org/wiki/Internet_Explorer_7

https://en.wikipedia.org/wiki/Firefox_2

https://en.wikipedia.org/wiki/Safari_3

https://en.wikipedia.org/wiki/Pixel

https://en.wikipedia.org/wiki/Safari_%28web_browser%29

https://en.wikipedia.org/wiki/Web_sites




https://en.wikipedia.org/wiki/OS_X

https://en.wikipedia.org/wiki/Trackwheel

https://en.wikipedia.org/wiki/Microsoft

https://en.wikipedia.org/wiki/Eggplant_%28GUI_testing_tool%29

https://en.wikipedia.org/wiki/Ranorex

https://en.wikipedia.org/wiki/Testdroid

https://en.wikipedia.org/wiki/Bring_your_own_device

Reporting and statistics

The Project Management Life Cycle has four phases: Initiation, Planning, Execution and Closure. Each project life

cycle phase is described below, along with the tasks needed to complete it. You can click the links provided, to view more

detailed information on the project management life cycle

Project Management Life Cycle

The Project Management Life Cycle has four phases: Initiation, Planning, Execution and Closure. Each project life cycle

phase is described below, along with the tasks needed to complete it. You can click the links provided, to view more

detailed information on the project management life cycle.

Develop a Business Case Undertake a Feasibility Study Establish the Project Charter Appoint the Project Team Set up the Project Office Perform Phase Review

Create a Project Plan Create a Resource Plan Create a Financial Plan Create a Quality Plan Create a Risk Plan Create an Acceptance Plan Create a Communications Plan Create a Procurement Plan Contract the Suppliers

Define the Tender Process Issue a Statement of Work Issue a Request for Information Issue a Request for Proposal Create Supplier Contract Perform Phase Review

Build Deliverables Monitor and Control

Perform Time Management Perform Cost Management Perform Quality Management Perform Change Management Perform Risk Management Perform Issue Management Perform Procurement Management Perform Acceptance Management Perform Communications Management

http://www.method123.com/project-initiation-phase.php

http://www.method123.com/project-planning-phase.php

http://www.method123.com/project-execution-phase.php

http://www.method123.com/project-closure-phase.php

http://www.method123.com/business-case.php

http://www.method123.com/feasibility-study.php

http://www.method123.com/terms-of-reference.php

http://www.method123.com/job-description.php

http://www.method123.com/project-management-office.php

http://www.method123.com/initiation-phase-review.php

http://www.method123.com/project-plan.php

http://www.method123.com/resource-plan.php

http://www.method123.com/financial-plan.php

http://www.method123.com/quality-plan.php

http://www.method123.com/risk-management-plan.php

http://www.method123.com/acceptance-plan.php

http://www.method123.com/communication-plan.php

http://www.method123.com/procurement-plan.php

http://www.method123.com/tender-forms.php

http://www.method123.com/tender-process.php

http://www.method123.com/statement-of-work.php

http://www.method123.com/request-for-information.php

http://www.method123.com/request-for-proposal.php

http://www.method123.com/supplier-contract.php

http://www.method123.com/planning-phase-review.php

http://www.method123.com/time-management.php

http://www.method123.com/cost-management.php

http://www.method123.com/quality-management.php

http://www.method123.com/change-management.php

http://www.method123.com/risk-management.php

http://www.method123.com/issue-management.php

http://www.method123.com/procurement-management.php

http://www.method123.com/acceptance-management.php

http://www.method123.com/communication-process.php

http://www.method123.com/project-initiation-phase.php

http://www.method123.com/project-planning-phase.php

http://www.method123.com/project-execution-phase.php

Perform Project Closure Review Project Completion

The Project Management Template kit contains all of the tools and templates you need, to complete the project

management life cycle. It also contains a free Project Management Book to help you manage projects. It takes you through

the project lifecycle step-by-step, helping you to deliver projects on time and within budget.

It's also unique, because it:

Applies to all project types and industries Is used to manage projects of any size Gives you the complete set of project templates Explains every step in the project lifecycle in depth!

The Project Management Kit helps:

Project Managers to deliver projects Consultants to manage client projects Trainers to teach project management Students to learn how to manage projects Project Offices to monitor and control projects Senior Managers to improve the success of projects.

http://www.method123.com/project-closure.php

http://www.method123.com/post-implementation-review.php

http://www.method123.com/project-management-kit.php

http://www.method123.com/free-project-management-book.php

http://www.method123.com/project-closure-phase.php

Project Initiation Phase

The Project Initiation Phase is the 1st phase in the Project Management Life Cycle, as it involves starting up a new

project. You can start a new project by defining its objectives, scope, purpose and deliverables to be produced. You'll also

hire your project team, setup the Project Office and review the project, to gain approval to begin the next phase.

Overall, there are six key steps that you need to take to properly initiate a new project. These Project Initiation steps and

their corresponding templates are shown in the following diagram. Click each link below, to learn how Method123

templates help you to initiate projects.

Activities

1.Develop a Business Case

2.Undertake a Feasibility Study

3.Establish the Project Charter

4.Appoint the Project Team

5.Set up the Project Office

6.Perform a Phase Review

Templates

Business Case Feasibility Study Project Charter Job Description Project Office Checklist Phase Review Form

The Project Initiation Phase is the most crucial phase in the Project Life Cycle, as it's the phase in which you define your

scope and hire your team.

Project Planning Phase

The Project Planning Phase is the second phase in the project life cycle. It involves creating of a set of plans to help

guide your team through the execution and closure phases of the project.

http://www.method123.com/business-case.php

http://www.method123.com/feasibility-study.php

http://www.method123.com/terms-of-reference.php

http://www.method123.com/job-description.php

http://www.method123.com/project-management-office.php

http://www.method123.com/initiation-phase-review.php

The plans created during this phase will help you to manage time, cost, quality, change, risk and issues. They will also help

you manage staff and external suppliers, to ensure that you deliver the project on time and within budget.

There are 10 Project Planning steps you need to take to complete the Project Planning Phase efficiently. These steps and

the templates needed to perform them, are shown in the following diagram.

Click each link in the diagram below, to learn how these templates will help you to plan projects efficiently.

Activities

1.

Create a

Project Plan

2.

Create a

Resource Plan

3.

Create a

Financial Plan

4.

Create a

Quality Plan

5.

Create a

Risk Plan

6.

Create a

Acceptance Plan

7.

Create a

Communications Plan

8.

Create a

Procurement Plan

9.

Contract the

Suppliers

10.

Perform a

Phase Review

Templates

Project Plan Resource Plan Financial Plan Quality Plan Risk Plan Acceptance Plan Communications Plan Procurement Plan Tender Process Statement of Work Request for Information Request for Proposal Supplier Contract Tender Register Phase Review Form

The Project Planning Phase is often the most challenging phase for a Project Manager, as you need to make an educated

guess of the staff, resources and equipment needed to complete your project. You may also need to plan your

communications and procurement activities, as well as contract any 3rd party suppliers.

http://www.method123.com/project-plan.php

http://www.method123.com/resource-plan.php

http://www.method123.com/financial-plan.php

http://www.method123.com/quality-plan.php

http://www.method123.com/risk-management-plan.php

http://www.method123.com/acceptance-plan.php

http://www.method123.com/communication-plan.php

http://www.method123.com/procurement-plan.php

http://www.method123.com/tender-process.php

http://www.method123.com/statement-of-work.php

http://www.method123.com/request-for-information.php

http://www.method123.com/request-for-proposal.php

http://www.method123.com/supplier-contract.php

http://www.method123.com/tender-forms.php

http://www.method123.com/planning-phase-review.php

Project Execution Phase

The Project Execution Phase is the third phase in the project life cycle. In this phase, you will build the physical project

deliverables and present them to your customer for signoff. The Project Execution Phase is usually the longest phase in the

project life cycle and it typically consumes the most energy and the most resources.

To enable you to monitor and control the project during this phase, you will need to implement a range of management

processes. These processes help you to manage time, cost, quality, change, risks and issues. They also help you to manage

procurement, customer acceptance and communications.

The project management activities and templates which help you complete them are shown in the following diagram. Click

the links below to learn how these templates help you to execute projects more efficiently than before

Activities

1.

Perform

Time Management

2.

Perform

Cost Management

3.

Perform

Quality Management

4.

Perform

Change Management

5.

Perform

Risk Management

6.

Perform

Issue

Management

7.

Perform

Procurement

Management

8.

Perform

Acceptance

Management

9.

Perform

Communications

Management

10.

Perform a

Phase Review

Templates

Time Management Process Timesheet Form Timesheet Register Cost Management Process Expense Form Expense Register Quality Management Process Quality Review Form Deliverables Register Change Management Process Change Request Form Change Register Risk Management Process Risk Form Risk Register Issue Management Process Issue Form Issue Register Procurement Management Process Purchase Order Form Procurement Register Acceptance Management Process Acceptance Form Acceptance Register Communications Management Process Project Status Report Communications Register Phase Review Form

By using these templates to monitor and control the Project Execution Phase, you will improve your chances of delivering

your project on time and within budget.

http://www.method123.com/time-management.php

http://www.method123.com/timesheet.php

http://www.method123.com/time-management-log.php

http://www.method123.com/cost-management.php

http://www.method123.com/expense-form.php

http://www.method123.com/expense-log.php

http://www.method123.com/quality-management.php

http://www.method123.com/quality-review-form.php

http://www.method123.com/deliverables-register.php

http://www.method123.com/change-management.php

http://www.method123.com/change-request-form.php

http://www.method123.com/change-log.php

http://www.method123.com/risk-management.php

http://www.method123.com/risk-assessment-form.php

http://www.method123.com/risk-register.php

http://www.method123.com/issue-management.php

http://www.method123.com/issue-form.php

http://www.method123.com/issue-log.php

http://www.method123.com/procurement-management.php

http://www.method123.com/purchase-order-form.php

http://www.method123.com/procurement-register.php

http://www.method123.com/acceptance-management.php

http://www.method123.com/acceptance-form.php

http://www.method123.com/acceptance-register.php

http://www.method123.com/communication-process.php

http://www.method123.com/project-reports.php

http://www.method123.com/communication-log.php

http://www.method123.com/execution-phase-review.php

Project Closure Phase

The Project Closure Phase is the fourth and last phase in the project life cycle. In this phase, you will formally close your

project and then report its overall level of success to your sponsor.

Project Closure involves handing over the deliverables to your customer, passing the documentation to the business,

cancelling supplier contracts, releasing staff and equipment, and informing stakeholders of the closure of the project.

After the project has been closed, a Post Implementation Review is completed to determine the projects success and

identify the lessons learned.

The activities taken to close a project and the templates which help you to complete each activity, are shown in the

following diagram. Click the links below to learn how these templates can help you to close projects efficiently.

Activities

1.

Perform

Project Closure

2.

Review

Project Completion

Templates

Project Closure Report Post Implementation Review

The first step taken when closing a project is to create a Project Closure Report. It is extremely important that you list

every activity required to close the project within this Project Closure report, to ensure that project closure is completed

smoothly and efficiently. Once the report has been approved by your sponsor, the closure activities stated in the report are

actioned.

Between one and three months after the project has been closed and the business has begun to experience the benefits

provided by the project, you need to complete a Post Implementation Review. This review allows the business to identify

the level of success of the project and list any lessons learned for future projects.

Data are simply facts or figures — bits of information, but not information itself. When data are processed, interpreted,

organized, structured or presented so as to make them meaningful or useful, they are called information. Information

provides context for data.

A digital signature, on the other hand, refers to the encryption / decryption technology on which an electronic signature

solution is built. ... Rather, digital signature encryption secures the data associated with a signed document and helps

verify the authenticity of a signed record.Dec 11, 2013

The Internet Protocol version 4 (IPv4) is a protocol for use on packet-switched Link Layer networks (e.g. Ethernet). IPv4provides an addressing capability of approximately 4.3 billion addresses. The Internet Protocol version 6 (IPv6) is more advanced and has better features compared to IPv4.

http://www.method123.com/project-closure.php

http://www.method123.com/post-implementation-review.php

IPv4 and IPv6 are two generations of Internet Protocols where IPv4 stands for Internet Protocol version 4 and IPv6 for Internet Protocol version 6.

IPv4 is a protocol for use on packet-switched Link Layer networks (e.g. Ethernet). It is one of the core protocols of standards-based inter-networking methods in the Internet, and was the first version deployed for production in the ARPANET in 1983. IPv4 uses 32-bit source and destination address fields which limits the address space to 4.3 billion addresses. This limitation stimulated the development of IPv6 in the 1990s.

IPv6 is more advanced and has better features compared to IPv4. It has the capability to provide an infinite number of addresses. It is replacing IPv4 to accommodate the growing number of networks worldwide and help solve the IP address exhaustion problem. IPv6 was developed by the Internet Engineering Task Force (IETF).

II.10 DIFFERENCE BETWEEN IPv4 AND IPv6

IPv6 is based on IPv4; it is an evolution of IPv4. So many things that we find with IPv6 are familiar to us. The main differences are illustrated in the table below:

IPv4 IPv6

The size of an address in IPv4 is 32 bits The size of an address in IPv6 is 128 bits

Address Shortages:

IPv4 supports 4.3×109 (4.3 billion) addresses, which is

inadequate to give one (or more if they possess more than one

device) to every living person.

Larger address space:

IPv6 supports 3.4×1038

addresses, or 5×1028

(50

octillion) for each of the roughly 6.5 billion people

alive today.33(*)

IPv4 header has 20 bytes

IPv4 header has many fields (13 fields)

IPv6 header is the double, it has 40 bytes

IPv6 header has fewer fields, it has 8 fields.

IPv4 is subdivided into classes <A-E>. IPv6 is classless.

IPv6 uses a prefix and an Identifier ID known as IPv4

network

IPv4 address uses a subnet mask. IPv6 uses a prefix length.

IPv4 has lack of security.

IPv4 was never designed to be secure

- Originally designed for an isolated military network

- Then adapted for a public educational & research network

IPv6 has a built-in strong security

- Encryption

- Authentication

ISP have IPv4 connectivity or have both IPv4 and IPv6 Many ISP don't have IPv6 connectivity

Non equal geographical distribution (>50% USA) No geographic limitation

Clearly, both are means of communications. The diference is that Web Servicealmost always involves communication over network and HTTP is the most commonly used protocol. Web service also uses SOAP, REST, and XML-RPC as a means of communication. ... All Web Services are API but APIs are not Web Services.

API vs Web Service

API and Web service serve as a means of communication. The only difference is that a

Web service facilitates interaction between two machines over a network. An API acts as

an interface between two different applications so that they can communicate with each

other. An API is a method by which the third-party vendors can write programs that

interface easily with other programs. A Web service is designed to have an interface that

http://www.memoireonline.com/03/08/970/m_quality-of-service-performance-characterization-ipv4-ipv617.html#fn33

is depicted in a machine-processable format usually specified in Web Service Description

Language (WSDL). Typically, “HTTP” is the most commonly used protocol for

communication. Web service also uses SOAP, REST, and XML-RPC as a means of

communication. API may use any means of communication to initiate interaction

between applications. For example, the system calls are invoked using interrupts by

the Linux kernel API.

An API exactly defines the methods for one software program to interact with the other.

When this action involves sending data over a network, Web services come into the

picture. An API generally involves calling functions from within a software program.

In case of Web applications, the API used is web based. Desktop applications such as

spreadsheets and word documents use VBA and COM-based APIs which don’t involve

Web service. A server application such as Joomla may use a PHP-based API present

within the server which doesn’t require Web service.

A Web service is merely an API wrapped in HTTP. An API doesn’t always need to be web

based. An API consists of a complete set of rules and specifications for a software

program to follow in order to facilitate interaction. A Web service might not contain a

complete set of specifications and sometimes might not be able to perform all the tasks

that may be possible from a complete API.

The APIs can be exposed in a number of ways which include: COM objects, DLL and .H

files in C/C++ programming language, JAR files or RMI in Java, XML over HTTP, JSON

over HTTP, etc. The method used by Web service to expose the API is strictly through a

network.

Summary:

1. All Web services are APIs but all APIs are not Web services.

2. Web services might not perform all the operations that an API would perform.

3. A Web service uses only three styles of use: SOAP, REST and XML-RPC for

communication whereas API may use any style for communication.

http://www.differencebetween.net/category/technology/protocols-formats/

http://www.differencebetween.net/miscellaneous/difference-between-caste-system-and-class-system/

http://www.differencebetween.net/technology/difference-between-ubuntu-and-linux/

http://www.differencebetween.net/category/technology/software-technology/

http://www.differencebetween.net/technology/difference-between-data-warehousing-and-data-marts/

http://www.differencebetween.net/technology/software-technology/difference-between-application-server-and-web-server/

http://www.differencebetween.net/technology/difference-between-xml-and-xsd/

http://www.differencebetween.net/language/difference-between-would-and-should/

4. A Web service always needs a network for its operation whereas an API doesn’t need

a network for its operation.

5. An API facilitates interfacing directly with an application whereas a Web service is a

XML vs. XSD

XML, or the Extensible Markup Language, is a standard or set of rules that governs the

encoding of documents into an electronic format. XML goes hand in hand with HTML in

internet usage. XML defines the structure of the document, but not the way the document

is displayed; this is handled by HTML. XSD stands for XML Schema Document, and is

one of the several XML schema languages that define what could be included inside the

document. An aspect of XSD that people find to be one of its strengths, is that it’s written

in XML. This means that users who know XML are already familiar with XSD,

eliminating the need to learn another language.

XML does not define any elements or tags that are usable within your document. You can

create any tag to describe any element on your XML document, as long as you follow the

correct structure. An XSD defines elements that can be used in the documents, relating to

the actual data with which it is to be encoded. Another positive aspect of having defined

elements and data types, is that the information will be properly interpreted. This is

because the sender and the receiver know the format of the content. A good example of

this, is the date. A date that is expressed as 1/12/2010 can either mean January 12 or

December 1st. Declaring a date data type in an XSD document, ensures that it follows the

format dictated by XSD.

As an XSD document still follows the XML structure, it is still validated as an XML

document. In fact, you can use XML parsers to parse XSD documents, and it will perform

flawlessly, and produce the right information from the file. The reverse is not necessarily

true, as an XML document may contain elements that an XSD parser may not recognize.

XML only checks how well-formed the document is. This can be a problem, as a well-

formed document can still contain errors. XSD validating software often catches the

errors that XML validating software might miss.

http://www.differencebetween.net/technology/difference-between-sgml-and-xml/

http://www.differencebetween.net/technology/difference-between-html-and-rich-text/

http://www.differencebetween.net/language/difference-between-a-want-and-a-need/

http://www.differencebetween.net/language/difference-between-data-and-information/

http://www.differencebetween.net/language/difference-between-knowledge-and-information/

http://www.differencebetween.net/language/difference-between-good-and-well/

http://www.differencebetween.net/language/difference-between-data-and-information/

Summary:

1. XSD is based and written on XML.

2. XSD defines elements and structures that can appear in the document, while XML

does not.

3. XSD ensures that the data is properly interpreted, while XML does not.

4. An XSD document is validated as XML, but the opposite may not always be true.

5. XSD is better at catching errors than XML

SOAP (Simple Object Access Protocol) and REST (Representation State Transfer) are popular with developers working on system integration based projects. Software architects will design the application from various perspectives and also decides, based on various reasons, which approach to take to expose new API to third party applications. As a software architect, it is good practice to involve your development team lead during system architecture process. This article, based on my experience, will discuss when to use SOAP or REST web services to expose your API to third party clients.

Web Services Demystified Web services are part of the Services Oriented Architecture. Web services are used as the model for process decomposition and assembly. I have been involved in discussion where there were some misconception between web services and web API.

The W3C defines a Web Service generally as:

A software system designed to support interoperable machine-to-machine interaction over a network.

Web API also known as Server-Side Web API is a programmatic interface to a defined request-response message system, typically expressed in JSON or XML, which is exposed

via the web – most commonly by means of an HTTP-based web server. (extracted from Wikipedia)

Based on the above definition, one can insinuate when SOAP should be used instead of REST and vice-versa but it is not as simple as it looks. We can agree that Web Services are not the same as Web API. Accessing an image over the web is not calling a web service but retrieving a web resources using is Universal Resource Identifier. HTML has a well-defined standard approach to serving resources to clients and does not require the use of web service

http://en.wikipedia.org/wiki/Web_service

http://en.wikipedia.org/wiki/Web_API

in order to fulfill their request.

Why Use REST over SOAP Developers are passionate people. Let's briefly analyze some of the reasons they mentioned when considering REST over SOAP:

REST is easier than SOAP I'm not sure what developers refer to when they argue that REST is easier than SOAP. Based on my experience, depending on the requirement, developing REST services can quickly become very complex just as any other SOA projects. What is your service abstracting from the client? What is the level of security required? Is your service a long running asynchronous process? And many other requirements will increase the level of complexity. Testability: apparently it easier to test RESTFul web services than their SOAP counter parts. This is only partially true; for simple REST services, developers only have to point their browser to the service endpoints and a result would be returned in the response. But what happens once you need to add the HTTP headers and passing of tokens, parameters validation… This is still testable but chances are you will require a plugin for your browser in order to test those features. If a plugin is required then the ease of testing is exactly the same as using SOAPUI for testing SOAP based services.

RESTFul Web Services serves JSON that is faster to parse than XML This so called "benefit" is related to consuming web services in a browser. RESTFul web services can also serve XML and any MIME type that you desire. This article is not focused on discussing JSON vs XML; and I wouldn't write any separate article on the topic. JSON relates to JavaScript and as JS is very closed to the web, as in providing interaction on the web with HTML and CSS, most developers automatically assumes that it also linked to interacting with RESTFul web services. If you didn't know before, I'm sure that you can guess that RESTFul web services are language agnostic.

Regarding the speed in processing the XML markup as opposed to JSON, a performance test conducted by David Lead, Lead Engineer at MarkLogic Inc, find out to be a myth.

REST is built for the Web Well this is true according to Roy Fielding dissertation; after all he is credited with the creation of REST style architecture. REST, unlike SOAP, uses the underlying technology for transport and communication between clients and servers. The architecture style is optimized for the modern web architecture. The web has outgrown is initial requirements and this can be seen through HTML5 and web sockets standardization. The web has become a platform on its own right, maybe WebOS. Some applications will require server-side state saving such as financial applications to e-commerce.

http://balisage.net/Proceedings/vol10/html/Lee01/BalisageVol10-Lee01.html

https://www.ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf

Caching When using REST over HTTP, it will utilize the features available in HTTP such as caching, security in terms of TLS and authentication. Architects know that dynamic resources should not be cached. Let's discuss this with an example; we have a RESTFul web service to serve us some stock quotes when provided with a stock ticker. Stock quotes changes per milliseconds, if we make a request for BARC (Barclays Bank), there is a chance that the quote that we have receive a minute ago would be different in two minutes. This shows that we cannot always use the caching features implemented in the protocol. HTTP Caching be useful in client requests of static content but if the caching feature of HTTP is not enough for your requirements, then you should also evaluate SOAP as you will be building your own cache either way not relying on the protocol.

HTTP Verb Binding HTTP verb binding is supposedly a feature worth discussing when comparing REST vs SOAP. Much of public facing API referred to as RESTFul are more REST-like and do not implement all HTTP verb in the manner they are supposed to. For example; when creating new resources, most developers use POST instead of PUT. Even deleting resources are sent through POST request instead of DELETE.

SOAP also defines a binding to the HTTP protocol. When binding to HTTP, all SOAP requests are sent through POST request.

Security Security is never mentioned when discussing the benefits of REST over SOAP. Two simples security is provided on the HTTP protocol layer such as basic authentication and communication encryption through TLS. SOAP security is well standardized through WS-SECURITY. HTTP is not secured, as seen in the news all the time, therefore web services relying on the protocol needs to implement their own rigorous security. Security goes beyond simple authentication and confidentiality, and also includes authorization and integrity. When it comes to ease of implementation, I believe that SOAP is that at the forefront.

Conclusion This was meant to be a short blog post but it seems we got to passionate about the subject.

I accept that there are many other factors to consider when choosing SOAP vs REST but I will over simplify it here. For machine-to-machine communications such as business processing with BPEL, transaction security and integrity, I suggest using SOAP. SOAP binding to HTTP is possible and XML parsing is not noticeably slower than JSON on the browser. For building public facing API, REST is not the undisputed champion. Consider the actual application requirements and evaluate the benefits. People would say that REST protocol agnostic and work on anything that has URI is beside the point. According to its creator, REST was conceived for the evolution of the web. Most so-called RESTFul web

services available on the internet are more truly REST-like as they do not follow the principle of the architectural style. One good thing about working with REST is that application do not need a service contract a la SOAP (WSDL). WADL was never standardized and I do not believe that developers would implement it. I remember looking for Twitter WADL to integrate it.

I will leave you to make your own conclusion. There is so much I can write in a blog post. Feel free to leave any comments to keep the discussion going.

Discover the unprecedented possibilities and challenges, created by today’s fast paced data climate and why your current integration solution is not enough, brought to you in partnership with Liaison Technologies.

REST stands for Representational State Transfer. (It is sometimes spelled "ReST".) It relies on a stateless, client-server, cacheable communications protocol -- and in virtually all cases, the HTTP protocol is used. REST is an architecture style for designing networked applications.

Normalization in DBMS: 1NF, 2NF, 3NF and BCNF in Database. ... Normalizationis a process of organizing the data in database to avoid data redundancy, insertion anomaly, update anomaly & deletion anomaly. Let's discuss about anomalies first then we will discuss normal forms with examples. Database normalization, or simply normalization, is the process of organizing the columns (attributes) and tables (relations) of a relational database to reduce data redundancy and improve data integrity

Normalization of Database Database Normalisation is a technique of organizing the data in the database. Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anamolies. It is a multi-step process that puts data into tabular form by removing duplicated data from the relation tables.

Normalization is used for mainly two purpose,

Eliminating reduntant(useless) data.

Ensuring data dependencies make sense i.e data is logically stored.

Problem Without Normalization

Without Normalization, it becomes difficult to handle and update the database, without facing data loss. Insertion, Updation and Deletion Anamolies are very frequent if Database is not Normalized. To understand these anomalies let us take an example of Student table.

S_id S_Name S_Address Subject_opted

401 Adam Noida Bio

402 Alex Panipat Maths

https://dzone.com/go?i=171135&u=http%3A%2F%2Fbit.ly%2F2aydpMZ%25C2%25A0

https://dzone.com/go?i=171135&u=http%3A%2F%2Fbit.ly%2F2aydpMZ%25C2%25A0

403 Stuart Jammu Maths

404 Adam Noida Physics

Updation Anamoly : To update address of a student who occurs twice or more than twice in a table, we will have to

update S_Address column in all the rows, else data will become inconsistent.

Insertion Anamoly : Suppose for a new admission, we have a Student id(S_id), name and address of a student but if

student has not opted for any subjects yet then we have to insert NULL there, leading to Insertion Anamoly.

Deletion Anamoly : If (S_id) 401 has only one subject and temporarily he drops it, when we delete that row, entire student

record will be deleted along with it.

Normalization Rule

Normalization rule are divided into following normal form.

1. First Normal Form

2. Second Normal Form

3. Third Normal Form

4. BCNF

First Normal Form (1NF)

As per First Normal Form, no two Rows of data must contain repeating group of information i.e each set of column must have a unique value, such that multiple columns cannot be used to fetch the same row. Each table should be organized into rows, and each row should have a primary key that distinguishes it as unique.

The Primary key is usually a single column, but sometimes more than one column can be combined to create a single primary key. For example consider a table which is not in First normal form

Student Table :

Student Age Subject

Adam 15 Biology, Maths

Alex 14 Maths

Stuart 17 Maths

In First Normal Form, any row must not have a column in which more than one value is saved, like separated with commas. Rather than that, we must separate such data into multiple rows.

Student Table following 1NF will be :

Student Age Subject

Adam 15 Biology

Adam 15 Maths

Alex 14 Maths

Stuart 17 Maths

Using the First Normal Form, data redundancy increases, as there will be many columns with same data in multiple rows but each row as a whole will be unique.

Second Normal Form (2NF)

As per the Second Normal Form there must not be any partial dependency of any column on primary key. It means that for a table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence. If any column depends only on one part of the concatenated key, then the table fails Second normal form.

In example of First Normal Form there are two rows for Adam, to include multiple subjects that he has opted for. While this is searchable, and follows First normal form, it is an inefficient use of space. Also in the above Table in First Normal Form, while the candidate key is Student, Subject, Age of Student only depends on Student column, which is incorrect as per Second Normal Form. To achieve second normal form, it would be helpful to split out the subjects into an independent table, and match them up using the student names as foreign keys.

New Student Table following 2NF will be :

Student Age

Adam 15

Alex 14

Stuart 17

In Student Table the candidate key will be Student column, because all other column i.e Age is dependent on it.

New Subject Table introduced for 2NF will be :

Student Subject

Adam Biology

Adam Maths

Alex Maths

Stuart Maths

In Subject Table the candidate key will be Student, Subject column. Now, both the above tables qualifies for Second Normal Form and will never suffer from Update Anomalies. Although there are a few complex cases in which table in Second Normal Form suffers Update Anomalies, and to handle those scenarios Third Normal Form is there.

Third Normal Form (3NF)

Third Normal form applies that every non-prime attribute of table must be dependent on primary key, or we can say that, there should not be the case that a non-prime attribute is determined by another non-prime attribute. So this transitive functional dependency should be removed from the table and also the table must be in Second Normal form. For example, consider a table with following fields.

Student_Detail Table :

Student_id Student_name DOB Street city State Zip

In this table Student_id is Primary key, but street, city and state depends upon Zip. The dependency between zip and other fields is called transitive dependency. Hence to apply 3NF, we need to move the street, city and state to new table, with Zip as primary key.

New Student_Detail Table :

Student_id Student_name DOB Zip

Address Table :

Zip Street city state

The advantage of removing transtive dependency is,

Amount of data duplication is reduced.

Data integrity achieved.

Boyce and Codd Normal Form (BCNF)

Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with certain type of anamoly that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF. For a table to be in BCNF, following conditions must be satisfied:

R must be in 3rd Normal Form

and, for each functional dependency ( X -> Y ), X should be a super Key.

RDBMS Concepts A Relational Database management System(RDBMS) is a database management system based on relational model

introduced by E.F Codd. In relational model, data is represented in terms of tuples(rows).

RDBMS is used to manage Relational database. Relational database is a collection of organized set of tables from which data

can be accessed easily. Relational Database is most commonly used database. It consists of number of tables and each table has its own primary key.

What is Table ?

In Relational database, a table is a collection of data elements organised in terms of rows and columns. A table is also considered as convenient representation of relations. But a table can have duplicate tuples while a true relation cannot have

duplicate tuples. Table is the most simplest form of data storage. Below is an example of Employee table.

RDBMS Concepts A Relational Database management System(RDBMS) is a database management system based on relational model introduced by E.F Codd. In relational model, data is represented in terms of tuples(rows).

RDBMS is used to manage Relational database. Relational database is a collection of organized set of tables from which data can be accessed easily. Relational Database is most commonly used database. It consists of number of tables and each table has its own primary key.

What is Table ?

In Relational database, a table is a collection of data elements organised in terms of rows and columns. A table is also considered as convenient representation of relations. But a table can have duplicate tuples while a true relation cannot have

duplicate tuples. Table is the most simplest form of data storage. Below is an example of Employee table.

RDBMS Concepts A Relational Database management System(RDBMS) is a database management system based on relational model introduced by E.F Codd. In relational model, data is represented in terms of tuples(rows).

RDBMS is used to manage Relational database. Relational database is a collection of organized set of tables from which data

can be accessed easily. Relational Database is most commonly used database. It consists of number of tables and each table has its own primary key.

What is Table ?

In Relational database, a table is a collection of data elements organised in terms of rows and columns. A table is also considered as convenient representation of relations. But a table can have duplicate tuples while a true relation cannot have duplicate tuples. Table is the most simplest form of data storage. Below is an example of Employee table.

What is a Column ?

In Relational table, a column is a set of value of a particular type. The term Attribute is also used to represent a column. For example, in Employee table, Name is a column that represent names of employee.

Database Keys Keys are very important part of Relational database. They are used to establish and identify relation between tables. They also ensure that each record within a table can be uniquely identified by combination of one or more fields within a table.

Super Key

Super Key is defined as a set of attributes within a table that uniquely identifies each record within a table. Super Key is a superset of Candidate key.

Candidate Key

Candidate keys are defined as the set of fields from which primary key can be selected. It is an attribute or set of attribute that

can act as a primary key for a table to uniquely identify each record in that table.

Primary Key

Primary key is a candidate key that is most appropriate to become main key of the table. It is a key that uniquely identify each record in a table.

Composite Key

Key that consist of two or more attributes that uniquely identify an entity occurance is called Composite key. But any attribute that makes up the Composite key is not a simple key in its own.

Secondary or Alternative key

The candidate key which are not selected for primary key are known as secondary keys or alternative keys

Non-key Attribute

Non-key attributes are attributes other than candidate key attributes in a table.

Non-prime Attribute

Non-prime Attributes are attributes other than Primary attribute.

E-R Diagram ER-Diagram is a visual representation of data that describes how data is related to each other.

1) Entity

An Entity can be any object, place, person or class. In E-R Diagram, an entity is represented using rectangles. Consider an example of an Organisation. Employee, Manager, Department, Product and many more can be taken as entities from an Organisation.

Weak Entity

Weak entity is an entity that depends on another entity. Weak entity doen't have key attribute of their own. Double rectangle represents weak entity.

2) Attribute

An Attribute describes a property or characterstic of an entity. For example, Name, Age, Address etc can be attributes of a

Student. An attribute is represented using eclipse.

Big data analytics is the process of examining large datasets to uncover hidden patterns, unknown

correlations, market trends, customer preferences and other useful business information

The primary goal of big data analytics is to help companies make more informed business

decisions by enabling data scientists, predictive modelers and other analytics professionals to

analyze large volumes of transaction data, as well as other forms of data that may be untapped by

conventional business intelligence (BI) programs. That could include Web server logs and

Internet clickstream data, social media content and social network activity reports, text from

customer emails and survey responses, mobile-phone call detail records and machine data

captured by sensors connected to the Internet of Things.

Semi-structured and unstructured data may not fit well in traditional data warehouses based

on relational databases. Furthermore, data warehouses may not be able to handle the processing

demands posed by sets of big data that need to be updated frequently or even continually -- for

example, real-time data on the performance of mobile applications or of oil and gas pipelines. As a

result, many organizations looking to collect, process and analyze big data have turned to a newer

class of technologies that includes Hadoop and related tools such

http://searchcloudcomputing.techtarget.com/definition/big-data-Big-Data

http://searchbusinessanalytics.techtarget.com/definition/Data-scientist

http://searchdatamanagement.techtarget.com/definition/business-intelligence

http://searchsoa.techtarget.com/definition/click-stream

http://whatis.techtarget.com/definition/Internet-of-Things

http://whatis.techtarget.com/definition/semi-structured-data

http://searchbusinessanalytics.techtarget.com/definition/unstructured-data

http://searchsqlserver.techtarget.com/definition/data-warehouse

http://searchsqlserver.techtarget.com/definition/relational-database

http://searchcloudcomputing.techtarget.com/definition/Hadoop

as YARN, MapReduce, Spark, Hive and Pig as well as NoSQL databases. Those technologies

form the core of an open source software framework that supports the processing of large and

diverse data sets across clustered systems.

What is big data technology? Big data is a term for data sets that are so large or complex that traditional dataprocessing applications are inadequate to deal with them. Challenges include analysis, capture, data curation, search, sharing, storage, transfer, visualization, querying, updating and information privacy.

What are the three V's of big data? 3Vs (volume, variety and velocity) are three defining properties or dimensions of big data. Volume refers to the amount of data, variety refers to the number of types ofdata and velocity refers to the speed of data processing.

What is data science and analytics? Data science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, machine learning, data mining, and predictive analytics, similar to ..

50 Bigdata Platforms and Bigdata Analytics Software IBM Bigdata Analytics. ...

HP Bigdata. ...

SAP Bigdata Analytics. ...

Microsoft Bigdata. ...

Oracle Bigdata Analytics. ...

Talend Open Studio. ...

Teradata Bigdata Analytics. ...

SAS Bigdata Analytics.

What is data analytics tools? Data analytics (DA) is the science of examining raw data with the purpose of drawing conclusions about that information. Data analytics is used in many industries to allow companies and organization to make better

business decisions and in the sciences to verify or disprove existing models or theories. Cloud Foundry is an open source cloud platform as a service (PaaS) on which developers can build, deploy, run and scale applications on public and private cloud models. VMware originally created Cloud Foundry and it is now part of Pivotal Software.

http://searchdatamanagement.techtarget.com/definition/Apache-Hadoop-YARN-Yet-Another-Resource-Negotiator

http://searchcloudcomputing.techtarget.com/definition/MapReduce

http://searchbusinessanalytics.techtarget.com/definition/Apache-Spark

http://searchdatamanagement.techtarget.com/definition/Apache-Hive

http://searchdatamanagement.techtarget.com/definition/Apache-Pig

http://searchdatamanagement.techtarget.com/definition/NoSQL-Not-Only-SQL

What is Cloud Foundry? Key benefits and a real use case

September 8, 2015 by Vineet Badola

Unlike most other Cloud Computing platform services, which are tied to particular cloud providers,

Cloud Foundry is available as a stand-alone software package. You can, of course, deploy it on

Amazon‘s AWS, but you can also host it yourself on your own OpenStack server, or through HP‘s

Helion or VMware‘s vSphere.

First of all though, to be completely clear, just what is a Cloud Computing platform? There are,

broadly speaking, three major categories of Cloud Computing:

Infrastructure as a Service (IaaS), which provides only a base infrastructure, leaving the end

user responsible for platform and environment configuration necessary to deploy applications.

Amazon‘s AWS and Microsoft Azure are prime examples of IaaS.

Software as a Service (SaaS) like Gmail or Salesforce.com.

Platform as a Service (PaaS), which helps to reduce the development overhead (environment

configuration) by providing a ready-to-use platform. PaaS services can be hosted on top of

infrastructure provided by an IaaS.

Since it‘s easy to become a bit confused when thinking about cloud platforms, it‘s important to be

able to visualize exactly which elements of the compute ecosystem are whose responsibilities. While

there is no precise definition, it‘s reasonable to say that a platform requires only that you take care of

your applications.

With that in mind, the platform layer should be able to provide:

A suitable environment to run an application.

Application life cycle management.

Self-healing capacity.

Centralized management of applications.

Distributed environment.

Easy integration.

Easy maintenance (upgrades etc).

What is Cloud Foundry

Cloud Foundry is an open source cloud computing platform originally developed in-house at

VMware. It is now owned by Pivotal Software, which is a joint venture made up of VMware, EMC,

and General Electric.

Cloud Foundry is optimized to deliver…

Fast application development and deployment.

Highly scalable and available architecture.

DevOps-friendly workflows.

Reduced chance of human error.

Multi-tenant compute efficiencies.

Not only can Cloud Foundry lighten developer workloads but, since Cloud Foundry handles so much

of an application‘s resource management, it can also greatly reduce the overhead burden on your

operations team.

Cloud Foundry‘s architectural structure includes components and a high-enough level of

interoperability to permit…

Integration with development tools.

Application deployment.

Application lifecycle management.

Integration with various cloud providers.

Application Execution.

https://docs.cloudfoundry.org/concepts/architecture/

Although Cloud Foundry supports many languages and frameworks, including Java, js, Go, PHP,

Python, and Ruby, not all applications will make a good fit. As with all modern software applications,

your project should attempt to follow the Twelve-Factor App standards.

Key benefits of Cloud Foundry:

Application portability.

Application auto-scaling.

Centralized platform administration.

Centralized logging.

Dynamic routing.

Application health management.

Integration with external logging components like Elasticsearch and Logstash.

Role based access for deployed applications.

Provision for vertical and horizontal scaling.

Infrastructure security.

Support for various IaaS providers.

Getting Started with Cloud Foundry

Before deciding whether Cloud Foundry is for you, you‘ll have to try actually deploying a real

application. As I already mentioned, to set up a suitable environment, you will need an infrastructure

layer. As of now, Cloud Foundry supports AWS, VMware and Open Stack. Setting Cloud Foundry

on top of VMware might not be the best choice for us, since we‘d probably prefer to avoid the extra

complexity. Instead, we‘ll work with Pivotal web services (PWS).

PWS provides ―Cloud Foundry as a web service,‖ deployed on top of AWS. You‘ll just need create

an account and you‘ll automatically get a sixty day free trial. Getting started isn‘t a big deal at all.

Hosting Static files in Cloud Foundry

Once you‘ve created your account and set up the command line interface tool, you‘ll be ready to

deploy your application. We‘re going to use some static files, which means we‘ll need one folder and

a few html files. Make sure there‘s an index.html file among them.

Normally, deploying static files requires a webserver like Apache or Nginx. But we‘re not going to

have to worry about that: the platform will automatically take care of any Internet-facing

https://12factor.net/

https://run.pivotal.io/

http://docs.run.pivotal.io/starting/

http://docs.run.pivotal.io/starting/#install-login

configuration we‘ll need. You only need to push your application files to the Cloud Foundry

environment and everything else will be taken care of.

Now, copy the folder with your files to the machine where you‘ve installed the CLI and log in to the

CLI using this API endpoint:

You may need to provide some information

1. Username (the username you used to to log in to your PWS account).

2. Password (the PWS password you created).

3. Organization name (any name will work).

4. Space (select any space where you want your application to be deployed).

Cloud servers can be configured to provide levels of performance, security and control similar to those of a dedicated server. But instead of being hosted on physical hardware that's solely dedicated to you, they reside on a shared ―virtualized‖ environment that's managed by your cloud hosting provider.

5 Differences between Cloud and Dedicated Servers

There`s a rapid expansion in the number of businesses getting online and multiple solutions are now being provided by the hosting industry to them in order to help them host their data on the right server as per their needs. If you are a startup, there are two major hosting options available to you – cloud servers and dedicated servers.

In cloud server you don‘t need to buy and maintain any hardware as everything is ‗handled‘ by the service provider, whereas the user rents or buys the server, software and other resources from the web hosting provider in dedicated server.

To decide which option is right for your business, it is essential to understand basic differences between them so as to take the right decision:

1. Availability

Cloud servers never go down as in case of any issue, one of the multiple nodes takes over the workload of the failed node automatically and this ensures zero downtime and maximum network uptime for your website and application.

With dedicated servers, there‘s risk of downtime and hardware failure as they do not have multiple nodes to share the load.

2. Scalability of resources

Increase or decrease of allotted resources – computing cores, RAM, and storage, as per workloads, is very easy and simple with cloud server. ZNetLive‘s cloud servers have scalable RAM, CPU, storage and strong technical resources to boost your website performance.

When it comes to dedicated server, rigid specifications are there and scaling of resources is a bit difficult and time consuming task.

3. Safety and security

With cloud servers, you have to trust your provider for the services and for taking adequate measures for security. Cloud service providers ensure data safety through dedicated IT support, secure and encrypted solutions, firewalls, and facilitate backup recoveries.

But in dedicated servers, you yourself need to take essential measures from monitoring server resources to upgrading your dedicated server to secure your sensitive and confidential business information.

4. Cost-efficiency

Hourly resource-based billing is among big benefits of cloud server hosting is typically pay as you go, that means you pay only for the computing resources that you actually use. With cloud servers, bandwidth, SQL storage and disk space offered are bit expensive, but they are relatively cheaper and abundant with dedicated servers.

Dedicated servers are generally billed monthly and you have to pay a consistent amount irrespective of how much server and resources you actually use.

5. Level of control

In cloud server, one does not have complete control and is limited to offerings provided by the service provider.

However, a dedicated server offers complete control over the server as one can add applications, programs and performance

https://www.znetlive.com/azure-public-cloud/

https://blog.znetlive.com/upgrading-znetlive-dedicated-server-to-latest-hardware/

https://blog.znetlive.com/upgrading-znetlive-dedicated-server-to-latest-hardware/

https://www.znetlive.com/azure-public-cloud/

enhancing measures to the machine.

Should I go for cloud server or dedicated server?

Selection of a server completely depends upon your business goals and objectives. The cloud server is best suitable for e-commerce websites with unpredictable and fluctuating demands, cost efficiency of cloud server suits best to SMB websites, and is ideal for web hosting providers and for testing of new and basic websites.

But, if you are aiming for high performance, resilience, reliability and full control, then dedicated servers should be your default choice.

https://www.znetlive.com/

https://www.znetlive.com/dedicated-server/

Digital Signatures

A digital signature is a type of electronic signature that offers more security than a traditional electronic signature. When you sign a document with a digital signature, the signature links a “fingerprint” of the document to your identity. Then that information is permanently embedded into the document, and the document will show if someone comes in and tries to tamper with it after you’ve signed it.

"Digital signatures offer tamper evidence, independent verification and a strict adherence to standards, meaning our customers are not left having to rely on us being around simply to prove that signatures took place,

Digital Signature A digital signature, on the other hand, refers to the encryption / decryption technology on which an electronic signature solution is built. A digital signature alone is not a type of electronic signature. Rather, digital signature encryption secures the data associated with a signed document and helps verify the authenticity of a signed record. Used alone, it cannot capture a person’s intent to sign a document or be legally bound to an agreement or contract.

What is an electronic signature? An electronic signature is a way of representing your signature on a computerized document, for

example a delivery slip. The term ‗electronic signature‘ can refer to several different methods of

capturing a signature on a document or device. This includes methods such as using a tablet or mobile

app to capture an image of a handwritten signature. It can also be simply typing your name into a

signature box. An example of a commonly created electronic signature is when you sign for a

delivery on the courier‘s digital device.

What is a digital signature? A digital signature is much more than an electronic signature. Digital signatures become

intrinsically linked to the content of the digital document using encryption.

Anyone digitally signing a document needs a digital certificate; the certificate being unique to that

individual. The certificate contains a public and a private key – known as a ‗key pair‘. Digital

signature software works by performing these steps:

1. The software creates a ‘hash’ of the document content. Hashes are representations of the whole content, including images.

2. The signatories certificate is then used to encrypt the hash. This combination of hashing and encryption creates an intrinsic connection between the document and the signatory; digital signing in this way, ties the two together.

3. The document hash is checked using the public key of the certificate to make sure it can be decrypted. It can only be decrypted if the user’s public key matches the private key used to encrypt the document.

4. When the signature is checked using the digital signing software, the original document is hashed again and both the original and signed hash are crosschecked. If there’s a difference between them, then the signature is invalidated.

Because a digital signature is effectively, ‗wrapped up‘ in the content of the document, it means that

if anyone tries to change anything about that document content, the signature will also change. It

effectively invalidates the signature and indicates that the document has been tampered with.

What’s the difference between a digital signature and an

electronic signature? The table below shows a quick, at-a-glance view of some of the key differences between digital

signatures and electronic signatures:

Digital Signature Electronic Signature

Digital signatures are like a lock on a document. If the

document changes after the signature is applied, it will

show up as an invalidated signature.

Electronic signatures are open to tampering.

Digital signatures are very secure. Hashes cannot be

easily undone and encryption using a digital certificate

is highly secure.

Electronic signature‘s are not based on standards and

tend to use proprietary methods so are intrinsically less

secure.

A digital signature is hard to deny. This is also known

as non-repudiation. A digital signature is associated

with an individual‘s private key of a digital certificate.

This identifies them as being the signatory, as it is

unique.

Electronic signatures are much harder to verify.

Digital signatures are nearly always time stamped. This

is very useful in a court of law to tie a person to a

signature at a specific day and time.

Electronic signatures can have a time and date

associated with the signature but it is held separate to

the signature itself so is open to abuse.

Digital signatures can hold logs of events, showing

when each signature was applied. In advanced digital

signature products like ApproveMe, this audit trail can

even send out alerts if the log is tampered with.

Audit logs are not easily applied to electronic

signatures.

The digital certificates representing the individual

signatories give details of the person signing the

document, such as full name, email address and

company name – they are tied to the document

signature through the certificate.

If details of the person placing an electronic signature

on a device or document are required, they have to be

placed separately to the signature and are not held with

the signature itself, therefore are more open to abuse.

eSign is an online electronic signature service in India to facilitate an Aadhaar holder to digitally sign a

document. ... With these two things, an Indian citizen cansign a document remotely without being physically

present

eSign – Online Digital Signature Service

Introduction

For creating electronic signatures, the signer is required to obtain a Digital Signature Certificate (DSC) from a Certifying Authority (CA) licensed by

the Controller of Certifying Authorities (CCA) under the Information Technology (IT) Act, 2000. Before a CA issues a DSC, the identity and address

of the signer must be verified. The private key used for creating the electronic signature is stored in hardware cryptographic token which is secured

with a password/pin. This current scheme of in-person physical presence, paper document based identity & address verification and issuance of

hardware cryptographic tokens does not scale to a billion people. For offering fully paperless citizen services, mass adoption of digital signature is

necessary. A simple to use online service is required to allow everyone to have the ability to digitally sign electronic documents.

eSign

eSign is an online electronic signature service which can be integrated with service delivery applications via an open API to facilitate an Aadhaar

holder to digitally sign a document. Using authentication of the Aadhaar holder through Aadhaar e-KYC service, online electronic signature service is

facilitated

Salient Features of eSign

Save cost and time Aadhaar e-KYC based authentication

Improve user convenience Mandatory Aadhaar ID

Easily apply Digital Signature Biometric or OTP based authentication

Verifiable Signatures and Signatory Flexible and fast integration with application

Legally recognized Suitable for individual business and Government

Managed by Licensed CAs API subscription Model

Privacy concerns addressed Assured Integrity with complete audit trail

Simple Signature verification Immediate destruction of keys after usage

Short validity certificates No concerns regarding key storage and key protection

Easy and secure way to digitally sign information anywhere, anytime - eSign is an online service for electronic signatures without using physical cryptographic token. Application service providers use Aadhaar e-KYC service to authenticate signers and facilitate digital signing of documents.

Facilitates legally valid signatures - eSign process includes signer consent, Digital Signature Certificate issuance request, Digital Signature creation and affixing as well as Digital Signature Certificate acceptance in accordance with provisions of Information Technology Act. It enforces compliance through API specification and licensing model of APIs. Comprehensive digital audit trail, in-built to confirm the validity of transactions , is also preserved.

Flexible and easy to implement - eSign provides configurable authentication options in line with Aadhaar e-KYC service and also records the Aadhaar ID used to verify the identity of the signer. The authentication options for eKYC include biometric (fingerprint or iris scan) or OTP (through the registered mobile in the Aadhaar database). eSign enables millions of Aadhaar holders easy access to legally valid Digital Signature service.

Respecting privacy - eSign ensures the privacy of the signer by requiring that only the thumbprint (hash) of the document be submitted for signature function instead of the whole document.

Secure online service - The eSign service is governed by e-authentication guidelines. While authentication of the signer is carried out using Aadhaar e-KYC services, the signature on the document is carried out on a backend server of the e-Sign provider. eSign services are facilitated by trusted third party service providers - currently Certifying Authorities (CA) licensed under the IT Act. To enhance security and prevent misuse, Aadhaar holders private keys are created on Hardware Security Module (HSM) and destroyed immediately after one time use.

http://www.cca.gov.in/cca/?q=eSign.html

Empanelled eSign Service Providers

List of Providers eMudhra Ltd. C-DAC (n)Code Solutions NSDL e-Governance Infrastructure Ltd

Careers in Emerging Technology: Databases and Data Science

By Lori Cameron

For this issue of ComputingEdge, we asked Andy Pavlo—assistant professor of databaseology in Carnegie Mellon University‘s Computer Science Department—about career opportunities in emerging technology fields involving databases and data science. Pavlo‘s research interests are database management systems—specifically main memory, nonrelational, and transaction-processing systems—and large-scale data analytics. He authored the article ―Emerging Hardware Trends in Large-Scale Transaction Processing‖ in IEEE Internet Computing‘s May/June 2015 issue.

ComputingEdge: What careers in emerging technologies in your field will see the most growth in the next several years?

Pavlo: Artificial intelligence, more specifically machine learning, will continue to be the hot growth area for the foreseeable future in database- and data-science-related fields. Developers who can design high-performance systems to support complex, data-intensive applications will surely be in demand for several years.

ComputingEdge: What would you advise college students to give them an advantage over the competition?

Pavlo: No company or organization starts a new software project from scratch. Thus, it‘s good to have is the ability to work on existing code bases with little or no guidance or documentation. The most ideal employees are those who can start quickly on a project that consists of a large amount of existing code they didn‘t write. The good way to learn this skill is through practice.

ComputingEdge: What advice would you give people changing careers midstream?

Pavlo: You must always work hard. And you have to stay up to date with the latest database systems, machine-learning tools, and data-analysis frameworks. Luckily, we live in an era where everyone is open-sourcing their software, so it is easier for people to try things out at home. The best way to pick up new skills is to pick a hobby project and then build it out using a new piece of software that you want to learn more about.

ComputingEdge: What do you consider to be the best strategies for professional networking?

Pavlo: You need to be visible. Making a LinkedIn page isn‘t enough. You must advertise what you have to offer. This means you should write a blog, build out your GitHub portfolio, contribute to open source projects, attend and give talks at meet-ups, and/or volunteer for hackathons. All of this shows potential employers that you are enthusiastic about computers and technology. Every little bit helps.

ComputingEdge: What should applicants keep in mind when applying for emerging-technology jobs?

Pavlo: The field is moving fast, but having a good computer-science foundation will serve you well no matter what the current technology trend is.

Cloud computing is on the cusp of a modified revolution as companies need faster computing resources to process, store and distribute a large amount of data in an efficient way. Cloud adoption helps big data businesses to deal with terabytes of data through a shared infrastructure.

As technology continues to evolve, the cloud computing market is expected to accelerate faster. According to Gartner over US$1 trillion in Information Technology (IT) spending will be directly or indirectly affected by the shift to cloud in the next five years. The move from on-premise to cloud infrastructure will be aided by a lot of industry developments in 2017. Here are the top five trends that will shape up the cloud computing space in 2017:

Cloud security to be the top priority

Security remains a serious concern with the growing adoption of cloud-based infrastructure globally. The vast amount of data stored on remote servers poses an enormous risk and necessitates implementation of wide security policies across organizations. In the business world, a data breach is linked to the loss of millions of customers, their identities, and reputation. Data breach investigations could result in millions of fines and could destroy businesses all at one go.

A market research firm estimates the size of cloud security market at US$8.7 billion in 2019. With the increase in cloud-based infrastructure, cloud security would become an integral part of big data management strategy in 2017.

Cloud-based IT infrastructure to traverse further

The hardware infrastructure for cloud computing will witness considerable investments with enterprises moving more workloads off-premise. In the next two years, about one-third of all organizations will be entirely based on the cloud, according to IDG‘s Enterprise Cloud Computing Survey, 2016. Servers and Ethernet will constitute the majority of spending for cloud-based infrastructure. On the contrary, spending on traditional IT infrastructure is expected to acknowledge a decline in a few organizations across geographies.

IDC expects that cloud-based IT infrastructure spending will register a compound annual growth rate (CAGR) of 13.6% to reach US$60.8 billion by 2020. ―Demand for cloud services will continue to drive the underlying shift in IT infrastructure spending from on-premise to off-premise deployments, said Natalya Yezhkova, Research Director, Storage Systems at IDC.‖

Public and Hybrid cloud computing set to grow rapidly

Public cloud is gaining traction as enterprises evince keen interesting in hosting their software applications on it. The worldwide public cloud growth will come mainly from infrastructure as a service (Iaas) and software as a service (SaaS), both will account for about 17% and 74% of the total workloads, respectively by 2020. Also, the overall spending on public cloud will be worth US$195 billion by 2020, according to IDC.

Since more and more organizations move their applications to the public cloud to save cost, big data vendors will evolve themselves to work in these IT environments. However, productivity, vendor lock-in, security and privacy concerns would force organizations to embrace a hybrid cloud strategy. This will enable them to shuffle between private and public clouds depending on the workload and business scenarios. The hybrid cloud market is expected to acknowledge a CAGR of 29% through 2019.

Internet of Things (IoT) and cloud computing to go hand in hand

The convergence of IoT and cloud is opening up new horizons in the technological landscape. The connected things will generate a large amount of data through the cloud. While vendors are able to achieve higher economies of scale, customer costs are reduced. Companies are now able to deploy applications worldwide and save cost on data centers. Moreover, IoT will see an upsurge in the management of smart mobile devices in IoT topology in the times ahead.

The Platform as a Service (PaaS) category which enables the combination of these two technologies is expected to see a huge growth in a few years. The database, analytics, and IoT workloads will account for 22% of the total business workloads by 2020. As the industry grows, we are likely to see more strategic collaborations between companies to drive IoT-optimized infrastructure services.

Machine learning to proliferate in cloud

Machine learning databases, applications, and algorithms are becoming pervasive in the cloud platforms. As cloud computing expands, organizations are developing different tools to create intelligent applications and incorporate machine learning in their software services. Infrastructure to support machine learning workloads such as natural language processing and neural networks will be of paramount importance to IT firms in the coming time.

All these developments will spread machine learning into a wide variety of uses and drive the next generation of applications. However, it is just the beginning. Eric Schmidt, Executive Chairman of Alphabet, said ―bringing machine learning to the cloud will be a game changer‖. The path is filled with a lot of challenges and it will be interesting to see how these developments pan out in 2017.

The outlook

Cloud computing has become a viable and mainstream solution to store and process a large amount of data. It holds a tremendous future with growing number of applications running on the cloud, offering unlimited, and elastic data storage capabilities. Cloud computing is helping enterprises gain efficiencies as they move their operations off-premise to better serve their customers. This transition will positively impact a large number of organizations globally over the next five years and will provide a fillip to the global economies.

References

[1] http://www.gartner.com/newsroom/id/3384720

[2] http://www.marketsandmarkets.com/PressReleases/cloud-security.asp

[3] http://www.idgenterprise.com/resource/research/2016-idg-enterprise-cloud-computing-survey/?utm_campaign=Cloud%20Computing%20Survey%202016&utm_medium=Press%20Release&utm_source=Press%20Release

http://www.computerworld.com/article/3047634/big-data/alphabets-eric-schmidt-sees-a-huge-future-for-machine-learning.html

http://www.gartner.com/newsroom/id/3384720

http://www.marketsandmarkets.com/PressReleases/cloud-security.asp

http://www.idgenterprise.com/resource/research/2016-idg-enterprise-cloud-computing-survey/?utm_campaign=Cloud%20Computing%20Survey%202016&utm_medium=Press%20Release&utm_source=Press%20Release

http://www.idgenterprise.com/resource/research/2016-idg-enterprise-cloud-computing-survey/?utm_campaign=Cloud%20Computing%20Survey%202016&utm_medium=Press%20Release&utm_source=Press%20Release

[4] http://www.einnews.com/pr_news/354178469/global-hybrid-cloud-market-trends-demand-and-analysis-by-2027

[5] http://www.informationweek.com/cloud/infrastructure-as-a-service/gartner-sees-$1-trillion-shift-in-it-spending-to-cloud/d/d-id/1326372

[6] http://blogs.wsj.com/cio/2016/10/05/cloud-it-infrastructure-spending-up/

entiment analysis of social media data using Big Data Processing Techniques

NOV 24, 2016 22:56 PM

Introduction

With the extensive growth in the usage of online social media, the ransom amount of data is available as users‘ preference regarding any product, services provided by various organizations or with respect to any political issues. Micro blogs, forums are also available wherein internet users, can express their opinions. Since mobile devices can access network easily from anywhere, social media is becoming more and more popular. The number of people using the social media is increasing day by day as they can share their personal feeling every day and reviews are created in large-scale. Every minute opinion, reviews are being expressed online and a potential user rely on these reviews, opinions, feedback given by various other users to make decisions with respect to purchasing an item or developing a software when it comes to an organization that provides services. Analyzing these reviews, opinions or feedback in this scenario is of utmost importance. It seems evaluating these reviews, opinions are not as easy as it appears to be, and it requires performing sentiment analysis. Sentiment analysis greatly helps us in knowing the customer behavior. The biggest challenge is to process the social data which are in unstructured or semi-structured form. The former technologies fail to process the data in this form in an effective way. So, there is a need for highly optimized, scalable and efficient technology to process the abundant data that are being produced at a high rate. The social media data produced will be either unstructured or semi-structured. Hadoop framework effectively analyzes the unstructured and semi-structured form data. With the increase in the utilization of Hadoop for processing the huge sets of data in various fields the need for maintaining the overall performance of Hadoop becomes inevitable which is made possible by developing various open source tools such as Spark, Hive, Flume, Oozie, Zookeeper, Sqoop which are supported by Hadoop which makes it even more powerful.

SENTIMENT ANALYSIS

Sentiment is defined as an expression or opinion by an author about any object or any aspect. Analyzing, investigating, extracting users‘ opinion, sentiment and preferences from the subjective text is known as sentiment analysis. The main focus of sentiment analysis is parsing the text. In simple terms, sentiment analysis can be defined as detecting the polarity of the text. Polarity can be positive, negative or neutral. It is also referred to as opinion mining as it derives opinion of the user. Opinions vary from user to user and sentiment analysis greatly helps in understanding users‘ perspective. Sentiment can be,

Direct opinion: As the name suggests the opinion about an object is given directly and the opinion may be either positive or negative. For example, ―The video clarity of the cellphone is poor‖ expresses a direct opinion.

Comparison opinion: It is a comparative statement which consists of comparison between two identical objects. The statement, ―The picture quality of camera-x is better than that of camera-y‖ is one possible example for expressing a comparative opinion.

Sentiment analysis is performed at three different levels:

Sentiment analysis at sentence level identifies whether the given sentence is subjective or objective. Analysis at sentence level assumes that the sentence contains only one opinion.

Sentiment analysis at document level classifies the opinion about the particular entity. Entire document contains opinion about the single obj ect and

from the single opinion holder.

Sentiment analysis at feature level extracts the feature of a particular object from the reviews and determines whether the stated opinion is positive or negative. The extracted features are then grouped and their summarized report is produced.

Architecture components of Big Data Ecosystem

With the explosive growth of data on the Internet and the improvement of corpus, sentiment analysis system needs the big data processing techniques to complete tasks. The term Big Data is represented by three V‘s (Volume, Variety, and Velocity). Volume represents amount of data used for summarization. Variety represents different type of data like structured, semi-structured and unstructured which is extracted from various sources. Velocity represents the speed of data generation on internet. For processing the large sets of data in parallel across cluster of nodes Apache came up with an open source framework known as Hadoop.

The major components of Hadoop are Hadoop distributed file system (HDFS) and the MapReduce programming model. Hadoop is accessible, because it runs on cloud computing services or commodity machine across clusters of nodes. It is able to handle failures in an efficient manner even though it is intended to run on commodity hardware which makes it robust. Any number of nodes can be added to the Hadoop cluster in order to deal with huge data in parallel. Hadoop is simple in that a user can write a simple parallel code. To each and every node data is distributed and hence operation is performed in parallel in Hadoop cluster. Hadoop overcomes the hardware failure by keeping multiple copies of data. Modules of Hadoop Ecosystems are as follows:

1. Hadoop common utilities

http://www.einnews.com/pr_news/354178469/global-hybrid-cloud-market-trends-demand-and-analysis-by-2027

http://www.informationweek.com/cloud/infrastructure-as-a-service/gartner-sees-$1-trillion-shift-in-it-spending-to-cloud/d/d-id/1326372

http://blogs.wsj.com/cio/2016/10/05/cloud-it-infrastructure-spending-up/

https://www.computer.org/ieeecs-ContentContainer-portlet/templates/toolTip_ToTheTop.html?width=300

https://www.computer.org/ieeecs-ContentContainer-portlet/templates/toolTip_ViewAs.html?width=300

Hadoop modules require operating system level and file system level abstractions which are provided by the java libraries and utilities. Execution of Hadoop is carried out by the java files and scripts facilitated by Hadoop common utilities.

2. Hadoop Distributed file system (HDFS)

Hadoop provides its own filesystem known as Hadoop distributed file system for storing huge set of data based on Google File Server (GFS) which is highly fault-tolerant. The architecture of HDFS depicts the master/slave architecture. Master node manages the file system and the storage of actual data is taken care by the slave node.A file in a HDFS namespace is divided into several segments and these segments are stored in DataNodes. The plotting of these segments to the DataNodes is identified by the NameNode. The data node performs read and write operations.

3. MapReduce

The MapReduce is a Distributed Data Processing Framework of Apache Hadoop enables the writing of applications in an effective manner and also enables parallel processing of huge sets of data .The MapReduce paradigm has two different tasks:

The Map Task: The Map task captures the input and this input data are divided into pair of data. This data is further divided into tuples to form a key/value pair.

The Reduce Task: The input to the Reduce task is the output from the Map task. All the divided tuples in the Map task is combined to form smaller set of tuples. Map Task is followed by Reduce Task.

The MapReduce component of the Hadoop framework schedules monitors the tasks and also re-executes the failed task. The MapReduce paradigm has a single JobTracker and one TaskTracker that acts as master and slave respectively. The master JobTracker directs the slave TaskTracker to execute the task and also it manages the resource, tracks the resource distribution, consumption and availability. On the other hand the TaskTracker provides the status information to the JobTracker.

4. Hadoop Yarn framework

It provides computational resources required for application execution.YARN is enabler for dynamic resource utilization on Hadoop framework as users can run various Hadoop applications without having to bother about increasing workloads. Yarn has Resource Manager and Node manager for Scheduling of jobs and managing the resources to the clusters. The master is Resource Manager. It will do resource scheduling by knowing where the slaves are located and how many resources they have. The slave of the infrastructure is Node Manager. When it starts, it announces himself to the Resource Manager and periodically sends a heartbeat to the Resource Manager.

Hadoop proves to be a reliable framework and also it processes the huge set of data in a fault-tolerant manner which makes it efficient. To make Hadoop function methodically various open source technologies such as Spark, Flume, Hive, Mahout, Sqoop, Oozie, Zookeeper etc. are developed which are built on top of Hadoop, collectively called as Hadoop eco-system proves to improve the overall performance of Hadoop.

Figure: Hadoop Ecosystem

Data Access Components of Hadoop Ecosystem

Data access components of Hadoop are Apache Pig and Hive. They are used for analyzing large data sets without the low level work with Map

reduce. Apache Pig is a platform for analysing large data sets . Pig‘s infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs and Pig's language layer currently consists of a textual language called Pig Latin. Hive is a data warehouse system for Hadoop for querying, Summarizing and analysis of large data sets stored in HDFS. It provides SQL like interface. The data stored in HDFS is queried by Hive with the help of HiveQL.

Data Integration Components of Hadoop Ecosystem

Data Integration components of Hadoop Ecosystem are flume and sqoop. Flume is used for populating data with Hadoop. Collection, aggregation and movement of data is the responsibility of Flume. Sqoop/REST/ODBC is a connectivity tool for moving data from non-Hadoop data stores like relational databases and data warehouses into Hadoop. The users

Cyclomatic Complexity – 40 Years Later

The criticality and risk of software is defined by its complexity. Forty years ago, McCabe introduced his famous cyclomatic complexity (CC)

metric. Today, it is still one of the most popular and meaningful measurements for analyzing code. Read this blog about the measurement and

its value for improving code quality and maintainability...

— Christof Ebert

It is of great benefit for projects to be able to predict software components likely to have a high defect rate or which might be difficult to test and

maintain. It is of even more value having an indicator which can provide constructive guidance on how to improve the quality of code. This is

what the cyclomatic complexity (CC) metric gives us.

The CC metric is simple to calculate and intuitive to understand. It can be taught quickly. Control flows in code are analyzed by counting the

decisions, i.e., the number of linear independent paths through the code under scrutiny. Too many nested decisions make the code more difficult

to understand due to the many potential flows and possibilities of passing through it. . In addition, the CC value of a module correlates directly

with the number of test cases necessary for path coverage, so even a rough indication given by the CC metric is of high value to a developer or

project manager.

A high CC thus implies high criticality and the code will have a higher defect density (vis-à-vis code with a relatively lower CC); test effort is

higher and maintainability severely reduced. These relationships are intuitive for students as well as experts and managers and this is another

appealing feature of the CC metric. It is small wonder therefore that CC, unlike many other metrics which have been proposed over the past

decades is still going strong and is used in almost all tools for criticality prediction and static code analysis.

CC, together with change history, past defects and a selection of design metrics (e.g., level of uninitialized data, method overriding and God

classes) can be used to build a prediction model. Based on a ranked list of module criticality used in a build, different mechanisms

namely refactoring, re-design, thorough static analysis and unit testing with different coverage schemes can then be applied. The CC metric

therefore gives us a starting point for remedial maintenance effort.

Instead of predicting the number of defects or changes (i.e., algorithmic relationships) we consider assignments to classes (e.g., ―defect-prone‖).

While the first goal can be achieved more or less successfully with regression models or neural networks mainly in finished projects, the latter

goal seems to be adequate for predicting potential outliers in running projects, where precision is too expensive and not really necessary for

decision support. Christof – I am not sure I follow the point being made in these last two sentences – can you possibly clarify/elaborate please?

While the benefits of CC are clear, it does need clear counting rules. These days for instance, we do not count simple ―switch‖ or ―case‖

statements as multiplicities of ―if, then, else‖ decisions. Moreover, the initial proposal to limit CC to seven plus/m inus two per entity is no longer

taken as a hard rule, because boundaries for defect-prone components are rather fuzzy and multi-factorial.

Having identified such overly critical modules, risk management must be applied. The most critical and most complex of the analyzed modules,

for instance, the top 5, are candidates for redesign. For cost reasons mitigation is not only achieved with redesign. The top 20% should have a

thorough static code analysis, and the top 80% should be at least unit tested with C0 coverage of 100%. By concentrating on these critical

components the productivity of quality assurance is increased.

Critical modules should at least undergo a flash review and subsequent refactoring, redesign or rewriting – depending on their complexity, age

and reuse in other projects. Refactoring includes reducing size, improving modularity, balancing cohesion and coupling, and so on. For instance,

apply thorough unit testing with 100 percent C0 coverage (statement coverage) to those modules ranked most critical. Investigate the details of

the selected modules‘ complexity measurements to determine the redesign approach. Typically, the different complexity measurements will

indicate the approach to follow. Static control flow analysis tools incorporating CC can also find security vulnerabilities such as dead code, often

used as backdoors for hijacking software.

Our own data but also many published empirical studies demonstrate that a high decision-to-decision path coverage or C1 coverage will find

over 50% of defects, thus yielding a strong business case in favor of using CC. On the basis of the results from many of our client projects and

taking a conservative ratio of only 40 percent defects in critical components, criticality prediction can yield at least a 20 percent cost reduction for

defect correction.

The additional costs for the criticality analysis and corrections are in the range of few person days per module. The necessary tools such

as Coverity, Klocwork, Lattix, Structure 101, SonarX, SourceMeter, are off the shelf and account for even less per project. These

criticality analyses provide numerous other benefits, such as the removal of specific code-related risks and defects that otherwise are hard to

identify (for example, security flaws).

CC clearly has its value for critically predictions and thus improving code quality and reducing technical debt. Four decades of validity and usage

is a tremendous time in software, and I congratulate McCabe for such a ground-breaking contributio

More:

Read selected white papers on quality practices from our media-center:

http://consulting.vector.com/vc_download_en.html?product=quality

Read our full article on static code analysis technologies in IEEE Software:

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4814967

Author:

Christof Ebert is the managing director of Vector Consulting Services. He is on the IEEE Software editorial board and teaches at the University of

Stuttgart and the Sorbonne in Paris.

http://consulting.vector.com/vc_download_en.html?product=quality

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4814967

DevOps Practice

DevOps breaks organizational silos and thus accelerates delivery. DevOps principles apply not only for cloud and IT services but for most

industries, including critical systems. Read the blog and learn from a recent case study of using DevOps methodologies in critical domains...

— Christof Ebert

DevOps is an organizational shift where instead of distributed silo-like functions cross-functional teams work on continuous operational feature

deliveries. Teams thus deliver value in a faster and continuous way, reducing problems generated by miscommunication between team members

and enhancing a faster resolution of problems. It obviously means a culture shift towards collaboration between development, quality assurance

and operations. At Vector we have supported a number of companies on improving efficiency with DevOps and continuous delivery. Here a brief

case study from a domain with high safety and security requirements.

A global supplier of critical infrastructure solutions faced overly long cycle time and high rework of delivered upgrades. The overall delivery

process from development to the field took 18 months for new products and up to 3 months for upgrades thus being far too long, even in this

domain. We introduced a DevOps model tailored for these specific environmental constraints. The figure below shows the eight focus areas

mapped to the v-shaped lifecycle abstraction. The key change was the enhanced requirements engineering and delivery model (numbers 1 and

2 in the picture below). By running the automated tests and static and runtime analysis with every check into automatic build management, our

client obtained the capability to discover defects early in the development cycle. Less changes during the feature development phase and less

rework due to quality issues directly impacted ROI. Software releases became more consistent and less painful, because tests were run early

and often. The company gained an overall end-to-end cycle time improvement towards 12 months for products and few days for small upgrades

due to better quality and fewer changes.

DevOps principles apply to different delivery models and industries, but must be tailored to the environment and product architecture. Continuous

deliveries are difficult in distributed and critical systems, such as automotive, railway or medical. Nevertheless delivery processes can be

facilitated in a fast and reliable scheme, such as software over the air (OTA) upgrades in these industries show. Obviously such delivery models

need dedicated architecture and hardware changes, for instance secure delivery schemes and a hot swap controller concept, where one half is

operational and the other half builds the next updates which are swapped to active mode after in-depth security and verification

approaches. DevOps for such critical systems is more challenging than cloud and IT services due to the dependence on legacy code and

architecture, and trying to fit it into a continuous delivery approach.

Mutual understanding from requirements onwards to maintenance, service and product evolution will yield typically a cycle time improvement of

10-30% and cost reduction of up to 20%. As products and life-cycle processes vary, each company needs its own approach towards

a DevOps environment, from architecture to tools and culture.

More:

Read selected white papers on agile practices from our media-center

Directly proceed to the white papers…

Read our full article on DevOps tools and technologies in IEEE Software, May 2016

evOps is about fast and flexible development and delivery business processes. The blog provides a brief overview on most recent DevOps

technologies and what it means for industry projects. Learn about some best practices for DevOps in this blog...

— Christof Ebert

DevOps efficiently integrates development, delivery and operations and thus facilitates a lean and fluid connection of these traditionally separated silos. It is a software practice that integrates the two worlds of development and operations with automated development, deployment and infrastructure monitoring. It‘s an organizational shift where instead of distributed silo-like functions cross-functional teams work on continuous operational feature deliveries. This integrative approach helps teams deliver value in a faster and continuous way, reducing problems generated by miscommunication between team members and enhancing a faster resolution of problems.

DevOps means a culture shift towards collaboration between development, quality assurance and operations. The generic process is indicated in the figure below. Its promise and goal is to better integrate the development, production and operations business processes with adequate technology, thus not staying on highly artificial process concepts which will never fly, but rather set up a continuous delivery process with small upgrades. Companies such as Amazon and Google have lead this approach achieving cycle times of minutes. This obviously depends on the deployment model, whereas a single cloud service is easier to facilitate than actual software deliveries to real products.

DevOps applies to these very different delivery models, but must be tailored to the environment and product architecture. Not all products facilitate continuous deliveries, for instance in safety-critical systems. Nevertheless upgrades can be planned and delivered in a fast and reliable scheme, as recent evolution of automotive software over the air (OTA) upgrades show. Aside the highly secured cloud based delivery model, such delivery models also need dedicated architecture and hardware changes, for instance a hot swap controller concept, where one half is operational and the other half builds the next updates which are swapped to active mode after in-depth security and verification approaches. DevOps for embedded systems is more challenging than cloud and IT services due to the dependence on legacy code and architecture, and trying to fit it into a continuous delivery approach.

Modern tools are mandatory to implement a DevOps pipeline. Choosing the right tools for your environment or project it is an important step when moving to a DevOps practice. In the build phase the tools need to support fast workflows. From this perspective, build tools help to achieve fast iteration reducing manual time consuming tasks, and Continuous integration tools merge code from all developers and check for broken

http://consulting.vector.com/vc_download_en.html?product=agile

http://www.computer.org/portal/web/computingnow/software

code, improving software quality. During the deployment phase the most important shift is to treat Infrastructure as code. With this approach infrastructure can be shared, tested and version controlled. A homogenous infrastructure is shared between development and production, reducing problems and bugs because of difference in infrastructure configuration.

At Vector we have supported a number of companies on improving efficiency with DevOps and continuous delivery. A key learning for all companies is that the culture shift should not be underestimated. There are four major challenges which we face in all DevOps projects, namely:

• Break complex architectures and feature sets towards small chunks that can be produced and deployed independently.

• Maintain a configuration and build environment which provides visibility at all times about what is currently deployed with which versions and dependencies.

• Introduce a purpose-built development and production environment from legacy ALM/PLM environments.

• Bridge the traditional silo-type cultures of development (perceived by operations in its thoroughness as cumbersome and expensive) and operations (perceived by developers as quick and dirty).

DevOps is a paradigm shift impacting the entire software and IT industry. Building upon lean and agile practices,DevOps means end-to-end automation in software development and delivery. Hardly anybody will be able to approach it with a cookbook style approach, but most will benefit from better connecting the previously isolated silos of development and operations. Mutual understanding from requirements onwards to maintenance, service and product evolution will yield typically a cycle time improvement of 10-30% and cost reduction of up to 20%. Major drivers are less requirements changes, focused testing and quality assurance, and much faster delivery cycle with feature-driven teams. As products and life-cycle processes vary, each company needs its own approach towards a DevOps environment, from architecture to tools and culture.

Contact me at [email protected] for more information or to discuss these trends.

Key differences between MySQL vs PostgreSQL

MySQL is a relational database management system (RDBMS) currently developed by Oracle with open-source code. This code is available for free under the GNU General Public License, and commercial versions of MySQL are also available under various proprietary agreements. PostgreSQL is an object-RDBMS (ORDBMS) that‘s developed by the PostgreSQL Global Development Group. It also has an open source, which is released under the permissive PostgresSQL License. The differences between MySQL and PostgreSQL include the following key categories:

Governance

Supported platforms

Access Methods

Partitioning

Replication

DIFFICULTY Basic - 1 | Medium - 2 | Advanced - 3

TIME REQUIRED 5 min

RELATED PRODUCTS Linux-based VPS or dedicated servers

Governance

The governance model around the MySQL and PostgreSQL is one of the more significant differences between

mailto:[email protected]

the two database technologies. MySQL is controlled by Oracle, whereas Postgres is available under an open-source license from the PostgreSQL Global Development Group. As such, there has been increasing interest in Postgres over the past few years. Both are open source, but Postgres has gained in popularity recently.

Supported Platforms

Both MySQL and PostgreSQL can run on the Linux, OS X, Solaris and Windows operating systems (OSs). Linux is an open-source OS, OS X is developed by Apple, Solaris is developed by Oracle and Windows is developed by Microsoft. MySQL also supports the FreeBSD OS, which is open source. PostgreSQL supports the HP-UX operating system, which is developed by Hewlett Packard, and the open-source Unix OS.

Access Methods

Access methods that are common to both MySQL and PostgreSQL include ADO.NET, JDBC and ODBC. ADO.NET is a set of Application Programmer Interfaces (APIs) that programmers use to access data based on XML. JDBC is an API for the Java programming language that accesses databases, while ODBC is a standard API for accessing databases. PostgreSQL can also be accessed with routines from the platform‘s native C library as well as streaming APIs for large objects.

Partitioning

MySQL and PostgreSQL differ significantly with respect to their partitioning methods, which determine how data is stored on different nodes of the database. MySQL uses a proprietary technology called MySQL Cluster to perform horizontal clustering, which consists of creating multiple clusters with a single cluster instance within each node. PostgreSQL doesn‘t implement true partitioning, although it can provide a similar capability with table inheritance. This task involves using a separate sub-table to control each ―partition.‖

Replication

A database may use multiple methods to store redundant data across multiple nodes. MySQL uses master-master replication, in which each node can update the data. Both MySQL and PostgreSQL can perform master-slave replication, where one node controls the storage of data by the other nodes. PostgreSQL can also handle other types of replication with the implementation of third-party extensions.

Smartphones and tablets are still regarded as the latest craze in technology, but what we should

really turn our attention to is the Internet of Things and the devices that are shaping the future for

it. Those of you who are aware of what it means know that the Internet of Things is just around the

corner and we’re excited to see it evolve with the help of these ingenious technical products. There are

many definitions for the IoT, probably because it’s still a pioneering idea, but I’ll try my best to explain

it as plain as possible.

The Internet of Things is an environment where objects, animals or people have unique identifiers that

allow them to transfer data over a network on their own. The evolution of wireless technologies has

made it possible for the IoT to see a rapid growth. Also, the IPv6 huge increase is another catalyst in the

development of the Internet of Things.

20 Devices to show how the Internet of Things will be

Within eight years, it is expected that various organizations ranging from government to business will

create a market of almost $9 trillion which will be constituted from 212 billion “things” making up the

http://thejournal.com/articles/2013/10/07/212-billion-devices-to-make-up-the-internet-of-things-by-2020.aspx

global Internet of Things in 2020. Another good definition for the IoT comes from Techtarget:

A thing, in the Internet of Things, can be a person with a heart monitor implant, a farm animal with

a biochip transponder, an automobile that has built-in sensors to alert the driver when tire pressure

is low — or any other natural or man-made object that can be assigned an IP address and provided

with the ability to transfer data over a network. So far, the Internet of Things has been most closely

associated with machine-to-machine (M2M) communication in manufacturing and power, oil and

gas utilities. Products built with M2M communication capabilities are often referred to as being

smart.

In our article, we won’t talk about the role of the Internet of Things in such big areas as manufacturing,

oil and gas industries; but we’ll discuss about the Internet of Things devices that we come in contact

with every day, such as a smart thermometer, a smart scale, pretty much everything that makes a smart

home.

An ideal Internet of Things scenario would suggest something like this: as you wake up, a smart coffee

maker will start brewing your coffee, when you want to watch a movie, a smart light system would turn

down the lights. Before we start the list with some of the best Internet of Things devices, you

might want to have one more look at this infographic to understand just how big and important the IoT

really is.

Internet of Things Platforms and Networks

Those of you acquainted with the Internet of Things notion and devices have probably heard about Z-

Wave. To put it very simple, Z-Wave is the wireless standard for IoT devices as most of them have Z-

Wave chips inside. It’s the same case with ZigBee, which, albeit newer to this game, is betting on its

low-power approach. The ZigBee and Z-Wave standards, if you will, are the backbones behind most of

the devices present here, but are more used in platforms and networks.

SmartThings

http://whatis.techtarget.com/definition/Internet-of-Things

http://techpp.com/2013/08/26/weather-gadgets/

http://techpp.com/2012/08/24/geeky-body-scales/

http://techpp.com/2013/09/09/best-gadgets-ifa-2013/

http://techpp.com/2013/09/09/best-gadgets-ifa-2013/

http://share.cisco.com/internet-of-things.html

http://en.wikipedia.org/wiki/Z-Wave

http://en.wikipedia.org/wiki/Z-Wave

http://en.wikipedia.org/wiki/Zigbee

http://www.smartthings.com/

A special attention I think needs to be given to the SmartThings start-up, which works like a platform

for the growing number of devices that are connected to the Internet. It works like an online

marketplace where users can buy starter kits to transform their homes into an unified IoT system. So, if

you’re looking for some cool IoT devices, you should also have a look there, where you will find lots

of Internet of Things devices, such as smart lights, switches, doors and locks. Developers will find the

right tools while users will get smart apps for the devices they buy from SmartThings.

Revolv

Revolv is another platform that brings more devices together under a single command center. Just like

the above video says, you can control the lighting in your home (such as the Philips Hue light-bulb that

we’ll talk lower about), your Apple TV, your heating and much more! The platform Revolv IoT

service is first expected to ship for $299 before the end of this fall. You will get with your purchase the

Revolv Hub, Revolv App, and for a limited time, also the free lifetime service plan that comes with

GeoSense automation.

Securifi

Another IoT device that was born thanks to Kickstarter, Securifi is

somehow also an IoT platform, but also a standalone device, but we’ve decided to put it in this

category. The Almond+ is their latest version of their touchscreen router, an Internet of Things device

that can wirelessly connect a 5,000 square foot home, and is four times faster than your average

wireless router. Also, thanks to its touchscreen, the Almond+ can be set up without having to use a PC

or a smartphone. It also works with ZigBee and Z-Wave standards so Almond+ supports hundreds of

existing sensors in the market.

Xfinity Home products

http://revolv.com/

http://www.securifi.com/almondplus

http://www.comcast.com/home-security/equipment.html

http://cdn.techpp.com/wp-content/uploads/2013/10/securifi-almond-plus-internet-of-things-devices.jpg

Comcast’s range of Xfinity

Home products is focused on providing users with a smart way to control their homes. Like a true

service of its kind, Comcast doesn’t sell products on a pay-once basis, but it comes with monthly plans

that vary according to your needs. The Secure and Control plan protects against fire and break-ins,

while providing automation for lights, temperature, and more. The Home Control plan, which is also

cheaper, will let you control lights, temperature, and more home features will be included to save

money on your bills.

WeMo Belkin Home Automation

http://www.belkin.com/us/Products/home-automation/c/wemo-home-automation/

http://cdn.techpp.com/wp-content/uploads/2013/10/xfinity-comcast-iot-devices.jpg

http://cdn.techpp.com/wp-content/uploads/2013/10/belkin-we-mo-internet-of-things.jpg

The WeMo family of IoT products from Belkin is composed of light and insight switches, motion

sensor, baby monitor. Though not included in the WeMo range of products, Belkin also has two

NetCams that will let you watch what happens inside your home. Compared to other similar home

automation systems, Belkin’s products seem to have a lower price-point. Even more, WeMo also works

with IFTTT, so you could do much more amazing stuff with Belkin’s IoT devices.

Ninja Blocks

We did talk also about

Ninja Blocks, as well, when we were sharing with you some of the best weather gadgets there are in the

market. Using the Ninja Blocks IoT system, you can do so many things like watching over the

temperature and humidity levels, turn on the lights when you’re not at home or even send an SMS

when someone is at your front door. A smart mix between IFTTT functionality and the power of the

Internet of Things. A wireless Window & Door sensor can let you know when your door is opened or

images of someone who is moving in front of your door can be stored to your Dropbox account. How

cool is that!

Internet of Things Devices

These IoT devices cover many fields, with a special attention, of course, to home automation and the

functionality of your house. Thermostats, smoke detectors, smart music systems, smart light bulbs –

we have it all here. But Internet of Things devices are also present in the interaction with the human

body, whether we’re talking about fitness trackers, smart body scales or even baby monitors.

Fitbit Aria Wi-Fi Smart Scale

http://techpp.com/2013/04/01/android-automation-apps/

http://ninjablocks.com/

http://techpp.com/2013/08/26/weather-gadgets/


http://www.fitbit.com/aria

http://cdn.techpp.com/wp-content/uploads/2013/10/ninja-blocks.jpeg

We’ve talked about Fitbit’s Aria smart

scale before and we’ll do it again. Aria tracks your weight, body fat percentage, body mass index and

lets you watch over these values on the long-term. It wirelessly syncs and auto-uploads your stats to an

online graph that you can always access to check how you’re doing. Fitbit’s Aria smart

scale recognizes up to eight people but it does so discreetly, as every information is kept private. To

keep you motivated, you can earn badges as you go. Besides this, if you want, you can receive alerts on

your smartphone when you’re nearing your goals.

Withings Smart Body Analyzer

Withings is a company well-known for making gadgets to look after your health and help you keep fit.

The Smart Body Analyzer is a smart scale that can can do so much more than just letting you keep a

eye on your weight. It comes with an impressive set of feature – full body knowledge, heart

measurement, weight goals and long-term progress graph and indoor air quality monitoring. This

amazing Internet of Things device can be yours for $150 from Amazon.

Nest

Nest is one of the companies that releases products with impressive but simple designs. It probably has


http://www.withings.com/en/bodyanalyzer

http://nest.com/


http://cdn.techpp.com/wp-content/uploads/2013/10/withings-smart-body-analyzer.png

to do with the fact that their company was was co-founded by former Apple

engineers Tony Fadell and Matt Rogers back in 2010. They have only two products, but they seem to be

very well-built and show us that this is a company that will definitely be present in the future with even

more Internet of Things devices.

Thermostat – we’ve featured the Nest Thermostat in our top with the best there are in the market, so

have a look at it if you’re interested in such devices. The Nest Learning Thermostat is called like

that because it can learn your schedule so it could program itself to increase or reduce the heating

levels, thus reducing your bills. Obviously, you can control it from your phone.

Protect smoke detector – the Nest Protect is the second

product of the company; this is a smoke and carbon monoxide detector. It is smart because it doesn’t

immediately turn on a loud alarm when it detects some CO2 from the toaster. The Protect will give an

early warning through a yellow lights flashing and a warning spoke with human voice. It will tell where

the smoke or carbon monoxide is or so you will decide yourself if it is an emergency or just a nuisance

alarm. And if so, you’ll cancel the alarm just by standing under the Nest Protect and waving your arm.

Sonos Music System

http://nest.com/thermostat/life-with-nest-thermostat/

http://techpp.com/2013/05/16/best-thermostats/

http://nest.com/smoke-co-alarm/life-with-nest-protect/

http://www.sonos.com/

http://cdn.techpp.com/wp-content/uploads/2013/10/nest-thermostat.jpg

http://cdn.techpp.com/wp-content/uploads/2013/10/nest-protect-smoke-detector.png

Music definitely needs to be taken into consideration with the rise of IoT devices. With an appealing

design, Sonos is a system of HiFi wireless speakers and audio components that combines all your

music collection, radio or podcasts in a single app. You can choose to play what you want in different

rooms by using a dedicated wireless network. Sono has got a wide collection of products for music fans,

from speakers to sound-bars, so head over to their website to choose what you like.

Philips Hue light bulb

https://www.meethue.com/en-US

http://cdn.techpp.com/wp-content/uploads/2013/10/sonos-internet-of-things.jpg

Probably because it follows the same product principles as Apple’s product, Philips’ Hue smart

bulb is available to buy from Apple Store, but also from Amazon. You can choose to buy a starter pack

that includes hue bridge and three bulbs for $200 or single connected bulb for $60. A single bridge

will let you control up to 50 bulbs and you will be able to create lighting scenes based on your favorite

photos. Obviously, the lighting is controllable from your smart phone or tablet. They are said to help

you use 80% less energy than traditional bulbs. If you’re interested in this IoT device, you could have a

look at a similar smart light bulb, the LiFx.

Lockitron

http://lifx.co/

https://lockitron.com/preorder

Lockitron is not the single smart lock out there, but it’s our favorite and definitely one of the most

innovative IoT devices. Another product that was made real thanks to the power of

crowdfunding, Lockitron ensures keyless entry inside your home using only your phone. You can also

monitor to see if the door is locked when you’re gone. And if not, it will send a notification when it is

unlocked. The single drawback is that it works only with iPhone 4S or iPhone 5, but support for iPhone

5s and 5c should be coming soon, as well. Lockitron has an intelligent power management that makes

its batteries last for up to one year.

LG Smart Thinq

http://www.lg.com/us/refrigerators/lg-LFX31995ST-french-3-door-refrigerator

LG wants to be one of the first companies to deploy the power of the Internet of Things in home

appliances. That’s why it has launched the Smart Thinq line of products. It currently contains only

four categories of products, but more will be added as the interest of consumes will increase over time.

At the moment, you can find refrigerators, washing machines, dryers and ovens that are connected to

the Internet to help you be in better control and save money.

AirQuality egg

I have always wondered – what is the

quality of the air that I am breathing? Living in a city has always had me longing for the fresh air in the

small village I was born in. The AirQuality Egg is a sensing device that measures the air quality in

your environment and lets you share that information with an online community in real-time. The

whole system is composed of outdoor sensors that have an RF transmitter which sends the air quality

data wirelessly to an Egg-shaped base station inside. The Egg hub then sends the data to Xively which

stores it and shares it with the community.

Smart baby monitor

http://airqualityegg.com/

http://mimobaby.com/

http://cdn.techpp.com/wp-content/uploads/2013/10/air-quality-egg-internet-of-things.jpeg

We think Smart baby monitors are some very useful IoT devices, especially for tech savvy parents, and

that’s why we have compiled a while ago a list with some of the best to use. The Mimo baby

monitor is another useful such device that lets you watch over your little one’s respiration and check

the temperature in the room. Also, it can tell you if your baby is asleep or how active he is.

Through logical address the system identify a network (source to destination). after identifying the network physical

address is used to identify the host on that network. The port address is used to identify the particular application running

on the destination machine.

Logical Address: An IP address of the system is called logical address. This address is the combnation of Net ID and Host ID. This

address is used by network layer to identify a particular network (source to destination) among the networks. This address can be

changed by changing the host position on the network. So it is called logical address.

Physical address: Each system having a NIC(Network Interface Card) through which two systems physically connected with each

other with cables. The address of the NIC is called Physical address or mac address. This is specified by the manficture company of

the card. This address is used by data link layer.

Port Address: There are many application running on the computer. Each application run with a port no.(logically) on the computer.

This port no. for application is decided by the Karnal of the OS. This port no. is called port address.

Suppose you have to reach to your friends house. You have to first go to the area or the street of your friends house, then go to the house no. Then your friend is a particular person to that house. Now in technical terms the logical address define the area or street no., physical address defines the house no. And lastly the port no. Is your particular friend within that house. Street = link address/ logical address House no. = host address/ physical address Your friend = port address / service point address

http://techpp.com/2013/10/03/baby-monitor-devices/

http://cdn.techpp.com/wp-content/uploads/2013/10/mimo-baby-monitor.jpg

Consequences[edit]

A memory leak reduces the performance of the computer by reducing the amount of available memory. Eventually, in the worst case, too much of the available memory may become allocated and all or part of the system or device stops working correctly, the application fails, or the system slows down vastly due to thrashing.

Memory leaks may not be serious or even detectable by normal means. In modern operating systems, normal memory used by an application is released when the application terminates. This means that a memory leak in a program that only runs for a short time may not be noticed and is rarely serious.

Much more serious leaks include those:

where the program runs for an extended time and consumes additional memory over time, such as background tasks on servers, but especially in embedded devices which may be left running for many years

where new memory is allocated frequently for one-time tasks, such as when rendering the frames of a computer game or animated video

where the program can request memory — such as shared memory — that is not released, even when the program terminates

where memory is very limited, such as in an embedded system or portable device

where the leak occurs within the operating system or memory manager

when a system device driver causes the leak

running on an operating system that does not automatically release memory on program termination.

An example of memory leak[edit]

The following example, written in pseudocode, is intended to show how a memory leak can come about, and its effects, without needing any programming knowledge. The program in this case is part of some very simple software designed to control an elevator. This part of the program is run whenever anyone inside the elevator presses the button for a floor.

When a button is pressed:

Get some memory, which will be used to remember the floor number

Put the floor number into the memory

Are we already on the target floor?

If so, we have nothing to do: finished

Otherwise:

Wait until the lift is idle

Go to the required floor

Release the memory we used to remember the floor number

The memory leak would occur if the floor number requested is the same floor that the elevator is on; the condition for releasing the memory would be skipped. Each time this case occurs, more memory is leaked.

Cases like this wouldn't usually have any immediate effects. People do not often press the button for the floor they are already on, and in any case, the elevator might have enough spare memory that this could happen hundreds or thousands of times. However, the elevator will eventually run out of memory. This could take months or years, so it might not be discovered despite thorough testing.

The consequences would be unpleasant; at the very least, the elevator would stop responding to requests to move to another floor (like when you call the elevator or when someone is inside and presses the floor buttons). If other parts of the program need memory (a part assigned to open and close the door, for example), then someone may be trapped inside, or if no one is in, then no one would be able to use the elevator since the software cannot open the door.

The memory leak lasts until the system is reset. For example: if the elevator's power were turned off or in a power outage, the program would stop running. When power was turned on again, the program would restart and all the memory would be available again, but the slow process of memory leak would restart together with the program, eventually prejudicing the correct running of the system.

The leak in the above example can be corrected by bringing the 'release' operation outside of the conditional:

https://en.wikipedia.org/w/index.php?title=Memory_leak&action=edit&section=1

https://en.wikipedia.org/wiki/Thrashing_(computer_science)

https://en.wikipedia.org/wiki/Embedded_system

https://en.wikipedia.org/wiki/Shared_memory_(interprocess_communication)

https://en.wikipedia.org/wiki/Embedded_system

https://en.wikipedia.org/wiki/Memory_management

https://en.wikipedia.org/wiki/Device_driver


https://en.wikipedia.org/wiki/Pseudocode

https://en.wikipedia.org/wiki/Elevator

When a button is pressed:

Get some memory, which will be used to remember the floor number

Put the floor number into the memory

Are we already on the target floor?

If not:

Wait until the lift is idle

Go to the required floor

Release the memory we used to remember the floor number

Programming issues[edit] Memory leaks are a common error in programming, especially when using languages that have no built in automatic garbage collection, such as C and C++. Typically, a memory leak occurs because dynamically allocated memory has become unreachable. The prevalence of memory leak bugs has led to the development of a number of debugging tools to detect unreachable memory. BoundsChecker, Deleaker, IBM Rational Purify, Valgrind, Parasoft Insure++, Dr. Memory and memwatch are some of the more popular memory debuggers for C and C++ programs. "Conservative" garbage collection capabilities can be added to any programming language that lacks it as a built-in feature, and libraries for doing this are available for C and C++ programs. A conservative collector finds and reclaims most, but not all, unreachable memory.

Although the memory manager can recover unreachable memory, it cannot free memory that is still reachable and therefore potentially still useful. Modern memory managers therefore provide techniques for programmers to semantically mark memory with varying levels of usefulness, which correspond to varying levels of reachability. The memory manager does not free an object that is strongly reachable. An object is strongly reachable if it is reachable either directly by a strong reference or indirectly by a chain of strong references. (A strong reference is a reference that, unlike a weak reference, prevents an object from being garbage collected.) To prevent this, the developer is responsible for cleaning up references after use, typically by setting the reference to null once it is no longer needed and, if necessary, by deregistering any event listeners that maintain strong references to the object.

Drupal is free, open source software that can be used by individuals or groups of users -- even those lacking technical skills -- to easily create and manage many types of Web sites. The application includes a content management platform and a development framework.

Technology professionals look for reliability, security, and the flexibility to create the

features they want without weighty features they don’t need. They require a platform with a

strong architecture, integrating with third-party applications. Drupal provides all this and

more, conforming to their technical and business requirements, not the other way around.

Features and Benefits Drupal is an “out of the box” web

content management tool as well as a customizable platform -- to help you

build the right tool to serve your content management strategy. Business and

technology leaders use Drupal to create real-world enterprise solutions that

empower web innovation. When assessing Drupal, it’s important to envision your

goals and ask “Can Drupal be used to build this?” The answer nearly always is

“yes”. Drupal offers limitless potential with native features and module


https://en.wikipedia.org/wiki/Programming_language

https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)



https://en.wikipedia.org/wiki/C_(programming_language)

https://en.wikipedia.org/wiki/C%2B%2B

https://en.wikipedia.org/wiki/Dynamic_memory_allocation

https://en.wikipedia.org/wiki/Unreachable_memory

https://en.wikipedia.org/wiki/Software_bug

https://en.wikipedia.org/wiki/Debugging

https://en.wikipedia.org/wiki/Programming_tool

https://en.wikipedia.org/wiki/BoundsChecker

https://en.wikipedia.org/w/index.php?title=Deleaker&action=edit&redlink=1

https://en.wikipedia.org/wiki/IBM_Rational_Purify

https://en.wikipedia.org/wiki/Valgrind

https://en.wikipedia.org/wiki/Parasoft

https://en.wikipedia.org/wiki/Insure%2B%2B

https://en.wikipedia.org/wiki/DynamoRIO#Dr._Memory



https://en.wikipedia.org/wiki/Memwatch

https://en.wikipedia.org/wiki/Memory_debugger

https://en.wikipedia.org/wiki/Memory_management

https://en.wikipedia.org/wiki/Strong_reference

https://en.wikipedia.org/wiki/Weak_reference

https://en.wikipedia.org/wiki/Null_pointer

https://en.wikipedia.org/wiki/Event_listener

extensions -- it’s a platform for the next disruptive technology, without

disruption to your business.

Highly Scalable

Drupal’s scalability means it can manage the largest, most high-traffic sites in the world. Sites that experience daily

high traffic, like Weather.com, and sites that see periodic spikes in traffic, like Grammy.com and the publications of

Time, Inc. (like SI.com) all use Drupal to ensure scalability as traffic and content grows. Learn more

Mobile-First

Build responsive sites and also create web applications that deliver optimal visitor experiences, no matter what

device they’re on. Drupal supports responsive design best practices and ensures your users get a seamless content

experience every time, on every device.

Integrated Digital Applications

Drupal integrates easily with a wide ecosystem of digital marketing technology and other business applications, so

you can use the best set of tools today, and flex with new tools tomorrow. And, Drupal’s API-first focus means

connecting content to other sites and applications, making content more powerful.

Security

https://www.drupal.com/feature/drupal-scalability







https://www.drupal.com/feature/security





Drupal’s community provide countless eyes and ears to help keep Drupal sites secure. Rely on your team, but also

on the open source community to identify vulnerabilities and create/deliver patches automatically to protect your

sites and your business. And never lose a night’s sleep. Learn more

Flexible Content Architecture

Create the right content architecture using the Admin Interface or do it programatically. Display only the content

appropriate for each context with powerful display mode tools and Views. Include a variety of media types (images,

video, pdfs, etc.). Customizable menus create a comfortable user experience, creating paths to content across

multiple devices.

Tools for Business, with No Limitations

Drupal doesn’t dictate to the business; the business dictates what it needs from Drupal. Too many CMS platforms

impose their will on your business, forcing you to conform to their way of doing things. Drupal acts the opposite

way: use Drupal to create a solution that supports your specific business needs. Drupal creates a foundation for

limitless solutions.

Easy Content Authoring

Essential tools for content creation and publishing, like a customizable WYSIWYG editor for content and marketing

pros. Authentication and permissions for managing editorial workflows as well as content. Authors, publishers, site

admins and developers all use Drupal to meet their requirements, with a workflow that offers them just enough

access to features they need. Learn more

Multisite

Manage many sites across your organization, brands, geographies and campaigns on a single platform that allows

quick, easy site creation and deployment.

https://www.drupal.com/feature/drupal-content-authoring









Community of Talent and Experience

The worldwide Drupal community shares its secrets on how to get things done, right. If you have a question,

someone has the answer. Leverage the power of open source by building on previously-created solutions. Drupal

developers have access to worldwide community experience. When’s the last time your software provider gave you

this much support?

Content as a Service

With Drupal’s structured data model you can display content in multiple layouts for the responsive web, or export it

to any app or client with a built in REST services. Drupal’s open architecture and APIs provide developers a

framework and tools to build using Drupal and to connect to other sources of data, content, and application

functionality, including marketing technology tools. Content is decoupled from delivery: content can be presented

anywhere, any channel, in any format. Learn more

Multilingual

Architect and configure Drupal to deliver sites to a global, multilingual audience as part of your localization

strategy. Drupal makes it easy to create and manage sites for different regions and geographies, and support one

to many languages across all of your sites, translating and localizing your content and experiences. Learn more

Strong Stack Foundation

Drupal lives on a modern LAMP technology stack: Linux, Apache, MySQL and PHP, which together are meeting the

needs of fast-moving, flexible, agile enterprises and brands building next generation digital platforms.

SDLC, Software Development Life Cycle is a process used by software industry to design,

develop and test high quality softwares. The SDLC aims to produce a high quality software that

https://www.drupal.com/feature/content-as-a-service









https://www.drupal.com/feature/multilingual









meets or exceeds customer expectations, reaches completion within times and cost estimates.

SDLC is the acronym of Software Development Life Cycle.

It is also called as Software development process.

The software development life cycle (SDLC) is a framework defining tasks performed at each step in the

software development process.

ISO/IEC 12207 is an international standard for software life-cycle processes. It aims to be the standard

that defines all the tasks required for developing and maintaining software.

What is SDLC? SDLC is a process followed for a software project, within a software organization. It consists of a

detailed plan describing how to develop, maintain, replace and alter or enhance specific

software. The life cycle defines a methodology for improving the quality of software and the

overall development process.

The following figure is a graphical representation of the various stages of a typical SDLC.

A typical Software Development life cycle consists of the following stages:

Stage 1: Planning and Requirement Analysis Requirement analysis is the most important and fundamental stage in SDLC. It is performed by

the senior members of the team with inputs from the customer, the sales department, market

surveys and domain experts in the industry. This information is then used to plan the basic

project approach and to conduct product feasibility study in the economical, operational, and

technical areas.

Planning for the quality assurance requirements and identification of the risks associated with

the project is also done in the planning stage. The outcome of the technical feasibility study is to

define the various technical approaches that can be followed to implement the project

successfully with minimum risks.

Stage 2: Defining Requirements Once the requirement analysis is done the next step is to clearly define and document the

product requirements and get them approved from the customer or the market analysts. This is

done through .SRS. . Software Requirement Specification document which consists of all the

product requirements to be designed and developed during the project life cycle.

Stage 3: Designing the product architecture SRS is the reference for product architects to come out with the best architecture for the

product to be developed. Based on the requirements specified in SRS, usually more than one

design approach for the product architecture is proposed and documented in a DDS - Design

Document Specification.

This DDS is reviewed by all the important stakeholders and based on various parameters as risk

assessment, product robustness, design modularity , budget and time constraints , the best

design approach is selected for the product.

A design approach clearly defines all the architectural modules of the product along with its

communication and data flow representation with the external and third party modules (if any).

The internal design of all the modules of the proposed architecture should be clearly defined

with the minutest of the details in DDS.

Stage 4: Building or Developing the Product In this stage of SDLC the actual development starts and the product is built. The programming

code is generated as per DDS during this stage. If the design is performed in a detailed and

organized manner, code generation can be accomplished without much hassle.

Developers have to follow the coding guidelines defined by their organization and programming

tools like compilers, interpreters, debuggers etc are used to generate the code. Different high

level programming languages such as C, C++, Pascal, Java, and PHP are used for coding. The

programming language is chosen with respect to the type of software being developed.

Stage 5: Testing the Product This stage is usually a subset of all the stages as in the modern SDLC models, the testing

activities are mostly involved in all the stages of SDLC. However this stage refers to the testing

only stage of the product where products defects are reported, tracked, fixed and retested, until

the product reaches the quality standards defined in the SRS.

Stage 6: Deployment in the Market and Maintenance Once the product is tested and ready to be deployed it is released formally in the appropriate

market. Sometime product deployment happens in stages as per the organizations. business

strategy. The product may first be released in a limited segment and tested in the real business

environment (UAT- User acceptance testing).

Then based on the feedback, the product may be released as it is or with suggested

enhancements in the targeting market segment. After the product is released in the market, its

maintenance is done for the existing customer base.

SDLC Models There are various software development life cycle models defined and designed which are

followed during software development process. These models are also referred as "Software

Development Process Models". Each process model follows a Series of steps unique to its type,

in order to ensure success in process of software development.

Following are the most important and popular SDLC models followed in the industry:

Waterfall Model

Iterative Model

Spiral Model

V-Model

Big Bang Model

The other related methodologies are Agile Model, RAD Model, Rapid Application Development

and Prototyping Models.

The Waterfall Model was first Process Model to be introduced. It is also referred to as a linear-

sequential life cycle model. It is very simple to understand and use. In a waterfall model, each

phase must be completed before the next phase can begin and there is no overlapping in the

phases.

Waterfall model is the earliest SDLC approach that was used for software development .

The waterfall Model illustrates the software development process in a linear sequential flow;

hence it is also referred to as a linear-sequential life cycle model. This means that any phase in

the development process begins only if the previous phase is complete. In waterfall model

phases do not overlap.

Waterfall Model design Waterfall approach was first SDLC Model to be used widely in Software Engineering to ensure

success of the project. In "The Waterfall" approach, the whole process of software development

is divided into separate phases. In Waterfall model, typically, the outcome of one phase acts as

the input for the next phase sequentially.

Following is a diagrammatic representation of different phases of waterfall model.

The sequential phases in Waterfall model are:

Requirement Gathering and analysis: All possible requirements of the system to be developed are

captured in this phase and documented in a requirement specification doc.

System Design: The requirement specifications from first phase are studied in this phase and system

design is prepared. System Design helps in specifying hardware and system requirements and also

helps in defining overall system architecture.

Implementation: With inputs from system design, the system is first developed in small programs

called units, which are integrated in the next phase. Each unit is developed and tested for its

functionality which is referred to as Unit Testing.

Integration and Testing: All the units developed in the implementation phase are integrated into a

system after testing of each unit. Post integration the entire system is tested for any faults and failures.

Deployment of system: Once the functional and non functional testing is done, the product is

deployed in the customer environment or released into the market.

Maintenance: There are some issues which come up in the client environment. To fix those issues

patches are released. Also to enhance the product some better versions are released. Maintenance is

done to deliver these changes in the customer environment.

All these phases are cascaded to each other in which progress is seen as flowing steadily

downwards (like a waterfall) through the phases. The next phase is started only after the

defined set of goals are achieved for previous phase and it is signed off, so the name "Waterfall

Model". In this model phases do not overlap.

Waterfall Model Application Every software developed is different and requires a suitable SDLC approach to be followed

based on the internal and external factors. Some situations where the use of Waterfall model is

most appropriate are:

Requirements are very well documented, clear and fixed.

Product definition is stable.

Technology is understood and is not dynamic.

There are no ambiguous requirements.

Ample resources with required expertise are available to support the product.

The project is short.

Waterfall Model Pros & Cons Advantage

The advantage of waterfall development is that it allows for departmentalization and control. A

schedule can be set with deadlines for each stage of development and a product can proceed

through the development process model phases one by one.

Development moves from concept, through design, implementation, testing, installation,

troubleshooting, and ends up at operation and maintenance. Each phase of development

proceeds in strict order.

Disadvantage

The disadvantage of waterfall development is that it does not allow for much reflection or

revision. Once an application is in the testing stage, it is very difficult to go back and change

something that was not well-documented or thought upon in the concept stage.

The following table lists out the pros and cons of Waterfall model:

Pros Cons

Simple and easy to

understand and use

Easy to manage due to the

rigidity of the model . each

phase has specific

deliverables and a review

process.

Phases are processed and

completed one at a time.

Works well for smaller

projects where

requirements are very well

understood.

Clearly defined stages.

Well understood milestones.

Easy to arrange tasks.

Process and results are well

documented.

No working software is produced until

late during the life cycle.

High amounts of risk and uncertainty.

Not a good model for complex and

object-oriented projects.

Poor model for long and ongoing

projects.

Not suitable for the projects where

requirements are at a moderate to

high risk of changing. So risk and

uncertainty is high with this process

model.

It is difficult to measure progress

within stages.

Cannot accommodate changing

requirements.

Adjusting scope during the life cycle

can end a project.

Integration is done as a "big-bang. at

the very end, which doesn't allow

identifying any technological or

business bottleneck or challenges

early.

n Iterative model, iterative process starts with a simple implementation of a small set of the

software requirements and iteratively enhances the evolving versions until the complete system

is implemented and ready to be deployed.

An iterative life cycle model does not attempt to start with a full specification of requirements.

Instead, development begins by specifying and implementing just part of the software, which is

then reviewed in order to identify further requirements. This process is then repeated, producing

a new version of the software at the end of each iteration of the model.

Iterative Model design Iterative process starts with a simple implementation of a subset of the software requirements

and iteratively enhances the evolving versions until the full system is implemented. At each

iteration, design modifications are made and new functional capabilities are added. The basic

idea behind this method is to develop a system through repeated cycles (iterative) and in

smaller portions at a time (incremental).

Following is the pictorial representation of Iterative and Incremental model:

Iterative and Incremental development is a combination of both iterative design or iterative

method and incremental build model for development. "During software development, more

than one iteration of the software development cycle may be in progress at the same time." and

"This process may be described as an "evolutionary acquisition" or "incremental build"

approach."

In incremental model the whole requirement is divided into various builds. During each iteration,

the development module goes through the requirements, design, implementation and testing

phases. Each subsequent release of the module adds function to the previous release. The

process continues till the complete system is ready as per the requirement.

The key to successful use of an iterative software development lifecycle is rigorous validation of

requirements, and verification & testing of each version of the software against those

requirements within each cycle of the model. As the software evolves through successive cycles,

tests have to be repeated and extended to verify each version of the software.

Iterative Model Application Like other SDLC models, Iterative and incremental development has some specific applications

in the software industry. This model is most often used in the following scenarios:

Requirements of the complete system are clearly defined and understood.

Major requirements must be defined; however, some functionalities or requested enhancements may

evolve with time.

There is a time to the market constraint.

A new technology is being used and is being learnt by the development team while working on the

project.

Resources with needed skill set are not available and are planned to be used on contract basis for

specific iterations.

There are some high risk features and goals which may change in the future.

Iterative Model Pros and Cons The advantage of this model is that there is a working model of the system at a very early stage

of development which makes it easier to find functional or design flaws. Finding issues at an

early stage of development enables to take corrective measures in a limited budget.

The disadvantage with this SDLC model is that it is applicable only to large and bulky software

development projects. This is because it is hard to break a small software system into further

small serviceable increments/modules.

The following table lists out the pros and cons of Iterative and Incremental SDLC Model:

Pros Cons

Some working functionality can

be developed quickly and early

in the life cycle.

Results are obtained early and

periodically.

Parallel development can be

planned.

Progress can be measured.

Less costly to change the

scope/requirements.

Testing and debugging during

smaller iteration is easy.

More resources may be required.

Although cost of change is lesser

but it is not very suitable for

changing requirements.

More management attention is

required.

System architecture or design

issues may arise because not all

requirements are gathered in the

beginning of the entire life cycle.

Defining increments may require

definition of the complete system.

Risks are identified and

resolved during iteration; and

each iteration is an easily

managed milestone.

Easier to manage risk - High

risk part is done first.

With every increment

operational product is

delivered.

Issues, challenges & risks

identified from each increment

can be utilized/applied to the

next increment.

Risk analysis is better.

It supports changing

requirements.

Initial Operating time is less.

Better suited for large and

mission-critical projects.

During life cycle software is

produced early which

facilitates customer evaluation

and feedback.

Not suitable for smaller projects.

Management complexity is more.

End of project may not be known

which is a risk.

Highly skilled resources are

required for risk analysis.

Project.s progress is highly

dependent upon the risk analysis

phase.

The spiral model combines the idea of iterative development with the systematic, controlled

aspects of the waterfall model.

Spiral model is a combination of iterative development process model and sequential linear

development model i.e. waterfall model with very high emphasis on risk analysis.

It allows for incremental releases of the product, or incremental refinement through each

iteration around the spiral.

Spiral Model design The spiral model has four phases. A software project repeatedly passes through these phases in

iterations called Spirals.

Identification:This phase starts with gathering the business requirements in the baseline spiral. In the

subsequent spirals as the product matures, identification of system requirements, subsystem

requirements and unit requirements are all done in this phase.

This also includes understanding the system requirements by continuous communication between the

customer and the system analyst. At the end of the spiral the product is deployed in the identified

market.

Design:Design phase starts with the conceptual design in the baseline spiral and involves architectural

design, logical design of modules, physical product design and final design in the subsequent spirals.

Construct or Build:Construct phase refers to production of the actual software product at every spiral.

In the baseline spiral when the product is just thought of and the design is being developed a POC

(Proof of Concept) is developed in this phase to get customer feedback.

Then in the subsequent spirals with higher clarity on requirements and design details a working model

of the software called build is produced with a version number. These builds are sent to customer for

feedback.

Evaluation and Risk Analysis:Risk Analysis includes identifying, estimating, and monitoring technical

feasibility and management risks, such as schedule slippage and cost overrun. After testing the build,

at the end of first iteration, the customer evaluates the software and provides feedback.

Following is a diagrammatic representation of spiral model listing the activities in each phase:

Based on the customer evaluation, software development process enters into the next iteration

and subsequently follows the linear approach to implement the feedback suggested by the

customer. The process of iterations along the spiral continues throughout the life of the

software.

Spiral Model Application Spiral Model is very widely used in the software industry as it is in synch with the natural

development process of any product i.e. learning with maturity and also involves minimum risk

for the customer as well as the development firms. Following are the typical uses of Spiral

model:

When costs there is a budget constraint and risk evaluation is important.

For medium to high-risk projects.

Long-term project commitment because of potential changes to economic priorities as the requirements

change with time.

Customer is not sure of their requirements which is usually the case.

Requirements are complex and need evaluation to get clarity.

New product line which should be released in phases to get enough customer feedback.

Significant changes are expected in the product during the development cycle.

Spiral Model Pros and Cons The advantage of spiral lifecycle model is that it allows for elements of the product to be added

in when they become available or known. This assures that there is no conflict with previous

requirements and design.

This method is consistent with approaches that have multiple software builds and releases and

allows for making an orderly transition to a maintenance activity. Another positive aspect is that

the spiral model forces early user involvement in the system development effort.

On the other side, it takes very strict management to complete such products and there is a risk

of running the spiral in indefinite loop. So the discipline of change and the extent of taking

change requests is very important to develop and deploy the product successfully.

The following table lists out the pros and cons of Spiral SDLC Model:

Pros Cons

Changing requirements can be

accommodated.

Allows for extensive use of

prototypes

Requirements can be captured more

accurately.

Users see the system early.

Development can be divided into

smaller parts and more risky parts

can be developed earlier which helps

better risk management.

Management is more

complex.

End of project may not be

known early.

Not suitable for small or low

risk projects and could be

expensive for small

projects.

Process is complex

Spiral may go indefinitely.

Large number of

intermediate stages

requires excessive

documentation.

The V - model is SDLC model where execution of processes happens in a sequential manner in

V-shape. It is also known as Verification and Validation model.

V - Model is an extension of the waterfall model and is based on association of a testing phase

for each corresponding development stage. This means that for every single phase in the

development cycle there is a directly associated testing phase. This is a highly disciplined model

and next phase starts only after completion of the previous phase.

V- Model design Under V-Model, the corresponding testing phase of the development phase is planned in parallel.

So there are Verification phases on one side of the .V. and Validation phases on the other side.

Coding phase joins the two sides of the V-Model.

The below figure illustrates the different phases in V-Model of SDLC.

Verification Phases Following are the Verification phases in V-Model:

Business Requirement Analysis: This is the first phase in the development cycle where the product

requirements are understood from the customer perspective. This phase involves detailed

communication with the customer to understand his expectations and exact requirement. This is a very

important activity and need to be managed well, as most of the customers are not sure about what

exactly they need. The acceptance test design planning is done at this stage as business requirements

can be used as an input for acceptance testing.

System Design: Once you have the clear and detailed product requirements, it.s time to design the

complete system. System design would comprise of understanding and detailing the complete hardware

and communication setup for the product under development. System test plan is developed based on

the system design. Doing this at an earlier stage leaves more time for actual test execution later.

Architectural Design: Architectural specifications are understood and designed in this phase. Usually

more than one technical approach is proposed and based on the technical and financial feasibility the

final decision is taken. System design is broken down further into modules taking up different

functionality. This is also referred to as High Level Design (HLD).

The data transfer and communication between the internal modules and with the outside world (other

systems) is clearly understood and defined in this stage. With this information, integration tests can be

designed and documented during this stage.

Module Design:In this phase the detailed internal design for all the system modules is specified,

referred to as Low Level Design (LLD). It is important that the design is compatible with the other

modules in the system architecture and the other external systems. Unit tests are an essential part of

any development process and helps eliminate the maximum faults and errors at a very early stage.

Unit tests can be designed at this stage based on the internal module designs.

Coding Phase The actual coding of the system modules designed in the design phase is taken up in the Coding

phase. The best suitable programming language is decided based on the system and

architectural requirements. The coding is performed based on the coding guidelines and

standards. The code goes through numerous code reviews and is optimized for best performance

before the final build is checked into the repository.

Validation Phases Following are the Validation phases in V-Model:

Unit Testing: Unit tests designed in the module design phase are executed on the code during this

validation phase. Unit testing is the testing at code level and helps eliminate bugs at an early stage,

though all defects cannot be uncovered by unit testing.

Integration Testing: Integration testing is associated with the architectural design phase. Integration

tests are performed to test the coexistence and communication of the internal modules within the

system.

System Testing: System testing is directly associated with the System design phase. System tests

check the entire system functionality and the communication of the system under development with

external systems. Most of the software and hardware compatibility issues can be uncovered during

system test execution.

Acceptance Testing: Acceptance testing is associated with the business requirement analysis phase

and involves testing the product in user environment. Acceptance tests uncover the compatibility issues

with the other systems available in the user environment. It also discovers the non functional issues

such as load and performance defects in the actual user environment.

V- Model Application V- Model application is almost same as waterfall model, as both the models are of sequential

type. Requirements have to be very clear before the project starts, because it is usually

expensive to go back and make changes. This model is used in the medical development field,

as it is strictly disciplined domain. Following are the suitable scenarios to use V-Model:

Requirements are well defined, clearly documented and fixed.

Product definition is stable.

Technology is not dynamic and is well understood by the project team.

There are no ambiguous or undefined requirements.

The project is short.

V- Model Pros and Cons The advantage of V-Model is that it.s very easy to understand and apply. The simplicity of this

model also makes it easier to manage. The disadvantage is that the model is not flexible to

changes and just in case there is a requirement change, which is very common in today.s

dynamic world, it becomes very expensive to make the change.

The following table lists out the pros and cons of V-Model:

Pros Cons

This is a highly disciplined model

and Phases are completed one at

a time.

Works well for smaller projects

where requirements are very well

understood.

Simple and easy to understand

and use.

Easy to manage due to the rigidity

of the model . each phase has

specific deliverables and a review

process.

High risk and uncertainty.

Not a good model for complex

and object-oriented projects.

Poor model for long and

ongoing projects.

Not suitable for the projects

where requirements are at a

moderate to high risk of

changing.

Once an application is in the

testing stage, it is difficult to

go back and change a

functionality

No working software is

produced until late during the

life cycle.

The Big Bang model is SDLC model where we do not follow any specific process. The

development just starts with the required money and efforts as the input, and the output is the

software developed which may or may not be as per customer requirement.

B ig Bang Model is SDLC model where there is no formal development followed and very little

planning is required. Even the customer is not sure about what exactly he wants and the

requirements are implemented on the fly without much analysis.

Usually this model is followed for small projects where the development teams are very small.

Big Bang Model design and Application Big bang model comprises of focusing all the possible resources in software development and

coding, with very little or no planning. The requirements are understood and implemented as

they come. Any changes required may or may not need to revamp the complete software.

This model is ideal for small projects with one or two developers working together and is also

useful for academic or practice projects. It.s an ideal model for the product where requirements

are not well understood and the final release date is not given.

Big Bang Model Pros and Cons The advantage of Big Bang is that its very simple and requires very little or no planning. Easy to

mange and no formal procedure are required.

However the Big Bang model is a very high risk model and changes in the requirements or

misunderstood requirements may even lead to complete reversal or scraping of the project. It is

ideal for repetitive or small projects with minimum risks.

Following table lists out the pros and cons of Big Bang Model:

Pros Cons

This is a very simple model

Little or no planning

required

Easy to manage

Very few resources required

Gives flexibility to

developers

Is a good learning aid for

new comers or students

Very High risk and uncertainty.

Not a good model for complex and

object-oriented projects.

Poor model for long and ongoing

projects.

Can turn out to be very expensive if

requirements are misunderstood

Agile SDLC model is a combination of iterative and incremental process models with focus on

process adaptability and customer satisfaction by rapid delivery of working software product.

Agile Methods break the product into small incremental builds. These builds are provided in

iterations. Each iteration typically lasts from about one to three weeks. Every iteration involves

cross functional teams working simultaneously on various areas like planning, requirements

analysis, design, coding, unit testing, and acceptance testing.

At the end of the iteration a working product is displayed to the customer and important

stakeholders.

What is Agile? Agile model believes that every project needs to be handled differently and the existing methods

need to be tailored to best suit the project requirements. In agile the tasks are divided to time

boxes (small time frames) to deliver specific features for a release.

Iterative approach is taken and working software build is delivered after each iteration. Each

build is incremental in terms of features; the final build holds all the features required by the

customer.

Here is a graphical illustration of the Agile Model:

Agile thought process had started early in the software development and started becoming

popular with time due to its flexibility and adaptability.

The most popular agile methods include Rational Unified Process (1994), Scrum (1995), Crystal

Clear, Extreme Programming (1996), Adaptive Software Development, Feature Driven

Development, and Dynamic Systems Development Method (DSDM) (1995). These are now

collectively referred to as agile methodologies, after the Agile Manifesto was published in 2001.

Following are the Agile Manifesto principles

Individuals and interactions - in agile development, self-organization and motivation are important,

as are interactions like co-location and pair programming.

Working software - Demo working software is considered the best means of communication with the

customer to understand their requirement, instead of just depending on documentation.

Customer collaboration - As the requirements cannot be gathered completely in the beginning of the

project due to various factors, continuous customer interaction is very important to get proper product

requirements.

Responding to change - agile development is focused on quick responses to change and continuous

development.

Agile Vs Traditional SDLC Models Agile is based on the adaptive software development methods where as the traditional SDLC

models like waterfall model is based on predictive approach.

Predictive teams in the traditional SDLC models usually work with detailed planning and have a

complete forecast of the exact tasks and features to be delivered in the next few months or

during the product life cycle. Predictive methods entirely depend on the requirement analysis

and planning done in the beginning of cycle. Any changes to be incorporated go through a strict

change control management and prioritization.

Agile uses adaptive approach where there is no detailed planning and there is clarity on future

tasks only in respect of what features need to be developed. There is feature driven

development and the team adapts to the changing product requirements dynamically. The

product is tested very frequently, through the release iterations, minimizing the risk of any

major failures in future.

Customer interaction is the backbone of Agile methodology, and open communication with

minimum documentation are the typical features of Agile development environment. The agile

teams work in close collaboration with each other and are most often located in the same

geographical location.

Agile Model Pros and Cons Agile methods are being widely accepted in the software world recently, however, this method

may not always be suitable for all products. Here are some pros and cons of the agile model.

Following table lists out the pros and cons of Agile Model:

Pros Cons

Is a very realistic approach to

software development

Promotes teamwork and cross

training.

Functionality can be developed

rapidly and demonstrated.

Resource requirements are

minimum.

Suitable for fixed or changing

requirements

Delivers early partial working

solutions.

Good model for environments

that change steadily.

Minimal rules, documentation

easily employed.

Enables concurrent

development and delivery

within an overall planned

context.

Little or no planning required

Easy to manage

Gives flexibility to developers

Not suitable for handling complex

dependencies.

More risk of sustainability,

maintainability and extensibility.

An overall plan, an agile leader and

agile PM practice is a must without

which it will not work.

Strict delivery management

dictates the scope, functionality to

be delivered, and adjustments to

meet the deadlines.

Depends heavily on customer

interaction, so if customer is not

clear, team can be driven in the

wrong direction.

There is very high individual

dependency, since there is

minimum documentation

generated.

Transfer of technology to new

team members may be quite

challenging due to lack of

documentation.

The RAD (Rapid Application Development) model is based on prototyping and iterative

development with no specific planning involved. The process of writing the software itself

involves the planning required for developing the product.

Rapid Application development focuses on gathering customer requirements through workshops

or focus groups, early testing of the prototypes by the customer using iterative concept, reuse of

the existing prototypes (components), continuous integration and rapid delivery.

What is RAD? Rapid application development (RAD) is a software development methodology that uses minimal

planning in favor of rapid prototyping. A prototype is a working model that is functionally

equivalent to a component of the product.

In RAD model the functional modules are developed in parallel as prototypes and are integrated

to make the complete product for faster product delivery.

Since there is no detailed preplanning, it makes it easier to incorporate the changes within the

development process. RAD projects follow iterative and incremental model and have small

teams comprising of developers, domain experts, customer representatives and other IT

resources working progressively on their component or prototype.

The most important aspect for this model to be successful is to make sure that the prototypes

developed are reusable.

RAD Model Design RAD model distributes the analysis, design, build, and test phases into a series of short, iterative

development cycles. Following are the phases of RAD Model:

Business Modeling: The business model for the product under development is designed in terms of

flow of information and the distribution of information between various business channels. A complete

business analysis is performed to find the vital information for business, how it can be obtained, how

and when is the information processed and what are the factors driving successful flow of information.

Data Modeling: The information gathered in the Business Modeling phase is reviewed and analyzed to

form sets of data objects vital for the business. The attributes of all data sets is identified and defined.

The relation between these data objects are established and defined in detail in relevance to the

business model.

Process Modeling: The data object sets defined in the Data Modeling phase are converted to establish

the business information flow needed to achieve specific business objectives as per the business model.

The process model for any changes or enhancements to the data object sets is defined in this phase.

Process descriptions for adding , deleting, retrieving or modifying a data object are given.

Application Generation: The actual system is built and coding is done by using automation tools to

convert process and data models into actual prototypes.

Testing and Turnover:The overall testing time is reduced in RAD model as the prototypes are

independently tested during every iteration. However the data flow and the interfaces between all the

components need to be thoroughly tested with complete test coverage. Since most of the programming

components have already been tested, it reduces the risk of any major issues.

Following image illustrates the RAD Model:

RAD Model Vs Traditional SDLC The traditional SDLC follows a rigid process models with high emphasis on requirement analysis

and gathering before the coding starts. It puts a pressure on the customer to sign off the

requirements before the project starts and the customer doesn.t get the feel of the product as

there is no working build available for a long time.

The customer may need some changes after he actually gets to see the software, however the

change process is quite rigid and it may not be feasible to incorporate major changes in the

product in traditional SDLC.

RAD model focuses on iterative and incremental delivery of working models to the customer.

This results in rapid delivery to the customer and customer involvement during the complete

development cycle of product reducing the risk of non conformance with the actual user

requirements.

RAD Model Application RAD model can be applied successfully to the projects in which clear modularization is possible.

If the project cannot be broken into modules, RAD may fail. Following are the typical scenarios

where RAD can be used:

RAD should be used only when a system can be modularized to be delivered in incremental manner.

It should be used if there.s high availability of designers for modeling.

It should be used only if the budget permits use of automated code generating tools.

RAD SDLC model should be chosen only if domain experts are available with relevant business

knowledge.

Should be used where the requirements change during the course of the project and working prototypes

are to be presented to customer in small iterations of 2-3 months.

RAD Model Pros and Cons RAD model enables rapid delivery as it reduces the overall development time due to reusability

of the components and parallel development.

RAD works well only if high skilled engineers are available and the customer is also committed

to achieve the targeted prototype in the given time frame. If there is commitment lacking on

either side the model may fail.

Following table lists out the pros and cons of RAD Model:

Pros Cons

Changing requirements can

be accommodated.

Progress can be measured.

Iteration time can be short

with use of powerful RAD

tools.

Productivity with fewer

people in short time.

Reduced development time.

Increases reusability of

components

Dependency on technically strong

team members for identifying

business requirements.

Only system that can be modularized

can be built using RAD.

Requires highly skilled

developers/designers.

High dependency on modeling skills.

Inapplicable to cheaper projects as

cost of modeling and automated code

generation is very high.

Quick initial reviews occur

Encourages customer

feedback

Integration from very

beginning solves a lot of

integration issues.

Management complexity is more.

Suitable for systems that are

component based and scalable.

Requires user involvement

throughout the life cycle.

Suitable for project requiring shorter

development times.

The Software Prototyping refers to building software application prototypes which display the

functionality of the product under development but may not actually hold the exact logic of the

original software.

Software prototyping is becoming very popular as a software development model, as it enables

to understand customer requirements at an early stage of development. It helps get valuable

feedback from the customer and helps software designers and developers understand about

what exactly is expected from the product under development.

What is Software Prototyping? Prototype is a working model of software with some limited functionality.

The prototype does not always hold the exact logic used in the actual software application and is an

extra effort to be considered under effort estimation.

Prototyping is used to allow the users evaluate developer proposals and try them out before

implementation.

It also helps understand the requirements which are user specific and may not have been considered by

the developer during product design.

Following is the stepwise approach to design a software prototype:

Basic Requirement Identification: This step involves understanding the very basics product

requirements especially in terms of user interface. The more intricate details of the internal design and

external aspects like performance and security can be ignored at this stage.

Developing the initial Prototype: The initial Prototype is developed in this stage, where the very

basic requirements are showcased and user interfaces are provided. These features may not exactly

work in the same manner internally in the actual software developed and the workarounds are used to

give the same look and feel to the customer in the prototype developed.

Review of the Prototype:The prototype developed is then presented to the customer and the other

important stakeholders in the project. The feedback is collected in an organized manner and used for

further enhancements in the product under development.

Revise and enhance the Prototype: The feedback and the review comments are discussed during

this stage and some negotiations happen with the customer based on factors like , time and budget

constraints and technical feasibility of actual implementation. The changes accepted are again

incorporated in the new Prototype developed and the cycle repeats until customer expectations are

met.

Prototypes can have horizontal or vertical dimensions. Horizontal prototype displays the user

interface for the product and gives a broader view of the entire system, without concentrating

on internal functions. A vertical prototype on the other side is a detailed elaboration of a specific

function or a sub system in the product.

The purpose of both horizontal and vertical prototype is different. Horizontal prototypes are used

to get more information on the user interface level and the business requirements. It can even

be presented in the sales demos to get business in the market. Vertical prototypes are technical

in nature and are used to get details of the exact functioning of the sub systems. For example,

database requirements, interaction and data processing loads in a given sub system.

Software Prototyping Types There are different types of software prototypes used in the industry. Following are the major

software prototyping types used widely:

Throwaway/Rapid Prototyping: Throwaway prototyping is also called as rapid or close ended

prototyping. This type of prototyping uses very little efforts with minimum requirement analysis to build

a prototype. Once the actual requirements are understood, the prototype is discarded and the actual

system is developed with a much clear understanding of user requirements.

Evolutionary Prototyping: Evolutionary prototyping also called as breadboard prototyping is based on

building actual functional prototypes with minimal functionality in the beginning. The prototype

developed forms the heart of the future prototypes on top of which the entire system is built. Using

evolutionary prototyping only well understood requirements are included in the prototype and the

requirements are added as and when they are understood.

Incremental Prototyping: Incremental prototyping refers to building multiple functional prototypes of

the various sub systems and then integrating all the available prototypes to form a complete system.

Extreme Prototyping : Extreme prototyping is used in the web development domain. It consists of

three sequential phases. First, a basic prototype with all the existing pages is presented in the html

format. Then the data processing is simulated using a prototype services layer. Finally the services are

implemented and integrated to the final prototype. This process is called Extreme Prototyping used to

draw attention to the second phase of the process, where a fully functional UI is developed with very

little regard to the actual services.

Software Prototyping Application Software Prototyping is most useful in development of systems having high level of user

interactions such as online systems. Systems which need users to fill out forms or go through

various screens before data is processed can use prototyping very effectively to give the exact

look and feel even before the actual software is developed.

Software that involves too much of data processing and most of the functionality is internal with

very little user interface does not usually benefit from prototyping. Prototype development could

be an extra overhead in such projects and may need lot of extra efforts.

Software Prototyping Pros and Cons Software prototyping is used in typical cases and the decision should be taken very carefully so

that the efforts spent in building the prototype add considerable value to the final software

developed. The model has its own pros and cons discussed as below.

Following table lists out the pros and cons of Prototyping Model:

Pros Cons

Increased user involvement in

the product even before

implementation

Since a working model of the

system is displayed, the users

get a better understanding of

the system being developed.

Reduces time and cost as the

defects can be detected much

earlier.

Quicker user feedback is

available leading to better

solutions.

Missing functionality can be

identified easily

Confusing or difficult functions

Risk of insufficient requirement

analysis owing to too much

dependency on prototype

Users may get confused in the

prototypes and actual systems.

Practically, this methodology may

increase the complexity of the

system as scope of the system

may expand beyond original

plans.

Developers may try to reuse the

existing prototypes to build the

actual system, even when its not

technically feasible

The effort invested in building

prototypes may be too much if

can be identified not monitored properly

This was about the various SDLC models available and the scenarios in which these SDLC

models are used. The information in this tutorial will help the project managers decide what

SDLC model would be suitable for their project and it would also help the developers and testers

understand basics of the development model being used for their project.

We have discussed all the popular SDLC models in the industry, both traditional and Modern.

This tutorial also gives you an insight into the pros and cons and the practical applications of the

SDLC models discussed.

Waterfall and V model are traditional SDLC models and are of sequential type. Sequential means

that the next phase can start only after the completion of first phase. Such models are suitable

for projects with very clear product requirements and where the requirements will not change

dynamically during the course of project completion.

Iterative and Spiral models are more accommodative in terms of change and are suitable for

projects where the requirements are not so well defined, or the market requirements change

quite frequently.

Big Bang model is a random approach to Software development and is suitable for small or

academic projects.

Agile is the most popular model used in the industry. Agile introduces the concept of fast

delivery to customers using prototype approach. Agile divides the project into small iterations

with specific deliverable features. Customer interaction is the backbone of Agile methodology,

and open communication with minimum documentation are the typical features of Agile

development environment.

RAD (Rapid Application Development) and Software Prototype are modern techniques to

understand the requirements in a better way early in the project cycle. These techniques work

on the concept of providing a working model to the customer and stockholders to give the look

and feel and collect the feedback. This feedback is used in an organized manner to improve the

product.

The Useful Resources section lists some suggested books and online resources to gain further

understanding of the SDLC concepts.

https://www.tutorialspoint.com/sdlc/sdlc_useful_resources.htm

SDLC Overview SDLC, Software Development Life Cycle is a process used by software industry to design,

develop and test high quality softwares. The SDLC aims to produce a high quality software that

meets or exceeds customer expectations, reaches completion within times and cost estimates.

SDLC is the acronym of Software Development Life Cycle.

It is also called as Software development process.

The software development life cycle (SDLC) is a framework defining tasks performed at each step in the

software development process.

ISO/IEC 12207 is an international standard for software life-cycle processes. It aims to be the standard

that defines all the tasks required for developing and maintaining software.

A typical Software Development life cycle consists of the following stages:

Stage 1: Planning and Requirement Analysis

Stage 2: Defining Requirements

Stage 3: Designing the product architecture

Stage 4: Building or Developing the Product

Stage 5: Testing the Product

Stage 6: Deployment in the Market and Maintenance

SDLC Models There are various software development life cycle models defined and designed which are

followed during software development process. These models are also referred as "Software

Development Process Models". Each process model follows a Series of steps unique to its type,

in order to ensure success in process of software development.

Following are the most important and popular SDLC models followed in the industry:

Waterfall Model

Iterative Model

Spiral Model

V-Model

Big Bang Model

The other related methodologies are Agile Model, RAD Model, Rapid Application Development

and Prototyping Models.

SDLC Waterfall Model Following is a diagrammatic representation of different phases of waterfall model.

The sequential phases in Waterfall model are:

Requirement Gathering and analysis: All possible requirements of the system to be developed are

captured in this phase and documented in a requirement specification doc.

System Design: The requirement specifications from first phase are studied in this phase and system

design is prepared. System Design helps in specifying hardware and system requirements and also

helps in defining overall system architecture.

Implementation: With inputs from system design, the system is first developed in small programs

called units, which are integrated in the next phase. Each unit is developed and tested for its

functionality which is referred to as Unit Testing.

Integration and Testing: All the units developed in the implementation phase are integrated into a

system after testing of each unit. Post integration the entire system is tested for any faults and failures.

Deployment of system: Once the functional and non functional testing is done, the product is

deployed in the customer environment or released into the market.

Maintenance: There are some issues which come up in the client environment. To fix those issues

patches are released. Also to enhance the product some better versions are released. Maintenance is

done to deliver these changes in the customer environment.

All these phases are cascaded to each other in which progress is seen as flowing steadily

downwards (like a waterfall) through the phases. The next phase is started only after the

defined set of goals are achieved for previous phase and it is signed off, so the name "Waterfall

Model". In this model phases do not overlap.

SDLC Iterative Model Following is the pictorial representation of Iterative and Incremental model:

This model is most often used in the following scenarios:

Requirements of the complete system are clearly defined and understood.

Major requirements must be defined; however, some functionalities or requested enhancements may

evolve with time.

There is a time to the market constraint.

A new technology is being used and is being learnt by the development team while working on the

project.

Resources with needed skill set are not available and are planned to be used on contract basis for

specific iterations.

There are some high risk features and goals which may change in the future.

SDLC Spiral Model The spiral model has four phases. A software project repeatedly passes through these phases in

iterations called Spirals.

Identification:This phase starts with gathering the business requirements in the baseline spiral. In the

subsequent spirals as the product matures, identification of system requirements, subsystem

requirements and unit requirements are all done in this phase.

This also includes understanding the system requirements by continuous communication between the

customer and the system analyst. At the end of the spiral the product is deployed in the identified

market.

Design:Design phase starts with the conceptual design in the baseline spiral and involves architectural

design, logical design of modules, physical product design and final design in the subsequent spirals.

Construct or Build:Construct phase refers to production of the actual software product at every spiral.

In the baseline spiral when the product is just thought of and the design is being developed a POC

(Proof of Concept) is developed in this phase to get customer feedback.

Then in the subsequent spirals with higher clarity on requirements and design details a working model

of the software called build is produced with a version number. These builds are sent to customer for

feedback.

Evaluation and Risk Analysis:Risk Analysis includes identifying, estimating, and monitoring technical

feasibility and management risks, such as schedule slippage and cost overrun. After testing the build,

at the end of first iteration, the customer evaluates the software and provides feedback.

Following is a diagrammatic representation of spiral model listing the activities in each phase:

V Model The V - model is SDLC model where execution of processes happens in a sequential manner in

V-shape. It is also known as Verification and Validation model.

V - Model is an extension of the waterfall model and is based on association of a testing phase

for each corresponding development stage. This means that for every single phase in the

development cycle there is a directly associated testing phase. This is a highly disciplined model

and next phase starts only after completion of the previous phase.

The below figure illustrates the different phases in V-Model of SDLC.

SDLC Big Bang Model The Big Bang model is SDLC model where there is no specific process followed. The

development just starts with the required money and efforts as the input, and the output is the

software developed which may or may not be as per customer requirement.

B ig Bang Model is SDLC model where there is no formal development followed and very little

planning is required. Even the customer is not sure about what exactly he wants and the

requirements are implemented on the fly without much analysis.

Usually this model is followed for small projects where the development teams are very small.

Agile Model Here is a graphical illustration of the Agile Model:

Agile thought process had started early in the software development and started becoming

popular with time due to its flexibility and adaptability.

The most popular agile methods include Rational Unified Process (1994), Scrum (1995), Crystal

Clear, Extreme Programming (1996), Adaptive Software Development, Feature Driven

Development, and Dynamic Systems Development Method (DSDM) (1995). These are now

collectively referred to as agile methodologies, after the Agile Manifesto was published in 2001.

Following are the Agile Manifesto principles

Individuals and interactions . in agile development, self-organization and motivation are important,

as are interactions like co-location and pair programming.

Working software . Demo working software is considered the best means of communication with the

customer to understand their requirement, instead of just depending on documentation.

Customer collaboration . As the requirements cannot be gathered completely in the beginning of the

project due to various factors, continuous customer interaction is very important to get proper product

requirements.

Responding to change . agile development is focused on quick responses to change and continuous

development.

RAD Model Following image illustrates the RAD Model:

Following are the typical scenarios where RAD can be used:

RAD should be used only when a system can be modularized to be delivered in incremental manner.

It should be used if there.s high availability of designers for modeling.

It should be used only if the budget permits use of automated code generating tools.

RAD SDLC model should be chosen only if domain experts are available with relevant business

knowledge.

Should be used where the requirements change during the course of the project and working prototypes

are to be presented to customer in small iterations of 2-3 months.

Software Prototyping The Software Prototyping refers to building software application prototypes which display the

functionality of the product under development but may not actually hold the exact logic of the

original software.

Software prototyping is becoming very popular as a software development model, as it enables

to understand customer requirements at an early stage of development. It helps get valuable

feedback from the customer and helps software designers and developers understand about

what exactly is expected from the product under development.

Following is the stepwise approach to design a software prototype:

Basic Requirement Identification: This step involves understanding the very basics product

requirements especially in terms of user interface. The more intricate details of the internal design and

external aspects like performance and security can be ignored at this stage.

Developing the initial Prototype: The initial Prototype is developed in this stage, where the very

basic requirements are showcased and user interfaces are provided. These features may not exactly

work in the same manner internally in the actual software developed and the workarounds are used to

give the same look and feel to the customer in the prototype developed.

Review of the Prototype:The prototype developed is then presented to the customer and the other

important stakeholders in the project. The feedback is collected in an organized manner and used for

further enhancements in the product under development.

Revise and enhance the Prototype: The feedback and the review comments are discussed during

this stage and some negotiations happen with the customer based on factors like , time and budget

constraints and technical feasibility of actual implementation. The changes accepted are again

incorporated in the new Prototype developed and the cycle repeats until customer expectations are

met.

Summary This was about the various SDLC models available and the scenarios in which these SDLC

models are used. The information in this tutorial will help the project managers decide what

SDLC model would be suitable for their project and it would also help the developers and testers

understand basics of the development model being used for their project.

We have discussed all the popular SDLC models in the industry, both traditional and Modern.

This tutorial also gives you an insight into the pros and cons and the practical applications of the

SDLC models discussed.

Waterfall and V model are traditional SDLC models and are of sequential type. Sequential means

that the next phase can start only after the completion of first phase. Such models are suitable

for projects with very clear product requirements and where the requirements will not change

dynamically during the course of project completion.

Iterative and Spiral models are more accommodative in terms of change and are suitable for

projects where the requirements are not so well defined, or the market requirements change

quite frequently.

Big Bang model is a random approach to Software development and is suitable for small or

academic projects.

Agile is the most popular model used in the industry. Agile introduces the concept of fast

delivery to customers using prototype approach. Agile divides the project into small iterations

with specific deliverable features. Customer interaction is the backbone of Agile methodology,

and open communication with minimum documentation are the typical features of Agile

development environment.

RAD (Rapid Application Development) and Software Prototype are modern techniques to

understand the requirements in a better way early in the project cycle. These techniques work

on the concept of providing a working model to the customer and stockholders to give the look

and feel and collect the feedback. This feedback is used in an organized manner to improve the

product

In this world of cloud, one of the biggest features is the ability to scale. There are different ways to accomplish scaling, which is a transformation that enlarges or diminishes. One is vertical scaling and the other is horizontal scaling. What is the difference between the two? If you look at just the definitions of vertical and horizontal you might see the following: • Vertical: something that is standing directly upright at a right angle to the flat ground • Horizontal: something that is parallel to the horizon (the area where the sky seems to meet the earth) If you are a visual kind of person you may be able to see this. Let‘s add some technology to this and see what we get. Vertical scaling can essentially resize your server with no change to your code. It is the ability to increase the capacity of existing hardware or software by adding resources. Vertical scaling is limited by the fact that you can only get as big as the size of the server. Horizontal scaling affords the ability to scale wider to deal with traffic. It is the ability to connect multiple hardware or software entities, such as servers, so that they work as a single logical unit. This kind of scale cannot be implemented at a moment‘s notice. So, having said all that, I always like to provide an example that you might be able to visually imagine. Imagine, if you will, an apartment building that has many rooms and floors where people move in and out all the time. In this apartment building, 200 spaces are available but not all are taken at one time. So, in a sense, the apartment scales vertically as more people come and there are rooms to accommodate them. As long as the 200-space capacity is not exceeded, life is good. This could even apply to a restaurant. You have seen the signs that tell you how many people could be held in the establishment. As more patrons come in more tables may be set up and more chairs added (scaling vertically). However when capacity is reached no more patrons would be able to fit. You can only be as big as the building and patio of the restaurant. This is much like in your cloud environment, where you could add more hardware to the existing machine (RAM and hard drive space) but you are limited to capacity of your actual machine. On the horizontal scaling side, imagine a two lane expressway. The expressway is good to handle the 2,000 or so vehicles that travel the expressway. As commerce begins to expand, more buildings are constructed and more homes are built. As a result the expressway that once handled 2,000 or so vehicles is now having an increase to 8,000 vehicles. This makes a major traffic jam during rush hour. To alleviate this problem of traffic jams and an increase in accidents, the expressway can be scaled horizontally by constructing more lanes and quite possibly adding an overpass. In this example the construction will take some time. Much like scaling your cloud horizontally, you add additional machines to your environment (scaling wider). This requires planning and making sure you have resources available as well as making sure your architecture can handle the scalability.

Five easy steps to improve your database performance January 30, 2015: Based on reader feedback, section 4 ―Do you have enough database connections?‖ has been revised.

Database access is a core feature of most applications. Based on our experience, it seems that for at least 80% of all applications we see, simple database performance tuning can speed up applications significantly. Fortunately, there isn‘t a lot of rocket science involved until you get really deep under the hood of database tuning. When you‘re ready to take database tuning to the next level, there are many great tools around for you to consider, for example from our friends at Vivid Cortex. For this post however, we will only focus on quick wins that you can easily achieve without any help from third parties.

Step 1. Is your database server healthy?

First and foremost, make sure that the host that‘s serving your database process has sufficient resources available. This includes CPU, memory, and disk space.

CPU CPU will most likely not be a bottleneck, but database servers induce continuous base load on machines. To keep the host responsive, make sure that it has at the very least two CPU cores available. I will assume that at least some of your hosts are virtualized. As a general rule of thumb, when monitoring virtual machines, also monitor the virtual host that the machines run on. CPU metrics of individual virtual machines won‘t show you the full picture. Numbers like CPU ready time are of particular importance.

CPU ready time is a considerable factor when assigning CPU time to virtual machines

Memory

http://www.vividcortex.com/

https://dt-cdn.net/wp-content/uploads/2014/12/cpu-ready-time.png

Keep in mind that memory usage is not the only metric to keep an eye on. Memory usagedoes not tell you how much additional memory may be needed. The important number to look at is page faults per seconds.

Page faults is the real indicator when it comes to your host‘s memory requirements

Having thousands of page faults per second indicates that your hosts are out of memory (this is when you start to hear your server‘s hard drive grinding away).

Disk space Because of indices and other performance improvements, databases use up a LOT more disk space than what the actual data itself requires (indices, you know). NoSQL databases in particular (Cassandra and MongoDB for instance) eat up a lot more disk space than you would expect. MongoDB takes up less RAM than a common SQL database, but it‘s a real disk space hog.

I can‘t emphasize this too much: make sure you have lots of disk space available on your hard drive. Also, make sure your database runs on a dedicated hard drive, as this should keep disk fragmentation caused by other processes to a minimum.

http://cassandra.apache.org/

https://www.mongodb.org/

https://www.dynatrace.com/en/ruxit/technologies/database-monitoring/mongodb/

https://dt-cdn.net/wp-content/uploads/2014/12/page-faults.png

Disk latency is an indicator for overloaded harddrives

One number to keep an eye on is disk latency. Depending on hard drive load, disk latency will increase, leading to a decrease in database performance. What can you do about this? Firstly, try to leverage your application‘s and database‘s caching mechanisms as much as possible. There is no quicker and more cost-effective way of moving the needle. If that still does not yield the expected performance, you can always add additional hard drives. Read performance can be multiplied by simply mirroring your hard drives. Write performance really benefits from using RAID 1 or RAID 10 instead of, let‘s say, RAID 6. If you want to get your hands dirty on this subject, read up on disk latency and I/O issues.

Step 2. Who is accessing the database?

Once your database is residing on healthy hardware you should take a look at which applications are actually accessing the database. If one of your applications or services suffers from bad database performance, do not jump to the conclusion that you know which application or service is responsible for the bad performance.

http://blog.scoutapp.com/articles/2011/02/10/understanding-disk-i-o-when-should-you-be-worried



https://dt-cdn.net/wp-content/uploads/2014/12/disk-latency.png

Knowing which services access a database is vital for finding database performance bottlenecks

When talking about inferior database performance, you‘re really talking about two different things. On one hand, the database as a whole may be affected. On the other hand, the database may be just a single service that‘s experiencing bad performance.

If all of the database‘s clients experience bad performance, go back and check if your host is truly healthy. Chances are that your hardware is not up to the challenge. If there is only a single service that‘s suffering from bad database response times, dig deeper into that service‘s metrics to find out what‘s causing the problem.

3. Understand the load and individual response time of each service

If an individual service is having bad database performance, you should take a deeper look into the service‘s communication with the database. Which queries are executed? How often are the queries executed per request? How many rows do they return?

https://dt-cdn.net/wp-content/uploads/2014/12/who-accesses-db.png

You should now what kind of commands affect the database performance the most

It‘s important to know that issues that materialize on the database level may be rooted elsewhere. Very often there is an issue related to the way a database is accessed.

Look at how often queries are called per request. Maybe you can reduce the number of actual database queries by improving the database cache of your service. Question everything. Is there any reason why a single query should be executed more than once per request? If there is, maybe you can unlock some potential performance by applying smart caching strategies.

4. Do you have enough database connections?

Even if the way you query your database is perfectly fine, you may still experience inferior database performance. If this is your situation, it‘s time to check that your application‘s database connection is correctly sized.

Check to see if Connection Acquisition time comprises a large percentage of your database‘s response time.

When configuring a connection pool there are two things to consider:

1) What is the maximum number of connections the database can handle?

2) What is the correct size connection pool required for your application?

Why shouldn‘t you just set the connection pool size to the maximum? Because your application may not be the only client that‘s connected to the database. If your application takes up all the connections, the database server won‘t be able to perform as expected. However if your application is the only client connected to the database, then go for it! How to find out the maximum number of connections

You already confirmed in Step #1 that your database server is healthy. The maximum number of connections to

https://dt-cdn.net/wp-content/uploads/2014/12/sql-hotspot.png

https://dt-cdn.net/wp-content/uploads/2014/12/db-acquisition-time.png

the database is a function of the resources on the database. So to find the maximum number of connections, gradually increase load and the number of allowed connections to your database. While doing this, keep an eye on your database server‘s metrics. Once they max out—either CPU, memory, or disk performance—you know you‘ve reached the limit. If the number of available connections you reach is not enough for your application, then it‘s time to consider upgrading your hardware. Determine the correct size for your application‘s connection pool The number of allowed concurrent connections to your database is equivalent to the amount of parallel load that your application applies to the database server. There are tools available to help you in determining the correct number here. For Java, you might want to give log4jdbc a try.

Increasing load will lead to higher transaction response times, even if your database server is healthy. Measure the transaction response time from end-to-end to see if Connection Acquisition time takes up increasingly more time under heavy load. If it does, then you know that your connection pool is exhausted. If it doesn‘t, have another look at your database server‘s metrics to determine the maximum number of connections that your database can handle.

By the way, a good rule of thumb to keep in mind here is that a connection pool‘s size should be constant, not variable. So set the minimum and maximum pool sizes to the same value.

5. Don‘t forget about the network We tend to forget about the physical constraints faced by our virtualized infrastructure. Nonetheless, there are physical constraints: cables fail and routers break. Unfortunately, the gap between works and doesn‘t work usually varies. This is why you should keep an eye on your network metrics. If problems suddenly appear after months or even years of operating flawlessly, chances are that your infrastructure is suffering from a non-virtual, physical problem. Check your routers, check your cables, and check your network interfaces. It‘s best to do this as early as possible following the first sign that there may be a problem because this may be the point in time when you can fix a problem before it impacts your business.

Retransmissions seriously impact network performance

Very often, over-stressed processes start to drop packets due to depleted resources. Just in case your network issue is not a hardware problem, process level visibility can definitely come in handy in identifying a failing component.

Database performance wrap up

Databases are sophisticated applications that are not built for bad performance or failure. Make sure your databases are securely hosted and resourced so that they can perform at their best.

https://code.google.com/p/log4jdbc/

https://dt-cdn.net/wp-content/uploads/2014/12/network-problem.png

Here‘s what you‘ll need to optimize your database: Server data to check host health Hypervisor and virtual machine metrics to ensure that your virtualization is okay Application data to optimize database access

Network data to analyze the network impact of database communication.

There are many tools that can provide you with this information. I used Ruxit for my examples here because it provides all the data I need in a single tool. Though, obviously, I am a bit biased.

Give it a try! Ruxit is free to use for 30 days! The trial stops automatically, no credit card is required. Just enter your email address, choose your cloud location and install our agent. Monitoring your database performance was never easier!

https://www.dynatrace.com/en/ruxit/

https://www.dynatrace.com/en/ruxit/technologies/database-monitoring/mongodb/

10 Steps to better postgresql performance

Christophe Pettus

PostgreSQL guy

Done PostgreSQL for over 10 years

Django for 4 years

Not going to explain why things work great, just will provide good options. Ask him later for details

http://thebuild.com/presentations/not-your-job.pdf

Note

Christophe talks super fast and I can’t keep up PostgreSQL features

Robust, feature-rich, fully ACID compliant database

Very high performance, can handle hundreds of terabytes

Default database with Django

PostgreSQL negatives

Configuration is hard

Installation is hard on anything but Linux

Not NoSQL

Configuration

Logging

Be generous with logging; it’s very low-impact on the system

Locations for logs

o syslog

o standard format to files

o Just paste the following:

log_destination = 'csvlog' log_directory = 'pg_log' TODO - get rest from Christophe

Shared_buffers

TODO - get this

work_mem

Start low: 32-64MB

Look for ‘temporary file’ lines in logs

set to 2-3x the largest temp file you see

http://thebuild.com/presentations/not-your-job.pdf

Can cause a huge speed-up if set properly

Be careful: it can use that amount of memory per query

maintenance_work_mem

Set to 10% of system memory, up to 1GB

effective_cache_size

Set to the amount of file system cache available

If you don’t know it, set it to 50% of the available memory

Checkpointing

A complete fish of dirty buffers to disk

Potentially a lot of I/O

Done when the first of two thresholds are hit:

o A particular...

Note

Didn’t get any of this part of things. Easy performance boosts

Don’t run anything else on your PostgreSQL server

If PostgreSQL is in a VM, remember all of the other VMs on the same host

Disable the Linux OOM killer

Stupid Database Tricks

Don’t put your sessions in the database

Avoid aonstantly-updated accumulator records.

Don’t put the task queues in the database

Don’t use the database as a filesystem

Don’t use frequently-locked singleton records

Don’t use very long-running transactions

Mixing transactional and data warehouse queries on the same database

One schema trick

If one model ha sa constantly-updated section and a rarely-updated section

o last-seen on site field

o cut out that field into a new model

SQL Pathologies

Gigantic IN clauses (a typical Django anti-pattern) are problematic

Unanchored text queries like ‘%this%’ run slow

Indexing

A good index

o Has high selectivity on commonly-used data

o Returns a small number of records

o Is determined by analysis, not guessing

Use pg_stat_user_tables - shows sequential scans

Use pg_stat_index_blah

Vacuuming

autovacuum slowing the system down?

o increase autovacuum_vacuum_cost_limit in small increments

Or if the load is periodic

o Do manual VACUUMing instead at low-low times

o You must VACUUM on a regular basis

Analyze your vacuum

o Collect statistics on the data to help the planner choose a good plan

o Done automatically as part of autovacuum

On-going maintenance

keeping it running

monitoring

Keep track of disk space and system load

memory and I/O utilization is very handy

1 minute bnts

check_posgres.pl at bucardo.org

Backups

pg_dump

Easiest backup tool for PostgreSQL

Low impact on a running database

Makes a copy of the database

becomes impractical for large databases

Streaming replication

Best solution for large databases

Easy to set up

Maintains an exact logical copy of the database on a different host

Does not guard against application-level failures, however

Can be used for read-only queries

if you are getting query cancellations then bump up a config

Is all-or-nothing

If you need partial replication, you need to use Slony or Bucardo

o ..warning:: partial replication is a full-time effort

WAL Archiving

Maintains a set of base backups and WAL segments on a remote server

Can be used for point-in-time recovery in case of an application (or DBA) failure

Slightly more complex to set up

Encodings

Character encoding is fixed in a database when created

The defaults are not what you want

Use UTF-8 encoding

Migrations

All modifications to a table take an exclusive lock on that table while the modification is being done.

If you add a column with a default value, the table will be rewritten

Migrating a big table

o Create the column as NOT NULL

o Add constraint later once field is populated o Note

I’ve done this a lot.

Vacuum FREEZE

Once in a while PostgreSQL needs to scan every table

THis can be a very big surprise

Run VACUUM manually periodically

Hardware

Get lots of ECC RAM

CPU is not as vital as RAM

Use a RAID

AWS Survival Guide

Biggest instance you can afford

EBS for the data and transaction

Set up streaming replication

Handling Very Large Tables in Postgres Using Partitioning September 13, 2016 by Rimas Silkaitis

One of the interesting patterns that we‟ve seen, as a result of managing one of the largest fleets of Postgres databases, is one or two tables growing at a rate that‟s much larger and faster than the rest of the tables in the database. In terms of absolute numbers, a table that grows sufficiently large is on the order of hundreds of gigabytes to terabytes in size. Typically, the data in this table tracks events in an application or is analogous to an application log. Having a table of this size isn‟t a problem in and of itself, but can lead to other issues; query performance can start to degrade and indexes can take much longer to update. Maintenance tasks, such as vacuum, can also become inordinately long. Depending on how you need to work with the information being stored, Postgres table partitioning can be a great way to restore query performance and deal with large volumes of data over time without having to resort to changing to a different data store. We use pg_partman ourselves in the Postgres database that backs the control plane that maintains the fleet of Heroku Postgres, Heroku Redis, and Heroku Kafka stores. In our control plane, we have a table that tracks all of the state transitions for any individual data store. Since we don‟t need that information to stick around after a couple of weeks, we use table partitioning. This allows us to drop tables after the two week window and we can keep queries blazing fast. To understand how to get better performance with a large dataset in Postgres, we need to understand how Postgres does inheritance, how to set up table partitions manually, and then how to use the Postgres extension, pg_partman, to ease the partitioning setup and maintenance process.

Let‟s Talk About Inheritance First Postgres has basic support for table partitioning via table inheritance. Inheritance for tables in Postgres is much like inheritance in object-oriented programming. A table is said to inherit from another one when it maintains the same data definition and interface. Table inheritance for Postgres has been around for quite some time, which means the functionality has had time to mature. Let‟s walk through a contrived example to illustrate how inheritance works:

CREATE TABLE products (

id BIGSERIAL,

price INTEGER

created_at TIMESTAMPTZ,

updated_at TIMESTAMPTZ

);

CREATE TABLE books (

isbn TEXT,

author TEXT,

title TEXT

) INHERITS (products);

CREATE TABLE albums (

artist TEXT,

length INTEGER,

http://neovintage.org/

https://www.heroku.com/postgres



https://github.com/keithf4/pg_partman

number_of_songs INTEGER

) INHERITS (products);

In this example, both books and albums inherit from products. This means that if a record was inserted into the books table, it would have all the same characteristics of the products table plus that of the books table. If a query was issued against the products table, that query would reference information on the product table plus all of its descendants. For this example, the query would reference products, books and albums. That‟s the default behavior in Postgres. But, you can also issue queries against any of the child tables individually.

Setting up Partitioning Manually Now that we have a grasp on inheritance in Postgres, we‟ll set up partitioning manually. The basic premise of partitioning is that a master table exists that all other children inherit from. We‟ll use the phrase „child table‟ and partition interchangeably throughout the rest of the setup process. Data should not live on the master table at all. Instead, when data gets inserted into the master table, it gets redirected to the appropriate child partition table. This redirection is usually defined by a trigger that lives in Postgres. On top of that, CHECK constraints are put on each of the child tables so that if data were to be inserted directly on the child table, the correct information will be inserted. That way data that doesn‟t belong in the partition won‟t end up in there. When doing table partitioning, you need to figure out what key will dictate how information is partitioned across the child tables. Let‟s go through the process of partitioning a very large events table in our Postgres database. For an events table, time is the key that determines how to split out information. Let‟s also assume that our events table gets 10 million INSERTs done in any given day and this is our original events table schema:

CREATE TABLE events (

uuid text,

name text,

user_id bigint,

account_id bigint,

created_at timestamptz

);

Let‟s make a few more assumptions to round out the example. The aggregate queries that run against the events table only have a time frame of a single day. This means our aggregations are split up by hour for any given day. Our usage of the data in the events table only spans a couple of days. After that time, we don‟t query the data any more. On top of that, we have 10 million events generated a day. Given these extra assumptions, it makes sense to create daily partitions. The key that we‟ll use to partition the data will be the time at which the event was created (e.g. created_at).

CREATE TABLE events (

uuid text,

name text,

user_id bigint,

account_id bigint,


);

CREATE TABLE events_20160801 (

CHECK (created_at >= ‘2016-08-01 00:00:00’ AND created_at < ‘2016-08-02 00:00:00’)

) INHERITS (events);

CREATE TABLE events_20160802 (

CHECK (created_at >= ‘2016-08-02 00:00:00’ AND created_at < ‘2016-08-03 00:00:00’)

) INHERITS (events);

Our master table has been defined as events and we have two tables out in the future that are ready to accept

data, events_20160801 and events_20160802. We‟ve also put CHECK constraints on them to make sure that only data for that day ends up on that partition. Now we need to create a trigger to make sure that any data entered on the master table gets directed to the correct partition:

CREATE OR REPLACE FUNCTION event_insert_trigger()

RETURNS TRIGGER AS $$

BEGIN

IF ( NEW.created_at >= ‘2016-08-01 00:00:00'AND

NEW.created_at < ‘2016-08-02 00:00:00' ) THEN

INSERT INTO events_20160801 VALUES (NEW.*);

ELSIF ( NEW.created_at >= ‘2016-08-02 00:00:00'AND

NEW.created_at < ‘2016-08-03 00:00:00' ) THEN

INSERT INTO events_20160802 VALUES (NEW.*);

ELSE

RAISE EXCEPTION 'Date out of range. Fix the event_insert_trigger() function!';

END IF;

RETURN NULL;

END;

$$

LANGUAGE plpgsql;

CREATE TRIGGER insert_event_trigger

BEFORE INSERT ON event

FOR EACH ROW EXECUTE PROCEDURE event_insert_trigger();

Great! The partitions have been created, the trigger function defined, and the trigger has been added to the events table. At this point, my application can insert data on the events table and the data can be directed to the appropriate partition. Unfortunately, utilizing table partitioning is a very manual setup fraught with chances for failure. It requires us to go into the database every so often to update the partitions and the trigger, and we haven‟t even talked about removing old data from the database yet. This is where pg_partman comes in.

Implementing pg_partman pg_partman is a partition management extension for Postgres that makes the process of creating and managing table partitions easier for both time and serial-based table partition sets. Compared to partitioning a table manually, pg_partman makes it much easier to partition a table and reduce the code necessary to run partitioning outside of the database. Let‟s run through an example of doing this from scratch: First, let‟s load the extension and create our events table. If you already have a big table defined, the pg_partman documentation has guidance for how to convert that table into one that‟s using table partitioning.

$ heroku pg:psql -a sushi

https://github.com/keithf4/pg_partman

sushi::DATABASE=> CREATE EXTENSION pg_partman;

sushi::DATABASE=> CREATE TABLE events (

id bigint,

name text,

properities jsonb,


);

Let‟s reuse our assumptions that we made about our event data we made earlier. We‟ve got 10 million events that are created a day and our queries really need aggregation on a daily basis. Because of this we‟re going to create daily partitions.

sushi::DATABASE=> SELECT create_parent('public.events', 'created_at', 'time', 'daily');

This command is telling pg_partman that we‟re going to use time-series based partitioning, created_at is going to be the column we use for partitioning, and we want to partition on a daily basis for our master events table. Amazingly, everything that was done to manually set up partitioning is completed in this one command. But we‟re not finished, we need to make sure that on regular intervals maintenance is run on the partitions so that new tables get created and old ones get removed.

sushi::DATABASE=> SELECT run_maintenance();

The run_maintenance() command will instruct pg_partman to look through all of the tables that were partitioned and identify if new partitions should be created and old partitions destroyed. Whether or not a partition should be destroyed is determined by the retention configuration options. While this command can be run via a terminal session, we need to set this up to run on a regular basis. This is a great opportunity to use Heroku Scheduler to accomplish the task.

This command will run on an hourly basis to double check the partitions in the database. Checking the partitioning on an hourly basis might be a bit overkill in this scenario but since Heroku Scheduler is a best effort service, running it hourly is not going to cause any performance impacts on the database. That‟s it! We‟ve set up table partitioning in Heroku Postgres and it will be running on its own with very little maintenance on our part. This setup only scratches the surface of what‟s possible with pg_partman. Check out the extension‟s documentation for the details of what‟s possible.

Should I Use Table Partitioning? Table partitioning allows you to break out one very large table into many smaller tables dramatically increasing performance. As pointed out in the „Setting up Partitioning Manually‟ section, many challenges exist when trying to create and use table partitioning on your own but pg_partman can ease that operational burden. Despite that, table partitioning shouldn‟t be the first solution you reach for when you run into problems. A number of questions should be asked to determine if table partitioning is the right fit:

1. Do you have a sufficiently large data set stored in one table, and do you expect it to grow significantly over time?

2. Is the data immutable, that is, will it never updated after being initially inserted?

3. Have you done as much optimization as possible on the big table with indexes?

4. Do you have data that has little value after a period of time?

5. Is there a small range of data that has to be queried to get the results needed?

6. Can data that has little value be archived to a slower, cheaper storage medium, or can the older data be stored in aggregate or “rolled up”?

http://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.2#Index-only_scans

https://www.datadoghq.com/blog/100x-faster-postgres-performance-by-changing-1-line/

https://robots.thoughtbot.com/advanced-postgres-performance-tips

http://dba.stackexchange.com/questions/42290/configuring-postgresql-for-read-performance

computingnow - uptraining.up.nic.in/coursematerial/imp_url_imortant_topics.pdf · 9/13/2016 · on...

Documents