inroduction to grid computing by gargi shankar verma
DESCRIPTION
This is about introduction of Grid computing - Gargishankar verma(RCET Bhilai)TRANSCRIPT
Grid Computing
Gargi Shankar VermaReader – Dept. Of Information Technology
Rungta College of Engg.& Technology Bhilai
Introduction to Grid Computing
What if the computers you own don't have enough CPU cycles to meet your needs?What if the institution you work for doesn't have enough CPU cycles to meet your needs?What if no single HPC Center has enough CPU cycles to meet your needs? What if you've got spare CPU cycles that aren't being used by local users?
DesktopA desktop computer is yours to do with what
you like... Two quad-core 3 GHz CPUs 16 GB RAM125 Gflops
A cluster is a shared resource...
Cluster
Computing GridsGrids represent a different approach ... Build bigger
supercomputers by joining smaller ones' together in a grid.
LocalCluster
Inter PlanetGrid
2100
2100 2100 2100 2100
2100 2100 2100 2100
Personal Device SMPs or SuperComputers
GlobalGrid
PERFORMANCE+QoS
•Individual•Group•Department•Campus•State•National•Globe
Administrative Barriers
EnterpriseCluster/Grid
Scalable Computing
PC vs Cluster vs GridPC: • Owner has total control• Limited capabilitiesCluster:• Used by a small number of people (e.g.,
department, institution)• Preserves some localityGrid:• Thousands of users - large scale• From many different places - highly distributed• Increased problems (due to distributed nature)
What is a Grid?A Grid is a system that coordinates resources that
are not subject to centralized control. Using standard, open, general-purpose protocols and
interfaces, to deliver nontrivial qualities of service.
9
What is Grid Computing?Computational Grids
– Homogeneous (e.g., Clusters)– Heterogeneous (e.g., with one-of-a-kind
instruments)Cousins of Grid ComputingMethods of Grid Computing
Grid Computing
11
Computational Grids
A network of geographically distributed resources including computers, peripherals, switches, instruments, and data.
Each user should have a single login account to access all resources.
Resources may be owned by diverse organizations.
12
Computational GridsGrids are typically managed by gridware. Gridware can be viewed as a special type of
middleware that enable sharing and manage grid components based on user requirements and resource attributes (e.g., capacity, performance, availability…)
13
Cousins of Grid Computing
Parallel ComputingDistributed ComputingPeer-to-Peer ComputingMany others: Cluster Computing, Network
Computing, Client/Server Computing, Internet Computing, etc...
14
Distributed Computing
People often ask: Is Grid Computing a fancy new name for the concept of distributed computing?
In general, the answer is “no.” Distributed Computing is most often concerned with distributing the load of a program across two or more processes.
15
PEER2PEER Computing
Sharing of computer resources and services by direct exchange between systems.
Computers can act as clients or servers depending on what role is most efficient for the network.
16
Peer to Peer architecture
17
Cluster computing
Put some PCs together and get them to communicate
Cheaper to build than a mainframe supercomputer
Different sizes of clustersScalable – can grow a cluster by adding more
PCs
18
Cluster Architecture
19
Methods of Grid Computing
Distributed SupercomputingHigh-Throughput ComputingOn-Demand ComputingData-Intensive ComputingCollaborative ComputingLogistical Networking
20
Distributed Supercomputing
Combining multiple high-capacity resources on a computational grid into a single, virtual distributed supercomputer.
Tackle problems that cannot be solved on a single system.
21
High-Throughput Computing
Uses the grid to schedule large numbers of loosely coupled or independent tasks, with the goal of putting unused processor cycles to work.
22
On-Demand Computing
Uses grid capabilities to meet short-term requirements for resources that are not locally accessible.
Models real-time computing demands.
23
Data-Intensive Computing
• The focus is on synthesizing new information from data that is maintained in geographically distributed repositories, digital libraries, and databases.
• Particularly useful for distributed data mining.
24
Collaborative Computing
Concerned primarily with enabling and enhancing human-to-human interactions.
Applications are often structured in terms of a virtual shared space.
25
Logistical Networking
Global scheduling and optimization of data movement.
Contrasts with traditional networking, which does not explicitly model storage resources in the network.
Called "logistical" because of the analogy it bears with the systems of warehouses, depots, and distribution channels.
Grid Topologies Intragrid• Local grid within an organization• Trust based on personal contracts Extra grid• Resources of a consortium of organizations
connected through a (Virtual) Private Network• Trust based on Business to Business contractsInter grid• Global sharing of resources through the internet• Trust based on certification.
27
Who Needs Grid Computing?A chemist may utilize hundreds of processors to
screen thousands of compounds per hour.Teams of engineers worldwide pool resources to
analyze terabytes of structural data.Meteorologists seek to visualize and analyze
petabytes of climate data with enormous computational demands.
Grid Architecture
Connectivity
Fabric
Resource
Collective
Application
Link
InternetTransport
Application
Grid Architecture (Layered)
Protocols, Services,and APIs Occur at Each Level
Languages/Frameworks
Fabric Layer
Applications
Local Access APIs and Protocols
Collective Service APIs and SDKsCollective Services
Collective Service Protocols
Resource APIs and SDKsResource Services
Resource Service Protocols
Connectivity APIs
Application layer defines protocols and services that are parochial in nature, targeted towards a specific application domain or class of applications.
Grid layer defines protocols that provide system oriented capabilities that are expected to be wide scale in deployment and generic in function. This includes GIIS, bandwidth brokers, resource brokers,….
Resource layer defines protocols to initiate and control sharing of (local) resources. Services defined at this level are gatekeeper, GRIS, along with some user oriented application protocols from the Internet protocol suite, such as file-transfer.
The connectivity layer defines core protocols required for Grid-specific network transactions. This layer includes the IP protocol stack (system level application protocols [e.g. DNS, RSVP, Routing], transport and internet layers), as well as core Grid security protocols for authentication and authorization.
Fabric layer includes the protocols and interfaces that provide access to the resources that are being shared, including computers, storage systems, datasets, programs, and networks. This layer is a logical view rather then a physical view. For example, the view of a cluster with a local resource manager is defined by the local resource manger, and not the cluster hardware. Likewise, the fabric provided by a storage system is defined by the file system that is available on that system, not the raw disk or tapes.
Types of Grids• Computational Grid• Scavenging Grid • Data Grid
Architecture of Computational Grid
Continue………….Independent clients are given access to a set of
resources through a middle tier that routes information and implements the different services.
We define computational grid as a collection of computers, online instruments, data archives, and networks that are connected by a shared set of services which, when taken together, provide users with transparent access to the entire set of resources.
Computational GridA computational grid is a hardware and
software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.
DependableThe need for dependable service Is fundamental.
Users require assurances that they will receive predictable, sustained, and often high levels of performance from the diverse components that constitute the grid; in the absence of such assurances, applications will not be written or used. The performance characteristics that are of interest will vary widely from application to application, but may include network bandwidth, latency, jitter, computer power, software services, security, and reliability.
ConsistentThe need for consistency of service is a second
fundamental concern. As with electric power, we need standard services, accessible via standard interfaces, and operating within standard parameters. Without such standards, application development and pervasive use are impractical. A significant challenge when developing standards is to encapsulate heterogeneity without compromising high-performance execution.
PervasivePervasive access allows us to count on services
always being available, within whatever environment we expect to move. Pervasiveness does not imply that resources are everywhere or are universally accessible. We cannot access electric power in a new home until wire has been laid and an account established with the local utility; computational grids will have similarly circumscribed availability and controlled access. However, we will be able to count on universal access within the confines of whatever environment the grid is designed to support.
InexpensiveAn infrastructure must offer inexpensive (relative to
income) access if it is to be broadly accepted and used. Homeowners and industrialists both make use of remote billion-dollar power plants on a daily basis because the cost to them is reasonable. A computational grid must achieve similarly attractive economics.
Approaches are needed to develop a computational grids.
Grid developers Tool developers Application developers
Grid DevelopersDevelop protocols and produce routine libraries. The
challenge here is to produce a library of protocols which will work well with many underlying technologies (e.g., different types of networks). The library must also fulfill the many different requests from the tool developers, making it hard to give every different request best performance, while at the same time accommodating the different underlying technologies. There will therefore be a battle between generality and performance. It is very important to standardize all protocols so the tool developers knows how they can implement their work.
Tool DeveloperConcentrate on developing a system that will take
care of the main things that must exist for using various applications. Security must be taken care of, things like authentication and confidentiality has to be implemented. They also develop methods for payments, which are very important in for example on-demand grids. Finally they also develop methods to find and organize resources and information. Which include communication, fault detection and many more things.
Continue…………….Tool developers must adapt their protocols to fit the
protocols developed by the grid developers and also keep in mind the requests from the application developers. Everything must be standardized so that the application developers easily can make use of the capabilities from the tool-layer. The tool developers must also inform the application developers which implementation can get higher or lower performance.
Application DevelopersApplication developers which are supposed to use all
the methods they need from the tool level to make specific application programs for the end users. Applications that are intended to solve hard problems for the end users. The challenge for the application developers is finding algorithms that divide a task into thousands of smaller tasks that can be handled separately and to make the tasks work efficient with the tool layer. For the end user it will only be important to solve a request while it is less important for the user to know how it works
Application of computational gridGovernment Health Maintenance A material science collaboratory Computational market economy
Scavenging gridA scavenging grid is most commonly used with large
numbers of desktop machines. Machines are scavenged for available CPU cycles and other resources. Owners of the desktop machines are usually given control over when their resources are available to participate in the grid
Components of Grid
Grid Portal: Portal provides the user with an interface to launch applications..
Grid Security Infrastructure (GSI): It consists of 2 components -user security and-node security.
User security provides single sign-on, delegation, run-anywhere authentication service, with support for local control over access rights .
In node security, if a processor enrolls in a dynamic rather than pre-administered manner, then an identification and authentication validation must be performed before the processor can actually participate in the grid’s work. A certificate authority (CA) can be utilized to establish the identity of the processor as well as the users of the grid
Broker: It provides information about available resources on the grid and working status of these resources.
Data management Block: It is responsible for moving files and data to various nodes within the grid.
Job Management Block: It is also known as grid resource allocation Manager(GRAM). Its functions include providing the services to launch a job on a particular resource checking the job’s status, retrieve the results when the job is complete. It keeps track of the resources available to the grid and which users are members of the grid
Authentication and Authorization• Important for allowing users to
cross the administrative boundaries in a virtual organization
• System security for jobs outside the administrative domain currently rudimentary
• Work being done on sandboxing, better job control, development environments
HPC
A&A Server
A&A ServerUser
A&A Server
Cluster
Resource Information Service• Used in resource discovery• Leverages existing
technologies such as LDAP, UDDI
• Information service must be able to report very current availability and load data
• Balanced with overhead of updating data
HPC
GIS
GISUser
Cluster
GIS
Scheduler• Owners of systems interested in maximizing
throughput• Users interested in maximizing runtime performance• Both offer challenges with crossing administrative
boundaries• Unique issues such as co-allocation and co-location• Interesting work being done in scheduling like
market based scheduling