duogui wu yu miaosheng li peijun shi ying qu. about proxy introduction functions of proxy server:...
TRANSCRIPT
Duogui Wu
Yu MiaoSheng Li
Peijun Shi
Ying Qu
About Proxy
Introduction Functions of proxy server: Filtering Firewalls Logging
Caching Our Presentation: Inter-Cache
Protocols, pre-fetching technique, Code study, Cooperative Web Browsing, anonymous surfing.
Inter-Cache Protocols
Web hierarchy & Protocols
Problem:* Client bandwidth growth is exceeding
backbone capacity growth* Congestion problem is out of our control
Benefit of Caching: * With Standard client-server model -- inefficiency* With Caching Proxy Server -- reduce client-server traffic and improve response time
Why Caching?
Web Transaction
Cache Proxy History
CERN HTTP Daemon (the first WWW proxy/cache software )
Harvest Project (1994, Internet Engineering Research Task Force)
Two Important Features -- Requests handled by a single, non-forking
process -- UDP-based Internet Cache Protocol Split into two parts (1996): -- a commercial version of the Harvest cache -- Squid -- a free version
Advantages and Disadvantages of Caching
Advantages Improves browsing response time Reduces load on remote servers Provides robustness in case of external
network failure Disadvantages: (main) Receiving an outdated document from
the cache Information providers lose access counts Often requires manual configuration
Evolution of Web Caching Servers
A single web caching server -- like a proxy with caching
functionality A chain of web caching servers A web caching hierarchy (or mesh) -- a group of web caches can benefit from
sharing another cache in the same way that a group of Web clients benefit from sharing a cache.
Single Caching Server
A Chain of Caching Server
Web Caching Mesh
Distributed caching - Contents of the cache are distributed over a number of
proxy servers, all connected together in a net. - Create an array to form a strong caching system; - Serves well when one of the proxies fails. Hierarchical caching (or mesh)
- The proxy servers are being connected in hierarchical way.
- Cache establishes peering relationships with its neighbors
- Example: NLANR Cache Hierarchy
Two Caching Topologies
Hierarchical caching
NLANR Cache Hierarchy
The overall NLANR caching hierarchy
ICPv2 – Internet Cache Protocol
Version 2
Very basic method of inter-cache communication
Hierarchy Model Implemented on top of UDP used as an object location protocol Only Hit/Miss information is propagated
between caches Cache sends an ICP query to its neighbors. The neighbors send back ICP replies indicating
a "HIT" or a "MISS“.
Animations
Hierarchical cache relations cache hierarchy relations Cache Hit cache hit Cache miss #1 cache miss Cache miss #2 cache miss
CARPv1.0 -- Cache Array Routing Protocol
Version 1.0
Based on: A known membership list of loosely
coupled proxies A hash function for dividing URL space
among those proxies Allows for:Distributed caching, hierarchical proxy
arrays
Squid & MS Proxy Server 2.0
Squid
-- High-performance proxy caching server -- supporting FTP, gopher, and HTTP data objects -- Numerous inter-cache communication protocols are
supported ICP, Cache-Digests, HTCP MS Proxy Server 2.0 -- Uses the CARP to provide seamless scaling and
extreme efficiency -- allows for "queryless" distributed caching -- distributed caching helps alleviate network
administrator
Intelligent Prefetching at a Proxy Server
What is prefetching It is a technique to predict future
possible request for the user and store the corresponding objects in cache.
While an object is downloaded for a user the several other objects are prefetched behind the scene.
Why prefetching Reduces the perceived delay for
document downloading at the user’s end.
Increases the chances of getting the document from the local cache.
Prefetching to Personal Proxy Cache
Client
Client
Internet
Objects are downloaded before they are requested
Pre-fetching
Proxy Server
Personal Proxy Cache
More Bandwidth
less time for downloading
Less Bandwidth more time for downloading
Hierarchy for Personal Proxy Cache
Internet
Local Cache
Client
Personal Proxy Cache
Proxy Server Cache
Local Cache
Client
Personal Proxy Cache
Proxy Server
Layer 1
Layer 2
Layer 3
Personal caching agent
Determines the objects to be prefetched.
The user browser history helps to determine possible candidates.
The neighborhood prefetching for visiting a well structured website. Example:
Faculty of Computer Science
Implementation For every user there is a buffer
associated recently visited links. The buffer is updated on the basis of FIFO.
The maximum number of simultaneous connections are controlled dynamically. The buffer stores visited links and embedded links in the pages.
Prefetching is done on lower priority threads.
A sample of the typical results
Conclusion Objects are pre-fetched to the
personal cache at the proxy server to get faster downloading
Personal cache agent increase the probability of finding the objects at the proxy server
Multiple layers of cache could cost more overhead
Code study
BY peijun shi
"... and I think to myself, what a browsable world..."
overview Java-based forwards HTTP requests back the replies supports chained proxying ACL Caching Remote Admin
http://www.cs.technion.ac.il/
// Start the admin thread
System.out.print("Creating Admin Thread...");
Admin adminThd = new Admin(config,cache);
adminThd.start();
System.out.println(" port " +
config.getAdminPort() + " OK");
if (config.getIsFatherProxy()) {
System.out.println("Using Father Proxy "+
config.getFatherProxyHost()+":"+
config.getFatherProxyPort()+" ."); }
else
System.out.println("Not Using Father Proxy .");
System.out.println("Proxy up and running!");
Main()
Initialize the proxy{Set/get parameters; Load cache manager;}
create server socketloop{one client one thread}
// Create main socket
System.out.print("Creating Daemon Socket...");
MainSocket = new ServerSocket(daemonPort);
System.out.println(" port " + daemonPort + " OK");
// Main loop
while (true)
{
// Listen on main socket
Socket ClientSocket = MainSocket.accept();
// Pass request to new proxy thread
Proxy thd = new Proxy(ClientSocket,cache,config);
thd.start();
}
Remote admin// Read HTTP Request from client
request.parse(ClientSocket.getInputStream());
url = new URL(request.url);
// Send Web page with applet to administrator
if (url.getFile().equalsIgnoreCase("/admin"))
Parse(http request)See If from administrator
set up a object output stream generate http response push admin html page and applet out to administrator
File appletHtmlPage = new File(config.getAdminPath()
+ File.separator + "Admin.html");
DataInputStream in = new
DataInputStream(new FileInputStream(appletHtmlPage));
String s = null;
while((s = in.readLine()) != null) page += s;
page =page.substring(0,page.indexOf("PORT"))
+config.getAdminPort()
+page.substring(page.indexOf("PORT")+4);
in.close();
DataOutputStream out = new
DataOutputStream(ClientSocket.getOutputStream());
out.writeBytes(page);
out.flush();
out.close();
HttpReplyHdr reply = new HttpReplyHdr();
File appletFile = new
File(adminPath + File.separatorChar + className);
long length = appletFile.length();
FileInputStream in = new FileInputStream(appletFile);
DataOutputStream out = new
DataOutputStream(ClientSocket.getOutputStream());
out.writeBytes(reply.formOk("application/octet-stream",length));
while (-1 < ( count = in.read(data)))
{out.write(data,0,count);}
out.flush();
in.close();
out.close();
// Check if accessing the URL is allowed by administrator
String[] denied = config.getDeniedHosts();
for (int i=0; i<denied.length; i++)
{ if (url.toString().indexOf(denied[i]) != -1)
{ System.out.println("Access not allowed...");
DataOutputStream out =
new DataOutputStream(ClientSocket.getOutputStream());
out.writeBytes(reply.formNotAllowed());
out.flush();
ClientSocket.close();
return; } }
ACLScan blacklist see if URL is access_allowed;If (ok) forward URLElse send back http deny reply
Cache managerIf(isCached(request)){if(up-to-date) {increase cache-hit; refresh file-accessed date; return the file;}Else{Delete stale file;}Increase cache-missing;Forward request;}
// Client wants a web page - let's see if we have it in cache
if (cache.IsCached(url.toString()))
{// Client request is allready cached - get it from file
System.out.println("Hit! Getting from cache!!!");
config.increaseHits();
TakenFromCache = true;
// Get FileInputStream from Cache Manager
fileInputStream = cache.getFileInputStream(url.toString());
OutputStream out = ClientSocket.getOutputStream();
// Send the bits to client
byte data[] = new byte[2000];
int count;
while (-1 < ( count = fileInputStream.read(data)))
out.write(data,0,count);
out.flush();
fileInputStream.close();
}
// We do not have the page in cache
// Open socket to web server (or father proxy)
if (config.getIsFatherProxy()) {
System.out.println("Miss! Forwarding to father proxy " +
config.getFatherProxyHost() + ":" +
config.getFatherProxyPort() + "...");
config.increaseMisses();
SrvrSocket = new Socket(config.getFatherProxyHost(),
config.getFatherProxyPort()); }
else {
serverName = url.getHost();
Cache manager
Parse(http reply);If(dynamic object) doesn’t cache;If(unsuccessful reply) doesn’t cache;If(!cache-allowed) doesn’t cache;Otherwise{Send reply to client;Cache reply;}
System.out.println("Miss! Forwarding to server "+ serverName + "...");
config.increaseMisses();
SrvrSocket = new Socket(serverName,serverPort(request.url));}
// Send the url to web server (or father proxy)
DataOutputStream srvOut = new
DataOutputStream(SrvrSocket.getOutputStream());
srvOut.writeBytes(request.toString(false));
srvOut.flush();
// parse the URL for special characters.
// Convert the URL to filename - this method parses the URL and
// generate filename only if the URL is to be cached.
// We do not cache URLs containing '?', "cgi-bin" and
// a list of not-to-cached-URLs as instructed by the proxy administrator.
private String getFileName(String rawUrl)
{
String filename = basePath + File.separatorChar +
rawUrl.substring(7).replace('/','@');
if (filename.indexOf('?') != -1 ||
filename.indexOf("cgi-bin") != -1)
{
return null;
}
return filename;
}
Cache manager(con’t)// check reply headers (we must read first line of headers for that).
DataInputStream Din = new DataInputStream(SrvrSocket.getInputStream());
DataOutputStream Dout = new DataOutputStream(ClientSocket.getOutputStream());
String str = Din.readLine();
StringTokenizer s = new StringTokenizer(str);
String retCode = s.nextToken(); // first token is HTTP protocol
retCode = s.nextToken(); // second is return code
// Return codes 200,302,304 are OK to cache
if (!retCode.equals("200") && !retCode.equals("302")&& !retCode.equals("304"))
{
isCachable = false;
reasonForNotCaching = "Return Code is "+retCode;
}
Cache manager(con’t)// check if URL is cache-allowed by administrator
String[] denyCache = config.getCacheMasks();
for (int i=0; i<denyCache.length; i++)
{
if (url.toString().indexOf(denyCache[i]) != -1)
{
isCachable = false;
reasonForNotCaching = "Caching this URL is not allowed";
break;
}
}
// With the HTTP reply do: (1) Send it to client. (2) Cache it.
InputStream in = SrvrSocket.getInputStream();
OutputStream out = ClientSocket.getOutputStream();
byte data[] = new byte[2000];
int count;
while (( count = in.read(data)) > 0)
{// Send bits to client
out.write(data,0,count);
if (isCachable)
{// Write bits to file
line = new byte[count];
System.arraycopy(data,0,line,0,count);
fileOutputStream.write(line);
cache.DecrementFreeSpace(count,url.toString());
}
}// Add new entry to hash table
cache.AddToTable(url.toString());
shared resources protection
Multi-thread may lead to Deadlock
Content inconsistent
synchronized methods & objects
Cache manager(con’t) private synchronized void MakeFreeSpace(String rawUrl)
{
……
while (config.getBytesFree() < MinFreeSpace)
{
// Enumerate the hash table entries to find the LRU file
for (Enumeration keys = htable.keys(); keys.hasMoreElements() ;)
{
filename = (String)keys.nextElement();
date = (Date)htable.get(filename);
if (date.before(minDate))
LRUfilename = filename;
}
//delete LRU file
File LRUfile = new File(LRUfilename);
long nbytes = LRUfile.length();
boolean result = LRUfile.delete();
if (result == true)
{
// Delete entry in hash table
htable.remove(LRUfilename);
config.decreaseFilesCached();
// Increment free space
config.setBytesCached(config.getBytesCached()
- nbytes);
}
A Proxy-based Approach
to Support Cooperative WWW Browsing
Presented by Yu Miao
What?
Definition: Cooperative browsing is the activity of a team that surfs the Web with the goal of finding and retrieving information on a common topic.
Functionality:
Avoid duplicate searches
Integrate a workgroup cache.
Permit communications.
Achieve portability.
Enforce security.
The Proxy Framework Architecture
A Proxy-based Approach
Filter requests from clients and results from servers.
Integrate a group cache.
Dynamic production of result HTML pages.
Define a workgroup.
Define specific security policies.
Portable (re-directing, HTTP).
The Cooperative Browsing Application
Figure 4. The framed applet inserted by the proxy
Other Features
The result pages can be modified:
Link tag enriched with error information and color.
Link in group cache marked with a lighter color.
Additional information v.s. readability
The Proxy Framework
The proxy-framework is implemented as a set of Java classes:
GenericServer: accepts connection requests
ProxyServer: accepts HTTP request
MsgServer: handles virtual communication channels (applets)
ProxyConnection: handles each request; inserts control applet
MsgConnection: serves one applet
ProxyEnvironment: implements a blackboard.
Performance Evaluation
Overhead: The delay introduced by the proxy is too little to be significantly perceived by the user.
Group Cache: significantly speedups the browsing activity.
Anonymous Surfing using Proxy
Some Problems in Internet Security
1. Eavesdropping
Outsiders listening in on electronic conversations.
2. Traffic Analysis
Revealing who is talking to whom.
Anonymous Connection
1. Function of anonymous connection Make it difficult for observers to learn identifying information from the connection
by reading packet headers or tracking encrypted payload.
2. Characteristics of anonymous connection Designed to be resistant to traffic analysis
Virtual connection
Connection-oriented
Bi-directional
Near real-time
Multiplexed over long-standing socket connection
Identifying information is passed as data through the anonymous connection
Use onion to establish the anonymous connection
Proxy/Onion Router
Onion Routing using Proxy
Anonymous connection
Permanent socket connection
Onion Router
Onion Router
Onion Router
Onion Router
Socket connection
Secure Site Secure Site
Initiator
Responder
Onion
Proxy/Onion Router
w
Onion Router Onion Router Onion Router
x y z
ACI=5
To Y, key1x key2x
To Z, key1Y key2Y
0, key1Z
key2Z
Innermost layer
outmost layer
ACI=8
Random bits
To Z, key1Y key2Y
0, key1Z
key2Z
ACI
Random bits
Random bits
0, key1Z key2Z
Input port Input ACI Output port Output ACI Forward key for data-crypt
Backward key for data-crypt
1 5 3 8 key1 key2
Data table in onion router X:
ACI: anonymous connection identifier
Procedure of anonymous connection set up and data transfer
clientApplication proxy
Onion proxyOnion network
time1.establish socket connection
2.establish socket connection
3.standard structure
4.onion
5.standard structure
6. One byte status code
7. data transfer
Proxy
Onion network
A proxy can viewed as three logical components:
1. Application proxy
Bridge between application and onion proxy
Make application independent to onion network
2. Onion proxy
Build onion to establish anonymous connection
Data encryption
3. Entry multiplex
Multiplex anonymous connections to onion network
Responsibilities of application proxy
1. Establish socket connection to application client
2. Request socket connection to onion proxy
3. Re-format the data stream so that the other component is application independent
4. Send a standard structure (including destination IP address and port number) to onion proxy
5. Process a 1-byte return status code
6. Filter request header, strip identifying information
7. Insulate cookies, ad banners, dynamic content
Responsibilities of onion proxy
1. Establish socket connection to application proxy
2. Build a onion
3. Repeatedly pre-encrypt outgoing data
4. Repeatedly de-encrypt incoming data
5. Relay data back and forth
Security Performance
1. Anonymous communication
The identifying information is stripped by proxy
2. Anonymous connection
Each onion router can only identify its next hop
Data passed along the anonymous connection appear different at each onion router
The number of layers in onion is unchanged
Even one honest node is enough to maintain the privacy of the route
Applications
1. VPN
Long term anonymous connection acts as leased line over public network
2. Anonymous chatting
3. Anonymous cash
4. Remote login
5. Web browsing
6. Electrical mail
References
http://vms.process.com/~help/helpproxy.html Sajid Hussain and Robert D. Mcleod, “ Intelligent Prefetching at a Proxy
Server”, in Proceeding s of IEEE 2000 http://www.cs.technion.ac.il/ Cabri, G.; Leonardi, L.; Zambonelli, F., Supporting cooperative WWW
browsing: a proxy-based approach, Parallel and Distributed Processing, 1999. PDP '99. Proceedings of the Seventh Euromicro Workshop on , 1999 Page(s): 138 -145
http://www.onion-router.net