dca mini project report
DESCRIPTION
AN APPROACH TO CATEGORIZATION OF TEXT IN WEBSITESUSING PARALLEL SEARCHTRANSCRIPT
AN APPROACH TO CATEGORIZATION OF TEXT IN WEBSITES USING PARALLEL SEARCH
BAKTAVATCHALAM.G (08MW03)
MASTER OF ENGINEERING
Branch: SOFTWARE ENGINEERING
of Anna University
May 2009
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PSG COLLEGE OF TECHNOLOGY
(Autonomous Institution)
COIMBATORE – 641 004
PSG COLLEGE OF TECHNOLOGY (Autonomous Institution)
COIMBATORE – 641 004
AN APPROACH TO CATEGORIZATION OF TEXT IN WEBSITES USING
PARALLEL SEARCH
Bona fide record of work done by
BAKTAVATCHALAM.G (08MW03)
MASTER OF ENGINEERING
Branch: COMPUTER SCIENCE AND ENGINEERING
of Anna University, Coimbatore.
May 2009
Acknowledgement
i
ACKNOWLEDGEMENT
We wish to express our sincere gratitude to our respected Principal Dr. R. Rudramoorthy for having given us the opportunity to undertake our project.
We also wish to express our sincere thanks to Dr. S. N. Sivanandam, Professor and Head of the Department of Computer Science and Engineering, for his
encouragement and support that he extends towards our project work.
We extend our sincere thanks to our internal guide, Mrs. D. Indumathi, Asst. Professor, Department of Computer Science and Engineering, for his guidance and
help rendered for the successful completion of our project.
Contents
iii
CONTENTS
CHAPTER Page No. Synopsis………………….………………………………………………..…………….. .(i) List of Figures.………….………………………………………………...…………….. .(ii) List of Tables.…………………………………………………………………………….(iii) 1. INTRODUCTION.……...…………………………………………………………... .1
1.1. Problem Definition 1
1.2. Objective of the Project 1
1.3. Significance of the Project 1
1.4. Outline of the Project 1
2. SYSTEM STUDY..…….……………………..……………………………………...3 2.1. Proposed System 3
3. SYSTEM ANALYSIS..…….……………………..………………………………….4 3.1 Requirement Analysis 4 3.2 Feasibility Study 4
4. SYSTEM IMPLEMENTATION.………………..…………………………………...10 5.1 Server Module 10
5.2 Parser Module 11
5. TESTING……………………….………………..……………………………………12 6.1 Unit Testing 12
6.2 Integration Testing 14
6.3 Sample Test Cases 15
6. SNAPSHOT.…..……………….………………..…………………………………. 16
7.1 Finding Document Category 16
7.2 Finding keyword Document 16
CONCLUSIONS………………..………………………………………….……….……..17 FUTURE ENHANCEMENTS..…………………………………………………….……. .18
BIBLIOGRAPHY...…………………………………………………………….………….19
Synopsis
i
SYNOPSIS
In this project, we search a given set of keywords in categorized
documents. Searching is done after the categorization is completed and categories of
given documents are available.
Here we do two separate operations. First we generate the categories and
its related categories. After that we give required web site links to find categories of
those links. Here each website contents are parsed into keywords list and using those
keys the corresponding category is determined. Now the documents and its categories
are computed to search using keys.
Second, we give keywords to search engine to search the document and
its corresponding category. If keyword is composite of multiple keywords then all keys
are searched and its corresponding document and its corresponding category will be
retrieved. The category contains name, keys, and weights for corresponding keys.
Category is sorted using those weights and key occurrences.
List of Figures
ii
LIST OF FIGURES
FIGURE NO LIST OF FIGURES PAGE NO.
Fig: 2.1
System Architecture 3
List of Tables
iii
LIST OF TABLES
TABLE NO NAME PAGE NO.
Table 6.1 Sample Test Cases 15
Introduction Chapter 1
1
CHAPTER 1
INTRODUCTION
This chapter provides a brief overview of the problem definition, objectives and
significance of the project and an outline of the report.
1.1 PROBLEM DEFINITION Searching a given keyword set in a given website set and categorizes the
websites. If a keyword set is given then it will determine the documents which are most
relevant to that keyword set and also the category which it belongs to that keyword set
1.2 OBJECTIVE OF THE PROJECT Most of the users are interested in the website contents of their desired
information. Also users want the information location where that info is found. So this
project gives a solution for user that user can search where a particular text paragraph is
found in a given set of websites and corresponding category.
1.3 SIGNIFICANCE OF THE PROJECT With the enormous growth in information on the Internet, there is a corresponding
need for tools that enable fast and efficient searching, browsing and delivery of textual
data. The concurrent execution will greatly simplify the complexity of the search.
1.4 OUTLINE OF THE PROJECT The rest of the report is structures as follows. Chapter 2 provides a detailed study
of the existing system and the basic ideas of the proposed system. Chapter 3 discusses
the requirements for the development of the system and an analysis on the feasibility of
the system. Chapter 4 presents the overall design of the system. Chapter 5 discusses
Introduction Chapter 1
2
the implementation details. Chapter 6 explains various testing procedures conducted on
the system. Chapter 7 contains the snapshot of various forms in our system. The last
section summarizes the project.
System Study Chapter 2
3
CHAPTER 2
SYSTEM STUDY
This chapter elucidates the existing system and a brief description of the
proposed system.
2.1 PROPOSED SYSTEM
In our project, we search a given set of keywords in categorized
documents. Searching is done after the categorization is completed and categories of
given documents are available. Here we do two separate operations. First we generate
the categories and its related categories. After that we give required web site links to find
categories of those links. Here each website contents are parsed into keywords list and
using those keys the corresponding category is determined. Now the documents and its
categories are computed to search using keys. Second, we give keywords to search
engine to search the document and its corresponding category. If keyword is composite
of multiple keywords then all keys are searched and its corresponding document and its
corresponding category will be retrieved. The category contains name, keys, and
weights for corresponding keys. Category is sorted using those weights and key
occurrences.
Figure 2.1
Keywords
Websites
Document Finder Categorizer
Search Keyword Documents +
Categories
System Analysis Chapter 3
4
CHAPTER 3
SYSTEM ANALYSIS This section describes the hardware and software specifications for the
development of the system and an analysis on the feasibility of the system.
3.1 REQUIREMENT ANALYSIS 3.1.1 Software Requirements After experimenting with various commercial software available and analyzing
the Pros and Cons of the software, the following are chosen.
• Operating System – Platform Independent • Programming Languages – Java 1.6+ • Front End - Java
3.1.2 Hardware Requirements The Hardware requirements of the proposed system are as follows:
• Pentium-III machine & above
• RAM-256 MB
• Hard Disk with a Capacity of 10 GB 3.2 FEASIBILITY ANALYSIS Feasibility deals with step-by-step analysis of the system. Analysis showed that
this project was feasible in all respects. Three kinds of feasibility factors are considered:
• Economic Feasibility
• Technical Feasibility
• Operational Feasibility
System Analysis Chapter 3
5
3.2.1 Economic Feasibility
The system is developed only using those softwares that are very well used in
the market, so there is no need for installation of new softwares. Hence, the cost
incurred towards this project is negligible
3.2.2 Technical Feasibility
3.2.2.1 Searching The main aim of our project is to search a specific set of keywords in a specific
set of websites only.
3.2.2.2 Categorizing Next important thing that must be done in our project is to categorize the
documents, so that we can able to search for a specific keyword set.
3.2.3 Operational Feasibility The functions needed to be performed by the system are all valid and without
any conflicts. All functions and constraints specified in the requirements are completely
operational. The requirements stated are realistically testable.
The requirements are adaptable to changes with out any large-scale effects on
other system requirements. The system is capable of accommodating future
requirements if they arise.
System Design Chapter 4
6
CHAPTER 4
SYSTEM DESIGN This chapter describes the functional decomposition of the system and illustrates
the movement of data between external entities, the processes and the data stores
within the system, with the help of data flow diagrams.
4.1 USE CASE DIAGRAM
Actors User, Client, Server
Usecases IP List, URL List, Keywords, Specification, Send Jobs Process Jobs,
Searching, Results
IP List URL List Keywords
Process JobsClient
Specification
Send Jobs
Server
Searching
Results
User
System Design Chapter 4
7
4.2 CLASS DIAGRAM
ClientReadS : Socket
dataFS()
ClientWriteS : Socket
send()
ServerReadS : Socket
dataFS()
ServerWriteS : Socket
send()
ServerGUI
main()
ClientGUI
main()
ServerManagerS : SocketkN : intkey[] : StringURL : String
ClientManagerS : SocketkN : intkey[] : StringURL : String
search()parseURL()dataFS()
4.3 SEQUENCE DIAGRAM
User Server Client(s)
1: IP List
2: Keywords
3: URL List
4: Init Process
5: Allocate Jobs
6: Distribute Jobs
7: Process Searching
8: Result9: Combined Result
System Design Chapter 4
8
4.4 COLLABORATION DIAGRAM
User Server
Client(s)
5: Allocate Jobs
7: Process Searching
1: IP List2: Keywords3: URL List
4: Init Process
9: Combined Result6: Distribute Jobs
8: Result
4.5 STATE CHART / ACTIVITY DIAGRAM
Read IPList, URL List and Keywords
Send Keywords To all IP
Send URL List to all IP
Display Results
Results Found?
Yes
No
Receive all Data
Search each keyword count in each URL
Compute All keywords Count from all URL's
Results
ClientServ er
System Design Chapter 4
9
4.6 DEPLOYMENT DIAGRAM
IP List URL ListKeywords
ServerClient(s)
Implementation Chapter 5
10
CHAPTER 5
IMPLEMENTATION
This phase is broken up into two phases: Development and Implementation. The
individual system components are built during the development period. Programs are
written and tried by users.
During Implementation, the components built during development are put into
operational use.
In the development phase of our system, the following system components were
built.
• Server module
• Parser module
The Server & Parser module is developed using Java.
5.1 Server Module This module contains following sub-modules,
• Load Details
• Categorizing
• Searching
5.1.1 Load Details In this module we load Categories & its related categories, Documents & its
categories, Categories & its Keys with Weights.
5.1.2 Categorizing In this module we categorize the given document using key set parsed from that
document and corresponding weights relevant to available categories.
5.1.3 Searching In this module we search documents and its category using given key set.
Implementation Chapter 5
11
5.2 Parser Module This module contains following sub-modules,
• Load Module
• URL Content Grabber Module
5.2.1 Load Module In this module we load keywords from server and then retrieve URL to begin
searching.
5.2.1 URL Content Grabber Module Whenever a URL is coming from server then the parser makes connection to that
URL and retrieves the contents to begin searching and after it collects key sets from that
site.
Testing Chapter 6
12
CHAPTER 6
TESTING
This chapter explains the various testing procedures conducted on the system.
Testing is a process of executing a program with the intent of finding an error. A
successful test is one that uncovers an as yet undiscovered error. A testing process
cannot show the absence of defects but can only show that software errors are present.
It ensures that defined input will produce actual results that agree with the required
results. A good testing methodology should include
• Clearly define testing roles, responsibilities and procedures
• Establish consistent testing process
• Streamline testing requirements
• Overcome “requirements slow me down” mentality
• Common sense process approach
• Use some elements of existing Process
• Not an attempt to replace, rewrite or redefine Process
• To find defects early and to give good time to developers for bug fixes
• Independent perspective in testing
Some of the testing principles used in this project are:
• Unit Testing
• Integration Testing
6.1 UNIT TESTING Unit testing is a strategy by which individual components, which make up the
system, are tested first to ensure that system works up to the desired extent. It focuses
on the verification effort on the smallest unit of the software design i.e. module. Various
modules of the system are tested to see whether they perform their intended functions.
Using procedural design description, important control paths are tested to uncover the
Testing Chapter 6
13
errors with in the boundary of the module. While accepting a connection using specified
functions we go for unit testing in their respective modules. The unit test is normally a
white box test (a testing method in which the control structure of the procedural design is
used to derive test cases).
6.1.1 Process Objectives To test every unit of the software in isolation before integrating it with other units.
6.1.2 Definition of Unit
A unit is a module as identified during size estimation process with a size
estimate that does not exceed 1000LOC.
For GUI applications each screen will be a unit.
If the size estimate for a unit exceeds 1000 LOC and it is not feasible to break it
into smaller logically independent units that can be tested in isolation, the project lead in
concurrence with the SQA can decide to define this as a unit.
6.1.3 Entry Criteria The entry criteria for this process are the following:
• Unit completed
• Unit peer reviewed
6.1.4 Exit Criteria The exit criteria for this process are the following:
• Unit test cases executed
• Any defects that are identified during unit testing and that are not fixed before the
unit enters component testing is listed in the test report and verified
• 100% statement coverage
If unit will be tested before code review of unit, this must be identified in the
project plan. In these projects the developer will self-review (desk check) the code
before unit testing.
In cases of exception handling of error conditions that are difficult to generate,
thereby making it impossible to achieve 100% statement coverage, the code should be
formally reviewed with this additional criteria
Testing Chapter 6
14
6.2 INTEGRATION TESTING The integration testing is a systematic technique for constructing the program
structure while conducting tests to uncover errors associated with interfacing. It is a type
of testing by which the individual modules of the system are combined and tested
whether they work properly as a whole. The objective is to take unit test modules and
build a program that has been dictated by the design. Integration testing can be either
‘Incremental’ or ‘Non-Incremental’.
The objective of the integration testing is to help engineers plan and execute the
component and Integration testing for their respective projects.
Integration testing should include the following objectives:
• Performed by the product group/Dev test team after feature complete
• Determines that all product components on a list of specific platforms function
successfully together (The List specified in Master test plan)
• Performed in a basic product / platform environment (Basic environment
specified in Master test plan)
• Tests the product functionality against the specification
• Tests functionality of fake languages with sample single and double byte
languages
• Tests scaling to an acceptable minimum level as called out in the master test
plan
• Tests performance, reliability to an acceptable level as called out in the master
test plan
• Final integration tests done after all components are integrated, with the build in
production format
The tasks of the project have been integrated and the functioning of the entire
system has been found to be satisfactory. The functionality of the entire system has
been subjected to a series of tests and all the modules have been found to interoperate
properly.
Finally the integration testing was performed on the integrated system and found
to work properly.
Testing Chapter 6
15
6.3 SAMPLE TEST CASES The following are the some of the sample test cases employed along with the
test results have been described in the table below.
Table 6.1 Sample Test Cases
Test Description
Result
Is Server stable for running more than one key set? OK
Is parser returns the results properly? OK
Is searching is done correctly? OK
Is Server takes Lower Resources? OK
Is the result is got over a less time? OK
Snapshot Chapter 7
16
CHAPTER 7
SNAPSHOT
This chapter contains the snapshot of various forms in our system.
7.1 Finding Category of given document
Snapshot Chapter 7
17
7.2 Finding the Document & its Category using given keyword
Conclusion
17
CONCLUSION
Thus the analysis, design and implementation of text categorization and
searching are done successfully. So that the user can able to do searching of a set of
keywords in a list of websites and the user can able to view the each keyword count for
a particular website. This searching is very useful for crawl the websites with particular
perspective view of specific content. Also the search is running concurrently, so we can
get higher performance.
Future Enhancements
18
FUTURE ENHANCEMENTS Currently we have flat classification scheme to find categories, in future it will
extended to hierarchical tree structure classification to reduce the time complexity and
improve relevancy. Currently we give set of websites for classification, in future
classification is done by automatic parsing of sites.
Bibliography
19
BIBLIOGRAPHY
• [Lorenz 1994] Lorenz, L. Kidd, J. Object Oriented Software Metrics, Prentice Hall 1994, ISBN 0-13-179292-X
• Saturnino Luz, Implementing a Text Categorization System: a step-by-step tutorial
• A. McCallum and K. Nigam. A comparison of event models for naive Bayes text classification. In AAAI/ICML-98 Workshop on Learning for Text Categorization, pages 41–48. AAAI Press, 1998.
• Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization.
• In D. H. Fisher, editor, Proceedings of ICML-97, 14th International Conference on Machine Learning, pages 412–420, Nashville, 1997. Morgan Kaufmann Publishers.
• Java Network Programming, O'Reilly & Associates, Inc.,, Second Edition
• Herbert Schildt ., and Patrick Naughton , 2001,“Java2: The Complete Reference “, Fourth
Edition , Tata McGraw-Hill Publishing Company Limited . Websites
http://www-2.cs.cmu.edu/afs/cs.cmu.edu/project/theo-11/www/wwkb/ http://paul.luminos.nl/documents/show_document.php?d=197
Appendix
20
APPENDIX
SOURCE CODE LISTINGS
This chapter provides source code listings.
INPUT FILES IP.TXT 2 127.0.0.1 127.0.0.1 127.0.0.1 JOBS.TXT 0 5 www.google.co.in www.yahoo.com www.chennaionline.com www.psgtech.edu www.psgtech.edu KEY.TXT 4 page href www Tamil OUTPUT (In Server)
Sockets created
Keys distributed
Is Found:true
Socket[addr=/127.0.0.1,port=4926,localport=5678]:www.google.co.in---page---1
Is Found:true
Appendix
21
Socket[addr=/127.0.0.1,port=4926,localport=5678]:www.google.co.in---href---36
Is Found:true
Socket[addr=/127.0.0.1,port=4926,localport=5678]:www.google.co.in---www---18
Is Found:true
Socket[addr=/127.0.0.1,port=4926,localport=5678]:www.google.co.in---Tamil---1
Is Found:true
Socket[addr=/127.0.0.1,port=4926,localport=5678]:www.yahoo.com---page---1
Is Found:true
Socket[addr=/127.0.0.1,port=4926,localport=5678]:www.yahoo.com---href---48
Is Found:true
Socket[addr=/127.0.0.1,port=4926,localport=5678]:www.yahoo.com---www---5
Is Found:false
Socket[addr=/127.0.0.1,port=4926,localport=5678]:www.yahoo.com---Tamil---0
SERVER /* * ServerGUI.java * * Created on November 2, 2008, 3:09 PM */ import java.io.*; import java.util.*; import javax.swing.*; /** * * @author SuperStar */ interface ServerI { public void setErr(String err); public void setInfo(String info); } public class ServerGUI extends javax.swing.JFrame implements ServerI { String[] ip; int ipN=0,rN=0,jN=0,jT=0,kN=0; String[] jobs; String[] rank; String[] key; ServerManager SM; /** Creates new form ServerGUI */ public ServerGUI() { initComponents(); this.jTextArea2.setText("Err Stream:"); this.jList1.removeAll();
Appendix
22
// this.jList2.removeAll(); this.jList3.removeAll(); (new MessageBox("welcome To SuperStar's Network!")).setVisible(true); } /** This method is called from within the constructor to * initialize the form. * WARNING: Do NOT modify this code. The content of this method is * always regenerated by the Form Editor. */ // <editor-fold defaultstate="collapsed" desc="Generated Code">//GEN-BEGIN:initComponents private void initComponents() { jScrollPane1 = new javax.swing.JScrollPane(); jList1 = new javax.swing.JList(); jLabel1 = new javax.swing.JLabel(); jButton1 = new javax.swing.JButton(); jScrollPane3 = new javax.swing.JScrollPane(); jList3 = new javax.swing.JList(); jLabel2 = new javax.swing.JLabel(); jScrollPane2 = new javax.swing.JScrollPane(); jTextArea1 = new javax.swing.JTextArea(); jButton3 = new javax.swing.JButton(); jScrollPane4 = new javax.swing.JScrollPane(); jTextArea2 = new javax.swing.JTextArea(); jButton2 = new javax.swing.JButton(); jScrollPane5 = new javax.swing.JScrollPane(); jTextArea3 = new javax.swing.JTextArea(); setDefaultCloseOperation(javax.swing.WindowConstants.EXIT_ON_CLOSE); setTitle("Server"); jList1.setModel(new javax.swing.AbstractListModel() { String[] strings = { "Item 1", "Item 2", "Item 3", "Item 4", "Item 5" }; public int getSize() { return strings.length; } public Object getElementAt(int i) { return strings[i]; } }); jScrollPane1.setViewportView(jList1); jLabel1.setText("Clients IP :"); jButton1.setText("Load Details"); jButton1.addActionListener(new java.awt.event.ActionListener() { public void actionPerformed(java.awt.event.ActionEvent evt) { jButton1ActionPerformed(evt); } }); jList3.setModel(new javax.swing.AbstractListModel() { String[] strings = { "Item 1", "Item 2", "Item 3", "Item 4", "Item 5" }; public int getSize() { return strings.length; } public Object getElementAt(int i) { return strings[i]; } }); jScrollPane3.setViewportView(jList3); jLabel2.setText("Clients Rank :");
Appendix
23
jTextArea1.setColumns(20); jTextArea1.setEditable(false); jTextArea1.setLineWrap(true); jTextArea1.setRows(5); jTextArea1.setWrapStyleWord(true); jTextArea1.setOpaque(false); jScrollPane2.setViewportView(jTextArea1); jButton3.setText("Exit"); jButton3.addActionListener(new java.awt.event.ActionListener() { public void actionPerformed(java.awt.event.ActionEvent evt) { jButton3ActionPerformed(evt); } }); jTextArea2.setColumns(20); jTextArea2.setRows(5); jScrollPane4.setViewportView(jTextArea2); jButton2.setText("Process"); jButton2.addActionListener(new java.awt.event.ActionListener() { public void actionPerformed(java.awt.event.ActionEvent evt) { jButton2ActionPerformed(evt); } }); jTextArea3.setColumns(20); jTextArea3.setRows(5); jScrollPane5.setViewportView(jTextArea3); javax.swing.GroupLayout layout = new javax.swing.GroupLayout(getContentPane()); getContentPane().setLayout(layout); layout.setHorizontalGroup( layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addGroup(layout.createSequentialGroup() .addContainerGap() .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addGroup(javax.swing.GroupLayout.Alignment.TRAILING, layout.createSequentialGroup() .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addComponent(jLabel1) .addComponent(jScrollPane1, javax.swing.GroupLayout.DEFAULT_SIZE, 330, Short.MAX_VALUE)) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addComponent(jLabel2) .addComponent(jScrollPane3, javax.swing.GroupLayout.PREFERRED_SIZE, 333, javax.swing.GroupLayout.PREFERRED_SIZE))) .addGroup(layout.createSequentialGroup() .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING) .addComponent(jScrollPane5, javax.swing.GroupLayout.PREFERRED_SIZE, 195, javax.swing.GroupLayout.PREFERRED_SIZE) .addComponent(jScrollPane4, javax.swing.GroupLayout.PREFERRED_SIZE, 195, javax.swing.GroupLayout.PREFERRED_SIZE)) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED)
Appendix
24
.addComponent(jScrollPane2, javax.swing.GroupLayout.DEFAULT_SIZE, 371, Short.MAX_VALUE) .addGap(6, 6, 6) .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addComponent(jButton2, javax.swing.GroupLayout.DEFAULT_SIZE, 91, Short.MAX_VALUE) .addGroup(layout.createSequentialGroup() .addGap(10, 10, 10) .addComponent(jButton3, javax.swing.GroupLayout.PREFERRED_SIZE, 60, javax.swing.GroupLayout.PREFERRED_SIZE)) .addComponent(jButton1, javax.swing.GroupLayout.DEFAULT_SIZE, javax.swing.GroupLayout.DEFAULT_SIZE, Short.MAX_VALUE)))) .addContainerGap()) ); layout.setVerticalGroup( layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addGroup(layout.createSequentialGroup() .addGap(11, 11, 11) .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.TRAILING) .addGroup(layout.createSequentialGroup() .addComponent(jLabel2) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addComponent(jScrollPane3, javax.swing.GroupLayout.PREFERRED_SIZE, 88, javax.swing.GroupLayout.PREFERRED_SIZE)) .addGroup(layout.createSequentialGroup() .addComponent(jLabel1) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addComponent(jScrollPane1, javax.swing.GroupLayout.PREFERRED_SIZE, 88, javax.swing.GroupLayout.PREFERRED_SIZE))) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addGroup(layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addComponent(jScrollPane2, javax.swing.GroupLayout.DEFAULT_SIZE, 104, Short.MAX_VALUE) .addGroup(layout.createSequentialGroup() .addComponent(jButton1) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addComponent(jButton3) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addComponent(jButton2)) .addGroup(layout.createSequentialGroup() .addComponent(jScrollPane4, javax.swing.GroupLayout.PREFERRED_SIZE, 49, javax.swing.GroupLayout.PREFERRED_SIZE) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addComponent(jScrollPane5, javax.swing.GroupLayout.PREFERRED_SIZE, 49, javax.swing.GroupLayout.PREFERRED_SIZE))) .addContainerGap()) ); pack(); }// </editor-fold>//GEN-END:initComponents private void jButton1ActionPerformed(java.awt.event.ActionEvent evt) {//GEN-FIRST:event_jButton1ActionPerformed // TODO add your handling code here: _getIPList(); _getRankList();
Appendix
25
_getJobs(); _getKeyList(); }//GEN-LAST:event_jButton1ActionPerformed private void jButton3ActionPerformed(java.awt.event.ActionEvent evt) {//GEN-FIRST:event_jButton3ActionPerformed // TODO add your handling code here: this.dispose(); System.exit(0); }//GEN-LAST:event_jButton3ActionPerformed private void jButton2ActionPerformed(java.awt.event.ActionEvent evt) {//GEN-FIRST:event_jButton2ActionPerformed // TODO add your handling code here: SM=new ServerManager(ipN,rN,jN,kN,ip,jobs,rank,key,this); }//GEN-LAST:event_jButton2ActionPerformed /** * @param args the command line arguments */ public static void main(String args[]) { java.awt.EventQueue.invokeLater(new Runnable() { public void run() { new ServerGUI().setVisible(true); } }); } // Variables declaration - do not modify//GEN-BEGIN:variables private javax.swing.JButton jButton1; private javax.swing.JButton jButton2; private javax.swing.JButton jButton3; private javax.swing.JLabel jLabel1; private javax.swing.JLabel jLabel2; private javax.swing.JList jList1; private javax.swing.JList jList3; private javax.swing.JScrollPane jScrollPane1; private javax.swing.JScrollPane jScrollPane2; private javax.swing.JScrollPane jScrollPane3; private javax.swing.JScrollPane jScrollPane4; private javax.swing.JScrollPane jScrollPane5; private javax.swing.JTextArea jTextArea1; private javax.swing.JTextArea jTextArea2; private javax.swing.JTextArea jTextArea3; // End of variables declaration//GEN-END:variables // public void _getIPList() { this.jList1.removeAll(); try { BufferedReader in=new BufferedReader(new FileReader("ip.txt")); ipN=Integer.parseInt(in.readLine()); ip=new String[ipN];
Appendix
26
for(int i=0;i<ipN;i++) { ip[i]=in.readLine(); } in.close(); this.jList1.setListData(ip); this.jButton1.setEnabled(false); } catch(Exception e) { setErr(e.getMessage()); } } // public void _getRankList() { this.jList3.removeAll(); try { BufferedReader in=new BufferedReader(new FileReader("rank.txt")); rN=Integer.parseInt(in.readLine()); rank=new String[rN]; for(int i=0;i<rN;i++) { rank[i]=in.readLine(); } in.close(); this.jList3.setListData(rank); } catch(Exception e) { setErr(e.getMessage()); } } // public void _getKeyList() { this.jTextArea3.setText(""); try { BufferedReader in=new BufferedReader(new FileReader("key.txt")); kN=Integer.parseInt(in.readLine()); key=new String[kN]; for(int i=0;i<kN;i++) { key[i]=in.readLine(); this.jTextArea3.setText(this.jTextArea3.getText()+"\n"+key[i]); } in.close(); //this.jList3.setListData(rank); } catch(Exception e) { setErr(e.getMessage());
Appendix
27
} } // public void _getJobs() { this.jTextArea2.setText(""); try { BufferedReader in=new BufferedReader(new FileReader("jobs.txt")); jT=Integer.parseInt(in.readLine()); this.jTextArea2.setText("Job Type:"+jT); switch(jT) { case 0: jN=Integer.parseInt(in.readLine()); jobs=new String[jN]; for(int i=0;i<jN;i++) { jobs[i]=in.readLine(); this.jTextArea2.setText(this.jTextArea2.getText()+"\n"+jobs[i]); } break; } in.close(); } catch(Exception e) { setErr(e.getMessage()); } } // public void setErr(String err) { this.jTextArea1.setText(this.jTextArea1.getText()+"\n"+err); System.out.println(err); } public void setInfo(String info) { setErr(info); } } /** * * @author SuperStar */ import java.net.*; import java.io.*; interface ServerIF { final int PORT=5678; public void dataFC(String data); }
Appendix
28
public class ServerManager extends Thread implements ServerIF { String IP[],R[],J[],K[]; int rN,ipN,jN,kN; Socket[] sock; ServerWriteThread[] SWT; ServerReadThread[] SRT; ServerI SI=null; public ServerManager(int i,int r,int j,int k,String[] ip1,String[] j1,String[] r1,String[] k1,ServerI si) { rN=r; ipN=i; jN=j; kN=k; IP=ip1; J=j1; R=r1; K=k1; SI=si; start(); } public void run() { try { sock=new Socket[ipN]; SWT=new ServerWriteThread[ipN]; SRT=new ServerReadThread[ipN]; //SI.setInfo("ipn:"+ipN); for(int i=0;i<ipN;i++) { sock[i]=new Socket(IP[i],5678); //SI.setInfo("ip:"+IP[i]); SWT[i]=new ServerWriteThread(sock[i],SI,this); SRT[i]=new ServerReadThread(sock[i],SI,this); //SI.setInfo("soc:"+sock[i].toString()); } SI.setInfo("Sockets created"); _split(); } catch(Exception e1) { SI.setErr("Sock Cre:"+e1.toString()); } } public void _split() { //java.util.Arrays.sort(R); for(int i=0;i<ipN;i++) { SWT[i].send(""+kN); //SI.setInfo(""+kN); }
Appendix
29
for(int i=0;i<ipN;i++) { for(int j=0;j<kN;j++) { SWT[i].send(K[j]); //SI.setInfo(K[j]); } } // SI.setInfo("Keys distributed"); for(int i=0,j=0;i<jN;i++) { SWT[j].send(J[i]); //SI.setInfo(J[i]); if(j<ipN-1) j++; else j=0; } } public void dataFC(String data) { SI.setInfo(data); } public void _quit() { // } } ////////////// class ServerWriteThread { Socket S; ServerI SI=null; ServerIF SIF; public ServerWriteThread(Socket s,ServerI si,ServerIF sif) { SIF=sif; SI=si; S=s; //SI.setInfo(s.toString()); } public void send(String msg) { try { //SI.setInfo(msg); PrintWriter out=new PrintWriter(new BufferedWriter(new OutputStreamWriter(S.getOutputStream())),true); out.println(msg); } catch(Exception e3) { SI.setErr(e3.getMessage()); } }
Appendix
30
} ////////////// class ServerReadThread extends Thread { Socket S; ServerI SI=null; ServerIF SIF; public ServerReadThread(Socket s,ServerI si,ServerIF sif) { S=s; SIF=sif; SI=si; //SI.setInfo(s.toString()); start(); } public void run() { try { BufferedReader in=new BufferedReader(new InputStreamReader(S.getInputStream())); while(true) { //PrintWriter out=new PrintWriter(new BufferedWriter(new OutputStreamWriter(os.getOutputStream())),true); SIF.dataFC(in.readLine()); } } catch(Exception e2) { SI.setErr(e2.getMessage()); } } } /* * MessageBox.java * * Created on November 2, 2008, 9:15 PM */ /** * * @author SuperStar */ public class MessageBox extends javax.swing.JFrame { String MSG="SuperStar"; /** Creates new form MessageBox */ public MessageBox(String msg) { MSG=msg; initComponents(); this.jTextArea1.setText(MSG); } /** This method is called from within the constructor to * initialize the form. * WARNING: Do NOT modify this code. The content of this method is
Appendix
31
* always regenerated by the Form Editor. */ // <editor-fold defaultstate="collapsed" desc="Generated Code">//GEN-BEGIN:initComponents private void initComponents() { jButton1 = new javax.swing.JButton(); jScrollPane1 = new javax.swing.JScrollPane(); jTextArea1 = new javax.swing.JTextArea(); setDefaultCloseOperation(javax.swing.WindowConstants.EXIT_ON_CLOSE); setTitle("MessageBox"); setAlwaysOnTop(true); setBackground(new java.awt.Color(183, 226, 252)); setForeground(new java.awt.Color(0, 0, 0)); jButton1.setText("OK"); jButton1.addActionListener(new java.awt.event.ActionListener() { public void actionPerformed(java.awt.event.ActionEvent evt) { jButton1ActionPerformed(evt); } }); jTextArea1.setColumns(20); jTextArea1.setRows(5); jTextArea1.setOpaque(false); jScrollPane1.setViewportView(jTextArea1); javax.swing.GroupLayout layout = new javax.swing.GroupLayout(getContentPane()); getContentPane().setLayout(layout); layout.setHorizontalGroup( layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addGroup(javax.swing.GroupLayout.Alignment.TRAILING, layout.createSequentialGroup() .addComponent(jScrollPane1, javax.swing.GroupLayout.DEFAULT_SIZE, 315, Short.MAX_VALUE) .addPreferredGap(javax.swing.LayoutStyle.ComponentPlacement.RELATED) .addComponent(jButton1) .addContainerGap()) ); layout.setVerticalGroup( layout.createParallelGroup(javax.swing.GroupLayout.Alignment.LEADING) .addComponent(jScrollPane1, javax.swing.GroupLayout.PREFERRED_SIZE, 46, javax.swing.GroupLayout.PREFERRED_SIZE) .addGroup(layout.createSequentialGroup() .addContainerGap() .addComponent(jButton1)) ); pack(); }// </editor-fold>//GEN-END:initComponents private void jButton1ActionPerformed(java.awt.event.ActionEvent evt) {//GEN-FIRST:event_jButton1ActionPerformed // TODO add your handling code here: this.dispose(); }//GEN-LAST:event_jButton1ActionPerformed
Appendix
32
// Variables declaration - do not modify//GEN-BEGIN:variables private javax.swing.JButton jButton1; private javax.swing.JScrollPane jScrollPane1; private javax.swing.JTextArea jTextArea1; // End of variables declaration//GEN-END:variables } CLIENT /** * * @author SuperStar */ import java.io.*; import java.net.*; import java.util.*; public class ClientGUI { public static void main(String[] s) throws Exception { ServerSocket SS=new ServerSocket(5678); new ClientManager(SS.accept()); } } ///////// interface ClientIF { final int PORT=5678; public void dataFS(String s); public void setErr(String err); public void setInfo(String info); public void setKLen(int kn); public void setKeys(String[] k); } //////// class ClientManager implements ClientIF { Socket S; ClientWriteThread CWT; ClientReadThread CRT; int kN=0; String[] key; String URL; public ClientManager(Socket s) { S=s; //setInfo(s.toString()); CWT=new ClientWriteThread(S,this); CRT=new ClientReadThread(S,this); }
Appendix
33
// public void _search(String src,String key) { // //java.util.Scanner ss=new java.util.Scanner(src); //StringTokenizer ss=new StringTokenizer(src,key,true); int c=0,i=0,j=-1; while(i<src.length()) { if((j=src.indexOf(key,(j+1)))!=-1) ++c; else break; //ss.next(key); //System.out.println(c); ++i; } CWT.send("Is Found:"+src.contains(key)); CWT.send(S.toString()+"\n:"+URL+"---"+key+"---"+c); //setInfo(URL+"---"+key+"---"+c); } // public String _parseURL(String u) { String r=""; try { URL url=new URL("http",u,"/"); URLConnection con=url.openConnection(); con.connect(); InputStream in=con.getInputStream(); int ch=-1; while((ch=in.read())!=-1) { r+=((char)ch); } in.close(); System.out.println(r); } catch(Exception e1) { setErr("URL Err:"+e1.toString()); } //setInfo(r); return r; } // public void dataFS(String s) { URL=s; //setInfo(s); for(int i=0;i<kN;i++) _search(_parseURL(s),key[i]); } public void setErr(String err)
Appendix
34
{ System.out.println(err); } public void setInfo(String info) { setErr(info); } public void setKLen(int kn) { kN=kn; } public void setKeys(String[] k) { key=k; } } /////////// class ClientWriteThread { Socket S; ClientIF CIF; public ClientWriteThread(Socket s,ClientIF cif) { CIF=cif; S=s; //CIF.setInfo(S.toString()); } public void send(String msg) { try { //CIF.setInfo(msg); PrintWriter out=new PrintWriter(new BufferedWriter(new OutputStreamWriter(S.getOutputStream())),true); out.println(msg); } catch(Exception e3) { CIF.setErr("Send:"+e3.getMessage()); } } } ////////////// class ClientReadThread extends Thread { Socket S; //ServerI SI=null; ClientIF CIF; int kN=0; String[] key; public ClientReadThread(Socket s,ClientIF cif) { S=s; CIF=cif; //SI=si; //CIF.setInfo(s.toString());
Appendix
35
//CIF.setInfo(""+kN); start(); } public void run() { try { BufferedReader in=new BufferedReader(new InputStreamReader(S.getInputStream())); kN=Integer.parseInt(in.readLine()); key=new String[kN]; //CIF.setInfo(""+kN); for(int i=0;i<kN;i++) { key[i]=in.readLine(); //CIF.setInfo(key[i]); } CIF.setKLen(kN); //CIF.setInfo(""+kN); CIF.setKeys(key); //CIF.setInfo(key.toString()); while(true) { //PrintWriter out=new PrintWriter(new BufferedWriter(new OutputStreamWriter(os.getOutputStream())),true); CIF.dataFS(in.readLine()); } } catch(Exception e2) { CIF.setErr("Read:"+e2.getMessage()); } } }