case study 1
TRANSCRIPT
CASE STUDY #1
kwic
Dr Reeja S RProfessorCSE Dept.SJEC, Vamanjoor, Mangalore
CASE STUDY #1
KWIC
Key Word In Context
“The KWIC system index system accepts an ordered set of lines, each line is an ordered set of words, and each word is an ordered set of characters.
Any line may be “circularly shifted” by repeatedly removing the first word and appending it at the end of the line.
The KWIC index system outputs a listing of all circular shifts of all lines in alphabetical order.”
Evaluation Criteria1. Change in overall processing algorithm:
line shifting - as line read in, on all lines after they are read, or on demand
2. Change in data representation: Store lines as character string or linked list
3. Enhancements: eliminate shifts that start with noise words (“a”,
“the”), allow deletion of lines, make system interactive
4. Performance: space and time.
5. Reuse: to what extent may components be reused?
KWIC: Functional Decomposition
• Read input• Create circular shifts• Sort in alphabetical order• Write output
Solution #1
Main program/Subroutine with Shared Data
KWIC: Solution #1
One subroutine for each of the four functions
Main program “Master control” calls each of the subroutines in turn
The subroutines operates on shared data
Solution #1 Main program/Subroutine with shared data
Input: reads the input lines and stores in Characters data structure
Circular Shift directly accesses Characters to build Index data structure. Each entry in Index identifies the address of circular shift in Characters.
Alphabetizer: directly accesses Characters and Index to build Alphabetized Index. This contains an ordered set of indices into Characters
Output directly accesses Alphabetized Index and Characters to output the ordered, shifted titles
Strengths and weakness Strengths:
Efficient use of space (and time) Intuitive appeal -natural solution Enhancements e.g. removing shifts starting with noise
words can be easily accommodated
Weaknesses: Change of data representation affects all modules Not particularly supportive of reuse - Tight coupling with
data structures. Changes to overall processing algorithm –depends on
nature of change.
Solution 1
5 criteria to judge each design on:
1. change the overall algorithm? 2. changes to data representation?
3. extend the program functionality? 4. efficient use of space and time?
5. Can the code be easily re-used?
Solution #2 Abstract Data Type
Strengths and weakness
Strengths: –Data representations can be changed inside
individual modules without affecting others. –Reuse better supported because modules make
less assumptions about others -looser coupling. Weaknesses:
difficult to enhance e.g. How would you remove shifts starting with noise words? Feasible, but potentially messy.
Perhaps more space, and access through interfaces may be slightly slower.
Solution 1
5 criteria to judge each design on:
1. change the overall algorithm? 2. changes to data representation?
3. extend the program functionality? 4. efficient use of space and time?
5. Can the code be easily re-used?
Solution #3 Implicit Invocation
LineStorage 1 : original lines,
LineStorage 2 :all circular shifts.
Input module: read data from a file and store it in LineStorage1
CircularShifter: produce circular shifts and store in the LineStorage2
Alphabetizer: sort circular shifts alphabetically.
Output : print the sorted shifts.
Master control module, responsible for overall control of the system
Strengths and weakness
Strengths: Can alter overall processing algorithm by registering on
different events -triggering after each line entered or when all lines entered.
May enhance by adding independent functions e.g. Omit that deletes ‘noisy’ shifts after a Shifted Line insert.
supports change in data representation. Reusable –modules are loosely coupled.
Weaknesses: more space is required e.g. Line, Shifted Line and Sorted
Line buffers.
Implicit Invocation
5 criteria to judge each design on:
1. change the overall algorithm? 2. changes to data representation?
3. extend the program functionality? 4. efficient use of space and time?
5. Can the code be easily re-used?
Solution #4 Pipes and Filters
Strengths and Weaknesses Strengths:
•Intuitive flow of processing. •Supports enhancements by addition of filters e.g.
an omit ‘noisy shift’ filter, or by modifying independent filters.
•Supports reuse, filters operate in isolation.
Weaknesses: how would you interactively delete user-selected
lines Data format has to be agreed between the filters Inefficient use of space. Overall processing algorithm limited to sequential
flow / batch style.
Implicit Invocation
5 criteria to judge each design on:
1. change the overall algorithm? 2. changes to data representation?
3. extend the program functionality? 4. efficient use of space and time?
5. Can the code be easily re-used?