03 text editor

Upload: globalhodmech

Post on 15-Oct-2015

59 views

Category:

Documents


6 download

TRANSCRIPT

  • 5/25/2018 03 Text Editor

    1/91

    TEXT EDITOR, MAMCET, 2008 IT

    1

    M.A.M. COLLEGE OF ENGINEERING AND TECHNOLOGYTRICHY -621105

    DEPARTMENT OF INFORMATION TECHNOLOGY

    ANNA UNIVERSITY PRACTICAL EXAMINATIONS, OCT2011

    RECORD NOTE BOOK

    CS1403 - SOFTWARE DEVELOPMENT LABORATORY

    Student Name :

    Register Number :

    Class & Semester : 2008-2012, B.Tech., IT, VII - Semester

    Month & Year : OCT - 2011

  • 5/25/2018 03 Text Editor

    2/91

    TEXT EDITOR, MAMCET, 2008 IT

    2

  • 5/25/2018 03 Text Editor

    3/91

    TEXT EDITOR, MAMCET, 2008 IT

    3

    M.A.M. COLLEGE OF ENGINEERING AND TECHNOLOGY

    TRICHY -621105

    DEPARTMENT OF INFORMATION TECHNOLOGY

    ANNA UNIVERSITY PRACTICAL EXAMINATIONS, OCT2011

    BONAFIDE CERTIFICATE

    This is to certify that this practical work titled CS1403 Software Development

    Laboratory is the bonafide work of ________________________________

    Register Number ______________________ during the academic year 2011-2012.

    Faculty Incharge Head of the Department

    Submitted for Anna University of Technology, Tiruchirappalli practical

    examination held on ____/___/___ at M.A.M. College of Engineering and

    Technology, Trichy 621105.

    Internal Examiner External Examiner

  • 5/25/2018 03 Text Editor

    4/91

    TEXT EDITOR, MAMCET, 2008 IT

    4

    ABSTRACT

    The data structure used at maintains the sequence of characters is an

    important part of a text editor. This paper investigates and evaluates the range of

    possible data structures for text sequences. The ADT interface to the text sequence

    component of a text editor is examined. Six common sequence data structures

    (array, gap, list, line pointers, fixed size buffers and piece tables) are examined and

    then a general model of sequence data structures that encompasses all six

    structures is presented. The piece table method is explained in detail and its

    advantages are presented.

    The design space of sequence data structures is examined and several

    variations on the ones listed above are presented. These sequence data structures

    are compared experimentally and evaluated based on a number of criteria. The

    experimental comparison is done by implementing each data structure in an editing

    simulator and testing it using a synthetic load of many thousands of edits. We also

    report on experiments on the sensitivity of the results to variations in the

    parameters used to generate the synthetic editing load.

  • 5/25/2018 03 Text Editor

    5/91

    TEXT EDITOR, MAMCET, 2008 IT

    5

    TABLE OF CONTENTS

    Description Page No

    BONAFIDE CERTIFICATE 03

    ABSTRACT 04

    Chapter 1 INTRODUCTION 07

    Chapter 2 SEQUENCE INTERFACES 11

    2.1 The Sequence Abstract Data Type 11

    2.2 The C Interface 13

    Chapter 3 COMPARING SEQUENCE DATA STRUCTURES 16

    Chapter 4 DEFINITIONS 19

    Chapter 5 BASIC SEQUENCE DATA STRUCTURES 215.1 The array method (one span in one buffer) 21

    5.2 The gap method (two spans in one buffer) 22

    5.3 The linked list method 24

    Chapter 6 RECURSIVE SEQUENCE DATA STRUCTURES 26

    6.1 Determination of span and buffer sizes 27

    6.2 The line span method (one span per line) 29

    6.3 Fixed size buffers 31

    Chapter 7 A GENERAL MODEL OF SEQUENCE DATA 37

  • 5/25/2018 03 Text Editor

    6/91

    TEXT EDITOR, MAMCET, 2008 IT

    6

    STRUCTURES

    7.1 The design space for sequence data structures 40

    Chapter 8 EXPERIMENTAL COMPARISON OF SEQUENCE DATA

    STRUCTURES

    43

    8.1 Discussion of the timing results 44

    Chapter 9 COMPARISON OF SEQUENCE DATA STRUCTURES 46

    9.1 Basic sequence data structures 46

    9.2 Recursive sequence data structures 47Chapter 10 CONCLUSIONS AND RECOMMENDATIONS 49

    REFERENCES 51

    APPENDICES 52

    1 Sample Screen Shots 52

    2 Sample Source Code 57

    3 ASCII Chart 90

  • 5/25/2018 03 Text Editor

    7/91

    TEXT EDITOR, MAMCET, 2008 IT

    7

    Chapter 1

    INTRODUCTION

    The central data structure in a text editor is the one that manages the

    sequence of characters that represents the current state of the file that is being

    edited. Every text editor requires such a data structure but books on data structures

    do not cover data structures for text sequences. Articles on the design of texteditors often discuss the data structure they use but they do not cover the area in a

    general way. This article is concerned with such data structures.

    Figure 1: Ordered sets

  • 5/25/2018 03 Text Editor

    8/91

    TEXT EDITOR, MAMCET, 2008 IT

    8

    Figure 1 show where sequence data structures fit in with other data

    structures. Some ordered sets are ordered by something intrinsic in the items in the

    sets (e.g., the value of an integer, the lexicographic position of a string) and the

    position of an inserted item depends on its value and the values of the items

    already in the set. Such data structures are mainly concerned with fast searching.

    Data structures for this type of ordered set have been studied extensively.

    The other possibility is for the order to be determined by where the items are

    placed when they are inserted into the set. If insert and delete is restricted to thetwo ends of the ordering then you have a deque. For deques, the two basic data

    structures are an array (used circularly) and a linked list. Nothing beyond this is

    necessary due to the simplicity of the ADT interface to deques. If you can insert

    and delete items from anywhere in the ordering you have a sequence. An important

    subclass is sequences where reading an item in the sequence (by position number)

    are extremely localized.

    This is the case for text editors and it is this subclass that is examined in this

    paper. A linked list and an array are the two obvious data structures for a sequence.

    Neither is suitable for a general purpose text editor (a linked list takes up too much

    memory and an array is too slow because it requires too much data movement) but

    they provide useful base cases on which to build more complex sequence data

    structures. The gap method is a simple extension of an array; it is simply an array

    with a gap in the middle where characters are inserted and deleted. Many text

    editors use the gap method since it is simple and quite efficient but the demands on

    a modern text editor (multiple files, very large files, structured text, sophisticated

  • 5/25/2018 03 Text Editor

    9/91

    TEXT EDITOR, MAMCET, 2008 IT

    9

    undo, virtual memory, etc.) encourage the investigation of more complicated data

    structures which might handle these things more effectively.

    The more sophisticated sequence data structures keep the sequence as a

    recursive sequence of spans of text. The line span method keeps each line together

    and keeps an array or a linked list of line pointers. The fixed buffer method keeps a

    linked list of fixed size buffers each of which is partially full of text from the

    sequence. Both the line span method and the fixed buffer method have been used

    for many text editors.

    A less commonly used method is the piece table method which keeps the

    text as a sequence of \pieces" of text from either the original file and an \added

    text" file. This method has many advantages and these will become clear as the

    methods are presented in detail and analyzed. A major purpose of this paper is to

    describe the piece table method and explain why it is a good data structure for text

    sequences.

    Looking at methods in detail suggests a general model of sequence data

    structures that subsumes them all. Based on examination of this general model I

    will propose several new sequence data structures that do not appear to have been

    tried before.

    It is difficult to analyze these algorithms mathematically so an experimental

    approach was taken. I have implemented a text editor simulator and a number of

    sequence data structures. Using this as an experimental text bed I have compared

    the performance of each of these sequence data structures under a variety of

    conditions. Based on these experiments and other considerations I conclude with

  • 5/25/2018 03 Text Editor

    10/91

    TEXT EDITOR, MAMCET, 2008 IT

    10

    recommendations on which sequence data structures are best in various situations.

    In almost all cases, either the gap method or the piece table method is the best data

    structure.

  • 5/25/2018 03 Text Editor

    11/91

    TEXT EDITOR, MAMCET, 2008 IT

    11

    Chapter 2

    SEQUENCE INTERFACES

    It is useful to start out with a definition of a sequence and the interface to it.

    Since the more sophisticated text sequence data structures are recursive in that they

    require a component data structure that maintains a sequence of pointers, I will

    formulate the sequence of a general sequence of \items" rather than as a sequenceof characters. This supports discussion of the recursive sequence data structures

    better.

    2.1 The Sequence Abstract Data Type

    A text editor maintains a sequence of characters, by implementing some

    variant of the abstract date type Sequence. One definition of the Sequence abstract

    data type is:

    Domains:

    Item | the data type that this is a sequence of (in most cases it will be anASCII character).

    Sequence | an ordered set of Items. Position | an index into the Sequence which identifies the Items (in this case

    it will be a natural number from 0 to the length of the Sequence minus one).

    Syntax:

  • 5/25/2018 03 Text Editor

    12/91

    TEXT EDITOR, MAMCET, 2008 IT

    12

    Empty :Sequence Insert :Sequence x Position x ItemSequence Delete :Sequence x PositionSequence ItemAt :Sequence x PositionItem U {EndOfFile}

    Types:

    s : Sequence; i : Item; p, p1, p2 : Position;

    Axioms:

    1. Delete(Empty, p) = Empty

    2. Delete(Insert(s; p1; i); p2) =

    if p1 < p2 then Insert(Delete(s, p2-1),p1,i)if p1 = p2 then s

    if p1 > p2 then Insert(Delete(s, p2); p1-1,i)

    3. ItemAt(Empty, p) = EndOfFile

    4. ItemAt(Insert(s, p1, i); p2) =

    if p1 < p2 then ItemAt(s, p2 -1)

    if p1 = p2 then i

    if p1 > p2 then ItemAt(s, p2)

    The definition of a Sequence is relatively simple. Axiom 1 says that deleting

    from an Empty Sequence is a no-op. This could be considered an error. Axiom 2

  • 5/25/2018 03 Text Editor

    13/91

    TEXT EDITOR, MAMCET, 2008 IT

    13

    allows the reduction of a Sequence of Inserts and Deletes to a Sequence containing

    only Inserts. This defines a canonical form of a Sequence which is a Sequence of

    Inserts on a initial Empty Sequence. Axiom 3 implies that reading outside the

    Sequence returns a special EndOfFile item. This also could have been an error.

    Axiom 4 defines the semantics of a Sequence by defining what is at each position

    of a canonical Sequence.

    2.2 The C Interface

    How would this translate into C? First some type definitions:typedef ReturnCode int; /* 1 for success, zero or negative for failure */

    typedef Position int; /* a position in the sequence */

    /* the first item in the sequence is at position 0 */

    typedef Item unsigned char; /* they are sequences of eight bit bytes */

    typedef struct {

    /* To be determined */

    /* whatever information we need for the data structures we choose */

    } Sequence;

    In this interface the only operations that change the Sequence are Insert and

    Delete. I am ignoring the error of inserting beyond the end of the existing

    sequence.

    Sequence Empty( ); ReturnCode Insert( Sequence *sequence, Position position, Item ch ); ReturnCode Delete( Sequence *sequence, Position position ); Item ItemAt( Sequence *sequence, Position position ); - This does not

    actually require a pointer to a Sequence since no change to the sequence is

  • 5/25/2018 03 Text Editor

    14/91

    TEXT EDITOR, MAMCET, 2008 IT

    14

    being made but we expect that they will be large structures and should not

    be passing them around. I am ignoring error returns (e.g., position out of

    range) for the purposes of this discussion. These are easily added if desired.

    ReturnCode Close( Sequence *sequence );Many variations are possible. The next few paragraphs discuss some of them.

    Any practical interface would allow the sequence to be initialized with the

    contexts of a file. In theory this is just the Empty operation followed by an Insert

    operation for each character in the initializing file. Of course, this is too in efficientfor a real text editor.2 Instead we would have a NewSequence operation:

    Sequence NewSequence( char * file name ); - The sequence is initializedwith the contents of the file whose name is contained in `file name'. Usually

    the Delete operation will delete any logically contiguous subsequence

    ReturnCode Delete( Sequence *sequence, Position beginPosition, PositionendPosition ); Sometimes the Insert operation will insert a subsequence

    instead of just a single character.

    ReturnCode Insert( Sequence *sequence, Position position, SequencesequenceToInsert ); Sometimes Copy and Move are separate operations

    (instead of being composed of Inserts and Deletes).

    ReturnCode Copy( Sequence *sequence, Position fromBegin, PositionfromEnd, Position toPosition );

    ReturnCode Move( Sequence *sequence, Position fromBegin, PositionfromEnd, Position toPosition ); A Replace operation that subsumes Insert

    and Delete in another possibility.

  • 5/25/2018 03 Text Editor

    15/91

    TEXT EDITOR, MAMCET, 2008 IT

    15

    ReturnCode Replace( Sequence *sequence, Position fromBegin, PositionfromEnd, Sequence sequenceToReplaceItWith ); Finally the ItemAt

    procedure could retrieve a subsequence. Although this is the method I use in

    my text editor simulator described later.

    ReturnCode SequenceAt( Sequence *sequence, Position fromBegin,Position fromEnd, Sequence *returnedSequence );

    These variations will not a ect the basic character of the data structure used to

    implement the sequence or the comparisons between them that follow. Therefore I

    will assume the first interface (Empty, Insert, Delete, IntemAt, and Close).

  • 5/25/2018 03 Text Editor

    16/91

    TEXT EDITOR, MAMCET, 2008 IT

    16

    Chapter 3

    COMPARING SEQUENCE DATA STRUCTURES

    In order to compare sequence data structures it is necessary to know the relative

    frequency of the five basic operations in typical editing.

    The NewSequence operation is infrequent. Most sequence data structures

    require the NewSequence operation to scan the entire file and convert it to someinternal form. This can be a problem with very large files. To look through a very

    large file, it must be read in any case but if the sequence data structure does not

    require sequence preprocessing it can interleave the file reading with text editor

    use rather than requiring the user to wait for the entire file to be read before editing

    can begin. This is an advantage and means the user does not have to worry about

    how many files are loaded or how large they are, since the cost of reading the large

    file is amortized over its use.

    The Close operation is also infrequent and not normally an important factor

    in comparing sequence data structures.

    The ItemAt operation will, of course, be very frequent, but it will also be

    very localized as well because characters in a text editor are almost always

    accessed sequentially. When the screen is drawn the characters on the screen are

    accessed sequentially beginning with the first character on the screen. If the user

    pages forward or backwards the characters are accessed sequentially. Searches

    normally begin at the current position and proceed forward or backward

  • 5/25/2018 03 Text Editor

    17/91

    TEXT EDITOR, MAMCET, 2008 IT

    17

    sequentially. Overall, there is almost no random access in a text editor. This means

    that, although the ItemAt operation is very frequent, its efficiency in the general

    case is not significant. Only the case where a character is requested that is

    sequentially before or after the last character accessed needs to be optimized.

    To test this hypothesis I instrumented a text editor and looked at the ItemAt

    operations in a variety of text editing situations. The number of ItemAt calls that

    were sequential (one away from the last ItemAt) was always above 97% and

    usually above 99%. The number of ItemAts that were within 20 items of the

    previous ItemAt was always over 99%. The average number of characters fromone ItemAt to the next was typically around 2 but sometimes would go up to as

    high as 10 or 15.

    Overall these experiments verify that the positions of ItemAt calls are

    extremely localized.

    Thus sequence data structures should be optimized for edits (inserts and

    deletes) and sequential character access. Since caching can be used with most data

    structures to make sequential access fast, this means that the eiency of inserts and

    deletes is paramount. With regard to Inserts and Deletes one would assume that

    there would be more Inserts than Deletes but that there would be a mix between

    them in typical text editing.

    In all sequence data structures, ItemAt is much faster than Inserts and

    Deletes. If we consider typical text editing, the display is changed after each Insert

    and Delete and this requires some number of ItemAts, often just a few characters

  • 5/25/2018 03 Text Editor

    18/91

    TEXT EDITOR, MAMCET, 2008 IT

    18

    around the edit but possibly the whole rest of the window (if a newline in inserted

    or deleted).

    The criteria used for comparing sequence data structures are:

    The time taken by each operation The paging behavior of each operation The amount of space used by the sequence data structure How easily it fits in with typical file and IO systems The complexity (and space taken by) the implementation

    Later I will present timings comparing the basic operations for a range of

    sequence data structures. These timings will be taken from example

    implementations of the data data structures and a text editor simulator that calls

    these implementations.

  • 5/25/2018 03 Text Editor

    19/91

    TEXT EDITOR, MAMCET, 2008 IT

    19

    Chapter 4

    DEFINITIONS

    An item is the basic element. Usually it will be a character. A sequence is an

    ordered set of items. Sequential items in a sequence are said to be logically

    contiguous. The sequence data structure will keep the items of the sequence in bu

    ers. A buer is a set of sequentially addressed memory locations. A buer containsitems from the sequence but not necessarily in the same order as they appear

    logically in the sequence. Sequentially addressed items in a buffer are physically

    contiguous.

    When a string of items is physically contiguous in a buffer and is also

    logically contiguous in the sequence we call them a span. A descriptor is a pointer

    to a span. In some cases the buffer is actually part of the descriptor and so no

    pointer is necessary. This variation is not important to the design of the data

    structures but is more a memory management issue.

    Sequence data structures keep spans in buffers and keep enough information

    (in terms of descriptors and sequences of descriptors) to piece together the spans to

    form the sequence. Buffers can be kept in memory but most sequence data

    structures allow buffers to get as large as necessary or allow an unlimited number

    of buffers. Thus it is necessary to keep the buffers on disk in disk files. Many

    sequence data structures use buffers of unlimited size, that is, their size is

  • 5/25/2018 03 Text Editor

    20/91

    TEXT EDITOR, MAMCET, 2008 IT

    20

    determined by the file contents. This requires the buffer to be a disk file. With

    enough disk block caching this can be made as fast as necessary.

    The concepts of buffers, spans and descriptors can be found in almost every

    sequence data structure. Sequence data structures vary in terms of how these

    concepts are used.

    If a sequence data structures uses a variable number of descriptors it requires

    a recursive sequence data structure to keep track of the sequence of descriptors. In

    section 5 we will look at three sequence data structures that use a fixed number of

    descriptors and in section 6 we will look at three sequence data structures that use avariable number of descriptors. Section 7 will present a general model of a

    sequence data structure that encompasses all these data structures.

  • 5/25/2018 03 Text Editor

    21/91

    TEXT EDITOR, MAMCET, 2008 IT

    21

    Chapter 5

    BASIC SEQUENCE DATA STRUCTURES

    All three of the sequence data structures in this section use a fixed number (either

    one, two or three) of external descriptors.

    5.1 The array method (one span in one buffer)

    This is the most obvious sequence data structure. In this method there is one

    span and one buffer. Two descriptors are needed: one for the span and one for the

    buffer. (See Figure 24) The buffer contains the items of the sequence in physically

    contiguous order. Deletes are handled by moving all the items following the

    deleted item to fill in the hole left by the deleted item. Inserts are handled by

    moving all the items that will follow the item to be inserted in order to make room

    for the new item. ItemAt is an array reference. The buffer can be extended as much

    as necessary to hold the data.

  • 5/25/2018 03 Text Editor

    22/91

    TEXT EDITOR, MAMCET, 2008 IT

    22

    Figure 2: The array method

    Clearly this would not be an efficient data structure if a lot of editing was to

    be done are large files. It is a useful base case and is a reasonable choice in

    situations where few inserts and deletes are made (e.g., a read-only sequence) or

    the sequences are relatively small (e.g., a one-line text editor).

    This data structure is sometimes used to hold the sequence of descriptors in

    the more complex sequence data structure, for example, an array of line pointers

    (see section 6).

    5.2 The gap method (two spans in one buffer)

    The gap method is only a little more complex than the array method but it is more

    much efficient. In this method you have one large buffer that holds all the text but

  • 5/25/2018 03 Text Editor

    23/91

    TEXT EDITOR, MAMCET, 2008 IT

    23

    there is a gap in the middle of the buffer that does not contain valid text. (See

    Figure 3) Three descriptors are needed: one for each span and one for the gap. The

    gap will be at the text \cursor" or \point" where all text

    Figure 3: The gap (or two span) method

    editing operations take place. Inserts are handled by using up one of the places in

    the gap and incrementing the length of the first descriptor (or decrementing the

    begin pointer of the second descriptor). Deletes are handled by decrementing the

    length of the first descriptor (or incrementing the begin pointer of the second

    descriptor). ItemAt is a test (first or second span?) and an array reference.

    When the cursor moves the gap is also moved so if the cursor moves 100

    items forward then 100 items have to bemoved from the second span to the first

    span (and the descriptors adjusted). Since most cursor moves are local, not that

    many items have to be moved in most cases.

    Actually the gap does not need to move every time the cursor is moved.

    When an editing operation is requested then the gap is moved. This way moving

  • 5/25/2018 03 Text Editor

    24/91

    TEXT EDITOR, MAMCET, 2008 IT

    24

    the cursor around the file while paging or searching will not cause unnecessary gap

    moves.

    If the gap fills up, the second span is moved in the buffer to create a new

    gap. There must be an algorithm to determine the new gap size. As in the array

    case, the buffer can be extended to any length. In practice, it is usually increased in

    size by some fixed increment or by some fixed percentage of the current buffer

    size and this becomes the size of the new gap. With virtual memory we can make

    the buffer so large that it is unlikely to fill up. And with some help from the

    operating system, we can expand the gap without actually moving any data. Thismethod does use up large quantities of virtual address space, however.

    This method is simple and surprisingly efficient. The gap method is also called the

    split buffer method and is discussed in [9].

    5.3 The linked list method

    At the other extreme is the linked list method which has a span for each item

    in the sequence and the span is kept in the descriptor. This requires a large (and file

    content dependent) number of descriptors which must be sequenced. If we link the

    descriptors together into a linked list then we really only have to remember one

    descriptor, the head of the list. (See Figure 4)

    Inserts and Deletes are fast and easy (just move some pointers) but ItemAt is

    more expensive than the array or gap methods since several pointers may have to

    be followed.

  • 5/25/2018 03 Text Editor

    25/91

    TEXT EDITOR, MAMCET, 2008 IT

    25

    Figure 4: The linked list method

    The linked list method uses a lot of extra space and so is not appropriate for

    a large sequence but is frequently used as a way of implementing a sequence of

    descriptors required in the more complex sequence data structures. In fact, it is the

    most common method for that. One could think of the array method as a special

    case of the linked list method. An array is really a linked list with the links

    implicit, that is, the link is computed to be the next physically sequential address

    after the current item. In this view, the linked list method the only primitive

    sequence data type. The array method is a special case if linked list method and the

    gap method is a variation on the array method.

  • 5/25/2018 03 Text Editor

    26/91

    TEXT EDITOR, MAMCET, 2008 IT

    26

    Chapter 6

    RECURSIVE SEQUENCE DATA STRUCTURES

    In this section three more sequence data structures are presented. Each of

    these methods requires a variable number of descriptors and so must recursively

    use a (usually simpler) sequence data structure to implement the sequence of

    descriptors.

    6.1 Determination of span and buffer sizes

    The basic cases either used a fixed number of spans (one for the array

    method and two for the gap method) or spans of size 1 (the linked list method).

    The recursive methods described in this section use a variable number of spans.

    How is the span size determined? There are two possibilities: either the spans are

    determined by the contents of the file or by the text editor independent of the

    sequence contents. The main example of content determined spans is to have one

    span per logical line of

    text.

    If the editor is allowed to determine span sizes to its own best advantage the

    second issue is the size of the buffers. Again there are two possibilities: fixed size

    or variable size. Fixed size buffers are easier to manage and are often chosen to be

    some (low) multiple of the page size or the disk block size. A fixed size buffer

    implies a maximum span size (that is, the buffer size).

  • 5/25/2018 03 Text Editor

    27/91

    TEXT EDITOR, MAMCET, 2008 IT

    27

    This table show the possibilities:

    Fixed size buffers Variable size buffers

    Content determined spans Line size is limited Line spans

    Editor determined spans Fixed size buffers Piece tables

    These are the three methods that will be examined in this section.

    6.2 The line span method (one span per line)

    Since most text editors do many operations on lines it seems reasonable to use a

    data structure that is line oriented so line operations can be handled efficiently. The

    line span method uses a descriptor for each line. Often one large buffer is used to

    hold all the line spans. New lines are appended to the end of the used part of the

    buffer. See Figure 5.

    Figure 5: The line spans method

  • 5/25/2018 03 Text Editor

    28/91

    TEXT EDITOR, MAMCET, 2008 IT

    28

    Line deletes are handled by deleting the line descriptor. Deleting characters

    within a line involves moving the rest of the characters in the line up to fill the gap.

    Since any one line is probably not that long this is reasonably efficient.

    Line inserts are handled by adding a line descriptor. Inserting characters in a

    line involves copying the initial part (before the insert) of the line buffer to new

    space allocated for the line, adding the characters to be inserted, copying the rest of

    the line and pointing the descriptor at the new line. Multiple line inserts and deletes

    are combinations of these operations. Caching can make this all fairly efficient.

    Usually new space is allocated at the end of the buffer and the space occupied by

    deleted or changed lines is not reused since the effort of dynamic memory

    allocation is not worth the trouble. A disk file the continues to grow at the end can

    be handled quite efficiently by most file systems.

    This method uses as many descriptors as there are lines in the file, that is, a

    variable number of descriptors hence there is a recursive problem of keeping these

    descriptors in a sequence. Typically one of the basic methods described in section

    5 is used to maintain the sequence of line descriptors.

    These simpler methods are acceptable since the number of line descriptors is

    much smaller than the number of characters in the file being edited. The linked list

    method allows efficient insertions and deletions but requires the management of

    list nodes. This method is acceptable for a line oriented text editor but is not as

    common these days since strict line orientation is seen as too restrictive. It does

    require preprocessing of the entire buffer before you can begin editing since the

    line descriptors have to be set up.

  • 5/25/2018 03 Text Editor

    29/91

    TEXT EDITOR, MAMCET, 2008 IT

    29

    6.3 Fixed size buffers

    In the line spans method the partitioning of the sequence into spans is

    determined by the contents of the text file, in particular the line structure. Another

    approach is for the text editor to decide the partitioning of the sequence into spans

    based on the efficiency of buffer use. Since fixed size blocks are more efficient to

    deal with the file can be divided into a sequence of fixed size buffers. If the buffers

    were required to be always full, a great deal of time would be spent rearranging

    data in the buffers, therefore, each block has a maximum size but will usuallycontain fewer actual items from the sequence so there is room for inserts and

    deletes, which will usually affect only one buffer.(See Figure 6.)

    Figure 6: Fixed size buffers

    The disk block size (or some multiple of the disk block size) is usually the

    most efficient choice for the fixed size buffers since then the editor can do its own

  • 5/25/2018 03 Text Editor

    30/91

    TEXT EDITOR, MAMCET, 2008 IT

    30

    disk management more easily and not depend on the virtual memory system or the

    file system for efficient use of the disk.

    Usually a lower bound on the number of items in a buffer is set (half the

    buffer size is a common choice). This requires moving items between buffers and

    occasionally merging two buffers to prevent the accumulation of large numbers of

    buffers. There are four problems with letting too many buffers accumulate:

    wasted space in the buffers, the recursive sequence of descriptors gets too large, the probability that an edit will be confined to one buffer is reduced, and ItemAt caching is less effective.

    As an example of fixed size buffers, suppose disk blocks are 4K bytes long.

    Each buffer will be 4K bytes long and will contain a span of length from2K to 4K

    bytes. Each buffer is handled using the array method, that is, inserts and deletes are

    done by moving the items up or down in the buffer. Typically an edit will only

    affect one buffer but if a buffer fills up it is split into two buffers and if the span in

    a buffer falls below 2K bytes then items are moved into it from an adjacent buffer

    or it is coalesced with an adjacent buffer.

    Each descriptor points to a buffer and contains the length of the span in the

    buffer. The fixed size buffer method also requires a recursive sequence for the

    descriptors. This could be any of the basic methods but most examples from the

    literature use a linked list of descriptors. The loose packing allows small changes

    to be made within the buffers and the fact that the buffers are linked makes it easy

    to add and delete buffers.

  • 5/25/2018 03 Text Editor

    31/91

    TEXT EDITOR, MAMCET, 2008 IT

    31

    6.4 The piece table method

    In the piece table method the sizes of the spans are as large as possible but

    are split as a result of the editing that is done on the sequence. The sequence starts

    out as one big span and that gets divided up as insertions and deletions are made.

    We call each span a piece (of the sequence) and its descriptor is called a piece

    descriptor. The sequence of piece descriptors is called the piece table.

    The piece table method uses two buffers. The first (the file buffer) contains

    the original contents of the file being edited. It is of fixed size and is read-only.The second (the add buffer) is another buffer that can grow without bound and is

    append-only. All new items are placed in this buffer.

    Each piece descriptor points to a span in the file buffer or in the add buffer.

    Thus a descriptor must contain three pieces of information: which buffer (a

    boolean), an offset into that buffer (a non-negative integer) and a length (a positive

    integer5). Figure 7 shows a piece table structure. The file consists of five pieces.

    Initially there is only one piece descriptor which points to the entire file

    buffer. A delete is handled by splitting a piece into two pieces. One of these pieces

    points to the items in the old piece before the deleted item and the other piece

    points to the items after the deleted item. A special case occurs when the deleted

    item is at the beginning or end of the piece in which case we simply adjust the

    pointer or the piece length.

    An insert is handled by splitting the piece into three pieces. The first piece

    points to the items of the old piece before the inserted item. The third piece points

  • 5/25/2018 03 Text Editor

    32/91

    TEXT EDITOR, MAMCET, 2008 IT

    32

    to the items of the old piece after the inserted item. The inserted item is appended

    to the end of the add file and the second piece points to this item. Again there are

    special cases for in insertion at the beginning or end of a piece.

    If several items are inserted in a row, the inserted items are combined into a

    single piece rather than using a separate piece for each item inserted. Figures 8, 9

    and 10 show the effect of a delete and an insert in a piece table. Figure 8 shows the

    5Normally zero length pieces are eliminated.

    Figure 7: The piece table method

    piece table after the file is read in initially. This is a very short file containing only

    20 characters. Figure 9 shows the piece table after the word \large " has been

    deleted. Figure 10 shows the piece table after the word \English " has been added.

    Notice that, in general, a delete increases the number of pieces in the piece table byone and an insert increases the number of pieces in the piece table by two.

  • 5/25/2018 03 Text Editor

    33/91

    TEXT EDITOR, MAMCET, 2008 IT

    33

    Figure 8: A piece table before any edits

    Figure 9: A piece table after a delete

    Let us look at another example. Suppose we start with a new file that is 1000 bytes

    long and make the following edits.

  • 5/25/2018 03 Text Editor

    34/91

    TEXT EDITOR, MAMCET, 2008 IT

    34

    Figure 10: A piece table after a delete and an insert

    1. Six characters inserted (typed in) after character 900.

    2. Character 600 deleted.

    3. Five characters inserted (typed in) after character 500.

    The piece table after these edits will look like this:

    file start length logical offset

    orig 0 500 0

    add 6 5 500

    Orig 500 100 505

    Orig 601 300 605

    add 0 6 905

    Orig 901 100 911

  • 5/25/2018 03 Text Editor

    35/91

    TEXT EDITOR, MAMCET, 2008 IT

    35

    The \logical offset" column does not actually exist in the piece table but can be

    computed from it (it is the running total of the lengths). These logical offsets are

    not kept in the piece table because they would all have to be updated after each

    edit.

    The piece table method has several advantages.

    The original file is never changed so it can be a read-only file. This isadvantageous for caching systems since the data never changes.

    The add file is append-only and so, once written, it never changes either. Items never move once they have been written into a bu

    er so they can be pointed to by other data structures working together with

    the piece table.

    Undo is made much easier by the fact that items are never written over. It isnever necessary to save deleted or changed items. Undo is just a matter of

    keeping the right piece descriptors around. Unlimited undoes can be easily

    supported.

    No file preprocessing is required. The initial piece can be set up onlyknowing the length of the original file, information that can be quickly and

    easily obtained from the file system. Thus the size of the file does not affect

    the startup time.

    The amount of memory used is a function of the number of edits not the sizeof the file. Thus edits on very large files will be quite efficient.

    The above description implies that a sequence must start out as one big piece

    and only inserts and deleted can add pieces. Following this rule keeps the number

    of pieces at a minimum and the fewer pieces there are the more efficient the

  • 5/25/2018 03 Text Editor

    36/91

    TEXT EDITOR, MAMCET, 2008 IT

    36

    ItemAt operations are. But the text editor is free to split pieces at other times to suit

    its purposes.

    For example, a word processor needs to keep formatting information as well

    as the text sequence itself. This formatting information can be kept as a tree where

    the leaves of the tree are pieces. A word in bold face would be kept as a separate

    piece so it could be pointed to by a \bold" format node in the format tree. The text

    editor Lara [8] uses piece tables this way.

    As another example, suppose the text editor implements hypertext linksbetween any two spans of text in the sequence. The span at each end of the link can

    be isolated in a separate piece and the link data structure would point to these two

    pieces.6 This technique is used in the Pastiche text editor [5] for fine-grained

    hypertext.

    These techniques work because piece table descriptors and items do not

    move when edits occur and so these tree structures will be maintained with little

    extra work even if the sequence is edited heavily.

    Overall, the piece table is an excellent data structure for sequences and is

    normally the data structure of choice. Caching can be used to speed up this data

    structure so it is competitive with other data structures for sequences.

  • 5/25/2018 03 Text Editor

    37/91

    TEXT EDITOR, MAMCET, 2008 IT

    37

    Chapter 7

    A GENERAL MODEL OF SEQUENCE DATA STRUCTURES

    The discussions in the previous two sections suggest that it is possible to

    characterize all sequence data structures given the following assumptions:

    1. The computer has main memory with sequentially addressed cells and has disk

    memory with sequentially addressed blocks of sequentially addressed cells.2. Items have a fixed size in memory and on disk.

    3. Items are stored directly in the memory or on disk, that is, they are not encoded.

    Hence every item must exist somewhere in memory or on disk.

    4. The main memory is of limited size and hence cannot hold all the items in large

    sequences. Even if virtual memory is provided there will be an upper bound on it

    in any actual system configuration. In addition, most sophisticated sequence data

    structures do not rely on virtual memory to efficiently shuttle sequence data

    between main memory and the disk. Usually the program can do better since it

    understands exactly how the data is accessed.

    5. The environment provides dynamic memory allocation (although some sequence

    data structures will do their own and not use the environment's dynamic memory

    allocation).

    6. The environment provides a reasonably efficient file system for item storage that

    provides files of (for all practical purposes) unlimited size.

    The following concepts are used in the model.

  • 5/25/2018 03 Text Editor

    38/91

    TEXT EDITOR, MAMCET, 2008 IT

    38

    An item is the basic component of our sequences. In most cases it will be acharacter or byte (but it might be a descriptor in a recursive sequence data

    structure).

    A sequence is an ordered set of items. During editing, items will be insertedinto and deleted from the sequence. The items in the sequence are logically

    contiguous.

    A buffer is some contiguous space in main memory or on disk that cancontain items (Assumption 4). All items in the sequence are kept in buffers

    (Assumption 3). Consecutive items in a buffer are physically contiguous.

    A span is one or more items that are logically contiguous in the sequenceand are also physically contiguous in the buffer. (Assumption 1)

    A descriptor is a data structure that represents a span. Usually the descriptorcontains a pointer to the span but it is also possible for the descriptor to

    contain the buffer that contains the span.

    A sequence data structure is either

    * A basic sequence data structure which is one of:

    - An array.

    - An array with a gap.

    - A linked list of items.

    - A more complex linked structure of items.

    * A recursive sequence data structure which comprises:

    - Zero or more buffers each of which contains zero or more spans.

    - A (recursive) sequence data structure of zero or more descriptors to spans

    in the buffers.

  • 5/25/2018 03 Text Editor

    39/91

    TEXT EDITOR, MAMCET, 2008 IT

    39

    This model is recursive in that to implement a sequence of items it is

    necessary to implement a sequence of descriptors. This recursion is usually only

    one step, that is, the sequence of descriptors in implemented with a basic sequence

    data structure. The deficiencies of the basic sequence data structures for

    implementing character sequences are less critical for sequences of descriptors

    since there are usually far fewer descriptors and so sophisticated methods are not

    required.

  • 5/25/2018 03 Text Editor

    40/91

    TEXT EDITOR, MAMCET, 2008 IT

    40

    7.1 The design space for sequence data structures

    This model can be used to examine the design space for sequence data

    structures. The four basic methods seem to cover most of the useful cases. A basic

    method is one where the number of descriptors is fixed. The array method uses two

    descriptors and the gap method uses three. We could generalize this to a two gap

    method using four descriptors and so on but it is not clear that there is any

    advantage in doing that. The linked list method is basic even though it uses a

    descriptor for every item in the sequence. The items are linked together so one

    descriptor serves to define the sequence. As soon as we try to put two or moreitems into a descriptor it becomes an instance of the recursive fixed size buffer

    method.

    Using a more complex linked structure than a linked list will make reading

    items (ItemAt) more efficient but makes Insert and Delete less efficient. Since

    ItemAt access is nearly always sequential this tradeoff is not advantageous.

    The recursive methods divide into two types. The first uses fixed size buffers

    and the second uses variable sized buffers. There are two issues in the fixed size

    buffer method The first is the method used in maintaining the items in each fixed

    size buffer and the second is the method of maintaining the sequence of buffers.

    The items in a single (fixed size) buffer are a sequence. The typical

    implementation keeps them as an array at the beginning of the buffer. In general

    the gap method is superior to the array method, thus it might be useful to keep

    characters in a single buffer using the gap method where the gap is kept at the last

    edit point. This should halve the expected number of bytes moved for one edit

  • 5/25/2018 03 Text Editor

    41/91

    TEXT EDITOR, MAMCET, 2008 IT

    41

    at little additional program cost. If the edits exhibit locality (as they typically do)

    the advantage will be greater.

    A linked list is usually used to implement the sequence of descriptors (buffer

    pointers) and this is generally a good method because of the ease of inserting and

    deleting descriptors (which are frequent operations). One problem is the extra

    space used by the links but this is only a problem if there are lots of descriptors.

    With 1K blocks and editing 5M of files, this is still only 5K descriptors with 10K

    links. Thus this problem does not seem to be significant in practical cases.

    Another issue is virtual memory performance. Linked lists do not exhibitgood locality. If the descriptors were kept using the gap method, locality would be

    improved considerably. Assuming a descriptor takes four words (one for the

    pointer to the span, one for the length of the span and two for the links) the 5K

    descriptors would consume 20K words or 80K bytes (assuming four bytes per

    word). Again this is small enough that virtual memory performance would

    probably not be a significant problem, but if it were, the gap method could improve

    things.

    The piece table method uses fewer descriptors than the fixed buffer method

    initially (before any editing) but heavy editing can create numerous pieces. There

    are advantages to maintaining the pieces since it allows easy implementation of

    unlimited undo. In addition, pieces can be used to record structures in the text (as

    described in section 7.1). As a consequence there might be many pieces. This

    means the problems presented above in the discussion of fixed size bu

    ers (space consumed by links and virtual memory performance) might be

    significant here. That is, the gap method of keeping the piece table might be

    preferred.

  • 5/25/2018 03 Text Editor

    42/91

    TEXT EDITOR, MAMCET, 2008 IT

    42

    Another generalization would be to use two levels of recursion, that is, to

    use one of the recursive sequence data structures to implement the sequence of

    descriptors. The recursive methods are beneficial when the sequences are quite

    large so we might use a two-level recursive method if the number of descriptors

    was quite large. As we mentioned above, this might be the case with a piece table.

    So there are four new variations that we have uncovered in this analysis.

    The fixed size buffers method using the gap method inside each buffer.

    The fixed size buffers method using the gap method for descriptors. Thismight be better if virtual memory performance was a consideration.

    The piece table method using the gap method for descriptors. This might bebetter if virtual memory performance was a consideration.

    A two level recursive method that uses a recursive method to maintain thesequence of descriptors. This would suitable if there is a very large number

    of descriptors.

  • 5/25/2018 03 Text Editor

    43/91

    TEXT EDITOR, MAMCET, 2008 IT

    43

    Chapter 8

    EXPERIMENTAL COMPARISON OF SEQUENCE DATA STRUCTURES

    In order to compare the performance of these data structures I implemented

    each of them and a simulator program that would simulate typical editing behavior.

    The simulator has a number of parameters that I will discuss in presenting the

    results. I ran these tests on a several machines, under two compilers (cc and gcc)

    and with maximum optimization. The results for the different architectures andcompilers were all similar.

    The measurements were made with the following parameters (except where

    one of these parameters is being experimentally varied):

    Sequence length of 8000 characters Block size of 1024 characters Fixed buffer methods keep buffers at least half full (from 512 to 1024

    characters)

    The location of 98% of the edits is normally distributed around the locationof the previous edit with a standard deviation of 25

    The location of 2% of the edits is uniformly distributed over the entiresequence

    After each edit, 25 characters on each side of the edit location are accessed Every 250 edits the entire file is scanned sequentially with ItemAts

    The sequence data structures measured (and their abbreviated names) is:

  • 5/25/2018 03 Text Editor

    44/91

    TEXT EDITOR, MAMCET, 2008 IT

    44

    Null | the null method that does nothing. This is for comparison since itmeasures the overhead of the procedure calls.

    Arr | The array method. List | The list method. Gap | The gap method. FsbA - The fixed size buffer method with the array method used to maintain

    the sequence inside each buffer.

    FsbAOpt - The fixed size buffer method with the array method used tomaintain the sequence inside each buffer and with ItemAt optimized.

    FsbG - The fixed size buffer method with the gap method used to maintainthe sequence inside each buffer.

    Piece - The piece table method. PieceOpt - The piece table method with ItemAt optimized

    8.1 Discussion of the timing results

    Only time will be considered in the following discussion. In the next section

    these sequence data structures will be compared on a range of criteria. Remember

    that the ItemAt times assume a very high locality of reference. If the references

    were not local the ItemAt times would be much higher and the ratios would be

    different. The Arr method has the fastest ItemAt but is terrible for Insert and

    Delete. It is not a practical method.

    The Gap method is nearly as fast for ItemAt but is also quite fast for Insert

    and Delete. The Gap method is a generally fast method but some other problems as

  • 5/25/2018 03 Text Editor

    45/91

    TEXT EDITOR, MAMCET, 2008 IT

    45

    will be seen in the next section. The List method has the fastest Inserts and Deletes

    by far but a fairly slow ItemAt. The List method uses a lot of extra space (two

    pointers per item). It is useful for in-memory sequences with large items.

    The FsbA method is slow for Inserts and Deletes but its ItemAt can be made

    quite fast with some simple caching. It is possible to reduce the ItemAt time even

    further by making it an inline operation. The FsbG method reduces the Insert and

    Delete times radically but the ItemAt time is a bit higher. The equivalent ItemAt

    caching would be a little more complicated and a little slower.

    The Piece method has excellent Insert and Delete times (only slightly slower

    than the linked list method) but its ItemAt time is quite slow even with simple

    caching. More complex ItemAt caching that avoids procedure calls is necessary

    when using the Piece method. The idea of the caching is simple. Instead of

    requesting an item you request the address and length of the longest span starting

    at a particular position. Then all the items in the span can be accessed with a

    pointer dereference (and increment). This will bring the ItemAt time down to the

    level of the array and gap methods.

  • 5/25/2018 03 Text Editor

    46/91

    TEXT EDITOR, MAMCET, 2008 IT

    46

    Chapter 9

    COMPARISON OF SEQUENCE DATA STRUCTURES

    9.1 Basic sequence data structures

    The array method is just too slow for Inserts and Deletes. In addition its

    paging behavior is very bad (it touches lots of pages). It might be useful for a one

    line text editor or a text editor where few edits are expected.

    The linked list method takes far too much space for long sequences. It is

    useful for short in-memory sequences where the edits and Item Ats are not

    localized. The linked list method has one great advantage and that is that the items

    never move in memory. This makes it easy to embed a list sequence in another

    data structures (such as a tree to provide fast searching). Of the basic sequence data

    structures, only the gap method can be seriously considered for a general purpose

    text editor. The gap method has one major problem and that is when the gap fills

    up. This will require lots of item movement to reestablish the gap. REDO: be more

    positive about the gap method.

    Array Gap Linked List

    Time Slow Fast Fast

    Space Low Low Very high

    Ease of programming Easy Easy Easy

    Size of code Low (39 lines) Low (59 lines) Medium (79 lines)

    The lines of code measure was taken from the sample implementations.

  • 5/25/2018 03 Text Editor

    47/91

    TEXT EDITOR, MAMCET, 2008 IT

    47

    9.2 Recursive sequence data structures

    The line span method is an older method that has little to recommend it in

    modern text editors. The Fsb methods and the Piece method are both good choices

    for professional-quality text editors.

    Both methods:

    are acceptably fast if caching is used handle large files without slowing down handle many of files without slowing down are efficient in their use of space allow efficient buffer management provide excellent locality of buffer use

    Overall however the piece table method seems to be better. It has a number of

    advantages:

    All buffers except one are read-only and the other buffer is append-only.(Thus the buffers are easy to cache and work well over a network.)

    The code is quite simple. (The code for Fsb is complicated by the need tobalance buffers.)

    Huge files load as quickly as tiny files. (No preprocessing is required forlarge files so they load quickly.)

    Disk buffers are always full of data (rather than 3=4 full|on the average|asthey are in the Fixed buffer method). Thus disk caching is more efficient.

    Items never move once they are placed.

  • 5/25/2018 03 Text Editor

    48/91

    TEXT EDITOR, MAMCET, 2008 IT

    48

    The last point is important. If the piece sequence is kept as a list then the

    pieces never move either. This allows the sequence to be pointed to by other data

    structures that record additional information about the sequence. For example it is

    fairly easy to implement \sticky" pointers (that is, pointers that point to the content

    of the sequence rather than relative sequence positions). For example we might

    want to attach a sticky pointer to the beginning of a procedure definition.

    As another example, the text editor Lara [8] also formats its text. It keeps the

    formatting state in a tree structure where pieces are the leaves of the tree. Inserts

    and deletes require very little bookkeeping because the items and pieces nevermove around when you use a piece table.

    FSB-Array FSB-Gap Piece

    Time Fairly fast Fast

    (with caching)

    Fairly fast

    (with caching)

    Space Low Low Low

    Ease of

    programming

    Hard Hard Medium

    Size of code Medium to large

    (218 lines)

    Large

    (301 lines)

    Medium

    (162 lines)

  • 5/25/2018 03 Text Editor

    49/91

    TEXT EDITOR, MAMCET, 2008 IT

    49

    Chapter 10

    CONCLUSIONS AND RECOMMENDATIONS

    This paper has two purposes:

    to examine systematically data structures for sequences and to present the piece table method and describe its advantages.

    A review of the literature shows that there are only a few different data

    structures for text sequences that have been used in text editors. A careful

    examination of the design space showed that there really are not that many

    fundamental types of sequence data structures.

    Sequence data structures are divided into two categories. Basic sequence

    data structures (array, gap and linked list) and recursive sequence data structures

    (line spans, fixed size buffers and piece tables).

    The array method is the obvious one: keep the text in an array of characters.

    The gap method is similar but it keeps a gap in the middle of the array at the text

    editor insertion point. The linked list method keeps the characters on a linked list.

    The recursive methods keep the text in a number of separate \spans" and

    keeps track of a sequence of pointers to these spans. The line span method uses a

    span for each line. The fixed size buffer method keeps a sequence of fixed size

  • 5/25/2018 03 Text Editor

    50/91

    TEXT EDITOR, MAMCET, 2008 IT

    50

    buffers each of which contains one span. The piece table method uses spans of any

    size either in the original file or in a file for added characters.

    In examining these data structures a general model of sequence data

    structures was formulated and used this model and the examples were used to

    discover several new variations for sequence data structures that might improve

    performance in some situations.

    A series of experiments was performed on these data structures to determine

    their relative performance and they were compared on a variety of criteriaincluding time. The main conclusion is that the piece table structure is the best data

    structure for text sequences although some of the other methods might be useful in

    certain cases. The piece table method has a number of advantages and is an

    especially good method for text with additional structure. Thus it would be the best

    choice for a word processor or a editor with hypertext facilities.

  • 5/25/2018 03 Text Editor

    51/91

    TEXT EDITOR, MAMCET, 2008 IT

    51

    REFERENCES

    1. C. C. Charlton and P. H. Leng. Editors: two for the price of one.Software|Practice and Experience, 11:195{202, 1981.

    2. Computer System Research Group, EECS, University of California,Berkeley, CA 94720. UNIX User's Reference Manual (4.3 Berkeley

    Software Distribution), April 1986.

    3. Computer System Research Group, EECS, University of California,Berkeley, CA 94720. UNIX User's Supplementary Documents (4.3 Berkeley

    Software Distribution), April 1986.4. C. Crowley. The Point text editor for X. Technical Report CS91-3,

    University of New Mexico, 1991.

    5. C. Crowley. Using fine-grained hypertext for recording and viewingprogram structures. Technical Report CS91-2, University of New Mexico,

    1991.

    6. B. Elliot. Design of a simple screen editor. Software|Practice andExperience, 12:375{384, 1982.

    7. C. W. Fraser and B. Krishnamurthy. Live text. Software|Practice andExperience, 20(8):851{ 858, August 1990.

    8. J. Gutknecht. Concepts of the text editor Lara. Communications of theACM, 28(9):942{960, September 1985.

    9. J.Kyle. Split buers, patched links, and half-transpositions. ComputerLanguage, pages 67{70, December 1989.

    10.B. W. Lampson. Bravo Manual in the Alto User's Handbook. Xerox PaloAlto Research Center, Palo Alto, CA, 1976.

  • 5/25/2018 03 Text Editor

    52/91

    TEXT EDITOR, MAMCET, 2008 IT

    52

    Appendix 1

    SAMPLE SCREEN SHOTS

  • 5/25/2018 03 Text Editor

    53/91

    TEXT EDITOR, MAMCET, 2008 IT

    53

  • 5/25/2018 03 Text Editor

    54/91

    TEXT EDITOR, MAMCET, 2008 IT

    54

  • 5/25/2018 03 Text Editor

    55/91

    TEXT EDITOR, MAMCET, 2008 IT

    55

  • 5/25/2018 03 Text Editor

    56/91

    TEXT EDITOR, MAMCET, 2008 IT

    56

  • 5/25/2018 03 Text Editor

    57/91

    TEXT EDITOR, MAMCET, 2008 IT

    57

    Appendix 2

    SOURCE CODE

    Cursor.c/ * posi t i ons cur sor at t he begi nni ng of t he f i l e */st ar t _ f i l e( ){

    i nt di spl ay = YES ;

    / * i f f i r st page of t he f i l e i s bei ng cur r ent l y di spl ayed */i f ( cur scr == st art l oc && ski p == 0 )

    di spl ay = NO ;

    / * reset var i abl es */cur r = 2 ;curc = 1 ;l ogr = 1 ;l ogc = 1 ;ski p = 0 ;cur scr = st ar t l oc ;cur r ow = st art l oc ;

    / * di spl ay f i rs t page of f i l e, i f necessary * /i f ( di spl ay == YES ){

    menubox ( 2, 1, 22, 78, 27, NO_SHADOW ) ;di spl ayscr een ( cur scr ) ;

    }

    / * di spl ay cur r ent cur sor l ocat i on */wr i t er ow( ) ;wr i tecol ( ) ;

    }

    / * posi t i ons cur sor at t he end of t he f i l e */end_f i l e( ){

    char *t emp ;

    i nt i , status ;s i ze ( 32, 0 ) ; / * hi de cur sor */

    / * count t he t ot al number of l i nes i n t he f i l e */l ogr = 1 ;cur r = 2 ;t emp = st art l oc ;whi l e ( t emp != endl oc ){

  • 5/25/2018 03 Text Editor

    58/91

    TEXT EDITOR, MAMCET, 2008 IT

    58

    i f ( *t emp == ' \ n' )cur r ++ ;

    t emp++ ;}

    / * set up cur r ent scr een and cur r ent r ow poi nt er s */cur scr = endl oc ;cur r ow = endl oc ;

    / * di spl ay t he l ast page of t he f i l e */page_up ( 0 ) ;

    / * posi t i on cur sor 20 l i nes af t er cur r ent r ow */f or ( i = 0 ; i < 20 ; i ++ ){

    st at us = down_l i ne ( 0 ) ;

    / * i f end of f i l e i s reached */i f ( st at us == 1 )

    br eak ;

    }

    / * posi t i on cur sor at t he end of t he l ast l i ne */end_l i ne( ) ;

    si ze ( 5, 7 ) ; / * show cur sor */

    / * di spl ay cur r ent cur sor r ow */wr i t er ow( ) ;

    }

    / * posi t i ons cur sor i n t he f i r st r ow of cur r ent scr een */top_screen()

    { s i ze ( 32, 0 ) ;

    / * go up unt i l t he f i r st r ow i s encount er ed */whi l e ( l ogr ! = 1 )

    up_l i ne ( 0 ) ;

    / * di spl ay cur r ent cur sor r ow */wr i t er ow( ) ;

    s i ze ( 5, 7 ) ;}

    / * posi t i ons cur sor i n t he l ast r ow of cur r ent screen */

    bot t om_scr een( ){

    i nt st at us ;

    s i ze ( 32, 0 ) ;

    / * go down unt i l t he l ast r ow or end of f i l e i s encount er ed */whi l e ( l ogr ! = 21 ){

    st at us = down_l i ne ( 0 ) ;

  • 5/25/2018 03 Text Editor

    59/91

    TEXT EDITOR, MAMCET, 2008 IT

    59

    / * i f end of f i l e i s reached */i f ( st at us == 1 )

    br eak ;}

    wr i t er ow( ) ;s i ze ( 5, 7 ) ;

    }

    / * posi t i ons cur sor at t he begi nni ng of cur r ent r ow */star t _ l i ne( ){

    / * i f t her e exi st char acter s t o t he l ef t of cur r ent l y di spl ayed l i ne */i f ( ski p ! = 0 ){

    ski p = 0 ;menubox ( 2, 1, 22, 78, 27, NO_SHADOW ) ;di spl ayscr een ( cur scr ) ;

    }

    l ogc = 1 ;curc = 1 ;

    / * di spl ay cur r ent cur sor col umn */wr i tecol ( ) ;

    }

    / * posi t i ons cur sor at t he end of cur r ent r ow */end_l i ne( ){

    char *t emp ;i nt count , di spl ay = YES ;

    t emp = cur r ow ;count = 1 ;

    / * count t he number of char act er s i n cur r ent l i ne */whi l e ( *t emp ! = ' \ n' ){

    / * i f end of f i l e i s encount er ed */i f ( t emp >= endl oc )

    br eak ;

    i f ( *t emp == ' \ t ' )count += 8 ;

    el se

    count ++ ;

    t emp++ ;}

    / * backt r ace across t he \ t ' s and spaces whi ch may be pr esent at t he endof t he l i ne * /

    whi l e ( * ( t emp - 1 ) == ' \ t ' | | * ( t emp - 1 ) == ' ' ){

    i f ( * ( t emp - 1 ) == ' \ t ' )

  • 5/25/2018 03 Text Editor

    60/91

    TEXT EDITOR, MAMCET, 2008 IT

    60

    count - = 8 ;el se

    count ++ ;

    t emp- - ;}

    / * i f t he number of char act er s i n t he l i ne i s l ess t han 78 */i f ( count

  • 5/25/2018 03 Text Editor

    61/91

    TEXT EDITOR, MAMCET, 2008 IT

    61

    }

    t emp++ ;}

    / * i f end of f i l e i s encount er ed */i f ( t emp >= endl oc )

    t emp- - ;

    / * i f char acters at cur r ent cur sor l ocat i on and t o i t s l ef t ar eal phanumer i c */

    condi t i on1 = i sal num ( *t emp ) && i sal num ( * ( t emp - 1 ) ) ;

    / * i f char act er at cur r ent cur sor l ocat i on i s al phanumer i c and t hepr evi ous char act er i s not al phanumer i c */

    condi t i on2 = i sal num ( *t emp ) && ! i sal num ( * ( t emp - 1 ) ) ;

    / * i f char act er at cur r ent cur sor l ocat i on i s not al phanumer i c */condi t i on3 = ! i sal num ( *t emp ) ;

    i f ( *t emp == ' \ n' )t emp- - ;

    i f ( condi t i on2 )t emp- - ;

    i f ( condi t i on1 ){

    / * move l ef t so l ong as al phanumer i c charact er s are f ound */whi l e ( i sal num ( *t emp ) ){

    i f ( t emp == st ar t l oc )br eak ;

    t emp- - ;count ++ ;

    }}

    i f ( condi t i on2 | | condi t i on3 ){

    / * move l ef t t i l l an al phanumer i c char act er i s f ound */whi l e ( ! ( i sal num ( *t emp ) ) ){

    i f ( t emp

  • 5/25/2018 03 Text Editor

    62/91

    TEXT EDITOR, MAMCET, 2008 IT

    62

    r et ur n ;}

    t emp- - ;count ++ ;

    }

    / * move l ef t t i l l a non- al phanumer i c char act er i s f ound */whi l e ( i sal num ( *t emp ) ){

    i f ( t emp == st ar t l oc )br eak ;

    t emp- - ;count ++ ;

    }}

    / * i f begi nni ng of f i l e i s encount er ed */i f ( t emp == st art l oc )

    {l ogc = 1 ;curc = 1 ;

    }el se{

    l ogc - = count ;cur c - = count ;

    / * i f screen needs t o be scrol l ed hor i zont al l y */i f ( cur c > 78 ){

    l ogc = 78 ;

    ski p = cur c - 78 ;menubox ( 2, 1, 22, 78, 27, NO_SHADOW ) ;di spl ayscr een ( cur scr ) ;

    }}

    wr i tecol ( ) ;wr i t er ow( ) ;

    }

    / * posi t i ons cur sor one wor d t o t he r i ght */wor d_r i ght ( ){

    char *t emp ;

    i nt col , count = 0 ;

    / * i ncrement `t emp' t o poi nt t o char act er at cur r ent cur sor l ocat i on */t emp = cur r ow ;f or ( col = 1 ; col < cur c ; col ++ ){

    / * i f end of f i l e i s encount er ed */i f ( t emp >= endl oc )

    r et ur n ;

  • 5/25/2018 03 Text Editor

    63/91

    TEXT EDITOR, MAMCET, 2008 IT

    63

    i f ( *t emp == ' \ t ' )col += 7 ;

    / * i f end of l i ne i s encount er ed bef or e cur r ent cur sor col umn */i f ( *t emp == ' \ n' )

    br eak ;

    t emp++ ;}

    i f ( t emp >= endl oc )r et ur n ;

    / * i f cur sor i s at t he end of cur r ent l i ne */i f ( *t emp == ' \ n' ){

    / * cont i nue t i l l an al phanumer i c char act er i s f ound */whi l e ( ! ( i sal num ( *t emp ) ) ){

    / * i f end of f i l e i s encount er ed */

    i f ( t emp >= endl oc )br eak ;

    i f ( *t emp == ' \ t ' )count += 7 ;

    / * i f end of l i ne i s encount er ed */i f ( *t emp == ' \ n' ){

    / * posi t i on cur sor i n t he next l i ne */down_l i ne ( 0 ) ;

    / * posi t i on cur sor at t he begi nni ng of t he l i ne */

    s tar t _ l i ne( ) ;t emp = cur r ow ;count = 0 ;cont i nue ;

    }

    t emp++ ;count ++ ;

    }}el se{

    / * t her e exi st s a wor d t o t he r i ght of cur sor */

    count = 0 ;

    / * move r i ght so l ong as al phanumeri c charact ers are f ound */whi l e ( i sal num ( *t emp ) ){

    i f ( t emp >= endl oc )br eak ;

    t emp++ ;count ++ ;

  • 5/25/2018 03 Text Editor

    64/91

    TEXT EDITOR, MAMCET, 2008 IT

    64

    }

    / * move r i ght t i l l a non- al phanumer i c char act er or end of l i ne i smet */

    whi l e ( ! ( i sal num ( *t emp ) | | *t emp == ' \ n' ) ){

    i f ( t emp >= endl oc )br eak ;

    i f ( *t emp == ' \ t ' )count += 7 ;

    t emp++ ;count ++ ;

    }}

    l ogc += count ;cur c += count ;

    / * i f scr een needs t o be scr ol l ed hor i zont al l y */i f ( cur c > 78 ){

    l ogc = 78 ;ski p = cur c - 78 ;menubox ( 2, 1, 22, 78, 27, NO_SHADOW ) ;di spl ayscr een ( cur scr ) ;

    }

    wr i tecol ( ) ;wr i t er ow( ) ;

    }

    / * di spl ays pr evi ous f i l e page */page_up ( i nt di spl ay ){

    i nt r ow ;

    / * i f f i r st page i s cur r ent l y di spl ayed */i f ( cur scr == st ar t l oc )

    r et ur n ;

    / * posi t i on t he `cur scr ' poi nt er 20 l i nes bef or e */f or ( r ow = 1 ; r ow

  • 5/25/2018 03 Text Editor

    65/91

    TEXT EDITOR, MAMCET, 2008 IT

    65

    curc = 1 ;

    menubox ( 2, 1, 22, 78, 27, NO_SHADOW ) ;di spl ayscr een ( cur scr ) ;

    i f ( di spl ay ){

    wr i tecol ( ) ;wr i t er ow( ) ;

    }

    r et ur n ;}

    / * go t o t he begi nni ng of pr evi ous l i ne */whi l e ( *cur scr ! = ' \ n' ){

    i f ( cur scr

  • 5/25/2018 03 Text Editor

    66/91

    TEXT EDITOR, MAMCET, 2008 IT

    66

    }

    / * di spl ays next f i l e page */page_down( ){

    char *p ;i nt r ow = 1, i , col ;

    / * posi t i on t he `cur scr' poi nt er 20 l i nes hence */p = cur scr ;f or ( r ow = 1 ; r ow = endl oc ){

    cur scr = p ;r et ur n ;

    }

    curscr ++ ;}

    i f ( cur scr >= endl oc ){

    cur scr = p ;r et ur n ;

    }

    / * go t o t he begi nni ng of next l i ne */curscr ++ ;

    }/ * di spl ay t he next screen */menubox ( 2, 1, 22, 78, 27, NO_SHADOW ) ;di spl ayscreen ( cur scr ) ;

    / * posi t i on cur sor 20 l i nes hence */

    s i ze ( 32, 0 ) ;

    / * cont i nue t i l l f i r st r ow on t he screen i s r eached */r ow = 1 ;whi l e ( cur r ow ! = cur scr ){

    i f ( *cur r ow == ' \ n' ){

    cur r ++ ;r ow++ ;

    }cur r ow++ ;

    }

    l ogr = 1 ;col = cur c ;

  • 5/25/2018 03 Text Editor

    67/91

    TEXT EDITOR, MAMCET, 2008 IT

    67

    f or ( i = r ow ; i = 249 ){

    curc = 249 ;pr i nt f ( " \ a" ) ;r et ur n ;

    }

    / * i ncrement `t emp' t o poi nt t o char act er at cur r ent cur sor l ocat i on */t emp = cur r ow ;f or ( col = 1 ; col < cur c ; col ++ ){

    i f ( *t emp == ' \ t ' )col += 7 ;

    i f ( *t emp == ' \ n' )br eak ;

    t emp++ ;}

    / * i f next char acter i s a t ab */i f ( *t emp == ' \ t ' ){

    l ogc += 7 ;cur c += 7 ;

    }

    cur c++ ;

    / * i f cur sor i s i n t he l ast col umn, scr ol l scr een hor i zont al l y */i f ( l ogc >= 78 ){

    ski p = cur c - 78 ;menubox ( 2, 1, 22, 78, 27, NO_SHADOW ) ;di spl ayscr een ( cur scr ) ;

  • 5/25/2018 03 Text Editor

    68/91

    TEXT EDITOR, MAMCET, 2008 IT

    68

    l ogc = 78 ;}el se

    l ogc++ ;

    wr i tecol ( ) ;}

    / * posi t i ons cur sor one col umn t o the l ef t */l e f t ( ){

    i nt col = 1 ;char *t emp ;

    / * i f cur sor i s i n t he f i rs t col umn */i f ( cur c == 1 )

    r et ur n ;

    / * i ncrement `t emp' t o poi nt t o char act er at cur r ent cur sor l ocat i on */t emp = cur r ow ;

    f or ( col = 1 ; col < cur c ; col ++ ){

    i f ( *t emp == ' \ t ' )col += 7 ;

    i f ( *t emp == ' \ n' )br eak ;

    t emp++ ;}

    / * i f pr evi ous char acter i s a t ab */i f ( * ( t emp - 1 ) == ' \ t ' )

    { l ogc - = 7 ;cur c - = 7 ;

    }

    / * i f cur sor i s i n t he f i r st col umn and i f t her e exi st char acters t ot he l ef t of cur r ent l y di spl ayed l i ne, scr ol l scr een hor i zont al l y */

    i f ( l ogc

  • 5/25/2018 03 Text Editor

    69/91

    TEXT EDITOR, MAMCET, 2008 IT

    69

    / * posi t i ons cur sor one l i ne up */up_l i ne ( i nt di spl ay ){

    / * i f cursor i s i n the f i r s t l i ne of the f i l e * /i f ( cur r == 2 )

    r et ur n ;

    / * r emove spaces and tabs at t he end of t he l i ne */del _whi t espace( ) ;

    l ogr - - ;c u r r - - ;

    / * go to t he begi nni ng of pr evi ous l i ne */curr ow - = 2 ;i f ( curscr < start l oc )

    cur scr = st ar t l oc ;whi l e ( *cur r ow ! = ' \ n' ){

    i f ( cur r ow

  • 5/25/2018 03 Text Editor

    70/91

    TEXT EDITOR, MAMCET, 2008 IT

    70

    whi l e ( *cur r ow ! = ' \ n' ){

    / * i f end of f i l e i s encount er ed */i f ( cur r ow >= endl oc ){

    cur r ow = p ;r et ur n ( 1 ) ;

    }

    cur r ow++ ;}i f ( cur r ow == endl oc ){

    cur r ow = p ;r et ur n ( 1 ) ;

    }cur r ow++ ;l ogr ++ ;cur r ++ ;

    / * i f vert i cal scrol l i ng i s requi red * /i f ( l ogr >= 22 ){

    l ogr = 21 ;scr ol l up ( 2, 1, 22, 78 ) ;di spl ayl i ne ( cur r ow, 22 ) ;

    / * posi t i on `cur scr' poi nt er at t he begi nni ng of cur r ent screen*/

    whi l e ( *cur scr ! = ' \ n' )curscr ++ ;

    curscr ++ ;}

    / * posi t i on cur sor i n appr opr i at e col umn */got ocol ( ) ;

    / * i f cur r ent cur sor r ow and col umn i s t o be di spl ayed */i f ( di spl ay ){

    wr i t ecol ( ) ;wr i t er ow( ) ;

    }

    r et ur n ( 0 ) ;}

    / * scr ol l s t he scr een cont ent s down */scrol l down ( i nt sr , i nt sc, i nt er, i nt ec ){

    uni on REGS i i , oo ;

    i i . h. ah = 7 ; / * ser vi ce number */i i . h. al = 1 ; / * number of l i nes t o scr ol l */i i . h. ch = sr ; / * s tart i ng row */i i . h. cl = sc ; / * start i ng col umn */

  • 5/25/2018 03 Text Editor

    71/91

    TEXT EDITOR, MAMCET, 2008 IT

    71

    i i . h. dh = er ; / * endi ng r ow */i i . h. dl = ec ; / * endi ng col umn */i i . h. bh = 27 ; / * di spl ay at t r i but e of bl ank l i ne creat ed at t op */i nt 86 ( 16, &i i , &oo ) ; / * i ssue i nt er r upt */

    }

    / * scr ol l s t he screen cont ent s up */scrol l up ( i nt s r , i nt sc, i nt er , i nt ec ){

    uni on REGS i i , oo ;

    i i . h. ah = 6 ; / * ser vi ce number */i i . h. al = 1 ; / * number of l i nes t o scr ol l */i i . h. ch = sr ; / * s tart i ng row */i i . h. cl = sc ; / * start i ng col umn */i i . h. dh = er ; / * endi ng r ow */i i . h. dl = ec ; / * endi ng col umn */i i . h. bh = 27 ; / * di spl ay at t r i but e of bl ank l i ne creat ed at bot t om */i nt 86 ( 16, &i i , &oo ) ; / * i ssue i nt er r upt */

    }

    / * posi t i ons cur sor i n appr opr i at e col umn */got ocol ( ){

    char *t emp ;i nt c ol ;

    / * i ncrement `t emp' t o poi nt t o char act er at cur r ent cur sor l ocat i on */t emp = cur r ow ;f or ( col = 1 ; col < cur c ; col ++ ){

    i f ( *t emp == ' \ t ' )col += 7 ;

    i f ( *t emp == ' \ n' )br eak ;

    t emp++ ;}

    / * i f t he char acter at cur r ent cur sor l ocat i on i s a t ab */i f ( col > cur c ){

    / * go t o t he end of t ab */l ogc += ( col - cur c ) ;cur c = col ;

    / * i f screen needs t o be scr ol l ed hor i zont al l y */i f ( cur c > 78 ){

    l ogc = 78 ;ski p = cur c - 78 ;menubox ( 2, 1, 22, 78, 27, NO_SHADOW ) ;di spl ayscr een ( cur scr ) ;

    }}

    }

  • 5/25/2018 03 Text Editor

    72/91

    TEXT EDITOR, MAMCET, 2008 IT

    72

    Util.c# def i ne CGA 1# def i ne ESC 27# def i ne YES 1# def i ne NO 0# def i ne NO_SHADOW 0# def i ne HALF_SHADOW - 1

    / * var i ous menu def i ni t i ons *// * char act er f ol l owi ng ` ' i s a hot key */char *mai nmenu[ ] = {

    " Cursor Movement" ," Fi l e" ," Search"," Del et e" ," Exi t "

    } ;

    char *cursormenu[ ] = {" St ar t of F i l e","End of Fi l e"," Top of scr een" ," Bot t omof scr een","St ar t of L i ne"," End of Li ne" ," Word Lef t " ,"Word Ri ght " ,"Page Up","Page Down","Ret ur N"

    } ;

    char *f i l emenu[ ] = {" Load" ," Pi ck"," New" ," Save" ,"Save As" ," Merge" ," Change di r " ," Out put t o pr i nt er ","Re Turn"

    } ;

    char *sear chmenu[ ] = {" Fi nd" ,"Fi nd & Repl ace" ,"Repeat Last f i nd"," Abor t oper at i on" ," Go t o l i ne no","Re Turn"

    } ;

    char *del etemenu[ ] = {

  • 5/25/2018 03 Text Editor

    73/91

    TEXT EDITOR, MAMCET, 2008 IT

    73

    " Del et e l i ne","To End of l i ne","To Begi nni ng of Li ne","Word Ri ght " ,"Re Turn"

    } ;

    char *exi t menu[ ] = {" Exi t "," Shel l ","Re Turn"

    } ;

    / * most r ecent f i l es edi t ed */char *pi ckf i l e[ 5] = {

    " " ," " ," " ," " ," "

    } ;

    / * buf f er i n whi ch f i l es are l oaded and mani pul at ed */char *buf ;

    unsi gned maxsi ze ;char *st ar t l oc, *cur scr, *cur r ow, *endl oc ;char sear chst r [ 31] , r epl acest r [ 31] , f i l espec[30] , f i l ename[ 17] ;

    voi d i nt er r upt ( *ol d1b) ( ) ;voi d i nt er r upt handl er ( ) ;voi d i nt er r upt ( *ol d23) ( ) ;char *sear ch ( char * ) ;

    i nt asci i , scan, pi ckf i l eno, no_tab ;i nt cur r = 2, cur c = 1, l ogc = 1, l ogr = 1 ;i nt ski p, f i ndf l ag, f r f l ag, saved = YES, ctr l _c_f l ag = 0 ;

    char f ar *vi d_mem ;char f ar *i ns = ( char f ar * ) 0x417 ;

    / * wr i t es a char acter i n speci f i ed at t r i but e */wr i t echar ( i nt r , i nt c, char ch, i nt at t b ){

    char f ar *v ;

    v = vi d_mem + r * 160 + c * 2 ; / * cal cul at e addr ess cor r espondi ng t o

    r ow r and col umn c */*v = ch ; / * store asci i val ue of char act er */v++ ;*v = at t b ; / * st or e at t r i but e of char acter */

    }

    / * wr i t es a str i ng i n speci f i ed at t r i but e */wr i tes t r i ng ( char *s, i nt r , i nt c, i nt at t b ){

    whi l e ( *s != ' \ 0' )

  • 5/25/2018 03 Text Editor

    74/91

    TEXT EDITOR, MAMCET, 2008 IT

    74

    {/ * i f next char act er i s t he hot key of menu i t em */i f ( *s == ' ' ){

    s++ ;

    / * i f hot key of hi ghl i ght ed bar */i f ( att b == 15 )

    wri t echar ( r , c, *s , 15 ) ;el se

    wr i t echar ( r , c, *s, 113 ) ;}el se{

    / * i f next char acter i s hot key of "Yes", "No", "Cancel ",etc . * /

    i f ( *s == ' $' ){

    s++ ;wri t echar ( r , c, *s , 47 ) ;

    }el se

    wr i t echar ( r , c, *s, at t b ) ; / * nor mal char acter*/

    }

    c++ ;s++ ;

    }}

    / * saves scr een cont ent s i nt o al l ocated memory */savevi deo ( i nt sr, i nt sc, i nt er , i nt ec, char *buf f er )

    { char f ar *v ;i nt i , j ;

    f or ( i = sr ; i

  • 5/25/2018 03 Text Editor

    75/91

    TEXT EDITOR, MAMCET, 2008 IT

    75

    f or ( i = sr ; i

  • 5/25/2018 03 Text Editor

    76/91

    TEXT EDITOR, MAMCET, 2008 IT

    76

    }}

    }

    / * draws a doubl e- l i ned box */dr awbox ( i nt sr , i nt sc, i nt er, i nt ec, i nt at t r ){

    i nt i ;

    / * dr aw hor i zont al l i nes */f or ( i = sc + 1 ; i < ec ; i ++ ){

    wri t echar ( sr , i , 205, at t r ) ;wri t echar ( er , i , 205, at t r ) ;

    }

    / * dr aw ver t i cal l i nes */f or ( i = sr + 1 ; i < er ; i ++ ){

    wri t echar ( i , sc, 186, at t r ) ;

    wri t echar ( i , ec, 186, at t r ) ;}

    / * di spl ay cor ner char act er s */wri t echar ( sr , sc, 201, at t r ) ;wr i t echar ( sr, ec, 187, at t r ) ;wr i t echar ( er , sc, 200, at t r ) ;wr i t echar ( er , ec, 188, at t r ) ;

    }

    / * di spl ays or hi des cur sor */si ze ( i nt s sl , i nt esl ){

    uni on REGS i , o ;i . h. ah = 1 ; / * ser vi ce number * /i . h. ch = ssl ; / * start i ng scan l i ne */i . h. cl = esl ; / * endi ng scan l i ne */i . h. bh = 0 ; / * vi deo page number */

    / * i ssue i nt er r upt f or changi ng t he si ze of t he cur sor */i nt 86 ( 16, &i , &o ) ;

    }

    / * get s asci i and scan code of key pr essed */get key( ){

    uni on REGS i , o ;

    / * wai t t i l l a key i s hi t * /whi l e ( ! kbhi t ( ) ){

    / * i f Ct r l - C has been hi t */i f ( ct r l _ c_ f l ag ){

    / * erase t he charact er s C */di spl ayl i ne ( cur r ow, l ogr + 1 ) ;

  • 5/25/2018 03 Text Editor

    77/91

    TEXT EDITOR, MAMCET, 2008 IT

    77

    gotoxy ( l ogc + 1, l ogr + 2 ) ;ctr l _c_f l ag = 0 ;

    }}

    i . h. ah = 0 ; / * ser vi ce number * /

    / * i ssue i nt err upt * /i nt 86 ( 22, &i , &o ) ;

    asci i = o. h. al ;scan = o. h. ah ;

    }

    / * pops up a menu ver t i cal l y */popupmenuv ( char **menu, i nt count , i nt sr , i nt sc, char *hot keys, i nthel pnumber ){

    i nt er, i , ec, l , l en = 0, area, choi ce ;char *p ;

    s i ze ( 32, 0 ) ; / * hi de cur sor */

    / * cal cul ate endi ng r ow f or menu */er = sr + count + 2 ;

    / * f i nd l ongest menu i t em */f or ( i = 0 ; i < count ; i ++ ){

    l = st r l en ( menu[ i ] ) ;i f ( l > l en )

    l en = l ;}

    / * cal cul ate endi ng col umn f or menu */ec = sc + l en + 5 ;

    / * cal cul ate area r equi r ed t o save scr een cont ent s where menu i s t o bepopped up */

    area = ( er - sr + 1 ) * ( ec - sc + 1 ) * 2 ;

    p = mal l oc ( area ) ; / * al l ocate memory */

    / * i f al l ocat i on f ai l s */i f ( p == NULL )

    err or_exi t ( ) ;

    / * save screen cont ent s i nt o al l ocated memory */savevi deo ( sr, sc, er , ec, p ) ;

    / * draw f i l l ed box wi t h shadow */menubox ( sr , sc + 1, er , ec, 112, 15 ) ;

    / * draw a doubl e l i ned box */dr awbox ( sr , sc + 1, er - 1, ec - 2, 112 ) ;

    / * di spl ay the menu i n t he f i l l ed box */

  • 5/25/2018 03 Text Editor

    78/91

    TEXT EDITOR, MAMCET, 2008 IT

    78

    di spl aymenuv ( menu, count , sr + 1, sc + 3 ) ;

    / * r ecei ve user ' s choi ce */choi ce = getr esponsev ( menu, hot keys, sr , sc + 2, count , hel pnumber )

    ;

    / * r est or e or i gi nal scr een cont ent s */r estorevi deo ( sr, sc, er , ec, p ) ;

    / * f r ee al l ocat ed memory */f ree ( p ) ;

    si ze ( 5, 7 ) ; / * set cur sor t o nor mal si ze */r et ur n ( choi ce ) ;

    }

    / * di spl ays menu ver t i cal l y */di spl aymenuv ( char **menu, i nt count , i nt sr , i nt sc ){

    i nt i ;

    f or ( i = 0 ; i < count ; i ++ ){

    wr i t estr i ng ( menu[ i ] , sr, sc, 112 ) ;sr ++ ;

    }}

    / * r ecei ves user ' s choi ce f or t he ver t i cal menu di spl ayed */get r esponsev ( char **menu, char *hot keys, i nt sr , i nt sc, i nt count , i nthel pnumber ){

    i nt choi ce = 1, l en, hot keychoi ce ;

    / * cal cul ate number of hot keys f or t he menu */l en = st r l en ( hot keys ) ;

    / * hi ghl i ght t he f i r st menu i t em */wr i t est r i ng ( menu[ choi ce - 1] , sr + choi ce, sc + 1, 15 ) ;

    whi l e ( 1 ){

    / * r ecei ve key */get key( ) ;

    / * i f speci al key i s hi t * /i f ( asci i == 0 )

    {swi t ch ( scan ){

    case 80 : / * down ar r ow key */

    / * make hi ghl i ght ed i t em normal */wr i t est r i ng ( menu[ choi ce - 1] , sr + choi ce, sc

    + 1, 112 ) ;

    choi ce++ ;

  • 5/25/2018 03 Text Editor

    79/91

    TEXT EDITOR, MAMCET, 2008 IT

    79

    br eak ;

    case 72 : / * up arr ow key */

    / * make hi ghl i ght ed i t em normal */wr i t est r i ng ( menu[ choi ce - 1] , sr + choi ce, sc

    + 1, 112 ) ;

    choi ce- - ;br eak ;

    case 77 : / * ri ght ar r ow key */r et ur n ( 77 ) ;

    case 75 : / * l ef t arr ow key */r et ur n ( 75 ) ;

    case 59 : / * f unct i on key F1 f or hel p */

    / * i f cur r ent menu i s f i l e menu */

    i f ( hel pnumber == 3 ){

    / * i f hi ghl i ght ed bar i s not on r et ur n */i f ( choi ce ! = 9 ){

    / * cal l wi t h appr opr i at e hel pscr een number */

    di spl ayhel p ( 8 + choi ce - 1 ) ;}br eak ;

    }

    / * i f cur r ent menu i s exi t menu */

    i f ( hel pnumber == 6 ){/ * i f hi ghl i ght ed bar i s not on Ret ur n */i f ( choi ce ! = 3 ){

    / * cal l wi t h appr opr i at e hel pscr een number */

    di spl ayhel p ( 16 + choi ce - 1 ) ;}br eak ;

    }

    / * i f cur r ent menu i s ot her t han f i l e menu orexi t menu */

    di spl ayhel p ( hel pnumber ) ;}

    / * i f hi ghl i ght ed bar i s on f i r st i t em and up ar r ow key i shi t * /

    i f ( choi ce == 0 )choi ce = count ;

    / * i f hi ghl i ght ed bar i s on l ast i t em and down ar r ow key i shi t * /

  • 5/25/2018 03 Text Editor

    80/91

    TEXT EDITOR, MAMCET, 2008 IT

    80

    i f ( choi ce > count )choi ce = 1 ;

    / * hi ghl i ght t he appr opr i at e menu i t em */wr i t est r i ng ( menu[ choi ce - 1] , sr + choi ce, sc + 1, 15 ) ;

    }el se{

    i f ( asci i == 13 ) / * Ent er key */r et ur n ( choi ce ) ;

    i f ( asci i == ESC ){

    di spl aymenuh ( mai nmenu, 5 ) ;r et ur n ( ESC ) ;

    }

    hot keychoi ce = 1 ;asci i = t oupper ( asci i ) ;

    / * check whet her hot key has been pressed */whi l e ( *hot keys ! = ' \ 0' ){

    i f ( *hot keys == asci i )r et ur n ( hot keychoi ce ) ;

    el se{

    hotkeys++ ;hot keychoi ce++ ;

    }}

    / * r eset var i abl e t o poi nt t o t he f i r st hot key char acter

    */ hotkeys = hotkeys - l en ;}

    }}

    / * di spl ays menu hor i zont al l y */di spl aymenuh ( char ** menu, i nt count ){

    i nt col = 2, i ;

    s i ze ( 32, 0 ) ;menubox ( 0, 0, 0, 79, 112, NO_SHADOW ) ;

    f or ( i = 0 ; i < count ; i ++ ){

    wr i t est r i ng ( menu[ i ] , 0, col , 112 ) ;col = col + ( st r l en ( menu[ i ] ) ) + 7 ;

    }

    s i ze ( 5, 7 ) ;}

    / * r ecei ves user ' s choi ce f or t he hori zont al menu di spl ayed */

  • 5/25/2018 03 Text Editor

    81/91

    TEXT EDITOR, MAMCET, 2008 IT

    81

    getr esponseh ( char **menu, char *hot keys, i nt count ){

    i nt choi ce = 1, hot keychoi ce, l en, col ;

    s i ze ( 32, 0 ) ;col = 2 ;

    / * cal cul ate number of hot keys f or t he menu */l en = st r l en ( hot keys ) ;

    / * hi ghl i ght t he f i r st menu i t em */wr i t est r i ng ( menu[ choi ce - 1] , 0, col , 15 ) ;

    whi l e ( 1 ){

    / * r ecei ve key */get key( ) ;

    / * i f speci