The gap method (two spans in one buffer)

Next: The linked list Up: Basic Sequence Data Previous: The array method

The gap method (two spans in one buffer)

The gap method is only a little more complex than the array method but it is more much efficient. In this method you have one large buffer that holds all the text but there is a gap in the middle of the buffer that does not contain valid text. (See Figure 3)

Figure 3: The gap (or two span) method

Three descriptors are needed: one for each span and one for the gap. The gap will be at the text ``cursor'' or ``point'' where all text editing operations take place. Inserts are handled by using up one of the places in the gap and incrementing the length of the first descriptor (or decrementing the begin pointer of the second descriptor). Deletes are handled by decrementing the length of the first descriptor (or incrementing the begin pointer of the second descriptor). ItemAt is a test (first or second span?) and an array reference.

When the cursor moves the gap is also moved so if the cursor moves 100 items forward then 100 items have to be moved from the second span to the first span (and the descriptors adjusted). Since most cursor moves are local, not that many items have to be moved in most cases.

Actually the gap does not need to move every time the cursor is moved. When an editing operation is requested then the gap is moved. This way moving the cursor around the file while paging or searching will not cause unnecessary gap moves.

If the gap fills up, the second span is moved in the buffer to create a new gap. There must be an algorithm to determine the new gap size. As in the array case, the buffer can be extended to any length. In practice, it is usually increased in size by some fixed increment or by some fixed percentage of the current buffer size and this becomes the size of the new gap. With virtual memory we can make the buffer so large that it is unlikely to fill up. And with some help from the operating system, we can expand the gap without actually moving any data. This method does use up large quantities of virtual address space, however.

This method is simple and surprisingly efficient. The gap method is also called the split buffer method and is discussed in [9].

Next: The linked list Up: Basic Sequence Data Previous: The array method

Charlie Crowley
Thu Jun 27 15:36:10 MDT 1996