Programmers Workshop Series Part 3

Dim about DIM? Make the most of every byte with these memory saving techniques.

Volume 2

Number 2

April 1984


By David Lewis

WHEN you write any sort of file program it is usual to put the data into arrays. Usually the file is divided into fields, such as NAME$(N), FSTNAME$(N), ADDRESS$(N), AGE$(N), and so on.

A typical file will want to have a hundred or so records, each divided into say 12 fields.

Now these arrays need dimensioning. You may make sure of plenty of space by writing DIM NAME$(200), FSTNAME$(200), etc. Then you run the program, and start entering actual records.

All is going well. You have 60 records in the memory and you are just thinking of saving the file before breaking for a well-earned cup of coffee.

One more record goes in - and suddenly the dreaded message "No room" appears.

Disaster! Not only is all your work wasted, but you can't understand why your BBC Micro has closed up on you so soon.

You had put in about 60 records of 120 characters each. That took about 7k of memory. Even if you had a 12k program in Mode 7 that should still have left at least 10k of memory unused.

So the memory just can't be full... can it?

Oh yes, it can. You see, you used up 10k of memory when you wrote those "make sure" DIMs - and two thirds of the space you used was wasted.

Here is the way it works. When you write DIM NAME$(200), that doesn't reserve space for the name strings. It couldn't, because no one knows the length of each string. What the computer does is reserve space for the addresses of the name strings.

Each completed element in the array will have an address, taking two bytes, and two numbers, taking one byte each.

As an example, the 8 bytes of the address section from 4100 hex to 4107 hex, when filled might look like Figure I.

Figure I: String storage (in hexadecimal)

The address space was empty at the DIM stage, as neither the addresses nor the length of the strings were known. DIM NAME$(200) reserved 800 bytes of memory (200 x 4) in the address section.

You used 12 fields, so the total space reserved for addresses took 12 x 800 bytes nearly 10k of memory.

The computer also allows some working space, immediately above the

Basic program. So the memory map looks like Figure II.

Figure II

The actual file data in our example started at 6500 hex. HIMEM is at 7000, so you had less than 6k of memory available for the data.

No wonder that it hit the roof when you had only about 60 records stored. Now at this point if you examine the memory you will find only one third of the address space has been used, leaving 6k of empty, wasted space, between 4EOO and 6500 hex.

You have saved enough address space for 200 records, but you only have enough data space for 60 records. What can be done about this waste? The answer is to calculate the maximum number of records very carefully. Then you use it to make sure that you reserve the absolute minimum of address space, by a minimum dimension statement.

Work out the average or maximum number of characters in each record. For example, NAME (15 characters), FSTNAME (15), ADDRESS (40), AGE (2), etc. Let us say that this comes to 120 characters.

Add the number of fields, say 12, multiplied by 4, which means add 48. This gives a new total of 168 characters for each record.

Find HIMEM and subtract TOP. (If your program is loaded, you can find these numbers by PRINT HIMEM, and PRINT TOP.) Example:

31744 - 14286 = 17458

Subtract 2,500, for working space above the program:

17458 - 2500 = 14958

Divide by 168:

14958/168 = 89

That is the maximum number of records so write in your program: DIM NAME$(89), FSTNAME$(89), etc.

You can get the computer to do the whole calculation, of course. You enter T for the total characters in all fields, and NUM for the number of fields.


MAX = (HIMEM - TOP - 2500) / (T + 4 * NUM)



This set of notes should enable you to store the maximum number of records, but it leaves one question unanswered -what are we to use for T?

The maximum number of characters in any field may be very different from the average in the actual file.

There are many people with short names, such as Joe Bloggs of 5 West Street, and a few people with long names, such as Marmaduke Stan-dingworthy of The Larches, Fother-ingham Avenue.

If, when you have built your file, you are only going to search through it, display it or print parts of it, you can take T as the average number of characters in a record.

In this case a typical file, with its records divided into 12 fields, may have maximum numbers of characters up to 120, while the average comes out at about 50 characters.

By using T = 60 in the above formula you allow space for over 180 records, instead of the 89 calculated using the maximum characters.

But what else do you want to do with the file? Do you want to constantly edit it, changing the positions of the long records, again and again? Or do you want to sort it into alphabetical order, or into date of birth order?

In that case, your friend Marmaduke may start at the top of the list, and be passed down through nearly every array element in turn until the SORT routine finds him his proper place near the bottom of the list.

Each array element must, in this case, be long enough to contain the longest string, or else the computer will have to make extra space for the long records, wasting the space already given to the short records.

You must reserve this array space before you establish the file:

Add together the maximum number of characters in each field, to determine T.

Use T to find MAX: write the DIM statements, using MAX.

Immediately after the DIM statements, write:

NAME$(X)=STRING$(MN," ") :
AGE$(X)=STRING$(MA," ") :AGE$(X)=""

and so on, where MN, MF, MA are the maximum numbers of characters in the respective fields. The eight address bytes for Bloggs,

Joe will appear as in Figure III.

Figure III

The 16 bytes reserved between 5A70 and 5A80 will allow STANDING-WORTHY to fit in, during the sort process, without the computer having to find a new place for the long name, wasting the old space completely.

Of course it is unfortunate that the total number of records will be cut drastically. But the facility to sort files at any time is one of the great advantages of a computer and should not be given up lightly.

Is there any way of lessening that four bytes of address space per record per field?

The four bytes are particularly irking if the field itself is a one letter code such as M or F, or AGE, using only two characters at most.

Well, you could put the whole record in one array element, RECORD$(N), and allow 128 bytes for each element.

Then you would write a sub-routine to analyse the record string, looking for dividers. So RECORD$(l) might be BLOGGS/JOE/10 LEST ST/47/, etc.

This might be the subject of another article, but for the moment I hope that I have shed a little more light into the memory system and the storage of array variables in our favourite computer.