Contents

3.6.12 Use Read as a Pattern for Write

Unless this is your first week on the job, I'll bet you've witnessed this scenario several times. Data from an important test had to be recorded in real time and analyzed immediately after the test. The project engineers carefully devised a format for recording the data on tape. The test was performed and data was recorded using that format. The tapes were taken to the data processing people and project management anxiously await the results. Weeks later nobody had seen any reduced data. The data processing people still had not figured out how to read the tapes yet! The project engineers blamed the data processing section. The head of the data processing department defended his people and tried to put the blame on the project engineers. Why does this happen?

It is easy to see how this could have happened with the routine to Read procedure. The first thing it does is read the number of Field_arrays from the first line of the file and create a discriminated record the proper size to hold the data. Suppose I hadn't written the number of Field_arrays to the first line of the file. I could have thought that it isn't necessary to waste a line on writing that information because it is possible to find the number of fields by reading the file and counting the number of times the "-- data --" header line appears. If I had done that, I would have had to read the whole file to find out how many field specifications it contains, create a form the proper size, and then read the file again to put the data in the form. (Or I could save all the data in memory from the first pass through the file, and then transfer it to the form if there is enough memory space.) If I had written the TEXT in each field before writing the FIRST and LAST column numbers. It would have been much more difficult to read the form from the file. If I had made either of these stupid decisions when I defined the file format, the data wouldn't have been lost. I still could have read the file with a little extra effort.

Writing a file is easy. You can write anything in any order without any problems. Reading a file can be difficult because you often need to know certain pieces of information before you can process others. So, the first rule for establishing a file format is, "Write the routine that gets the data from the file before you write the routine that puts it there." After you have done this you can use your Read routine as a pattern for your Write routine.

Look at Listing 28, which contains both the Read and Write procedures. Let's compare Read to Write. The first thing Read does is Open a file, so the first thing Write does is Create a file. The next thing Read does is to read the number of fields on the form, so Write must write the number of fields on the form before writing anything else. Read uses that number to set the limit on a loop that reads

118 a header, NAME, LINE, FIRST, LAST, PROTECTED, and TEXT, so Write should also set up a loop that writes those things in that order. Read ends by closing the file, so Write should end that way, too. The only thing Read does that doesn't correspond to something Write does is the error checking. (That's because Read can't be sure the file it is trying to read really contains a valid form.)

The Read routine in Listing 28 is more complicated than most file reading routines because it needs the Stored_Form function to create a discriminated record. In many cases a file reading routine is a simple loop or straight line program. In those cases you can create a Write routine from a Read routine simply by using a text editor to change all occurrences of get to put and then make some other minor changes (change Open to Create, for example). But whether you copy and edit the Read source file, or just use a printed copy of the Read source code as a guide for writing the Write source code, the principle is the same. Pick a format that is easy to read, write the Read routine first, and then use that as a pattern for the Write routine.

3.6.13 ASCII Data Files

You perhaps noticed that I used ASCII (rather than binary) format for the FORM files. Not only that, I wastefully put only one data item on each line. Binary files generally use less disk space and are faster than ASCII files. In this case the ASCII files are probably small enough to fit in a single disk sector with room to spare, so an ASCII file probably isn't any bigger than a binary file. The time it takes to convert those few words from binary to ASCII and back is negligible. In cases like these I always prefer ASCII to binary files because I can easily display, print, and edit them.

If I try to write a FORM to a file and then read it back, and it doesn't work, how do I know what went wrong? Did it write the file correctly and fail to read it? or did it write the file incorrectly? It is easy to print an ASCII file and see. Of course there are utility programs to dump and patch binary files. You can examine blocks of hexadecimal listings and find out what data was written to the file, but that's not as easy as looking at ASCII files with one data item per line. So even if I know I'm eventually going to be dealing with huge files, I usually start developing the file IO routines with ASCII representations of dwarfed files. After they are debugged I switch to binary and test the routines again with small files. Then I try them with the big files.

A computer can easily count lines to determine which numbers are associated with each variable, but I can't. I had some difficulty figuring out which numbers represented lines and columns, especially if I was interested in a field specification near the middle of the file, so I added header lines that said "-- data --" at the beginning of each field

119 specification. They stick out like a sore thumb, and make it easy for me to visually see where each field specification begins.

The Read program could ignore the "-- data --" lines, since they convey no information. This is easily done using skip_line. I decided not to skip them, but to use them as parity checks instead. Every time I would normally have skipped the header line, I read it and make sure it really says "-- data --". If it doesn't, it means the file has been corrupted, the Read routine has gotten out of sync, or the file doesn't really contain a form. In any of those cases I don't want to continue trying to read the form, so it raises the READ_ERROR exception and quits.

3.6.14 Storing Boolean Values in a File

I've seen a message on an electronic bulletin board saying that a particular implementation of TEXT_IO has a bug in it that prevents it from properly storing boolean types when ENUMERATION_IO is instantiated for boolean types. I'm not sure if that's true or not. (Perhaps that person just wasn't using it correctly.) What I am sure of is that they were trying to use ENUMERATION_IO to store a boolean value, and I don't think that's a good idea.

The Read and Write routines store the boolean variable PROTECTED in an external file without instantiating ENUMERATION_IO. They simply uses the character P to indicate protected fields and U to indicate unprotected ones. What could be easier?

If you really have you heart set on writing TRUE and FALSE, you can use the attributes boolean'IMAGE and boolean'VALUE to convert between Boolean values and text strings, just as I used integer'IMAGE and integer'VALUE to do the same for numbers. I don't see any reason to use four or five characters where one will do the job, but you may have a good reason. Consider this, however. When you look at the ASCII representation of the file, TRUE and FALSE don't tell you much. They just tell you something is true or false. If you see a TRUE in a file, does that mean the field is unprotected? You have to think about it. If I were going to use several characters instead of just one, I would write PROTECTED or UNPROTECTED to the file, not TRUE or FALSE.

3.6.15 One Compilation Unit Per File

By now you must have noticed that I almost always put just one compilation unit in a file. I could combined all 19 FORM_TERMINAL listings in a single file. That would have made it easier for you to compile the FORM_TERMINAL. You could just submit that one file to the compiler and go visit your coworkers at the water cooler. Ten minutes later you could return to your terminal and see if it was done yet.

There are two good reasons not to combine several compilation units in one file. The primary one is that you have to recompile a whole file at a time. If the file contains ten long compilation units, and you change one, then you waste time recompiling nine units that haven't changed.

The secondary reason is that separate files allow you to make it easier to find particular compilation units. If there is something wrong with the Display routine, it is easier to look in Listing 29 than to search a huge listing looking for it. (This reason for separating compilation units isn't as compelling as it once was because modern software engineering environments make it possible for anyone associated with the project to search any file file for anything electronically, but I still think it is a good idea to try to keep files small.)

Everyone who has completed an introductory Ada course should have had it pounded in his head why package specifications should be separated from package bodies. I shouldn't need to tell you that using separate files for the package body and specification allow you to make changes to the body without making units that depend on the specification obsolete. I won't insult your intelligence by reminding you of that.

You don't need to combine compilation units in a single file to compile them all at once. Every operating system has something equivalent to a shell script (perhaps a ".BAT" file, or ".COM" file) that lets you execute a sequence of commands at once. Whenever I have a software component like the FORM_TERMINAL that is spread out over several files, I just write a script that compiles them all in the correct order.

There are times, however, when you have to break the "One unit per file" rule. Some Ada compilers require all parts of a generic package or subprogram be in a single file. On those compilers you don't have any choice but to put multiple compilation units in the file in that case. Since I know that is a potential portability problem, I always put all the components for a generic unit in one file whether the compiler I am using requires me to or not.

3.6.16 Encapsulating Details in One File

Listing 28 is another example of when to break the rule. It contains two separate subunits, Read and Write, even though they aren't generic.

Normally, I try to encapsulate design details in a single compilation unit. The format of the file containing a FORM is a design detail I would like to confine to one location. If possible, I would like to make only one compilation unit dependent upon the external file format, so any changes to that format will require me to recompile only one unit.

In this case, Read and Write both need to know the external file format. Since the file format affects both Read and Write, any changes made to format affect both

121 subunits. There isn't any practical way I can see to encapsulate the format in just one unit. If you change one subunit without changing the other, it will cause problems.

Ada's compilation order rules sometimes help out, but not this time. If you change the FORM_TERMINAL body, Ada will realize that Read and Write are obsolete and need to be recompiled; but since Read and Write are both subunits of the body, you can change and recompile either without making the other one obsolete. Ada won't automatically tell you that you have to change and recompile the other subunit.

Since I couldn't encapsulate the external file format in a single subunit, I did the next best thing. I encapsulated it in a single file. If I modify one of units, I'm bound to notice the other one and remember that it has to be changed, too. This isn't foolproof. It is possible to open the file, change the format in one subunit without changing the other, and recompile the file, but it's had for me to imagine someone who could do that accidentally.

Putting both subunits in the same file not only reminds me to make the same changes in both, it also makes it easier to use the text editor to cut and paste patches to both subprograms at once.

Encapsulating the external file format in a file with two subunits gives us the flexibility to change the external format without affecting any other part of the program. If we want to use binary external form instead of ASCII, we can change the Read and Write routines, and all our changes are confined to one source file. Whenever we compile that file, we automatically compile a matching pair of routines. We never have to worry about accidentally compiling the old ASCII format Read and the new binary format Write.

3.6.17 Formatted I/O

I've hated formatted output ever since I first encountered a FORMAT(F6.2) statement 22 years ago. You would think that after all these years it would have gotten easier, but it hasn't. It is still easy to make a mistake when counting spaces, so column headings don't line up correctly! I never seem to get it write the first time.

Laying out a two dimensional form is even more hassle than laying out one dimensional column headers. The FORM_TERMINAL requires you to count rows, columns, and string lengths. Everything has to be exactly right, or else CONSTRAINT_ERROR would raises its ugly head.

The difficulty of formatting the display on the screen almost lead me to fatal design error. This error is so common, and so important, it is worth while to devote the next subsection to it.

3.6.18 The Danger of Improvement

It's ironic, but sometimes you can improve a good product so much that it becomes useless. Several examples come quickly to mind. There was a word processing program that dominated the CP/M market in the late seventies. The manufacturer added many features to this good product, and released the new, improved version. The resulting product was so slow and difficult to use that it got terrible reviews in computer magazines, and other word processors tore the market away from it. There are two real-time operating systems that came out in the seventies that are suffering the same fate. Too many good products have failed because they've been improved too much. Some people can get upset and nasty when I criticize their products, so I'll pick on my own FORM_TERMINAL and show what almost happened to it.

The original FORM_TERMINAL consisted of six files that looked a lot like Listings 25 through 30. It did not have the capability of creating or editing forms. I used a text editor to create the external file containing the field specifications. As I pointed out, that was a nuisance, but it only had to be done once for each form I created, and I only created ten different kinds of forms. Each time I did it, it took less than an hour, so I spent less than 10 hours total time creating files with the text editor.

The FORM_TERMINAL is such a useful user interface, I wanted to be sure to include it in this book. I realized that its most serious deficiency was the laborious procedure required to create the form file. I decided to add the Create procedure, that would make this much easier. Well, after several days I got the Create procedure working, and it only increased the size of the FORM_TERMINAL package from six files to ten files. (Listings 31 through 34.)

I used the Create procedure for a while, and realized that it forced the user to start from scratch every time a new form was needed. If you wanted to correct an error in a form, or make a second form almost exactly like another form, you had to start from scratch. I needed a way to edit an existing form, so I wrote the Edit procedure.

The Edit procedure is spread out over eight files (Listings 35 through 42), and brought the total number of files in the FORM_TERMINAL to eighteen. Needless to say, this took considerable time and effort to get this feature working.

I discovered that while using the Create or Edit procedure it was possible to produce a form containing errors. I needed Error_Recovery to allow me to recursively call the Edit procedure until the form was error free. One more small file (Listing 43) brought the total to nineteen files in the FORM_TERMINAL package.

I used the Make_Form and Edit_Form programs (Listings 44 and 45) and discovered that the first call to Error_Recovery raises STORAGE_ERROR on my IBM PC AT clone. The publication deadline was getting close, and FORM_TERMINAL didn't work any more. I was panic stricken.

Finally I got the FORM_TERMINAL, as shown in Listings 25 through 43, to work on a VAX (and also on a PC if you don't make recursive errors). Most of your application

123 programs won't use Create, Edit, or Error_Recovery. That means thirteen of the nineteen source files create dead code that will have to be removed by an optimizer (if you have one.)

Looking back, I see countless hours spent writing slick utility programs that save a few minutes. I was tempted to remove Create and Edit from the package specification, then remove all the code associated with them, and never tell you about them. That's less embarrassing to me, but I'd rather have you learn from my mistake. The whole sordid package is there for you to see.

It is easy to get seduced into doing more than you should. From time to time it is a good idea to ask yourself, "Is this really worth it?" Sometimes you have to admit you made a mistake and go back to an older version.

One easy way to return FORM_TERMINAL to its original small size is to use the Edit and Create stubs in Listings 46 and 47. If you compile these two small stubs, they write error messages if you should ever try to Edit or Create a form. I don't expect any of your application programs to call these routines, so they produce dead code, but not nearly as much as the real Edit and Create routines do. The better way, of course, is to edit the package specification and body to remove all references to Create, Edit, and Error_Recovery.

If I could live my life over again, I wouldn't have written the Edit and Create procedures; but the fact is that I did write them, and there are some lessons that can be drawn from them. Let's look at them.

3.6.19 Creating a New FORM

Create takes most of the work out of designing a form. You still need to decide what the form should look like, but the Create procedure does all the counting of lines and columns for you.

I wanted to make Create an independent program outside the FORM_TERMINAL package, but it needs to know about Field_specs and the SCREEN. I would have to make those internal details visible to all programs outside the package if the Create procedure was outside the package. I don't want clever application programmers directly manipulating the Field_specs and the SCREEN. Putting the Create procedure inside the package allows me to keep those details hidden from application programs.

The Create procedure asks you if you need instructions. If you do, it gives you a screen full of explanation. When you have read this, it covers the screen with '~' characters. The wiggles are there to help you see how much space you have to work with. (They won't appear on the form you create.) You can use the arrow keys to move the cursor around wherever you want, and type whatever you like. Keep doodling around until the form looks like you want it to. If 124 you make a mistake, just type over what you have already done.

Eventually it will look like you want it to. When it does, it is time to tell the computer to store it. In general, you do this by pointing to the beginning and end of each field with the cursor and using function keys to indicate if it is protected or not. Each time you do this, the program will ask you to give the field a name. Every field must have a unique name, and must fit on a single line. Remember there is the concept of "next field" and "previous field", so be sure to specify them in the correct order (just as you must specify enumeration types in the correct order). Usually you will want to start with the field in the upper left corner and work to the right and down, but that's isn't necessary. (If you want to really baffle a user you can start at the bottom and work up!)

You point to the beginning of a field by moving the cursor to the first position in the field and press F1 or F2. Press F1 if this is to be a protected field the user can't modify. Press F2 if it is an area where the user is expected to enter data. Use the RIGHT arrow key to move the cursor to the last character in the field. (You can use the LEFT arrow key if you overshoot the end.) When the cursor is at the proper place, press F3. The computer stores the line number, the first and last column numbers, and the text contained in those columns. (The text could be a prompt, a default response, or blank spaces.) It also stores whether this field is protected or not. All that remains for you to do is to give it a unique name of 20 characters or less. You do this by typing the name at the prompt at the bottom of the screen and pressing the RETURN key. (Note: you may use significant embedded blanks and underlines, but all lower case letters will be converted to upper case automatically.)

After you have entered a field name the cursor returns to the end of the field you just entered, and you may enter the next field. When all the fields have been entered, press F10.

The Create procedure leaves the form in memory. You probably want to write it to a disk file. The Make_Form program (Listing 44) shows you how to do this. It doesn't do much more than call FORM_TERMINAL.Create and FORM_TERMINAL.Write.

3.6.20 Character Substitution

When I designed the SCROLL_TERMINAL I chose the question mark key as a help request. I couldn't imagine any time a user would answer a question with another question, so decided the question mark key should always raise the NEEDS_HELP exception. I kept the same convention in the FORM_TERMINAL. There were no problems until it came time to create a form.

It is likely someone will want a form to display a prompt with a question mark it in. How can someone create a

125 form containing a question mark when pressing the question mark key always raises the NEEDS_HELP exception? The solution (near the end of the Process_Keystrokes procedure in Listing 33) was to substitute the escape key for the question mark. I don't like to map keys to other functions, but in this case it seemed like the best way to solve the problem.

3.6.21 Long Strings

Sometimes string literals won't fit on one line. Suppose you want to print a string that is 60 or 70 characters long. The print statement might appear at a point in the program where there are several levels of indentation, and you may be using dot notation, and your work processor may insist on saving a generous right margin. There isn't room to put SCROLL_TERMINAL.put_line("70 characters here"); on one line. The text editor inserts a carriage return somewhere in the string literal, and Ada generates an error saying something about an unterminated string.

I ran into that problem in Listing 32. The help messages wouldn't fit on a single line. The simple solution was to break the messages into two strings (one string on each of two lines) and print the catenation of the two strings. You can use this trick whenever a string literal won't fit on a single line.

3.6.22 IN OUT Mode

I am ashamed to say that, in my desperation to try to get the complete FORM_TERMINAL package to fit on a PC, I saved space by intentionally misusing the IN OUT mode in Listing 33. That was a really bad thing to do. Let me explain why.

Lazy programmers always use IN OUT mode to avoid those annoying error messages Ada generates when you misuse an IN or OUT mode parameter. Ada warns you of those errors for your own good. Using IN OUT mode to suppress them simply prevents you from detecting the error at compile time, and makes it appear at run time, when it is much more difficult to detect.

Pardon my FORTRAN, but Figure 30 shows what can happen if you don't pay attention to parameter modes. This is a fragment of a program I wrote for a client in FORTRAN because he didn't have an Ada compiler for his computer. FORTRAN treats all parameters the way Ada treats IN OUT mode parameters. TIME is expressed in milliseconds, and I wanted to convert it to HOURS, MINUTES, and SECONDS so I could display the time in "HH:MM:SS" format. The program did strange things because the value of TIME was corrupted by the SPLIT subroutine. For example, if the value of TIME was 34,644,822 before calling SPLIT, the display correctly showed 09:37:24, but the value of TIME was changed to 24,822. It took me most of a day to figure out what went wrong. If I had written the routine in Ada, it would have looked like Figure 31. Since I thought I was just reading TIME and not changing its value, I would have declared it to have IN mode. Ada would have spotted my error at compile time. Then I would have rewritten it as shown in Figure 32. If I had used IN OUT mode for all the parameters in Figure 31, Ada would not have caught the error, and I would have had the same problem I had in FORTRAN.

Notice the solution in Figure 32 requires the declaration of an extra variable. I didn't want to do that in Listing 33 because Form_specs take up lots of space. To be brutally honest, I was using DATA as a global variable, but pretending to pass it as a parameter. I should have made DATA an OUT parameter because DATA is produced by Get_Field. Then I should have declared a local variable of type Form_specs and copied it to DATA at the end of the procedure. (In fact, that's what I did in the original version. I had to take out the extra variable because it caused STORAGE_ERROR to be raised on the PC.)

Legitimate use of IN OUT mode is rare. It should only be used in those cases where you are passing a variable to a routine and you expect that routine to somehow modify it and return the modified value back to you. If you use IN OUT mode to avoid declaring an extra variable, your program may work, but it may confuse a maintenance programmer. He may spend hours trying to figure out where the calling program created the original value (it really didn't), or where the calling program will use the transformed value (it really doesn't). You shouldn't mislead someone into thinking a routine transforms a value if it simply uses it or produces it.

3.6.23 Editing an Existing FORM

I sometimes became very frustrated with Create because I would almost be finished with a complicated form, and would make a little mistake. There was nothing I could do except start all over again. There was no way to edit the form.

If you have a complicated form with many fields, and you just want to add one more field, swap the position of two fields, or correct a spelling error in a prompt, you can't fix it with Create. Create will make you enter the entire form from scratch. That's a lot of unnecessary work, and it gives you too many chances to make a mistake.

The Edit procedure can be used to make changes to the Field_specs or SCREEN. The Edit_Form program, Listing 45, uses the Edit procedure to make it easy for you to make minor changes in the form.

3.6.24 null Exception Handlers

Students often make an amusing mistake they are first exposed to exceptions. They think they need to handle every exception in every routine. If they don't know what to do, 127 the put a do-nothing exception handler at the end of the block. begin -- some code here exception when others => null; end; I generally come down pretty hard on the student because he is telling Ada, "I don't know what went wrong, so just ignore it and proceed to the next block as if everything is OK." I used to say there is never a time when a null statement is an appropriate exception handler. Now I say it is ALMOST never appropriate. I used one in Listing 45.

The FORM_TERMINAL.Read procedure will raise LAYOUT_ERROR if it reads a form into memory from a file and then discovers an error in it. A LAYOUT_ERROR should be rare, and will probably force most programs to terminate abnormally. The Edit_Form program is a special case. When it reads a form from a disk there is a good chance there is a LAYOUT_ERROR in it. (That's why we want to edit it!) If we only allow the program to read good forms, then it isn't much use to us.

Notice that the Edit_Form procedure encloses the FORM_TERMINAL.Read(FILE); statement in a block and provides a local exception handler for that block. The exception handler ignores the LAYOUT_ERROR and lets the program proceed normally. Any other exception, like READ_ERROR, is not ignored and is handled by an exception handler at the end of the program. 112


Contents | Next ...