Contents

Chapter 4


PROGRAMMING ISN'T SOFTWARE ENGINEERING

The difference between programming and software engineering is like the difference between gardening and farming. You could say the difference is the size of the effort, but there is really more to it than that.

Farming isn't just gardening on a large scale. You can't use the same techniques for farming that you would if you were gardening. Any farmer who tries to plant his crops using nothing more than a shovel, rake, and hoe, is not going to succeed. Farming requires more powerful tools. A farmer needs a tractor.

Gardening isn't just farming on a small scale. You can't use the same techniques gardening that you would use if you were farming. I shouldn't try to use a tractor to plant my six tomatoes seedlings. It would be more trouble just getting the tractor into my back yard than it would be dig six holes with a shovel. The amount of money I save growing my own tomatoes wouldn't pay the maintenance on the tractor.

Even though there are differences between gardening and farming, there are some fundamental principles that don't change. Regardless of the size of the effort, you still need to provide the plants with adequate nourishment, water, and the right amount of sunlight. Things that you learn about soil preparation will be useful to you regardless of whether you are gardening or farming.

Software engineering isn't just programming done by more people over a longer period of time. You need different techniques for "programming in the small" and "programming in the large." In this section you will see several examples of small programming projects and one example of software engineering. Some of the techniques that work for small programming projects aren't adequate for large projects. Some of the techniques necessary for large projects are too awkward for small projects. Some basic principles (like the ones discussed in the previous sections on numeric considerations and IO utilities) hold for both programming and software engineering.

I'm sure you wouldn't try to plant a 40 acre farm with just a shovel, nor would you be foolish enough to try to use a tractor to plow a 5 x 10 foot backyard garden. Most people intuitively know when an area of land is too big to shovel or too small to plow. Unfortunately many people lack that same intuition when it comes to software development. They have one method, and they use it regardless of the size of the project. Using software engineering techniques on a small program leads to just as much trouble as using simple programming techniques on a large project does. You will save yourself a lot of grief if you can recognize when to shovel and when to plow.

You are about to see several little software tools that are examples of programming. They took a few hours to write and debug. I didn't spend weeks planning them; I just started writing with a vague goal in mind. As I got closer to the goal my vision became clearer. I used the programs and then made minor modifications to improve them. That is an appropriate approach to take for small projects.

If the project is large, seat-of-the-pants programming just won't work. You don't just sit down one afternoon and write the operational flight program for the space shuttle. You can't just say to yourself, "I'm not really sure what this space shuttle software should do, but it will come to me if I just wing it." Big programs require software engineering. The Draw_Poker program is a small example of a big program, and it shows some of the things you have to do differently when working on a large project.

I wish I could give you a simple rule, like, "Use simple programming techniques for projects less than 1,000 lines of code, and use software engineering for larger projects.", but I can't. There isn't a clear cut boundary between big and small that can be expressed in lines of code. Even if there were, it wouldn't do you any good because you don't know how many lines of code there are in the program until it is finished, and then it's too late.

Still, there are ways to tell when a project warrants software engineering. Ask yourself, "Is this program likely to require long-term maintenance? Will there be people on salary who will be responsible for improving this program and correcting bugs? Is this a program that will take several man-years to develop?" If the answer to these questions is yes, then you should use software-engineering principles. If not, applying rigorous software-engineering discipline will simply make a small project cost as much as a large one, with little or no benefit.

4.1 The Show Tool

I was doing a job for a client, using his Alsys Ada compiler on his IBM PC AT. It was my first experience with Alsys Ada and my first experience with PC DOS. I discovered that when you compile SOMEFILE.ADA, and SOMEFILE.ADA has errors in it, Alsys Ada will write the errors to SOMEFILE.LST without displaying them on the screen. Alsys gives such complete error messages that every error gives you at least five lines of text describing the error and suggestions of how to correct it, so SOMEFILE.LST can easily contain two or three screens full of error messages.

Since I was inexperienced with PC-DOS, I tried to display SOMEFILE.LST the same way I would on VMS. I used the command TYPE SOMEFILE.LST. The DOS TYPE command does not pause at when the screen fills, and text written to the screen isn't limited by a 1200 baud modem, so the whole error file flashed across the screen in a blur. I had to use CONTROL-S and CONTROL-Q to interrupt the transmission of text to the screen, and I had to have very fast reflexes.

I've had some limited experience with UNIX, so I tried MORE SOMEFILE.LST. That appeared to crash PC DOS, so I had to reboot. (It didn't really crash, but the symptoms were the same as a crash. The explanation of what really happened is best delayed for a little while.)

In desperation I read the PC-DOS documentation. The section on the TYPE command told me there were no switches I could set to display one screen at a time, and there was no cross reference to the MORE command.

I knew it was easy to write a program to display one screen of data at a time, because I had done it years ago in 8080 assembly language. I decided to rewrite it in Ada. I called the first version of that program Show, and you can find the listing for it in Figure 34. It prompts the user for the file name, opens the file, then does a loop 22 times that gets one line from the file and writes that one line to the screen. It prompts the user to "Press RETURN for the next screen," and jumps back to the loop that copies 22 lines to the screen. It does this until it hits an end of file mark.

4.1.1 Named Loops

Let's digress for a moment, and talk about named loops. I will sometimes use a comment to describe what a loop is doing, but I don't name a loop unless I'm going to use an unusual exit from the loop. The label MAIN: in Figure 34 should draw attention to that loop.

The problem is that I have nested loops. The inner loop executes 22 times, and the outer loop executes as often as necessary, until there is no more data to display. I could begin the outer loop with while not End_Of_File(FILE) loop if I were certain every file would have an exact multiple of 22 lines in it. In general, that won't be true. The end of the file will almost always be reached after a partial screen has been displayed. If I just write exit when End_Of_File(FILE);, that will get me out of the inner loop, but not out of the outer loop. The program would prompt the user to press RETURN, then go to the top of the outer loop again, where I would have to check for the end of file again.

The chicken way out is to never check end of file, let the program run until it raises an exception, and quit in the exception handler. That works, but it requires some intuition on the part of a maintenance programmer to figure out how the program ends (unless you reveal the trick in a comment). I find that solution artistically offensive because it looks sloppy and careless. Besides, exception handlers should be used for unusual error conditions. A finite length file is not unusual or erroneous.

The assembly language version handled the problem by checking for end of file at the point corresponding to the exit statement. If it found it was at the end of the file, it jumped to a statement corresponding to Close(FILE);. (JUMP is the assembly language equivalent of GOTO.) I certainly didn't want to endure the shame of using a GOTO in an Ada program, so I didn't use that solution, either.

Naming the outer loop is the clean solution to the problem. The exit MAIN when End_Of_File(FILE); statement takes us down to end loop MAIN; as soon as we run out of data. It clearly shows the maintenance programmer the condition required to leave the loop, and is considered a normal exit.

4.1.2 Command Tail

Now let's return to the story of the development of the Show tool. I wasn't happy with the program because it still wasn't as good as the assembly language version. When I used the Ada version I had to wait for the program to prompt me for the file name. The assembly language version let me type a single command, SHOW SOMEFILE.LST, and automatically extracted the file name SOMEFILE.LST from the command line without me having to enter it in response to the prompt.

The Ada LRM doesn't specify a standard way to get the rest of the command line. In all fairness to Ada I should point out that most other languages don't do that either, because it is beyond the normal scope of the language. It is really an operating system function.

I looked through the Alsys documentation and found a package called DOS that includes a function Get_Parms that fetches the command tail for you. It wasn't exactly what I wanted, because I wanted something that was an exact replacement for the get_line procedure already used in the Show program. I knew I would have portability problems if I used Get_Parms in Show and tried to move Show to another system, because Get_Parms is a special function Alsys was thoughtful enough to provide with their compiler. It almost certainly wouldn't exist in any other Ada implementation.

4.1.3 Compiling Library Procedures

I decided the best approach would be to write the procedure Get_Command_Line, shown in Figure 35. It produces a string and a length, just like get_line. All the implementation specific code is confined to one place, and does not infect all the application programs that need to read the command tail. Once this procedure is compiled and stored in the Ada library, every application program that needs to read the command line can use it. Novice Ada programmers think that only packages can be compiled and reused as library components. That's not true. This is an example of a procedure that can be compiled once and reused often.

4.1.4 Unconstrained Strings

Implementation specific routines are usually tricky, and Figure 35 is no exception. The Get_Parms function returns an unconstrained string L characters long, where L depends on the number of characters typed after the command name. I want to write LENGTH := L; TAIL(1..L) := DOS.Get_Parms;. The problem is, I don't know what L is. The incredibly clever solution is to make the function call an input parameter to the Extract function. Extract(DOS.Get_Parms, TAIL, LENGTH); associates the string returned by DOS.Get_Parms with the formal parameter S_IN. The LENGTH attribute yields the value of L. Notice that I couldn't write S_OUT(1..L) := S_IN; because L is an out parameter and can't be read. That's not a problem because I can use the LENGTH attribute as often as I want.

I thought that trick was pretty clever, until I read a column by Ben Brosgol. He knew an even easier way to use the Alsys DOS.Get_Parms function. (He has an unfair advantage over me. He works for Alsys!) When you declare a constant string, you don't need to declare the bounds because the bounds can be determined from the value you assign to the constant, even if that value is an unconstrained string returned by a function. I incorporated his idea into my procedure and came up with Listing 58.

4.1.5 Using Library Procedures

The Show program was compiled in the context of the Get_Command_Line procedure, as shown in Figure 36. TEXT and LENGTH come from Get_Command_Line if the user enters a file on the command line, or from get_line if he doesn't. (I like the way VMS prompts for missing parameters, so I usually include that feature in my routines.)

4.1.6 Porting Show to Other Systems

It wasn't long until I bought an AT clone of my own, and Meridian Adavantage Version 1.5 compiler to go with it. I decided to port the Show program to my computer. As I expected, Meridian had a utility that could fetch command line arguments, but it didn't look anything like the Alsys Get_Parms function. "No problem," I thought, "I'll just enclosed it in a different version of Get_Command_Line."

My first attempt at doing this must have looked a lot like Figure 37. I compiled Get_Command_Line and then compiled Show, linked them, and tried the resulting executable code. There was something wrong with it. The main program Show was fully debugged, so the error, of course, was in Get_Command_Line. I thought I knew what was wrong with Get_Command_Line, changed it, and recompiled it. I tried to relink it with Show, but the compiler (correctly) told me that Show was obsolete. Show was compiled in the context of Get_Command_Line, and since I had changed Get_Command_Line Ada couldn't be sure Show was still valid. I recompiled Show, linked, ran the program, and it still didn't work. I modified Get_Command_Line again, recompiled it, then had to recompile Show again, and so on. Eventually I came up with the form you see in Figure 37, but it was frustrating having to recompile Show every time.

It happened that I was evaluating the Gould APLEX Ada compiler running under the MPX-32 operating system at the time. I decided to try to transport the Show program to it. I was not very well acquainted with MPX-32 or APLEX Ada, and I knew I was going to have to recompile Get_Command_Line a million times before I got it working. That didn't bother me. I expected that. What bothered me was that I was going to have to recompile Show every time, even though I knew it was correct and never changed it.

I wished I had put Get_Command_Line inside a package. Then I could have compiled the package specification, compiled Show, and compiled the package body containing Get_Command_Line last. Then I could recompile Get_Command_Line as often as it took to get it working, and Show wouldn't need to be recompiled because it wouldn't be obsolete. (Show would depend upon the package specification, not the body.) Then I realized the value of a widely ignored Ada feature. You can separately compile a procedure specification.

4.1.6.1 Compiling Procedure Specifications.

Ada programmers tend to forget that procedures and functions have specifications and bodies just like packages and tasks do. Ada requires you to compile the specification of a package or a task before you compile its body. She lets you omit that step, however, when compiling non-generic procedures and functions. We almost always take a short cut when compiling subprograms by compiling simply the subprogram body without compiling the specification first.

Sometimes taking a short cut turns out to be longer, and porting Get_Command_Line to the Meridian environment is an example of such a situation. But I learned my lesson before porting Show to MPX-32. I compiled the package specification shown in Listing 59. There's not much too it, but it saved a mountain of work. After compiling Listing 59, I compiled the Show procedure and my first attempt at the Get_Command_Line body. It didn't work of course, but when I changed and recompiled it I was delighted that I didn't have to recompile Show. It took me several iterations before I got Get_Command_Line right, but I only had to compile Show once. The correct Get_Command_Line body for APLEX Ada running on MPX-32 is shown in Figure 38.

4.1.6.2 Porting Show to VAX/VMS.

Porting Get_Command_Line to the VAX/VMS environment was the most difficult. First there was the problem of finding a system service that would get the command line. That wasn't a trivial task. DOS is described in three paperback books with a total thickness of about four inches, but VMS is described in a series of big, orange, three-ring binders that take 5 or 6 feet of shelf space. That means there's a lot more haystack to find the needle in. To make matters worse, the needle was cleverly disguised. It was called Get_Foreign, which doesn't immediately suggest suitability for fetching the command line.

Once you find this service, you have to figure out how to interface with it. DEC has some special pragmas, Interface and Import, that allowed me to associate the LIB$GET_FOREIGN service in the system library with the Get_Foreign procedure specification. Listing 60 shows how this was done.

The final problem is using it. If you compile and link it with the Show procedure, it produces an executable module. The normal way to run an executable module is to type RUN SHOW. In this case, however, we want to type RUN SHOW SOMEFILE.EXT. If you do that, it complains about TOO MANY PARAMETERS. I suppose RUN calls LIB$GET_FOREIGN to find out what to run, and is expecting only one parameter. When if finds two, it generates an error message.

The magic VMS trick is to use an alias. If you type $SHOW :== $MY_DISK:[MY_DIRECTORY]SHOW.EXE (where MY_DISK and MY_DIRECTORY represent the actual path to the executable file), then you can type SHOW SOMEFILE.EXT and it will work (because you don't have to use RUN to run the program). It is convenient to put this command in your login file, so you don't have to remember to type it before you try to SHOW something.

This is an awfully brief explanation, but remember it doesn't have anything to do with Ada. These are features of VAX/VMS that are mentioned here just because they were necessary to port the Get_Command_Line procedure to VMS. If you want to know why these things work, take a course on the VMS operating system, or talk to your local VMS wizard. (I'm lucky to have Dave Dent around to find these VMS features for me.)

4.1.7 Library Procedure Summary

You can separately compile a single library procedure or function without having to put it in a package. (Most people must not know this because several times I've seen package specifications with nothing but a single subprogram specification in it.) When you do this, you freeze the interface. Then you can recompile the body over and over again, and Ada will check to make sure you have used exactly the same formal parameter list. As long as you don't change anything in the parameter list, you can make as many changes in the body as you like without making units that depend on the specification obsolete.

4.1.8 Common Command Names

Remember how I thought typing MORE SOMEFILE.LST caused DOS to crash? We are about to run into that same problem again, and this time we will see that command names can sometimes get you into trouble.

I was particularly frustrated because I was using so many different operating systems. They all used different names for deleting files. I could never remember if I should ERA, DEL, DELETE, rm, KILL, or VOLMGR. On one systems LIST would type a file, on another it would display the directory. It was driving me nuts! I decided I wanted to try to standardize utility names. Since I was using UNIX then, I decided to rename the Show program to More to match the UNIX name. Since it was going to have a UNIX name, I also wanted it to work like the UNIX version.


Contents | Next ...