I recently gave a presentation on what it is like to work as a software developer to first-year engineering students at KTH taking an introductory programming course. I wanted to give my view on the main differences between professional software development and programming for a university course.
First I talked about challenges with large-scale software development. Then I listed several development practices used to cope with these challenges. I went on to present ways to become a better programmer, and ended with some fun facts from work.
CHARACTERISTICS OF PRODUCTION SOFTWARE
Programs are BIG. The main characteristic of software used in live systems is the size. For example, our main repository at work contains 1.8 million lines of Java (I used the tool cloc to find the number of lines). The sheer size of the programs complicates software development, and a lot of practices have been developed to deal with it.
Software is never done. Software that is used keeps growing and evolving. The customers find more and more uses for it, and want more and more features. Software that is not used is discontinued, but successful software is developed continuously for many years. Several developers are almost always involved. For example, I checked the subversion log of one of the main classes in our SMS application. There were 150 check-ins over 7 years, by 8 different developers (2 of which don’t work at Symsoft anymore).
Complexity from aggregation. Most features in production software are quite simple, but because there are so many of them, you get subtle (or not so subtle) interactions between them causing bugs. The complexity of the system comes from the aggregation of many simple parts, not from any complex parts.
Reading code. A consequence of the characteristics mentioned above is that reading code is a very important skill. Before a program can be modified, you need to understand what it does, and how it does it. Only then can new functionality be added so it fits in with the existing structure, and without breaking anything. Reading and understanding a program can be a major effort, and one sign of a well-designed program is that it is relatively straight-forward to modify it.
HOW TO MANAGE
Many techniques have been developed to make developing and maintaining large programs easier. Here are the most important.
Modularize. This is an obvious first step, and (I believe) universally used. The software is split into subsystems, layers or modules so that smaller chunks of functionality can be dealt with at a time.
Iterate. Developing software bit by bit is as helpful for small 30-lines scripts as it is for systems with millions of lines of code. I like the following quote:
“A complex system that works is invariably found to have evolved from a simple system that worked.” – John Gall
Self-documenting code. The naming of the classes, methods and variables is incredibly important. The names, when chosen well, let you understand what the program does just by reading them. They also greatly reduce the need for comments in the code.
No duplication. Code-duplication causes a lot of problems when you come back to modify the program (happens all the time) – you may forget to make the intended change in all the duplicates. So instead of copy-pasting, combine the logic into one method. It makes the code more compact, and easier to modify in the future. There is an excellent article by Martin Fowler, Avoiding Repetition, on this subject. Unfortunately, I have seen a lot of duplicated code in production software over the years.
Unit testing. Unit testing (JUnit etc.) is useful both in small-scale and large-scale development. It’s an easy way to ensure your smallest program parts work as expected, and you get repeatable tests that can be run again and again. Making sure your code can be unit-tested also automatically improves the structure of the code – it becomes less monolithic, more de-coupled.
Version control. Using version control (like git or subversion) is a no-brainer, and as far as I can tell pretty much always used in professional software development. Version control systems are used both to keep track of different working versions of the software, and for knowing exactly what code is included in each release.
Version control is also useful on an individual level, for example for the programming assignments at a university course. Any program I spend more than 10 minutes writing, I stick in a local git repository. That way, I can always go back to previous (working) versions of my program.
Write for people first, computer second. The code you write will be read many times in the future (by you, or another developer). The computer doesn’t care how the code is written, so make it as easy as possible to understand for the next person that has to read it. A corollary to this is: don’t be too clever. It’s better to be clear than to be clever. The following quote puts it another way:
“It’s OK to figure out murder mysteries, but you shouldn’t need to figure out code. You should be able to read it.” – Steve McConnell
Plan for failure – logging and error handling.What do you do when the program doesn’t work as expected? You need some way of seeing what is going on. Usually this is in the form of tracing or logging. This will inevitably happen, so you might as well build in support for tracing and logging from the start (and make it possible to activate and deactivate while the program is running). It’s a similar story with error handling; put in place a unified way of handling errors in the program, because errors will happen.
Issue tracking. For any real system, it is necessary to have a way to keep track of bug reports – e-mails don’t cut it. At work we use Jira, but there are many products with similar capabilities. Often, the same system is also used to keep track of new features to be implemented.