NSWI170 Computer Systems

Guidelines to write a better C/C++ code

Unforgivable curse #1: Decomposition

The term decomposition refers to the code design strategy where the programmer divides the code into appropriate logical units (blocks, functions, classes, modules, ...). Narrowing the scope for the intents of this course, we will mainly refer to functional decomposition (splitting the code into multiple functions). Later, we will start using structures (classes) so the decomposition will extend in an object-oriented way. On the other hand, we do not expect you to divide the code into modules or multiple files since the assignments are rather small.

Achieving a good code decomposition is difficult since there are no exact rules and it requires experience. That is why this rule was introduced as the first one, so you will get as much experience as possible. Furthermore, in many situations, even experienced programmers may take significantly different approaches to decomposition, so there is no one correct way. What counts is your effort to do it.

Motivation

One logical unit at any given level (block, function, class, module, ...) should do one thing (have one purpose). It makes the code much more readable and maintainable. It also helps significantly when the coding work needs to be assigned to multiple programmers -- each programmer has clear jurisdiction and responsibility.

When the code is well divided, it also promotes re-usability as it is more likely that a function or a class with one clear purpose can be used to solve the same thing in another module or application.

Explanation

As mentioned before, we will focus mainly on dividing the code into functions. There are no exact rules, the following list constitutes hints that could help you decide whether to place a block of code into a separate function.

A few hints about when it might be a really good idea to take a piece of code and wrap it into a function:

The same non-trivial functionality is used multiple times in your code. This goes hand in hand with the DRY concept we will cover later.
You have a long piece of code doing multiple things so you want to divide it into smaller pieces.
You want to put a label on a block of code, so it is more readable.
You have a trivial piece of code (e.g., a condition in if-statement) used multiple times, but you expect it might change over time (placing it in a function will make it less error-prone as you would change it only in one place).
You wish to encapsulate local variables used in a block to prevent accidental naming collisions or even variable misuse. This concept can be logically extended to recursion.

There are also situations, where separating a piece of code into a function could do more harm than good, for instance:

When the code is too small and used only in a few places (i.e., the function would make the code longer and less readable).
When an extensive state (many variables) needs to be passed down through arguments and/or return value.
When the actual purpose of the intended function is not clear so you have trouble making up a name for it.

Another thing that might help you make the decision is the scope. Typical functions have at least a few lines of code and up to 50 lines at most (in C-like languages). However, this suggestion is slightly more ambiguous as there are times when a function with a one-line body or with a hundred-line body makes sense. On the other hand, having many 1-line functions or a thousand-line function rarely stands to reason.

When applying the decomposition, you need to find the right balance. In the context of the Arduino assignments, you will typically need to create a few functions. Placing the whole code in loop() is rarely correct, creating 20+ functions for an assignment where you need to make a LED blink is overkill (which is as bad as making no decomposition).

Example

The following example presents a scenario where data are collected every second from a thermometer sensor. The sensor outputs values in Fahrenheit, but we need them in Celsius scale. Furthermore, we need to do ongoing value smoothing (each value is averaged with the last WINDOW_SIZE values) and compute a total average at the end. The sanitized values are stored into an array (for possible subsequent processing) and printed out.

💩 Bad code 👍 Good code

constexpr int N = 1024;
constexpr int WINDOW_SIZE = 5;
float data[N];

int main() {
    float avg = 0.0f;
    for (int i = 0; i < N; ++i) {
        data[i] = (read_sensor(therm_pin) - 32.0f) * 5.0f / 9.0f;
        float sum = 0.0f;
        for (int j = std::max(i-WINDOW_SIZE+1, 0); j < i; ++j) {
            sum += data[j];
        }
        data[i] = sum / (float)std::min(i, WINDOW_SIZE);
        avg += data[i];

        printf("Time %i: ", i);
        printf("%.1f °C", data[i]);
        printf("\n");
        sleep(1);
    }

    printf("Average of all %i values: ", N);
    printf("%.1f °C", avg / (float)N);
    printf("\n");
    return 0;
}

float fahrenheit_to_celsius(float f)
{
    return (f - 32.0f) * 5.0f / 9.0f;
}

void print_temp(float temp) {
    printf("%.1f °C", temp);
}

void print_info_line(const char *pref, int pref_val, float temp) {
    printf(pref, pref_val);
    print_temp(temp);
    printf("\n");
}

float array_avg(float arr[], int n) {
    float sum = 0.0f;
    for (int i = 0; i < n; ++i) {
        sum += arr[i];
    }
    return n > 0 ? sum / (float)n : 0.0f;
}

constexpr int N = 1024;
constexpr int WINDOW_SIZE = 5;
float data[N];

int main() {
    for (int i = 0; i < N; ++i) {
        float temp_f = read_sensor(therm_pin);
        data[i] = fahrenheit_to_celsius(temp_f);
        int window_offset = std::max(i - WINDOW_SIZE + 1, 0);
        data[i] = array_avg(&data[window_offset], WINDOW_SIZE);
        print_info_line("Time %i: ", i, data[i]);
        sleep(1);
    }

    float avg = array_avg(data, N);
    print_info_line("Average of all %i values: ", N, avg);
    return 0;
}

The print_info_line() is perhaps the most obvious case. It performs a task, which is twice in the bad code. The choice of moving the output of a temperature into print_temp() was made so that temperature formatting will be in one location only (and print_temp may become useful as a standalone function should our code needs extensions).

The array_avg() function takes care of computing an average (obviously from its name). It is used both for smoothing and for final average. In the bad code, this functionality is duplicated, although it may not be apparent since the total average computation is merged into the loading loop (which makes it even more difficult to spot).

The fahrenheit_to_celsius() is there merely to label an expression, which may be difficult to understand without a comment. Furthermore, such conversion is quite common in an application that handles temperatures, so it might be easily re-used in other situations.

Please note that some parts of this example are not ideal. We were aiming at providing reasonably short code; thus, some things were simplified. Also, many things could have been done better by C++ experts (output via stream, formatting, using a vector instead of an array...), but it would complicate code readability for C++ first-timers.