Anh-Duy Vu

In this blog post, I would like to share a compilation of insights and best practices that I have acquired through my experience with the C++ programming language.

RAII technique

The concept of RAII (Resource Acquisition Is Initialization) is a fundamental C++ programming technique. It entails binding the lifecycle of a resource to the lifetime of an object, ensuring that resources are acquired and released appropriately. To implement RAII:

Encapsulate each resource within a class (or struct): the constructor acquires the resource, and the destructor releases it.
Utilize the resource through a local instance of the class, avoiding the use of pointers or static variables.

Consider the following example illustrating the potential pitfalls of not employing RAII:

void allocateAndProcessMessage() {
    char* msg = new char[128];
    // Do something with 'msg' but not delete it.

    if (mystery_function() != 0) {
        // ERROR! 
        return; // Potential memory leak as 'msg' has not been deleted yet.
    }

    // No error and do something else ...

    delete []msg; // Another potential memory leak is that an exception occurs before this 'delete' statment is invoked.
}

Applying RAII by wrapping the resource using a struct (or class), we can eliminate the risk of memory leaks:

class RAIIManagedMessage {
public:
    RAIIManagedMessage() : msg(new char[128]) {}

    ~RAIIManagedMessage() {
        if (msg != nullptr)
            delete[] msg;
    }

    char *getMessage() const {
        return msg;
    }

private:
    char *msg = nullptr;
};

void raiiManagedMessageProcessing() {
    struct RAIIManagedMessage messageWrapper;

    // Do something with 'wrapper.getMessage()'.
    if (mystery_function() != 0) {
        // ERROR! 
        return;
    }

     // No potential memory leak as the destructor of 'wrapper' will be invoked when we leave this function for any reason.
}

While RAII is effective in preventing memory leaks, it introduces some overhead due to the need for creating new wrappers using class or struct. A workaround for this overhead is to use smart pointers with custom deleters. The following example demonstrates the same functionality using smart pointers:

void uniquePtrMessageProcessing() {
    std::unique_ptr<char, void (*)(char*)> uniquePtrMessage(
        new char[128],
        [](char* ptr) {
            if(ptr != nullptr)
                delete[] ptr;
        }
    );

    // Do something with 'uniquePtrMessage'.
    if (mystery_function() != 0) {
        // ERROR! 
        return;
    }

     // No potential memory leak as the deleter of 'uniquePtrMessage' will be invoked when we leave this function for any reason.
}

void sharedPtrMessageProcessing() {
    std::shared_ptr<char> sharedPtrMessage(
        new char[128],
        [](char* ptr) {
            if(ptr != nullptr)
                delete[] ptr;
        }
    );

    // Do something with 'uniquePtrMessage'.
    if (mystery_function() != 0) {
        // ERROR! 
        return;
    }

     // No potential memory leak as the deleter of 'uniquePtrMessage' will be invoked when we leave this function for any reason.
}

Working with containers

`auto` specifier in a range-based for loop

When employing a range-based for loop, use the auto specifier for automatic type deduction in the range_declaration. Remember that auto represents a *value type*, necessitating the use of &`` to denote a reference type when modifications to the original elements are desired. See following example:

void useValueRangeBasedForLoop() {
    std::vector<std::string> strings { "first", "second", "third" };
    for(auto s : strings) {
        s.append(" new string");
    }

    // This will print: "frist", "second" and "third"
    for(auto s : strings) {
        std::cout << s << std::endl;
    }
}

void useReferenceRangeBasedForLoop() {
    std::vector<std::string> strings { "first", "second", "third" };
    for(auto &s : strings) {
        s.append(" new string");
    }

    // This will print: "frist new string", "second new string" and "third new string"
    for(auto s : strings) {
        std::cout << s << std::endl;
    }
}

In the first function (useValueRangeBasedForLoop), the auto specifier represents a value type, so modifications to s within the loop do not affect the original elements in the container. However, we use auto & in the second example (useReferenceRangeBasedForLoop). This denotes a reference type, and modifications made to s directly impact the original elements in the container.

Finding an element with std::find_if

Utilize std::find_if to efficiently locate an element satisfying a specified criterion, avoiding the need for explicit loops. For example:

#include <algorithm>

// Finding the first occurrence of a string in a set of strings.
void findFirstOccurence(const std::vector<std::string> &haystack, const std::string &needle) {
    auto str = std::find_if(haystack.begin(), haystack.end(),
                            [&](const std::string &s) { return s == needle; });
    if(str != haystack.end()) {
        std::cout << "found " << *str << " at " << std::distance(haystack.begin(), str) << std::endl;
    }


// inding the nth occurrence of a string in a set of strings.
void findNthOccurence(const std::vector<std::string> &haystack, const std::string &needle, const int &n) {
	auto count = 0;
    auto str = std::find_if(haystack.begin(), haystack.end(),
                            [&](const std::string &s) {
                                if (s == needle)
                                    count += 1;
                                return s == needle && count == n; });
    if(str != haystack.end()) {
        std::cout << "found " << *str << " at " << std::distance(haystack.begin(), str) << std::endl;
    }
}}

`class` vs. `struct`

The decision between using class or struct`` in C++ revolves around conventions rather than strict rules. The primary distinction lies in the default access of members and inheritance. If access-specifier is omitted, struct defaults to public, while class defaults to private.

struct BaseStruct {
    int structMember;   // this is a public member and can be accessed directly via an instance of BaseStruct
};

struct DerivedStruct : BaseStruct { // Equivalent to: struct DerivedStruct : public BaseStruct
    ...
};

class BaseClass {
    int classMember;  // this is a private member and can NOT be accessed via an instance of BaseClass
};

struct DerivedClass : BaseClass { // Equivalent to: struct DerivedClass : private BaseClass
    ...
};

There is no concrete rule for making the choice of using class or struct, but only conventions. To me, here are the two significantly important ones:

Use struct to bundle related data for improved code abstraction; it may have methods and base struct but typically doesn’t.
Use class if there is an invariant, a relationship between the data members that must be upheld to avoid issues. For example, an implement of MySimpleString has buf and size members, and its invariant is that the number of characters allocated for buf must match the value in size.

Rule of five

The Rule of Five dictates that if you define any of destructor, copy constructor, copy assignment operator, move constructor, or move assignment operator, then you should probably define all five. These methods handle resource management, such as deleting a pointer or closing a file. Failing to define these methods prompts the compiler to generate default ones, potentially leading to insufficient behavior.