Understanding the Underlying Principles of Polymorphism in C++

Polymorphism is divided into compile-time polymorphism and runtime polymorphism. Can you explain this in detail?

Compile-time Polymorphism (Static Binding)

Compile-time polymorphism, also known as static binding or early binding, is when the function to be called is determined during the compilation phase.

Implementation Method: Mainly achieved through function overloading and templates.

Principle: The compiler will find and bind to the correct function version at compile time based on the function name, parameter types, and the number of parameters.

Characteristics: High efficiency, as there is no need to search at runtime.

Example: Function Overloading

#include <iostream>
void print(int i) {    std::cout << "Printing integer: " << i << std::endl;}
void print(double f) {    std::cout << "Printing double: " << f << std::endl;}
void print(const char* s) {    std::cout << "Printing string: " << s << std::endl;}
int main() {    print(10);        // Compile-time determines to call print(int)    print(3.14);      // Compile-time determines to call print(double)    print("Hello");   // Compile-time determines to call print(const char*)    return 0;}

Runtime Polymorphism (Dynamic Binding)

Runtime polymorphism, also known as dynamic binding or late binding, is when the function to be called can only be determined during the program execution phase.

Implementation Method: Mainly achieved through virtual functions (<span><span>virtual</span></span>) and base class pointers/references.

Principle: The compiler creates a virtual function table (vtable) for each class containing virtual functions at compile time. Each object will have a virtual function pointer (vptr) pointing to this table. At runtime, the program will use this <span><span>vptr</span></span> to find the correct function address and call it.

Characteristics: Offers flexibility and extensibility, but introduces additional performance overhead (searching the vtable).

Example: Virtual Functions

#include <iostream>
class Animal {public:    virtual void speak() { // virtual keyword is crucial        std::cout << "The animal is speaking..." << std::endl;    }};
class Dog : public Animal {public:    void speak() override {        std::cout << "Woof Woof!" << std::endl;    }};
class Cat : public Animal {public:    void speak() override {        std::cout << "Meow Meow!" << std::endl;    }};
int main() {    Animal* my_animal;
    my_animal = new Dog();    my_animal->speak();   // Calls Dog::speak() at runtime
    my_animal = new Cat();    my_animal->speak();   // Calls Cat::speak() at runtime
    delete my_animal;    return 0;}

In this example, the declared type of <span><span>my_animal</span></span> is always <span><span>Animal*</span></span>. However, at runtime, when it points to a <span><span>Dog</span></span> object, calling <span><span>speak()</span></span> will execute the <span><span>Dog</span></span><span><span>'s version; when it points to a </span></span><code><span><span>Cat</span></span><span><span> object, it will execute the </span></span><code><span><span>Cat</span></span><span><span>'s version. The compiler cannot determine at compile time what </span></span><code><span><span>my_animal</span></span><span><span> will ultimately point to, so this decision is left to runtime.</span></span><span><span>What are the underlying principles of polymorphism?</span></span><p><span>The underlying principles of polymorphism mainly rely on two key mechanisms: </span><b><span>virtual function table (vtable)</span></b><span> and </span><b><span>virtual pointer (vptr)</span></b><span>.</span></p><h3><span><span>1. Virtual Function Table (vtable)</span></span></h3><p><b><span>What is it</span></b><span>: </span><code><span>vtable</span> is a static, read-only array generated by the compiler for each class containing virtual functions at compile time. It is a table where each entry stores a function pointer.

Function: It stores the actual addresses of all virtual functions in the class. If a derived class overrides a virtual function, the corresponding entry in the <span>vtable</span> will be replaced with the address of the overridden function in the derived class.

How is it generated:

Base class <span>Animal</span><span>'s vtable</span>: It contains the address of the <span>Animal::speak()</span> function.

Derived class <span>Dog</span><span>'s vtable</span>: The compiler first copies the <span>Animal</span>‘s <span>vtable</span>, then replaces the address of the corresponding <span>speak()</span> function in the <span>vtable</span> with the address of <span>Dog::speak()</span>.

2. Virtual Pointer (vptr)

What is it: <span>vptr</span> is a hidden, automatically added member variable that is added to the memory of any object that contains or inherits virtual functions.

Function: <span>vptr</span> stores the address of the <span>vtable</span> of the class to which the object belongs. It is usually located at the front of the object’s memory layout.

3. Addressing Process: How Polymorphism is Achieved

When calling a virtual function through a base class pointer or reference, the underlying execution process is as follows:

For example, with <span><span>Animal* p = new Dog(); p->speak();</span></span>:

  1. Static Type vs Dynamic Type:

    <span><span>p</span></span>‘s static type (declared type) is <span><span>Animal*</span></span>.

    <span><span>p</span></span>‘s dynamic type (actual pointed type) is <span><span>Dog</span></span>.

  2. Accessing vptr through Pointer: The compiler sees <span><span>p->speak()</span></span> and knows that <span><span>speak()</span></span> is a virtual function. It will not bind the call at compile time but will generate code that executes the following at runtime:

  3. From the starting memory location of the object pointed to by <span><span>p</span></span>, find the hidden <span><span>vptr</span></span>.

    Since <span><span>p</span></span> actually points to a <span><span>Dog</span></span><span><span> object, this </span></span><code><span><span>vptr</span></span><span><span> stores the address of the </span></span><code><span><span>Dog</span></span><span><span> class's </span></span><code><span><span>vtable</span></span><span><span>.</span></span>

  4. Finding vtable through vptr: The program uses the address of <span><span>vptr</span></span> to find the <span><span>Dog</span></span><span><span> class's </span></span><code><span><span>vtable</span></span><span><span>.</span></span>

  5. Looking up Function Address in vtable: The compiler determines at compile time the index position of the <span><span>speak()</span></span><span><span> function in the </span></span><code><span><span>vtable</span></span><span><span> (for example, it is the first virtual function, index 0). The program will retrieve the corresponding function pointer from the </span></span><code><span><span>Dog</span></span><span><span>'s </span></span><code><span><span>vtable</span></span><span><span> based on this index, which points to </span></span><code><span><span>Dog::speak()</span></span><span><span>.</span></span>

  6. Calling the Correct Function: Finally, the program calls the <span><span>Dog::speak()</span></span><span><span> function found in the </span></span><code><span><span>vtable</span></span><span><span>.</span></span>

Due to the presence of the virtual function pointer, virtual function class objects will have an additional pointer size compared to normal class objects. If a class has multiple virtual functions, it still only increases by the size of one pointer, as the virtual function pointer points to the entire virtual function table, not each virtual function corresponding to a pointer.

For example, assuming a 64-bit machine where the pointer size is 8 bytes.

  1. Normal Class Object (No Virtual Functions)

    #include <iostream>
    class NormalClass {public:    int a; // 4 bytes    int b; // 4 bytes
        // No virtual functions};
    int main() {    // Theoretical size of the object: 4 + 4 = 8 bytes    // Actual size may increase due to alignment, but we focus on the increment of vptr    std::cout << "NormalClass size: " << sizeof(NormalClass) << " bytes" << std::endl;    return 0;}
  2. Virtual Class Object (With Virtual Functions)

    #include <iostream>
    class VirtualClass {public:    int a; // 4 bytes    int b; // 4 bytes
        virtual void func() {} // Introduces a virtual function};
    int main() {    // Theoretical size: 4 (a) + 4 (b) + 8 (vptr) = 16 bytes    std::cout << "VirtualClass size: " << sizeof(VirtualClass) << " bytes" << std::endl;    return 0;}

Why can only base class pointers or references call dynamic polymorphism (i.e., runtime polymorphism)?

The core reason lies in the memory structure of the object and preventing object slicing

1. Ensuring the Existence and Correctness of vptr

Polymorphism depends on vptr: As mentioned earlier, the underlying principle of dynamic polymorphism is that the internal virtual function pointer (vptr) points to the correct virtual function table (vtable).

Accessing through Pointer/Reference: When you use <span><span>Base* p</span></span> or <span><span>Base& r</span></span>, you are operating on the address of the object itself. The compiler knows to look for the <span><span>vptr</span></span> at this address, and then follow the <span><span>vptr</span></span> to find the correct <span><span>vtable</span></span><span>.</span>

Directly Calling through Object (Not Pointer/Reference): If you directly call a virtual function through an object, such as <span><span>Dog myDog; myDog.speak();</span></span>, the compiler will directly determine at compile time to call <span><span>Dog::speak()</span></span><span><span> (because it knows the exact type of </span></span><code><span><span>myDog</span></span><span><span>, which is static binding),</span></span><b><span><span> and will not use the </span></span></b><span><span>dynamic lookup mechanism of </span></span><code><span><span>vptr</span></span><span><span> and </span></span><code><span><span>vtable</span></span><span>.</span>

2. Preventing Object Slicing

This is the most critical reason. If you try to implement polymorphism through value passing, object slicing will occur, causing the polymorphic mechanism to fail.

Why are pointers/references dynamically bound while objects are statically bound?

1. Local Objects (Static Binding)

When you directly create an object, such as <span><span>Dog myDog;</span></span>, the compiler knows all the information about this object at compile time:

Static Type:<span><span>Dog</span></span>

Dynamic Type:<span><span>Dog</span></span>

Memory Size:<span><span>Dog</span></span>‘s size

Function Address:<span><span>myDog.speak()</span></span> corresponds to the function address (in the <span><span>Dog</span></span><span><span>'s vtable at index N).</span></span>

Since the type is determined and unchanging, the compiler will directly bind <span><span>myDog.speak()</span></span><span><span> to the actual address of </span></span><code><span><span>Dog::speak()</span></span><span><span> at compile time for efficiency.</span></span><b><span><span> It will skip the runtime query of vptr/vtable.</span></span></b>

2. Pointers/References (Dynamic Binding)

When you use a base class pointer or reference, such as <span><span>Animal* p_animal = &myDog;</span></span>, the compiler only knows partial information at compile time:

Static Type:<span><span>Animal*</span></span> (the compiler only knows this is a pointer to <span><span>Animal</span></span><span><span>).</span></span>

Dynamic Type:Unknown (only known at runtime whether it is <span><span>Dog</span></span><span><span> or </span></span><code><span><span>Cat</span></span><span><span>).</span></span>

Function Address:<span><span>p_animal->speak()</span></span> corresponds to a function address that is uncertain.

Since the compiler does not know at compile time what type of object <span><span>p_animal</span></span> will ultimately point to, it cannot determine in advance whether to call <span><span>Animal::speak()</span></span><span><span> or </span></span><code><span><span>Dog::speak()</span></span><span><span>. Therefore, the compiler must generate a set of code to query </span></span><b><span><span>vptr</span></span></b><span><span> and </span></span><code><span><span>vtable</span></span><span><span> at </span></span><b><span><span>runtime</span></span></b><span><span> to achieve dynamic binding.</span></span>

Summary:

Local Objects: The type is determined at compile time, so static binding is used (highest efficiency).

Pointers/References: The type is uncertain at compile time (only the base class can be determined), so dynamic binding must be used (more flexible).

What is slicing in polymorphism?

Object slicing occurs when a derived class object is assigned to a base class object or when a derived class object is passed by value to a function that accepts a base class object.

In such operations, the part of the data unique to the derived class and the <span><span>vptr</span></span> will be “sliced off” or lost, leaving only the base class part copied.

See the following example:

#include <iostream>
#include <string>// Base class
class Person {protected:    std::string name;public:    Person(std::string n) : name(n) {}    virtual void introduce() const {        std::cout << "I am " << name << std::endl;    }};// Derived class
class Student : public Person {private:    std::string studentID;public:    Student(std::string n, std::string id)         : Person(n), studentID(id) {}
    void introduce() const override {        std::cout << "I am a student " << name << ", student ID is " << studentID << std::endl;    }
    void study() const {        std::cout << name << " is studying" << std::endl;    }};
int main() {    Student student("Zhang San", "2023001");
    // Case 1: Value assignment (slicing occurs)
    Person person = student;      person.introduce();  // Outputs "I am Zhang San" (the derived class's studentID is truncated, calling the base class version)
    // person.study();   // Compilation error: Person has no study() method
    // Case 2: Pointer/Reference (no slicing)
    Person* p = &student;      p->introduce();  // Outputs "I am a student Zhang San, student ID is 2023001" (polymorphism works, the object is still Student)

In this example, there are two classes: <span><span>Person</span></span> (base class) and <span><span>Student</span></span> (derived class).

Class Unique/Inheriting Content Additional Content in Derived Class (student specific)
Person (Base Class) name (string)introduction() (virtual function) None
Student (Derived Class) Inherits all content from Person studentID (string)study() (regular function)

<span><span>Student</span></span> class has an additional member variable <span><span>studentID</span></span> and a member function <span><span>study()</span></span><code><span><span>.</span></span>

When a <span><span>Student</span></span> object is created, its memory structure contains a complete <span><span>Person</span></span><span><span> sub-object (including </span></span><code><span><span>name</span></span><span><span> and </span></span><code><span><span>vptr</span></span><span><span>), plus its own unique </span></span><code><span><span>studentID</span></span><span><span> member.</span></span>

Why is this the output?

Case A: Object Slicing (Passing by Value)

Person person = student; // Slicing occurs
person.introduce();      // Calls introduce()// Output: I am Zhang San

Underlying Principle and Output Explanation:

  1. Slicing Occurs: <span><span>Person person = student;</span></span> statement uses <span><span>student</span></span><span><span> object to </span></span><b><span><span>initialize</span></span></b><span><span> a new </span></span><code><span><span>Person</span></span><span><span> object </span></span><code><span><span>person</span></span><span><span>.</span></span>

    The compiler will copy the part of <span><span>student</span></span><span><span> that belongs to </span></span><code><span><span>Person</span></span><span><span>.</span></span>

    <span><span>student</span></span> unique part (<span><span>studentID: 2023001</span></span><span><span> and </span></span><code><span><span>vptr</span></span><span><span> pointing to </span></span><code><span><span>Student</span></span><span><span>'s </span></span><code><span><span>vtable</span></span><span><span>) is </span></span><b><span><span>sliced off</span></span></b><span><span>, not copied to </span></span><code><span><span>person</span></span><span><span>.</span></span>

  2. Polymorphism Fails: The newly created <span><span>person</span></span><span><span> is a separate, complete </span></span><code><span><span>Person</span></span><span><span> object. Its </span></span><code><span><span>vptr</span></span><span><span> points to the </span></span><b><code><span><span>Person</span></span>‘s <span><span>vtable</span></span>.

  3. Result: When calling <span><span>person.introduce()</span></span><code><span><span>, because </span></span><code><span><span>person</span></span><span><span> is a </span></span><code><span><span>Person</span></span><span><span> object, it calls </span></span><code><span><span>Person::introduce()</span></span><span><span>, so the output is </span></span><b><span><span>"I am Zhang San"</span></span></b><span><span>, without showing the student ID.</span></span>

Case B: Correct Polymorphism (Pointer/Reference)

Person* p = &student; // Using pointer to point to student
p->introduce();       // Calls introduce()// Output: I am a student Zhang San, student ID is 2023001

Underlying Principle and Output Explanation:

  1. No Slicing: <span><span>Person* p = &student;</span></span> statement simply lets the pointer <span><span>p</span></span><span><span> store the </span></span><code><span><span>student</span></span><span><span> object's memory address.</span></span><b><span><span>No copying or slicing occurs.</span></span></b><code><span><span>student</span></span><span><span> object remains intact.</span></span>

  2. Dynamic Binding:

    <span><span>p</span></span>‘s static type is <span><span>Person*</span></span>.

    <span><span>p</span></span>‘s dynamic type is <span><span>Student</span></span>.

    The compiler sees <span><span>p->introduce()</span></span> as a virtual function call, and it will check the <span><span>vptr</span></span><span><span> in the </span></span><code><span><span>student</span></span><span><span> object.</span></span>

    <span><span>vptr</span></span> points to the <span><span>Student</span></span>‘s <span><span>vtable</span></span>, which records the address of <span><span>Student::introduce()</span></span><span><span>.</span></span>

  3. Result: The program calls the overridden <span><span>introduce()</span></span><span><span> function in the derived class at runtime, which outputs the complete student information and student ID.</span></span>

Summary:

  1. Polymorphism is divided into static and dynamic binding:

  • Compile-time Polymorphism (Static Binding): Determined at compile time through function overloading or templates, with the highest efficiency.

  • Runtime Polymorphism (Dynamic Binding): Determined at runtime through the virtual function (<span><span>virtual</span></span><span>) mechanism, providing flexibility and extensibility in code.</span>

  • The underlying principle of dynamic polymorphism is vptr and vtable::

    • Each object containing virtual functions has a hidden virtual function pointer (vptr) that points to the virtual function table (vtable).

    • <span>vtable</span> is a static array that stores the actual addresses of all virtual functions. At runtime, the program looks up the function address through <span>vptr</span> in the <span>vtable</span>.

  • To implement dynamic polymorphism, base class pointers or references must be used::

    • Reason: Pointers or references allow for indirect access, preserving the complete dynamic type (i.e., the correct <span>vptr</span><span>).</span>

    • Direct Object Calls: Will result in static binding, as the compiler will determine the function address at compile time for efficiency, skipping the dynamic lookup mechanism.

  • Object Slicing is the main cause of polymorphism failure::

    • Slicing occurs when a derived class object is assigned by value or passed by value to a base class object.

    • The unique data and dynamic type information (correct <span><span>vptr</span></span><span><span>) of the derived class will be "sliced off", leaving only the base class part copied. The new object (base class copy) thus cannot call derived class methods, leading to runtime polymorphism failure.</span></span>

    Leave a Comment