C++ Programming: Building an Excel-Style Cell Reference System from Scratch

C++ Programming: Building an Excel-Style Cell Reference System from Scratch

Hello everyone! Today, I will guide you through implementing one of the most important features in Excel using C++—the cell reference system.

1. Basic Characteristics of Excel Cell References

Excel cell references mainly have three characteristics:

  1. Coordinate Representation: Row and column identifiers like A1, B2
  2. Relative/Absolute References: Mixed references like 1 or A$1
  3. Range References: Area references like A1:B5

To implement this functionality in C++, we need:

  • Coordinate Conversion: Converting letters and numbers into row and column indices
  • Reference Types: Handling relative and absolute references
  • Dependency Management: Tracking the reference relationships between cells

2. Basic Data Structure Design

1. Cell Coordinate Class

#include <string>
#include <utility> // for std::pair

class CellReference {
private:
    int row;        // Row number (0-based)
    int col;        // Column number (0-based)
    bool isAbsRow;  // Is it an absolute row reference
    bool isAbsCol;  // Is it an absolute column reference
    
public:
    // Parse references like "A1", "$B$2"
    CellReference(const std::string& ref) {
        size_t colEnd = ref.find_first_of("0123456789");
        std::string colPart = ref.substr(0, colEnd);
        
        // Determine if the column is an absolute reference
        if(!colPart.empty() && colPart[0] == '$') {
            isAbsCol = true;
            colPart = colPart.substr(1);
        }
        
        col = 0;
        for(char c : colPart) {
            col = col * 26 + (toupper(c) - 'A' + 1);
        }
        col--; // Convert to 0-based
        
        // Handle the row part
        std::string rowPart = ref.substr(colEnd);
        if(!rowPart.empty() && rowPart[0] == '$') {
            isAbsRow = true;
            rowPart = rowPart.substr(1);
        }
        
        row = std::stoi(rowPart) - 1; // Convert to 0-based
    }
    
    std::pair<int, int> toAbsolute(int currentRow, int currentCol) const {
        return {
            isAbsRow ? row : currentRow + row,
            isAbsCol ? col : currentCol + col
        };
    }
    
    // Convert to string representation
    std::string toString() const {
        std::string result;
        if(isAbsCol) result += "$";
        result += colToLetters(col);
        if(isAbsRow) result += "$";
        result += std::to_string(row + 1);
        return result;
    }
    
private:
    static std::string colToLetters(int col) {
        std::string result;
        while(col >= 0) {
            result = char('A' + (col % 26)) + result;
            col = col / 26 - 1;
        }
        return result;
    }
};

Tips:

  • Excel column numbers are represented as A=1, Z=26, AA=27 in base 26
  • Row and column numbers are converted to 0-based for easier processing
  • Absolute references are indicated with<span>$</span>, such as<span>$A$1</span>

3. Cell Class Implementation

1. Basic Cell Class

#include <memory>
#include <set>

class Spreadsheet; // Forward declaration

class Cell {
private:
    std::string rawValue; // Raw value or formula
    double computedValue; // Computed value
    Spreadsheet* sheet;   // Associated spreadsheet
    
    std::set<CellReference> references; // Referenced other cells
    std::set<Cell*> dependents;        // Cells that depend on this cell
    
public:
    Cell(Spreadsheet* s) : sheet(s), computedValue(0.0) {}
    
    void setValue(const std::string& value);
    double getValue() const { return computedValue; }
    
    const auto& getReferences() const { return references; }
    void addDependent(Cell* cell) { dependents.insert(cell); }
    
    void updateReferences();
    void notifyDependents();
};

2. Formula Parsing and Reference Extraction

void Cell::setValue(const std::string& value) {
    rawValue = value;
    references.clear();
    
    if(value.empty()) {
        computedValue = 0.0;
        return;
    }
    
    if(value[0] == '=') { // Formula
        // Simple parsing for formulas like SUM(A1:B2)
        size_t start = 1;
        while(start < value.size()) {
            size_t refStart = value.find_first_of("A-Za-z$", start);
            if(refStart == std::string::npos) break;
            
            size_t refEnd = value.find_first_not_of("A-Za-z0-9$", refStart);
            if(refEnd == std::string::npos) refEnd = value.size();
            
            std::string refStr = value.substr(refStart, refEnd - refStart);
            references.insert(CellReference(refStr));
            
            start = refEnd;
        }
    } else { // Regular value
        try {
            computedValue = std::stod(value);
        } catch(...) {
            computedValue = 0.0;
        }
    }
    
    updateReferences();
    notifyDependents();
}

4. Spreadsheet Class Implementation

1. Basic Spreadsheet Class

#include <vector>
#include <unordered_map>

class Spreadsheet {
private:
    std::vector<std::vector<std::unique_ptr<Cell>>> cells;
    
    // Convert coordinate string like "A1" to row and column indices
    std::pair<int, int> parseCoord(const std::string& coord) const {
        CellReference ref(coord);
        return ref.toAbsolute(0, 0); // Here 0,0 does not affect absolute references
    }
    
public:
    Spreadsheet(int rows, int cols) {
        cells.resize(rows);
        for(auto& row : cells) {
            row.resize(cols);
            for(auto& cell : row) {
                cell = std::make_unique<Cell>(this);
            }
        }
    }
    
    Cell* getCell(int row, int col) {
        return cells[row][col].get();
    }
    
    Cell* getCell(const std::string& coord) {
        auto [row, col] = parseCoord(coord);
        return getCell(row, col);
    }
    
    // Update all cell dependencies
    void rebuildDependencies() {
        // First clear all dependencies
        for(auto& row : cells) {
            for(auto& cell : row) {
                cell->clearDependents();
            }
        }
        
        // Rebuild dependencies
        for(int r = 0; r < cells.size(); ++r) {
            for(int c = 0; c < cells[r].size(); ++c) {
                cells[r][c]->updateReferences();
            }
        }
    }
};

2. Dependency Management

void Cell::updateReferences() {
    if(!sheet) return;
    
    // Notify old dependencies that they are no longer needed
    for(auto& ref : references) {
        auto [refRow, refCol] = ref.toAbsolute(/*current row and column*/);
        Cell* refCell = sheet->getCell(refRow, refCol);
        if(refCell) {
            refCell->removeDependent(this);
        }
    }
    
    // Establish new dependencies
    for(auto& ref : references) {
        auto [refRow, refCol] = ref.toAbsolute(/*current row and column*/);
        Cell* refCell = sheet->getCell(refRow, refCol);
        if(refCell) {
            refCell->addDependent(this);
        }
    }
}

void Cell::notifyDependents() {
    for(Cell* dependent : dependents) {
        dependent->recalculate();
    }
}

void Cell::recalculate() {
    if(rawValue.empty() || rawValue[0] != '=') return;
    
    // Here should implement complete formula calculation
    // Simplified version only handles simple cell references
    double sum = 0.0;
    for(auto& ref : references) {
        auto [refRow, refCol] = ref.toAbsolute(/*current row and column*/);
        Cell* refCell = sheet->getCell(refRow, refCol);
        if(refCell) {
            sum += refCell->getValue();
        }
    }
    
    computedValue = sum;
    notifyDependents(); // Notify dependent cells after value change
}

5. Range Reference Implementation

1. Range Reference Class

class CellRange {
private:
    CellReference topLeft;
    CellReference bottomRight;
    
public:
    CellRange(const std::string& range) {
        size_t colon = range.find(':');
        if(colon == std::string::npos) {
            topLeft = CellReference(range);
            bottomRight = topLeft;
        } else {
            topLeft = CellReference(range.substr(0, colon));
            bottomRight = CellReference(range.substr(colon + 1));
        }
    }
    
    // Get all cell coordinates within the range
    std::vector<std::pair<int, int>> getCoordinates(int currentRow, int currentCol) const {
        auto [startRow, startCol] = topLeft.toAbsolute(currentRow, currentCol);
        auto [endRow, endCol] = bottomRight.toAbsolute(currentRow, currentCol);
        
        std::vector<std::pair<int, int>> coords;
        for(int r = startRow; r <= endRow; ++r) {
            for(int c = startCol; c <= endCol; ++c) {
                coords.emplace_back(r, c);
            }
        }
        return coords;
    }
};

2. Range Reference Functions (e.g., SUM)

double sumRange(Spreadsheet* sheet, const std::string& range, int currentRow, int currentCol) {
    CellRange cellRange(range);
    auto coords = cellRange.getCoordinates(currentRow, currentCol);
    
    double sum = 0.0;
    for(auto [r, c] : coords) {
        Cell* cell = sheet->getCell(r, c);
        if(cell) {
            sum += cell->getValue();
        }
    }
    return sum;
}

6. Considerations

  1. Cycle Reference Detection:

    class Cell {
        bool inCalculation = false;
        
        void recalculate() {
            if(inCalculation) {
                throw std::runtime_error("Cycle reference detected");
            }
            
            inCalculation = true;
            // Calculation logic...
            inCalculation = false;
        }
    };
    
  2. Performance Optimization:

  • Use smart pointers for memory management
  • Only recalculate affected cells
  • Consider using the observer pattern to optimize notification mechanisms
  • Extension Suggestions:

    • Add cross-sheet references
    • Support named ranges
    • Implement a more complete formula parser

    7. Hands-On Practice

    Try these interesting exercises:

    1. Implement a copy-paste feature that supports relative references
    2. Add support for R1C1 style references
    3. Create a reference tracking visualization tool
    4. Implement function calls in formulas (e.g., SUM, AVERAGE)

    Conclusion

    Today we delved into:

    • The implementation principles of Excel-style cell references
    • Methods for converting row and column coordinates
    • Management of dependencies between cells
    • Techniques for handling range references

    Cell references are a core feature of spreadsheets, mastering this technology can make your program more professional. It is recommended to start with simple absolute references and gradually implement relative and range references.

    Leave a Comment