Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formula Corruption During Row/Column Insertion in Server-Generated Excel Workbook #1748

Open
chris-newsome opened this issue Dec 9, 2024 · 9 comments
Assignees
Labels
bug Something isn't working

Comments

@chris-newsome
Copy link

EPPlus usage

Commercial use (I have a commercial license)

Environment

Windows

Epplus version

7.5.2

Spreadsheet application

Excel

Description

The workbook is generated server-side using a solution built in .NET8, and the issue occurs when rows or columns are inserted into the worksheet. Specifically, this affects cells containing formulas that use dynamic range references, such as:

=IF(
    COLUMN() - COLUMN($J23) > (COUNTA('#CompaniesAndConsolidations'!$A:$A) + 1), 
    0,
    IF(
        K$5 = "TopConsolidation",
        SUMIFS(
            INDEX(23:23, 1, COLUMN() + 1) : INDEX(23:23, 1, COLUMN($L23)),
            INDEX($7:$7, 1, COLUMN() + 1) : INDEX($7:$7, 1, COLUMN($L23)),
            K$2,
            INDEX($6:$6, 1, COLUMN() + 1) : INDEX($6:$6, 1, COLUMN($L23)), 
            FALSE
        ),
        IF(
            K$5 = "SubConsolidation",
            SUMIFS(
                INDEX(23:23, 1, COLUMN() + 1) : INDEX(23:23, 1, COLUMN($L23)),
                INDEX($8:$8, 1, COLUMN() + 1) : INDEX($8:$8, 1, COLUMN($L23)),
                K$2,
                INDEX($6:$6, 1, COLUMN() + 1) : INDEX($6:$6, 1, COLUMN($L23)), 
                FALSE
            ),
            IF(
                K$5 = "DivisionalConsolidation",
                SUMIFS(
                    INDEX(23:23, 1, COLUMN() + 1) : INDEX(23:23, 1, COLUMN($L23)),
                    INDEX($9:$9, 1, COLUMN() + 1) : INDEX($9:$9, 1, COLUMN($L23)),
                    K$2,
                    INDEX($6:$6, 1, COLUMN() + 1) : INDEX($6:$6, 1, COLUMN($L23)), 
                    FALSE
                ),
                -SUMIFS(
                    '#TrialBalance_CY'!$E:$E,
                    '#TrialBalance_CY'!$A:$A,
                    K$2,
                    '#TrialBalance_CY'!$G:$G,
                    "IncomeStatement"
                )
            )
        )
    )
)

The affected formulas seem to lose critical syntax elements (e.g., the ":" in range references like +1):INDEX( during processing, resulting in broken formulas with #REF! errors in the generated file.

Observations:

  1. Cause of Corruption: The issue appears related to tokenization during row/column insertion. This likely happens when the server-side logic attempts to copy and adjust formulas dynamically.
  2. Error Details: The corrupted formulas, as seen in the XML, have malformed range references:
<c r="K23" s="80">
  <f>
    IF(
      COLUMN() - COLUMN($J23) > (COUNTA('#CompaniesAndConsolidations'!$A:$A) + 1), 
      0, 
      IF(
        K$5 = "TopConsolidation",
        SUMIFS(
          INDEX(23:23, 1, COLUMN() + 1) #REF! INDEX(23:23, 1, COLUMN($S23)),
          INDEX($7:$7, 1, COLUMN() + 1) #REF! INDEX($7:$7, 1, COLUMN($S23)),
          K$2,
          INDEX($6:$6, 1, COLUMN() + 1) #REF! INDEX($6:$6, 1, COLUMN($S23)),
          FALSE
        ),
        IF(
          K$5 = "SubConsolidation",
          SUMIFS(
            INDEX(23:23, 1, COLUMN() + 1) #REF! INDEX(23:23, 1, COLUMN($S23)),
            INDEX($8:$8, 1, COLUMN() + 1) #REF! INDEX($8:$8, 1, COLUMN($S23)),
            K$2,
            INDEX($6:$6, 1, COLUMN() + 1) #REF! INDEX($6:$6, 1, COLUMN($S23)),
            FALSE
          ),
          IF(
            K$5 = "DivisionalConsolidation",
            SUMIFS(
              INDEX(23:23, 1, COLUMN() + 1) #REF! INDEX(23:23, 1, COLUMN($S23)),
              INDEX($9:$9, 1, COLUMN() + 1) #REF! INDEX($9:$9, 1, COLUMN($S23)),
              K$2,
              INDEX($6:$6, 1, COLUMN() + 1) #REF! INDEX($6:$6, 1, COLUMN($S23)),
              FALSE
            ),
            -SUMIFS(
              '#TrialBalance_CY'!$E:$E,
              '#TrialBalance_CY'!$A:$A,
              K$2,
              '#TrialBalance_CY'!$G:$G,
              "IncomeStatement"
            )
          )
        )
      )
    )
  </f>
  <v>0</v>
</c>

Relevant Code:

This issue likely stems from the interaction of the following code sections:

  • Row Generation and Copying:
    wks.InsertRow(templateRow.Start.Row, insertCount, templateRow.Start.Row);
    This insertion logic is followed by a copy-paste of formulas, which seems to corrupt the tokenized formula.

  • Formula Replacement:
    string newFormula = oldFormula.Replace(CFWConstants.AdjustmentReferenceSheetName, replace);
    cell.Value = newFormula;

Expected Behavior:

  • Formulas should remain intact and dynamically adjust their references when rows or columns are inserted. The same implementation works correctly in .NET6 with EPPLUS 7.4.1.

Actual Behavior:

  • Formulas are corrupted during processing, leading to malformed syntax and #REF! errors

Steps to Reproduce:

  1. Generate the workbook server-side with a worksheet containing formulas similar to the one above.
  2. Insert rows or columns into the worksheet programmatically.
  3. Open the workbook in Excel and inspect the formulas in the affected cells.

Full code snippet:

private void ProcessAdjustmentsByType(IEnumerable<WorksheetAdjustment> allAdjustments, PermOrTemp adjustmentType, string rowIdentifier)
{
    // Filter adjustments for the current type
    var filteredAdjustments = allAdjustments.Where(a => a.Adjustment.PermOrTemp == adjustmentType).ToList();
    if (!filteredAdjustments.Any()) return;

    // Find the template row for the adjustment type
    var helperColumnRange = Worksheet.GetHelperColumnRange();
    var templateRow = ExcelUtils.FindCell(Worksheet, rowIdentifier, helperColumnRange.Address)?.EntireRow.Range;

    if (templateRow == null)
        throw new Exception($"Could not find template row for '{rowIdentifier}' on the Book-to-Tax Summary tab.");

    // Generate rows for the adjustments
    GenerateRowsFromTemplate(Worksheet, templateRow, filteredAdjustments.Count() - 1);

    // Process each adjustment
    int currentRow = templateRow.Start.Row;
    foreach (var adjustment in filteredAdjustments)
    {
        if (adjustment.ParentWorksheet == null) continue;

        ReplaceTemplateFormula(currentRow, adjustment);
        AddHyperlinksToAdjustment(currentRow, adjustment);

        currentRow++;
    }
}

internal ExcelRangeBase GenerateRowsFromTemplate(ExcelWorksheet wks, ExcelRangeBase templateRow, int insertCount)
{
    if (insertCount <= 0)
    {
        return templateRow;
    }

    int originalTemplateRow = templateRow.Start.Row;

    wks.InsertRow(templateRow.Start.Row, insertCount, templateRow.Start.Row);

    // Grab templateRow again after inserting columns
    templateRow = templateRow.Offset(insertCount, 0);

    // Copy data from row template to newly inserted blank columns
    for (int j = 1; j <= insertCount; j++)
    {
        var sourceRange = wks.Cells[$"{templateRow.Start.Row}:{templateRow.End.Row}"];
        var destRange = wks.Cells[$"{templateRow.Start.Row - j}:{templateRow.End.Row - j}"];

        sourceRange.Copy(destRange);

    }

    return wks.Cells[$"{originalTemplateRow}:{originalTemplateRow + insertCount}"];
}

 private void ReplaceTemplateFormula(int row, WorksheetAdjustment? adjustment)
 {
     string replace = ExcelUtils.FormatSheetNameForFormula(adjustment.ParentWorksheet.Name);
     var query = Worksheet.Cells[$"{row}:{row}"]
                 .Where(cell => cell.Text.Contains(CFWConstants.AdjustmentReferenceSheetName));

     foreach (var cell in query)
     {
         if (string.IsNullOrWhiteSpace(cell.Text))
             continue;

         try
         {
             string oldFormula = cell.Text;

             // Replace placeholder sheet name with actual sheet name
             string newFormula = oldFormula.Replace(CFWConstants.AdjustmentReferenceSheetName, replace);

             cell.Value = newFormula;
         }
         catch (Exception ex)
         {
             throw new Exception($"Error replacing formula in cell {cell.Address}: {ex.Message}");
         }
     }
 }
``
@chris-newsome chris-newsome added the bug Something isn't working label Dec 9, 2024
@chris-newsome
Copy link
Author

chris-newsome commented Dec 9, 2024

It's also occurring for cells that contain formulas such as:

=SUMIF($I$3:$L$3,1,INDEX($I:$I,ROW()):INDEX($L:$L,ROW()))

Here's the XML:

<c r="H17" s="90">
<f>SUMIF($I$3:$P$3,1,INDEX($I:$I,ROW())#REF!INDEX($P:$P,ROW()))</f>
<v>0</v>
</c>

@JanKallman
Copy link
Contributor

Thanks, we will look at it and get back to you.

@JanKallman
Copy link
Contributor

Yes, this is a bug. I will provide a fix shortly.
Here is a test to reproduce the issue:

[TestMethod]
public void i1748()
{
    using var package = new ExcelPackage();
    var formula = "SUMIF($I$3:$L$3,1,INDEX($I:$I,ROW()):INDEX($L:$L,ROW()))";
    var formulaLong = "IF(COLUMN()-COLUMN($J23)>(COUNTA('#CompaniesAndConsolidations'!$A:$A)+1),0,IF(K$5=\"TopConsolidation\",SUMIFS(INDEX(23:23,1,COLUMN()+1):INDEX(23:23,1,COLUMN($L23)),INDEX($7:$7,1,COLUMN()+1):INDEX($7:$7,1,COLUMN($L23)),K$2,INDEX($6:$6,1,COLUMN()+1):INDEX($6:$6,1,COLUMN($L23)),FALSE),IF(K$5=\"SubConsolidation\",SUMIFS(INDEX(23:23,1,COLUMN()+1):INDEX(23:23,1,COLUMN($L23)),INDEX($8:$8,1,COLUMN()+1):INDEX($8:$8,1,COLUMN($L23)),K$2,INDEX($6:$6,1,COLUMN()+1):INDEX($6:$6,1,COLUMN($L23)),FALSE),IF(K$5=\"DivisionalConsolidation\",SUMIFS(INDEX(23:23,1,COLUMN()+1):INDEX(23:23,1,COLUMN($L23)),INDEX($9:$9,1,COLUMN()+1):INDEX($9:$9,1,COLUMN($L23)),K$2,INDEX($6:$6,1,COLUMN()+1):INDEX($6:$6,1,COLUMN($L23)),FALSE),-SUMIFS('#TrialBalance_CY'!$E:$E,'#TrialBalance_CY'!$A:$A,K$2,'#TrialBalance_CY'!$G:$G,\"IncomeStatement\")))))";
    var ws = package.Workbook.Worksheets.Add("Sheet1");
    ws.Cells["A1"].Formula = formula;
    ws.Cells["A2"].Formula = formulaLong;
    ws.InsertRow(1,1);

    var formulaInserted = "SUMIF($I$4:$L$4,1,INDEX($I:$I,ROW()):INDEX($L:$L,ROW()))";
    Assert.AreEqual(formulaInserted, ws.Cells["A2"].Formula);

    var formulaLongInserted = "IF(COLUMN()-COLUMN($J24)>(COUNTA('#CompaniesAndConsolidations'!$A:$A)+1),0,IF(K$6=\"TopConsolidation\",SUMIFS(INDEX(24:24,1,COLUMN()+1):INDEX(24:24,1,COLUMN($L24)),INDEX($8:$8,1,COLUMN()+1):INDEX($8:$8,1,COLUMN($L24)),K$3,INDEX($7:$7,1,COLUMN()+1):INDEX($7:$7,1,COLUMN($L24)),FALSE),IF(K$6=\"SubConsolidation\",SUMIFS(INDEX(24:24,1,COLUMN()+1):INDEX(24:24,1,COLUMN($L24)),INDEX($9:$9,1,COLUMN()+1):INDEX($9:$9,1,COLUMN($L24)),K$3,INDEX($7:$7,1,COLUMN()+1):INDEX($7:$7,1,COLUMN($L24)),FALSE),IF(K$6=\"DivisionalConsolidation\",SUMIFS(INDEX(24:24,1,COLUMN()+1):INDEX(24:24,1,COLUMN($L24)),INDEX($10:$10,1,COLUMN()+1):INDEX($10:$10,1,COLUMN($L24)),K$3,INDEX($7:$7,1,COLUMN()+1):INDEX($7:$7,1,COLUMN($L24)),FALSE),-SUMIFS('#TrialBalance_CY'!$E:$E,'#TrialBalance_CY'!$A:$A,K$3,'#TrialBalance_CY'!$G:$G,\"IncomeStatement\")))))";
    Assert.AreEqual(formulaLongInserted, ws.Cells["A3"].Formula);
}

@JanKallman JanKallman self-assigned this Dec 10, 2024
JanKallman added a commit that referenced this issue Dec 10, 2024
JanKallman added a commit that referenced this issue Dec 10, 2024
swmal pushed a commit that referenced this issue Dec 10, 2024
* Fixes issue #1748

* Some more tweeks for #1748
@JanKallman
Copy link
Contributor

Please try the fix using our develop7 Nuget feed as described here... https://github.com/EPPlusSoftware/EPPlus/wiki/Using-our-develop-Nuget-feed

@chris-newsome
Copy link
Author

chris-newsome commented Dec 10, 2024

@JanKallman After using the NuGet feed as requested, my issue appeared resolved. However, I encountered an "Unterminated string" exception. This was addressed by a fix in #1689, implemented in version 7.5.1.

@chris-newsome
Copy link
Author

Hi, I just wanted to follow up on this. I'm still getting the unterminated string exception using that feed.

@JanKallman
Copy link
Contributor

Sorry for the delayed answer.
#1689 should be resolved in the latest develop7 branch.
The problem was the # in the worksheet name in a formula.
See test:

public void i1689()

Is this the same formula you are getting your exception on?

@chris-newsome
Copy link
Author

@JanKallman Happy new year! Thanks for the update. Has this been updated in the newest EPPLUS version? My team is planning on release to Production for our app, but this issue is the only thing holding it back.

@chris-newsome
Copy link
Author

@JanKallman Hi there! Any update?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants