Using powershell to compare excel file headers

MrFlinstone 476 Reputation points
2021-03-10T23:56:23.807+00:00

I have an XLSM file dynamically created in PowerShell using the code below.

dd-Type -AssemblyName Microsoft.Office.Interop.Excel  
$xl  = [Microsoft.Office.interop.Excel.XLFileFormat]::xlOpenXMLWorkbookMacroEnabled  
$xlsm = New-Object -ComObject Excel.Application  
$workbook = $xlsm.workbooks.add()  
$worksheetNew = $workbook.Worksheets.add()  
$sheet1 = $workbook.Worksheets.item(1)  
$sheet1.name = "Source"  
$filepath = 'C:\output\file.xlsm'  
  
  
$sheet1.cells.item(1,1) = "ID"  
$sheet1.cells.item(1,2) = "Firstname"  
$sheet1.cells.item(1,3) = "Surname"  
$sheet1.cells.item(1,4) = "Grade"  
$xlsm.ActiveWorkbook.SaveAs("$filepath",$xl)  
$xlsm.quit()  
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($xlsm) | Out-Null  

The code above creates a file with the following column headers.

ID  
Firstname  
Surname  
Grade  

What I am trying to achieve is to compare the headers generated in the code above with the headers in the template file, I want to compare column for column (including the column header name) and highlight the differences between what was created in the code and the template file. A screenshot of the template file can be seen below. The column names should be exactly the same and any difference in the column names of missing column names should be highlighted.

76454-image.png

The differences in this case are as follows

Grade Score (spelt differently compared to what is in the code)  
City (extra column header in the template file)  
School Name (extra column header in the template file)  

I am also looking to have a count of template headers and generated files to see if they match.

What I have so far can be seen below.

$ExcelFile = "C:\output\file.xlsm"  
$excel = New-Object -ComObject Excel.Application  
$excel.visible = $true  
$Workbook = $excel.workbooks.open($ExcelFile)  
$Worksheets = $Workbook.worksheets  
$Worksheet = $Workbook.worksheets.Item(1)  
  
Function Read-Headers {  
    # Parameters: Excel worksheet  
    # Returns: Hash table of the contents of each cell in the first row and the  
    #    corresponding column number  
    # Note: Processes until first blank cell  
        Param ($Worksheet)      
        $Headers =@{}  
        $column = 1  
        Do {  
            $Header = $Worksheet.cells.item(1,$column).text  
            If ($Header) {  
                $Headers.add($Header, $column)  
                $column++  
            }  
        } until (!$Header)  
        $Headers  
    }  
      
    $Headers = Read-Headers $Worksheet  
  
$Headers  

This shows me the headers and their positions.

76430-image.png

The challenge now is how do I compare it programmatically ?

Windows Server PowerShell
Windows Server PowerShell
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.PowerShell: A family of Microsoft task automation and configuration management frameworks consisting of a command-line shell and associated scripting language.
5,354 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ian Xue (Shanghai Wicresoft Co., Ltd.) 29,486 Reputation points Microsoft Vendor
    2021-03-12T09:40:13.987+00:00

    Hi @MrFlinstone ,

    Does this work for you?

    Add-Type -AssemblyName Microsoft.Office.Interop.Excel  
    $Excel = New-Object -ComObject Excel.Application  
    $FilePath = 'C:\output\file.xlsm'  
    $TemplatePath = 'C:\output\template.xlsm'  
    $FileBook = $Excel.workbooks.Open($filepath)  
    $FileSheet = $FileBook.Worksheets.item(1)  
    $FileColumns = $filesheet.UsedRange.Columns.Count  
    $FileHeaders = @()  
    for($i=1;$i -le $FileColumns;$i++){  
        $FileHeaders += $FileSheet.UsedRange.Cells.Item(1,$i).Text  
    }  
    $FileBook.Close()  
    $Excel.Visible = $true  
    $TemplateBook = $Excel.workbooks.Open($TemplatePath)  
    $TemplateSheet = $TemplateBook.Worksheets.Item(1)  
    $TemplateColumns = $TemplateSheet.UsedRange.Columns.Count  
    for($j=1;$j -le $TemplateColumns;$j++){  
        if($TemplateSheet.Cells.Item(1,$j).text -notin $FileHeaders){  
         $TemplateSheet.Cells.Item(1,$j).interior.colorindex = 3    
        }  
    }  
    

    Best Regards,
    Ian Xue

    ============================================

    If the Answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments