Excel to pdf comparison and mark the same value on Pdf

Raki 486 Reputation points
2022-10-17T20:48:56.243+00:00

Hello,

I have a PDF and excel document which has some same values. First I want to match the excel to pdf and mark the same value into pdf.

Thanks in advanced!

Developer technologies | VB
Developer technologies | ASP.NET | Other
Developer technologies | C#
0 comments No comments
{count} votes

Accepted answer
  1. QiYou-MSFT 4,326 Reputation points Microsoft External Staff
    2022-10-18T08:11:39.44+00:00

    Hi @Raki ,
    Normally, PDF is difficult to edit, you can use Spire.PDF for .NET to convert PDF to Excel.
    code:

    using Spire.Pdf;  
    namespace PDFtoExcel  
    {  
        class Program  
        {  
            static void Main(string[] args)  
            {  
                PdfDocument pdf = new PdfDocument();  
                pdf.LoadFromFile("sample.pdf");  
                pdf.SaveToFile("ToExcel.xlsx",FileFormat.XLSX);  
            }  
        }  
    }  
    

    Use OleDB to read EXCEL files, and use EXCEL files as a data source to read data.
    code:

    public DataSet ExcelToDS(string Path)   
    {   
    string strConn = "Provider=Microsoft.Jet.OLEDB.4.0;" +"Data Source="+ Path +";"+"Extended Properties=Excel 8.0;";   
    OleDbConnection conn = new OleDbConnection(strConn);   
    conn.Open();     
    string strExcel = "";      
    OleDbDataAdapter myCommand = null;   
    DataSet ds = null;   
    strExcel="select * from [sheet1$]";   
    myCommand = new OleDbDataAdapter(strExcel, strConn);   
    ds = new DataSet();   
    myCommand.Fill(ds,"table1");      
    return ds;   
    }  
    

    Next, compare the two Excel data and add marks at the end.

    Best regards,
    Qi You

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Michael Taylor 60,326 Reputation points
    2022-10-17T21:02:41.763+00:00

    This cannot be done directly. These are 2 entirely different formats and PDFs are actually object-based files. You'll need to convert one file format to the other. Unfortunately comparing PDFs isn't easy either because PDFs are composed of object blocks and therefore two identical looking documents may actually be stored differently in the PDF. It completely depends upon the renderer being used.

    Your better option is to go back to the source data if you have it. If that isn't available then you'll have to pick apart the PDF and find the data you care about (I assume it is a table or something). Then you'll need to extract the data into a table format that you can then compare to the Excel table format. All of this is completely dependent upon how the PDF is structured so you'll have to get really good at understanding the PDF format and specifically how the PDF you are going to read works.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.