GZip detection

Peter Volz 1,295 Reputation points
2023-06-06T14:39:50.9666667+00:00

Hello all

I need to detect GZip file format by its content, not just file extension, using this code recently given:

Dim file_is_CFBF As Boolean

Using fs = File.OpenRead("path to my file...")
    Dim bytes(8 - 1) As Byte
    Dim expected_bytes As Byte() = {&HD0, &HCF, &H11, &HE0, &HA1, &HB1, &H1A, &HE1}
    fs.Read(bytes, 0, 8)
    file_is_CFBF = bytes.SequenceEqual(expected_bytes)
End Using

MsgBox(file_is_CFBF)

No idea how to change the code to cover GZip? Don't have its expected bytes, anyone have? :)

C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
11,111 questions
VB
VB
An object-oriented programming language developed by Microsoft that is implemented on the .NET Framework. Previously known as Visual Basic .NET.
2,755 questions
0 comments No comments
{count} votes

Accepted answer
  1. Bukke SanthiSwaroop Naik 390 Reputation points
    2023-06-06T14:47:48.02+00:00

    Hello Peter

    The code you provided checks for the CFBF (Compound File Binary Format) file format by comparing the first 8 bytes of a file with an expected byte sequence. To detect the GZip file format, you can modify the code as follows:

    Dim file_is_GZip As Boolean
    
    Using fs = File.OpenRead("path to your file...")
        Dim bytes(2 - 1) As Byte
        Dim expected_bytes As Byte() = {&H1F, &H8B}
        fs.Read(bytes, 0, 2)
        file_is_GZip = bytes.SequenceEqual(expected_bytes)
    End Using
    
    MsgBox(file_is_GZip)
    
    

    In the modified code, we are checking for the GZip file format by comparing the first 2 bytes of the file with the expected byte sequence ({&H1F, &H8B}). GZip files typically start with these two bytes.

    Make sure to replace "path to your file..." with the actual path to the file you want to detect. After running this code, the file_is_GZip variable will indicate whether the file is in the GZip format (True if it is, False otherwise). The MsgBox will display the result.

    Please note that this method only checks the file format based on the first few bytes and does not perform a thorough validation of the entire file. It assumes that the file is not corrupted and follows the expected format.

    thanks

    santhiswaroop

    1 person found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  2. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.