Parsing the message download history for a POP3 account
This topic describes the structure of the POP3 BLOB that represents the message download history of a POP3 account, to identify the messages that have been downloaded or deleted on that account.
Why parse the message download history?
The Post Office Protocol (POP) provider for Outlook allows users to retrieve and download new email messages on their local device, and subsequently to leave or delete these email messages on the mail server. When the mail client checks for new messages to download, it has to be able to identify and download only the new messages for that Inbox. The mail client does this by first using the UIDL (Unique ID Listing) command to obtain a map of each message that has ever been delivered to that Inbox to a unique identifier (UID). The client also gets the message download history for messages that have been downloaded or deleted for the Inbox on that client. Using the message UID map and download history, the client can then identify those messages that are absent from the history as new and, hence, should be downloaded.
To get the messages download history for an Inbox:
Follow the steps in Locating the message download history for a POP3 account to find the PidTagAttachDataBinary property, which contains a binary large object (BLOB) that represents the message history for a POP3 account.
Read this topic, which describes the structure of the BLOB, and shows an example BLOB to identify messages that have been downloaded or deleted for the Inbox of the POP3 account.
POP BLOB structure
The POP BLOB structure, as described in Table 1, begins with two fields, Version and Count, followed by a Count number of resource tags, each of which is null-terminated.
Table 1. Structure of the BLOB that represents the message download history of a POP3 account
Field in BLOB | Size | Description |
---|---|---|
Version |
2 bytes |
Must be 3 (PBLOB_VERSION_NUM). |
Count |
2 bytes |
The number of resource tags in this BLOB. |
Resource tag |
Variable |
0 or more null-terminated UTF-8 strings that encode the resource tags. The number of null-terminated strings must match Count. |
Each resource tag specifies the operation that is applied to a message, some date-time metadata about the operation, and encodes the UID of the message. The format of a resource tag string is broken down as follows, and is further explained in Table 2.
Ocyyyymmddhhmmssuuu...
Table 2. Structure of a resource tag
Field in a resource tag | Size | Description |
---|---|---|
O |
1 character |
The operation performed on the email message. The value must be "+", "-", or "&", which indicates a successful get, delete, or get-and-delete operation, respectively. |
c |
1 character |
The part of the message content involved in the operation. The value must be " ", "h", or "b", which indicates the content of none, header, or body, respectively. |
yyyy |
4 characters |
The four-digit year of the operation. |
MM |
2 characters |
The two-digit month of the operation. |
dd |
2 characters |
The two-digit day of the operation. |
hh |
2 characters |
The two-digit hour of the operation. |
mm |
2 characters |
The two-digit minute of the operation. |
ss |
2 characters |
The two-digit second of the operation. |
uuu… |
Variable length |
The encoded UID of a message. |
Example
Figure 1 shows an example of a BLOB that represents the message download history of a POP account.
Figure 1. Example BLOB structure for the message download history of a POP3 account
Based on the structure described in Table 1 and Table 2, this BLOB represents the download history of 23 email messages.
To parse the raw UID in each resource tag, be aware that the UID follows this encoding: characters in a UID are mostly alphanumeric characters, and each non-alphanumeric character is preceded by the ASCII character "$" (0x24). So the ASCII characters $2d represent the non-alphanumeric character "-". Figure 2 shows an example of first converting the raw UID in resource tag 1 to the ASCII representation, then converting any non-alphanumeric character preceded by "$" to produce the actual UID:
0BC535DB-EA63-11E1-A75C-00215AD7BB74
Figure 2. Converting the raw UID in a resource tag to the actual message UID
To interpret resource tag 1 in this BLOB: the message with the UID 0BC535DB-EA63-11E1-A75C-00215AD7BB74
was successfully retrieved on September 6, 2012, at 13:11:38.
You can similarly parse the remaining 22 resource tags for that BLOB.