InternetCanonicalizeUrl
A version of this page is also available for
4/8/2010
This function converts a URL to a canonical form, including the conversion of unsafe characters into escape sequences.
Syntax
BOOL WINAPI InternetCanonicalizeUrl(
LPCTSTR lpszUrl,
LPWSTR lpszBuffer,
LPDWORD lpdwBufferLength,
DWORD dwFlags
);
Parameters
- lpszUrl
[in] Pointer to the input URL to canonicalize.
- lpszBuffer
[out] Pointer to the buffer containing the canonicalized URL.
- lpdwBufferLength
[in, out] Pointer to the length, in characters, of the lpszBuffer buffer. If the function succeeds, this parameter indicates the length, in characters, of the lpszBuffer buffer, excluding the terminating null. If the function fails, this parameter indicates the required length, in characters, of the lpszBuffer buffer. The required length includes the terminating null.
dwFlags
[in] Flags specifying the control canonicalization. The following table shows the possible values. If no flags are specified (dwFlags = 0), the function converts all unsafe characters and meta-sequences (such as \.,\ .., and \...) to escape sequences.Value Description ICU_DECODE
Converts all %XX sequences to characters, including escape sequences, before the URL is parsed.
ICU_ENCODE_SPACES_ONLY
Encodes spaces only.
ICU_NO_ENCODE
Does not convert unsafe characters to escape sequences.
ICU_NO_META
Does not remove meta sequences (such as . and ..) from the URL.
Return Value
Returns TRUE to indicate success, and FALSE to indicate failure. To get extended error information, the application should call GetLastError. The following table describes the possible error values for GetLastError.
Value | Description |
---|---|
ERROR_BAD_PATHNAME |
The URL could not be canonicalized. |
ERROR_INSUFFICIENT_BUFFER |
The canonicalized URL is too large to fit in the buffer provided. The lpdwBufferLength parameter is set to the size, in bytes, of the buffer required to hold the resultant, canonicalized URL. |
ERROR_INTERNET_INVALID_URL |
The format of the URL is invalid. |
ERROR_INVALID_PARAMETER |
A bad string, buffer, buffer size, or flags parameter. |
Remarks
This function always encodes by default, even if the ICU_DECODE flag is specified. To decode without re-encoding, the application should use ICU_DECODE | ICU_NO_ENCODE. If the ICU_DECODE flag is used without ICU_NO_ENCODE, the function decodes the URL before parsing, and then re-encodes unsafe characters after parsing.
This function will handle arbitrary protocol schemes. However, to do so, it must make inferences from the unsafe character set.
The application should track the use of this function on a particular URL. If unsafe characters in a URL have been converted to escape sequences, using InternetCanonicalizeUrl again on the URL (with no flags) will convert the the escape sequences to another escape sequence. For example, a blank space in a URL will be converted to the escape sequence %20. Calling InternetCanonicalizeUrl again on the URL will convert the escape sequence %20 to the escape sequence %2520. The reason for this conversion is that the function replaces the percent (%) sign, an unsafe character reserved for escape sequences, with the escape sequence %25.
Requirements
Header | wininet.h |
Library | secur32.lib |
Windows Embedded CE | Windows CE 2.0 and later |
Windows Mobile | Pocket PC 2000 and later, Smartphone 2002 and later |