Find and replace function use the wrong Unicode code

Question

Find and replace function use the wrong Unicode code

Anonymous

Hi,

I have this weird problem.

I am on Office 2010 on Win7 ultimate (Japanese version but set to English) if that is of any use.

In a document, when I use ALT+X to display the Unicode code of a character, it displays the right code but the search/replace function of word matches with different codes.

I am trying to do a search for a character which is displayed as blank (it's not the space displayed as a dot), and since copy/paste into the find box wasn't working, I tried to find by using the unicode code, hence the ALT+X. It tells me 2028, which seems to be a line separator. This is very possible since these characters are from a UTF8 text file and indeed could be a line separator.

Anyway, when looking up for ^u2028, I get zero results. I tried with other character and for the small letter a, ALT+X says correctly that it is 0061, but again, if I search for ^u0061, I get zero match. Instead, the search function finds the small a when I type ^u0065, which is the unicode code for e.

So basically, the text is properly encoded and alt+x displays the code properly, however my search function is off. Any idea of what the problem is?

EDIT: Hi all. Thanks for the replies. It's definitely a problem related to the search function. I tried again today ^u0065 to see if it incorrectly finds again the small a (u0061) instead and didn't even find it! I realized that the find parameters were different. The default option is "Sounds like (Japanese) and it finds nothing when this is turned on. However, if I untick this option, then it is back at finding the character with the wrong code.

The suggested solutions (5 digit unicode code, and searching with the Html code) do not work.

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

Answer accepted by question author

9 additional answers

Answer 1

Have you tried searching for “^u8232” (without the quote marks)? If you do, Word should find the blank “line separators” in your document.

If you search for “^u97” (without the quote marks), Word should find any “Latin small letter a”.

If you search for “^u101” (without the quote marks), Word should find any “Latin small letter e”.

Etc.

In short, use the HTML entity code numbers to search for characters in a Word document.

Robert

Answer 2

Anonymous

What exactly have you tried to no avail?

What I suggested above works in Word 2007.

Robert

Answer 3

Sorry I edited my answer (I had tried the 5 digits), as well as my question to reflect what I tried so far.

It looks like the issue is definitely with the search function since when I untick the "sounds like (japanese)" it consistently finds the wrong unicode character while if that option is ticked, it finds nothing.

Answer 4

Anonymous

Thanks Robert19.

This does work indeed. Although the weird thing is ^u97 finds small letter a only if "match case" is checked. If not, it finds both A, and a... I wonder why.

Anyway, thanks a million.

Answer 5

Try prefixing the find number with a zero, ie ^u02028.

This tip suggests that unicode supports up to 5 digit ID's, although all other tips I've seen only specify using 4 digits (as you did).

http://www.word.mvps.org/FAQs/MacrosVBA/FindReplaceSymbols.htm

prefixing with 0 doesn't work.

thanks.

Share via

Find and replace function use the wrong Unicode code

9 additional answers