Unicode References
In this article, I recommend several Unicode articles/websites for reference. Note, the list is not yet completed, I will add more entries and make better categorization
- My blog is a good site for collation issues in SQL Server.
- Sort it all out. Michael Kaplan's random stuff of dubious value is a great source for learning Globalization techniques with Microsoft Software.
- International Features in Microsoft SQL Server 2005. The title says.
- Description of storing UTF-8 data in SQL Server: this is a article you need to read if you want to store UTF-8 data into SQL Server
- UTF-8 and Unicode FAQ for Unix/Linux. Although the title indicates about Linux/Unix, this is a very good papers talking about UTF-8. If you want to learn UTF-8, it is the best document.
- A tutorial on character code issues: one comprehensive article about character encoding. It is a good reference.
- Migrating Software to Supplementary Characters. This 57 pages presentation is a complete list of things you need to consider when migrating from UCS-2 to UTF-16. Read it, you will see that it is really not a trivial task.
- Globalization issues in ASP and ASP.NET. This is a really good article for ASP and ASP .Net
- Oracle® Database Globalization Support Guide: Programming with Unicode. I have to say oracle’s document is better sometimes. This book is pretty good reference if you want to learn Oracle’s Globalization support. it can also serve as a reference to Unicode.
- Avoid treating binary data as a String. In this blog, Shawn Steele discussed that interchange between binaries and string by using Encoding class is not always safe and round-trip since the binaries might be a mal-formed UTF-16 sequence. This link is the change in details.
- “How to: Send and Retrieve UTF-8 Data (SQL Server 2005 Driver for PHP)“. It is a must read document if you are developing PHP with SQL Server.
- UTF8 Security and Whidbey Changes. In this blog, Shawn Steel described behavior changes related to UTF-8 encoding functions in Windows Vista and .Net 2.0. For people using Windows MultiBytesToWCHAR and .Net UTF8Encoding class, this is a must read document.
- Some musings on Oracle Character Sets. A very good article about our friend Oracle’s Unicode character set support.
- Surrogates and Supplementary Characters. MSDN Win32 UTF-16 Support document.
- Implementation of Unicode in SQL Server. This is a pretty good introduction paper about Unicode Support in SQL Server
- “Saving UTF-8 in SQL Server 2005” is one forum thread where people talk about storing UTF-8 data in SQL Server.
- The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). This is a very good article which talks about the basic about Unicode in a simple way. A good introduction ariticle.
- Unicode in XML and other Markup Languages. This is W3C ‘s official document for using Unicode Standard for XML language
- Form of Unicode: this is a pretty handy document which talking about Unicode Encoding and the different format of Unicode character. This is my favorite document.
- Unicode in Visual Studio 2003 contains good examples how C#’s Unicode Support, such as FileEncoding, Globalization Config, ASP .Net Request and Response Encoding.
- The Basic of UTF-8. In this article, Marius did a brief introduction about UTF-8 Encoding. Then he gave a C++ code example of convert from/to UTF-16 to UTF-8. This is another example of writing UTF-8 in C++
- UTF-8 Versus Windows UNICODE. In this article, Ben did a comparison between UTF-8 and UTF-16 Encoding.
- Guide to Unicode is a series very extensive articles about Unicode. My favorite articles.
- Globalization Step by Step: a comprehensive guide-line for doing globalization development with Microsoft Products.
- Supplementary-Aware String Manipulation Sample is a set of SQLCLR functions which can do supplementary-ware string manipulation. Works for SQL Server 2005 and 2008.
- Autotranslation of Character Data. If you works on SQL Server varchar type, and retrieve by using ODBC driver, you might hit the issues discussed in the article when your client OS is different with your server OS.
- Oracle’s string function length semantic
- Oracle’s data type length semantic
- MySQL’s string function length semantic
- MySQL’s Unicode support
- Implement 4-byte UTF8, UTF16 and UTF32. This is MySQL’s design note for implementing different Unicode Encodings for version 6.0. It is a good reference if you want to know the challenge of supporting different Encoding in a database
- Understanding Unicode and ODBC Data Access: this is a good article for ODBC Driver’s Unicode Support.
- Using the Easysoft ODBC-Oracle Driver with Unicode Data: another article for ODBC Driver’s Unicode Support.
- Application encoding schemes and DB2 z/OS ODBC”: DB2’s Unicode Support
- MySQL 6.0 Manual: 9.1.4. Connection Character Sets and Collations: MySQL driver/protocol support for Unicode
- Esaysoft ODBC for MS SQL SERVER with UTF-8 support: If you want to get UTF-8 data back from SQL Server, try this.
- SQL SERVER字符集的研究. If you can read the title, you will know what is talking about.
- 众多字符集编码的区别. very good article in Chinese which talks about Unicode Encoding.
- The current PHP version 5 have NO Unicode Support, according to this web site.
- The next release of PHP version 6 will have native Unicode Support, here is the description from web site "Upcoming PHP release will offer Unicode support"
- PHP 6.0 will switch from having a single, generic string type to having two: a Unicode string type for text data, implemented through UTF-16, and a binary type, which will include actual binary data and text data for legacy locales.
- Can the CP_ACP be UTF-8? If you are working with Visual C++, and wonder whether the UTF-8 can set as the default locale of the system or current thread’s locale. You might need read this document which says No
Comments
- Anonymous
May 19, 2009
PingBack from http://asp-net-hosting.simplynetdev.com/unicode-references/