How to: Enable Syntax Coloring
Your language service can provide syntax coloring for different types of tokens. Tokens are mapped to color classes, which are used by Visual Studio to colorize the text in the editor window.
This topic is based on the Visual Studio Language Package solution, which provides a basic implementation of the Babel package. For more information about creating one of these solutions, see Walkthrough: Creating a Language Service Package.
The lexer and the colorizer
The Visual Studio colorizer imposes some constraints on the form of the tokenization rules. The colorizer saves the state of the lexer at every line of text. Whenever the line colors change, the lexer is restarted. Each time a token is returned by the lexer, the colorizer locates the color class in the table provided by the getTokenInfo method in the Service class. All the characters scanned for this token are colored with this color class. To make this work, the colorizer makes two assumptions about the lexical specification:
A token is always returned at the end of a line.
No characters are skipped or added during scanning.
Therefore you should not simply discard white space and newlines, as many lexical definitions do. Instead, the lexical rules must return a white space token:
[ \t\r]+ { return LEX_WHITE; }
\n { return LEX_WHITE; }
Because the token mapping specifies the CharWhiteSpace character class for LEX_WHITE, only the colorizer knows about the white space. The parser ignores all white space tokens for other purposes, although you can change this behavior by overriding the Service::isGrammarToken method.
In order to satisfy the second constraint, you must add actions to every lexical rule to ensure that characters are not lost. The only three ways to end an action should be the following ones:
return Token;
Indicates that the token has been scanned and its name has been returned. This is the typical way to end an action.
yymore();
This approach is used to avoid losing characters in yytext (the currently scanned text) while continuing to scan. Use this in situations where you want to return the comment token only at the end of a comment or line.
yyless(0);
Indicates that something has been scanned and the state has changed, but the input should be scanned again. Here, yyless(0) is used to prevent duplication of characters.
To define language tokens in the lexer source file
Open the default lexer.lex file. You can find it in the default Visual Studio Language Package solution in the main project under Source Files folder.
The default token definitions start with the following lines:
<init>if { return KWIF;} <init>otherwise { return KWELSE;} <init>while { return KWWHILE; }
Add any new tokens you need. For example, to add a keyword token KWDOUBLE for the double keyword, you would add the following line:
<init>double { return KWDOUBLE;}
To define language tokens in the parser source file
The KWDOUBLE keyword also must be added to the parser file. Since the KWDOUBLE keyword has the same grammar as KWINT, that is, a double can appear in the same code contexts as an integer, you can add KWDOUBLE to the same places as KWINT.
Open the parser.y file. You can find it in the default Visual Studio Language Package solution in the main project under Source Files folder.
Find the first line where KWINT appears and add KWDOUBLE:
%token KWEXTERN KWSTATIC KWAUTO KWINT KWDOUBLE KWVOID
Since KWDOUBLE is a type like KWINT, you should also add it to the Type category:
Type : KWINT : KWDOUBLE | KWVOID ;
Save the lexer and parser source files, and build the solution.
When the service is built, Bison should generate the the parser.cpp file called that with the token.
Mapping tokens to color classes
You can map tokens to a specific color class. For example, in the default language the KWIF token maps to the ClassKeyword color class, and the NUMBER token maps to the ClassNumber color class (see ColorClass Enumeration for details).
Mapping information is provided by the Service::getTokenInfo method In service.cpp. This method returns a pointer to a static TokenInfo array. Each entry in the TokenInfo structure contains the following information:
Element |
Example |
---|---|
Token name |
IDENTIFIER, NUMBER, ';' |
Color Class |
ClassIdentifier, ClassNumber, ClassComment |
Description |
identifier, number, comment |
Character Class |
CharIdentifier, CharLiteral, CharComment |
Trigger Class (optional) |
For more information, see Adding Triggers in How to: Provide Automatic Brace Matching. |
The character class typically coincides with the color class. The essential difference is that the color classes can be extended (as is described later) while the character classes are fixed. This allows Babel to use the character class for Visual Studio search and navigation functions.
Example of Mapping Tokens to Color Classes
A typical token table is as follows:
static TokenInfo tokenInfoTable[] =
{
//TODO: Add your own token information here.
{ IDENTIFIER, ClassIdentifier, "identifier '%s'", CharIdentifier },
{ NUMBER, ClassNumber, "number ('%s')" , CharLiteral },
{ KWIF, ClassKeyword, "if" , CharKeyword },
{ KWELSE, ClassKeyword, "else" , CharKeyword },
{ KWWHILE, ClassKeyword, "while" , CharKeyword },
{ LEX_WHITE, ClassText, "white space" , CharWhiteSpace },
{ LEX_LINE_COMMENT, ClassComment, "comment" , CharLineComment },
{ LEX_COMMENT,ClassComment, "comment" , CharComment },
//Always end with the 'TokenEnd' token.
{ TokenEnd, ClassText, "<unknown>" }
};
The default color classes are defined in babelservice.idl:
enum DefaultColorClass
{
ClassText,
ClassKeyword,
ClassComment,
ClassIdentifier,
ClassString,
ClassNumber
};
It is also possible to use custom color classes. For more information, see How to: Provide Custom Color Classes.
Change History
Date |
History |
Reason |
---|---|---|
July 2008 |
Rewrote and refactored project. |
Content bug fix. |