Search |
||
Languages Evolution: introduction of new keywordsPosted by forax on October 9, 2006 at 2:22 AM PDT
When you want to add features to a language without breaking backward compatibility, a widespread idea that you can't add new keywords. That is why we can currently see weird proposal in Java space that try to reuse old keywords to express new kind of abstraction, by example, synchronized (closure v0.2 section 3) or (Neal Gafter blog about for). Why introducing a new keyword breaks already written codes ?When you specify a new keyword, you need to change the lexer to recognize sequence of characters as a new token. Thus the lexer doesn't recognize this sequence as an identifier anymore.
One magic solution is to use a special character (or more) for
differenciate keyword from identifier.
Lot of scripting languages use '$', '#' etc. to tag variables,
Perl6
is the best example.
Java is a strong typed language so it doesn't need such special characters and we are stuck while we continue to see lexers as lex. The problem comes from the lexer, so the solution is to change how lexer works. Contextual keywords
Let me take an example, "enum" is a new keyword introduced in 1.5
to declare enumerated type.
So the lexer of an 1.5 compiler
now recognize "enum" as a keyword in the whole program.
The solution is to use a lexer that implements contextual keywords, i.e a lexer that let the parser activate or not rules needed to recognize tokens depending on the parser state.
enum Foo { // keyword
public static void main(String[]) {
Enumeration
With two colleagues, i've written a new Parser Generator
named Tatoo
that generates this kind of lexer.
Tatoo contains other innovative features like grammar versioning, full NIO support (push lexer/parser), lexing without unicode decoding, AST generator. I will blog about those features later. »
Related Topics >>
Open JDK Comments
Comments are listed in date ascending order (oldest first)
|
||
|
|