The scripting language

CodeWorker must be seen as a script interpreter that is intended to parse and to generate any kind of text or source code. This interpreter admits some options on the command line. Some of them look like those of a compiler.

CodeWorker doesn't provide any Graphical User Interface, but a console mode allows interactivity with the user.

1 Command line of the interpreter

The leader script is the name given to the script that is executed first by the interpreter. It exists six ways to pass this leader script to the interpreter via the command line:

To find easier a file to open for reading among some directories, the option -I specifies a path to explore. It gives more flexibility in sharing input files (both scripts and user files, excepting generated or expanded files) between directories, and it avoids relative or absolute paths into scripts.

It is possible to define some properties on the command line, thanks to option -define (or -D). These properties are intended to be exploited into scripts.

It is recommended to specify a kind of working directory with option -path. The assigned value is accessible into scripts via the function getWorkingPath(). This working directory generally indicates the output path for copying or generating files. The developer of scripts decides how to use it.

CodeWorker interprets scripts efficiently for speed. However, it is more convenient to run a standalone executable, instead of the interpreter and some script files. Moreover, once scripts are stable, why not to compile them as an executable to run the project a few times faster? Option -c++ allows translating the leader script and all its dependencies to C++ source codes, ready-to-compile.

To facilitate the tracking of errors, an integrated debugger is called thanks to the option -debug. It runs into the console, and some classical commands allow taking the control of the execution and exploring the stack and the variables.

Here are presented all switches that are allowed on the command line:

SwitchDescription
-args [arg]* Pass some arguments to the command line. The list of arguments stops at the end of the command line or as soon as an option is encountered. The arguments are stored in a global array variable called _ARGS.
-autoexpand file-to-expand The file file-to-expand is explored for expanding code at markups, executing a template-based script inserted just below each markup. It is identical to execute the script function autoexpand(file-to-expand, project).
-c++ generated-project-path
CodeWorker-path?
To translate the leader script and all its dependencies in C++ source code, once the execution of the leader script has achieved (same job as compileToCpp() compileToCpp()). The CodeWorker-path is optional and gives the path through includes and libraries of the software. However, it is now recommended to specify CodeWorker-path by the switch -home.
-c++2target script-file
generated-project-path target-language?
To translate the leader script and all its dependencies in C++ source code. Hence, the C++ is translated to a target language, all that once the execution of the leader script has achieved. Do not forget to give the path through includes and libraries of CodeWorker, setting the switch -home.
A preprocessor definition called "c++2target-path" is automatically created. It contains the path of the generated project. Call getProperty("c++2target-path") to retrieve the path value.
target-language is optional if at least one script of the project holds the target into its filename, just before the extension. Example: "myscript.java.cwt" means that the target language of this script is "java".
A property can follow the name of the target language, separated by a '=' symbol. The property is accessible via getProperty("c++2target-property"), and its nature depends on the target. For instance, in Java, this property represents the package the generated classes will belong to. Example: java=org.landscape.mountains.
-c++external filename To generate C++ source code for implementing all functions declared as external into scripts.
-commentBegin format To specify the format of a beginning of comment.
-commentEnd format To specify the format of a comment's end.
-compile scriptFile To compile a script file, just to check whether the syntax is correct.

SwitchDescription
-commands commandFile To load all arguments processed ordinary on the command-line. It must be the only switch or else passed on the command-line.
-console To open a console session (default mode if no script to interpret is specified via -script or -compile or -generate or -expand.
-debug [remote]? To debug a script in a console while executing it. The optional argument remote defines parameters for a remote socket control of the debugging session. remote looks like <hostname>:<port>. If <hostname> is empty, CodeWorker runs as a socket server.
-define VAR=value
or -D ...
To define some variables, as when using the C++ preprocessor or when passing properties to the JAVA compiler. These variables are similar to properties, insofar as they aren't exploited during the preprocessing of scripts to interpret. This option conforms to the format -define VAR when no value has to be assigned ; in that case, "true" is assigned by default to variable VAR. The script function getProperty("VAR") gives the value of variable VAR.
-expand pattern-script
file-to-expand
Script file pattern-script is executed to expand file file-to-expand into markups. It is identical to execute script function expand(pattern-script, project, file-to-expand).
-fast To optimize speed. While processing generation, the output file is built into memory, instead of into a temporary file.
-generate pattern-script
file-to-generate
Script file pattern-script is executed to generate file file-to-generate. It is identical to execute script function generate(pattern-script, project, file-to-generate).
-genheader text Adds a header at the beginning of all generated files, followed by a text (see procedure setGenerationHeader() setGenerationHeader()).
-help or ? Help about the command line.
-home CodeWorker-path Specifies the path to the home directory of CodeWorker.
-I path Specify a path to explore when trying to find a file while invoking include or parseFree or parseAsBNF or generate or expand or ... This option may be repeated to specify more than one path.
-insert variable_expression
value
Creates a new node in the main parse tree project and assigns a constant value to it. It is identical to execute the statement insert variable_expression = " value " ;.
-nologo The interpreter doesn't write the copyright in the shell at the beginning.

SwitchDescription
-nowarn warnings Specified warning types are ignored. They are separated by pipe symbols. Today, the only recognized type is undeclvar, which prevents the developer against the use of a undeclared variable.
-parseBNF BNF-parsing-script
source-file
The script file BNF-parsing-script parses source-file from an extended BNF grammar. It is identical to execute the script function parseAsBNF(BNF-parsing-script, project, source-file).
-path path Output directory, returned by the script function getWorkingPath(), and used ordinary to specify where to generate or copy a file.
-quantify [outputFile]? To execute scripts into quantify mode that consists of measuring the coverage and the time consuming. Results are saved to HTML file outputFile or displayed to the console if not present.
-report report-file
request-flag
To generate a report once the execution has achieved. The report is saved to file report-file and nature of information depends on the flag request-flag. This flag must be built by computing a bitwise OR for one or several of the following integer constants:
  • 1: provides every output file written by a template-based script (generate(), expand() or translate)
  • 2: provides every input file scanned by a BNF parse script (parseAsBNF() or translate())
  • 4: provides details of coverage recording for every output file using the #coverage directive
  • 8: provides details of coverage recording for every input file using the #matching directive
  • 16: provides details of coverage recording for every output file written by a template-based script
  • 32: provides details of coverage recording for every input file scanned by a BNF parse script
Notice that flags 16 and 32 may become highly time and memory consuming, depending both on how many input/output files you have to process and on their size.
-script script-file Defines the leader script, which will be executed first.
-stack depth To limit the recursive call of functions, for avoiding an overflow stack memory. By default, the depth is set to 1000.
-stdin filename To change the standard input for reading from an existing file. It may be useful for running a scenario.
-stdout filename To change the standard output for writing it to a file.
-time To display the execution time expressed in milliseconds, just before exiting.

SwitchDescription
-translate translation-script
source-file file-to-generate
Script file translation-script processes a source-to-source translation. It is identical to execute the script function translate(translation-script, project, source-file, file-to-generate).
-varexist To trigger a warning when the value of a variable that doesn't exist is required into a script.
-verbose To display internal messages of the interpreter (information).
-version version-name To force interpreted scripts as written in a precedent version given by version-name.

Note that the interpreter proposes a convenient way for running a common script with arguments:

codeworker <script-file> <arg1> ... <argN> [<switch>]*

This writing replaces the more verbose:

codeworker -script <script-file> -args <arg1> ... <argN> [<switch>]*

A console mode is launched when the command line is empty. The console only accepts scripts written in the common syntax, with common functions and procedures. So, parsing and generation scripts aren't typed directly on the console.

2 Syntax generalities and statements

A script in CodeWorker consists of a series of statements that are organized into blocks (also known as compound statements). A statement is an instruction the interpreter has to execute.

A single statement must close with a semicolon (';'). A compound statement is defined by enclosing instructions between braces ('{}'). A block can be used everywhere you can use a single statement and must never end with a semicolon after the trailing brace.

Comments are indicated either by surrounding the text with '/*' and '*/' or by preceding the rest of the line to ignore with a double slash ('//').

It exists three families of scripts here. To facilitate their syntax highlighting in editors, or to indicate briefly the type of the script, we suggest to employ some file extensions, depending on the nature of the script. The next table exposes the different extensions used commonly in CodeWorker.

ExtensionDescription
".cwt" a template-based script, for text generation
".cwp" a extended-BNF parse script, for parsing text
".cws" a common script, none of the precedent

The structure of the grammar is so rich that it is a challenge to find an editor, which offers a syntax highlighting engine powerful enough. JEdit proposes the writing of production rules to describe it, so it is possible to express the syntax highlighting of the scripting language.

You'll find a package dedicated to JEdit on the Web site, for the inclusion of these new highlighting modes. Many thanks to Patrick Brannan for this contribution.

2.1 preprocessor directives

A preprocessor directive always starts with a '#' symbol and is followed by the name of the directive.

2.1.1 Including a file

The #include filename directive tells the preprocessor to replace the directive at the point where it appears by the contents of the file specified by the constant string filename. The preprocessor looks for the file in the current directory and then searches along the path specified by the -I option on the command line.

2.1.2 Extending the language via a package

A package is an extension of the scripting language that allows adding new functions in CodeWorker at runtime. A package is implemented as an executable module, which exports all new functions the developer wants to make available in the interpreter.

Loading of a package

The preprocessor directive #use tells the interpreter that it must extend itself with the functions exposed by a package.

The syntax is: #use package-name

Loading a package more than once has no effect.

The name of the package must prefix the name of the function, when calling it: package-name::my-function(parameters...)

Example:

#use PGSQL
PGSQL::connect("-U pilot -d emergencyDB");
local sRequest = "SELECT solution FROM average_adjustment WHERE damage = 'broken wing'";
local listOfSolutions;
PGSQL::selectList(sRequest, listOfSolutions);
if listOfSolutions.empty()
  traceLine("No solution. Suggestion: parachute jump?");
else {
  traceLine("Solutions:");
  foreach i in listOfSolutions
    traceLine(" -" + i);
}
PGSQL::disconnect(); // if the plane hasn't crashed yet

The PGSQL package serves here for connecting to and querying a PostGreSQL database. For this example, the package exports three functions: PGSQL::connect, PGSQL::selectList and PGSQL::disconnect.

The executable module

CodeWorker expects a dynamic library, whose name is deduced from the package name and from the platform the interpreter is running to.
The short name of the dynamic library concatenates "cw" at the end of the package name. The extension of the dynamic library must be ".dll" under Microsoft Windows, and ".so" under Linux.

You must put the dynamic library at a place where CodeWorker will find it at runtime.
Microsoft Windows proceeds in the following order to locate the library:

Under Unix, a relative path for the shared object refers to the current directory (according to the man description of dlopen(3C)).

So, when CodeWorker reads #use PGSQL, it searches a dynamic library called "PGSQLcw.dll" under Windows or "PGSQLcw.so" under Linux.

Building a package

This section is intended to those that want to build their own packages, for binding to a database or to a graphical library ... or just for gluing with their own libraries.

When the interpreter find the preprocessor directive #use package-name in a script, it loads the executable module and executes the exported C-like function CW4DL_EXPORT_SYMBOL void package-name_Init(CW4dl::Interpreter*).

The preprocessor definition CW4DL_EXPORT_SYMBOL and the namespace CW4dl are both declared in the C++ header file "CW4dl.h". This header file is located in the "include" directory if you downloaded binaries, and at the root of the project if you downloaded sources.

The C-like function 'package-name_Init()' MUST be present! C-like means that it is declared extern "C" (done by CW4DL_EXPORT_SYMBOL).

Initializing the module that way is useful for registering new functions in the engine, via the function createCommand() of the interpreter (see the header file "CW4dl.h" in the declaration of the class Interpreter for learning more about it).

Every function to export must start its declaration with the preprocessor definition CW4DL_EXPORT_SYMBOL (means 'extern "C"', but a little more under Windows).

Every function returns const char*. The CodeWorker's keyword null designates an atypical tree node. It doesn't accept navigation and reference, only passing by parameter to a function. On the C++ side, this null tree node is seen as a null pointer of kind CW4dl::Tree*.

The interpreter CW4dl::Interpreter represents the runtime context of CodeWorker. It is the unavoidable intermediary between the module you are building and CodeWorker.
Use it for:

The #line directive forces to another number the line counter of the script file being parsed. The line just after the directive is supposed to be worth the number specified after #line.

2.1.3 Changing the syntax of the scripting language

The #syntax directive tells the preprocessor not to parse the following instructions as classical statements of the scripting language, but as conforming to another syntax. It allows adapting the syntax to what you are programming: The directive admits the following writing:
"#syntax" [parsing-mode [':' BNF-script-file]? | BNF-script-file]

How does it work? The piece of source code, which doesn't conform to the syntax of the script language, is put between the directives #syntax ... and #end syntax. If the trailing directive isn't found, the remaining of the script is considered as written in a foreign syntax. Be careful that the trailing directive must start at the beginning of the line necessary to be recognized and that no spaces are allowed between # and end.
At runtime, the famous piece of source code is parsed and processed via the BNF script file.

Note that it is possible to attach an identifier (called parsing-mode above) to a script file, and to specify later, in any other script, the parsing mode only; CodeWorker will find the corresponding BNF script file. It avoids to handle a physical name of the BNF parsing file, where a logical name of parsing mode is more convenient.

Example:

     // the first time, a parsing mode may be attached to the BNF script file
     #syntax shell:"TinyShell.cwp"
     ...
     #end syntax
     
     // at the second call, it isn't recommended to use the path of the parsing file
     // it is better to use the parsing mode registered previously
     #syntax shell
     ...
     #end syntax
     
     // here, I know that I'll call it once only, so I don't care about a parsing mode
     #syntax "MakeFile.cwp"
     ...
     #end syntax

where the parsing script "TinyShell.cwp" might be worth:

      // file "GettingStarted/TinyShell.cwp":
      tinyShell ::=
              #ignore(C++)
              #continue
              [
                  #readIdentifier:sCommand
                  #ignore(blanks) #continue
                  command<sCommand>
              ]* #empty;
     
      //----------------------------//
      // commands of the tiny shell //
      //----------------------------//
      command<"copy"> ::=
              #continue parameter:sSource parameter:sDestination
              => {copyFile(sSource, sDestination);};
     
      command<"rmdir"> ::=
              #continue parameter:sDirectory
              => {removeDirectory(sDirectory);};
     
      command<"del"> ::=
              #continue parameter:sFile
              => {deleteFile(sFile);};
     
     
      //--------------------
      // Some useful clauses
      //--------------------
      parameter:value ::=
              #readCString:parameter
                  |
              #!ignore #continue [~[' ' | '\t' | '\r' | '\n']]+:parameter;

Of course, the parsing and the processing are implemented in the scripting language, so changing the syntax will be slower than keeping the default one. However, it allows writing a code easy to support and to understand.

2.1.4 Managing changes in a multi-language generation

The directives #reference and #attach serve to be notified when a change has been made into a script for generating in a given language, but not taken back in another language. For example, you are writing a framework both in C++ and JAVA. You are adding some new features in C++ or correcting some mistakes. One day, you'll be care not to forget to update the JAVA generation. In fact, thanks to these directives, a warning will be produced up to changes will have been put in the other script.

How does it work? Directives must delimit the piece of script you have changed:
"#reference" key
...
"#end" key

The key is an identifier that allows putting more than one reference area into a script file. A #reference area might cover one or more #reference directives, without confusing about boundaries. The directive must be put at the beginning of the line.

Here are the directives delimiting the piece of script that should be updated later in another file:
"#attach" reference-file ':' reference-key
...
"#end" reference-key

A #attach area might cover one or more #reference or #attach directives, as a #reference area. The directive must be put at the beginning of the line.

The first time CodeWorker will encounter the reference script file, it will compute a number that depends on the content of the area. The first time CodeWorker will encounter an attached script file, it will get back the magic number of the reference area, found both by the file name and the key of the reference. And then, at the beginning, the reference and attached areas are considered as similar. CodeWorker stores the magic number of the reference just behind the #attach directive:
"#attach" reference-file ':' reference-key ',' reference-number

In fact, a script file that must be updated, so as to store the magic numbers for some attached areas, takes into account the modifications at the end of the parsing, and only if no error was encountered. If the writefileHook() function (see writefileHook) is implemented, it is called and the script file doesn't change if it returns false. If the script file is read-only, the corresponding readonlyHook() function is called (see readonlyHook). If it isn't possible to save the script file, an error is thrown.

When a change occurs in the reference area, the next time CodeWorker will encounter it, the magic number will be recomputed. When an attached piece of script is encountered after the change, the old magic number of the reference is compared to the new one. If they aren't the same, a warning is displayed to notify that the attached area hasn't been updated yet.

Once the changes have been taken back into the attached area, the magic number of the reference must be cut (don't forget the comma too!). And so, the next time this attached area will be encountered by the interpreter, it will get back the magic number of the reference area. And then, the reference area and the attached area are considered as similar once again.

Of course, the use of these directives is quite constraining. However, it is the only way in CodeWorker to assure that features and corrections have been taken back in all generated languages.

2.2 Constant literals

CodeWorker handles all basic types as strings, and doesn't distinguish a double from a boolean or a date. A string literal is a sequence of characters from the source character set enclosed in double quotation marks (" "). String literals are used to represent a sequence of characters which, taken together, form a null-terminated string. The interpretation done of the data depends on the context: function

increment(index)

expects that its argument index contains a number, but stored as a string.

A constant tree describes a tree as a list of constant trees and expressions, intended to be assigned to a variable. Example:

local aVariable = "a"{["yellow", "red":"or"{.alternative="orange"

], .vehicle="submarine"};}

You'll find more information in the sub section Scope below.

2.3 Variables, declaration and assignment

Variables serve as containers for the data you use into scripts. Data type is a tree that may be reduced to a leaf node, which contains a value and that's all.

2.3.1 Declaring variables

It isn't necessary to declare a variable before using if for the first time. A variable that is assigned without being declared is understood as a new sub-node to be added to the current tree context. The current context is obtained by the read-only variable called this. It corresponds to the main parse tree whose root name is project when you are into the leader script, and to the variable passed by parameter when calling a parsing or pattern script.

The next table exposes all pre-defined variable names (accessible from anywhere) and their meaning:

Variable NameDescription
project The main parse tree, always present.
this It points to the current context variable.
_ARGS An array of all custom command-line arguments. Custom arguments are following the script file name or the switch -args on the command-line.
_REQUEST If the interpreter works as a CGI program, it stores all parameters of the request in a association table. The key is the parameter name, which associates the corresponding value.

A variable that is read without being declared returns an empty string, but doesn't cause the creation of a sub-node. The danger is that you aren't safe from a spelling mistake. To prevent it, put the option -varexist on the command line and use the function existVariable() to check whether a variable exists or not.

2.3.2 Scope

When you declare a local variable, it is valid for use within a specific area of code, called the scope. When the flow of execution leaves the scope, the content of the variable, a subtree specially allocated during its declaration, is deleted and disappears forever from the stack. A scope is delimited by a block.

To declare a variable to the stack, use the following declaration statement:
local-variable-statement ::= "local" local-variable-declaration ';'
local-variable-declaration ::= variable [ '=' assignment-expression ]?
assignment-expression ::= constant-tree | expression
constant-tree ::= [tree-value]? '{' [tree-array-or-attribute [',' tree-array-or-attribute]* ]? '}'
tree-value ::= expression
tree-array-or-attribute ::= tree-array | tree-attribute
tree-attribute ::= '.' attribute-name '=' assignment-expression
tree-array ::= '[' tree-array-item [',' tree-array-item]* ']'
tree-array-item ::= expression ':' assignment-expression | assignment-expression

An extension of the syntax allows the declaration of more than one variable in one shot. A comma separates the variable declarations:
local-variable-statement ::= "local" local-variable-declaration [ ',' local-variable-declaration ]* ';'

The local variable points to a new empty tree, pushed into the stack.

To assign a reference to another variable, instead of either the result of evaluating an expression or a constant tree, use rather the following declaration statement:
local-ref-statement ::= "localref" local-ref-declaration [ ',' local-ref-declaration ]* ';'
local-ref-declaration ::= variable '=' reference

In the case of a CodeWorker version strictly older than 1.13, local variables that are declared in the body of a script or in the scope of a function may be accessed further in the scope of functions during their timelife. So a different behaviour may occur with a more recent CodeWorker interpreter.

This stack management had historical reasons, but it is now obsolete and often reflects an implementation's error. To preserve you from this kind of mistake, a warning may be displayed, so that scripts strictly older than version 1.13 may continue to run. Specify a version strictly older than 1.13 to the command line (option -version) for reclaiming that CodeWorker checks and generates a warning.

To correct this kind of mistake in old scripts, the variable should be propagated in an argument for functions that refer to it.

To declare a global variable, use the global statement. The declaration of a global variable can be specified anywhere in scripts. The first time the declaration of a global variable is encountered, the interpreter registers it as accessible from any point into scripts. The second time the interpreter encounters a global declaration for the variable, the latter remains global but its content is cleared.
Note that if a local variable or an attribute of the current node (this) is identical to the name of an existing global variable, the global variable remains hidden while the flow of control hasn't left the scope that contains the homonym.

the global declaration statement looks like:
global-variable-statement ::= "global" global-variable-declaration [ ',' global-variable-declaration ]* ';'
global-variable-declaration ::= variable [ '=' assignment-expression ]?

2.3.3 Navigating along branches

It is possible to navigate along a branch of the subtree put into the variable. A branch points to a node of the subtree. The syntax looks generally like:
branch ::= variable ['.' sub-node]*

If the branch isn't known before runtime, it may be build during the execution.

Example: while parsing an XML file, each time an XML attribute is encountered, one creates the corresponding attribute into the parse tree. But the name of the attribute is discovered during the parsing. The directive #evaluateVariable(expression) allows doing it. expression is evaluated at runtime and provides a branch:

#evaluateVariable("a.b.c")

will resolve the path "a.b.c" at runtime and navigate from a to textit{c}.

A node may contain an array of nodes, which are indexed by a key that is a constant string. A branch allows navigating through arrays, and the definitive syntax of branches conforms to:
branch ::= "#evaluateVariable" '(' expression ')'
                ::= variable ['.' sub-node | array-access]*
array-access ::= '[' expression ']'
                ::= '#' ["front" | "back" | "parent"] | "root"]
                ::= '#' '[' integer-expression ']'

We see that there are some ways to access an item node of an array or to change how to navigate from nodes to nodes:

2.3.4 Assignments

CodeWorker provides some different ways to put a data into a variable or into the node pointed to by a branch:

2.4 Expressions

2.4.1 Presentation

The BNF representation of an expression looks like:
expression ::= boolean-expr | ternary-expr
boolean-expr ::= comparison-expr [boolean-op comparison-expr]
boolean-op ::= '&' | '&&' | '|' | '||' | '^' | '^^'
ternary-expr ::= comparison-expr '?' expression ':' expression
comparison-expr ::= concatenation-expr [comparison-op concatenation-expr | "in" constant-set]
constant-set ::= '{' constant-string [',' constant-string]* '}'
comparison-op ::= '<' | '<=' | '==' | '=' | '!=' | '<>' | '>' | '>='
concatenation-expr ::= stdliteral-expr ['+' stdliteral-expr]*
stdliteral-expr ::= literal-expr
                ::= '$' arithmetic-expr '$'
literal-expr ::= constant-string | number
                ::= "true" | "false"
                ::= '(' expression ')'
                ::= '!' literal-expr
                ::= preprocessor-expr
                ::= function-call
                ::= variable-or-branch

arithmetic-expr ::= comparith-expr [boolean-op comparith-expr]*
comparith-expr ::= sum-expr [comparison-op sum-expr]
sum-expr ::= shift-expr [['+' | '-'] shift-expr]*
shift-expr ::= factor-expr [["<<" | ">>"] factor-expr]*
factor-expr ::= literal-expr [['*' | '/' | '%'] literal-expr]*
unary-expr ::= literal-expr ["++" | "--"]
literal-expr ::= string | variable-expr | number | unary-expr
                ::= '~' literal-expr
preprocessor-expr ::= '#' ["LINE" | "FILE"]

where:

2.4.2 Arithmetic expressions

The classical syntax of the interpreter forces expressions to work on sequences of characters. So, comparison operators apply the lexicographical order and the '+' operator concatenates two strings and the '*' operator doesn't exist.

Of course, it exists some functions to handle strings as number and to execute an arithmetic operation (the 'add()' or 'mult()' functions for instance) or a comparison (the 'isPositive()' or 'inf()' functions for instance).

However, it appears clearly more convenient to write arithmetic operations and comparisons in a natural way, using operators instead of the corresponding functions. So, CodeWorker provides an escape mode that draws its inspiration from LaTeX to express mathematical formulas: the arithmetic expression are delimited by the symbol '$'.

Example:


local a = 11;
local b = 7;
traceLine("Classical mode = '"
    + inf(add(mult(5, a), 3), sub(mult(a, a), mult(b, b))) + "'");
traceLine("Escape mode = '" + $5*a + 3 < a*a - b*b$ + "'");

Output:

Classical mode = 'true'
Escape mode = 'true'

2.5 Common statements

2.5.1 The 'if' statement

The BNF representation of the while statement is:
if-statement ::= "if" expression then-statement ["else" else-statement]?

The if statement evaluates the expression following immediately. The expression must be of arithmetic, text, variable or condition type. In both forms of the if syntax, if the expression evaluates to a nonempty string, the statement dependent on the evaluation is executed; otherwise, it is skipped.

In the if...else syntax, the second statement is executed if the result of evaluating the expression is an empty string. The else clause of an if...else statement is associated with the closest previous if statement that does not have a corresponding else statement.

2.5.2 The 'while'/'do' statements

The BNF representation of the while statement is:
while_statement ::= "while" expression statement

The while statement lets you repeat a statement or compound statement as long as a specified expression becomes an empty string. The expression in a while statement is evaluated before the body of the loop is executed. Therefore, the body of the loop may be never executed. If expression returns an empty string, the while statement terminates and control passes to the next statement in the program. If expression is non-empty, the process is repeated. The while statement can also terminate when a break, or return statement is executed within the statement body. When a continue statement is encountered, the control breaks the flow and jumps to the evaluation of the expression.

Note that the break and continue statements apply to the first loop statement (foreach/forfile/select, do/while) they encounter while leaving instruction blocks.

The BNF representation of the do statement is:
do_statement ::= "do" statement "while" expression ';'

The do-while statement lets you repeat a statement or compound statement until a specified expression becomes an empty string. The expression in a do-while statement is evaluated after the body of the loop is executed. Therefore, the body of the loop is always executed at least once. If expression returns an empty string, the do-while statement terminates and control passes to the next statement in the program. If expression is non-empty, the process is repeated. The do-while statement can also terminate when a break, or return statement is executed within the statement body. When a continue statement is encountered, control is transferred to the evaluation of the expression.

2.5.3 The 'switch' statement

The BNF representation of this statement is:
switch_statement ::= "switch" '(' expression ')' '{' (label_declaration)* ("default" ':' statement)? '}'
label_declaration ::= ["case" | "start"] constant_string ':' statement

The switch statement allows selection among multiple sections of code, depending on the value of an expression. The expression enclosed in parentheses, the controlling expression, must be of string type.

The switch statement causes an unconditional jump to, into, or past the statement that is the switch body, depending on the value of the controlling expression, the constant string values of the case or start labels, and the presence or absence of a default label. The switch body is normally a compound statement (although this is not a syntactic requirement). Usually, some of the statements in the switch body are labeled with case labels or with start labels or with the default label. The default label can appear only once.

The constant-string in the case label is compared for equality with the controlling expression. The constant-string in the start label is compared for equality with the first characters of the controlling expression. In a given switch statement, no two constant strings in start or case statements can evaluate to the same value.

The switch statement behaviour depends on how the controlling expression matches with labels. If a case label exactly matches with the controlling expression, control is transferred to the statement following that label. If failed, start labels are iterated into the lexicographical order, and the control is transferred to the statement following the first label that matches with the beginning of the controlling expression. If failed, control is transferred to the default statement or, if not present, an error is thrown.

A switch statement can be nested. In such cases, case or start or default labels associate with the most deeply nested switch statements that enclose them.

Control is not impeded by case or start or default labels. To stop execution at the end of a part of the compound statement, insert a break statement. This transfers control to the statement after the switch statement.

2.5.4 The 'foreach' statement

The BNF representation of this statement is:
foreach_statement ::= "foreach" iterator "in" [direction]?
                [sorted_declaration]? [cascading_declaration]? list-node body_statement
direction ::= "reverse"
sorted_declaration ::= "sorted" ["no_case"]? ["by_value"]?
cascading_declaration ::= "cascading" ["first" | "last"]?

A foreach statement iterates all items of the list owned by node list-node. The iterator refers to the current item of the list, and the body statement is executed on it.

Items are iterated either in the order of entrance, or in alphabetical order if option sorted is set. The sort operates on keys, except if the option by_value is set. The order is inverted if option reverse was chosen. To ignore the case, these options must be followed by no_case. If not, uppercase letters are considered as smaller than any lowercase letter.

      // file "Documentation/ForeachSampleSorted.cws":
      local list;
      insert list["silverware"] = "tea spoon";
      insert list["Mountain"] = "Everest";
      insert list["SilverWare"] = "Tea Spoon";
      insert list["Boat"] = "Titanic";
      insert list["acrobat"] = "Circus";
     
      traceLine("Sorted list in a classical order:");
      foreach i in sorted list {
          traceLine("\t" + key(i));
      }
      traceLine("Note that uppercases are listed before lowercases." + endl());
     
      traceLine("Sorted list where the case is ignored:");
      foreach i in sorted no_case list {
          traceLine("\t" + key(i));
      }
     
      traceLine("Reverse sorted list:");
      foreach i in reverse sorted list {
          traceLine("\t" + key(i));
      }
     
      traceLine("Reverse sorted list where the case is ignored:");
      foreach i in reverse sorted no_case list {
          traceLine("\t" + key(i));
      }

Output:

Sorted list in a classical order:
    Boat
    Mountain
    SilverWare
    acrobat
    silverware
Note that uppercases are listed before lowercases.

Sorted list where the case is ignored:
    acrobat
    Boat
    Mountain
    SilverWare
    silverware
Reverse sorted list:
    silverware
    acrobat
    SilverWare
    Mountain
    Boat
Reverse sorted list where the case is ignored:
    silverware
    SilverWare
    Mountain
    Boat
    acrobat

Control may not be sequential into the body statement. break and return enable exiting definitely the loop, and continue transfers the control to the head of the foreach statement for the next iteration.

Option cascading allows propagating foreach on item nodes. The way it works is illustrated by an example:


    foreach i in cascading myObjectModeling.packages ...

At the beginning, i points to myObjectModeling.packages#front and the body is executed. Before iterating i to the next item, the foreach checks whether the item node myObjectModeling.packages#front owns attribute packages or not. If yes, it applies recursively foreach on myObjectModeling.packages#front.packages.

Option cascading avoids writing the following code:


function propagateOnPackages(myPackage : node) {
    foreach i in myPackage {
        // my code to apply on this package
        if existVariable(myPackages.packages)
            propagateOnPackages(myPackages.packages);
    }
}
propagateOnPackages(myObjectModeling.packages);

Option cascading offers two behaviours:

2.5.5 The 'forfile' statement

The BNF representation of this statement is:
forfile_statement ::= "forfile" iterator "in" [sorted_declaration]? [cascading_declaration]? file-pattern body_statement
sorted_declaration ::= "sorted" ["no_case"]?
cascading_declaration ::= "cascading" ["first" | "last"]?

A forfile statement iterates the name of all files that verify the filter file-pattern. The iterator refers to the current item of the list composed of retained file names, and the body statement is executed on it. Note that the file pattern may begin with a path, which cannot contain jocker characters ('*' and '?').

Like for the foreach statement, items are iterated either in the order of entrance, or in alphabetical order of keys if option sorted is set. To ignore the case, the option must be followed by no_case. If not, uppercase letters are considered as smaller than any lowercase letter.

Control may not be sequential into the body statement. break and return enable exiting definitely the loop, and continue transfers the control to the head of the forfile statement for the next iteration.

The option cascading allows propagating forfile on directories recursively. The way it works is illustrated by an example:

      // file "Documentation/ForfileSample.cws":
      local iIndex = 0;
      forfile i in cascading "*.html" {
          if $findString(i, "manual_") < 0$ &&
              $findString(i, "Bugs") < 0$ {
                  traceLine(i);
          }
          // if too long, stop the iteration
          if $iIndex > 15$ break;
          increment(iIndex);
      }

Output:

cs/DOTNET.html
cs/tests/data/MatchingTest/example.csv.html
Documentation/LastChanges.html
java/JAVAAPI.html
java/data/MatchingTest/example.csv.html
Scripts/Tutorial/GettingStarted/defaultDocumentation.html
WebSite/AllDownloads.html
WebSite/examples/basicInformation.html
WebSite/highlighting/basicInformation.html
WebSite/repository/highlighting.html
WebSite/repository/JEdit/Entity.java.cwt.html
WebSite/serewin/ExempleIllustre.html
WebSite/tutorials/DesignSpecificModeling/tutorial.html
WebSite/tutorials/DesignSpecificModeling/highlighting/demo.cws.html
WebSite/tutorials/overview/tinyDSL_spec.html
WebSite/tutorials/overview/scripts2HTML/CodeWorker_grammar.html

At the beginning, i points to the first HTML file of the current directory and the body is executed. Before iterating i to the next item, the forfile checks whether the directory of the current file owns subfolders or not. If yes, it applies recursively forfile on subfolders.

Option cascading offers two behaviours:

2.5.6 The 'select' statement

The BNF representation of this statement is:
select_statement ::= "select" iterator "in" [sorted_declaration]? node-motif body_statement
sorted_declaration ::= "sorted" first-key [, other-key]*
first-key ::= branch
other-key ::= branch

A select statement iterates a list of nodes that match a motif expression. The iterator refers to the current item of the list composed of retained nodes, and the body statement is executed on it.

      // file "Documentation/SelectSample.cws":
      local a;
      pushItem a.b;
      pushItem a.b#back.c = "01";
      pushItem a.b#back.c = "02";
      pushItem a.b#back.c = "03";
      pushItem a.b;
      pushItem a.b#back.c = "11";
      pushItem a.b#back.c = "12";
      pushItem a.b#back.c = "13";
      pushItem a.b;
      pushItem a.b#back.c = "21";
      pushItem a.b#back.c = "22";
      pushItem a.b#back.c = "23";
      select i in a.b[].c[] {
          traceLine("i = "+ i);
      }

Output:

i = 01
i = 02
i = 03
i = 11
i = 12
i = 13
i = 21
i = 22
i = 23

Like for the foreach statement, items are iterated either in the order of entrance, or according to the sorting result if the option sorted is set.

Control may not be sequential into the body statement. break and return enable exiting definitely the loop, and continue transfers the control to the head of the select statement for the next iteration.

2.5.7 The 'try'/'catch' statement

The BNF representation of this statement is:
try-catch-statement ::= "try" try-statement "catch" '('error_message_variable')' catch-statement

Error handling is implemented by using the try, catch, and error keyword. With error handling, your program can communicate unexpected events to a higher execution context that is better able to recover from such abnormal events. These errors are handled by code that is outside the normal flow of control.

The compound statement after the try clause is the guarded section of code. An error is thrown (or raised) when command error(message-text) is called or when CodeWorker encounters an internal error. The compound statement after the catch clause is the error handler, and catches (handles) the error thrown. The catch clause statement indicates the name of the variable that must receive the error message.

2.5.8 The 'exit' statement

The BNF representation of this statement is:
exit_statement ::= "exit" integer-expression ";"

A exit statement leaves the application and returns an error code, given by the integer-expression.

Example:

exit -1;

2.6 User-defined functions

The BNF representation of a user-defined function to implement is:
user-function ::= classical-function-definition | template-function-definition
classical-function-definition ::= classical-function-prototype compound-statement
classical-function-prototype ::= "function" function-name '(' parameters ')'
template-function-definition ::= see the next section,
template function, for more information
parameters ::= parameter [',' parameter]*
parameter ::= argument [':' parameter-mode [':' default-value]? ]?
parameter-mode ::= "value" | "node" | "reference" | "index"
default-value ::= "project" | "this" | "null" | "true" | "false" | constant-string

The scripting language allows the user implementing its own functions. Parameters may be passed to the body of the function. A value may be returned by the function and, if so, the return type is necessary a sequence of characters. Of course, functions manage their own stack, and so, accept recursive calls.

An argument may have a default value if the parameter is missing in a call. All following arguments must then have default values too. A node argument can't have a constant string as a default argument, but it can be worth a global variable.

2.6.1 Parameters and return value

Arguments passed by parameter must be chosen among the following modes:

If you have omitted to return a value from a function, it returns an empty string ; in that case, you expects to call this function as a procedure and the result isn't exploited. The special procedure nop takes a function call as parameter and allows executing the function and ignoring the result. It isn't compulsory to use nop for calling a function as a procedure. As in C or C++, you can type the function call followed by a semi-colon and the result is lost.

It exists two possibilities for returning a value:

If you wish to execute a particular process in any case before leaving a function and:

2.6.2 The 'finally' statement

the statement finally warrants you that the block of instructions that follows the keyword will be systematically executed before leaving. This declaration may be placed anywhere into the body of the function. Its syntax conforms to:
finally-statement ::= "finally" compound-statement

Example:

      // file "Documentation/FinallySample.cws":
      1 function f(v : value) {
      2     traceLine("BEGIN f(v)");
      3     finally {
      4         traceLine("END f(v)");
      5     }
      6     // the body of the function, with more than
      7     // one way to exit the function, for example:
      8     if !v return "empty";
      9     if v == "1" return "first";
      10     if v == "2" return "second";
      11     if v == "3" return "third";
      12     return "other";
      13 }
      14
      15 traceLine("...f(1) has been executed and returned '" + f(1) + "'");

line 3: the finally statement is put anywhere in the body,
line 4: this statement will be executed while exiting the function, even if an exception was raised,

Output:

BEGIN f(v)
END f(v)
...f(1) has been executed and returned 'first'

2.6.3 Unusual function declarations

It may arrive that a function prototype must be declared before being implemented, because of a cross-reference with another function for instance. The scripting language offers the forward declaration to answer this need. To do that, the prototype of the function is written, preceded by the declare keyword:
forward-declaration ::= "declare" function-prototype ';'

If the body of the function must be implemented in another library and into C++ for example, the prototype of the function is preceded by the external keyword (see section C++ binding):
external-declaration ::= "external" function-prototype ';'

2.6.4 Template functions

CodeWorker proposes a special category of functions called template functions. Because of CodeWorker doesn't provide a typed scripting language, template hasn't to be understood as it is commonly exploited in C++ for instance.

A template function represents a set of functions with the same prototype, except the dispatching constant. The dispatching constant is a constant string that extends that name of the function. These functions instantiate the template function for a particular dispatching constant. Each instantiated function implements its own body.

The BNF representation of a template function to implement is:
template-function-definition ::= instantiated-function-definition | generic-function-definition
instantiated-function-definition ::= instantiated-function-prototype compound-statement
instantiated-function-prototype ::= "function" function-name '<' dispatching-constant '>' '(' parameters ')'
dispatching-constant ::= a constant string between double quotes
generic-function-definition ::= generic-function-prototype [compound-statement | template-based-body]
generic-function-prototype ::= "function" function-name '<' generic-key '>' '(' parameters ')'
generic-key ::= an identifier that matches any dispatching constant with no attached prototype
template-based-body ::= "{{" template-based-script "}}"
template-based-script ::= a piece of template-based script describing the generic implementation

A call to a template function requires to provide a dispatching expression to determine the dispatching constant. The dispatching expression will be evaluated during the execution and CodeWorker will resolve what instantiated function of this template to call: the result of the dispatching expression must match with the dispatching constant of the instantiated function. The BNF representation of a call to a template function is:
instantiated-function-call ::= function-name '<' dispatching-expression '>' '(' parameters ')'
parameters ::= expression [',' expression]*

Note that a dispatching constant may be empty and such an instantiated function can be called as a classical function. In fact, classical functions are considered as instantiated functions where the dispatching constant is empty.

template functions bring generic programming in the language: let imagine that we need function getType(myType : node), to decline for every language we could have to generate (C++, Java, ...). Normally, you'll write the following lines to recover the type depending on the language for which you are producing the source code:


if doc_language == "C++" {
    sType = getCppType(myParameterType);
} else if doc_language == "JAVA" {
    sType = getJAVAType(myParameterType);
} else {
    error("unrecognized language '" + doc_language + "'");
}

Thanks to the template functions, you may replace the precedent lines by the next one:


sType = getType<doc_language>(myParameterType);

with:


function getType<"JAVA">(myType : node) {
    ... // implementation for returning a Java type
}

function getType<"C++">(myType : node) {
    ... // implementation for returning a C++ type
}

During the execution, the function getType<T>(myType : node) resolves on what instantiated function it has to dispatch: either getType<"JAVA">(myType : node) or getType<"C++">(myType : node), depending on what value is assigned to variable doc_language.

Trying to call an instantiated function that doesn't exist, raises an error at runtime. However, one might imagine an implementation by default. For instance:


function getType<T>(myType : node) {
    ... // common implementation for any unrecognized language
}

For those that know generic programming with C++ templates, here is a classical example of using template functions:


function f<1>() { return 1; }
function f<N>() { return $N*f<$N - 1$>()$; }
local f10 = f<10>();
if $f10 != 3628800$ error("10! should be worth 3628800");
traceLine("10! = " + f10);

Output:

10! = 3628800

To provide more flexibility in the implementation of the template function, depending on the generic key <T>, the body admits a template-based script to implement the source code of the function. The specialization of the function for a given template instantiation key is then resolved at runtime.

Example:
The template function f inserts a new attribute in a tree node. The attribute has the name passed to the generic key for instantiation, and the value of the instantiation key is assigned to the new attribute. Then, the function calls itself recursively on the instantiation key without the last character.
For instance, the source code of f<"field"> should be:

function f<"field">(x : node) {
      insert x.field = "field";
      f<"fiel">(x); // cut the last character
}

Code:

//a synonym of f<"">(x : node), terminal condition for recusive calls
function f(x : node) {/*does nothing*/}

function f<T>(x : node) {{
      // '{{' announces a template-based script, which
      // will generate the correct implementation during the instantiation
      insert x.@T@ = "@T@";
      f<"@T.rsubString(1)@">(x);
@
      // '}}' announces the end of the template-based script
}}

f<"field">(project);
traceObject(project);

Output:

Tracing variable 'project':
      field = "field"
      fiel = "fiel"
      fie = "fie"
      fi = "fi"
      f = "f"
End of variable's trace 'project'.

2.6.5 Methods

For more readability, syntactical facilities are offered to call functions on a node as if this function was a method of the node. For example, it is possible to call function leftString on the node a like this: a.leftString(2), instead of the classical functional form: leftString(a, 2).

The rule is that every function (user-defined included) whose first argument is passed either by value or by node or by index (but never by reference) can propose a method call.

In that case, the method call applies on the first argument, which has to be a node. The BNF representation of a method call is:
method-call ::= variable '.' function-name '(' parameters ')'
parameters ::= expression [',' expression]*
where parameters have missed the first argument of the function called function-name.

It exists some exceptions where the method doesn't apply to the first argument:

The following methods offer a synonym to the function name:

2.6.6 The 'readonly' hook

The BNF representation of this statement is:
readonlyHook-statement ::= "readonlyHook" '(' filename ')' compound-statement

The token filename is the argument name that the user chooses for passing the name of the file to the body of the hook.

This special function allows implementing a hook that will be called each time a read-only file will be encountered while generating the output file through the generate or expand instruction.

Limitations: only one declaration of this hook is authorized, and it can't be declared inside a parsing or pattern script.

Example:

Common usage: file to generate has to be checked out from a source code control system (see system command to run executables).

readonlyHook(sFilename) {
  if !getProperty("SSProjectFolder") || !getProperty("SSWorkingFolder") || !getProperty("SSExecutablePath") || !getProperty("SSArchiveDir") {
    traceLine("WARNING: properties 'SSProjectFolder' and 'SSWorkingFolder' and 'SSExecutablePath' and 'SSArchiveDir' should be passed to the command line for checking out read-only files from Source Safe");
  } else {
    if startString(sFilename, getProperty("SSWorkingFolder")) {
      local sourceSafe;
      insert sourceSafe.fileName = sFilename;
      generate("SourceSafe.cwt", sourceSafe, getEnv("TMP") + "/SourceSafe.bat");
      if sourceSafe.isOk {
        putEnv("SSDIR", getProperty("SSArchiveDir"));
        traceLine("checking out '" + sFilename + "' from Source Safe archive '" + getProperty("SSArchiveDir") + "'");
        local sFailed = system(getEnv("TMP") + "/SourceSafe.bat");
        if sFailed {
          traceLine("Check out failed: '" + sFailed + "'");
        }
      }
    } else {
      traceLine("Unable to check out '" + sFilename + "': working folder starting with '" + getProperty("SSWorkingFolder") + "' expected");
    }
  }
}

2.6.7 The 'write file' hook

This special function allows implementing a hook that will be called just before writing a file, after ending a text generation process such as expanding or generating or translating text.

It is very important to notice that it returns a boolean value. A true value means that the generated text must be written into the file. A false boolean value means that the generated text doesn't have to be written into the file.

CodeWorker always interprets not returning a value explicitly of a function, as returning an empty string. If you forget to return a value, the generated text will not be written into the file!

The BNF representation of this statement is:
writefileHook-statement ::= "writefileHook" '(' filename ',' position ',' creation ')' compound-statement

ArgumentTypeDescription
filename string The argument name that the user chooses for passing the file name to the body of the hook.
position int The argument name that the user chooses for passing a position where a difference occurs between the new generated version of the file and the precedent one.
If the files don't have the same size, the position is worth -1.
creation boolean The argument name that the user chooses for passing whether the file is created or updated.
The argument is worth true if the file doesn't exist yet.

Limitations: only one declaration of this hook is authorized, and it can't be declared inside a parsing or pattern script.

Example:

writefileHook(sFilename, iPosition, bCreation) {
    if bCreation {
        traceLine("Creating file '" + sFilename + "'!");
    } else {
        traceLine("Updating file '" + sFilename + "', difference at " + iPosition + "!");
    }
    return true;
}

2.6.8 The 'step into' hook

This special function is automatically called before that the extended BNF engine resolves the production rule of a BNF non-terminal. Combined with stepoutHook(), it is very useful for trace and debug tasks.

This hook can be implemented in parse scripts only.

The BNF representation of this statement is:
stepintoHook-statement ::= "stepintoHook" '(' sClauseName ',' localScope ')' compound-statement

ArgumentTypeDescription
sClauseName string The name of the non-terminal.
localScope tree The scope of parameters used into the production rule.

2.6.9 The 'step out' hook

This special function is automatically called once the extended BNF engine has finished the resolution of a BNF non-terminal. Combined with stepintoHook(), it is very useful for trace and debug tasks.

This hook can be implemented in parse scripts only.

The BNF representation of this statement is:
stepoutHook-statement ::= "stepoutHook" '(' sClauseName ',' localScope ',' bSuccess ')' compound-statement

ArgumentTypeDescription
sClauseName string The name of the non-terminal.
localScope tree The scope of local variables and parameters used into the production rule.
bSuccess boolean Whether the resolution of the production rule has succeeded or not.

2.7 Statement's modifiers

A statement's modifier is a directive that stands just before a statement, meaning an instruction or a compound statement.

This directive operates some actions in the scope of the statement and then restores the behaviour as being before.

This action may be: