 |
The scripting language
|
CodeWorker must be seen as a script interpreter that is intended to parse and
to generate any kind of text or source code. This interpreter admits some options
on the command line. Some of them look like those of a compiler.
CodeWorker doesn't provide any Graphical User Interface, but a console
mode allows interactivity with the user.
1 Command line of the interpreter
The leader script is the name given to the script that is executed first by the interpreter.
It exists six ways to pass this leader script to the interpreter via the command line:
- the script describes all the processing tasks for parsing text, decorating the graph and
generating code ; the option of the command line is -script to execute the script,
- the script describes an extended BNF grammar ; the option of the command line is
-parseBNF for executing the script and parsing the source file,
- the script describes how to generate code ; the option of the command line is
-generate to execute the script and to generate the output file,
- the script describes how to expand a file ; the option of the command line is
-expand to execute the script and to expand the output file into its markups,
- a file contains embedded scripts driving their own expansion ; the option of the command line is
-autoexpand to execute embedded scripts located below each markups, expanding the output file on markups,
- the script describes a source-to-source translation ; the option of the
command line is -translate to execute the script and to translate the source
file to the output file,
To find easier a file to open for reading among some directories, the option -I
specifies a path to explore. It gives more flexibility in sharing input files (both scripts
and user files, excepting generated or expanded files) between directories, and it avoids
relative or absolute paths into scripts.
It is possible to define some properties on the command line, thanks to option -define
(or -D). These properties are intended to be exploited into scripts.
It is recommended to specify a kind of working directory with option -path.
The assigned value is accessible into scripts via the function getWorkingPath(). This
working directory generally indicates the output path for copying or generating
files. The developer of scripts decides how to use it.
CodeWorker interprets scripts efficiently for speed. However, it is more convenient to run
a standalone executable, instead of the interpreter and some script files. Moreover, once
scripts are stable, why not to compile them as an executable to run the project a few times
faster? Option -c++ allows translating the leader script and all its dependencies to
C++ source codes, ready-to-compile.
To facilitate the tracking of errors, an integrated debugger is called thanks to the
option -debug. It runs into the console, and some classical commands allow taking
the control of the execution and exploring the stack and the variables.
Here are presented all switches that are allowed on the command line:
| Switch | Description |
| -args [arg]* |
Pass some arguments to the command line.
The list of arguments stops at the end of the command line or as soon as an option
is encountered. The arguments are stored in a global array variable called _ARGS. |
| -autoexpand file-to-expand |
The file file-to-expand is explored
for expanding code at markups, executing a template-based script inserted
just below each markup.
It is identical to execute the script function autoexpand(file-to-expand, project). |
-c++ generated-project-path
CodeWorker-path? |
To translate the leader script and all its
dependencies in C++ source code, once the execution of the leader script has
achieved (same job as compileToCpp() compileToCpp()). The CodeWorker-path
is optional and gives the path through includes and libraries of the software. However, it is now
recommended to specify CodeWorker-path by the switch -home. |
-c++2target script-file
generated-project-path target-language? |
To translate the leader script and all its dependencies in C++ source code.
Hence, the C++ is translated to a target language, all that once the execution
of the leader script has achieved. Do not forget to give the path through
includes and libraries of CodeWorker, setting the switch -home.
A preprocessor definition called "c++2target-path" is automatically
created. It contains the path of the generated project. Call
getProperty("c++2target-path") to retrieve the path value.
target-language is optional if at least one script of the project
holds the target into its filename, just before the extension. Example:
"myscript.java.cwt" means that the target language of this script is "java".
A property can follow the name of the target language, separated by a '=' symbol.
The property is accessible via getProperty("c++2target-property"), and
its nature depends on the target. For instance, in Java, this property represents
the package the generated classes will belong to. Example: java=org.landscape.mountains. |
| -c++external filename |
To generate C++ source code for implementing
all functions declared as external into scripts. |
| -commentBegin format |
To specify the format of a beginning of comment. |
| -commentEnd format |
To specify the format of a comment's end. |
| -compile scriptFile |
To compile a script file, just
to check whether the syntax is correct. |
| Switch | Description |
| -commands commandFile |
To load all arguments processed ordinary on the command-line.
It must be the only switch or else passed on the command-line. |
| -console |
To open a console session (default mode if no script
to interpret is specified via -script or -compile or
-generate or -expand. |
| -debug [remote]? |
To debug a script in a console while executing it. The optional
argument remote defines parameters for a remote socket control of the debugging session.
remote looks like <hostname>:<port>. If <hostname> is empty, CodeWorker runs as a
socket server.
|
-define VAR=value
or -D ... |
To define some variables, as
when using the C++ preprocessor or when passing properties to the JAVA compiler.
These variables are similar to properties, insofar as they aren't exploited during
the preprocessing of scripts to interpret. This option conforms to the format
-define VAR when no value has to be assigned ; in that case,
"true" is assigned by default to variable VAR. The script
function getProperty("VAR") gives
the value of variable VAR.
|
-expand pattern-script
file-to-expand |
Script
file pattern-script is executed to expand file file-to-expand
into markups.
It is identical to execute script function expand(pattern-script, project, file-to-expand). |
| -fast |
To optimize speed. While processing generation, the output
file is built into memory, instead of into a temporary file.
|
-generate pattern-script
file-to-generate |
Script
file pattern-script is executed to generate file file-to-generate.
It is identical to execute script function generate(pattern-script, project, file-to-generate).
|
| -genheader text |
Adds a header at the beginning of all generated
files, followed by a text (see procedure setGenerationHeader() setGenerationHeader()). |
| -help or ? |
Help about the command line. |
| -home CodeWorker-path |
Specifies the path to the home directory of CodeWorker. |
| -I path |
Specify a path to explore when trying
to find a file while invoking include or parseFree
or parseAsBNF or generate or expand or ... This option may be
repeated to specify more than one path. |
-insert variable_expression
value |
Creates a new node in
the main parse tree project and assigns a constant value to it. It is identical
to execute the statement insert variable_expression = " value " ;.
|
| -nologo |
The interpreter doesn't write the copyright in the shell at the beginning. |
| Switch | Description |
| -nowarn warnings |
Specified warning types are ignored. They are separated by pipe symbols.
Today, the only recognized type is undeclvar, which prevents the developer against the
use of a undeclared variable. |
-parseBNF BNF-parsing-script
source-file |
The script
file BNF-parsing-script parses source-file from an extended BNF grammar.
It is identical to execute the script function parseAsBNF(BNF-parsing-script, project, source-file).
|
| -path path |
Output directory, returned by the script
function getWorkingPath(), and used ordinary to specify where
to generate or copy a file. |
| -quantify [outputFile]? |
To execute scripts into quantify mode that consists of
measuring the coverage and the time consuming. Results are saved to HTML file outputFile or
displayed to the console if not present. |
-report report-file
request-flag |
To generate a report once the execution has achieved.
The report is saved to file report-file and nature of information
depends on the flag request-flag. This flag must be built by computing
a bitwise OR for one or several of the following integer constants:
- 1: provides every output file written by a template-based script (generate(), expand() or translate)
- 2: provides every input file scanned by a BNF parse script (parseAsBNF() or translate())
- 4: provides details of coverage recording for every output file using the #coverage directive
- 8: provides details of coverage recording for every input file using the #matching directive
- 16: provides details of coverage recording for every output file written by a template-based script
- 32: provides details of coverage recording for every input file scanned by a BNF parse script
Notice that flags 16 and 32 may become highly time and memory consuming, depending both on
how many input/output files you have to process and on their size.
|
| -script script-file |
Defines the leader script, which will be
executed first. |
| -stack depth |
To limit the recursive call of functions, for avoiding an
overflow stack memory. By default, the depth is set to 1000. |
| -stdin filename |
To change the standard input for reading from an
existing file. It may be useful for running a scenario. |
| -stdout filename |
To change the standard output for writing it to a file. |
| -time |
To display the execution time expressed in milliseconds, just
before exiting. |
| Switch | Description |
-translate translation-script
source-file file-to-generate |
Script
file translation-script processes a source-to-source translation.
It is identical to execute the script function translate(translation-script, project, source-file, file-to-generate).
|
| -varexist |
To trigger a warning when the value of a variable that doesn't
exist is required into a script. |
| -verbose |
To display internal messages of the interpreter (information). |
| -version version-name |
To force interpreted scripts as written in a
precedent version given by version-name. |
Note that the interpreter proposes a convenient way for running a common script with arguments:
codeworker <script-file> <arg1> ... <argN> [<switch>]*
This writing replaces the more verbose:
codeworker -script <script-file> -args <arg1> ... <argN> [<switch>]*
A console mode is launched when the command line is empty. The console only accepts
scripts written in the common syntax, with common functions and procedures. So, parsing
and generation scripts aren't typed directly on the console.
2 Syntax generalities and statements
A script in CodeWorker consists of a series of statements that are organized into
blocks (also known as compound statements). A statement is an instruction
the interpreter has to execute.
A single statement must close with a semicolon (';'). A compound statement is defined
by enclosing instructions between braces ('{}'). A block can be used
everywhere you can use a single statement and must never end with a semicolon after the
trailing brace.
Comments are indicated either by surrounding the text with '/*' and '*/'
or by preceding the rest of the line to ignore with a double slash ('//').
It exists three families of scripts here. To facilitate their syntax highlighting in editors,
or to indicate briefly the type of the script, we suggest to employ some file extensions,
depending on the nature of the script. The next table exposes the different extensions used
commonly in CodeWorker.
| Extension | Description |
| ".cwt" |
a template-based script, for text generation |
| ".cwp" |
a extended-BNF parse script, for parsing text |
| ".cws" |
a common script, none of the precedent |
The structure of the grammar is so rich that it is a challenge to find an editor, which
offers a syntax highlighting engine powerful enough. JEdit proposes the writing of
production rules to describe it, so it is possible to express the syntax highlighting of
the scripting language.
You'll find a package dedicated to JEdit on the Web site, for the inclusion of these new
highlighting modes. Many thanks to Patrick Brannan for this contribution.
2.1 preprocessor directives
A preprocessor directive always starts with a '#' symbol and is followed by the name
of the directive.
2.1.1 Including a file
The #include filename directive tells the preprocessor to replace
the directive at the point where it appears by the contents of the file specified by the
constant string filename. The preprocessor looks for the file in the current
directory and then searches along the path specified by the -I option on the command
line.
2.1.2 Extending the language via a package
A package is an extension of the scripting language that allows adding new
functions in CodeWorker at runtime. A package is implemented as an executable module,
which exports all new functions the developer wants to make available in the interpreter.
Loading of a package
The preprocessor directive #use tells the interpreter that it must extend itself
with the functions exposed by a package.
The syntax is:
#use package-name
Loading a package more than once has no effect.
The name of the package must prefix the name of the function, when calling it:
package-name::my-function(parameters...)
Example:
#use PGSQL
PGSQL::connect("-U pilot -d emergencyDB");
local sRequest = "SELECT solution FROM average_adjustment WHERE damage = 'broken wing'";
local listOfSolutions;
PGSQL::selectList(sRequest, listOfSolutions);
if listOfSolutions.empty()
traceLine("No solution. Suggestion: parachute jump?");
else {
traceLine("Solutions:");
foreach i in listOfSolutions
traceLine(" -" + i);
}
PGSQL::disconnect(); // if the plane hasn't crashed yet
The PGSQL package serves here for connecting to and querying a PostGreSQL database.
For this example, the package exports three functions: PGSQL::connect,
PGSQL::selectList and PGSQL::disconnect.
The executable module
CodeWorker expects a dynamic library, whose name is deduced from the package name and from
the platform the interpreter is running to.
The short name of the dynamic library concatenates "cw" at the end of the
package name. The extension of the dynamic library must be ".dll" under
Microsoft Windows, and ".so" under Linux.
You must put the dynamic library at a place where CodeWorker will find it at runtime.
Microsoft Windows proceeds in the following order to locate the library:
- The directory where the executable module for the current process is located.
- The current directory.
- The Windows system directory (not recommended - it concerns CodeWorker only).
- The Windows directory (not recommended - same reason).
- The directories listed in the PATH environment variable.
Under Unix, a relative path for the shared object refers to the current directory
(according to the man description of dlopen(3C)).
So, when CodeWorker reads #use PGSQL, it searches a dynamic library
called "PGSQLcw.dll" under Windows or "PGSQLcw.so"
under Linux.
Building a package
This section is intended to those that want to build their own packages, for binding to
a database or to a graphical library ... or just for gluing with their own libraries.
When the interpreter find the preprocessor directive #use package-name
in a script, it loads the executable module and executes the exported C-like function
CW4DL_EXPORT_SYMBOL void package-name_Init(CW4dl::Interpreter*).
The preprocessor definition CW4DL_EXPORT_SYMBOL and the namespace CW4dl are
both declared in the C++ header file "CW4dl.h". This header file is located in
the "include" directory if you downloaded binaries, and at the root of the project
if you downloaded sources.
The C-like function 'package-name_Init()' MUST be present!
C-like means that it is declared extern "C" (done by CW4DL_EXPORT_SYMBOL).
Initializing the module that way is useful for registering new functions in the engine,
via the function createCommand() of the interpreter (see the header file "CW4dl.h"
in the declaration of the class Interpreter for learning more about it).
Every function to export must start its declaration with the preprocessor definition
CW4DL_EXPORT_SYMBOL (means 'extern "C"', but a little more under Windows).
- Up to 4 parameters, the signature of such a function looks like:
CW4DL_EXPORT_SYMBOL const char*
selectList(CW4dl::Interpreter*,
CW4dl::Parameter p1, CW4dl::Parameter p2);
where selectList is a function expecting 2 parameters.
The initializer PGSQL_Init() in our example informs the engine about the existence
of this function selectList in the package:
createCommand("selectList", VALUE_PARAMETER, NODE_PARAMETER);
which means that selectList expects a string followed by a tree.
In the body of the function 'selectList(...)', the C++ binding is obtained easily
by a cast of CW4dl::Parameter:
- (const char*) p1 for the value parameter p1,
- (CW4dl::Tree*) p2 for the node parameter p2,
- if a function contains strictly more than 4 parameter, its signature changes
and requires a variable number of parameters:
CW4DL_EXPORT_SYMBOL const char*
myFunction(CW4dl::Interpreter*,
int nbParams, CW4dl::Parameter* tParams);
where tParams is an array of parameter types, and where 'nbParams' gives
the size.
The initializer PGSQL_Init() informs the engine about the existence
of this function in the package differently too:
createCommand("myFunction", 6, tParams);
which means that myFunction has 6 parameters whose types are provided
in tParams.
Every function returns const char*. The CodeWorker's keyword null designates
an atypical tree node. It doesn't accept navigation and reference, only passing by parameter to a function.
On the C++ side, this null tree node is seen as a null pointer of kind CW4dl::Tree*.
The interpreter CW4dl::Interpreter represents the runtime context of CodeWorker.
It is the unavoidable intermediary between the module you are building and CodeWorker.
Use it for:
- registering new functions into the CodeWorker's engine,
- throwing an error,
- handling parse trees,
The #line directive forces to another number the line counter of the script file
being parsed. The line just after the directive is supposed to be worth the number specified after
#line.
2.1.3 Changing the syntax of the scripting language
The #syntax directive tells the preprocessor not to parse the following
instructions as classical statements of the scripting language, but as conforming to another
syntax. It allows adapting the syntax to what you are programming:
- If you are programming a kind of makefile logic, where you have to check
whether a file has been changed before another or not (using the function
fileLastModification() fileLastModification() for example), it is clear
that you would prefer to implement it in a makefile-like syntax rather than
in the scripting language's syntax,
- If you are programming a kind of shell logic, where you have to copy files
and directories, or re/move them, you would prefer to implement it in a shell-like
syntax rather than in the scripting language's syntax. For instance:
traceLine("Creating directory 'CodeWorker'...");
removeDirectory("CodeWorker");
copyFile("readme.txt", "CodeWorker/readme.txt");
...
might be written in a shell-like syntax, inlayed in the CodeWorker script:
#syntax shell:"TinyShell.cwp"
echo Creating directory 'CodeWorker'...
rmdir CodeWorker
copy readme.txt CodeWorker/readme.txt
...
#end syntax
The directive admits the following writing:
"#syntax" [parsing-mode [':' BNF-script-file]? | BNF-script-file]
How does it work? The piece of source code, which doesn't conform to the syntax of the script
language, is put between the directives #syntax ... and #end syntax. If the
trailing directive isn't found, the remaining of the script is considered as written in a
foreign syntax. Be careful that the trailing directive must start at the beginning of the line
necessary to be recognized and that no spaces are allowed between # and end.
At runtime, the famous piece of source code is parsed and processed via the BNF script file.
Note that it is possible to attach an identifier (called parsing-mode above) to a
script file, and to specify later, in any other script, the parsing mode only;
CodeWorker will find the corresponding BNF script file. It avoids to handle a physical name
of the BNF parsing file, where a logical name of parsing mode is more convenient.
Example:
// the first time, a parsing mode may be attached to the BNF script file
#syntax shell:"TinyShell.cwp"
...
#end syntax
// at the second call, it isn't recommended to use the path of the parsing file
// it is better to use the parsing mode registered previously
#syntax shell
...
#end syntax
// here, I know that I'll call it once only, so I don't care about a parsing mode
#syntax "MakeFile.cwp"
...
#end syntax
where the parsing script "TinyShell.cwp" might be worth:
// file "GettingStarted/TinyShell.cwp":
tinyShell ::=
#ignore(C++)
#continue
[
#readIdentifier:sCommand
#ignore(blanks) #continue
command<sCommand>
]* #empty;
//----------------------------//
// commands of the tiny shell //
//----------------------------//
command<"copy"> ::=
#continue parameter:sSource parameter:sDestination
=> {copyFile(sSource, sDestination);};
command<"rmdir"> ::=
#continue parameter:sDirectory
=> {removeDirectory(sDirectory);};
command<"del"> ::=
#continue parameter:sFile
=> {deleteFile(sFile);};
//--------------------
// Some useful clauses
//--------------------
parameter:value ::=
#readCString:parameter
|
#!ignore #continue [~[' ' | '\t' | '\r' | '\n']]+:parameter;
Of course, the parsing and the processing are implemented in the scripting language, so changing
the syntax will be slower than keeping the default one. However, it allows writing a code
easy to support and to understand.
2.1.4 Managing changes in a multi-language generation
The directives #reference and #attach serve to be notified when a change has
been made into a script for generating in a given language, but not taken back in another
language. For example, you are writing a framework both in C++ and JAVA. You are adding some
new features in C++ or correcting some mistakes. One day, you'll be care not to forget to
update the JAVA generation. In fact, thanks to these directives, a warning will be produced
up to changes will have been put in the other script.
How does it work? Directives must delimit the piece of script you have changed:
"#reference" key
...
"#end" key
The key is an identifier that allows putting more than one reference area into a
script file. A #reference area might cover one or more #reference
directives, without confusing about boundaries. The directive must be put at the beginning of
the line.
Here are the directives delimiting the piece of script that should be updated later in
another file:
"#attach" reference-file ':' reference-key
...
"#end" reference-key
A #attach area might cover one or more #reference or #attach
directives, as a #reference area. The directive must be put at the beginning of the
line.
The first time CodeWorker will encounter the reference script file, it will compute
a number that depends on the content of the area. The first time CodeWorker will encounter an
attached script file, it will get back the magic number of the reference
area, found both by the file name and the key of the reference. And then, at the beginning,
the reference and attached areas are considered as similar. CodeWorker stores
the magic number of the reference just behind the #attach directive:
"#attach" reference-file ':' reference-key ',' reference-number
In fact, a script file that must be updated, so as to store the magic numbers for some attached
areas, takes into account the modifications at the end of the parsing, and only if no error
was encountered. If the writefileHook() function (see writefileHook) is implemented,
it is called and the script file doesn't change if it returns false. If the script file
is read-only, the corresponding readonlyHook() function is called (see readonlyHook).
If it isn't possible to save the script file, an error is thrown.
When a change occurs in the reference area, the next time CodeWorker will encounter
it, the magic number will be recomputed. When an attached piece of script is encountered after
the change, the old magic number of the reference is compared to the new one. If they aren't
the same, a warning is displayed to notify that the attached area hasn't been updated yet.
Once the changes have been taken back into the attached area, the magic number of the reference
must be cut (don't forget the comma too!). And so, the next time this attached area will be
encountered by the interpreter, it will get back the magic number of the reference
area. And then, the reference area and the attached area are considered as
similar once again.
Of course, the use of these directives is quite constraining. However, it is the only way in
CodeWorker to assure that features and corrections have been taken back in all generated
languages.
2.2 Constant literals
CodeWorker handles all basic types as strings, and doesn't distinguish a double from a boolean or a date.
A string literal is a sequence of characters from the source character set enclosed in double quotation marks (" ").
String literals are used to represent a sequence of characters which, taken together, form a null-terminated string.
The interpretation done of the data depends on the context: function increment(index)
expects that its argument index contains a number, but stored as a string.
- Floating-point numbers are represented as they are commonly admitted into programming languages: 3.141592 or 5.5E+6.
- Integers are represented without the dot, 64 for instance.
- A character literal is represented between single quotes as in C or JAVA; it admits
classical escape characters.
- Bytes are represented as a couple of hexadecimal digits. The 4D byte is the
ASCII of the letter N.
- About boolean types, an empty string "" means false, and any kind of sequence of characters means true, such as "1" or "raspberries".
Two constant literals are provided: keyword true is worth
"true" and false is an empty string.
- dates are written according to a format that looks like 24sep2002, where:
- day takes 2 digits
- month is represented as the 3 first letters of the corresponding english word ; aug as august and may as may, for instance
- year takes 4 digits: 2002 but never 02.
- the time representation conforms to the format:
HH:MM:SS.millis
A constant tree describes a tree as a list of constant trees and expressions, intended
to be assigned to a variable. Example:
local aVariable = "a"{["yellow", "red":"or"{.alternative="orange"
], .vehicle="submarine"};}
You'll find more information in the sub section Scope below.
2.3 Variables, declaration and assignment
Variables serve as containers for the data you use into scripts. Data type is a tree that may
be reduced to a leaf node, which contains a value and that's all.
2.3.1 Declaring variables
It isn't necessary to declare a variable before using if for the first time. A variable that
is assigned without being declared is understood as a new sub-node to be added to the current
tree context. The current context is obtained by the read-only variable called this.
It corresponds to the main parse tree whose root name is project when you are into
the leader script, and to the variable passed by parameter when calling a parsing or pattern
script.
The next table exposes all pre-defined variable names (accessible from anywhere) and their meaning:
| Variable Name | Description |
| project |
The main parse tree, always present. |
| this |
It points to the current context variable. |
| _ARGS |
An array of all custom command-line arguments. Custom arguments are following the script file name or the switch -args on the command-line. |
| _REQUEST |
If the interpreter works as a CGI program, it stores all parameters of the request in a association table. The key is the parameter name, which associates the corresponding value. |
A variable that is read without being declared returns an empty string, but doesn't cause the
creation of a sub-node. The danger is that you aren't safe from a spelling mistake. To prevent
it, put the option -varexist on the command line and use the function existVariable()
to check whether a variable exists or not.
2.3.2 Scope
When you declare a local variable, it is valid for use within a specific area of code, called
the scope. When the flow of execution leaves the scope, the content of the variable,
a subtree specially allocated during its declaration, is deleted and disappears forever from
the stack. A scope is delimited by a block.
To declare a variable to the stack, use the following declaration statement:
local-variable-statement ::= "local" local-variable-declaration ';'
local-variable-declaration ::= variable [ '=' assignment-expression ]?
assignment-expression ::= constant-tree | expression
constant-tree ::= [tree-value]? '{' [tree-array-or-attribute [',' tree-array-or-attribute]* ]? '}'
tree-value ::= expression
tree-array-or-attribute ::= tree-array | tree-attribute
tree-attribute ::= '.' attribute-name '=' assignment-expression
tree-array ::= '[' tree-array-item [',' tree-array-item]* ']'
tree-array-item ::= expression ':' assignment-expression | assignment-expression
An extension of the syntax allows the declaration of more than one variable in one shot. A comma separates the
variable declarations:
local-variable-statement ::= "local" local-variable-declaration [ ',' local-variable-declaration ]* ';'
The local variable points to a new empty tree, pushed into the stack.
- If an expression is present after the local declaration, it is evaluated and
the string result is assigned to the new local variable.
- If a constant tree is present after the local declaration, it is assigned to
the new local variable. Example:
local aVariable = {"a", {"yellow", "red"}, "submarine"};
is equivalent to:
local aVariable;
pushItem aVariable = "a";
pushItem aVariable;
pushItem aVariable#back = "yellow";
pushItem aVariable#back = "red";
pushItem aVariable = "submarine";
where pushItem means that a new item has to be added in the array owned
by aVariable, and where #last means accessing to the last item
of the array.
To assign a reference to another variable, instead of either the result of
evaluating an expression or a constant tree, use rather the following declaration statement:
local-ref-statement ::= "localref" local-ref-declaration [ ',' local-ref-declaration ]* ';'
local-ref-declaration ::= variable '=' reference
In the case of a CodeWorker version strictly older than 1.13, local
variables that are declared in the body of a script or in the scope of a function may
be accessed further in the scope of functions during their timelife. So a different behaviour
may occur with a more recent CodeWorker interpreter.
This stack management had historical reasons, but it is now obsolete and often reflects an
implementation's error. To preserve you from this kind of mistake, a warning may be displayed,
so that scripts strictly older than version 1.13 may continue to run. Specify a
version strictly older than 1.13 to the command line (option -version) for
reclaiming that CodeWorker checks and generates a warning.
To correct this kind of mistake in old scripts, the variable should be propagated in an
argument for functions that refer to it.
To declare a global variable, use the global statement. The declaration of a global
variable can be specified anywhere in scripts. The first time the declaration of a global
variable is encountered, the interpreter registers it as accessible from any point into
scripts. The second time the interpreter encounters a global declaration for the variable,
the latter remains global but its content is cleared.
Note that if a local variable or an attribute of the current node (this) is identical
to the name of an existing global variable, the global variable remains hidden while
the flow of control hasn't left the scope that contains the homonym.
the global declaration statement looks like:
global-variable-statement ::= "global" global-variable-declaration [ ',' global-variable-declaration ]* ';'
global-variable-declaration ::= variable [ '=' assignment-expression ]?
2.3.3 Navigating along branches
It is possible to navigate along a branch of the subtree put into the variable. A branch
points to a node of the subtree. The syntax looks generally like:
branch ::= variable ['.' sub-node]*
If the branch isn't known before runtime, it may be build during the execution.
Example: while parsing an XML file, each time an XML attribute is encountered, one creates the
corresponding attribute into the parse tree. But the name of the attribute is discovered
during the parsing. The directive #evaluateVariable(expression) allows doing it.
expression is evaluated at runtime and provides a branch:
#evaluateVariable("a.b.c")
will resolve the path "a.b.c" at runtime and
navigate from a to textit{c}.
A node may contain an array of nodes, which are indexed by a key that is a constant string.
A branch allows navigating through arrays, and the definitive syntax of branches conforms
to:
branch ::= "#evaluateVariable" '(' expression ')'
::= variable ['.' sub-node | array-access]*
array-access ::= '[' expression ']'
::= '#' ["front" | "back" | "parent"] | "root"]
::= '#' '[' integer-expression ']'
We see that there are some ways to access an item node of an array or to change how to
navigate from nodes to nodes:
- sub-node '[' expression ']' means that
we'll access the node item associated to the string key resulting of the
expression's evaluation,
- sub-node '#' "front" means that the first item
node of the array is required. If the array is empty, an error occurs.
- sub-node '#' "back" means that the last item
node of the array is required. If the array is empty, an error occurs.
- sub-node '#' "parent" means that one comes back
up the parent's node of sub-node.
- sub-node '#' "root" means that one comes back
up the root's node of the tree sub-node belongs to.
- sub-node '#' '[' <integer-expression> ']' means
that we'll access the node item located at the position given by the evaluation
of the expression. The position starts counting to 0. An error is raised if the
position is out of bounds.
2.3.4 Assignments
CodeWorker provides some different ways to put a data into a variable or into the node
pointed to by a branch:
- set variable-branch ['=' | '+='] assignment-expression :
the expression is evaluated and the resulting string value is assigned to the
variable, or concatenated if the operator '+=' was required. Keyword
set may be omitted. The node to assign is supposed existing yet. If not,
the assignment is done, but it causes a warning to prevent a spelling mistake on
the variable's name.
- insert variable-branch [['=' | '+='] assignment-expression]? :
it works like the set assignment, except that it is the preferred mode
to add a new node when the variable doesn't exist yet. If the node already exists,
of course it isn't added twice, and the assignment if done as expected. If no
assignment is specified after the variable's name, nothing is assigned to the node.
So, if the node wasn't existing yet, it contains an empty string. Otherwise,
the ancient value isn't changed.
- ref variable '=' existing-variable-or-branch :
the variable to assign will refer to an existing node. Inspecting the variable
will cause inspecting the referenced existing node. If the referenced node doesn't
exist, an error occurs. If you apply the reference to a variable that already
refers a node, this link is broken instead of propagating the reference to the
referred node. This operator is very useful during the decoration of the
parse tree, and leads to transform the tree as a freely-oriented graph.
Be careful not to keep a reference to a local variable once the flow of execution
has left its scope: the local variable is deleted, and so, the reference points
to a corrupted part of the memory.
If you intend to assign a reference to a variable into a function and that the
variable is passed by parameter, don't forget to take the reference parameter
mode:
function badFunction(myVar : node) {
...
// myVar will keep up a reference to aNode
// up to the end of the function:
ref myVar = aNode;
...
// myVar is passed as variable, so the
// reference is cancelled once the function is left!
}
// To keep the reference after leaving the function, change the parameter
// mode to reference:
function goodFunction(myVar : reference) {
...
// myVar will keep up a reference to aNode
// up to the end of the function:
ref myVar = aNode;
...
// myVar is passed as reference, so the
// reference is kept once the function is left!
}
- setall variable-branch '=' existing-variable-or-branch :
value, attributes and array of the variable to assign are purged, and the subtree,
to which the existing variable points, is copied integrally to the node to
assign.
- merge variable-branch '=' existing-variable-or-branch :
the subtree, to which the existing variable points, is copied integrally to the
node to assign, preserving the attributes and the arrays of the assigned node,
which are updated or completed.
- pushItem variable-branch ['=' expression]? :
a new item node is added at the end of the variable's array, whose key is worth
its position, starting at 0. If the expression exists, then after evaluating it,
the result is assigned to the item node as a value. If no array was previously
existing, the item becomes its first component.
2.4 Expressions
2.4.1 Presentation
The BNF representation of an expression looks like:
expression ::= boolean-expr | ternary-expr
boolean-expr ::= comparison-expr [boolean-op comparison-expr]
boolean-op ::= '&' | '&&' | '|' | '||' | '^' | '^^'
ternary-expr ::= comparison-expr '?' expression ':' expression
comparison-expr ::= concatenation-expr [comparison-op concatenation-expr | "in" constant-set]
constant-set ::= '{' constant-string [',' constant-string]* '}'
comparison-op ::= '<' | '<=' | '==' | '=' | '!=' | '<>' | '>' | '>='
concatenation-expr ::= stdliteral-expr ['+' stdliteral-expr]*
stdliteral-expr ::= literal-expr
::= '$' arithmetic-expr '$'
literal-expr ::= constant-string | number
::= "true" | "false"
::= '(' expression ')'
::= '!' literal-expr
::= preprocessor-expr
::= function-call
::= variable-or-branch
arithmetic-expr ::= comparith-expr [boolean-op comparith-expr]*
comparith-expr ::= sum-expr [comparison-op sum-expr]
sum-expr ::= shift-expr [['+' | '-'] shift-expr]*
shift-expr ::= factor-expr [["<<" | ">>"] factor-expr]*
factor-expr ::= literal-expr [['*' | '/' | '%'] literal-expr]*
unary-expr ::= literal-expr ["++" | "--"]
literal-expr ::= string | variable-expr | number | unary-expr
::= '~' literal-expr
preprocessor-expr ::= '#' ["LINE" | "FILE"]
where:
2.4.2 Arithmetic expressions
The classical syntax of the interpreter forces expressions to work on sequences of characters.
So, comparison operators apply the lexicographical order and the '+' operator concatenates
two strings and the '*' operator doesn't exist.
Of course, it exists some functions to handle strings as number and to execute an arithmetic
operation (the 'add()' or 'mult()' functions for instance) or a comparison (the
'isPositive()' or 'inf()' functions for instance).
However, it appears clearly more convenient to write arithmetic operations and comparisons in a
natural way, using operators instead of the corresponding functions. So, CodeWorker provides
an escape mode that draws its inspiration from LaTeX to express mathematical formulas:
the arithmetic expression are delimited by the symbol '$'.
Example:
local a = 11;
local b = 7;
traceLine("Classical mode = '"
+ inf(add(mult(5, a), 3), sub(mult(a, a), mult(b, b))) + "'");
traceLine("Escape mode = '" + $5*a + 3 < a*a - b*b$ + "'");
Output:
Classical mode = 'true'
Escape mode = 'true'
2.5 Common statements
2.5.1 The 'if' statement
The BNF representation of the while statement is:
if-statement ::= "if" expression then-statement
["else" else-statement]?
The if statement evaluates the expression following immediately. The expression
must be of arithmetic, text, variable or condition type. In both forms of the if
syntax, if the expression evaluates to a nonempty string, the statement dependent on the
evaluation is executed; otherwise, it is skipped.
In the if...else syntax, the second statement is executed if the result of evaluating
the expression is an empty string. The else clause of an if...else statement is
associated with the closest previous if statement that does not have a corresponding
else statement.
2.5.2 The 'while'/'do' statements
The BNF representation of the while statement is:
while_statement ::= "while" expression statement
The while statement lets you repeat a statement or compound
statement as long as a specified expression becomes an empty
string. The expression in a while statement is evaluated before
the body of the loop is executed. Therefore, the body of the loop
may be never executed. If expression returns an empty string,
the while statement terminates and control passes to the next
statement in the program. If expression is non-empty, the process
is repeated. The while statement can also terminate when a
break, or return statement is executed within the statement
body. When a continue statement is encountered, the control breaks
the flow and jumps to the evaluation of the expression.
Note that the break and continue statements apply to
the first loop statement (foreach/forfile/select, do/while) they encounter
while leaving instruction blocks.
The BNF representation of the do statement is:
do_statement ::= "do" statement "while" expression ';'
The do-while statement lets you repeat a statement or compound
statement until a specified expression becomes an empty string. The
expression in a do-while statement is evaluated after the body of the
loop is executed. Therefore, the body of the loop is always executed at
least once. If expression returns an empty string, the do-while
statement terminates and control passes to the next statement in the
program. If expression is non-empty, the process is repeated.
The do-while statement can also terminate when a break, or return
statement is executed within the statement body. When a continue
statement is encountered, control is transferred to the evaluation of
the expression.
2.5.3 The 'switch' statement
The BNF representation of this statement is:
switch_statement ::= "switch" '(' expression ')' '{' (label_declaration)* ("default" ':' statement)? '}'
label_declaration ::= ["case" | "start"] constant_string ':' statement
The switch statement allows selection among multiple sections of
code, depending on the value of an expression. The expression enclosed
in parentheses, the controlling expression, must be of string type.
The switch statement causes an unconditional jump to, into, or past
the statement that is the switch body, depending on the value of the
controlling expression, the constant string values of the case or start labels,
and the presence or absence of a default label. The switch body is
normally a compound statement (although this is not a syntactic
requirement). Usually, some of the statements in the switch body are
labeled with case labels or with start labels or with the default
label. The default label can appear only once.
The constant-string in the case label is compared for equality with the controlling
expression. The constant-string in the start label is compared for equality with the
first characters of the controlling expression. In a given switch statement, no two
constant strings in start or case statements can evaluate to the same value.
The switch statement behaviour depends on how the controlling
expression matches with labels. If a case label exactly matches with
the controlling expression, control is transferred to the statement
following that label. If failed, start labels are iterated into the
lexicographical order, and the control is transferred to the statement
following the first label that matches with the beginning of the controlling expression.
If failed, control is transferred to the default statement or, if not present,
an error is thrown.
A switch statement can be nested. In such cases, case or start or
default labels associate with the most deeply nested switch
statements that enclose them.
Control is not impeded by case or start or default labels. To
stop execution at the end of a part of the compound statement, insert a
break statement. This transfers control to the statement after the
switch statement.
2.5.4 The 'foreach' statement
The BNF representation of this statement is:
foreach_statement ::= "foreach" iterator "in" [direction]?
[sorted_declaration]? [cascading_declaration]? list-node body_statement
direction ::= "reverse"
sorted_declaration ::= "sorted" ["no_case"]? ["by_value"]?
cascading_declaration ::= "cascading" ["first" | "last"]?
A foreach statement iterates all items of the list owned by node list-node.
The iterator refers to the current item of the list, and the body statement is executed
on it.
Items are iterated either in the order of entrance, or in alphabetical order if
option sorted is set. The sort operates on keys, except if the option by_value is set.
The order is inverted if option reverse was chosen.
To ignore the case, these options must be followed by no_case.
If not, uppercase letters are considered as smaller than any lowercase letter.
// file "Documentation/ForeachSampleSorted.cws":
local list;
insert list["silverware"] = "tea spoon";
insert list["Mountain"] = "Everest";
insert list["SilverWare"] = "Tea Spoon";
insert list["Boat"] = "Titanic";
insert list["acrobat"] = "Circus";
traceLine("Sorted list in a classical order:");
foreach i in sorted list {
traceLine("\t" + key(i));
}
traceLine("Note that uppercases are listed before lowercases." + endl());
traceLine("Sorted list where the case is ignored:");
foreach i in sorted no_case list {
traceLine("\t" + key(i));
}
traceLine("Reverse sorted list:");
foreach i in reverse sorted list {
traceLine("\t" + key(i));
}
traceLine("Reverse sorted list where the case is ignored:");
foreach i in reverse sorted no_case list {
traceLine("\t" + key(i));
}
Output:
Sorted list in a classical order:
Boat
Mountain
SilverWare
acrobat
silverware
Note that uppercases are listed before lowercases.
Sorted list where the case is ignored:
acrobat
Boat
Mountain
SilverWare
silverware
Reverse sorted list:
silverware
acrobat
SilverWare
Mountain
Boat
Reverse sorted list where the case is ignored:
silverware
SilverWare
Mountain
Boat
acrobat
Control may not be sequential into the body statement. break and return enable
exiting definitely the loop, and continue transfers the control to the head of the
foreach statement for the next iteration.
Option cascading allows propagating foreach on item nodes. The way
it works is illustrated by an example:
foreach i in cascading myObjectModeling.packages ...
At the beginning, i points to myObjectModeling.packages#front and
the body is executed. Before iterating i to the next item, the foreach checks whether
the item node myObjectModeling.packages#front owns attribute packages or not.
If yes, it applies recursively foreach on myObjectModeling.packages#front.packages.
Option cascading avoids writing the following code:
function propagateOnPackages(myPackage : node) {
foreach i in myPackage {
// my code to apply on this package
if existVariable(myPackages.packages)
propagateOnPackages(myPackages.packages);
}
}
propagateOnPackages(myObjectModeling.packages);
Option cascading offers two behaviours:
- first means that the item is cascaded before running the body,
// file "Documentation/ForeachSampleFirst.cws":
local myObjectModeling;
insert myObjectModeling.packages["Massif"] = "...";
local myPackage;
ref myPackage = myObjectModeling.packages["Massif"];
insert myPackage.packages["Alps"] = "...";
insert myPackage.packages["Himalaya"] = "...";
insert myPackage.packages["Rock Mountains"] = "...";
insert myObjectModeling.packages["Silverware"] = "...";
ref myPackage = myObjectModeling.packages["Silverware"];
insert myPackage.packages["Spoon"] = "...";
insert myPackage.packages["Fork"] = "...";
insert myPackage.packages["Knife"] = "...";
foreach i in cascading first myObjectModeling.packages {
traceLine("\t" + key(i));
}
Output:
Alps
Himalaya
Rock Mountains
Massif
Spoon
Fork
Knife
Silverware
- last is the default behaviour, as seen in previous examples, and
// file "Documentation/ForeachSampleLast.cws":
local myObjectModeling;
insert myObjectModeling.packages["Massif"] = "...";
local myPackage;
ref myPackage = myObjectModeling.packages["Massif"];
insert myPackage.packages["Alps"] = "...";
insert myPackage.packages["Himalaya"] = "...";
insert myPackage.packages["Rock Mountains"] = "...";
insert myObjectModeling.packages["Silverware"] = "...";
ref myPackage = myObjectModeling.packages["Silverware"];
insert myPackage.packages["Spoon"] = "...";
insert myPackage.packages["Fork"] = "...";
insert myPackage.packages["Knife"] = "...";
foreach i in cascading last myObjectModeling.packages {
traceLine("\t" + key(i));
}
Output:
Massif
Alps
Himalaya
Rock Mountains
Silverware
Spoon
Fork
Knife
propagates the foreach on the current item after executing the body.
2.5.5 The 'forfile' statement
The BNF representation of this statement is:
forfile_statement ::= "forfile" iterator "in" [sorted_declaration]? [cascading_declaration]? file-pattern body_statement
sorted_declaration ::= "sorted" ["no_case"]?
cascading_declaration ::= "cascading" ["first" | "last"]?
A forfile statement iterates the name of all files that verify the filter file-pattern.
The iterator refers to the current item of the list composed of retained file names, and the body statement is executed
on it. Note that the file pattern may begin with a path, which cannot contain jocker characters ('*' and '?').
Like for the foreach statement, items are iterated either in the order of entrance, or in alphabetical order of keys if
option sorted is set. To ignore the case, the option must be followed by no_case.
If not, uppercase letters are considered as smaller than any lowercase letter.
Control may not be sequential into the body statement. break and return enable
exiting definitely the loop, and continue transfers the control to the head of the
forfile statement for the next iteration.
The option cascading allows propagating forfile on directories recursively.
The way it works is illustrated by an example:
// file "Documentation/ForfileSample.cws":
local iIndex = 0;
forfile i in cascading "*.html" {
if $findString(i, "manual_") < 0$ &&
$findString(i, "Bugs") < 0$ {
traceLine(i);
}
// if too long, stop the iteration
if $iIndex > 15$ break;
increment(iIndex);
}
Output:
cs/DOTNET.html
cs/tests/data/MatchingTest/example.csv.html
Documentation/LastChanges.html
java/JAVAAPI.html
java/data/MatchingTest/example.csv.html
Scripts/Tutorial/GettingStarted/defaultDocumentation.html
WebSite/AllDownloads.html
WebSite/examples/basicInformation.html
WebSite/highlighting/basicInformation.html
WebSite/repository/highlighting.html
WebSite/repository/JEdit/Entity.java.cwt.html
WebSite/serewin/ExempleIllustre.html
WebSite/tutorials/DesignSpecificModeling/tutorial.html
WebSite/tutorials/DesignSpecificModeling/highlighting/demo.cws.html
WebSite/tutorials/overview/tinyDSL_spec.html
WebSite/tutorials/overview/scripts2HTML/CodeWorker_grammar.html
At the beginning, i points to the first HTML file of the current directory and
the body is executed. Before iterating i to the next item, the forfile checks whether
the directory of the current file owns subfolders or not. If yes, it applies recursively
forfile on subfolders.
Option cascading offers two behaviours:
- first means that the subfolders are visited before running the body,
- last is the default behaviour, as seen in previous examples, and
propagates the forfile on the subfolder after executing the body.
2.5.6 The 'select' statement
The BNF representation of this statement is:
select_statement ::= "select" iterator "in" [sorted_declaration]? node-motif body_statement
sorted_declaration ::= "sorted" first-key [, other-key]*
first-key ::= branch
other-key ::= branch
A select statement iterates a list of nodes that match a motif expression.
The iterator refers to the current item of the list composed of retained nodes, and the body statement is executed
on it.
// file "Documentation/SelectSample.cws":
local a;
pushItem a.b;
pushItem a.b#back.c = "01";
pushItem a.b#back.c = "02";
pushItem a.b#back.c = "03";
pushItem a.b;
pushItem a.b#back.c = "11";
pushItem a.b#back.c = "12";
pushItem a.b#back.c = "13";
pushItem a.b;
pushItem a.b#back.c = "21";
pushItem a.b#back.c = "22";
pushItem a.b#back.c = "23";
select i in a.b[].c[] {
traceLine("i = "+ i);
}
Output:
i = 01
i = 02
i = 03
i = 11
i = 12
i = 13
i = 21
i = 22
i = 23
Like for the foreach statement, items are iterated either in the order of entrance, or
according to the sorting result if the option sorted is set.
Control may not be sequential into the body statement. break and return enable
exiting definitely the loop, and continue transfers the control to the head of the
select statement for the next iteration.
2.5.7 The 'try'/'catch' statement
The BNF representation of this statement is:
try-catch-statement ::= "try" try-statement
"catch" '('error_message_variable')'
catch-statement
Error handling is implemented by using the try, catch, and error
keyword. With error handling, your program can communicate unexpected events to a higher
execution context that is better able to recover from such abnormal events.
These errors are handled by code that is outside the normal flow of control.
The compound statement after the try clause is the guarded section of code. An error
is thrown (or raised) when command error(message-text) is called or
when CodeWorker encounters an internal error. The compound statement after the catch
clause is the error handler, and catches (handles) the error thrown. The catch clause
statement indicates the name of the variable that must receive the error message.
2.5.8 The 'exit' statement
The BNF representation of this statement is:
exit_statement ::= "exit" integer-expression ";"
A exit statement leaves the application and returns an error code, given by the integer-expression.
Example:
exit -1;
2.6 User-defined functions
The BNF representation of a user-defined function to implement is:
user-function ::= classical-function-definition | template-function-definition
classical-function-definition ::= classical-function-prototype compound-statement
classical-function-prototype ::= "function" function-name '(' parameters ')'
template-function-definition ::= see the next section, template function, for more information
parameters ::= parameter [',' parameter]*
parameter ::= argument [':' parameter-mode [':' default-value]? ]?
parameter-mode ::= "value" | "node" | "reference" | "index"
default-value ::= "project" | "this" | "null" | "true" | "false" | constant-string
The scripting language allows the user implementing its own functions. Parameters may be passed
to the body of the function. A value may be returned by the function and, if so, the return
type is necessary a sequence of characters. Of course, functions manage their own stack, and
so, accept recursive calls.
An argument may have a default value if the parameter is missing in a call. All following arguments
must then have default values too. A node argument can't have a constant string as a default
argument, but it can be worth a global variable.
2.6.1 Parameters and return value
Arguments passed by parameter must be chosen among the following modes:
- value: if the mode of argument is omitted, this is the default mode ; it
requires a sequence of characters (a value of node, a constant string or the
result of a expression),
- node: a node is passed and it may be changed or inspected in the body.
The scope of a reference assignment is limited to the scope of the function: once
the function is left, the variable receives the value of the referenced node. It
is explained by the fact that the parameter is a new local variable, which refers
to the node passed as argument. So, a reference assignment is applied on the
local variable only.
- iterator: the iterator of a foreach statement is expected, for applying
iterator functions on the argument (first() for instance). Not really
useful and node is now sufficient.
- reference: a node is passed and it may be changed or expected in the body.
On the contrary of variable mode, a reference assignment is propagated
outside the scope of the function.
If you have omitted to return a value from a function, it returns an empty string ; in that
case, you expects to call this function as a procedure and the result isn't exploited. The
special procedure nop takes a function call as parameter and allows executing the
function and ignoring the result. It isn't compulsory to use nop for calling a function
as a procedure. As in C or C++, you can type the function call followed by a semi-colon and
the result is lost.
It exists two possibilities for returning a value:
- to populate an internal local variable whose name is the same as the function name,
- to use the return statement, followed by the expression to evaluate,
If you wish to execute a particular process in any case before leaving a function and:
- it exists more than one controlling sequence to leave,
- some errors may be raised,
2.6.2 The 'finally' statement
the statement finally warrants you that the block of instructions that follows the
keyword will be systematically executed before leaving. This declaration may be placed
anywhere into the body of the function. Its syntax conforms to:
finally-statement ::= "finally" compound-statement
Example:
// file "Documentation/FinallySample.cws":
1 function f(v : value) {
2 traceLine("BEGIN f(v)");
3 finally {
4 traceLine("END f(v)");
5 }
6 // the body of the function, with more than
7 // one way to exit the function, for example:
8 if !v return "empty";
9 if v == "1" return "first";
10 if v == "2" return "second";
11 if v == "3" return "third";
12 return "other";
13 }
14
15 traceLine("...f(1) has been executed and returned '" + f(1) + "'");
line 3: the finally statement is put anywhere in the body,
line 4: this statement will be executed while exiting the function, even if an exception was
raised,
Output:
BEGIN f(v)
END f(v)
...f(1) has been executed and returned 'first'
2.6.3 Unusual function declarations
It may arrive that a function prototype must be declared before being implemented, because of
a cross-reference with another function for instance. The scripting language offers the forward
declaration to answer this need. To do that, the prototype of the function is written,
preceded by the declare keyword:
forward-declaration ::= "declare" function-prototype ';'
If the body of the function must be implemented in another library and into C++ for example,
the prototype of the function is preceded by the external keyword (see section C++ binding):
external-declaration ::= "external" function-prototype ';'
2.6.4 Template functions
CodeWorker proposes a special category of functions called template functions.
Because of CodeWorker doesn't provide a typed scripting language, template hasn't to
be understood as it is commonly exploited in C++ for instance.
A template function represents a set of functions with the
same prototype, except the dispatching constant. The dispatching constant
is a constant string that extends that name of the function. These functions
instantiate the template function for a particular dispatching
constant. Each instantiated function implements its own body.
The BNF representation of a template function to implement is:
template-function-definition ::= instantiated-function-definition | generic-function-definition
instantiated-function-definition ::= instantiated-function-prototype compound-statement
instantiated-function-prototype ::= "function" function-name '<' dispatching-constant '>' '(' parameters ')'
dispatching-constant ::= a constant string between double quotes
generic-function-definition ::= generic-function-prototype [compound-statement | template-based-body]
generic-function-prototype ::= "function" function-name '<' generic-key '>' '(' parameters ')'
generic-key ::= an identifier that matches any dispatching constant with no attached prototype
template-based-body ::= "{{" template-based-script "}}"
template-based-script ::= a piece of template-based script describing the generic implementation
A call to a template function requires to provide a dispatching expression
to determine the dispatching constant. The dispatching expression will be
evaluated during the execution and CodeWorker will resolve what instantiated
function of this template to call: the result of the dispatching expression must
match with the dispatching constant of the instantiated function.
The BNF representation of a call to a template function is:
instantiated-function-call ::= function-name '<' dispatching-expression '>' '(' parameters ')'
parameters ::= expression [',' expression]*
Note that a dispatching constant may be empty and such an instantiated function
can be called as a classical function. In fact, classical functions are considered as
instantiated functions where the dispatching constant is empty.
template functions bring generic programming in the language:
let imagine that we need function getType(myType : node), to decline for every
language we could have to generate (C++, Java, ...).
Normally, you'll write the following lines to recover the type depending on the
language for which you are producing the source code:
if doc_language == "C++" {
sType = getCppType(myParameterType);
} else if doc_language == "JAVA" {
sType = getJAVAType(myParameterType);
} else {
error("unrecognized language '" + doc_language + "'");
}
Thanks to the template functions, you may replace the precedent lines by the next one:
sType = getType<doc_language>(myParameterType);
with:
function getType<"JAVA">(myType : node) {
... // implementation for returning a Java type
}
function getType<"C++">(myType : node) {
... // implementation for returning a C++ type
}
During the execution, the function getType<T>(myType : node) resolves
on what instantiated function it has to dispatch: either getType<"JAVA">(myType : node)
or getType<"C++">(myType : node), depending on what value is assigned to
variable doc_language.
Trying to call an instantiated function that doesn't exist, raises an error at runtime.
However, one might imagine an implementation by default. For instance:
function getType<T>(myType : node) {
... // common implementation for any unrecognized language
}
For those that know generic programming with C++ templates, here is a classical example of
using template functions:
function f<1>() { return 1; }
function f<N>() { return $N*f<$N - 1$>()$; }
local f10 = f<10>();
if $f10 != 3628800$ error("10! should be worth 3628800");
traceLine("10! = " + f10);
Output:
10! = 3628800
To provide more flexibility in the implementation of the template function, depending on
the generic key <T>, the body admits a template-based script to implement
the source code of the function.
The specialization of the function for a given template instantiation key is then resolved at runtime.
Example:
The template function f inserts a new attribute in a tree node. The attribute has the name passed
to the generic key for instantiation, and the value of the instantiation key is assigned to the new attribute.
Then, the function calls itself recursively on the instantiation key without the last character.
For instance, the source code of f<"field"> should be:
function f<"field">(x : node) {
insert x.field = "field";
f<"fiel">(x); // cut the last character
}
Code:
//a synonym of f<"">(x : node), terminal condition for recusive calls
function f(x : node) {/*does nothing*/}
function f<T>(x : node) {{
// '{{' announces a template-based script, which
// will generate the correct implementation during the instantiation
insert x.@T@ = "@T@";
f<"@T.rsubString(1)@">(x);
@
// '}}' announces the end of the template-based script
}}
f<"field">(project);
traceObject(project);
Output:
Tracing variable 'project':
field = "field"
fiel = "fiel"
fie = "fie"
fi = "fi"
f = "f"
End of variable's trace 'project'.
2.6.5 Methods
For more readability, syntactical facilities are offered to call functions on a node as if
this function was a method of the node. For example, it is possible to call function
leftString on the node a like this:
a.leftString(2), instead of the classical functional form:
leftString(a, 2).
The rule is that every function (user-defined included) whose first argument is passed either
by value or by node or by index (but never by reference) can
propose a method call.
In that case, the method call applies on the first argument, which has to be a node.
The BNF representation of a method call is:
method-call ::= variable '.' function-name '(' parameters ')'
parameters ::= expression [',' expression]*
where parameters have missed the first argument of the function called function-name.
It exists some exceptions where the method doesn't apply to the first argument:
- findElement applies on the second argument,
- replaceString applies on the third argument,
The following methods offer a synonym to the function name:
- empty is a synonym as a method of the function isEmpty,
- length is a synonym for the function lengthString,
- size is a synonym for the function getArraySize,
2.6.6 The 'readonly' hook
The BNF representation of this statement is:
readonlyHook-statement ::= "readonlyHook" '(' filename ')'
compound-statement
The token filename is the argument name that the user chooses for passing the name
of the file to the body of the hook.
This special function allows implementing a hook that will be called each time a
read-only file will be encountered while generating the output file through the
generate or expand instruction.
Limitations: only one declaration of this hook is authorized, and it can't be declared inside
a parsing or pattern script.
Example:
Common usage: file to generate has to be checked out from a source code control system
(see system command to run executables).
readonlyHook(sFilename) {
if !getProperty("SSProjectFolder") || !getProperty("SSWorkingFolder") || !getProperty("SSExecutablePath") || !getProperty("SSArchiveDir") {
traceLine("WARNING: properties 'SSProjectFolder' and 'SSWorkingFolder' and 'SSExecutablePath' and 'SSArchiveDir' should be passed to the command line for checking out read-only files from Source Safe");
} else {
if startString(sFilename, getProperty("SSWorkingFolder")) {
local sourceSafe;
insert sourceSafe.fileName = sFilename;
generate("SourceSafe.cwt", sourceSafe, getEnv("TMP") + "/SourceSafe.bat");
if sourceSafe.isOk {
putEnv("SSDIR", getProperty("SSArchiveDir"));
traceLine("checking out '" + sFilename + "' from Source Safe archive '" + getProperty("SSArchiveDir") + "'");
local sFailed = system(getEnv("TMP") + "/SourceSafe.bat");
if sFailed {
traceLine("Check out failed: '" + sFailed + "'");
}
}
} else {
traceLine("Unable to check out '" + sFilename + "': working folder starting with '" + getProperty("SSWorkingFolder") + "' expected");
}
}
}
2.6.7 The 'write file' hook
This special function allows implementing a hook that will be called just before writing a
file, after ending a text generation process such as expanding or generating or translating
text.
It is very important to notice that it returns a boolean value. A true value means that
the generated text must be written into the file. A false boolean value means that the
generated text doesn't have to be written into the file.
CodeWorker always interprets not returning a value explicitly of a function, as
returning an empty string. If you forget to return a value, the generated text will not be
written into the file!
The BNF representation of this statement is:
writefileHook-statement ::= "writefileHook" '(' filename ',' position ',' creation ')' compound-statement
| Argument | Type | Description |
| filename |
string |
The argument name that the user chooses for
passing the file name to the body of the hook.
|
| position |
int |
The argument name that the user chooses for
passing a position where a difference occurs between the new
generated version of the file and the precedent one.
If the files don't have the same size, the position is worth
-1.
|
| creation |
boolean |
The argument name that the user chooses for
passing whether the file is created or updated.
The argument is worth true if the file doesn't exist
yet. |
Limitations: only one declaration of this hook is authorized, and it can't be declared inside
a parsing or pattern script.
Example:
writefileHook(sFilename, iPosition, bCreation) {
if bCreation {
traceLine("Creating file '" + sFilename + "'!");
} else {
traceLine("Updating file '" + sFilename + "', difference at " + iPosition + "!");
}
return true;
}
2.6.8 The 'step into' hook
This special function is automatically called before that the extended BNF engine
resolves the production rule of a BNF non-terminal. Combined with stepoutHook(),
it is very useful for trace and debug tasks.
This hook can be implemented in parse scripts only.
The BNF representation of this statement is:
stepintoHook-statement ::= "stepintoHook" '(' sClauseName ',' localScope ')' compound-statement
| Argument | Type | Description |
| sClauseName |
string |
The name of the non-terminal.
|
| localScope |
tree |
The scope of parameters used into the production rule.
|
2.6.9 The 'step out' hook
This special function is automatically called once the extended BNF engine has finished
the resolution of a BNF non-terminal. Combined with stepintoHook(), it is very
useful for trace and debug tasks.
This hook can be implemented in parse scripts only.
The BNF representation of this statement is:
stepoutHook-statement ::= "stepoutHook" '(' sClauseName ',' localScope ',' bSuccess ')' compound-statement
| Argument | Type | Description |
| sClauseName |
string |
The name of the non-terminal.
|
| localScope |
tree |
The scope of local variables and parameters used into the production rule.
|
| bSuccess |
boolean |
Whether the resolution of the production rule has succeeded or not. |
2.7 Statement's modifiers
A statement's modifier is a directive that stands just before a statement, meaning an instruction
or a compound statement.
This directive operates some actions in the scope of the statement and then restores the
behaviour as being before.
This action may be:
- to measure the time that is consumed by the execution of the statement,
- to redirect into a variable all messages intended to the console during the
execution of the statement,
- to push a new project parse tree,