EK9 Basics

Please take a look at the introduction if you are looking to understand what EK9 is. Review structure for an overview of the file structure and the constructs involved. This document goes into various details on how EK9 approaches variables, types and collections.

Getting Started

Symbols and Punctuation

As mentioned in the introduction EK9 is indentation based. Rather than use { } symbols in combination to mark out blocks; indentation with spaces is used. The { } symbols are used for Dictionary (Map) initialisation.

Line Termination

The ; (semi-colon) is not used anywhere at all and is not needed as a line terminator.

The < > in combination is only used for comments but not for any sort of generics/template declarations. Importantly there are no in-line lambdas as such, so symbol combinations of and => are not used in that context.

Accepting Parameters

The symbol is used to denote incoming parameters and the is used to mean 'coming out of', 'returning a value' or defining, inferring and initialising a constant/variable (see declarations and type inference). When calling methods or functions parameters are passed using () (parentheses) as with most programming languages. It is only where parameters are accepted or returned in methods or functions that and are employed.

Parameter passing examples.

#!ek9
defines module introduction
  defines function
    
    function1()
      -> n Integer
      <- sum Integer: 0      
      ...
      
    function2()
      ->
        n1 Integer
        n2 Integer
      <-
        sum Integer: 0
      ...
//EOF

In general if passing in one parameter the single line input (→ n Integer) is preferred (function1), if there are two parameters (function2) then the → should appear on a line by itself and the parameters should appear on consecutive lines but indented.

While it is not mandatory; the returning parameter (if any) should follow the same layout as the incoming parameter(s), i.e single line (function1) or indented form (function2). There can only be zero or one named return parameter multiple return values are not supported. Though as all parameters are passed by reference it is possible to alter the value of in coming parameters by 'copying' data into them. This is a little like 'inout' parameters in some languages.

Operators

EK9 uses a range of symbols and symbol combinations for operators this is intended to drive consistency in common areas such as addition of two items and also addition of an item to a collection. This is more in line with the C++ and Scala languages and is a move away from just adding various and inconsistent method names with very different semantics to Object types. Care has to be taken when overriding and using operators as excessive or misleading/ inappropriate use can lead to confusion and code that is hard to understand/maintain.

Objects

There are no primitive types in EK9 - this means that everything is an Object. This is quite an important point because most languages do have primitive types and so when values are passed to methods or functions they are passed by value. In EK9 all values are effectively passed by reference.

So what does this mean? A primitive int or float in most languages always has a value even if it just declared without initialisation (0 and 0.0f respectively). An amount of memory has been allocated to the variable for it to store the value.

So this means that a primitive always has memory allocated and there can never be a null pointer issue then? Well you might think that but with languages that support auto boxing (see here for what this means) errors can still occur where the Object form of a value is auto boxed into a primitive.

Auto-Boxing (not supported)
  • //If auto boxing was supported
  • //primitiveInt would be initialised as 0
  • primitiveInt as int
  • //objectInt would not have any memory allocated to hold its value
  • objectInt as Integer
  • //So what happens here?
  • primitiveInt = objectInt
  • //Some type of null exception will occur

Consistency

In general when working in an Object Oriented or Functional Programming model it's best to be consistent. For example all the EK9 collection types work with any type of Object and even functions can be passed as Objects (delegates). This is the main reason EK9 does not have primitives - therefore no auto boxing is ever used and therefore that sort of null pointer error can never occur.

The downside is that Objects are much more heavy weight than a pure primitive types, that could easily be implemented in a CPU register. This just means that the code generation phase has to work much harder to produce efficient executable instructions. But from a programmer/developer point of view; there is consistency.

Just think in terms of Objects and not low level primitives. There are no arrays but there are Lists and Dictionaries and to access the contents of these you can use Iterators but also Stream pipe lines as outlined in the introduction.

Declarations

The standard approach in EK9 to declare a variable or a constant is to always allocate it some memory so that it always "points somewhere'. This means that a variable (like an Integer) can be declared but importantly it can be declared as not having any meaningful value.

This may seem like a trivial, unimportant or even a nonsensical thing to do, why declare something with no meaning? See the example below.

Variable declarations
  • //Declare an Integer - but with no value yet known.
  • age Integer()
  • //Declare an Integer - but with a known value from the outset.
  • minimumAge 21

In the above example a variable of type Integer has been declared the variable is called age, if we assume the program we are about to write is going to ask the user for their age, then at this point in time we don't yet know what that value will be. Normally you would have to use a value like -1 or some other value to indicate that the value had not yet been set, or if you were to use an Object version, you might have left it as null to indicate that it had not yet been set.

With EK9 there is no need for this. Just declare age as above, it has memory allocated but is noted as having no meaningful value. The variable minimumAge is declared and is also known from the outset, importantly the type of minimumAge has been inferred as an Integer.

It is also possible to declare the same value like this (below) if you prefer. In this case you are being explicit as to what the type is. This becomes important when you want a variable to be of a specific type (typically in class, component, trait or function hierarchies).

  • //Declare an Integer - but with no value yet known.
  • age as Integer: Integer()
  • //Or like this.
  • age as Integer = Integer()
  • //Or even like this.
  • age as Integer := Integer()
  • //Declare an Integer - without type inference
  • minimumAge as Integer := 21

It is possible to declare variables like this below, this sort of syntax should really only be used when receiving parameters on methods/functions, it is best avoided when declaring local variables, fields/properties on records, classes and components as it is very likely to lead to some sort of error. An exception would be when you are dealing with an abstract type/trait; as you cannot instantiate an abstract type/trait/class

  • //Declare an Integer - but no space allocated
  • possibleError as Integer

Why use different ways to declare a variable/parameter you may ask? In short; it is a mix of history, a desire for type inference and polymorphic variables. Always prefer the following forms where possible.

  • //Local variable declaration and initialisation.
  • variableName TypeConstructor()
  • //polymorphic variable declaration and initialisation.
  • variableName as SuperType: SubTypeConstructor()
  • //Preferred field/property declaration and initialisation.
  • fieldName as Type: TypeConstructor()

History

There are programming constructs and technologies such as CSS, JSON where the ':' is used extensively for assignment when data elements are initialised. Not only does this look very clear on the eye it also means that many programmers have become used to this syntax.

However HTML uses the '=' for assignment to properties in UI elements and most programming languages use '=' for assignment of variables, so again there will be many programmers that are used to using '=' for assignment.

So why add ':=' as well? Both Pascal and ADA use ':=' for assignment (as well as other languages) but more importantly it brings the ability to drive for consistency with other operators listed below. Note that ':=' also looks and feels stronger as an assignment than either ':' or '=' separately. Use the assignment operator you feel most happy with; but try to be consistent.

There are more operators available but those above are involve reference to = either in the form of assignment or in the form of testing equality. So adding in := does bring some form of two character consistency(in a way). EK9 does provide a range of coalescing operators which work well with the 'is set' nature of EK9 variables.

Type Inference

The use of is the main mechanism to not only declare a variable; but also have its type inferred. As you can see in the example below, the inference mechanism is much more terse, easy to write and understand.

  • minimumAge 21
  • //Alternatively - which can be useful and must be used for fields/properties
  • minimumAge as Integer: 21

So EK9 uses type inference for the consumer of methods, functions and fields but forces the writer of the records, classes, components, methods and functions to fully declare the fields and parameters. This makes it much quicker and easier for the programmer to determine the types they are dealing with. They also get the speed and clarity of type inference when writing bodies of code, which is where the bulk of code is written.

Why not just use type inference everywhere?

It was noted when looking at type inference; that while it is possible with a Hindley - Milner type inference mechanism to deduce and infer all the types; developers actually needed to quickly see what type they were dealing with. Either in the definition of structured constructs or in the use of such constructs. While modern IDE's can really help in this regard (by showing types via hovers for example) it became very obvious that it was much quicker and easier for a developer to just follow the method call and immediately see what the type was.

Checking variables

Clearly if it is possible to create/declare a variable and not set it to any meaningful value, there has to be some way to check it. As mentioned before with primitives or Object type in some languages it would be necessary to check for -1 (if that denoted not set) or maybe even check if an object was set to null/nil if the language supported that concept (EK9 does not).

EK9 uses the ? (is set) operator in the following way.

  • age Integer()
  • if age?
  •   //This block would not be executed
  • else
  •   //This block would be the section that is executed

The good thing about the ? operator is that it can be applied to any type of object. EK9 also provides a set of ternary and assignment coalescing operators specifically for dealing with variables that may or may not be have meaningful values.

When you develop classes or records you can override the ? operator. This also applies to generics in the following way. While generics have not really been covered yet, they are intrinsic to how EK9 works and the built-in set of generic types really are first class constructs and so are treated on a par with Integers and Strings; for example.

  • //A simple empty list
  • aList as List of Integer := List()
  • if aList?
  •   //Code where aList has some meaningful value/content

Incidentally if you have an initial set of values you need in a list then you can use the shorthand below. But there is more on this in the collection types section.

  • //A list with some values using the 'list shorthand of' [] - this not an array!
  • bList ← [ "A", "B", "C" ]
  • if bList?
  •   //This block would be the section that is executed

This logic also applies to iterators (another supported generic type). Simple loop example getting each value via an iterator and converts it to a String to be printed on the standard output (console). But see introduction for alternatives to using Iterators.

  • stdout Stdout()
  • bList ← [ "A", "B", "C" ]
  • iteratorB bList.iterator()
  • while iteratorB?
  •   stdout.println("Value is [" + iteratorB.next() + "]")
  •  
  • //The stream pipeline construct is cleaner though
  • cat bList > stdout

In summary then; it is always best to declare a variable and assign it some space to hold a value, even if that value is not yet known.

Accept, for example that a Boolean can really have three states, true, false and importantly not yet known.

Assignment

The declarations above really combine the declaration of a variable name, its type (sometimes inferred) and its initial value into a single statement. In general, the creation of the variable and its assignment is always best done in one go like this. And with the easy syntax like it does make this approach more likely to be adopted (it is just less typing).

However there are times when you will already have a variable and just want to assign it a new value. The example below shows that syntax. But review ternary operators as there is a nice syntax that that makes a single assignment much more obvious.

  • //Declare charges and initialise
  • charges 20.99
  • //Use charges for some processing
  • ...
  • //Now set the same variable to a new value
  • charges := 50.99
  • //Use updated charges for some processing

Note that the assignment was done using ':=', the assignment could have been done with ':' or '='. Depending on your point of view, the reassignment of a variable to a new value could be considered quite significant. Indeed some developers would say you should never reassign a variable. Lets see what that would look like using the other assignment operators.

  • //Declare charges and initialise
  • charges 20.99
  • ...
  • charges: 50.99
  • charges = 50.99
  • //Using := feels more significant
  • charges := 50.99

EK9 does not stop you reassigning variables unless the block of code you do it in is marked as pure, then only return values can be reassigned. See the section on pure for more details. Pure is the main reason EK9 does not use keywords like const, final or static. It's much stronger a statement if developers are looking at immutability in their code.

While these keywords would provide very fine grained control over variables, EK9 takes the approach of forcing the whole block (function/method) to be pure. If the developer has concerns in this area they feel strongly about then adopting a more pure approach is the way to go. For most developers this might feel a little extreme.

Comments

There are four types of comments that available in EK9 source code, these are show below. Please note that the C/C++/C#/Java style of comments /* */ are not supported.

One Line Comments

A single line comment is shown below.

  • //A one line comment
  • variable ← 21 //Comments can be at the end of a line if necessary

The single line comment starts with '//' and continues until the end of the line.

Block Comments

There are two types of general block comments, which follow the HTML/XML coding standards as shown below.

  • <!--
  • These lines will be contained
  • within a comment block.
  • -->

The block comment below is an alternative and more consistent mark up.

  • <!-
  • These lines will be contained
  • within a comment block, just like above.
  • -!>

Documentation Comments

This comment type is really only used for documentation comments.

  • <?-
  •   Just sums the squares of values from 1 to n.
  • -?>

Pure Keyword

Whilst not a mainstream approach by developers with an Object Oriented or Procedural background, the concept of not allowing/controlling variable reassignment is really a key element for those with a Functional Programming background. The idea of which is quite attractive in many ways and can lead to code that is much less error prone.

It is not without cost however, sometimes the solution approach you would have traditionally used cannot be applied. Sometimes you will need to rethink your approach to a problem and come up with an alternative solution.

Below is an example of a function that has been marked as pure to a 'purist' functional programmer this would not qualify as being pure as there are a couple of explicit and implicit reassignments.

#!ek9
defines module introduction
  defines function
    <?-
      Just sums the squares of values from 1 to n.
    -?>
    sumOfSquares() as pure
      -> n Integer
      <- sum Integer: 0      
      for i in 1 ... n
        sum += i*i
 
    //An alternative solution using stream pipeline
       
    <?-
      Square of val.
    -?>    
    square() as pure
      -> val as Integer
      <- squared as Integer: val * val
      
    streamingSumOfSquares() as pure
      -> n Integer
      <- sum Integer: for i in 1 ... n | map with square | collect as Integer
 
//EOF

But here is the advantage that marking something pure has, the following code would result in a compiler error (because there is an assignment to an incoming parameter variable). The stream pipeline offers an alternative approach which is more functional in nature.

  • ...
  • sumOfSquares() as pure
  •   → n as Integer
  •   ← sum Integer: 0
  •   n := 20
  •   for i in 1 ... n
  • ...

It's important to know that pure can only call pure. If you have some existing functionality that is not marked as pure you cannot call it from your pure function! The converse is not true however!

Visibility

In EK9 there are several concepts of visibility that are important to understand, these are outlined below. Each of the different constructs have different capabilities/roles and purpose; as such the level of visibility of fields/properties and methods vary from construct to construct. While this may seem confusing there is a purpose to these different levels of visibility per construct.

Module Visibility

All constructs that are defined in a module (defines module introduction) are all visible to each other so function sumOfSquares (above) would be visible to any other function or construct in the same module.

References

However if you need (and you will) to access constructs from other modules there are two ways you can do this. The first way is shown below and it is to use a references statement. Here a constant of PI is defined in a module namespace called net.customer.geometry.

#!ek9
module net.customer.geometry
  defines constant
    PI <- 3.142
//EOF

Lets assume that the developer now needs to access PI from another module namespace called com.solutions.areas and want to define a function to calculate the area of a circle.

#!ek9
module com.solutions.areas
  references
    net.customer.geometry::PI
    
  defines function
    areaOfCircle()
      -> diameter as Float
      <- result as Float: PI * (diameter/2)^2
//EOF

By using the references statement above the constant PI can be used within the module. The alternative mechanism to use PI is show below.

#!ek9
module com.solutions.areas
  defines function
    areaOfCircle()
      -> diameter as Float
      <- result as Float: net.customer.geometry::PI * (diameter/2)^2
//EOF

Here the fully qualified name of the constant PI is used. Why have two mechanisms to do that same thing? There are cases (PI would be a bad example) where you have a construct say a function that has the same name but in a different module. You cannot reference it because the names would clash, so the only way to access this is to use the fully qualified name. Clearly this is less convenient (especially if you want to use it in multiple places), so take care in naming of constructs and also how you link them together.

The same syntax is used for components, records, types, traits, classes and functions. But note there are no wildcards you must reference every single construct that you need from other packages. This ensures that you are careful in packaging and re-use. Moreover everything can be referenced, there is no mechanism to limit access to items in modules that are defined. This is aimed at simplicity, other languages have mechanisms that enforce hiding of constructs EK9 does not support this.

The main focus of encapsulation in EK9 on a large scale is the component other constructs are considered to be building blocks to make the components, clearly there is nothing stopping developer from using specific namespaces to denote internal constructs that should not be exposed. For example com.geometry.internal, but these are a naming convention and not enforced by the compiler.

Function Visibility

Just like constants all functions have public visibility, but again just like constants if they are to be addressed outside the module they are defined in then they must be either referenced or by addressed by their fully qualified name. com.solutions.areas::areaOfCircle in the example of the function outlined above.

As an aside functions must have a unique name within a module, you cannot overload functions with the same names but different parameters (unlike methods that do support this). You can have a function of the same name in a different module namespace however.

Record Field Visibility

Record field/property visibility is always public, this means that when a record variable is accessible all its fields are directly readable and can also be modified. As records only support fields and operators there and there are no methods on records method visibility is not relevant for records.

Class Field/Method Visibility

Class field/property visibility is always private, this means that classes that extend classes do not have any access to fields/properties in their super classes. It is not possible to create a field/property in a class that is protected or public. The inner workings of a class are private to that class. It is possible to make accessor methods to expose such inner workings clearly this is undesirable from an Object Oriented encapsulation point of view; but it can be done and the compiler will not stop this.

All operators are public and can only be public.

Class methods have the most variety of visibility, these values can be one of:

Trait Method Visibility

Trait method visibility is always public and traits as do not have fields/properties their visibility is not relevant.

Component/Field/Method Visibility

Just like classes; component fields/properties are always private. As there are no operators on components, their visibility is not relevant.

Component methods have the following visibility:

There is no concept of protected access in components, they are intended to be composed of other building blocks such as classes, functions and other components and extension should be through composition and not inheritance. If you find yourself needing complex inheritance hierarchies with components you should look to pull that functionality down into classes. Moreover you should probably consider using some of the composition mechanisms available in classes to make them more reusable.

You may consider this advice rather opinionated and in some ways it is. EK9 has many more specific constructs than other languages; these are designed and intended to work in concert with other constructs in specific ways. Go 'with the grain' of this, going against the grain will not result in a good outcome. If you find this approach too constraining/limiting/restrictive or irritating see the other languages section - you may find a language there that you are more suited to.

Parameter Visibility

Clearly parameters passed in to a function or method must be visible in terms of being read and used, but their reassignment or alteration can be controlled via the use of the pure keyword - see the pure modifier for more details.

If you recall in the section on objects, parameters are always Objects and hence are always passed by reference. This means that it is possible to modify their internal state (subject to the pure modifier), hence all parameters can be in out parameters. This has major implications (both good and bad), it means that if you want to return multiple values from a function or method, you can pass parameters that you intend to be modified and updated. But it also means it is possible you may inadvertently modify an incoming parameter that you did not intend to.

Just using the ':', '=' or ':=' assignment operators would not modify the internal state of an object as it would just alter the location of where the variable in the function or method would access. The original variable would remain as is. But using and operator like '+=' or '++' would actually alter the internal state of the variable. In addition the copy operator ':=:' will alter the internal state; as would the merge operator ':~:'.

So by adopting an Object only approach and not supporting any pass by value semantics EK9 does open the door to potential errors here, but as most real world software always involves complex aggregates rather than just a few primitive types (like int or float) this is an issue that already has to be carefully managed.

Next Steps

To learn more on the language itself in terms of what operators exist on built-in types and and how to provide implementations in your own classes/records; take a look at the operators next.