 # Chapter 4 - Types and Expressions

In this chapter we’ll cover most of the types and the expressions that operate on them. Let’s get started with the simple types; types that can exist on their own.

## Simple Types

### Integers

You’ve been introduced to one of the integers already; `i32`. The i stands for ‘integer’. The number after the ‘i’ is how many bits - zeros and ones - represent the integer. That means the number determines the range of numbers that can be stored. For instance, the `i8` is 8 bits (1 byte) in size, and stores the integer range -128 to 127. If you add one to an `i8` holding a value of 127, it is said to ‘overflow’, and wraps around to -128. Here are all the signed integers:

``````i8  min:-128 max:127
i16 min:-32,768 max:32,767
i32 min:−2,147,483,648 max:2,147,483,647
i64 min:-9,223,372,036,854,775,808 max:9,223,372,036,854,775,807
``````

As said, those are the ‘signed’ integers. The sign being the minus sign. There’s another kind of integer - the ‘unsigned’ integer. As their name suggests, they don’t have a sign; they can’t hold negative values.

``````u8  min:0 max:255
u16 min:0 max:65,535
u32 min:0 max:4,294,967,295
u64 min:0 max:18,446,744,073,709,551,615
``````

#### Integer Literals

A series of digits on their own is an integer literal:

``````123
46
``````

If it’s preceded by `-` it’s a negative value:

``````-4
-42
``````

Integer literals are signed by default, but if you put `U` on the end they will be given an unsigned type:

``````32U
56U
``````

Underscores are ignored, this can help with larger numbers:

``````1000_000_000  // same as '1000000000'
3_2  // same as '32'
``````

A literal starting with `0x` is a hexadecimal literal:

``````0xFF  (same as 255)
``````

A literal starting with `0b` is a binary literal:

``````0b10  (same as 2)
``````

### Floating Point

Floating points represent numbers with a decimal place. You can’t represent pi as an integer, for instance, so that’s where the floating point types come in. Volt has two, and like integers they are delimited by their size.

``````f32
f64
``````

Floating points use the same arithmetic expressions as integers so `5.0 / 2.0` gives the result `2.5`. Floating point is a big topic, and the finer details deserve their own document. If you’re interested, What Every Computer Scientist Should Know About Floating-Point Arithmetic is well worth reading.

#### Floating Point Literals

The simplest form of floating point literal is just a number with a decimal point in it:

``````0.0
``````

Floating point literals are double precision (`f64`) unless they end with `f`:

``````0.0f
``````

### Characters

Character types hold little pieces of text. There are three:

``````char   1 byte wide, for UTF-8
wchar  2 bytes wide, for UTF-16
dchar  4 bytes wide, for UTF-32
``````

#### Character Literals

A character in single quotes (`'`) is a character literal:

``````'a'
'2'
``````

The ‘escape’ character is a backslash (`\`), this lets you insert various special characters that you can’t type:

``````\' The ' character
\\ The \ character
\0 An ASCII nul (a zero)
\n A newline character
\r Carriage return
\t A tab
``````

### Null

`null` is a special kind of value. For several reference types (like pointers, classes, and arrays), `null` represents an uninitialised type.

``````ptr: i32* = null;
// *ptr = 2;  // Crash!
obj: MyObject = null;
// obj.someMethod();  // Crash!
``````

Note that in the examples above, assigning to `null` is redundant: both pointers and class instances default to a value of `null` unless otherwise assigned to.

One interaction of `null` that is worth noting is with array types. `null`, just as before represents an uninitialised type:

``````arr: i32[] = null;
// arr = 3;  // Crash!
``````

The main difference being that an uninitialised can be operated on by several array operations without error. You can check the `length` field of an array initialised with `null` (`0`), and you can check the `ptr` field too (naturally, `null`). There are also operations that are valid with an empty array; concatenation (the `~` operator combines two arrays into one array), for example:

``````arr: i32[] = null;  // Again, technically redundant here.
arr ~= 1;  // arr is now , with a valid ptr field.
``````

Note that in expressions involving arrays of arrays, `null` has the type of the whole array, not the base type. This might seem obvious given `i32[]`, but given:

``````arr: i32[][];
arr ~= null;
``````

The `length` of `arr` is `0`, not `1`. To see why, simply expand out the expression by hand:

``````arr = arr ~ cast(i32[][])null;
``````

Two empty `i32[][]` concatenated together equals one empty array. If what you want to do is concatenate an empty `i32[]` to a list of `i32[]`s, then you can explicitly `cast` the `null` yourself:

``````arr: i32[][];
arr ~= cast(i32[])null;  // arr.length == 1
``````

### Remaining Primitives

The last two primitives are `bool` and `void`.

`bool` is a boolean data type. It is either `true` or `false`.

`void` is not a type at all, but instead marks the absence of a type.

## Composite Types

These types can only be complete when paired with other composite types, and at least one concrete (a primitive, or user-defined) type.

### Arrays

We used these last chapter. To declare an array of type T, we would write `T[]`.

``````i32[]     an array of i32s.
bool[]    an array of bools.
i16[][][] an array of an array of an array of i16s.
``````

#### Strings

You’ve seen these. They’re arrays of characters.

``````month := "January";  // month is of type string
``````

The type definition of `string` is simple:

``````alias string = immutable(char)[];
``````

`alias` creates a shorthand way of referring to a type. Using `string` or `immutable(char)[]` is the same, and the compiler will generate the same code for either. `immutable` means you can’t change the portion in parens:

``````a: string = "hello";
a = 'H';  // Error!
``````

You can, however, change the array:

``````a: string = "hello";
a = "H" ~ "ello";  // Hello
``````

Volt strings are arrays of UTF-8 codepoints. Using unicode correctly deserves a document all of its own, but don’t assume everything is ASCII. It may seem, for example, that the `length` parameter is counting letters, but that’s not true. The `length` of `"world"` may be `5`, but the length of `世界` is not `2`, but `6`, because despite being made of `2` characters, it’s `6` bytes of UTF-8. Use the `count` function in `watt.text.utf` if you want to count ‘characters’, and don’t assume that the Nth index into a string will get you the Nth character.

`\` escapes a string and lets you put special characters into a string:

``````writeln("hello\\\nworld");
``````

would display

``````hello\
world
``````

You can create raw strings if you need to use a lot of backslashes, for regexes and Windows paths:

``````r"C:\Users\Steve"
`C:\Users\Steve`
``````

Three backticks gets you a multiline string:

`````````
hello
world
this
is a multiline
string
```
``````

#### Array Literals

The `[` character denotes the start of the array literal. Values are separated by the `,` character, and the literal ends with the `]` character:

``````[1, 2, 3]
['a', 'b']
``````

### Pointers

A pointer holds the address of another value. To declare a pointer to type T, we would write `T*`.

``````i32*     a pointer to an i32.
bool[]*  a pointer to an array of bools.
char*[]  an array of pointers to chars.
``````

### Associative Arrays

An associative array associates ‘keys’ with ‘values’. To declare a an associative array of T values, using J as a key, we would write `T[J]`.

``````aa: bool[i32]  // an associative array of bools keyed by i32
void*[string]  // an associative array of pointers to void keyed by string
``````

#### Associative Array Literals

Like the array literals, associative array literals open with `[`, have values separated by `,`, and end with `]`. The values of an associative array literal have the key and value separated by a `:` character. So `[1:"hello", 2:"goodbye"]` would be a `string[i32]`, that has a value of “hello” associated with a key 1, and so on.

## Storage Type

The three storage types are `const`, `immutable`, and `scope`. These can be applied to a type using parens. `const(i32[])*` is a pointer to const array of `i32`s. Now, a quick overview of the meaning of each of these types.

### Const

A `const` type may not be modified.

``````a: const(i32) = 12;  // A const type may be initialised.
a = 6;  // Error, cannot modify const.
``````

### Immutable

An `immutable` type may not be modified. What separates this from `const` is that an `immutable` type instance cannot be constructed of a type that could be modified. If we think of a variable as a window into memory, a `const` window guarantees that the window marked as `const` will not be used to modify memory, but another window might modify that memory. An `immutable` window, on the other hand, makes the same guarantee that it will not be used to change that memory, but adds a further guarantee: the memory it’s looking at will not be changed by anyone.

``````i: i32 = 12;
ip: const(i32)* = &i;
assert(*ip == 12);
i = 6;
assert(*ip == 6);
``````

Whereas with `immutable`:

``````i: i32 = 12;
ip: immutable(i32)* = &i;  // error, cannot convert i32* to immutable(i32)*.
``````

Short of explicitly going over the type system’s head by `cast`ing away `immutable`, `immutable` values won’t change.

### Scope

A `scope` value can be modified, but it cannot become non-`scope`. A `scope`d type has a list of restrictions on what can be done with it. These restrictions are to prevent a reference to a piece of memory escaping the lifetime of the scope they were declared in, so that the pointer doesn’t become invalid.

That last sentence may have looked like Martian to you. At first, `scope` seems like an excuse for the compiler to yell at you for no good reason, but there is a genuine use to it, so here’s a concrete example.

Say we’re writing an API that takes a pointer to an `i32`, and doubles it (I didn’t say it was a useful API).

``````import watt.io;

fn processInteger(ip: i32*)
{
*ip = *ip * 2;
}

fn apiUser()
{
i := 32;
processInteger(&i);
writeln(i);  // output '64'
}

fn main() i32
{
apiUser();
return 0;
}
``````

As long as the pointer the user passes in is valid when `processInteger` is called, nothing will go wrong. But say you produce version 2.0 of the API, and this version stores pointers, and then processes them in a batch, later on.

``````module test;

import watt.io;

global integers: i32*[];

fn storeInteger(ip: i32*)
{
integers ~= ip;
}

fn processIntegers()
{
foreach (ip; integers) {
*ip = *ip * 2;
writeln(*ip);  // output ????
}
}

fn apiUser()
{
i := 32;
storeInteger(&i);
}

fn main() i32
{
apiUser();
processIntegers();
return 0;
}
``````

There’s no real way to predict how this program will behave on your machine. On mine, it produces seemingly random numbers, but it could just as easily crash. What happens here is known as ‘stack corruption’, and is the source of many very difficult to debug errors. Let’s break it down.

The `i` variable, in `apiUser` is what is known as a ‘stack’ variable. The stack is an area of memory that functions use for temporary variables – the local variables that don’t use the GC, or any other form of memory allocation. Once you `return` from these functions, the memory is free to be reused somewhere else.

But if some code, like `storeInteger` squirrels away that pointer, and then writes to it, all sorts of evil can happen. What `processIntegers` thinks it’s writing to is the `i` variable – that’s what the pointer was pointing to when `storeInteger` was called, after all. But now, even though the pointer value hasn’t changed, what it’s pointing to has. In fact, it’s likely (but not guaranteed) to be pointing to a point in `processInteger`, as it was called right next to `apiUser`. It’s easy to see how this can lead to bugs, especially in larger programs where it’s not obvious that this is happening. It leads to strange behaviours like functions jumping into the middle of other functions – the programs can limp on for a long time, doing all sorts of damage, before they crash.

Enter `scope`. If the pointer types above were instead marked as `scope`

``````global integers: scope i32*[];

fn storeInteger(ip: scope i32*)
{
integers ~= ip;  // Error! Can't escape scope!
}
``````

If the pointer would leave the scope of the current function, e.g. assigning to a `global` variable as in the above example, the compiler will complain. This is why inline functions are typed as `scoped dg` – they refer to the stack frame of the current function, so storing and calling them outside of that frame can lead to stack corruption.

So in addition to a `scope` value being disallowed from being implicitly converted to a non `scope` value, a `scope` value may not be assigned somewhere outside of the current function, and it may not be `return`ed, even if being `return`ed as a `scope` type.

``````fn storeInteger(ip: scope i32*) scope i32*
{
return ip;  // Error! Can't escape scope!
}
``````

### Mutable Indirection.

The above rules, where a type ‘cannot’ be converted to another can be elided if a type has no ‘mutable indirection’. A type with mutable indirection can change memory. A pointer, an array, and so on. A purely value type, like an `i32`, or a `struct` that only has `i32` members is said to be non mutably indirect, and the type system is more lenient when dealing with them.

Compare assigning an immutable array…

``````ia: immutable(i32[]);
ib: i32[] = ia;  // error!
``````

…to assigning an immutable integer:

``````i: immutable(i32);
j: i32 = i;
``````

The above is allowed, because `j` can not impact `i`’s value in any way.

## Expressions

Expressions perform an operation on one or more values.

### Arithmetic

The simplest expressions are the basic math operations. `+`, `-`, `*`, and `/`. Or, addition, subtraction, multiplication, and division, respectively.

``````a := 5 + 3;  // 8
b := 5 - 3;  // 2
c := 5 * 3;  // 15
d := 5 / 3;  // 1
``````

If you’ve not dealt with integer math before, you were probably nodding your head right up to the last example there. Integers are whole numbers, with no pesky fractional portions like `1`, `23`, `0`, or `-42`. If you divide two integers with `/`, you will get ‘integer division’ – the fractional portion will be ‘chopped off’ (effectively rounding down to the nearest integer, never up)

``````a := 10 / 6;  // 1
b := 2 / 5;   // 0
``````

Dividing by zero will cause your program to crash. If you want to represent a number with its fractional portion intact, you’ll need a floating point number (also known as a real):

``````a := 10.0 / 6.0;  // 1.666666666666667 (ish)
``````

If either side of the division operation is a real, then floating point division will be used, and the type of the operation will either be `f32` or `f64`.

One other operator that’s not quite as well known, but very useful is the modulo operator, `%` which returns the remainder of a division operation.

``````a := 10 % 4;  // 2
b := 10 % 5;  // 0
``````

You’ll often see this used to determine if a value is even or odd:

``````fn isEven(n: i32) bool
{
return n % 2 == 0;
}
``````

### Concatenation

In many languages, if you want to concatenate (stick) two strings together, you would use the `+` operator. Volt uses a separator operator altogether, `~` – the concatenation operator:

``````a := "hello " ~ "world"  // "hello world"
b ~= " nice.";  // "hello world nice."
``````

Note that `~` requires the language runtime to allocate memory for a new string, and concatenating in a loop can be suprisingly slow because of this. If you find yourself doing a lot of string concatenation, the `StringSink` `struct` in `watt.text.sink` is worth using.

### Relational Operators

These return a `bool` value.

`==` is the equality operator. If the two sides of this expression are the same, it returns `true`. Otherwise, it returns `false`.

``````import watt.io;

fn main() i32
{
if (str == "banana") {
writeln("you wrote 'banana'");
} else {
writeln("you didn't write 'banana'");
}
return 0;
}
``````

`!=` is the inequality operator. It’s like `==`, but instead returns `true` if both sides are not the same, otherwise it returns `false`.

``````import watt.conv;
import watt.io;

fn main() i32
{
writeln("I'm thinking of a number between one and one hundred. What is it?");
if (n != 32) {
writeln("you didn't get it!");
} else {
writeln("correct!");
}
return 0;
}
``````

The `<` and `>` and operators returns true if the left side is less than or greater than the right side, respectively.

``````import watt.io;

fn main() i32
{
if (5 > 2) {
writeln("five is bigger than two");
}
if (2 < 5) {
writeln("two is less than five");
}
return 0;
}
``````

Output:

``````five is bigger than two
two is less than five
``````

The `<=` returns true if the left side is less than or equal to the right, and `>=` returns true if the right side is greater than or equal to the right.

``````5 >= 4  true
5 >= 5  true
5 >= 6  false
4 <= 5  true
4 <= 4  true
4 <= 3  false
``````

The relational comparison operators function on arrays, too. The first than determines the ordering. If one array is shorter than the other, but equal in other respects it evaluates ‘less’ than the longer array. That is to say,

``````"aaaa" < "aaaaaaaa"  true
"b"    < "aaaaaaaa"  false
``````

### Logical Operators

`&&` returns true if both sides are true.

``````true && true    true
true && false   false
false && false  false
``````

`||` returns true if one or more side is true.

``````true || true    true
true || false   true
false || true   true
false || false  false
``````

`!` returns true if it’s applied to a false value, and false if it’s applied to a true one.

``````!true   false
!false  true
``````

### Casts

Usually, Volt won’t let us assign a variable to another if the types aren’t the same, unless it knows it’ll fit. For instance,

``````a: i32;
b: i16;
a = b;  // This is okay, as any i16 can fit into the larger i32
b = a;  // Error: certain values of i32 may not fit into i16.
``````

`cast` lets us say “we know what we’re doing, assign anyway.”

``````import watt.io;

fn main() i32
{
b: u8 = cast(u8)257;
writeln(b);
return 0;
}
``````

Output:

``````1
``````

As the maximum value that a `u8` can hold is a 255, the cast forces it to truncate the value, and it wraps around.

### Increment and Decrement

The increment and decrement operators increase and decrease a variables value by 1 each time.

``````a := 0;
a++;  // a is 1
a--;  // a is 0
``````

They can be used both before, and after the variable. These are known as prefix and postfix increment/decrement operators.

``````a := 0;
++a;  // a is 1
a++;  // a is 2
a--;  // a is 1
--a;  // a is 0
``````

There is a difference. The prefix operators return the value modified by the operation:

``````a := 0;
b := ++a;  // b is 1, a is 1
``````

While the postfix operators return the value before modifying it:

``````a := 0;
b := a++;  // b is 0, a is 1
``````

### Bitwise Operators

The bitwise operators perform logical operations on a bit level. See the wikipedia page for more detail. These operations operate on types of the same size.

`|` sets a bit if one or both of the bits are set.

``````0b0000 | 0b0001  // 0b0001
0b0001 | 0b0001  // 0b0001
0b0000 | 0b0000  // 0b0000
``````

`&` sets a bit if both of the bits are set.

``````0b0001 & 0b0001  // 0b0001
0b0000 & 0b0001  // 0b0000
0b0001 & 0b0000  // 0b0000
``````

`^` sets a bit if one (and only one) of the bits are set.

``````0b0000 ^ 0b0001  // 0b0001
0b0001 ^ 0b0001  // 0b0000
0b0000 ^ 0b0000  // 0b0000
``````

`<<` shifts the bit pattern left by the number on the right.

``````0b0001 << 1  // 0b0010
0b0001 << 2  // 0b0100
``````

`>>` shifts the bit pattern right by the number on the right.

``````0b1000 >> 1  // 0b0100
0b1000 >> 2  // 0b0010
``````

## Assignment

In addition to the regular assignment operator we’ve been using, there are several that are combined with the operators we’ve been touching on in this chapter.

`a += 1` is the same as `a = a + 1`.

`a *= 1` is the same as `a = a * 1`.

`a /= 1` is the same as `a = a / 1`.

`a -= 1` is the same as `a = a - 1`.

## Ternary

The ternary expression is like a compact `if` statement. It returns a value dependent on a boolean expression.

``````a := 5 > 2 ? "bigger" : "smaller";
writeln(a);
``````

Output:

``````bigger
``````

## Precedence

Operator precedence is the order in which operators (expressions) are evaluated. For instance, `*` has a higher precedence than `+`, so `5 * 5 + 2` is `27`, and not `35`. The associativity of an operator is from where groups are formed in expressions using the same precedence. `=` is right associative, so `a = b = c` is interpreted as `a = (b = c)`, not `(a = b) = c`. In order from highest precedence to lowest:

``````* / %
+ - ~
<< >>
< > <= >= in !in
== != is !is
&
^
|
&&
||
= += *= etc
``````

If you wanted `5 * 5 + 2` to mean `35`, you can wrap a part of an expression in parens to give it a higher priority, as parens are evaluated first: `5 * (5 + 2)`.