Source Text

Encoding

Forge source files must be encoded in UTF-8. The lexer assumes UTF-8 input and will produce errors on invalid byte sequences. No byte-order mark (BOM) is required or expected; if present, it is treated as ordinary content and will likely cause a parse error.

File Extension

Forge source files use the .fg file extension by convention. The CLI tools (forge run, forge test, forge fmt) expect this extension. Example:

hello.fg
server.fg
tests/math_test.fg

Line Endings

Forge recognizes two line ending sequences:

Sequence	Name	Unicode
`\n`	Line feed	U+000A
`\r\n`	Carriage return + line feed	U+000D U+000A

A bare carriage return (\r without a following \n) is not treated as a line ending. Both recognized forms are normalized to a single Newline token in the token stream.

Source Structure

A Forge source file consists of a sequence of top-level statements executed in order. There is no required main function, module declaration, or package header. The simplest valid Forge program is:

say "hello"

Forge programs are executed from the first statement to the last, top to bottom. Functions and type definitions are hoisted conceptually in that they can be referenced before their textual position, but side effects in top-level statements execute in source order.

Character Set

Within string literals, Forge supports the full Unicode character set. Outside of string literals, the following characters are meaningful to the lexer:

ASCII letters (a-z, A-Z) and underscore (_) begin identifiers and keywords.
ASCII digits (0-9) begin numeric literals.
Operator and punctuation characters: +, -, *, /, %, =, !, <, >, &, |, ., ,, :, ;, (, ), {, }, [, ], @, ?, #.
The double-quote character (") begins string literals.
Whitespace characters: space (U+0020), horizontal tab (U+0009).
Line terminators: line feed (U+000A), carriage return (U+000D).

All other characters outside of string literals are lexer errors.