-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Internal arrays (counter arrays) #213
Comments
Additions: Array assignmentsAssigning an array to another array will assign its elements. Subarray syntax will allow assigning parts of the array, or even shifting the array up or down (needs to be made in reverse order when shifting up). Code will be generated for direct assignment from element to element. In the future, a mechanism for indirect access (a for loop) might be generated, suitable for large arrays where instruction limits could be hit.
External arraysAllows to allocate an external array from the heap (in 3.1, support for declaring an external variable with a storage different from heap will also be added, but it's not part of this issue). Unless the array starts at index 0, this incurs the cost of adding the base index to the element index, but this might get optimized away, as the base index will be constant. Array assignment between internal and external arrays will also be supported. Array assignment between two external arrays might be encoded using a loop. Bounds checkingA new compiler option for bounds checking. Possible values:
When an out-of-bounds access is detected at compile time, a compile error will always be triggered. A way to find the position of the error in optimized (say, loop unrolled) code might need to be implemented. Currently, temporary variables do not hold source position. They should be assigned source position of the expression they correspond to. |
Will internal arrays be passed by value to functions? |
At this moment, only global arrays are planned, and no support for passing them as function arguments, except into inline functions. I don't plan having support for passing arrays by value in any way. All other problems apart, creating copies of arrays is costly both in execution time and instruction space. Is there a particular use case for passing arrays by value? Support could be relatively easily added to pass arrays into inline functions as general parameters of an array type. However, as inline functions are compiled anew for each call, no element copying will take place and the arrays will effectively be passed by reference. This will get implemented at the same time as variable types, or shortly thereafter. I have a hazy idea of emulating "pointers to arrays" which could be stored in variables and passed around to functions, something similar to function pointers. It's a plan, but a distant one. Edit: for the "array pointers", it might be better to interweave the read and write jump tables. To read, a jump to |
I have an early prototype - and it works! Yay! (Implementing the code generation was pretty straightforward - the compiler rewite definitely paid off. Optimizations will be worse.)
(There's no optimization of internal arrays whatsoever yet. They get resolved after all optimizations are done.) For comparison, almost the same syntax can be used to create external arrays:
|
Code after applying some (already existing) optimizations to it:
Notes:
This also illustrates the benefits of having different transfer variables for reads and writes: if it weren't so, Instructions now hold information about side effects (i.e. variable read/written implicitly by the instruction without being part of the argument list). I'll need to make sure this new information is used at all places. Currently the optimization code is pretty convoluted, unfortunately. I'm also aware there is a possibility to perform induction variables optimization on |
I'm kinda bragging here, so please forgive me: @A4-Tacks created a program computing a Pascal Triangle as a compiler benchmark (decsription in English). When adapting it to use internal arrays - like this:
it evaluates everything at a compile time, producing
It can be done because the triangle size is fixed. Compile-time evaluation is one of Mindcode strengths, but it's impact in real-life programs isn't as profound, so the benchmark is a bit unbalanced. A version using external arrays (memory cells) would still unroll all loops, but wouldn't compile-time evaluate expressions based on memory cells. When further declaring |
I wanted to implement passing arrays into vararg functions, but ended up just expanding arrays into the parameter list, as is already the case with varargs themselves. As a result, it is possible to pass arrays into any function. Eg:
compiles into
The function call mechanism will be completely rewritten with type system. However, all vararg functions (which includes built-in ones, such as |
Internal arrays
Internal arrays ("counter arrays") are planned for release 3.1. Implementing them requires a number of changes.
Status
Arrays (internal/external) - grammar & code generation
Subarrays (internal/external) - grammar & code generation
cell1[32 ... 64]
)Error detection and handling
Optimization/finalization
Documentation
Grammar
Declarations
Allowing array variable declarations using the
var
keyword:For now, always global and with zero-based indexes.
Array declaration creates the following variables:
.<array>*<index>
: array members, e.g..array*0
to.array*99
..<array>*rval
: transfer variable for reads, e.g..array*rval
.<array>*wval
: transfer variable for writes, e.g..array*wval
.<array>*rret
: return address from read routine, e.g..array*rret
.<array>*wret
: return address from write routine, e.g..array*wret
The
*
separator corresponds to a compiler generated variable. It ensures no collision with other variables. Would work even for local arrays.Subarray selection
Allows to select a subset of the array for later operation. Primarily for passing a portion of the array as a vararg argument, or to use with for-each loop over a portion of the array.
The subarray indexes will have to be constant for now. Dynamic ranges might be achievable with for-each loops and perhaps would allow a jump-in-the-middle loop unrolling optimizations. That would be nice.
For-each loops
For-each loops won't use jump tables (the counter array mechanism), but will expand the array automatically into individual elements and loop over them. Leads to way more efficient code than random access.
Current vararg syntax concatenates the vararg list with other arguments. Arrays will behave the same:
We need new syntax to express the arrays should be traversed in parallel. Proposal:
Can be combined:
The new syntax allows this as well:
Code generation
A new virtual instruction will be made for array access, modeled after
read
andwrite
(value
is a standard input/output parameter to be freely optimized by DFO):Instructions will be resolved into a local code + jump table (a read/write jump table shared among all reads/writes).
For each array, the following jump tables will be generated:
Virtual instructions will be resolved into this code:
The multijump instruction (
op add @counter
) in the expanded code needs to be aware of implicit reads/writes to the transfer variables and communicate this to the DFO. A new mechanism for this will be needed.Optimizations
Unoptimized cost: 6 instructions per read/write.
readarr
/writearr
with constant index will be replaced by aset value <array variable>
/set <array variable> value
.EXPANDED
executed after the ITERATED phase if there are array access instructions.Between
ITERATED
andEXPANDED
phases jump tables will be generated and instructions will be resolved. Then another round of optimization iterations will happen. The current optimizations should then improve the code like this:In the
EXPANDED
phase, replacing dynamic access with static access probably won't be possible anymore (certainly not in 3.1).Possible future optimizations
Future optimizations (not in 3.1) - optimizations for speed (performed on cost/benefit basis):
.array*rret
/.array*wret
), and the transfer variable will be replaced directly by the value being read/written. Reduces cost to 4 instructions per access.array[i] + array[i + 1]
.array[i]++
), the update can be injected to the jump table. Saves 3 instructions.The text was updated successfully, but these errors were encountered: