MoveCode (Averest)

MoveCode Module

Namespace: Averest.MiniC
Assembly: Averest.dll

This module implements functions for move code programs executed by BED (Buffered Exposed Datapath) processor architectures as the SCAD processor. In addition to the data types for move instruction programs, the module also provides an instruction set simulator to define the semantics of the move code instructions of the SCAD machine (for more details, see the SCAD reference sheet of SCAD).

Types

Type	Description
MoveCode	This type defines move code instructions with methods for encoding, decoding, string conversion, and parsing from strings and byte arrays. For the encoding into machine words (byte arrays), see the addresses of PUs and buffers in the SCAD reference sheet. Hence, the general purpose PUs have indices \(2,\ldots,\mathsf{numPU}+1\), and their input and output buffers have addresses \(4,\ldots,2\cdot\mathsf{numPU}+3\). If these numbers are encoded as radix-2 numbers, we need \(\left\lceil{\log_2(2\cdot\mathsf{numPU}+4)}\right\rceil\) bits for the source address and the same number of bits for the target address. Together with the two bits to determine one of the four kinds of move instructions, we finally obtain instructions \|mv\|srcAdr\|tgtAdr\| where mv has two bits, and each one of srcAdr and tgtAdr has \(\left\lceil{\log_2(2\cdot\mathsf{numPU}+4)}\right\rceil\) bits. Hence, we get \(\mathsf{numBits(numPU)} := 2+2\cdot \left\lceil{\log_2(2\cdot\mathsf{numPU}+4)}\right\rceil\) \(\mathsf{maxPU(numBits)} := 2^{\frac{\mathsf{numBits}}{2}-2}-2\) However, we demand that the minimal size of the bitvectors is 16 to ensure a sufficient size for immediate operands, and this number is also sufficient for up to 62 PUs.
OpCodeBED	type for operation codes with methods for parsing, printing and encoding
ProcessorState<'a>	a processor state contains all values that may change in an execution; buffers are thereby array where NONE denotes no value, and where the head value is at index 0, so that the arrays end with a suffix of values NONE and have a prefix of SOME-values.
PuIdx	PUs have a unique indices; index 0 is reserved for the CU, index 1 is reserved for the LSU; and general purpose PUs have indices 2,...,numPU+1
SrcAdr	sources of move code instructions
TgtAdr	targets of move code instructions

Functions and values

Function or value	Description
`CopyProcessorState ps` Full Usage: `CopyProcessorState ps` Parameters: ps : `ProcessorState<'a>` Returns: `ProcessorState<'a>`	copy a processor state ps : `ProcessorState<'a>` Returns: `ProcessorState<'a>`
`ExampleProgFactorial n` Full Usage: `ExampleProgFactorial n` Parameters: n : `int` Returns: `MoveCode list`	This function generates a move code program to compute the factorial of a given number n. It makes use of three PUs and runs in linear time. n : `int` Returns: `MoveCode list`
`ExampleProgListSum n puNum` Full Usage: `ExampleProgListSum n puNum` Parameters: n : `int` puNum : `int` Returns: `MoveCode list`	This function generates a move code program to compute the sum of the first n natural numbers with puNum PUs. The generated move code program has 3*n-3 instructions and can be run with log(n)+1 many steps with n/2 many PUs. However, it requires a large issue width of about 4 times the number of PUs. A buffer size of 2 is however sufficient for optimal time. n : `int` puNum : `int` Returns: `MoveCode list`
`ExampleProgOddEvenSort n` Full Usage: `ExampleProgOddEvenSort n` Parameters: n : `int` Returns: `MoveCode list`	OddEven Transposition Sorting is a parallel sorting algorithm that requires O(n^2) work with a depth of O(n). The compare-and-swap operation is implemented by comparing x1 with x2, and conditional assignments to variables y1 = b1 and x1 or !b1 or x2 and y2 = !b1 and x1 or b1 or x2. Without leveling, the program results in a dataflow graph which cannot be scheduled on a single PU; however, two PUs are sufficient to run the program. In addition to this PU, we need another PU that provides the input pair of values for this operation, and will afterwards pick the result to insert in the sequence. n : `int` Returns: `MoveCode list`
`ExampleProgParallelPrefixBrentKung n` Full Usage: `ExampleProgParallelPrefixBrentKung n` Parameters: n : `int` Returns: `MoveCode list`	This function generates a move code program for computing the parallel prefix sum of the first n numbers. The function uses only a single PU and processes the levelized dataflow graph level by level from left to right. The values are thereby kept in the left input buffer and are either copied back to this buffer, duplicated for a following addition, or added to a valued put in inR after a previous duplication. The algorithm implements the Brent-Kung parallel prefix tree which is work and time optimal. n : `int` Returns: `MoveCode list`
`ExampleProgParallelPrefixKoggeStone n` Full Usage: `ExampleProgParallelPrefixKoggeStone n` Parameters: n : `int` Returns: `MoveCode list`	Kogge-Stone parallel prefix computation is not work-efficient, but almost twice as fast as the Brent-Kung parallel prefix computation. On SCAD, the additional work does not matter since the Brent-Kung implementation may require additional COPY operations as well. The implementation makes use of two PUs, since we need to duplicate the result of additions while having the values of the current level in the inL buffer of PU[0]. n : `int` Returns: `MoveCode list`
`ExampleProgTreeSum n puNum` Full Usage: `ExampleProgTreeSum n puNum` Parameters: n : `int` puNum : `int` Returns: `MoveCode list`	This function generates a move code program to compute the sum of the first n natural numbers with puNum PUs. The generated move code program has 3*n-3 instructions and can be run with log(n)+1 many steps with n/2 many PUs. However, it requires a large issue width of about the number of PUs. A buffer size of 2 is however sufficient for optimal time. n : `int` puNum : `int` Returns: `MoveCode list`
`MkInitState initVal` Full Usage: `MkInitState initVal` Parameters: initVal : `'a` Returns: `ProcessorState<'a>`	generate an initial state for the processor initVal : `'a` Returns: `ProcessorState<'a>`
`MoveCodeThreads mvProg` Full Usage: `MoveCodeThreads mvProg` Parameters: mvProg : `MoveCode list` Returns: `MoveCode list array`	split a move code program into threads, i.e., move code instructions that have to be sent to each PU mvProg : `MoveCode list` Returns: `MoveCode list array`
`NumberOfPUs mvProg` Full Usage: `NumberOfPUs mvProg` Parameters: mvProg : `MoveCode seq` Returns: `PuIdx`	determine the number of PUs used by the given move code program, e.g. to assign the mutable variable numPU in this module mvProg : `MoveCode seq` Returns: `PuIdx`
`OptimizeParameters mvProg` Full Usage: `OptimizeParameters mvProg` Parameters: mvProg : `MoveCode list` Returns: `int * int * int`	This function computes the number of PUs used by the program and then it computes the minimal issue width and buffer size to achieve the minimal runtime of the program. Note that the minimal runtime can be achieved if the processor may issue all instructions in the first step. mvProg : `MoveCode list` Returns: `int * int * int`
`ParseMoveProgramFromFile filename` Full Usage: `ParseMoveProgramFromFile filename` Parameters: filename : `string` Returns: `MoveCode list`	parse a move program from a file filename : `string` Returns: `MoveCode list`
`ParseMoveProgramFromStream ostr` Full Usage: `ParseMoveProgramFromStream ostr` Parameters: ostr : `TextReader` Returns: `MoveCode list`	parse a move program from a stream ostr : `TextReader` Returns: `MoveCode list`
`ParseMoveProgramFromString s` Full Usage: `ParseMoveProgramFromString s` Parameters: s : `string` Returns: `MoveCode list`	parse a move program from a string s : `string` Returns: `MoveCode list`
`PrintMoveProgram ostr moveProg` Full Usage: `PrintMoveProgram ostr moveProg` Parameters: ostr : `TextWriter` moveProg : `MoveCode seq`	write a byte array to a binary file ostr : `TextWriter` moveProg : `MoveCode seq`
`PrintThreads ostr threads` Full Usage: `PrintThreads ostr threads` Parameters: ostr : `TextWriter` threads : `MoveCode list array`	Print the threads per PU of a move code program ostr : `TextWriter` threads : `MoveCode list array`
`SimTraceToHtml ostr trace` Full Usage: `SimTraceToHtml ostr trace` Parameters: ostr : `TextWriter` trace : `(int array * int * ProcessorState<int> * ProcessorState<int> * ProcessorState<int>) list`	This function prints a SCAD machine simulation trace as generated by ScadExec in html output format. ostr : `TextWriter` trace : `(int array * int * ProcessorState<int> * ProcessorState<int> * ProcessorState<int>) list`
`SimulateMoveCode maxIssue maxSteps mvProg` Full Usage: `SimulateMoveCode maxIssue maxSteps mvProg` Parameters: maxIssue : `int` maxSteps : `int option` mvProg : `MoveCode list` Returns: `(int array * int * ProcessorState<int> * ProcessorState<int> * ProcessorState<int>) list`	This function is used to generate a simulation trace for a given move code program. Make sure that the mutable values numPU, bufSize, and memSize of this module are initialized accordingly; you may use function NumberOfPUs to determine the number of PUs used by the given program. Parameter maxIssue determines the maximal number of move instructions issued by the CU in one step (it may issue less if the buffers are full). Parameter maxSteps provides an optional upper bound of steps for the simulation; if it is None, the program runs up to termination. The result is a list of triples (numFirings,numTransfers,psState) where numFirings holds the numbers of firings of the PUs up to this point, numTransfers is the number of data transfers up to this point, and psState is the processor state at that point. maxIssue : `int` maxSteps : `int option` mvProg : `MoveCode list` Returns: `(int array * int * ProcessorState<int> * ProcessorState<int> * ProcessorState<int>) list`
`UpScaleNumPUs puNum mvProg` Full Usage: `UpScaleNumPUs puNum mvProg` Parameters: puNum : `int` mvProg : `MoveCode list` Returns: `MoveCode list`	Upscaling a Move Code Program to a given number of PUs. Note that any PU allocation remains valid by any partition refinement, i.e., nodes that are mapped to different PUs before must also be mapped to different PUs afterwards, but nodes that are currently mapped to the same PU may be moved to new PUs without producing conflicts unless the new PU is not used for nodes currently mapped to other PUs as well. puNum : `int` mvProg : `MoveCode list` Returns: `MoveCode list`
`WriteBinaryFile moveProg filename` Full Usage: `WriteBinaryFile moveProg filename` Parameters: moveProg : `MoveCode seq` filename : `string`	write a byte array to a binary file moveProg : `MoveCode seq` filename : `string`
`WriteMoveProgram moveProg filename` Full Usage: `WriteMoveProgram moveProg filename` Parameters: moveProg : `MoveCode seq` filename : `string`	write move program to a text file moveProg : `MoveCode seq` filename : `string`
`WriteTextFile moveProg filename` Full Usage: `WriteTextFile moveProg filename` Parameters: moveProg : `MoveCode seq` filename : `string`	write a byte array to a text file moveProg : `MoveCode seq` filename : `string`
`bufSize` Full Usage: `bufSize` Returns: `int`	size of the buffers Returns: `int`
`latencyMap` Full Usage: `latencyMap` Returns: `Map<OpCodeBED, int>`	latencies of the instructions Returns: `Map<OpCodeBED, int>`
`memSize` Full Usage: `memSize` Returns: `int`	size of the data memory Returns: `int`
`numPU` Full Usage: `numPU` Returns: `int`	The value numPU in this module must be initialized with the number of general purpose PUs (without CU and LSU) to ensure a correct instruction encoding. Returns: `int`