Header menu logo F# Header menu logo Averest

MoveCode Module

This module implements functions for move code programs executed by BED (Buffered Exposed Datapath) processor architectures as the SCAD processor. In addition to the data types for move instruction programs, the module also provides an instruction set simulator to define the semantics of the move code instructions of the SCAD machine (for more details, see the SCAD reference sheet of SCAD).

Types

Type Description

MoveCode

This type defines move code instructions with methods for encoding, decoding, string conversion, and parsing from strings and byte arrays. For the encoding into machine words (byte arrays), see the addresses of PUs and buffers in the SCAD reference sheet. Hence, the general purpose PUs have indices \(2,\ldots,\mathsf{numPU}+1\), and their input and output buffers have addresses \(4,\ldots,2\cdot\mathsf{numPU}+3\). If these numbers are encoded as radix-2 numbers, we need \(\left\lceil{\log_2(2\cdot\mathsf{numPU}+4)}\right\rceil\) bits for the source address and the same number of bits for the target address. Together with the two bits to determine one of the four kinds of move instructions, we finally obtain instructions |mv|srcAdr|tgtAdr| where mv has two bits, and each one of srcAdr and tgtAdr has \(\left\lceil{\log_2(2\cdot\mathsf{numPU}+4)}\right\rceil\) bits. Hence, we get

  • \(\mathsf{numBits(numPU)} := 2+2\cdot \left\lceil{\log_2(2\cdot\mathsf{numPU}+4)}\right\rceil\)
  • \(\mathsf{maxPU(numBits)} := 2^{\frac{\mathsf{numBits}}{2}-2}-2\)
However, we demand that the minimal size of the bitvectors is 16 to ensure a sufficient size for immediate operands, and this number is also sufficient for up to 62 PUs.

OpCodeBED

type for operation codes with methods for parsing, printing and encoding

ProcessorState<'a>

a processor state contains all values that may change in an execution; buffers are thereby array where NONE denotes no value, and where the head value is at index 0, so that the arrays end with a suffix of values NONE and have a prefix of SOME-values.

PuIdx

PUs have a unique indices; index 0 is reserved for the CU, index 1 is reserved for the LSU; and general purpose PUs have indices 2,...,numPU+1

SrcAdr

sources of move code instructions

TgtAdr

targets of move code instructions

Functions and values

Function or value Description

CopyProcessorState ps

Full Usage: CopyProcessorState ps

Parameters:
Returns: ProcessorState<'a>

copy a processor state

ps : ProcessorState<'a>
Returns: ProcessorState<'a>

ExampleProgFactorial n

Full Usage: ExampleProgFactorial n

Parameters:
    n : int

Returns: MoveCode list

This function generates a move code program to compute the factorial of a given number n. It makes use of three PUs and runs in linear time.

n : int
Returns: MoveCode list

ExampleProgListSum n puNum

Full Usage: ExampleProgListSum n puNum

Parameters:
    n : int
    puNum : int

Returns: MoveCode list

This function generates a move code program to compute the sum of the first n natural numbers with puNum PUs. The generated move code program has 3*n-3 instructions and can be run with log(n)+1 many steps with n/2 many PUs. However, it requires a large issue width of about 4 times the number of PUs. A buffer size of 2 is however sufficient for optimal time.

n : int
puNum : int
Returns: MoveCode list

ExampleProgOddEvenSort n

Full Usage: ExampleProgOddEvenSort n

Parameters:
    n : int

Returns: MoveCode list

OddEven Transposition Sorting is a parallel sorting algorithm that requires O(n^2) work with a depth of O(n). The compare-and-swap operation is implemented by comparing x1 with x2, and conditional assignments to variables y1 = b1 and x1 or !b1 or x2 and y2 = !b1 and x1 or b1 or x2. Without leveling, the program results in a dataflow graph which cannot be scheduled on a single PU; however, two PUs are sufficient to run the program. In addition to this PU, we need another PU that provides the input pair of values for this operation, and will afterwards pick the result to insert in the sequence.

n : int
Returns: MoveCode list

ExampleProgParallelPrefixBrentKung n

Full Usage: ExampleProgParallelPrefixBrentKung n

Parameters:
    n : int

Returns: MoveCode list

This function generates a move code program for computing the parallel prefix sum of the first n numbers. The function uses only a single PU and processes the levelized dataflow graph level by level from left to right. The values are thereby kept in the left input buffer and are either copied back to this buffer, duplicated for a following addition, or added to a valued put in inR after a previous duplication. The algorithm implements the Brent-Kung parallel prefix tree which is work and time optimal.

n : int
Returns: MoveCode list

ExampleProgParallelPrefixKoggeStone n

Full Usage: ExampleProgParallelPrefixKoggeStone n

Parameters:
    n : int

Returns: MoveCode list

Kogge-Stone parallel prefix computation is not work-efficient, but almost twice as fast as the Brent-Kung parallel prefix computation. On SCAD, the additional work does not matter since the Brent-Kung implementation may require additional COPY operations as well. The implementation makes use of two PUs, since we need to duplicate the result of additions while having the values of the current level in the inL buffer of PU[0].

n : int
Returns: MoveCode list

ExampleProgTreeSum n puNum

Full Usage: ExampleProgTreeSum n puNum

Parameters:
    n : int
    puNum : int

Returns: MoveCode list

This function generates a move code program to compute the sum of the first n natural numbers with puNum PUs. The generated move code program has 3*n-3 instructions and can be run with log(n)+1 many steps with n/2 many PUs. However, it requires a large issue width of about the number of PUs. A buffer size of 2 is however sufficient for optimal time.

n : int
puNum : int
Returns: MoveCode list

MkInitState initVal

Full Usage: MkInitState initVal

Parameters:
    initVal : 'a

Returns: ProcessorState<'a>

generate an initial state for the processor

initVal : 'a
Returns: ProcessorState<'a>

MoveCodeThreads mvProg

Full Usage: MoveCodeThreads mvProg

Parameters:
Returns: MoveCode list array

split a move code program into threads, i.e., move code instructions that have to be sent to each PU

mvProg : MoveCode list
Returns: MoveCode list array

NumberOfPUs mvProg

Full Usage: NumberOfPUs mvProg

Parameters:
Returns: PuIdx

determine the number of PUs used by the given move code program, e.g. to assign the mutable variable numPU in this module

mvProg : MoveCode seq
Returns: PuIdx

OptimizeParameters mvProg

Full Usage: OptimizeParameters mvProg

Parameters:
Returns: int * int * int

This function computes the number of PUs used by the program and then it computes the minimal issue width and buffer size to achieve the minimal runtime of the program. Note that the minimal runtime can be achieved if the processor may issue all instructions in the first step.

mvProg : MoveCode list
Returns: int * int * int

ParseMoveProgramFromFile filename

Full Usage: ParseMoveProgramFromFile filename

Parameters:
    filename : string

Returns: MoveCode list

parse a move program from a file

filename : string
Returns: MoveCode list

ParseMoveProgramFromStream ostr

Full Usage: ParseMoveProgramFromStream ostr

Parameters:
Returns: MoveCode list

parse a move program from a stream

ostr : TextReader
Returns: MoveCode list

ParseMoveProgramFromString s

Full Usage: ParseMoveProgramFromString s

Parameters:
    s : string

Returns: MoveCode list

parse a move program from a string

s : string
Returns: MoveCode list

PrintMoveProgram ostr moveProg

Full Usage: PrintMoveProgram ostr moveProg

Parameters:

write a byte array to a binary file

ostr : TextWriter
moveProg : MoveCode seq

PrintThreads ostr threads

Full Usage: PrintThreads ostr threads

Parameters:

Print the threads per PU of a move code program

ostr : TextWriter
threads : MoveCode list array

SimTraceToHtml ostr trace

Full Usage: SimTraceToHtml ostr trace

Parameters:

This function prints a SCAD machine simulation trace as generated by ScadExec in html output format.

ostr : TextWriter
trace : (int array * int * ProcessorState<int> * ProcessorState<int> * ProcessorState<int>) list

SimulateMoveCode maxIssue maxSteps mvProg

Full Usage: SimulateMoveCode maxIssue maxSteps mvProg

Parameters:
    maxIssue : int
    maxSteps : int option
    mvProg : MoveCode list

Returns: (int array * int * ProcessorState<int> * ProcessorState<int> * ProcessorState<int>) list

This function is used to generate a simulation trace for a given move code program. Make sure that the mutable values numPU, bufSize, and memSize of this module are initialized accordingly; you may use function NumberOfPUs to determine the number of PUs used by the given program. Parameter maxIssue determines the maximal number of move instructions issued by the CU in one step (it may issue less if the buffers are full). Parameter maxSteps provides an optional upper bound of steps for the simulation; if it is None, the program runs up to termination. The result is a list of triples (numFirings,numTransfers,psState) where numFirings holds the numbers of firings of the PUs up to this point, numTransfers is the number of data transfers up to this point, and psState is the processor state at that point.

maxIssue : int
maxSteps : int option
mvProg : MoveCode list
Returns: (int array * int * ProcessorState<int> * ProcessorState<int> * ProcessorState<int>) list

UpScaleNumPUs puNum mvProg

Full Usage: UpScaleNumPUs puNum mvProg

Parameters:
Returns: MoveCode list

Upscaling a Move Code Program to a given number of PUs. Note that any PU allocation remains valid by any partition refinement, i.e., nodes that are mapped to different PUs before must also be mapped to different PUs afterwards, but nodes that are currently mapped to the same PU may be moved to new PUs without producing conflicts unless the new PU is not used for nodes currently mapped to other PUs as well.

puNum : int
mvProg : MoveCode list
Returns: MoveCode list

WriteBinaryFile moveProg filename

Full Usage: WriteBinaryFile moveProg filename

Parameters:
    moveProg : MoveCode seq
    filename : string

write a byte array to a binary file

moveProg : MoveCode seq
filename : string

WriteMoveProgram moveProg filename

Full Usage: WriteMoveProgram moveProg filename

Parameters:
    moveProg : MoveCode seq
    filename : string

write move program to a text file

moveProg : MoveCode seq
filename : string

WriteTextFile moveProg filename

Full Usage: WriteTextFile moveProg filename

Parameters:
    moveProg : MoveCode seq
    filename : string

write a byte array to a text file

moveProg : MoveCode seq
filename : string

bufSize

Full Usage: bufSize

Returns: int

size of the buffers

Returns: int

latencyMap

Full Usage: latencyMap

Returns: Map<OpCodeBED, int>

latencies of the instructions

Returns: Map<OpCodeBED, int>

memSize

Full Usage: memSize

Returns: int

size of the data memory

Returns: int

numPU

Full Usage: numPU

Returns: int

The value numPU in this module must be initialized with the number of general purpose PUs (without CU and LSU) to ensure a correct instruction encoding.

Returns: int

Type something to start searching.