Averest


Types Module

This module implements basic data structures for the MiniC language, the intermediate language CmdType, and the Abacus assembler language. Moreover, there are some basic functions to work with these basic data types like those for type checking MiniC programs, and determining the variables that are read and written by instructions.

Types

Type Description

AbacusInstr

Abacus is a typical RISC architecture; the following data type specifies however only the scalar part of the instructions, while there are also vector instructions. See also Abacus reference card. For addition, subtraction, multiplication, division there are four variants:

  • AbcOp(rd,r1,r2) means signed operation Reg[rd] := Reg[r1] op Reg[r2]
  • AbcOpU(rd,r1,r2) means unsigned operation Reg[rd] := Reg[r1] op Reg[r2]
  • AbcOpI(rd,r1,c) means signed operation Reg[rd] := Reg[r1] op c
  • AbcOpIU(rd,r1,c) means unsigned operation Reg[rd] := Reg[r1] op c
The following are logic operations
  • AbcAnd(rd,r1,r2) means bitwise Reg[rd] := Reg[r1] & Reg[r2]
  • AbcOr(rd,r1,r2) means bitwise Reg[rd] := Reg[r1] | Reg[r2]
  • AbcXor(rd,r1,r2) means bitwise Reg[rd] := Reg[r1] ^ Reg[r2]
  • AbcNeg(rd,r1) means bitwise Reg[rd] := !Reg[r1]
  • AbcShift(rd,r1,r2) means left shift of Reg[r1] by Reg[r2] bits to Reg[rd]
The following are comparison operations
  • AbcSlt(rd,r1,r2) means signed operation Reg[rd] := Reg[r1] < Reg[r2]
  • AbcSltU(rd,r1,r2) means unsigned operation Reg[rd] := Reg[r1] < Reg[r2]
  • AbcSlet(rd,r1,r2) means signed operation Reg[rd] := Reg[r1] <= Reg[r2]
  • AbcSletU(rd,r1,r2) means unsigned operation Reg[rd] := Reg[r1] <= Reg[r2]
  • AbcSeq(rd,r1,r2) means equality Reg[rd] := Reg[r1] == Reg[r2]
  • AbcSne(rd,r1,r2) means negated equality Reg[rd] := Reg[r1] != Reg[r2]
The following are memory operations
  • AbcMove(rd,c) means Reg[rd] := bv2int(c)
  • AbcOvf(rd) means Reg[rd] := ovflw
  • AbcLoad(rd,r1,r2) means Reg[rd] := Mem[Reg[r1] + Reg[r2]]
  • AbcLoadI(rd,r1,c) means Reg[rd] := Mem[Reg[r1] + bv2nat(c)]
  • AbcStore(rd,r1,r2) means Mem[Reg[r1] + Reg[r2]] := Reg[rd]
  • AbcStoreI(rd,r1,c) means Mem[Reg[r1] + bv2nat(c)] := Reg[rd]
The following are control flow operations
  • AbcBez(rd,c) means if(Reg[rd]==0) pc := pc + bv2int(c)
  • AbcBnz(rd,c) means if(Reg[rd]!=0) pc := pc + bv2int(c)
  • AbcJmp(rd) means pc := pc + Reg[rd]
The following are operations for thread control where additionally a load-link register is modified and checked:
  • AbcLdLink(rd,r1,c) means Reg[rd] := Mem[Reg[r1] + bv2nat(c)]
  • AbcStCond(rd,r1,r2) means Mem[Reg[r1] + Reg[r2]] := Reg[rd]
  • AbcSync waits until the store buffer is emptied

AbcParamsType

AbcParams holds the parameters of a computer system with an Abacus processor, which are the following:

  • DataWidth is the bitwidth of registers, i.e., machine words
  • BlockSize is the number of bytes in a cache block
  • SetAssoc is the number of blocks in a set of the cache
  • CacheSize is the size of the data cache in bytes
  • MemSize is the size of physical main memory in bytes
  • PipeFetch : pipeline stage where next program counter is set
  • PipeDecode : pipeline stage where operand registers are read
  • PipeExecute : pipeline stage where ALU results are produced
  • PipeMemAcs : pipeline stage where Load results are produced
  • PipeRegWrite : pipeline stage where registers are written
  • RegBypass : whether registers reads yield current writes
  • Forwarding : to use forwarding from PipeExecute and PipeMemAcs
  • BranchInDecode : immediately write new PC in PipeDecode for Branches
  • SingleCycle : whether everything is done in one stage
See function CheckParameters for consistent values of these parameters.

CmdType

The following are instructions of the virtual intermediate machine (cmd language) with variables y,x1,x2,y1,y2:

  • CmdAssert(x) asserts that x!=0 should hold at this place.
  • CmdCopy(y,x) denotes the assignment y := x.
  • CmdSglAssign(y,x1,op2,x2) denotes the assignment y := x1 op2 x2 with a binary operator op2.
  • CmdDblAssign(y1,y2,x1,op2,x2) denotes the assignment y1,y2 := x1 op2 x2 with a binary operator op2. The rhs is evaluated in full precision and y1@y2 is seen as a double machine word.
  • CmdAccessArr(y,x1,x2) denotes y = x1[x2].
  • CmdAssignArr(y,x1,x2) denotes y[x1] = x2.
  • CmdAssignCnd(y,c,x1,x0) is a conditional assignment y = (c?x1:x0).
  • CmdGoto(i) denotes goto(i).
  • CmdIfGoto(x,i) denotes if(x) goto(i).
  • CmdReturn(x) denotes return(x).
  • CmdSync instructs the tread to wait until its writes to the shared variables have been made visible to the other threads.
Hence, all expressions make use of at most one operator, and therefore, we have to abbreviate intermediate expressions by temporary variables. Moreover, function calls have to be inlined since there is no means for the latter. Still, there can be threads of cmd language instructions, with the same shared variables.

ExceptionMiniC

This is the general exception raised by MiniC functions if something went wrong.

ExprC

These are right-hand-side expressions of the MiniC language, i.e., expressions that may occur on the right-hand side of an assignment.

  • RhsVar(vn) is a variable with name vn.
  • ConstB(b) is the Cbool value b as a machine word.
  • ConstN(i) is the Cnat value i as a machine word.
  • ConstZ(i) is the Cint value i as a machine word.
  • Apply1(op1,e) is the application of the unary operator op1 to the MiniC expression e.
  • Apply2(op2,e1,e2) is the application of the binary operator op2 to the MiniC expressions e1 and e2.
  • CndExpr(e0,e1,e2) is a conditional expression that evaluates to e1 if e0 evaluates to true and otherwise it evaluates to e2.
  • FunApp(f,[|e1,...,en|]) is the application of function f to the argument expressions e1,...,en.
  • RhsArrAcc(lhs,e) refers to array element lhs[e].
  • RhsTupAcc(lhs,i) refers to tuple element lhs.i.
  • ArrNew [|e1,...,en|] constructs an array.
  • TupNew [|e1,...,en|] constructs a tuple.

Iconst

InstructionClass

This type defines the four basic instruction classes, i.e., ALU operations, MEMory operations, BRanCh operations and JuMPs.

LhsExprC

LhsExpr are viewed as addresses with an offset; the address is thereby the base variable, and the offset is computed as an expression that depends on the access expressions of arrays and tuples. Note that types [m][n]alpha are parsed to Carr(m,Carr(n,alpha)) and that expressions a[i][j] are parsed to LhsArrAcc(LhsArrAcc(a,i),j). The following are left-hand-side expressions of the MiniC language, i.e., expressions that may occur on the left-hand side of an assignment.

  • LhsVar(vn) is a variable with name vn.
  • LhsArrAcc(lhs,e) refers to array element lhs[e].
  • LhsTupAcc(lhs,i) refers to tuple element lhs.i.

MiniCFunction

MiniC functions are defined by their name, their arguments and return type, and of course by their body statement. The body statement should always terminate with a return expression. MiniC functions can call other functions, but not themselves (also not via other functions). This restriction ensures statically bounded memory for function calls.

MiniCProgram

MiniC programs consist of a set of threads which are pairs of the name of the thread and its statement, functions that can be called by these threads, and the declaration of the globally shared variables that are used for the threads to communicate with each other. Further information about the syntax can be found in the reference card MiniC reference card.

Ops1

Ops1 declares an enumeration type of the unary operators of the MiniC language (this makes is flexible to add operators). Available operators are:

  • NotB: a bitwise negation of Cbool values.
  • CastB: instructs the type checker to interpret this machine word as boolean, i.e., make it a Cbool value.
  • CastN: instructs the type checker to interpret this machine word as Cnat, i.e., make it a Cnat value.
  • CastZ: instructs the type checker to interpret this machine word as Cint, i.e., make it a Cint value.
Note that cast operators do not change the machine word, but due to the different interpretation, the value might still change (since Cint interprets machine words using 2-complement while Cnat uses radix-2).

Ops2

Ops2 declares an enumeration type of the binary operators of the MiniC language (this makes is flexible to add operators). Available operators are:

  • AddN,SubN,MulN,DivN,ModN for addition, subtraction, multiplication, division and modulo operations of Cnat values. The result of these operations is a Cnat value.
  • AddZ,SubZ,MulZ,DivZ,ModZ for addition, subtraction, multiplication, division and modulo operations of Cint values. The result of these operations is a Cint value.
  • LesN,LeqN,EqqN,NeqN for less-than, less-than-or-equal, equal, and inequal relation of Cnat values. The result of these operations is a Cbool value.
  • LesZ,LeqZ,EqqZ,NeqZ for less-than, less-than-or-equal, equal, and inequal relation of Cint values. The result of these operations is a Cbool value.
  • AndB,OrB,EqqB,NeqB for bitwise conjunction, disjunction, equivalence and exclusive-or operation on Cbool values. The result of these operations is a Cbool value.

StmtC

These are the statements of the MiniC language (further information about the syntax can be found in the reference card MiniC reference card.)

  • Nothing is an empty statement for some code transformations.
  • SglAssign(lhs,rhs) is an assignment to one machine word lhs.
  • DblAssign(lhs1,lhs2,rhs) is an assignment to two machine words lhs1,lhs2. Only the topmost operator of rhs is evaluated in double precision this way and then assigned to lhs11@lhs2.
  • AssertMC(b) is an assertion of condition b.
  • FunCall(f,[|e1,...,en|]) is the call of function f with argument expressions e1,...,en.
  • Sequence(s1,s2) is the sequential execution of s1 and s2.
  • IfThenElse(b,s1,s2) executes s1 if b holds, otherwise s2.
  • WhileDo(b,s) executes s while b holds, checking b at first.
  • DoWhile(s,b) executes s while b holds, ignoring b at first.
  • ForLoop(i,min,max,s) is a for loop with loop variable i and body s where the loop variable starts with min and ends with max.
  • LocDec(decls,s) declares local variables for the scope s.
  • Return(e) returns the value of the expression e as the result of a MiniC function.
  • CSync synchronizes MiniC threads, i.e., a thread will wait here until its assignments to global variables become visible to other the other threads.

TypeC

TypeC declares the available data types of variables in MiniC. There are atomic types Cbool, Cnat and Cint, and composite types Carr and Cfun:

  • Cbool are booleans, i.e., true and false, which are the machine words 1...1 and 0...0, i.e., -1 and 0 (interpreted as Cint).
  • Cnat are unsigned integers bounded by the machine width
  • Cint are signed integers bounded by the machine width
  • Ctup(tyA) is a tuple type consisting of the types in array tyA.
  • Carr(dim,ty) is an array of dimension dim of type ty. For the definition of functions, it is allowed to use arrays with an unspecified dimension which makes their use more flexible.
  • Cfun(tyA,rty) is a function type mapping arguments of types in array tyA to a potential value of return type rty (None means void).
Note that MiniC is viewed, as C, as a macro language for assembler programs, so that all atomic data types are mapped to machine words. Cint uses 2-complement for interpreting machine words as integers, while Cnat simply interprets them as binary numbers.

Uconst

declaration

regindex

Functions and values

Function or value Description

CheckParameters ostr a

Full Usage: CheckParameters ostr a

Parameters:

check the consistency of the given parameters, i.e.,

  • DataWidth must be a multiple of 8 (i.e., a number of bytes), and it is also limited to 32 bits at the moment.
  • BlockSize must be a multiple of the number of bytes in DataWidth, and must be equal to that number if no cache is used.
  • CacheSize must be multiple of the set size, i.e., of SetAssoc * BlockSize.
  • MemSize must be multiple of BlockSize.
The function checks these conditions, and writes errors to the output stream ostr, and fails if one of them should not be satisfied with an ExceptionMiniC.

ostr : TextWriter
a : AbcParamsType

GetBaseOfLhs lhs

Full Usage: GetBaseOfLhs lhs

Parameters:
Returns: string

LhsExpr are viewed as addresses with an offset (see comment on LhsExpr). GetBaseOfLhs lhs computes thereby the address, which is the target variable in the LhsExpr.

lhs : LhsExprC
Returns: string

GetInstructionClass instr

Full Usage: GetInstructionClass instr

Parameters:
Returns: InstructionClass

determines the InstructionClass of a given instruction

instr : AbacusInstr
Returns: InstructionClass

Lhs2RhsExpr e

Full Usage: Lhs2RhsExpr e

Parameters:
Returns: ExprC

converts a LhsExpr e to a Expr (it never fails).

e : LhsExprC
Returns: ExprC

MayReadByCmd cmd

Full Usage: MayReadByCmd cmd

Parameters:
Returns: Set<string>

MayReadByCmd cmd determines the set of variables the cmd may read. Remark: CmdAccessArr(y,x1,x2), i.e. y=x1[x2], is mapped to a load instruction where x1 is the start address of the array which is a constant value determined by the memory mapping of the program. Note further that y := (c?x1:x0) may read c,x1, and x0, but only c is read for sure.

cmd : CmdType
Returns: Set<string>

MkOffsetExpr decl varSubst lhs

Full Usage: MkOffsetExpr decl varSubst lhs

Parameters:
Returns: ExprC

LhsExpr are viewed as addresses with an offset (see comment on LhsExpr). MkOffsetExpr computes thereby the offset from the base variable of lhs to the memory address of lhs. Since this offset depends on the variables occurring in lhs, it is computed symbolically, i.e., as an expression. To this end, one has to make use of the types of variables, and thus a declaration must be given, but also a variable renaming is applied before computing the result. Example: a[i][j].1 of type [m][n](bool * nat) has the following offset: i*SizeOfType([n](bool * nat)) + j*SizeOfType((bool * nat)) + 1

decl : Map<string, TypeC>
varSubst : Map<string, string>
lhs : LhsExprC
Returns: ExprC

MustReadByCmd cmd

Full Usage: MustReadByCmd cmd

Parameters:
Returns: Set<string>

MustReadByCmd cmd determines the set of variables the cmd must read. Remark: CmdAccessArr(y,x1,x2), i.e. y=x1[x2], is mapped to a load instruction where x1 is the start address of the array which is a constant value determined by the memory mapping of the program. Note further that y := (c?x1:x0) may read c,x1, and x0, but only c is read for sure.

cmd : CmdType
Returns: Set<string>

ReadWritesOfInstr instr

Full Usage: ReadWritesOfInstr instr

Parameters:
Returns: Set<regindex> * Set<regindex> * Set<int>

For any Abacus instruction instr, function ReadWritesOfInstr computes a triple (r,w,o) where

  • r is the set of registers read by instr,
  • w is the set of registers written by instr, and
  • o is the set of offsets to the current program counter for the next instructions that can follow the current instruction.
This is the basic information needed for dataflow analysis and for the analysis of pipeline conflicts.

instr : AbacusInstr
Returns: Set<regindex> * Set<regindex> * Set<int>

ReportProblem ostr s

Full Usage: ReportProblem ostr s

Parameters:
Returns: 'a

writes a comment s to output stream ostr and raises then exception ExceptionMiniC with message s.

ostr : TextWriter
s : string
Returns: 'a

Rhs2LhsExpr e

Full Usage: Rhs2LhsExpr e

Parameters:
Returns: LhsExprC

converts rhs expression e to a LhsExpr, if possible, otherwise fails.

e : ExprC
Returns: LhsExprC

SizeOfType cty

Full Usage: SizeOfType cty

Parameters:
Returns: int option

SizeOfType ty computes the size of a type ty in terms of machine words. The result is None in case that an array type has an unspecified dimension, otherwise it is Some(s) where s is the number of machine words required to store type ty.

cty : TypeC
Returns: int option

SuccOfCmd (i, cmd)

Full Usage: SuccOfCmd (i, cmd)

Parameters:
Returns: Set<int>

SuccOfCmd (i,cmd) determines the set of indices of potential successor commands of command cmd which is found at line i. It is a conservative "may" estimation, since in case of CmdIfGoto both paths are taken.

i : int
cmd : CmdType
Returns: Set<int>

TypeCheckFunction decl mf

Full Usage: TypeCheckFunction decl mf

Parameters:
Returns: MiniCFunction

For simplicity, the parser will resolve all overloaded operators like the arithmetic operators to Cnat type. This function takes care of given declarations and will then determine type-correct operators for the MiniC function mf.

decl : Map<string, TypeC>
mf : MiniCFunction
Returns: MiniCFunction

TypeCheckProgram p

Full Usage: TypeCheckProgram p

Parameters:
Returns: MiniCProgram

For simplicity, the parser will resolve all overloaded operators like the arithmetic operators to Cnat type. This function will determine type-correct operators for the MiniC program p.

p : MiniCProgram
Returns: MiniCProgram

TypeOfExprC decl e

Full Usage: TypeOfExprC decl e

Parameters:
Returns: TypeC * ExprC

For simplicity, the parser will resolve all overloaded operators like the arithmetic operators to Cnat type. This function takes care of given declarations and will then determine type-correct operators for the rhs expression. The result is a pair (ty,e') where e' is the expression with corrected types and ty is its type.

decl : Map<string, TypeC>
e : ExprC
Returns: TypeC * ExprC

TypeOfLhsExprC decl e

Full Usage: TypeOfLhsExprC decl e

Parameters:
Returns: TypeC * LhsExprC

For simplicity, the parser will resolve all overloaded operators like the arithmetic operators to Cnat type. This function takes care of given declarations and will then determine type-correct operators for the lhs expression. The result is a pair (ty,lhs') where lhs' is the LhsExpr with corrected types and ty is its type.

decl : Map<string, TypeC>
e : LhsExprC
Returns: TypeC * LhsExprC

TypeOfStmtC decl stmt

Full Usage: TypeOfStmtC decl stmt

Parameters:
Returns: TypeC option * StmtC

For simplicity, the parser will resolve all overloaded operators like the arithmetic operators to Cnat type. This function takes care of given declarations and will then determine type-correct operators for the statement. The result is a pair (ty,stmt) where stmt is the given statement with corrected types, and ty is the optional return type of that statement, which is usually None (meaning void) or otherwise the type of a return expression (for type checking function bodies).

decl : Map<string, TypeC>
stmt : StmtC
Returns: TypeC option * StmtC

VarsOfCmd cmd

Full Usage: VarsOfCmd cmd

Parameters:
Returns: Set<string>

VarsOfCmd cmd determines the set of variables ocurring in cmd.

cmd : CmdType
Returns: Set<string>

VarsOfExpr e

Full Usage: VarsOfExpr e

Parameters:
Returns: Set<string>

compute the set of variables that occur in an expression

e : ExprC
Returns: Set<string>

VarsOfLhsExpr lhs

Full Usage: VarsOfLhsExpr lhs

Parameters:
Returns: Set<string>

compute the set of variables that occur in a lhs-expression

lhs : LhsExprC
Returns: Set<string>

VarsOfStmt stmt

Full Usage: VarsOfStmt stmt

Parameters:
Returns: Set<string>

compute the set of variables that occur in a statement

stmt : StmtC
Returns: Set<string>

WrittenByCmd cmd

Full Usage: WrittenByCmd cmd

Parameters:
Returns: Set<string>

WrittenByCmd cmd determines the set of variables written by cmd. Remark: CmdAssignArr(y,x1,x2), i.e. y[x1]=x2, is mapped to a store instruction which does not modify any of the variables (y is the start address of the array which is not modified).

cmd : CmdType
Returns: Set<string>

getDeclsOfProgram p

Full Usage: getDeclsOfProgram p

Parameters:
Returns: (string * TypeC)[]

compute the global and local variable declarations of MiniC program p and checks for shaddowing

p : MiniCProgram
Returns: (string * TypeC)[]

getDeclsOfStmt st

Full Usage: getDeclsOfStmt st

Parameters:
Returns: (string * TypeC)[]

compute the local variable declarations of MiniC statement st

st : StmtC
Returns: (string * TypeC)[]

isVarname s

Full Usage: isVarname s

Parameters:
    s : string

Returns: bool

s : string
Returns: bool