Grammars Module
This module implements some functions for context-free grammars and parser generators.
Types
Type | Description |
data type for the actions of LR(0) and SLR(1) parsers |
|
|
|
|
|
symbols on the stack of a SLR(1) parser |
|
data types for nonterminals and terminals |
|
|
Functions and values
Function or value | Description |
|
|
Cocke-Younger-Kasami algorithm for checking which nonterminals of a grammar in Chomsky normal form can produce a given word. The function returns a parse table that maps integers (i,l) to sets of nonterminals such that those nonterminals can produce the subword starting at index i of length l. |
|
This function computes the set of nonterminal symbols of a grammar that can derive other nonterminal symbols via derivations with more than one production step. If one of the nonterminals can produce itself, then the algorithm for elimination of left recursion may end up in an infinite loop, and requires basic cleanup steps in advance. In all other cases, the algorithm for elimination of left recursion will terminate. |
|
|
|
|
Elimination of left recursion of grammar rules (note that this algorithm may not terminate in case that there is a nonterminal A that can derive itself via more than one production step): For the resulting grammar, the right hand sides of production rules of nonterminal A may only start with a nonterminal if that nonterminal is greater than A (where greater refers to the ordering given in the grammar).
|
This function computes an equation system that defines the First sets as a least fixpoint. It is directly read from the grammar using the nullable nonterminals. The function returns a map that maps each nonterminal to a set of strings denoting sets whose union is the First set. |
|
Compute the first sets of all terminals and nonterminals; i.e., the set of terminal symbols that are the leftmost symbols of words that can be derived from the considered symbols (and "" if the empty word can be derived). The result is a pair of maps (firstSets,firstSetsTable) where firstSets maps terminal and nonterminal symbols to a set of strings that are the first sets of those symbols. The second map firstSetsTable shows how the First sets have been computed by a fixpoint iteration.
|
|
This function computes an equation system that defines the Follow sets as a least fixpoint. It is directly read from the grammar using the nullable nonterminals. The function returns a map that maps each nonterminal to a set of strings denoting sets whose union is the Follow set. |
|
Full Usage:
GrammarOfPDA (states, sigmaPD, sigmaIN, initState, transRel)
Parameters:
string[]
sigmaPD : string[]
sigmaIN : string[]
initState : string
transRel : (string * string * string * string * string) seq
Returns: (Set<Symbol> * Set<Symbol> * Symbol * Grammar) * (Set<Symbol> * Set<Symbol> * Symbol * (Symbol * Symbol array array) array) * (Set<Symbol> * Set<Symbol> * Symbol * (Symbol * Symbol array array) array) option
|
generate a grammar for a given PDA according to the standard algorithm; the function returns three grammars (gr1,gr2,gr3) where gr1 is the direct result of the standard algorithm, gr2 is its basic clean-up by removing unreachable and useless symbols, and in grammar gr3, the nonterminals were renamed to A,B,C,.. if possible (therefore it is an option type).
|
|
function to be called by the cgi script
|
|
Greibach normal form demands that all right hand sides of production rules start with a terminal symbol. The following function assumes that the given grammar has no left recursion so that all right hand sides of production rules of a nonterminal start may only start with nonterminals greater than that nonterminal. This algorithm rewrites also these occurrences so that all right hand sides will start with terminal symbols only.
|
|
|
|
|
|
|
|
|
|
Parsing a grammar: captial letters are nonterminals, while all other letters are interpreted as terminal symbols. Rules must be given in the form A -> w1 | w2 | ... | wn where in each line contains the rules of one nonterminal symbol
|
Full Usage:
ParsePDA s
Parameters:
string
Returns: string[] * string[] * string[] * string * (string * string * string * string * string) list
|
parse a pushdown automaton which should looks, for example, as follows: states: q0,q1,q2 sigmaPD: $,a,b sigmaIN: a,b transitions: ($,q0,ϵ,ϵ,q0); ($,q0,a,$a,q0); (a,q0,a,aa,q0); (a,q0,b,ϵ,q1); ($,q1,ϵ,ϵ,q1); (a,q1,b,ϵ,q1);
|
Full Usage:
ParseTreeOfSLR12Dot ostr (nodes, edges)
Parameters:
TextWriter
nodes : Set<int * Symbol>
edges : Set<int * int>
|
write a parse tree as generated by ParseTreeOfSLR1Parse to dot format
|
Full Usage:
ParseTreeOfSLR1Parse pL
Parameters:
('a * 'b * ActionSLR1) list
Returns: Set<int * Symbol> * Set<int * int>
|
generate a parse tree of a parse run generated by function PrintSLR1Parse
|
Full Usage:
ParserSLR1 actTable w
Parameters:
Map<(int * Symbol), Set<ActionSLR1>>
w : string
Returns: (StackSymbol list * Symbol list * ActionSLR1) list
|
perform SLR(1) parsing based on an action table
|
|
function to be called by the cgi script
|
|
print a grammar
|
|
|
|
print the parsing derivation as computed by ParserSLR1 as a html table
|
|
|
|
|
|
|
|
|
Clean-up a given context-free grammar by removing useless non-terminals and their rules, i.e., nonterminals that are either not reachable or that are no productive (cannot generate any terminal word). The obtained grammar is equivalent, but has only useful symbols and rules. |
|
|