Encodings Module
The functions in this module implement algorithms for encoding a given alphabet of letters based on the Shannon-Fano, Huffman, and arithmetic encodings as well as functions for linear encodings and CRC error checksums.
Types
Type | Description |
Prefix codes are represented as binary trees whose leafs are the symbols that were encoded. The codes correspond with the paths from the root to the character where left/right branchings correspond with bits 0 and 1, respectively. |
Functions and values
Function or value | Description |
Full Usage:
ArithCode probList wordToEncode
Parameters:
(char * float) list
wordToEncode : string
Returns: Map<char, (float * float)> * (float * float) array
|
Arithmetic encoding yields the minimal number of bits due to Shannon's source coding theorem. Given list of characters c with probabilities and a word to encode, the function computes a mapping from characters to the corresponding intervals in [0.0,1.0], and an array of intervals that are associated with the prefixes of the given word to encode.
|
Full Usage:
AverageSize probList codeTree
Parameters:
(char * float) list
codeTree : BinCodeTree
Returns: float
|
Determine the average number of bits per character of the prefix code given by the code tree.
|
Full Usage:
BinCodeTree2Dot probList codeTree ostr
Parameters:
(char * float) list
codeTree : BinCodeTree
ostr : TextWriter
|
Write a code tree in dot file format to output stream ostr; arguments are the list of characters with their probabilities, the code tree, and the output stream.
|
Full Usage:
BinCodeTree2Encoding codeTree
Parameters:
BinCodeTree
Returns: Map<char, string>
|
BinCodeTree2Encoding determines from a binary code tree (whose leafs are chars) the corresponding encoding of the characters by bitvectors.
|
Full Usage:
CyclicReduncdancyCheck codeWord crcWord
Parameters:
string
crcWord : string
Returns: string
|
This function computes a polynomial division as used by cyclic redundancy checks, i.e., it divides bitvector codeWord by bitvector crcWord, where subtraction is done by bitwise exclusive-or operations. The resulting remainder must be zero to indicate no errors. The checksum of a given code can be obtained by adding (String.length crcWord)-1 zeros to the right before computing the polynom division.
|
|
CyclicRedundancyCheckTool is the function to be called from the web page; it expects bitvector arguments codeWord and crcWord.
|
|
Decode a given string s consisting only of 0 and 1 with a prefix code given by the code tree; trailing bits that do not form another full code word are ignored.
|
|
Encode a given word s with a prefix code given by the code tree.
|
Full Usage:
Entropy probList
Parameters:
(char * float) list
Returns: float
|
Compute the entropy of a given character probability list.
|
|
EntropyEncodingTool is the function to be called from the web page; it expects the following key/value pairs:
|
Full Usage:
FibonacciCode n
Parameters:
int
Returns: int array * int[] array * string array
|
Fibonacci encoding according to Zeckendorf's theorem: Every natural number can be written as a sum of Fibonacci numbers that does not include two successor Fibonacci numbers. This function computes a triple (fSeq,zeck,codes) such that fSeq is a prefix of the Fibonacci sequence that ends with the first Fibonacci number which is greater than n; zeck is an array such that zeck[k] is an array of indices of those Fibonacci numbers whose sum yields k, and codes is an array that maps k to its code word codes[k]. The code words codes[k] start with "11" and have no occurrences of "11" other than this prefix (hence, it forms a prefix code).
|
Full Usage:
FibonacciEncode probList codes wordToEncode
Parameters:
(char * float) list
codes : string[]
wordToEncode : string
Returns: int array * string
|
Encoding a given word with a Fibonacci code as generated by function FibonacciCode.
|
Full Usage:
HuffmanCodeTree probList
Parameters:
(char * float) list
Returns: BinCodeTree
|
HuffmanCodeTree generates a binary code tree according to the Huffman algorithm. In contrast to the Shannon-Fano algorithm, Huffmann encoding is provably optimal for fixed size code words.
|
|
|
|
Print the prefix code of a code tree as a html table mapping characters to their code words.
|
Full Usage:
ShannonFanoCodeTree probList
Parameters:
(char * float) list
Returns: BinCodeTree
|
ShannonFanoCodeTree generates a binary code tree for encoding characters given with probabilities in probList according to the Shannon-Fano algorithm.
|