
File:          alice29.txt
Contents:      Lewis Carroll: Alice in Wonderland (English novel)
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 128 (ASCII)
Byte per char: 1

File:          asyoulik.txt
Contents:      Shakespeare; As you like it (English play)
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 128 (ASCII)
Byte per char: 1

File:          lcet10.txt
Contents:      Proceedings of Workshop on Electronic Texts (English
               technical writing)
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 128 (ASCII)
Byte per char: 1

File:          ailingNoASCIIx4.gb
Contents:      Excerpt of the Chinese novel "Love in a Fallen City" by
               Zhang Ailing (concatenated four times for length).
From:          http://www-personal.umich.edu/~dporter/sampler/sampler.html
Alphabet size: 2^16 (EUC-CN encoding of the GB2312 character set, with
               ASCII bytes (byte values < 128) removed)
Byte per char: 2

File:          y.tab.c
Contents:      Source code of program in C
From:          ftp://ftp.cwru.edu/pub/bash/bash-4.1.tar.gz
Alphabet size: 128 (ASCII)
Byte per char: 1

File:          E.coli
Contents:      Complete genome of the E. Coli bacterium
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 4 (DNA)
Byte per char: 1

File:          random.txt
Contents:      Random characters
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 64 ([a-z|A-Z|0-9|!| ]) 
Byte per char: 1

File:          alphabet.txt
Contents:      alphabet repeated
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 26 (a-z)
Byte per char: 1

File:          aaa.txt
Contents:      The letter 'a' repeated
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 1
Byte per char: 1

File:          pi.txt
Contents:      First mio digits of Pi
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 10 (decimal digits)
Byte per char: 1

File:          kennedy.xls
Contents:      Excel spreadsheet file
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 256 (binary data)
Byte per char: 1

File:          ptt5
Contents:      Fax image
From:          http://corpus.canterbury.ac.nz/descriptions/
Alphabet size: 256 (binary data)
Byte per char: 1

File:          string.ps.gz
Contents:      Gzip compressed postscript file
From:          Lecturers archive
Alphabet size: 256 (binary data)
Byte per char: 1

File:          bash
Contents:      Compiled program
From:          /bin/bash on Ubuntu 9.10 for i686
Alphabet size: 2^32 (object code)
Byte per char: 4

File:          yosemite.jpg
Contents:      Jpeg picture
From:          Private photo
Alphabet size: 256 (binary data)
Byte per char: 1

File:          yosemite_small.tiff
Contents:      Tiff picture
From:          Private photo
Alphabet size: 256 (binary data)
Byte per char: 1
