Arithmetic coding is a method of encoding data using a variable number of bits. The number of bits used to encode each symbol varies according to the probability assigned to that symbol. Low probability symbols use many bits, high probability symbols use fewer bits. So far, this makes Arithmetic Coding sound very similar to Huffman coding. However, there is an important difference. An arithmetic encoder doesn’t have to use an integral number of bits to encode a symbol. If the optimal number of bits for a symbol is 2.4, a Huffman coder will probably use 2 bits per symbol, whereas the arithmetic encoder my use very close to 2.4. This means an arithmetic coder can usually encode a message using fewer bits. The method by which this is accomplished is somewhat complex, and is explained in some of the links shown below

# Arithmetic Coding revealed - A guided tour from theory to praxis

An updated and translated version of our German paper “Proseminar Datenkompression - Arithmetische Kodierung” from 2001. To the best of our knowledge, it is the first comprehensive paper that describes the whole way from the basic principles of AC up to a simple implementation, fully documented with C++ source code.

http://www.sable.mcgill.ca/~ebodde/pubs/sable-tr-2007-5.pdf

# Michael Dipperstein’s Airthmetic Coding Page

The third in Michael’s collection of pages explaining lossless compression algorithms. A nice tutorial accompanied by ANSI C source.

http://michael.dipperstein.com/arithmetic/index.html

# Five Cents on Arithmetic Encoding

Aliaksei Sanko makes a few improvements to the code in the original 1987 CACM article. His sample includes a templated producer and consumer.

http://codeguru.com/Cpp/Cpp/algorithms/compression/article.php/c7043/

# Compression and Encryption Sources

Links to a variety of lossless coders, includes source for Huffman, arithmetic, LZSS, and other compressors.

http://www.larkin.lanit.ru/compress/compr.htm

# PAQ4 Archiver

The latest in the series of multi-model compressors from Matt Mahoney. This improves on PAQ3n’s remarkable Calgary corpus performance by an additional 12K, at some expense in speed. Takes a whopping 84MB at runtime!

http://cs.fit.edu/~mmahoney/compression/paq4.cpp

# PAQ3N

Matt Mahoney says that with recent improvements by Serge Osnach, PAQ3N does better on the Calgary Corpus than any other open source compressor.

http://cs.fit.edu/~mmahoney/compression/paq3n.cpp

# Arithmetic Coding

The NIST page on arithmetic coding from their Dictionary of Algorithms and Data Structures. A terse definition and a couple of links.

http://www.nist.gov/dads/HTML/arithmeticCoding.html

# Arithmetic Coding Revisited: A New Floating Point Technique

Anand Jain discusses his new technique for arithmetic coding. An executable file is included, but no source code.

# Compression via Arithmetic Coding in Java

Bob Carpenter has created a nice Java package that implements a PPM/arithmetic coding compression system. This page includes links to the source code, javadocs, and a fair amount of tutorial material. Very complete!

http://www.colloquial.com/ArithmeticCoding/

# Effective Arithmetic Coding

Sachin Garg’s article and source on arithmetic coding, includes a 64-bit implementation.

http://www.geocities.com/schngrg/sgarith.html

# Arithmetic Coding Revisited

by Moffat, Neal, and Witten. The authors of the original CACM article on arithmetic coding take a fresh look at the topic with an additional ten years of knowledge.

http://www.cs.technion.ac.il/~ronir/courses/advancedTopics/pubs/moffat.pdf

# Parallel Lossless Image Compression Using Huffman and Arithmetic Coding

by P. G. Howard and J. S. Vitter. This paper shows how images can be encoded and decoded using parallel processing. Both Huffman and arithmetic coding are examined.

http://www.cs.duke.edu/~jsv/Papers/catalog/node66.html

# Lossless Compression Algorithms (Entropy Encoding)

An overview of the basics, including Shannon-Fano, Huffman, Arithmetic coding, and a section on LZW for good measure.

http://www.cs.cf.ac.uk/Dave/Multimedia/node207.html

# Pegasus ELS Sample Code

Pegasus has a patented range coder that they license at no charge. This archive contains some C code that provides a sample implementation.

ftp://www.jpg.com/Osaugels.zip

# David Scott’s bijective 2nd order English compressor

This is compressor has static probabilities designed for English language text, encoding ASCII text and spaces only.

http://bijective.dogma.net/compres2bse.htm

# David Scott’s Bijective Static Entropy Compression

This compressor is designed to operate on English text - it has a static probability table configured for written English.

http://bijective.dogma.net/compres1bse.htm

# Proseminar Datenkompression - Arithmetische Kodierung

This page gives an introduction to Arithmetic Coding and shows how to implement it using floats or integers. There is also a proof of the efficiency of the algorithms, along with visualization and Win32 binaries. This page is in English and includes links to material in both German and English.

DataCompression.info user Juergen Abel found the site quite good: *Clear description, especially the explanation of the renormalisation part, full source code.*

http://www.bodden.de/studies/ac/

# Arithmetic Coding for Data Compression

by P. G. Howard and J. S. Vitter. “Arithmetic Coding for Data Compression,” Proceedings of the IEEE, 82(6), June 1994, 857-865. This paper describes arithmetic coding, and introduces a technique that uses table lookups to make the process more efficient.

http://www.cs.duke.edu/~jsv/Papers/catalog/node71.html

# Practical Implementations of Arithmetic Coding

This paper by Paul Howard and Jeff Vitter goes over some of the basics of Arithmetic Coding, then outlines a coder that has increased efficiency by virtue of substituting table lookups for expensive arithmetic operations.

http://www.cs.duke.edu/~jsv/Papers/catalog/node64.html

# Arithmetic Coder Visualisation

The author of this code created this visualization executable for a seminar.

DCL reader Drew D. downloaded the code, executed it, and had this to say about it: *The program is an executable for windows with a dll and some gif’s. The program seems to require a screen size greater than 800×600 to view the fixed size window. The program is a bit cryptic to me since I only have a basic understanding of Arithmetic encoding but it offers nice step by step encoding with some statistical information. It also allows for the modifying of input types and reading from a file.*.

http://www.bodden.de/download.php?cat=ac&file=acvis_exe.zip

# Wikipedia Entry: Arithmetic Coding

The Wikipedia entry for Arithmetic coding. Too short for anything other than a thumbnail sketch.

http://en.wikipedia.org/wiki/Arithmetic_coding

# Analysis of Arithmetic Coding for Data Compression

A paper by Howard and Vitter, available in both PDF and PS formats. The paper provides an analysis of the effect that models and various implementations of arithmetic coding have on compression.

http://www.cs.duke.edu/~jsv/Papers/catalog/node62.html

# US5422734: Method for arithmetically encoding half-tone image in image processing system

An arithmetic coding patent which appears to be currently assigned to Samsung.

# A Two-Stage Modelling Method for Compressing Binary Images by Arithmetic Coding

A paper describing a compression scheme for bitonal images. Complete text of paper in compressed postscript.

http://www.cs.utu.fi/tko/reports/R-92-6.html

# Standard ECMA-159 Data Compression for Information Interchange - Binary Arithmetic Coding Algorithm

The ECMA has a standard defintion for Arithmetic Coding.

http://www.ecma.ch/ecma1/STAND/ECMA-159.HTM

# An Improved Data Structure for Cumulative Probability Tables

A paper by Alistair Moffat describing an improvement on Peter Fenwick’s method for maintaining cumulative probability tables. The pointer on this page leads to some source implementing said table.

http://www.cs.mu.oz.au/~alistair/abstracts/mof99:spe.html

# Carryless Rangecoder

Mikael Lundqvist has adapted Dmitry Subbotin’s C++ rangecoder to C. Michael says it is simple and fast.

One DataCompression.info commented that some more documentation would be helpful.

http://w1.515.telia.com/~u51507446/range.tgz

# BIAC

A bijective arithmetic compressor written in Java. Includes all source code, seems to be free.

http://www.mandala.co.uk/biac/index.html

# David A. Scott’s Fully Bijective Arithmetic Coder

The term *bijective* as used by Scott means that for any given given file X you are guarantted that A( B( X ) ) == B( A( X ) ) == X, where A and B are a pair of bijectively matched programs. In this particular case, A and B are a compressor and decompressor that use arithmetic coding. Includes C++ source.

http://bijective.dogma.net/compres10.htm

# JPEG arithmetic encoding and decoding portable software implementation

A back-end implemenation of arithmetic coding for JPEG as defined in the standard. It is distributed as an add-on that can be used with the Independent JPEG groups library. The work of Guido Vollbeding.