PPM is a specialized form of compression based on Markov modeling. In general, PPM predicts the probability of a given character based on a given number of characters that immediately precede it. PPM is conceptually simple, but often computationally expensive.

# XMLPPM: XML-Conscious PPM Compression

An open source project that performs PPM compression on XML files. The advance knowledge of XML format helps give this algorithm somewhat better compressions ratios on XML data than universal compressors.

Version 0.98.1 was shipping as of June, 2004.

http://www.cs.cornell.edu/People/jcheney/xmlppm/xmlppm.html

# PAQ4 Archiver

The latest in the series of multi-model compressors from Matt Mahoney. This improves on PAQ3n’s remarkable Calgary corpus performance by an additional 12K, at some expense in speed. Takes a whopping 84MB at runtime!

http://cs.fit.edu/~mmahoney/compression/paq4.cpp

# PAQ3N

Matt Mahoney says that with recent improvements by Serge Osnach, PAQ3N does better on the Calgary Corpus than any other open source compressor.

http://cs.fit.edu/~mmahoney/compression/paq3n.cpp

# Text Preprocessing for Data Compression

by Jürgen Abel and William Teahan. This paper looks at a few different techniques for preprocessing data before performing text compression, and compares the gains achieved when combining the preprocessors with PPM, BWT, and LZ algorithms.

http://www.data-compression.info/JuergenAbel/Preprints/Preprint_Text_Preprocessing.pdf

# ShipInBottle Archiver

This is a Russian language web site, so I’ll have to apologize in advance for any mistakes. As far as I can tell, this is a PPM-based archiver. I can’t tell if it’s free or not, so I’m listing it as both!

# Dmitry Shkarin

Dmitry Shkarin is the author of the PPMD compressor. His homepage has links to to versions A-I of the compressor as of April, 2003. This page is in Russian, but I find that running it through

BabelFish produces a usuable translation.

# Compress::PPMd

Salvador Fandiño García has created a Perl interface to Dmitry Shkarin PPMd compression library.

Version 0.05 is shipping as of March, 2003.

http://search.cpan.org/author/SALVA/

# Compression via Arithmetic Coding in Java

Bob Carpenter has created a nice Java package that implements a PPM/arithmetic coding compression system. This page includes links to the source code, javadocs, and a fair amount of tutorial material. Very complete!

http://www.colloquial.com/ArithmeticCoding/

# BICOM - BIjective COMpressor

BICOM is a freely available open source compressor. It uses a souped-up PPM algorithm, and is completely *bijective*.

Reader comment:

*Wow this is hot! …a bijective compressor
using full size Rijndael encryption…*

http://www3.sympatico.ca/mt0000/bicom/bicom.html

# UPL Compression : the complete professional toolkit

The UPL Compression Library is a high-performance professional compression library. It offers the ability to compress and decompress data, buffers, strings or single files and features the latest innovations in data compression. The library offers eight extremely powerful compression algorithms. Dynamic Huffman, Arithmetic, BWT, Ppm and several Lempel Ziv flavors.

DataCompression.info user John G. had this to say: *I was looking for adding a better compression to my Visual Basic project and it worked like a charm. The compression ratio is really good, better than Zip!*

# PPM FAQ

What appears to be a comprehensive FAQ on PPM programming. Russian language, and fairly slow loading, but no reason not to try Google or BabelFish for a translation.

http://www.arctest.narod.ru/descript/ppm-faq.htm

# PPMD var I1

From the archive: *PPMd program is file-to-file compressor, it is written for embedding in user programs mainly and it is not intended for immediate use. I was interested in speed and performance improvements of abstract PPM model [1-5] only, without tuning it to particular data types, therefore compressor works good enough for texts, but it is not so good for nonhomogeneous files (executables) and for noisy analog data (sounds, pictures etc.).*

ftp://ftp.elf.stuba.sk/pub/pc/pack/ppmdi1.rar

# Markov Predictive Coders : PPMZ

This directory contains source and executable for Charles Bloom’s PPMZ encoder, as well as a paper on PPMZ and some benchmark results. There are also links to a few other pages containing PPM information.

**News:** Charles Bloom has now released the source code to PPMZ2. He says it is both cleaner and faster than the original PPMZ code.

http://www.cbloom.com/src/ppmz.html

# Design and Analysis of Fast Text Compression Based on Quasi-Arithmetic Coding

by Paul Howard and Jeff Vitter. Here’s what they have to say about this paper from the abstract: *Our algorithm, related to the PPM method, simplifies the modeling phase by eliminating the escape mechanism, and speeds up coding by using a combination of quasi-arithmetic coding and Rice coding. We provide details of the use of quasi-arithmetic code tables, and analyze their compression performance. Our Fast PPM method is shown experimentally to be almost twice as fast as the PPMC method, while giving comparable compression.*.

http://www.cs.duke.edu/~jsv/Papers/catalog/node68.html

# Bill Teahan’s Papers

Bill has links to some of his papers here, citations for all.

http://www.scms.rgu.ac.uk/staff/smc/researchcoord/staff_publications/wjt.html

# PPM: one step to practicality

by Dmitry Shkarin. Dmitry is the author of PPMd, a well know compressor. PPM is a powerful compression technology which has a reputation for being resource intensive. In this paper Dmitry adds bells and whistles to basic PPM and gets fantastic results.

Reader jcp said: *Big problem: algorithm very slow, it’s impractical to search anything inside of compressed files (compared with gzip), .. But it’s good to send/receive letters of e-mail, well compressed!!!*

http://DataCompression.info/Miscellaneous/PPMII_DCC02.pdf

# Lectures on Statistical Modeling Theory

A paper by Jorma Rissanen, IBM Almaden Research Center.

http://www.cs.tut.fi/~rissanen/papers/model.ps

# Test Harness for the PPMD Compressor

Here’s a nice little frameword for testing PPMD, or for that matter, any other compressor.

http://www.codeproject.com/cpp/ppmd.asp

# PPMD Var ‘G’ (Dmitry Shkarin)

Free PPM program with full source, advertised as fast.

ftp://ftp.simtel.net/pub/simtelnet/win95/compress/ppmdi.zip

# On-Line Stochastic Processes in Data Compression

Suzanne Bunton’s PhD thesis. *A recent post to comp.compression said that Bunton had used floating point math for arithmetic coding. I haven’t verified this, but it sounds worth a look.*

ftp://ftp.cs.washington.edu/tr/1997/03/UW-CSE-97-03-02.PS.Z

# PPMD Var ‘F’ (Dmitry Shkarin)

A file to file compressor that uses the PPMD algorithm is boasts of being very fast.

ftp://ftp2.itb.it/files/PC/SAC/pack/ppmdf.rar

# Finite context modeling

An article by Arturo Campos that describes and discusses Finite Context Modeling. This modeling technique is uses by PPM compressors, although Campos makes the point that the ideas in this article can be used in other compressors as well.

http://www.arturocampos.com/ac_finite_context_modeling.html

# PPMD Var ‘E’ (Dmitry Shkarin)

Another version of the PPMD program by Dmitry Shkarin. Readme says this includes a bug fix, and removal of one model.

ftp://ftp.nsk.su/.3/windows/compress/ppmde.zip

# Lzp

Arturo Campos presents the LZP algorithm, first described by Charles Bloom. LZP is a hybrid of dictionary based coding and statistical modeling. This means it has some of the elements of popular LZSS encoders, but takes advantage of PPM style modeling as well. The combination of the two leads to very good compression ratios.

http://www.arturocampos.com/ac_lzp.html

# Lzp order-3 with linked lists

Another article about LZP by Arturo Campos. This piece describes a specific implementation technique using linked lists.

http://www.arturocampos.com/ac_lzp_o3_ll.html

# Lzp research

Arturo Campos has been doing quite a bit of research on the LZP algorithm (first described by Charles Bloom.) This paper presents all of his results to date.

http://www.arturocampos.com/ac_lzp_research.html

# Lossless Data Compression Toolkit

Version 1.1 of the lossless data compression toolkit by Nico deVries. The C sources in this toolkit include an LZW compressor, AR002 archiver, a PPM like compressor using arithmetic compression, Huffman compressor, splay tree program, and LZRW1. Quite a variety.

ftp://garbo.uwasa.fi/pc/programming/lds_11.zip

# PPM by Bill Teahan

This ftp site has a few PPM programs, along with Bill Teahan’s thesis.

ftp://ftp.cs.waikato.ac.nz/pub/compression/ppm/

# Arithmetic Coding + Statistical Modeling = Data Compression

An article that explains arithmetic coding, plus a sample program that implements a limited sort of PPM.

Christable C. had these kind words: *I have read some other articles, but not clearly known. When reading this article, I find that Arithmetic Coding is easy to know !*

http://www.dogma.net/markn/articles/arith/part1.htm

# TinyHAP v1.02c

Tiny HAP is based on Harald Feldman’s HAP & PAH 3.0, and is distributed freely. The source code is not identical and should run 25% faster.

ftp://ftp.turbo.nsk.su/pub/msdos/archiver/thap102c.zip