Data-Compression.org

data compression link collection

Benchmarks

Benchmarks are an important part of the data compression world. Performance against benchmarks is a good way to judge algorithms in a fair manner. The problem, of course, is selecting benchmarks that accurately model the needs of the eventual users of the algorithm


Maximum Compression

Werner Bergmans has created a new benchmark site that aims to show the best compression ratios possible for multiple file types, including English text, executables, graphics, and so on. Werner says he is running these tests with 80-100 programs for each file type!

Reader Werner B. says Useful site to compare results of different compression programs. Regularly updated.

http://www.maximumcompression.com/

* * * * *

Posted in February 28th, 2004

Berto’s Compression Spreadsheet

Comparisions of over 230 archivers, in handy Excel format, from Berto.

Reader Emiliano C. said “Wonderful! Great! Wonderful! Cool!”

http://cs.fit.edu/~mmahoney/compression/ct.xls

* * * * *

Posted in September 10th, 2003

Silesia compression corpus

Published in Files, Benchmarks

Sebastian Deorowicz decided to create a compression corpus of his own, attempting to overcome some of the deficiencies he sees in the old guard.

http://sun.iinf.polsl.gliwice.pl/~sdeor/corpus.htm

         

Posted in June 23rd, 2003

Benchmark Images and Files

Published in Files, Links, Benchmarks

David Cary is a major link farmer. One of the sections of his massive Data Compression page has links to various images and files that are used in various benchmarks.

http://agora.rdrop.com/~cary/html/data_compression.html#benchmark

         

Posted in May 1st, 2003

Archive Comparsion Test 2.0

ACT - by Jeff Gilchrist. ACT is the Archive Comparison Test, a long running benchmark on well known archiving programs. Lots of good updates in May of 2002.

http://compression.ca/

* * * * *

Posted in June 2nd, 2002

Jeff Gilchrist

Published in People, Benchmarks

This is Jeff Gilchrist’s home page. Jeff is the curator of the Archive Compression Test, which presumably keeps him busy.

http://gilchrist.ca/jeff

* * * * *

Posted in June 25th, 2001

The Art Of Lossless Data Compression

A comprehensive set of tests on lossless data compression.

http://geocities.com/eri32/

* * * * *

Posted in June 10th, 2001

Neural Network Text Compression Programs and Papers

A couple of programs using neural networks for compression, along with a couple of papers by the author. This area of data compression is definitely underserved, check out what’s here and see if neural networks deserve more attention than they are getting.

Update: This page appears to now have some links to general lossless benchmarking info.

http://cs.fit.edu/~mmahoney/compression/

* * * *  

Posted in July 17th, 2000

Waterloo BragZone test suite

In the BragZone you will find the following:


  • A suite of test images, the “Waterloo Repertoire”.
  • Rate-Distortion plots for various compression codecs.
  • The data from which the above plots are derived.
  • Sample images at selected compression ratios.

ftp://links.uwaterloo.ca/pub/BragZone

* * * * *

Posted in November 14th, 1999

Waterloo BragZone

Published in Files, Benchmarks

Comparing different image compression programs has always been difficult. As a suite of test images and a place for archiving results, the Waterloo BragZone hopes to overcome these problems. Central to the effort is the Waterloo Repertoire, a suite of 32 test images

http://links.uwaterloo.ca/bragzone.base.html

* * * * *

Posted in November 14th, 1999

PNG Suite from Willem van Schaik

Published in Files, Benchmarks, PNG

This is Willem van Schaik’s suite of PNG icons for testing PNG decoder engines, PNG viewers, and PNG browsers.

http://www.libpng.org/pub/png/pngsuite.html

* * * * *

Posted in November 14th, 1999

Where can I find Lenna and other images?

The comp.compression FAQ attempts to answer this for you.

http://www.faqs.org/faqs/compression-faq/part1/section-30.html

* * * * *

Posted in November 14th, 1999

yabbawhap - Y and AP compression filters

Public domain code by Daniel Bernstein. (Note that this ftp site has an excellent selection of compressoin programs and code.)

ftp://ftp.inria.fr/system/arch-compr/yabba.tar.Z

* * *    

Posted in November 13th, 1999

Project Runeberg

Published in Files, Swedish, Benchmarks

A huge collection of Swedish language text files

ftp://ftp.lysator.liu.se/pub/runeberg/src

* * *    

Posted in November 13th, 1999

CCITT standard fax images

TIFF versions of the CCITT images.

http://www.cs.waikato.ac.nz/~singlis/ccitt.html

* * * * *

Posted in November 7th, 1999

The Canterbury Corpus

Published in Files, Benchmarks

This is the home page for the Canterbury Corpus, a test suite designed to provide a standard set of files for lossless compressoion testing. You will find links to the actual files in the test suite, as well as papers and test results.

http://corpus.canterbury.ac.nz/

         

Posted in September 21st, 1999

The British National Corpus

Published in Files, Benchmarks

The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of current British English, both spoken and written.

http://info.ox.ac.uk/bnc/

         

Posted in September 21st, 1999

The Calgary Corpus

This is the home page for the Calgary Corpus. This set of files has long been the standard used for comparison of various lossless compression techniques.

http://links.uwaterloo.ca/calgary.corpus.html

         

Posted in September 21st, 1999

Compression Ratios

A set of benchmarks for lossless compression of various test sets, including the CCITT B&W images, the Calgary Corpus, and a Gray Scale set. Includes some dates for checking historical progression.

http://www.cs.waikato.ac.nz/~singlis/ratios.html

         

Posted in December 24th, 1998

The Calgary Corpus Compression Challenge

Leonid A. Broukhis puts his money where his mouth is by offering a cash prize for good, reproducible compression. He has paid out at least one modest prize.

http://www.mailcom.com/challenge

         

Posted in December 23rd, 1998

Calgarry Corpus test results

Published in Results, Benchmarks

A set of test results for files run against the Calgary Corpus. This set of test results are kept on the Canterbury web site so that they can be easily referenced for comparison purposes.

http://corpus.canterbury.ac.nz/details/

         

Posted in December 12th, 1998

An FTP site for the Calgary Corpus

Published in Benchmarks

The Calgary Corpus is a set of files that were put together by compression mavens Bell, Cleary, and Witten in 1989 for benchmarking lossless compression algorithms. Files included in this set include English text, source code, executable code, and some data files.

ftp://ftp.cpsc.ucalgary.ca/pub/projects/text.compression.corpus/

         

Posted in December 8th, 1998