Publish a program (self-extracting archive) archive8.exe of
size S.
If run, it produces a 108 byte file data8 that
is identical to enwik8.
Programs must be Windows or Linux (x86 32bit) executables.
Programs must run without input from other sources (files, network,
dictionaries, etc.) under Windows or Linux without additional
installations. Use of standard libraries as for file I/O are allowed.
Programs should run in less than 10 hours on a 2GHz Pentium 4 with
1GB RAM and 10GB free HD for temporary files.
In lieu of a self-extracting archive, a decompressor program decomp8.exe
plus a compressed file archive8.bhm may be published, where
decomp8.exe produces data8 from archive8.bhm.
In this case, the total size is S := length(decomp8.exe)+length(archive8.bhm).
In lieu of archive8, a compressor comp8.exe may
be published, together with the size of archive8 (and possible
options used to create it), which creates archive8 from enwik8.
Resource restrictions for the (de)compressor are the same as for the
self-extracting archive.
The used filenames (except enwik8) are just for illustration.
If your contribution violates some of the rules, you may still be
eligible for the Large Compression Benchmark (see below).
Award = Z×(L-S)/L, where S = new record (size of archive8.exe or decomp8.exe+archive8.bhm+opt), L = previous record for S, Z = amount in prize fund (currently 50'000€)
Update: L := S.
Minimum award is 3% of Z.
Contributions are dealt with in the order of their submission.
The contribution is subject to public comments for a period of at
least 30 days before the prize is awarded.
Compressors/decompressors do not have to be general purpose. They may
be tuned specifically to this benchmark and are allowed to reject or
fail on any input other than enwik8/archive8.bhm.
Only the version and combination of options submitted is eligible for
the prize.
If an author breaks his own record within 30 days,
the older submission is regarded as withdrawn.
If a submission fails to meet the criteria for the prize, the entrant
will be informed, and the submission henceforth be ignored. In
particular a miss of the 3% criterion will not diminish the prize (L
remains unchanged).
If your compressor beats the current record, but violates some of the
constraints regarding operating system, used dll's, used programming
language, etc, Matt Mahoney may be willing to assist you in satisfying
them. For instance if you send portable C code he can compile it under
Windows for you.
You can run some of the previous records on your system, and by
comparing your runtime with the displayed runtime, you can estimate
whether your algorithm will meet the time constraint on our machine.
Members of the prize committee are not eligible for prize money. If a
committee member publishes a (de)compressor that would have otherwise
won prize money, then L is updated as in the normal case, but no
money is paid.
The above formula currently amounts to 1€ for every 330 byte
improvement, with a minimum improvement of 494'449 bytes.
If a decompressor has multiple authors, then a submission must
include instructions for dividing the prize money. All authors must
agree on this distribution before any money can be awarded.
There will be a waiting period of at least 30 days after submission
to allow for public comment and verification. Comments should be made to
the Hutter
Prize Newsgroup or by email to members of the
Prize committee.
The programs and/or data files must be available on the Internet for
free download and testing.
The (virtual) prize fund (Z) is constant. It is not decreased
after awarding a prize. It may increase if additional sponsors
contribute to it. (Please contact
Marcus
Hutter if you wish to contribute).
The prize will be paid if the solution reflects the spirit of the
contest. In particular decompressors (secretely) receiving any kind of "outside"
information are forbidden. Also in order to verify your claim we need to
be able to run your executable on our machines. Payment of the prize
cannot be legally enforced. Marcus Hutter will make the final decision
whether to recognize a record, award a prize, and the amount.
Rules may change at any time without notice to meet the goals of
fairness, accuracy, maximizing public participation, and recognizing
existing practice.