X-Git-Url: http://git.pld-linux.org/?a=blobdiff_plain;f=bzip2-libtoolizeautoconf.patch;fp=bzip2-libtoolizeautoconf.patch;h=6c3f2e7a18cf4ebb0830fc40e3cb56361693964a;hb=d967e3ecf90efdffc98c28109ea6316ce4faffcd;hp=0000000000000000000000000000000000000000;hpb=c4e9b52407b34c3b1168d4de438d8490dec2f1b5;p=packages%2Fbzip2.git diff --git a/bzip2-libtoolizeautoconf.patch b/bzip2-libtoolizeautoconf.patch new file mode 100644 index 0000000..6c3f2e7 --- /dev/null +++ b/bzip2-libtoolizeautoconf.patch @@ -0,0 +1,13968 @@ +diff -Nru bzip2-1.0.1/AUTHORS bzip2-1.0.1.new/AUTHORS +--- bzip2-1.0.1/AUTHORS Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/AUTHORS Sat Jun 24 20:13:05 2000 +@@ -0,0 +1 @@ ++Julian Seward +diff -Nru bzip2-1.0.1/CHANGES bzip2-1.0.1.new/CHANGES +--- bzip2-1.0.1/CHANGES Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/CHANGES Thu Jan 1 01:00:00 1970 +@@ -1,167 +0,0 @@ +- +- +-0.9.0 +-~~~~~ +-First version. +- +- +-0.9.0a +-~~~~~~ +-Removed 'ranlib' from Makefile, since most modern Unix-es +-don't need it, or even know about it. +- +- +-0.9.0b +-~~~~~~ +-Fixed a problem with error reporting in bzip2.c. This does not effect +-the library in any way. Problem is: versions 0.9.0 and 0.9.0a (of the +-program proper) compress and decompress correctly, but give misleading +-error messages (internal panics) when an I/O error occurs, instead of +-reporting the problem correctly. This shouldn't give any data loss +-(as far as I can see), but is confusing. +- +-Made the inline declarations disappear for non-GCC compilers. +- +- +-0.9.0c +-~~~~~~ +-Fixed some problems in the library pertaining to some boundary cases. +-This makes the library behave more correctly in those situations. The +-fixes apply only to features (calls and parameters) not used by +-bzip2.c, so the non-fixedness of them in previous versions has no +-effect on reliability of bzip2.c. +- +-In bzlib.c: +- * made zero-length BZ_FLUSH work correctly in bzCompress(). +- * fixed bzWrite/bzRead to ignore zero-length requests. +- * fixed bzread to correctly handle read requests after EOF. +- * wrong parameter order in call to bzDecompressInit in +- bzBuffToBuffDecompress. Fixed. +- +-In compress.c: +- * changed setting of nGroups in sendMTFValues() so as to +- do a bit better on small files. This _does_ effect +- bzip2.c. +- +- +-0.9.5a +-~~~~~~ +-Major change: add a fallback sorting algorithm (blocksort.c) +-to give reasonable behaviour even for very repetitive inputs. +-Nuked --repetitive-best and --repetitive-fast since they are +-no longer useful. +- +-Minor changes: mostly a whole bunch of small changes/ +-bugfixes in the driver (bzip2.c). Changes pertaining to the +-user interface are: +- +- allow decompression of symlink'd files to stdout +- decompress/test files even without .bz2 extension +- give more accurate error messages for I/O errors +- when compressing/decompressing to stdout, don't catch control-C +- read flags from BZIP2 and BZIP environment variables +- decline to break hard links to a file unless forced with -f +- allow -c flag even with no filenames +- preserve file ownerships as far as possible +- make -s -1 give the expected block size (100k) +- add a flag -q --quiet to suppress nonessential warnings +- stop decoding flags after --, so files beginning in - can be handled +- resolved inconsistent naming: bzcat or bz2cat ? +- bzip2 --help now returns 0 +- +-Programming-level changes are: +- +- fixed syntax error in GET_LL4 for Borland C++ 5.02 +- let bzBuffToBuffDecompress return BZ_DATA_ERROR{_MAGIC} +- fix overshoot of mode-string end in bzopen_or_bzdopen +- wrapped bzlib.h in #ifdef __cplusplus ... extern "C" { ... } +- close file handles under all error conditions +- added minor mods so it compiles with DJGPP out of the box +- fixed Makefile so it doesn't give problems with BSD make +- fix uninitialised memory reads in dlltest.c +- +-0.9.5b +-~~~~~~ +-Open stdin/stdout in binary mode for DJGPP. +- +-0.9.5c +-~~~~~~ +-Changed BZ_N_OVERSHOOT to be ... + 2 instead of ... + 1. The + 1 +-version could cause the sorted order to be wrong in some extremely +-obscure cases. Also changed setting of quadrant in blocksort.c. +- +-0.9.5d +-~~~~~~ +-The only functional change is to make bzlibVersion() in the library +-return the correct string. This has no effect whatsoever on the +-functioning of the bzip2 program or library. Added a couple of casts +-so the library compiles without warnings at level 3 in MS Visual +-Studio 6.0. Included a Y2K statement in the file Y2K_INFO. All other +-changes are minor documentation changes. +- +-1.0 +-~~~ +-Several minor bugfixes and enhancements: +- +-* Large file support. The library uses 64-bit counters to +- count the volume of data passing through it. bzip2.c +- is now compiled with -D_FILE_OFFSET_BITS=64 to get large +- file support from the C library. -v correctly prints out +- file sizes greater than 4 gigabytes. All these changes have +- been made without assuming a 64-bit platform or a C compiler +- which supports 64-bit ints, so, except for the C library +- aspect, they are fully portable. +- +-* Decompression robustness. The library/program should be +- robust to any corruption of compressed data, detecting and +- handling _all_ corruption, instead of merely relying on +- the CRCs. What this means is that the program should +- never crash, given corrupted data, and the library should +- always return BZ_DATA_ERROR. +- +-* Fixed an obscure race-condition bug only ever observed on +- Solaris, in which, if you were very unlucky and issued +- control-C at exactly the wrong time, both input and output +- files would be deleted. +- +-* Don't run out of file handles on test/decompression when +- large numbers of files have invalid magic numbers. +- +-* Avoid library namespace pollution. Prefix all exported +- symbols with BZ2_. +- +-* Minor sorting enhancements from my DCC2000 paper. +- +-* Advance the version number to 1.0, so as to counteract the +- (false-in-this-case) impression some people have that programs +- with version numbers less than 1.0 are in someway, experimental, +- pre-release versions. +- +-* Create an initial Makefile-libbz2_so to build a shared library. +- Yes, I know I should really use libtool et al ... +- +-* Make the program exit with 2 instead of 0 when decompression +- fails due to a bad magic number (ie, an invalid bzip2 header). +- Also exit with 1 (as the manual claims :-) whenever a diagnostic +- message would have been printed AND the corresponding operation +- is aborted, for example +- bzip2: Output file xx already exists. +- When a diagnostic message is printed but the operation is not +- aborted, for example +- bzip2: Can't guess original name for wurble -- using wurble.out +- then the exit value 0 is returned, unless some other problem is +- also detected. +- +- I think it corresponds more closely to what the manual claims now. +- +- +-1.0.1 +-~~~~~ +-* Modified dlltest.c so it uses the new BZ2_ naming scheme. +-* Modified makefile-msc to fix minor build probs on Win2k. +-* Updated README.COMPILATION.PROBLEMS. +- +-There are no functionality changes or bug fixes relative to version +-1.0.0. This is just a documentation update + a fix for minor Win32 +-build problems. For almost everyone, upgrading from 1.0.0 to 1.0.1 is +-utterly pointless. Don't bother. +diff -Nru bzip2-1.0.1/COPYING bzip2-1.0.1.new/COPYING +--- bzip2-1.0.1/COPYING Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/COPYING Sat Jun 24 20:13:05 2000 +@@ -0,0 +1,39 @@ ++ ++This program, "bzip2" and associated library "libbzip2", are ++copyright (C) 1996-2000 Julian R Seward. All rights reserved. ++ ++Redistribution and use in source and binary forms, with or without ++modification, are permitted provided that the following conditions ++are met: ++ ++1. Redistributions of source code must retain the above copyright ++ notice, this list of conditions and the following disclaimer. ++ ++2. The origin of this software must not be misrepresented; you must ++ not claim that you wrote the original software. If you use this ++ software in a product, an acknowledgment in the product ++ documentation would be appreciated but is not required. ++ ++3. Altered source versions must be plainly marked as such, and must ++ not be misrepresented as being the original software. ++ ++4. The name of the author may not be used to endorse or promote ++ products derived from this software without specific prior written ++ permission. ++ ++THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS ++OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED ++WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ++ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY ++DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL ++DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE ++GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS ++INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, ++WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING ++NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS ++SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ++ ++Julian Seward, Cambridge, UK. ++jseward@acm.org ++bzip2/libbzip2 version 1.0 of 21 March 2000 ++ +diff -Nru bzip2-1.0.1/ChangeLog bzip2-1.0.1.new/ChangeLog +--- bzip2-1.0.1/ChangeLog Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/ChangeLog Sat Jun 24 20:13:05 2000 +@@ -0,0 +1 @@ ++ +diff -Nru bzip2-1.0.1/INSTALL bzip2-1.0.1.new/INSTALL +--- bzip2-1.0.1/INSTALL Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/INSTALL Sat Jun 24 20:13:06 2000 +@@ -0,0 +1,182 @@ ++Basic Installation ++================== ++ ++ These are generic installation instructions. ++ ++ The `configure' shell script attempts to guess correct values for ++various system-dependent variables used during compilation. It uses ++those values to create a `Makefile' in each directory of the package. ++It may also create one or more `.h' files containing system-dependent ++definitions. Finally, it creates a shell script `config.status' that ++you can run in the future to recreate the current configuration, a file ++`config.cache' that saves the results of its tests to speed up ++reconfiguring, and a file `config.log' containing compiler output ++(useful mainly for debugging `configure'). ++ ++ If you need to do unusual things to compile the package, please try ++to figure out how `configure' could check whether to do them, and mail ++diffs or instructions to the address given in the `README' so they can ++be considered for the next release. If at some point `config.cache' ++contains results you don't want to keep, you may remove or edit it. ++ ++ The file `configure.in' is used to create `configure' by a program ++called `autoconf'. You only need `configure.in' if you want to change ++it or regenerate `configure' using a newer version of `autoconf'. ++ ++The simplest way to compile this package is: ++ ++ 1. `cd' to the directory containing the package's source code and type ++ `./configure' to configure the package for your system. If you're ++ using `csh' on an old version of System V, you might need to type ++ `sh ./configure' instead to prevent `csh' from trying to execute ++ `configure' itself. ++ ++ Running `configure' takes awhile. While running, it prints some ++ messages telling which features it is checking for. ++ ++ 2. Type `make' to compile the package. ++ ++ 3. Optionally, type `make check' to run any self-tests that come with ++ the package. ++ ++ 4. Type `make install' to install the programs and any data files and ++ documentation. ++ ++ 5. You can remove the program binaries and object files from the ++ source code directory by typing `make clean'. To also remove the ++ files that `configure' created (so you can compile the package for ++ a different kind of computer), type `make distclean'. There is ++ also a `make maintainer-clean' target, but that is intended mainly ++ for the package's developers. If you use it, you may have to get ++ all sorts of other programs in order to regenerate files that came ++ with the distribution. ++ ++Compilers and Options ++===================== ++ ++ Some systems require unusual options for compilation or linking that ++the `configure' script does not know about. You can give `configure' ++initial values for variables by setting them in the environment. Using ++a Bourne-compatible shell, you can do that on the command line like ++this: ++ CC=c89 CFLAGS=-O2 LIBS=-lposix ./configure ++ ++Or on systems that have the `env' program, you can do it like this: ++ env CPPFLAGS=-I/usr/local/include LDFLAGS=-s ./configure ++ ++Compiling For Multiple Architectures ++==================================== ++ ++ You can compile the package for more than one kind of computer at the ++same time, by placing the object files for each architecture in their ++own directory. To do this, you must use a version of `make' that ++supports the `VPATH' variable, such as GNU `make'. `cd' to the ++directory where you want the object files and executables to go and run ++the `configure' script. `configure' automatically checks for the ++source code in the directory that `configure' is in and in `..'. ++ ++ If you have to use a `make' that does not supports the `VPATH' ++variable, you have to compile the package for one architecture at a time ++in the source code directory. After you have installed the package for ++one architecture, use `make distclean' before reconfiguring for another ++architecture. ++ ++Installation Names ++================== ++ ++ By default, `make install' will install the package's files in ++`/usr/local/bin', `/usr/local/man', etc. You can specify an ++installation prefix other than `/usr/local' by giving `configure' the ++option `--prefix=PATH'. ++ ++ You can specify separate installation prefixes for ++architecture-specific files and architecture-independent files. If you ++give `configure' the option `--exec-prefix=PATH', the package will use ++PATH as the prefix for installing programs and libraries. ++Documentation and other data files will still use the regular prefix. ++ ++ In addition, if you use an unusual directory layout you can give ++options like `--bindir=PATH' to specify different values for particular ++kinds of files. Run `configure --help' for a list of the directories ++you can set and what kinds of files go in them. ++ ++ If the package supports it, you can cause programs to be installed ++with an extra prefix or suffix on their names by giving `configure' the ++option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'. ++ ++Optional Features ++================= ++ ++ Some packages pay attention to `--enable-FEATURE' options to ++`configure', where FEATURE indicates an optional part of the package. ++They may also pay attention to `--with-PACKAGE' options, where PACKAGE ++is something like `gnu-as' or `x' (for the X Window System). The ++`README' should mention any `--enable-' and `--with-' options that the ++package recognizes. ++ ++ For packages that use the X Window System, `configure' can usually ++find the X include and library files automatically, but if it doesn't, ++you can use the `configure' options `--x-includes=DIR' and ++`--x-libraries=DIR' to specify their locations. ++ ++Specifying the System Type ++========================== ++ ++ There may be some features `configure' can not figure out ++automatically, but needs to determine by the type of host the package ++will run on. Usually `configure' can figure that out, but if it prints ++a message saying it can not guess the host type, give it the ++`--host=TYPE' option. TYPE can either be a short name for the system ++type, such as `sun4', or a canonical name with three fields: ++ CPU-COMPANY-SYSTEM ++ ++See the file `config.sub' for the possible values of each field. If ++`config.sub' isn't included in this package, then this package doesn't ++need to know the host type. ++ ++ If you are building compiler tools for cross-compiling, you can also ++use the `--target=TYPE' option to select the type of system they will ++produce code for and the `--build=TYPE' option to select the type of ++system on which you are compiling the package. ++ ++Sharing Defaults ++================ ++ ++ If you want to set default values for `configure' scripts to share, ++you can create a site shell script called `config.site' that gives ++default values for variables like `CC', `cache_file', and `prefix'. ++`configure' looks for `PREFIX/share/config.site' if it exists, then ++`PREFIX/etc/config.site' if it exists. Or, you can set the ++`CONFIG_SITE' environment variable to the location of the site script. ++A warning: not all `configure' scripts look for a site script. ++ ++Operation Controls ++================== ++ ++ `configure' recognizes the following options to control how it ++operates. ++ ++`--cache-file=FILE' ++ Use and save the results of the tests in FILE instead of ++ `./config.cache'. Set FILE to `/dev/null' to disable caching, for ++ debugging `configure'. ++ ++`--help' ++ Print a summary of the options to `configure', and exit. ++ ++`--quiet' ++`--silent' ++`-q' ++ Do not print messages saying which checks are being made. To ++ suppress all normal output, redirect it to `/dev/null' (any error ++ messages will still be shown). ++ ++`--srcdir=DIR' ++ Look for the package's source code in directory DIR. Usually ++ `configure' can determine that directory automatically. ++ ++`--version' ++ Print the version of Autoconf used to generate the `configure' ++ script, and exit. ++ ++`configure' also accepts some other, not widely useful, options. +diff -Nru bzip2-1.0.1/LICENSE bzip2-1.0.1.new/LICENSE +--- bzip2-1.0.1/LICENSE Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/LICENSE Thu Jan 1 01:00:00 1970 +@@ -1,39 +0,0 @@ +- +-This program, "bzip2" and associated library "libbzip2", are +-copyright (C) 1996-2000 Julian R Seward. All rights reserved. +- +-Redistribution and use in source and binary forms, with or without +-modification, are permitted provided that the following conditions +-are met: +- +-1. Redistributions of source code must retain the above copyright +- notice, this list of conditions and the following disclaimer. +- +-2. The origin of this software must not be misrepresented; you must +- not claim that you wrote the original software. If you use this +- software in a product, an acknowledgment in the product +- documentation would be appreciated but is not required. +- +-3. Altered source versions must be plainly marked as such, and must +- not be misrepresented as being the original software. +- +-4. The name of the author may not be used to endorse or promote +- products derived from this software without specific prior written +- permission. +- +-THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS +-OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +-WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +-ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY +-DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +-DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +-GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +-INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +-WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +-NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +-SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +- +-Julian Seward, Cambridge, UK. +-jseward@acm.org +-bzip2/libbzip2 version 1.0 of 21 March 2000 +- +diff -Nru bzip2-1.0.1/Makefile-libbz2_so bzip2-1.0.1.new/Makefile-libbz2_so +--- bzip2-1.0.1/Makefile-libbz2_so Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/Makefile-libbz2_so Thu Jan 1 01:00:00 1970 +@@ -1,43 +0,0 @@ +- +-# This Makefile builds a shared version of the library, +-# libbz2.so.1.0.1, with soname libbz2.so.1.0, +-# at least on x86-Linux (RedHat 5.2), +-# with gcc-2.7.2.3. Please see the README file for some +-# important info about building the library like this. +- +-SHELL=/bin/sh +-CC=gcc +-BIGFILES=-D_FILE_OFFSET_BITS=64 +-CFLAGS=-fpic -fPIC -Wall -Winline -O2 -fomit-frame-pointer -fno-strength-reduce $(BIGFILES) +- +-OBJS= blocksort.o \ +- huffman.o \ +- crctable.o \ +- randtable.o \ +- compress.o \ +- decompress.o \ +- bzlib.o +- +-all: $(OBJS) +- $(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.1 $(OBJS) +- $(CC) $(CFLAGS) -o bzip2-shared bzip2.c libbz2.so.1.0.1 +- rm -f libbz2.so.1.0 +- ln -s libbz2.so.1.0.1 libbz2.so.1.0 +- +-clean: +- rm -f $(OBJS) bzip2.o libbz2.so.1.0.1 libbz2.so.1.0 bzip2-shared +- +-blocksort.o: blocksort.c +- $(CC) $(CFLAGS) -c blocksort.c +-huffman.o: huffman.c +- $(CC) $(CFLAGS) -c huffman.c +-crctable.o: crctable.c +- $(CC) $(CFLAGS) -c crctable.c +-randtable.o: randtable.c +- $(CC) $(CFLAGS) -c randtable.c +-compress.o: compress.c +- $(CC) $(CFLAGS) -c compress.c +-decompress.o: decompress.c +- $(CC) $(CFLAGS) -c decompress.c +-bzlib.o: bzlib.c +- $(CC) $(CFLAGS) -c bzlib.c +diff -Nru bzip2-1.0.1/Makefile.am bzip2-1.0.1.new/Makefile.am +--- bzip2-1.0.1/Makefile.am Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/Makefile.am Sat Jun 24 20:17:47 2000 +@@ -0,0 +1,31 @@ ++SUBDIRS = doc ++ ++bin_PROGRAMS = bzip2 bzip2recover ++bzip2_SOURCES = bzip2.c ++ ++bzip2_LDADD = libbz2.la ++bzip2recover_SOURCES = bzip2recover.c ++lib_LTLIBRARIES = libbz2.la ++libbz2_la_SOURCES = \ ++ blocksort.c \ ++ huffman.c \ ++ crctable.c \ ++ randtable.c \ ++ compress.c \ ++ decompress.c \ ++ bzlib.c \ ++ bzlib.h \ ++ bzlib_private.h ++ ++libbz2_la_LDFLAGS = -version-info 1:0:0 ++include_HEADERS = bzlib.h bzlib_private.h ++ ++bzip2SCRIPTS = bzless ++ ++EXTRA_DIST = README README.COMPILATION.PROBLEMS \ ++ Y2K_INFO libbz2.def libbz2.dsp \ ++ sample1.bz2 sample1.ref sample2.bz2 sample2.ref sample3.bz2 sample3.ref ++ ++install-exec-hook: ++ $(LN_S) -f bzip2 $(DESTDIR)$(bindir)/bunzip2 ++ $(LN_S) -f bzip2 $(DESTDIR)$(bindir)/bzcat +diff -Nru bzip2-1.0.1/NEWS bzip2-1.0.1.new/NEWS +--- bzip2-1.0.1/NEWS Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/NEWS Sat Jun 24 20:13:06 2000 +@@ -0,0 +1,12 @@ ++ ++ ++1.0.1 ++~~~~~ ++* Modified dlltest.c so it uses the new BZ2_ naming scheme. ++* Modified makefile-msc to fix minor build probs on Win2k. ++* Updated README.COMPILATION.PROBLEMS. ++ ++There are no functionality changes or bug fixes relative to version ++1.0.0. This is just a documentation update + a fix for minor Win32 ++build problems. For almost everyone, upgrading from 1.0.0 to 1.0.1 is ++utterly pointless. Don't bother. +diff -Nru bzip2-1.0.1/acinclude.m4 bzip2-1.0.1.new/acinclude.m4 +--- bzip2-1.0.1/acinclude.m4 Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/acinclude.m4 Sat Jun 24 20:13:06 2000 +@@ -0,0 +1,129 @@ ++#serial 7 ++ ++dnl By default, many hosts won't let programs access large files; ++dnl one must use special compiler options to get large-file access to work. ++dnl For more details about this brain damage please see: ++dnl http://www.sas.com/standards/large.file/x_open.20Mar96.html ++ ++dnl Written by Paul Eggert . ++ ++dnl Internal subroutine of AC_SYS_LARGEFILE. ++dnl AC_SYS_LARGEFILE_FLAGS(FLAGSNAME) ++AC_DEFUN(AC_SYS_LARGEFILE_FLAGS, ++ [AC_CACHE_CHECK([for $1 value to request large file support], ++ ac_cv_sys_largefile_$1, ++ [if ($GETCONF LFS_$1) >conftest.1 2>conftest.2 && test ! -s conftest.2 ++ then ++ ac_cv_sys_largefile_$1=`cat conftest.1` ++ else ++ ac_cv_sys_largefile_$1=no ++ ifelse($1, CFLAGS, ++ [case "$host_os" in ++ # HP-UX 10.20 requires -D__STDC_EXT__ with gcc 2.95.1. ++changequote(, )dnl ++ hpux10.[2-9][0-9]* | hpux1[1-9]* | hpux[2-9][0-9]*) ++changequote([, ])dnl ++ if test "$GCC" = yes; then ++ ac_cv_sys_largefile_CFLAGS=-D__STDC_EXT__ ++ fi ++ ;; ++ # IRIX 6.2 and later require cc -n32. ++changequote(, )dnl ++ irix6.[2-9]* | irix6.1[0-9]* | irix[7-9].* | irix[1-9][0-9]*) ++changequote([, ])dnl ++ if test "$GCC" != yes; then ++ ac_cv_sys_largefile_CFLAGS=-n32 ++ fi ++ esac ++ if test "$ac_cv_sys_largefile_CFLAGS" != no; then ++ ac_save_CC="$CC" ++ CC="$CC $ac_cv_sys_largefile_CFLAGS" ++ AC_TRY_LINK(, , , ac_cv_sys_largefile_CFLAGS=no) ++ CC="$ac_save_CC" ++ fi]) ++ fi ++ rm -f conftest*])]) ++ ++dnl Internal subroutine of AC_SYS_LARGEFILE. ++dnl AC_SYS_LARGEFILE_SPACE_APPEND(VAR, VAL) ++AC_DEFUN(AC_SYS_LARGEFILE_SPACE_APPEND, ++ [case $2 in ++ no) ;; ++ ?*) ++ case "[$]$1" in ++ '') $1=$2 ;; ++ *) $1=[$]$1' '$2 ;; ++ esac ;; ++ esac]) ++ ++dnl Internal subroutine of AC_SYS_LARGEFILE. ++dnl AC_SYS_LARGEFILE_MACRO_VALUE(C-MACRO, CACHE-VAR, COMMENT, CODE-TO-SET-DEFAULT) ++AC_DEFUN(AC_SYS_LARGEFILE_MACRO_VALUE, ++ [AC_CACHE_CHECK([for $1], $2, ++ [$2=no ++changequote(, )dnl ++ $4 ++ for ac_flag in $ac_cv_sys_largefile_CFLAGS no; do ++ case "$ac_flag" in ++ -D$1) ++ $2=1 ;; ++ -D$1=*) ++ $2=`expr " $ac_flag" : '[^=]*=\(.*\)'` ;; ++ esac ++ done ++changequote([, ])dnl ++ ]) ++ if test "[$]$2" != no; then ++ AC_DEFINE_UNQUOTED([$1], [$]$2, [$3]) ++ fi]) ++ ++AC_DEFUN(AC_SYS_LARGEFILE, ++ [AC_REQUIRE([AC_CANONICAL_HOST]) ++ AC_ARG_ENABLE(largefile, ++ [ --disable-largefile omit support for large files]) ++ if test "$enable_largefile" != no; then ++ AC_CHECK_TOOL(GETCONF, getconf) ++ AC_SYS_LARGEFILE_FLAGS(CFLAGS) ++ AC_SYS_LARGEFILE_FLAGS(LDFLAGS) ++ AC_SYS_LARGEFILE_FLAGS(LIBS) ++ ++ for ac_flag in $ac_cv_sys_largefile_CFLAGS no; do ++ case "$ac_flag" in ++ no) ;; ++ -D_FILE_OFFSET_BITS=*) ;; ++ -D_LARGEFILE_SOURCE | -D_LARGEFILE_SOURCE=*) ;; ++ -D_LARGE_FILES | -D_LARGE_FILES=*) ;; ++ -D?* | -I?*) ++ AC_SYS_LARGEFILE_SPACE_APPEND(CPPFLAGS, "$ac_flag") ;; ++ *) ++ AC_SYS_LARGEFILE_SPACE_APPEND(CFLAGS, "$ac_flag") ;; ++ esac ++ done ++ AC_SYS_LARGEFILE_SPACE_APPEND(LDFLAGS, "$ac_cv_sys_largefile_LDFLAGS") ++ AC_SYS_LARGEFILE_SPACE_APPEND(LIBS, "$ac_cv_sys_largefile_LIBS") ++ AC_SYS_LARGEFILE_MACRO_VALUE(_FILE_OFFSET_BITS, ++ ac_cv_sys_file_offset_bits, ++ [Number of bits in a file offset, on hosts where this is settable.], ++ [case "$host_os" in ++ # HP-UX 10.20 and later ++ hpux10.[2-9][0-9]* | hpux1[1-9]* | hpux[2-9][0-9]*) ++ ac_cv_sys_file_offset_bits=64 ;; ++ esac]) ++ AC_SYS_LARGEFILE_MACRO_VALUE(_LARGEFILE_SOURCE, ++ ac_cv_sys_largefile_source, ++ [Define to make fseeko etc. visible, on some hosts.], ++ [case "$host_os" in ++ # HP-UX 10.20 and later ++ hpux10.[2-9][0-9]* | hpux1[1-9]* | hpux[2-9][0-9]*) ++ ac_cv_sys_largefile_source=1 ;; ++ esac]) ++ AC_SYS_LARGEFILE_MACRO_VALUE(_LARGE_FILES, ++ ac_cv_sys_large_files, ++ [Define for large files, on AIX-style hosts.], ++ [case "$host_os" in ++ # AIX 4.2 and later ++ aix4.[2-9]* | aix4.1[0-9]* | aix[5-9].* | aix[1-9][0-9]*) ++ ac_cv_sys_large_files=1 ;; ++ esac]) ++ fi ++ ]) +diff -Nru bzip2-1.0.1/bzip2.1 bzip2-1.0.1.new/bzip2.1 +--- bzip2-1.0.1/bzip2.1 Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/bzip2.1 Thu Jan 1 01:00:00 1970 +@@ -1,439 +0,0 @@ +-.PU +-.TH bzip2 1 +-.SH NAME +-bzip2, bunzip2 \- a block-sorting file compressor, v1.0 +-.br +-bzcat \- decompresses files to stdout +-.br +-bzip2recover \- recovers data from damaged bzip2 files +- +-.SH SYNOPSIS +-.ll +8 +-.B bzip2 +-.RB [ " \-cdfkqstvzVL123456789 " ] +-[ +-.I "filenames \&..." +-] +-.ll -8 +-.br +-.B bunzip2 +-.RB [ " \-fkvsVL " ] +-[ +-.I "filenames \&..." +-] +-.br +-.B bzcat +-.RB [ " \-s " ] +-[ +-.I "filenames \&..." +-] +-.br +-.B bzip2recover +-.I "filename" +- +-.SH DESCRIPTION +-.I bzip2 +-compresses files using the Burrows-Wheeler block sorting +-text compression algorithm, and Huffman coding. Compression is +-generally considerably better than that achieved by more conventional +-LZ77/LZ78-based compressors, and approaches the performance of the PPM +-family of statistical compressors. +- +-The command-line options are deliberately very similar to +-those of +-.I GNU gzip, +-but they are not identical. +- +-.I bzip2 +-expects a list of file names to accompany the +-command-line flags. Each file is replaced by a compressed version of +-itself, with the name "original_name.bz2". +-Each compressed file +-has the same modification date, permissions, and, when possible, +-ownership as the corresponding original, so that these properties can +-be correctly restored at decompression time. File name handling is +-naive in the sense that there is no mechanism for preserving original +-file names, permissions, ownerships or dates in filesystems which lack +-these concepts, or have serious file name length restrictions, such as +-MS-DOS. +- +-.I bzip2 +-and +-.I bunzip2 +-will by default not overwrite existing +-files. If you want this to happen, specify the \-f flag. +- +-If no file names are specified, +-.I bzip2 +-compresses from standard +-input to standard output. In this case, +-.I bzip2 +-will decline to +-write compressed output to a terminal, as this would be entirely +-incomprehensible and therefore pointless. +- +-.I bunzip2 +-(or +-.I bzip2 \-d) +-decompresses all +-specified files. Files which were not created by +-.I bzip2 +-will be detected and ignored, and a warning issued. +-.I bzip2 +-attempts to guess the filename for the decompressed file +-from that of the compressed file as follows: +- +- filename.bz2 becomes filename +- filename.bz becomes filename +- filename.tbz2 becomes filename.tar +- filename.tbz becomes filename.tar +- anyothername becomes anyothername.out +- +-If the file does not end in one of the recognised endings, +-.I .bz2, +-.I .bz, +-.I .tbz2 +-or +-.I .tbz, +-.I bzip2 +-complains that it cannot +-guess the name of the original file, and uses the original name +-with +-.I .out +-appended. +- +-As with compression, supplying no +-filenames causes decompression from +-standard input to standard output. +- +-.I bunzip2 +-will correctly decompress a file which is the +-concatenation of two or more compressed files. The result is the +-concatenation of the corresponding uncompressed files. Integrity +-testing (\-t) +-of concatenated +-compressed files is also supported. +- +-You can also compress or decompress files to the standard output by +-giving the \-c flag. Multiple files may be compressed and +-decompressed like this. The resulting outputs are fed sequentially to +-stdout. Compression of multiple files +-in this manner generates a stream +-containing multiple compressed file representations. Such a stream +-can be decompressed correctly only by +-.I bzip2 +-version 0.9.0 or +-later. Earlier versions of +-.I bzip2 +-will stop after decompressing +-the first file in the stream. +- +-.I bzcat +-(or +-.I bzip2 -dc) +-decompresses all specified files to +-the standard output. +- +-.I bzip2 +-will read arguments from the environment variables +-.I BZIP2 +-and +-.I BZIP, +-in that order, and will process them +-before any arguments read from the command line. This gives a +-convenient way to supply default arguments. +- +-Compression is always performed, even if the compressed +-file is slightly +-larger than the original. Files of less than about one hundred bytes +-tend to get larger, since the compression mechanism has a constant +-overhead in the region of 50 bytes. Random data (including the output +-of most file compressors) is coded at about 8.05 bits per byte, giving +-an expansion of around 0.5%. +- +-As a self-check for your protection, +-.I +-bzip2 +-uses 32-bit CRCs to +-make sure that the decompressed version of a file is identical to the +-original. This guards against corruption of the compressed data, and +-against undetected bugs in +-.I bzip2 +-(hopefully very unlikely). The +-chances of data corruption going undetected is microscopic, about one +-chance in four billion for each file processed. Be aware, though, that +-the check occurs upon decompression, so it can only tell you that +-something is wrong. It can't help you +-recover the original uncompressed +-data. You can use +-.I bzip2recover +-to try to recover data from +-damaged files. +- +-Return values: 0 for a normal exit, 1 for environmental problems (file +-not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt +-compressed file, 3 for an internal consistency error (eg, bug) which +-caused +-.I bzip2 +-to panic. +- +-.SH OPTIONS +-.TP +-.B \-c --stdout +-Compress or decompress to standard output. +-.TP +-.B \-d --decompress +-Force decompression. +-.I bzip2, +-.I bunzip2 +-and +-.I bzcat +-are +-really the same program, and the decision about what actions to take is +-done on the basis of which name is used. This flag overrides that +-mechanism, and forces +-.I bzip2 +-to decompress. +-.TP +-.B \-z --compress +-The complement to \-d: forces compression, regardless of the +-invokation name. +-.TP +-.B \-t --test +-Check integrity of the specified file(s), but don't decompress them. +-This really performs a trial decompression and throws away the result. +-.TP +-.B \-f --force +-Force overwrite of output files. Normally, +-.I bzip2 +-will not overwrite +-existing output files. Also forces +-.I bzip2 +-to break hard links +-to files, which it otherwise wouldn't do. +-.TP +-.B \-k --keep +-Keep (don't delete) input files during compression +-or decompression. +-.TP +-.B \-s --small +-Reduce memory usage, for compression, decompression and testing. Files +-are decompressed and tested using a modified algorithm which only +-requires 2.5 bytes per block byte. This means any file can be +-decompressed in 2300k of memory, albeit at about half the normal speed. +- +-During compression, \-s selects a block size of 200k, which limits +-memory use to around the same figure, at the expense of your compression +-ratio. In short, if your machine is low on memory (8 megabytes or +-less), use \-s for everything. See MEMORY MANAGEMENT below. +-.TP +-.B \-q --quiet +-Suppress non-essential warning messages. Messages pertaining to +-I/O errors and other critical events will not be suppressed. +-.TP +-.B \-v --verbose +-Verbose mode -- show the compression ratio for each file processed. +-Further \-v's increase the verbosity level, spewing out lots of +-information which is primarily of interest for diagnostic purposes. +-.TP +-.B \-L --license -V --version +-Display the software version, license terms and conditions. +-.TP +-.B \-1 to \-9 +-Set the block size to 100 k, 200 k .. 900 k when compressing. Has no +-effect when decompressing. See MEMORY MANAGEMENT below. +-.TP +-.B \-- +-Treats all subsequent arguments as file names, even if they start +-with a dash. This is so you can handle files with names beginning +-with a dash, for example: bzip2 \-- \-myfilename. +-.TP +-.B \--repetitive-fast --repetitive-best +-These flags are redundant in versions 0.9.5 and above. They provided +-some coarse control over the behaviour of the sorting algorithm in +-earlier versions, which was sometimes useful. 0.9.5 and above have an +-improved algorithm which renders these flags irrelevant. +- +-.SH MEMORY MANAGEMENT +-.I bzip2 +-compresses large files in blocks. The block size affects +-both the compression ratio achieved, and the amount of memory needed for +-compression and decompression. The flags \-1 through \-9 +-specify the block size to be 100,000 bytes through 900,000 bytes (the +-default) respectively. At decompression time, the block size used for +-compression is read from the header of the compressed file, and +-.I bunzip2 +-then allocates itself just enough memory to decompress +-the file. Since block sizes are stored in compressed files, it follows +-that the flags \-1 to \-9 are irrelevant to and so ignored +-during decompression. +- +-Compression and decompression requirements, +-in bytes, can be estimated as: +- +- Compression: 400k + ( 8 x block size ) +- +- Decompression: 100k + ( 4 x block size ), or +- 100k + ( 2.5 x block size ) +- +-Larger block sizes give rapidly diminishing marginal returns. Most of +-the compression comes from the first two or three hundred k of block +-size, a fact worth bearing in mind when using +-.I bzip2 +-on small machines. +-It is also important to appreciate that the decompression memory +-requirement is set at compression time by the choice of block size. +- +-For files compressed with the default 900k block size, +-.I bunzip2 +-will require about 3700 kbytes to decompress. To support decompression +-of any file on a 4 megabyte machine, +-.I bunzip2 +-has an option to +-decompress using approximately half this amount of memory, about 2300 +-kbytes. Decompression speed is also halved, so you should use this +-option only where necessary. The relevant flag is -s. +- +-In general, try and use the largest block size memory constraints allow, +-since that maximises the compression achieved. Compression and +-decompression speed are virtually unaffected by block size. +- +-Another significant point applies to files which fit in a single block +--- that means most files you'd encounter using a large block size. The +-amount of real memory touched is proportional to the size of the file, +-since the file is smaller than a block. For example, compressing a file +-20,000 bytes long with the flag -9 will cause the compressor to +-allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 +-kbytes of it. Similarly, the decompressor will allocate 3700k but only +-touch 100k + 20000 * 4 = 180 kbytes. +- +-Here is a table which summarises the maximum memory usage for different +-block sizes. Also recorded is the total compressed size for 14 files of +-the Calgary Text Compression Corpus totalling 3,141,622 bytes. This +-column gives some feel for how compression varies with block size. +-These figures tend to understate the advantage of larger block sizes for +-larger files, since the Corpus is dominated by smaller files. +- +- Compress Decompress Decompress Corpus +- Flag usage usage -s usage Size +- +- -1 1200k 500k 350k 914704 +- -2 2000k 900k 600k 877703 +- -3 2800k 1300k 850k 860338 +- -4 3600k 1700k 1100k 846899 +- -5 4400k 2100k 1350k 845160 +- -6 5200k 2500k 1600k 838626 +- -7 6100k 2900k 1850k 834096 +- -8 6800k 3300k 2100k 828642 +- -9 7600k 3700k 2350k 828642 +- +-.SH RECOVERING DATA FROM DAMAGED FILES +-.I bzip2 +-compresses files in blocks, usually 900kbytes long. Each +-block is handled independently. If a media or transmission error causes +-a multi-block .bz2 +-file to become damaged, it may be possible to +-recover data from the undamaged blocks in the file. +- +-The compressed representation of each block is delimited by a 48-bit +-pattern, which makes it possible to find the block boundaries with +-reasonable certainty. Each block also carries its own 32-bit CRC, so +-damaged blocks can be distinguished from undamaged ones. +- +-.I bzip2recover +-is a simple program whose purpose is to search for +-blocks in .bz2 files, and write each block out into its own .bz2 +-file. You can then use +-.I bzip2 +-\-t +-to test the +-integrity of the resulting files, and decompress those which are +-undamaged. +- +-.I bzip2recover +-takes a single argument, the name of the damaged file, +-and writes a number of files "rec0001file.bz2", +-"rec0002file.bz2", etc, containing the extracted blocks. +-The output filenames are designed so that the use of +-wildcards in subsequent processing -- for example, +-"bzip2 -dc rec*file.bz2 > recovered_data" -- lists the files in +-the correct order. +- +-.I bzip2recover +-should be of most use dealing with large .bz2 +-files, as these will contain many blocks. It is clearly +-futile to use it on damaged single-block files, since a +-damaged block cannot be recovered. If you wish to minimise +-any potential data loss through media or transmission errors, +-you might consider compressing with a smaller +-block size. +- +-.SH PERFORMANCE NOTES +-The sorting phase of compression gathers together similar strings in the +-file. Because of this, files containing very long runs of repeated +-symbols, like "aabaabaabaab ..." (repeated several hundred times) may +-compress more slowly than normal. Versions 0.9.5 and above fare much +-better than previous versions in this respect. The ratio between +-worst-case and average-case compression time is in the region of 10:1. +-For previous versions, this figure was more like 100:1. You can use the +-\-vvvv option to monitor progress in great detail, if you want. +- +-Decompression speed is unaffected by these phenomena. +- +-.I bzip2 +-usually allocates several megabytes of memory to operate +-in, and then charges all over it in a fairly random fashion. This means +-that performance, both for compressing and decompressing, is largely +-determined by the speed at which your machine can service cache misses. +-Because of this, small changes to the code to reduce the miss rate have +-been observed to give disproportionately large performance improvements. +-I imagine +-.I bzip2 +-will perform best on machines with very large caches. +- +-.SH CAVEATS +-I/O error messages are not as helpful as they could be. +-.I bzip2 +-tries hard to detect I/O errors and exit cleanly, but the details of +-what the problem is sometimes seem rather misleading. +- +-This manual page pertains to version 1.0 of +-.I bzip2. +-Compressed +-data created by this version is entirely forwards and backwards +-compatible with the previous public releases, versions 0.1pl2, 0.9.0 +-and 0.9.5, +-but with the following exception: 0.9.0 and above can correctly +-decompress multiple concatenated compressed files. 0.1pl2 cannot do +-this; it will stop after decompressing just the first file in the +-stream. +- +-.I bzip2recover +-uses 32-bit integers to represent bit positions in +-compressed files, so it cannot handle compressed files more than 512 +-megabytes long. This could easily be fixed. +- +-.SH AUTHOR +-Julian Seward, jseward@acm.org. +- +-http://sourceware.cygnus.com/bzip2 +-http://www.muraroa.demon.co.uk +- +-The ideas embodied in +-.I bzip2 +-are due to (at least) the following +-people: Michael Burrows and David Wheeler (for the block sorting +-transformation), David Wheeler (again, for the Huffman coder), Peter +-Fenwick (for the structured coding model in the original +-.I bzip, +-and many refinements), and Alistair Moffat, Radford Neal and Ian Witten +-(for the arithmetic coder in the original +-.I bzip). +-I am much +-indebted for their help, support and advice. See the manual in the +-source distribution for pointers to sources of documentation. Christian +-von Roques encouraged me to look for faster sorting algorithms, so as to +-speed up compression. Bela Lubkin encouraged me to improve the +-worst-case compression performance. Many people sent patches, helped +-with portability problems, lent machines, gave advice and were generally +-helpful. +diff -Nru bzip2-1.0.1/bzip2.1.preformatted bzip2-1.0.1.new/bzip2.1.preformatted +--- bzip2-1.0.1/bzip2.1.preformatted Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/bzip2.1.preformatted Thu Jan 1 01:00:00 1970 +@@ -1,462 +0,0 @@ +- +- +- +-bzip2(1) bzip2(1) +- +- +-NNAAMMEE +- bzip2, bunzip2 - a block-sorting file compressor, v1.0 +- bzcat - decompresses files to stdout +- bzip2recover - recovers data from damaged bzip2 files +- +- +-SSYYNNOOPPSSIISS +- bbzziipp22 [ --ccddffkkqqssttvvzzVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._. ] +- bbuunnzziipp22 [ --ffkkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._. ] +- bbzzccaatt [ --ss ] [ _f_i_l_e_n_a_m_e_s _._._. ] +- bbzziipp22rreeccoovveerr _f_i_l_e_n_a_m_e +- +- +-DDEESSCCRRIIPPTTIIOONN +- _b_z_i_p_2 compresses files using the Burrows-Wheeler block +- sorting text compression algorithm, and Huffman coding. +- Compression is generally considerably better than that +- achieved by more conventional LZ77/LZ78-based compressors, +- and approaches the performance of the PPM family of sta- +- tistical compressors. +- +- The command-line options are deliberately very similar to +- those of _G_N_U _g_z_i_p_, but they are not identical. +- +- _b_z_i_p_2 expects a list of file names to accompany the com- +- mand-line flags. Each file is replaced by a compressed +- version of itself, with the name "original_name.bz2". +- Each compressed file has the same modification date, per- +- missions, and, when possible, ownership as the correspond- +- ing original, so that these properties can be correctly +- restored at decompression time. File name handling is +- naive in the sense that there is no mechanism for preserv- +- ing original file names, permissions, ownerships or dates +- in filesystems which lack these concepts, or have serious +- file name length restrictions, such as MS-DOS. +- +- _b_z_i_p_2 and _b_u_n_z_i_p_2 will by default not overwrite existing +- files. If you want this to happen, specify the -f flag. +- +- If no file names are specified, _b_z_i_p_2 compresses from +- standard input to standard output. In this case, _b_z_i_p_2 +- will decline to write compressed output to a terminal, as +- this would be entirely incomprehensible and therefore +- pointless. +- +- _b_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d_) decompresses all specified files. +- Files which were not created by _b_z_i_p_2 will be detected and +- ignored, and a warning issued. _b_z_i_p_2 attempts to guess +- the filename for the decompressed file from that of the +- compressed file as follows: +- +- filename.bz2 becomes filename +- filename.bz becomes filename +- filename.tbz2 becomes filename.tar +- +- +- +- 1 +- +- +- +- +- +-bzip2(1) bzip2(1) +- +- +- filename.tbz becomes filename.tar +- anyothername becomes anyothername.out +- +- If the file does not end in one of the recognised endings, +- _._b_z_2_, _._b_z_, _._t_b_z_2 or _._t_b_z_, _b_z_i_p_2 complains that it cannot +- guess the name of the original file, and uses the original +- name with _._o_u_t appended. +- +- As with compression, supplying no filenames causes decom- +- pression from standard input to standard output. +- +- _b_u_n_z_i_p_2 will correctly decompress a file which is the con- +- catenation of two or more compressed files. The result is +- the concatenation of the corresponding uncompressed files. +- Integrity testing (-t) of concatenated compressed files is +- also supported. +- +- You can also compress or decompress files to the standard +- output by giving the -c flag. Multiple files may be com- +- pressed and decompressed like this. The resulting outputs +- are fed sequentially to stdout. Compression of multiple +- files in this manner generates a stream containing multi- +- ple compressed file representations. Such a stream can be +- decompressed correctly only by _b_z_i_p_2 version 0.9.0 or +- later. Earlier versions of _b_z_i_p_2 will stop after decom- +- pressing the first file in the stream. +- +- _b_z_c_a_t (or _b_z_i_p_2 _-_d_c_) decompresses all specified files to +- the standard output. +- +- _b_z_i_p_2 will read arguments from the environment variables +- _B_Z_I_P_2 and _B_Z_I_P_, in that order, and will process them +- before any arguments read from the command line. This +- gives a convenient way to supply default arguments. +- +- Compression is always performed, even if the compressed +- file is slightly larger than the original. Files of less +- than about one hundred bytes tend to get larger, since the +- compression mechanism has a constant overhead in the +- region of 50 bytes. Random data (including the output of +- most file compressors) is coded at about 8.05 bits per +- byte, giving an expansion of around 0.5%. +- +- As a self-check for your protection, _b_z_i_p_2 uses 32-bit +- CRCs to make sure that the decompressed version of a file +- is identical to the original. This guards against corrup- +- tion of the compressed data, and against undetected bugs +- in _b_z_i_p_2 (hopefully very unlikely). The chances of data +- corruption going undetected is microscopic, about one +- chance in four billion for each file processed. Be aware, +- though, that the check occurs upon decompression, so it +- can only tell you that something is wrong. It can't help +- you recover the original uncompressed data. You can use +- _b_z_i_p_2_r_e_c_o_v_e_r to try to recover data from damaged files. +- +- +- +- 2 +- +- +- +- +- +-bzip2(1) bzip2(1) +- +- +- Return values: 0 for a normal exit, 1 for environmental +- problems (file not found, invalid flags, I/O errors, &c), +- 2 to indicate a corrupt compressed file, 3 for an internal +- consistency error (eg, bug) which caused _b_z_i_p_2 to panic. +- +- +-OOPPTTIIOONNSS +- --cc ----ssttddoouutt +- Compress or decompress to standard output. +- +- --dd ----ddeeccoommpprreessss +- Force decompression. _b_z_i_p_2_, _b_u_n_z_i_p_2 and _b_z_c_a_t are +- really the same program, and the decision about +- what actions to take is done on the basis of which +- name is used. This flag overrides that mechanism, +- and forces _b_z_i_p_2 to decompress. +- +- --zz ----ccoommpprreessss +- The complement to -d: forces compression, regard- +- less of the invokation name. +- +- --tt ----tteesstt +- Check integrity of the specified file(s), but don't +- decompress them. This really performs a trial +- decompression and throws away the result. +- +- --ff ----ffoorrccee +- Force overwrite of output files. Normally, _b_z_i_p_2 +- will not overwrite existing output files. Also +- forces _b_z_i_p_2 to break hard links to files, which it +- otherwise wouldn't do. +- +- --kk ----kkeeeepp +- Keep (don't delete) input files during compression +- or decompression. +- +- --ss ----ssmmaallll +- Reduce memory usage, for compression, decompression +- and testing. Files are decompressed and tested +- using a modified algorithm which only requires 2.5 +- bytes per block byte. This means any file can be +- decompressed in 2300k of memory, albeit at about +- half the normal speed. +- +- During compression, -s selects a block size of +- 200k, which limits memory use to around the same +- figure, at the expense of your compression ratio. +- In short, if your machine is low on memory (8 +- megabytes or less), use -s for everything. See +- MEMORY MANAGEMENT below. +- +- --qq ----qquuiieett +- Suppress non-essential warning messages. Messages +- pertaining to I/O errors and other critical events +- +- +- +- 3 +- +- +- +- +- +-bzip2(1) bzip2(1) +- +- +- will not be suppressed. +- +- --vv ----vveerrbboossee +- Verbose mode -- show the compression ratio for each +- file processed. Further -v's increase the ver- +- bosity level, spewing out lots of information which +- is primarily of interest for diagnostic purposes. +- +- --LL ----lliicceennssee --VV ----vveerrssiioonn +- Display the software version, license terms and +- conditions. +- +- --11 ttoo --99 +- Set the block size to 100 k, 200 k .. 900 k when +- compressing. Has no effect when decompressing. +- See MEMORY MANAGEMENT below. +- +- ---- Treats all subsequent arguments as file names, even +- if they start with a dash. This is so you can han- +- dle files with names beginning with a dash, for +- example: bzip2 -- -myfilename. +- +- ----rreeppeettiittiivvee--ffaasstt ----rreeppeettiittiivvee--bbeesstt +- These flags are redundant in versions 0.9.5 and +- above. They provided some coarse control over the +- behaviour of the sorting algorithm in earlier ver- +- sions, which was sometimes useful. 0.9.5 and above +- have an improved algorithm which renders these +- flags irrelevant. +- +- +-MMEEMMOORRYY MMAANNAAGGEEMMEENNTT +- _b_z_i_p_2 compresses large files in blocks. The block size +- affects both the compression ratio achieved, and the +- amount of memory needed for compression and decompression. +- The flags -1 through -9 specify the block size to be +- 100,000 bytes through 900,000 bytes (the default) respec- +- tively. At decompression time, the block size used for +- compression is read from the header of the compressed +- file, and _b_u_n_z_i_p_2 then allocates itself just enough memory +- to decompress the file. Since block sizes are stored in +- compressed files, it follows that the flags -1 to -9 are +- irrelevant to and so ignored during decompression. +- +- Compression and decompression requirements, in bytes, can +- be estimated as: +- +- Compression: 400k + ( 8 x block size ) +- +- Decompression: 100k + ( 4 x block size ), or +- 100k + ( 2.5 x block size ) +- +- Larger block sizes give rapidly diminishing marginal +- returns. Most of the compression comes from the first two +- +- +- +- 4 +- +- +- +- +- +-bzip2(1) bzip2(1) +- +- +- or three hundred k of block size, a fact worth bearing in +- mind when using _b_z_i_p_2 on small machines. It is also +- important to appreciate that the decompression memory +- requirement is set at compression time by the choice of +- block size. +- +- For files compressed with the default 900k block size, +- _b_u_n_z_i_p_2 will require about 3700 kbytes to decompress. To +- support decompression of any file on a 4 megabyte machine, +- _b_u_n_z_i_p_2 has an option to decompress using approximately +- half this amount of memory, about 2300 kbytes. Decompres- +- sion speed is also halved, so you should use this option +- only where necessary. The relevant flag is -s. +- +- In general, try and use the largest block size memory con- +- straints allow, since that maximises the compression +- achieved. Compression and decompression speed are virtu- +- ally unaffected by block size. +- +- Another significant point applies to files which fit in a +- single block -- that means most files you'd encounter +- using a large block size. The amount of real memory +- touched is proportional to the size of the file, since the +- file is smaller than a block. For example, compressing a +- file 20,000 bytes long with the flag -9 will cause the +- compressor to allocate around 7600k of memory, but only +- touch 400k + 20000 * 8 = 560 kbytes of it. Similarly, the +- decompressor will allocate 3700k but only touch 100k + +- 20000 * 4 = 180 kbytes. +- +- Here is a table which summarises the maximum memory usage +- for different block sizes. Also recorded is the total +- compressed size for 14 files of the Calgary Text Compres- +- sion Corpus totalling 3,141,622 bytes. This column gives +- some feel for how compression varies with block size. +- These figures tend to understate the advantage of larger +- block sizes for larger files, since the Corpus is domi- +- nated by smaller files. +- +- Compress Decompress Decompress Corpus +- Flag usage usage -s usage Size +- +- -1 1200k 500k 350k 914704 +- -2 2000k 900k 600k 877703 +- -3 2800k 1300k 850k 860338 +- -4 3600k 1700k 1100k 846899 +- -5 4400k 2100k 1350k 845160 +- -6 5200k 2500k 1600k 838626 +- -7 6100k 2900k 1850k 834096 +- -8 6800k 3300k 2100k 828642 +- -9 7600k 3700k 2350k 828642 +- +- +- +- +- +- +- 5 +- +- +- +- +- +-bzip2(1) bzip2(1) +- +- +-RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD FFIILLEESS +- _b_z_i_p_2 compresses files in blocks, usually 900kbytes long. +- Each block is handled independently. If a media or trans- +- mission error causes a multi-block .bz2 file to become +- damaged, it may be possible to recover data from the +- undamaged blocks in the file. +- +- The compressed representation of each block is delimited +- by a 48-bit pattern, which makes it possible to find the +- block boundaries with reasonable certainty. Each block +- also carries its own 32-bit CRC, so damaged blocks can be +- distinguished from undamaged ones. +- +- _b_z_i_p_2_r_e_c_o_v_e_r is a simple program whose purpose is to +- search for blocks in .bz2 files, and write each block out +- into its own .bz2 file. You can then use _b_z_i_p_2 -t to test +- the integrity of the resulting files, and decompress those +- which are undamaged. +- +- _b_z_i_p_2_r_e_c_o_v_e_r takes a single argument, the name of the dam- +- aged file, and writes a number of files "rec0001file.bz2", +- "rec0002file.bz2", etc, containing the extracted blocks. +- The output filenames are designed so that the use of +- wildcards in subsequent processing -- for example, "bzip2 +- -dc rec*file.bz2 > recovered_data" -- lists the files in +- the correct order. +- +- _b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2 +- files, as these will contain many blocks. It is clearly +- futile to use it on damaged single-block files, since a +- damaged block cannot be recovered. If you wish to min- +- imise any potential data loss through media or transmis- +- sion errors, you might consider compressing with a smaller +- block size. +- +- +-PPEERRFFOORRMMAANNCCEE NNOOTTEESS +- The sorting phase of compression gathers together similar +- strings in the file. Because of this, files containing +- very long runs of repeated symbols, like "aabaabaabaab +- ..." (repeated several hundred times) may compress more +- slowly than normal. Versions 0.9.5 and above fare much +- better than previous versions in this respect. The ratio +- between worst-case and average-case compression time is in +- the region of 10:1. For previous versions, this figure +- was more like 100:1. You can use the -vvvv option to mon- +- itor progress in great detail, if you want. +- +- Decompression speed is unaffected by these phenomena. +- +- _b_z_i_p_2 usually allocates several megabytes of memory to +- operate in, and then charges all over it in a fairly ran- +- dom fashion. This means that performance, both for com- +- pressing and decompressing, is largely determined by the +- +- +- +- 6 +- +- +- +- +- +-bzip2(1) bzip2(1) +- +- +- speed at which your machine can service cache misses. +- Because of this, small changes to the code to reduce the +- miss rate have been observed to give disproportionately +- large performance improvements. I imagine _b_z_i_p_2 will per- +- form best on machines with very large caches. +- +- +-CCAAVVEEAATTSS +- I/O error messages are not as helpful as they could be. +- _b_z_i_p_2 tries hard to detect I/O errors and exit cleanly, +- but the details of what the problem is sometimes seem +- rather misleading. +- +- This manual page pertains to version 1.0 of _b_z_i_p_2_. Com- +- pressed data created by this version is entirely forwards +- and backwards compatible with the previous public +- releases, versions 0.1pl2, 0.9.0 and 0.9.5, but with the +- following exception: 0.9.0 and above can correctly decom- +- press multiple concatenated compressed files. 0.1pl2 can- +- not do this; it will stop after decompressing just the +- first file in the stream. +- +- _b_z_i_p_2_r_e_c_o_v_e_r uses 32-bit integers to represent bit posi- +- tions in compressed files, so it cannot handle compressed +- files more than 512 megabytes long. This could easily be +- fixed. +- +- +-AAUUTTHHOORR +- Julian Seward, jseward@acm.org. +- +- http://sourceware.cygnus.com/bzip2 +- http://www.muraroa.demon.co.uk +- +- The ideas embodied in _b_z_i_p_2 are due to (at least) the fol- +- lowing people: Michael Burrows and David Wheeler (for the +- block sorting transformation), David Wheeler (again, for +- the Huffman coder), Peter Fenwick (for the structured cod- +- ing model in the original _b_z_i_p_, and many refinements), and +- Alistair Moffat, Radford Neal and Ian Witten (for the +- arithmetic coder in the original _b_z_i_p_)_. I am much +- indebted for their help, support and advice. See the man- +- ual in the source distribution for pointers to sources of +- documentation. Christian von Roques encouraged me to look +- for faster sorting algorithms, so as to speed up compres- +- sion. Bela Lubkin encouraged me to improve the worst-case +- compression performance. Many people sent patches, helped +- with portability problems, lent machines, gave advice and +- were generally helpful. +- +- +- +- +- +- +- +- +- 7 +- +- +diff -Nru bzip2-1.0.1/bzless bzip2-1.0.1.new/bzless +--- bzip2-1.0.1/bzless Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/bzless Sat Jun 24 20:16:09 2000 +@@ -0,0 +1,2 @@ ++#!/bin/sh ++%{_bindir}/bunzip2 -c "\$@" | /usr/bin/less +diff -Nru bzip2-1.0.1/config.h.in bzip2-1.0.1.new/config.h.in +--- bzip2-1.0.1/config.h.in Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/config.h.in Sat Jun 24 20:13:06 2000 +@@ -0,0 +1,17 @@ ++/* config.h.in. Generated automatically from configure.in by autoheader. */ ++ ++/* Name of package */ ++#undef PACKAGE ++ ++/* Version number of package */ ++#undef VERSION ++ ++/* Number of bits in a file offset, on hosts where this is settable. */ ++#undef _FILE_OFFSET_BITS ++ ++/* Define to make fseeko etc. visible, on some hosts. */ ++#undef _LARGEFILE_SOURCE ++ ++/* Define for large files, on AIX-style hosts. */ ++#undef _LARGE_FILES ++ +diff -Nru bzip2-1.0.1/configure.in bzip2-1.0.1.new/configure.in +--- bzip2-1.0.1/configure.in Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/configure.in Sat Jun 24 20:13:06 2000 +@@ -0,0 +1,10 @@ ++AC_INIT(bzip2.c) ++AM_INIT_AUTOMAKE(bzip2,1.0.1) ++AM_CONFIG_HEADER(config.h) ++AC_PROG_CC ++AM_PROG_LIBTOOL ++AC_PROG_LN_S ++AC_SYS_LARGEFILE ++AC_OUTPUT(Makefile ++ doc/Makefile ++ doc/pl/Makefile) +diff -Nru bzip2-1.0.1/crctable.c bzip2-1.0.1.new/crctable.c +--- bzip2-1.0.1/crctable.c Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/crctable.c Sat Jun 24 20:13:06 2000 +@@ -58,6 +58,10 @@ + For more information on these sources, see the manual. + --*/ + ++#ifdef HAVE_CONFIG_H ++#include ++#endif ++ + + #include "bzlib_private.h" + +diff -Nru bzip2-1.0.1/decompress.c bzip2-1.0.1.new/decompress.c +--- bzip2-1.0.1/decompress.c Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/decompress.c Sat Jun 24 20:13:06 2000 +@@ -58,6 +58,10 @@ + For more information on these sources, see the manual. + --*/ + ++#ifdef HAVE_CONFIG_H ++#include ++#endif ++ + + #include "bzlib_private.h" + +diff -Nru bzip2-1.0.1/dlltest.c bzip2-1.0.1.new/dlltest.c +--- bzip2-1.0.1/dlltest.c Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/dlltest.c Sat Jun 24 20:13:06 2000 +@@ -8,6 +8,10 @@ + usage: minibz2 [-d] [-{1,2,..9}] [[srcfilename] destfilename] + */ + ++#ifdef HAVE_CONFIG_H ++#include ++#endif ++ + #define BZ_IMPORT + #include + #include +diff -Nru bzip2-1.0.1/doc/Makefile.am bzip2-1.0.1.new/doc/Makefile.am +--- bzip2-1.0.1/doc/Makefile.am Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/Makefile.am Sat Jun 24 20:14:43 2000 +@@ -0,0 +1,5 @@ ++ ++SUBDIRS = pl ++ ++man_MANS = bzip2.1 bunzip2.1 bzcat.1 bzip2recover.1 ++#info_TEXINFOS = bzip2.texi +diff -Nru bzip2-1.0.1/doc/bunzip2.1 bzip2-1.0.1.new/doc/bunzip2.1 +--- bzip2-1.0.1/doc/bunzip2.1 Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/bunzip2.1 Sat Jun 24 20:13:06 2000 +@@ -0,0 +1 @@ ++.so bzip2.1 +\ No newline at end of file +diff -Nru bzip2-1.0.1/doc/bzcat.1 bzip2-1.0.1.new/doc/bzcat.1 +--- bzip2-1.0.1/doc/bzcat.1 Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/bzcat.1 Sat Jun 24 20:13:06 2000 +@@ -0,0 +1 @@ ++.so bzip2.1 +\ No newline at end of file +diff -Nru bzip2-1.0.1/doc/bzip2.1 bzip2-1.0.1.new/doc/bzip2.1 +--- bzip2-1.0.1/doc/bzip2.1 Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/bzip2.1 Sat Jun 24 20:13:06 2000 +@@ -0,0 +1,439 @@ ++.PU ++.TH bzip2 1 ++.SH NAME ++bzip2, bunzip2 \- a block-sorting file compressor, v1.0 ++.br ++bzcat \- decompresses files to stdout ++.br ++bzip2recover \- recovers data from damaged bzip2 files ++ ++.SH SYNOPSIS ++.ll +8 ++.B bzip2 ++.RB [ " \-cdfkqstvzVL123456789 " ] ++[ ++.I "filenames \&..." ++] ++.ll -8 ++.br ++.B bunzip2 ++.RB [ " \-fkvsVL " ] ++[ ++.I "filenames \&..." ++] ++.br ++.B bzcat ++.RB [ " \-s " ] ++[ ++.I "filenames \&..." ++] ++.br ++.B bzip2recover ++.I "filename" ++ ++.SH DESCRIPTION ++.I bzip2 ++compresses files using the Burrows-Wheeler block sorting ++text compression algorithm, and Huffman coding. Compression is ++generally considerably better than that achieved by more conventional ++LZ77/LZ78-based compressors, and approaches the performance of the PPM ++family of statistical compressors. ++ ++The command-line options are deliberately very similar to ++those of ++.I GNU gzip, ++but they are not identical. ++ ++.I bzip2 ++expects a list of file names to accompany the ++command-line flags. Each file is replaced by a compressed version of ++itself, with the name "original_name.bz2". ++Each compressed file ++has the same modification date, permissions, and, when possible, ++ownership as the corresponding original, so that these properties can ++be correctly restored at decompression time. File name handling is ++naive in the sense that there is no mechanism for preserving original ++file names, permissions, ownerships or dates in filesystems which lack ++these concepts, or have serious file name length restrictions, such as ++MS-DOS. ++ ++.I bzip2 ++and ++.I bunzip2 ++will by default not overwrite existing ++files. If you want this to happen, specify the \-f flag. ++ ++If no file names are specified, ++.I bzip2 ++compresses from standard ++input to standard output. In this case, ++.I bzip2 ++will decline to ++write compressed output to a terminal, as this would be entirely ++incomprehensible and therefore pointless. ++ ++.I bunzip2 ++(or ++.I bzip2 \-d) ++decompresses all ++specified files. Files which were not created by ++.I bzip2 ++will be detected and ignored, and a warning issued. ++.I bzip2 ++attempts to guess the filename for the decompressed file ++from that of the compressed file as follows: ++ ++ filename.bz2 becomes filename ++ filename.bz becomes filename ++ filename.tbz2 becomes filename.tar ++ filename.tbz becomes filename.tar ++ anyothername becomes anyothername.out ++ ++If the file does not end in one of the recognised endings, ++.I .bz2, ++.I .bz, ++.I .tbz2 ++or ++.I .tbz, ++.I bzip2 ++complains that it cannot ++guess the name of the original file, and uses the original name ++with ++.I .out ++appended. ++ ++As with compression, supplying no ++filenames causes decompression from ++standard input to standard output. ++ ++.I bunzip2 ++will correctly decompress a file which is the ++concatenation of two or more compressed files. The result is the ++concatenation of the corresponding uncompressed files. Integrity ++testing (\-t) ++of concatenated ++compressed files is also supported. ++ ++You can also compress or decompress files to the standard output by ++giving the \-c flag. Multiple files may be compressed and ++decompressed like this. The resulting outputs are fed sequentially to ++stdout. Compression of multiple files ++in this manner generates a stream ++containing multiple compressed file representations. Such a stream ++can be decompressed correctly only by ++.I bzip2 ++version 0.9.0 or ++later. Earlier versions of ++.I bzip2 ++will stop after decompressing ++the first file in the stream. ++ ++.I bzcat ++(or ++.I bzip2 -dc) ++decompresses all specified files to ++the standard output. ++ ++.I bzip2 ++will read arguments from the environment variables ++.I BZIP2 ++and ++.I BZIP, ++in that order, and will process them ++before any arguments read from the command line. This gives a ++convenient way to supply default arguments. ++ ++Compression is always performed, even if the compressed ++file is slightly ++larger than the original. Files of less than about one hundred bytes ++tend to get larger, since the compression mechanism has a constant ++overhead in the region of 50 bytes. Random data (including the output ++of most file compressors) is coded at about 8.05 bits per byte, giving ++an expansion of around 0.5%. ++ ++As a self-check for your protection, ++.I ++bzip2 ++uses 32-bit CRCs to ++make sure that the decompressed version of a file is identical to the ++original. This guards against corruption of the compressed data, and ++against undetected bugs in ++.I bzip2 ++(hopefully very unlikely). The ++chances of data corruption going undetected is microscopic, about one ++chance in four billion for each file processed. Be aware, though, that ++the check occurs upon decompression, so it can only tell you that ++something is wrong. It can't help you ++recover the original uncompressed ++data. You can use ++.I bzip2recover ++to try to recover data from ++damaged files. ++ ++Return values: 0 for a normal exit, 1 for environmental problems (file ++not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt ++compressed file, 3 for an internal consistency error (eg, bug) which ++caused ++.I bzip2 ++to panic. ++ ++.SH OPTIONS ++.TP ++.B \-c --stdout ++Compress or decompress to standard output. ++.TP ++.B \-d --decompress ++Force decompression. ++.I bzip2, ++.I bunzip2 ++and ++.I bzcat ++are ++really the same program, and the decision about what actions to take is ++done on the basis of which name is used. This flag overrides that ++mechanism, and forces ++.I bzip2 ++to decompress. ++.TP ++.B \-z --compress ++The complement to \-d: forces compression, regardless of the ++invokation name. ++.TP ++.B \-t --test ++Check integrity of the specified file(s), but don't decompress them. ++This really performs a trial decompression and throws away the result. ++.TP ++.B \-f --force ++Force overwrite of output files. Normally, ++.I bzip2 ++will not overwrite ++existing output files. Also forces ++.I bzip2 ++to break hard links ++to files, which it otherwise wouldn't do. ++.TP ++.B \-k --keep ++Keep (don't delete) input files during compression ++or decompression. ++.TP ++.B \-s --small ++Reduce memory usage, for compression, decompression and testing. Files ++are decompressed and tested using a modified algorithm which only ++requires 2.5 bytes per block byte. This means any file can be ++decompressed in 2300k of memory, albeit at about half the normal speed. ++ ++During compression, \-s selects a block size of 200k, which limits ++memory use to around the same figure, at the expense of your compression ++ratio. In short, if your machine is low on memory (8 megabytes or ++less), use \-s for everything. See MEMORY MANAGEMENT below. ++.TP ++.B \-q --quiet ++Suppress non-essential warning messages. Messages pertaining to ++I/O errors and other critical events will not be suppressed. ++.TP ++.B \-v --verbose ++Verbose mode -- show the compression ratio for each file processed. ++Further \-v's increase the verbosity level, spewing out lots of ++information which is primarily of interest for diagnostic purposes. ++.TP ++.B \-L --license -V --version ++Display the software version, license terms and conditions. ++.TP ++.B \-1 to \-9 ++Set the block size to 100 k, 200 k .. 900 k when compressing. Has no ++effect when decompressing. See MEMORY MANAGEMENT below. ++.TP ++.B \-- ++Treats all subsequent arguments as file names, even if they start ++with a dash. This is so you can handle files with names beginning ++with a dash, for example: bzip2 \-- \-myfilename. ++.TP ++.B \--repetitive-fast --repetitive-best ++These flags are redundant in versions 0.9.5 and above. They provided ++some coarse control over the behaviour of the sorting algorithm in ++earlier versions, which was sometimes useful. 0.9.5 and above have an ++improved algorithm which renders these flags irrelevant. ++ ++.SH MEMORY MANAGEMENT ++.I bzip2 ++compresses large files in blocks. The block size affects ++both the compression ratio achieved, and the amount of memory needed for ++compression and decompression. The flags \-1 through \-9 ++specify the block size to be 100,000 bytes through 900,000 bytes (the ++default) respectively. At decompression time, the block size used for ++compression is read from the header of the compressed file, and ++.I bunzip2 ++then allocates itself just enough memory to decompress ++the file. Since block sizes are stored in compressed files, it follows ++that the flags \-1 to \-9 are irrelevant to and so ignored ++during decompression. ++ ++Compression and decompression requirements, ++in bytes, can be estimated as: ++ ++ Compression: 400k + ( 8 x block size ) ++ ++ Decompression: 100k + ( 4 x block size ), or ++ 100k + ( 2.5 x block size ) ++ ++Larger block sizes give rapidly diminishing marginal returns. Most of ++the compression comes from the first two or three hundred k of block ++size, a fact worth bearing in mind when using ++.I bzip2 ++on small machines. ++It is also important to appreciate that the decompression memory ++requirement is set at compression time by the choice of block size. ++ ++For files compressed with the default 900k block size, ++.I bunzip2 ++will require about 3700 kbytes to decompress. To support decompression ++of any file on a 4 megabyte machine, ++.I bunzip2 ++has an option to ++decompress using approximately half this amount of memory, about 2300 ++kbytes. Decompression speed is also halved, so you should use this ++option only where necessary. The relevant flag is -s. ++ ++In general, try and use the largest block size memory constraints allow, ++since that maximises the compression achieved. Compression and ++decompression speed are virtually unaffected by block size. ++ ++Another significant point applies to files which fit in a single block ++-- that means most files you'd encounter using a large block size. The ++amount of real memory touched is proportional to the size of the file, ++since the file is smaller than a block. For example, compressing a file ++20,000 bytes long with the flag -9 will cause the compressor to ++allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 ++kbytes of it. Similarly, the decompressor will allocate 3700k but only ++touch 100k + 20000 * 4 = 180 kbytes. ++ ++Here is a table which summarises the maximum memory usage for different ++block sizes. Also recorded is the total compressed size for 14 files of ++the Calgary Text Compression Corpus totalling 3,141,622 bytes. This ++column gives some feel for how compression varies with block size. ++These figures tend to understate the advantage of larger block sizes for ++larger files, since the Corpus is dominated by smaller files. ++ ++ Compress Decompress Decompress Corpus ++ Flag usage usage -s usage Size ++ ++ -1 1200k 500k 350k 914704 ++ -2 2000k 900k 600k 877703 ++ -3 2800k 1300k 850k 860338 ++ -4 3600k 1700k 1100k 846899 ++ -5 4400k 2100k 1350k 845160 ++ -6 5200k 2500k 1600k 838626 ++ -7 6100k 2900k 1850k 834096 ++ -8 6800k 3300k 2100k 828642 ++ -9 7600k 3700k 2350k 828642 ++ ++.SH RECOVERING DATA FROM DAMAGED FILES ++.I bzip2 ++compresses files in blocks, usually 900kbytes long. Each ++block is handled independently. If a media or transmission error causes ++a multi-block .bz2 ++file to become damaged, it may be possible to ++recover data from the undamaged blocks in the file. ++ ++The compressed representation of each block is delimited by a 48-bit ++pattern, which makes it possible to find the block boundaries with ++reasonable certainty. Each block also carries its own 32-bit CRC, so ++damaged blocks can be distinguished from undamaged ones. ++ ++.I bzip2recover ++is a simple program whose purpose is to search for ++blocks in .bz2 files, and write each block out into its own .bz2 ++file. You can then use ++.I bzip2 ++\-t ++to test the ++integrity of the resulting files, and decompress those which are ++undamaged. ++ ++.I bzip2recover ++takes a single argument, the name of the damaged file, ++and writes a number of files "rec0001file.bz2", ++"rec0002file.bz2", etc, containing the extracted blocks. ++The output filenames are designed so that the use of ++wildcards in subsequent processing -- for example, ++"bzip2 -dc rec*file.bz2 > recovered_data" -- lists the files in ++the correct order. ++ ++.I bzip2recover ++should be of most use dealing with large .bz2 ++files, as these will contain many blocks. It is clearly ++futile to use it on damaged single-block files, since a ++damaged block cannot be recovered. If you wish to minimise ++any potential data loss through media or transmission errors, ++you might consider compressing with a smaller ++block size. ++ ++.SH PERFORMANCE NOTES ++The sorting phase of compression gathers together similar strings in the ++file. Because of this, files containing very long runs of repeated ++symbols, like "aabaabaabaab ..." (repeated several hundred times) may ++compress more slowly than normal. Versions 0.9.5 and above fare much ++better than previous versions in this respect. The ratio between ++worst-case and average-case compression time is in the region of 10:1. ++For previous versions, this figure was more like 100:1. You can use the ++\-vvvv option to monitor progress in great detail, if you want. ++ ++Decompression speed is unaffected by these phenomena. ++ ++.I bzip2 ++usually allocates several megabytes of memory to operate ++in, and then charges all over it in a fairly random fashion. This means ++that performance, both for compressing and decompressing, is largely ++determined by the speed at which your machine can service cache misses. ++Because of this, small changes to the code to reduce the miss rate have ++been observed to give disproportionately large performance improvements. ++I imagine ++.I bzip2 ++will perform best on machines with very large caches. ++ ++.SH CAVEATS ++I/O error messages are not as helpful as they could be. ++.I bzip2 ++tries hard to detect I/O errors and exit cleanly, but the details of ++what the problem is sometimes seem rather misleading. ++ ++This manual page pertains to version 1.0 of ++.I bzip2. ++Compressed ++data created by this version is entirely forwards and backwards ++compatible with the previous public releases, versions 0.1pl2, 0.9.0 ++and 0.9.5, ++but with the following exception: 0.9.0 and above can correctly ++decompress multiple concatenated compressed files. 0.1pl2 cannot do ++this; it will stop after decompressing just the first file in the ++stream. ++ ++.I bzip2recover ++uses 32-bit integers to represent bit positions in ++compressed files, so it cannot handle compressed files more than 512 ++megabytes long. This could easily be fixed. ++ ++.SH AUTHOR ++Julian Seward, jseward@acm.org. ++ ++http://sourceware.cygnus.com/bzip2 ++http://www.muraroa.demon.co.uk ++ ++The ideas embodied in ++.I bzip2 ++are due to (at least) the following ++people: Michael Burrows and David Wheeler (for the block sorting ++transformation), David Wheeler (again, for the Huffman coder), Peter ++Fenwick (for the structured coding model in the original ++.I bzip, ++and many refinements), and Alistair Moffat, Radford Neal and Ian Witten ++(for the arithmetic coder in the original ++.I bzip). ++I am much ++indebted for their help, support and advice. See the manual in the ++source distribution for pointers to sources of documentation. Christian ++von Roques encouraged me to look for faster sorting algorithms, so as to ++speed up compression. Bela Lubkin encouraged me to improve the ++worst-case compression performance. Many people sent patches, helped ++with portability problems, lent machines, gave advice and were generally ++helpful. +diff -Nru bzip2-1.0.1/doc/bzip2.texi bzip2-1.0.1.new/doc/bzip2.texi +--- bzip2-1.0.1/doc/bzip2.texi Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/bzip2.texi Sat Jun 24 20:13:06 2000 +@@ -0,0 +1,2217 @@ ++\input texinfo @c -*- Texinfo -*- ++@setfilename bzip2.info ++ ++@ignore ++This file documents bzip2 version 1.0, and associated library ++libbzip2, written by Julian Seward (jseward@acm.org). ++ ++Copyright (C) 1996-2000 Julian R Seward ++ ++Permission is granted to make and distribute verbatim copies of ++this manual provided the copyright notice and this permission notice ++are preserved on all copies. ++ ++Permission is granted to copy and distribute translations of this manual ++into another language, under the above conditions for verbatim copies. ++@end ignore ++ ++@ifinfo ++@format ++@dircategory File utilities: ++* Bzip2: (bzip2). A program and library for data ++ compression ++@end direntry ++@end format ++@end ifinfo ++ ++@iftex ++@c @finalout ++@settitle bzip2 and libbzip2 ++@titlepage ++@title bzip2 and libbzip2 ++@subtitle a program and library for data compression ++@subtitle copyright (C) 1996-2000 Julian Seward ++@subtitle version 1.0 of 21 March 2000 ++@author Julian Seward ++ ++@end titlepage ++ ++@parindent 0mm ++@parskip 2mm ++ ++@end iftex ++@node Top, Overview, (dir), (dir) ++ ++@top bzip2 ++ ++This program, @code{bzip2}, ++and associated library @code{libbzip2}, are ++Copyright (C) 1996-2000 Julian R Seward. All rights reserved. ++ ++Redistribution and use in source and binary forms, with or without ++modification, are permitted provided that the following conditions ++are met: ++@itemize @bullet ++@item ++ Redistributions of source code must retain the above copyright ++ notice, this list of conditions and the following disclaimer. ++@item ++ The origin of this software must not be misrepresented; you must ++ not claim that you wrote the original software. If you use this ++ software in a product, an acknowledgment in the product ++ documentation would be appreciated but is not required. ++@item ++ Altered source versions must be plainly marked as such, and must ++ not be misrepresented as being the original software. ++@item ++ The name of the author may not be used to endorse or promote ++ products derived from this software without specific prior written ++ permission. ++@end itemize ++THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS ++OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED ++WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ++ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY ++DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL ++DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE ++GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS ++INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, ++WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING ++NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS ++SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ++ ++Julian Seward, Cambridge, UK. ++ ++@code{jseward@@acm.org} ++ ++@code{http://sourceware.cygnus.com/bzip2} ++ ++@code{http://www.cacheprof.org} ++ ++@code{http://www.muraroa.demon.co.uk} ++ ++@code{bzip2}/@code{libbzip2} version 1.0 of 21 March 2000. ++ ++PATENTS: To the best of my knowledge, @code{bzip2} does not use any patented ++algorithms. However, I do not have the resources available to carry out ++a full patent search. Therefore I cannot give any guarantee of the ++above statement. ++ ++ ++ ++ ++ ++ ++ ++@node Overview, Implementation, Top, Top ++@chapter Introduction ++ ++@code{bzip2} compresses files using the Burrows-Wheeler ++block-sorting text compression algorithm, and Huffman coding. ++Compression is generally considerably better than that ++achieved by more conventional LZ77/LZ78-based compressors, ++and approaches the performance of the PPM family of statistical compressors. ++ ++@code{bzip2} is built on top of @code{libbzip2}, a flexible library ++for handling compressed data in the @code{bzip2} format. This manual ++describes both how to use the program and ++how to work with the library interface. Most of the ++manual is devoted to this library, not the program, ++which is good news if your interest is only in the program. ++ ++Chapter 2 describes how to use @code{bzip2}; this is the only part ++you need to read if you just want to know how to operate the program. ++Chapter 3 describes the programming interfaces in detail, and ++Chapter 4 records some miscellaneous notes which I thought ++ought to be recorded somewhere. ++ ++ ++@chapter How to use @code{bzip2} ++ ++This chapter contains a copy of the @code{bzip2} man page, ++and nothing else. ++ ++@quotation ++ ++@unnumberedsubsubsec NAME ++@itemize ++@item @code{bzip2}, @code{bunzip2} ++- a block-sorting file compressor, v1.0 ++@item @code{bzcat} ++- decompresses files to stdout ++@item @code{bzip2recover} ++- recovers data from damaged bzip2 files ++@end itemize ++ ++@unnumberedsubsubsec SYNOPSIS ++@itemize ++@item @code{bzip2} [ -cdfkqstvzVL123456789 ] [ filenames ... ] ++@item @code{bunzip2} [ -fkvsVL ] [ filenames ... ] ++@item @code{bzcat} [ -s ] [ filenames ... ] ++@item @code{bzip2recover} filename ++@end itemize ++ ++@unnumberedsubsubsec DESCRIPTION ++ ++@code{bzip2} compresses files using the Burrows-Wheeler block sorting ++text compression algorithm, and Huffman coding. Compression is ++generally considerably better than that achieved by more conventional ++LZ77/LZ78-based compressors, and approaches the performance of the PPM ++family of statistical compressors. ++ ++The command-line options are deliberately very similar to those of GNU ++@code{gzip}, but they are not identical. ++ ++@code{bzip2} expects a list of file names to accompany the command-line ++flags. Each file is replaced by a compressed version of itself, with ++the name @code{original_name.bz2}. Each compressed file has the same ++modification date, permissions, and, when possible, ownership as the ++corresponding original, so that these properties can be correctly ++restored at decompression time. File name handling is naive in the ++sense that there is no mechanism for preserving original file names, ++permissions, ownerships or dates in filesystems which lack these ++concepts, or have serious file name length restrictions, such as MS-DOS. ++ ++@code{bzip2} and @code{bunzip2} will by default not overwrite existing ++files. If you want this to happen, specify the @code{-f} flag. ++ ++If no file names are specified, @code{bzip2} compresses from standard ++input to standard output. In this case, @code{bzip2} will decline to ++write compressed output to a terminal, as this would be entirely ++incomprehensible and therefore pointless. ++ ++@code{bunzip2} (or @code{bzip2 -d}) decompresses all ++specified files. Files which were not created by @code{bzip2} ++will be detected and ignored, and a warning issued. ++@code{bzip2} attempts to guess the filename for the decompressed file ++from that of the compressed file as follows: ++@itemize ++@item @code{filename.bz2 } becomes @code{filename} ++@item @code{filename.bz } becomes @code{filename} ++@item @code{filename.tbz2} becomes @code{filename.tar} ++@item @code{filename.tbz } becomes @code{filename.tar} ++@item @code{anyothername } becomes @code{anyothername.out} ++@end itemize ++If the file does not end in one of the recognised endings, ++@code{.bz2}, @code{.bz}, ++@code{.tbz2} or @code{.tbz}, @code{bzip2} complains that it cannot ++guess the name of the original file, and uses the original name ++with @code{.out} appended. ++ ++As with compression, supplying no ++filenames causes decompression from standard input to standard output. ++ ++@code{bunzip2} will correctly decompress a file which is the ++concatenation of two or more compressed files. The result is the ++concatenation of the corresponding uncompressed files. Integrity ++testing (@code{-t}) of concatenated compressed files is also supported. ++ ++You can also compress or decompress files to the standard output by ++giving the @code{-c} flag. Multiple files may be compressed and ++decompressed like this. The resulting outputs are fed sequentially to ++stdout. Compression of multiple files in this manner generates a stream ++containing multiple compressed file representations. Such a stream ++can be decompressed correctly only by @code{bzip2} version 0.9.0 or ++later. Earlier versions of @code{bzip2} will stop after decompressing ++the first file in the stream. ++ ++@code{bzcat} (or @code{bzip2 -dc}) decompresses all specified files to ++the standard output. ++ ++@code{bzip2} will read arguments from the environment variables ++@code{BZIP2} and @code{BZIP}, in that order, and will process them ++before any arguments read from the command line. This gives a ++convenient way to supply default arguments. ++ ++Compression is always performed, even if the compressed file is slightly ++larger than the original. Files of less than about one hundred bytes ++tend to get larger, since the compression mechanism has a constant ++overhead in the region of 50 bytes. Random data (including the output ++of most file compressors) is coded at about 8.05 bits per byte, giving ++an expansion of around 0.5%. ++ ++As a self-check for your protection, @code{bzip2} uses 32-bit CRCs to ++make sure that the decompressed version of a file is identical to the ++original. This guards against corruption of the compressed data, and ++against undetected bugs in @code{bzip2} (hopefully very unlikely). The ++chances of data corruption going undetected is microscopic, about one ++chance in four billion for each file processed. Be aware, though, that ++the check occurs upon decompression, so it can only tell you that ++something is wrong. It can't help you recover the original uncompressed ++data. You can use @code{bzip2recover} to try to recover data from ++damaged files. ++ ++Return values: 0 for a normal exit, 1 for environmental problems (file ++not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt ++compressed file, 3 for an internal consistency error (eg, bug) which ++caused @code{bzip2} to panic. ++ ++ ++@unnumberedsubsubsec OPTIONS ++@table @code ++@item -c --stdout ++Compress or decompress to standard output. ++@item -d --decompress ++Force decompression. @code{bzip2}, @code{bunzip2} and @code{bzcat} are ++really the same program, and the decision about what actions to take is ++done on the basis of which name is used. This flag overrides that ++mechanism, and forces bzip2 to decompress. ++@item -z --compress ++The complement to @code{-d}: forces compression, regardless of the ++invokation name. ++@item -t --test ++Check integrity of the specified file(s), but don't decompress them. ++This really performs a trial decompression and throws away the result. ++@item -f --force ++Force overwrite of output files. Normally, @code{bzip2} will not overwrite ++existing output files. Also forces @code{bzip2} to break hard links ++to files, which it otherwise wouldn't do. ++@item -k --keep ++Keep (don't delete) input files during compression ++or decompression. ++@item -s --small ++Reduce memory usage, for compression, decompression and testing. Files ++are decompressed and tested using a modified algorithm which only ++requires 2.5 bytes per block byte. This means any file can be ++decompressed in 2300k of memory, albeit at about half the normal speed. ++ ++During compression, @code{-s} selects a block size of 200k, which limits ++memory use to around the same figure, at the expense of your compression ++ratio. In short, if your machine is low on memory (8 megabytes or ++less), use -s for everything. See MEMORY MANAGEMENT below. ++@item -q --quiet ++Suppress non-essential warning messages. Messages pertaining to ++I/O errors and other critical events will not be suppressed. ++@item -v --verbose ++Verbose mode -- show the compression ratio for each file processed. ++Further @code{-v}'s increase the verbosity level, spewing out lots of ++information which is primarily of interest for diagnostic purposes. ++@item -L --license -V --version ++Display the software version, license terms and conditions. ++@item -1 to -9 ++Set the block size to 100 k, 200 k .. 900 k when compressing. Has no ++effect when decompressing. See MEMORY MANAGEMENT below. ++@item -- ++Treats all subsequent arguments as file names, even if they start ++with a dash. This is so you can handle files with names beginning ++with a dash, for example: @code{bzip2 -- -myfilename}. ++@item --repetitive-fast ++@item --repetitive-best ++These flags are redundant in versions 0.9.5 and above. They provided ++some coarse control over the behaviour of the sorting algorithm in ++earlier versions, which was sometimes useful. 0.9.5 and above have an ++improved algorithm which renders these flags irrelevant. ++@end table ++ ++ ++@unnumberedsubsubsec MEMORY MANAGEMENT ++ ++@code{bzip2} compresses large files in blocks. The block size affects ++both the compression ratio achieved, and the amount of memory needed for ++compression and decompression. The flags @code{-1} through @code{-9} ++specify the block size to be 100,000 bytes through 900,000 bytes (the ++default) respectively. At decompression time, the block size used for ++compression is read from the header of the compressed file, and ++@code{bunzip2} then allocates itself just enough memory to decompress ++the file. Since block sizes are stored in compressed files, it follows ++that the flags @code{-1} to @code{-9} are irrelevant to and so ignored ++during decompression. ++ ++Compression and decompression requirements, in bytes, can be estimated ++as: ++@example ++ Compression: 400k + ( 8 x block size ) ++ ++ Decompression: 100k + ( 4 x block size ), or ++ 100k + ( 2.5 x block size ) ++@end example ++Larger block sizes give rapidly diminishing marginal returns. Most of ++the compression comes from the first two or three hundred k of block ++size, a fact worth bearing in mind when using @code{bzip2} on small machines. ++It is also important to appreciate that the decompression memory ++requirement is set at compression time by the choice of block size. ++ ++For files compressed with the default 900k block size, @code{bunzip2} ++will require about 3700 kbytes to decompress. To support decompression ++of any file on a 4 megabyte machine, @code{bunzip2} has an option to ++decompress using approximately half this amount of memory, about 2300 ++kbytes. Decompression speed is also halved, so you should use this ++option only where necessary. The relevant flag is @code{-s}. ++ ++In general, try and use the largest block size memory constraints allow, ++since that maximises the compression achieved. Compression and ++decompression speed are virtually unaffected by block size. ++ ++Another significant point applies to files which fit in a single block ++-- that means most files you'd encounter using a large block size. The ++amount of real memory touched is proportional to the size of the file, ++since the file is smaller than a block. For example, compressing a file ++20,000 bytes long with the flag @code{-9} will cause the compressor to ++allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 ++kbytes of it. Similarly, the decompressor will allocate 3700k but only ++touch 100k + 20000 * 4 = 180 kbytes. ++ ++Here is a table which summarises the maximum memory usage for different ++block sizes. Also recorded is the total compressed size for 14 files of ++the Calgary Text Compression Corpus totalling 3,141,622 bytes. This ++column gives some feel for how compression varies with block size. ++These figures tend to understate the advantage of larger block sizes for ++larger files, since the Corpus is dominated by smaller files. ++@example ++ Compress Decompress Decompress Corpus ++ Flag usage usage -s usage Size ++ ++ -1 1200k 500k 350k 914704 ++ -2 2000k 900k 600k 877703 ++ -3 2800k 1300k 850k 860338 ++ -4 3600k 1700k 1100k 846899 ++ -5 4400k 2100k 1350k 845160 ++ -6 5200k 2500k 1600k 838626 ++ -7 6100k 2900k 1850k 834096 ++ -8 6800k 3300k 2100k 828642 ++ -9 7600k 3700k 2350k 828642 ++@end example ++ ++@unnumberedsubsubsec RECOVERING DATA FROM DAMAGED FILES ++ ++@code{bzip2} compresses files in blocks, usually 900kbytes long. Each ++block is handled independently. If a media or transmission error causes ++a multi-block @code{.bz2} file to become damaged, it may be possible to ++recover data from the undamaged blocks in the file. ++ ++The compressed representation of each block is delimited by a 48-bit ++pattern, which makes it possible to find the block boundaries with ++reasonable certainty. Each block also carries its own 32-bit CRC, so ++damaged blocks can be distinguished from undamaged ones. ++ ++@code{bzip2recover} is a simple program whose purpose is to search for ++blocks in @code{.bz2} files, and write each block out into its own ++@code{.bz2} file. You can then use @code{bzip2 -t} to test the ++integrity of the resulting files, and decompress those which are ++undamaged. ++ ++@code{bzip2recover} ++takes a single argument, the name of the damaged file, ++and writes a number of files @code{rec0001file.bz2}, ++ @code{rec0002file.bz2}, etc, containing the extracted blocks. ++ The output filenames are designed so that the use of ++ wildcards in subsequent processing -- for example, ++@code{bzip2 -dc rec*file.bz2 > recovered_data} -- lists the files in ++ the correct order. ++ ++@code{bzip2recover} should be of most use dealing with large @code{.bz2} ++ files, as these will contain many blocks. It is clearly ++ futile to use it on damaged single-block files, since a ++ damaged block cannot be recovered. If you wish to minimise ++any potential data loss through media or transmission errors, ++you might consider compressing with a smaller ++ block size. ++ ++ ++@unnumberedsubsubsec PERFORMANCE NOTES ++ ++The sorting phase of compression gathers together similar strings in the ++file. Because of this, files containing very long runs of repeated ++symbols, like "aabaabaabaab ..." (repeated several hundred times) may ++compress more slowly than normal. Versions 0.9.5 and above fare much ++better than previous versions in this respect. The ratio between ++worst-case and average-case compression time is in the region of 10:1. ++For previous versions, this figure was more like 100:1. You can use the ++@code{-vvvv} option to monitor progress in great detail, if you want. ++ ++Decompression speed is unaffected by these phenomena. ++ ++@code{bzip2} usually allocates several megabytes of memory to operate ++in, and then charges all over it in a fairly random fashion. This means ++that performance, both for compressing and decompressing, is largely ++determined by the speed at which your machine can service cache misses. ++Because of this, small changes to the code to reduce the miss rate have ++been observed to give disproportionately large performance improvements. ++I imagine @code{bzip2} will perform best on machines with very large ++caches. ++ ++ ++@unnumberedsubsubsec CAVEATS ++ ++I/O error messages are not as helpful as they could be. @code{bzip2} ++tries hard to detect I/O errors and exit cleanly, but the details of ++what the problem is sometimes seem rather misleading. ++ ++This manual page pertains to version 1.0 of @code{bzip2}. Compressed ++data created by this version is entirely forwards and backwards ++compatible with the previous public releases, versions 0.1pl2, 0.9.0 and ++0.9.5, but with the following exception: 0.9.0 and above can correctly ++decompress multiple concatenated compressed files. 0.1pl2 cannot do ++this; it will stop after decompressing just the first file in the ++stream. ++ ++@code{bzip2recover} uses 32-bit integers to represent bit positions in ++compressed files, so it cannot handle compressed files more than 512 ++megabytes long. This could easily be fixed. ++ ++ ++@unnumberedsubsubsec AUTHOR ++Julian Seward, @code{jseward@@acm.org}. ++ ++The ideas embodied in @code{bzip2} are due to (at least) the following ++people: Michael Burrows and David Wheeler (for the block sorting ++transformation), David Wheeler (again, for the Huffman coder), Peter ++Fenwick (for the structured coding model in the original @code{bzip}, ++and many refinements), and Alistair Moffat, Radford Neal and Ian Witten ++(for the arithmetic coder in the original @code{bzip}). I am much ++indebted for their help, support and advice. See the manual in the ++source distribution for pointers to sources of documentation. Christian ++von Roques encouraged me to look for faster sorting algorithms, so as to ++speed up compression. Bela Lubkin encouraged me to improve the ++worst-case compression performance. Many people sent patches, helped ++with portability problems, lent machines, gave advice and were generally ++helpful. ++ ++@end quotation ++ ++ ++ ++ ++@chapter Programming with @code{libbzip2} ++ ++This chapter describes the programming interface to @code{libbzip2}. ++ ++For general background information, particularly about memory ++use and performance aspects, you'd be well advised to read Chapter 2 ++as well. ++ ++@section Top-level structure ++ ++@code{libbzip2} is a flexible library for compressing and decompressing ++data in the @code{bzip2} data format. Although packaged as a single ++entity, it helps to regard the library as three separate parts: the low ++level interface, and the high level interface, and some utility ++functions. ++ ++The structure of @code{libbzip2}'s interfaces is similar to ++that of Jean-loup Gailly's and Mark Adler's excellent @code{zlib} ++library. ++ ++All externally visible symbols have names beginning @code{BZ2_}. ++This is new in version 1.0. The intention is to minimise pollution ++of the namespaces of library clients. ++ ++@subsection Low-level summary ++ ++This interface provides services for compressing and decompressing ++data in memory. There's no provision for dealing with files, streams ++or any other I/O mechanisms, just straight memory-to-memory work. ++In fact, this part of the library can be compiled without inclusion ++of @code{stdio.h}, which may be helpful for embedded applications. ++ ++The low-level part of the library has no global variables and ++is therefore thread-safe. ++ ++Six routines make up the low level interface: ++@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, and @* @code{BZ2_bzCompressEnd} ++for compression, ++and a corresponding trio @code{BZ2_bzDecompressInit}, @* @code{BZ2_bzDecompress} ++and @code{BZ2_bzDecompressEnd} for decompression. ++The @code{*Init} functions allocate ++memory for compression/decompression and do other ++initialisations, whilst the @code{*End} functions close down operations ++and release memory. ++ ++The real work is done by @code{BZ2_bzCompress} and @code{BZ2_bzDecompress}. ++These compress and decompress data from a user-supplied input buffer ++to a user-supplied output buffer. These buffers can be any size; ++arbitrary quantities of data are handled by making repeated calls ++to these functions. This is a flexible mechanism allowing a ++consumer-pull style of activity, or producer-push, or a mixture of ++both. ++ ++ ++ ++@subsection High-level summary ++ ++This interface provides some handy wrappers around the low-level ++interface to facilitate reading and writing @code{bzip2} format ++files (@code{.bz2} files). The routines provide hooks to facilitate ++reading files in which the @code{bzip2} data stream is embedded ++within some larger-scale file structure, or where there are ++multiple @code{bzip2} data streams concatenated end-to-end. ++ ++For reading files, @code{BZ2_bzReadOpen}, @code{BZ2_bzRead}, ++@code{BZ2_bzReadClose} and @* @code{BZ2_bzReadGetUnused} are supplied. For ++writing files, @code{BZ2_bzWriteOpen}, @code{BZ2_bzWrite} and ++@code{BZ2_bzWriteFinish} are available. ++ ++As with the low-level library, no global variables are used ++so the library is per se thread-safe. However, if I/O errors ++occur whilst reading or writing the underlying compressed files, ++you may have to consult @code{errno} to determine the cause of ++the error. In that case, you'd need a C library which correctly ++supports @code{errno} in a multithreaded environment. ++ ++To make the library a little simpler and more portable, ++@code{BZ2_bzReadOpen} and @code{BZ2_bzWriteOpen} require you to pass them file ++handles (@code{FILE*}s) which have previously been opened for reading or ++writing respectively. That avoids portability problems associated with ++file operations and file attributes, whilst not being much of an ++imposition on the programmer. ++ ++ ++ ++@subsection Utility functions summary ++For very simple needs, @code{BZ2_bzBuffToBuffCompress} and ++@code{BZ2_bzBuffToBuffDecompress} are provided. These compress ++data in memory from one buffer to another buffer in a single ++function call. You should assess whether these functions ++fulfill your memory-to-memory compression/decompression ++requirements before investing effort in understanding the more ++general but more complex low-level interface. ++ ++Yoshioka Tsuneo (@code{QWF00133@@niftyserve.or.jp} / ++@code{tsuneo-y@@is.aist-nara.ac.jp}) has contributed some functions to ++give better @code{zlib} compatibility. These functions are ++@code{BZ2_bzopen}, @code{BZ2_bzread}, @code{BZ2_bzwrite}, @code{BZ2_bzflush}, ++@code{BZ2_bzclose}, ++@code{BZ2_bzerror} and @code{BZ2_bzlibVersion}. You may find these functions ++more convenient for simple file reading and writing, than those in the ++high-level interface. These functions are not (yet) officially part of ++the library, and are minimally documented here. If they break, you ++get to keep all the pieces. I hope to document them properly when time ++permits. ++ ++Yoshioka also contributed modifications to allow the library to be ++built as a Windows DLL. ++ ++ ++@section Error handling ++ ++The library is designed to recover cleanly in all situations, including ++the worst-case situation of decompressing random data. I'm not ++100% sure that it can always do this, so you might want to add ++a signal handler to catch segmentation violations during decompression ++if you are feeling especially paranoid. I would be interested in ++hearing more about the robustness of the library to corrupted ++compressed data. ++ ++Version 1.0 is much more robust in this respect than ++0.9.0 or 0.9.5. Investigations with Checker (a tool for ++detecting problems with memory management, similar to Purify) ++indicate that, at least for the few files I tested, all single-bit ++errors in the decompressed data are caught properly, with no ++segmentation faults, no reads of uninitialised data and no ++out of range reads or writes. So it's certainly much improved, ++although I wouldn't claim it to be totally bombproof. ++ ++The file @code{bzlib.h} contains all definitions needed to use ++the library. In particular, you should definitely not include ++@code{bzlib_private.h}. ++ ++In @code{bzlib.h}, the various return values are defined. The following ++list is not intended as an exhaustive description of the circumstances ++in which a given value may be returned -- those descriptions are given ++later. Rather, it is intended to convey the rough meaning of each ++return value. The first five actions are normal and not intended to ++denote an error situation. ++@table @code ++@item BZ_OK ++The requested action was completed successfully. ++@item BZ_RUN_OK ++@itemx BZ_FLUSH_OK ++@itemx BZ_FINISH_OK ++In @code{BZ2_bzCompress}, the requested flush/finish/nothing-special action ++was completed successfully. ++@item BZ_STREAM_END ++Compression of data was completed, or the logical stream end was ++detected during decompression. ++@end table ++ ++The following return values indicate an error of some kind. ++@table @code ++@item BZ_CONFIG_ERROR ++Indicates that the library has been improperly compiled on your ++platform -- a major configuration error. Specifically, it means ++that @code{sizeof(char)}, @code{sizeof(short)} and @code{sizeof(int)} ++are not 1, 2 and 4 respectively, as they should be. Note that the ++library should still work properly on 64-bit platforms which follow ++the LP64 programming model -- that is, where @code{sizeof(long)} ++and @code{sizeof(void*)} are 8. Under LP64, @code{sizeof(int)} is ++still 4, so @code{libbzip2}, which doesn't use the @code{long} type, ++is OK. ++@item BZ_SEQUENCE_ERROR ++When using the library, it is important to call the functions in the ++correct sequence and with data structures (buffers etc) in the correct ++states. @code{libbzip2} checks as much as it can to ensure this is ++happening, and returns @code{BZ_SEQUENCE_ERROR} if not. Code which ++complies precisely with the function semantics, as detailed below, ++should never receive this value; such an event denotes buggy code ++which you should investigate. ++@item BZ_PARAM_ERROR ++Returned when a parameter to a function call is out of range ++or otherwise manifestly incorrect. As with @code{BZ_SEQUENCE_ERROR}, ++this denotes a bug in the client code. The distinction between ++@code{BZ_PARAM_ERROR} and @code{BZ_SEQUENCE_ERROR} is a bit hazy, but still worth ++making. ++@item BZ_MEM_ERROR ++Returned when a request to allocate memory failed. Note that the ++quantity of memory needed to decompress a stream cannot be determined ++until the stream's header has been read. So @code{BZ2_bzDecompress} and ++@code{BZ2_bzRead} may return @code{BZ_MEM_ERROR} even though some of ++the compressed data has been read. The same is not true for ++compression; once @code{BZ2_bzCompressInit} or @code{BZ2_bzWriteOpen} have ++successfully completed, @code{BZ_MEM_ERROR} cannot occur. ++@item BZ_DATA_ERROR ++Returned when a data integrity error is detected during decompression. ++Most importantly, this means when stored and computed CRCs for the ++data do not match. This value is also returned upon detection of any ++other anomaly in the compressed data. ++@item BZ_DATA_ERROR_MAGIC ++As a special case of @code{BZ_DATA_ERROR}, it is sometimes useful to ++know when the compressed stream does not start with the correct ++magic bytes (@code{'B' 'Z' 'h'}). ++@item BZ_IO_ERROR ++Returned by @code{BZ2_bzRead} and @code{BZ2_bzWrite} when there is an error ++reading or writing in the compressed file, and by @code{BZ2_bzReadOpen} ++and @code{BZ2_bzWriteOpen} for attempts to use a file for which the ++error indicator (viz, @code{ferror(f)}) is set. ++On receipt of @code{BZ_IO_ERROR}, the caller should consult ++@code{errno} and/or @code{perror} to acquire operating-system ++specific information about the problem. ++@item BZ_UNEXPECTED_EOF ++Returned by @code{BZ2_bzRead} when the compressed file finishes ++before the logical end of stream is detected. ++@item BZ_OUTBUFF_FULL ++Returned by @code{BZ2_bzBuffToBuffCompress} and ++@code{BZ2_bzBuffToBuffDecompress} to indicate that the output data ++will not fit into the output buffer provided. ++@end table ++ ++ ++ ++@section Low-level interface ++ ++@subsection @code{BZ2_bzCompressInit} ++@example ++typedef ++ struct @{ ++ char *next_in; ++ unsigned int avail_in; ++ unsigned int total_in_lo32; ++ unsigned int total_in_hi32; ++ ++ char *next_out; ++ unsigned int avail_out; ++ unsigned int total_out_lo32; ++ unsigned int total_out_hi32; ++ ++ void *state; ++ ++ void *(*bzalloc)(void *,int,int); ++ void (*bzfree)(void *,void *); ++ void *opaque; ++ @} ++ bz_stream; ++ ++int BZ2_bzCompressInit ( bz_stream *strm, ++ int blockSize100k, ++ int verbosity, ++ int workFactor ); ++ ++@end example ++ ++Prepares for compression. The @code{bz_stream} structure ++holds all data pertaining to the compression activity. ++A @code{bz_stream} structure should be allocated and initialised ++prior to the call. ++The fields of @code{bz_stream} ++comprise the entirety of the user-visible data. @code{state} ++is a pointer to the private data structures required for compression. ++ ++Custom memory allocators are supported, via fields @code{bzalloc}, ++@code{bzfree}, ++and @code{opaque}. The value ++@code{opaque} is passed to as the first argument to ++all calls to @code{bzalloc} and @code{bzfree}, but is ++otherwise ignored by the library. ++The call @code{bzalloc ( opaque, n, m )} is expected to return a ++pointer @code{p} to ++@code{n * m} bytes of memory, and @code{bzfree ( opaque, p )} ++should free ++that memory. ++ ++If you don't want to use a custom memory allocator, set @code{bzalloc}, ++@code{bzfree} and ++@code{opaque} to @code{NULL}, ++and the library will then use the standard @code{malloc}/@code{free} ++routines. ++ ++Before calling @code{BZ2_bzCompressInit}, fields @code{bzalloc}, ++@code{bzfree} and @code{opaque} should ++be filled appropriately, as just described. Upon return, the internal ++state will have been allocated and initialised, and @code{total_in_lo32}, ++@code{total_in_hi32}, @code{total_out_lo32} and ++@code{total_out_hi32} will have been set to zero. ++These four fields are used by the library ++to inform the caller of the total amount of data passed into and out of ++the library, respectively. You should not try to change them. ++As of version 1.0, 64-bit counts are maintained, even on 32-bit ++platforms, using the @code{_hi32} fields to store the upper 32 bits ++of the count. So, for example, the total amount of data in ++is @code{(total_in_hi32 << 32) + total_in_lo32}. ++ ++Parameter @code{blockSize100k} specifies the block size to be used for ++compression. It should be a value between 1 and 9 inclusive, and the ++actual block size used is 100000 x this figure. 9 gives the best ++compression but takes most memory. ++ ++Parameter @code{verbosity} should be set to a number between 0 and 4 ++inclusive. 0 is silent, and greater numbers give increasingly verbose ++monitoring/debugging output. If the library has been compiled with ++@code{-DBZ_NO_STDIO}, no such output will appear for any verbosity ++setting. ++ ++Parameter @code{workFactor} controls how the compression phase behaves ++when presented with worst case, highly repetitive, input data. If ++compression runs into difficulties caused by repetitive data, the ++library switches from the standard sorting algorithm to a fallback ++algorithm. The fallback is slower than the standard algorithm by ++perhaps a factor of three, but always behaves reasonably, no matter how ++bad the input. ++ ++Lower values of @code{workFactor} reduce the amount of effort the ++standard algorithm will expend before resorting to the fallback. You ++should set this parameter carefully; too low, and many inputs will be ++handled by the fallback algorithm and so compress rather slowly, too ++high, and your average-to-worst case compression times can become very ++large. The default value of 30 gives reasonable behaviour over a wide ++range of circumstances. ++ ++Allowable values range from 0 to 250 inclusive. 0 is a special case, ++equivalent to using the default value of 30. ++ ++Note that the compressed output generated is the same regardless of ++whether or not the fallback algorithm is used. ++ ++Be aware also that this parameter may disappear entirely in future ++versions of the library. In principle it should be possible to devise a ++good way to automatically choose which algorithm to use. Such a ++mechanism would render the parameter obsolete. ++ ++Possible return values: ++@display ++ @code{BZ_CONFIG_ERROR} ++ if the library has been mis-compiled ++ @code{BZ_PARAM_ERROR} ++ if @code{strm} is @code{NULL} ++ or @code{blockSize} < 1 or @code{blockSize} > 9 ++ or @code{verbosity} < 0 or @code{verbosity} > 4 ++ or @code{workFactor} < 0 or @code{workFactor} > 250 ++ @code{BZ_MEM_ERROR} ++ if not enough memory is available ++ @code{BZ_OK} ++ otherwise ++@end display ++Allowable next actions: ++@display ++ @code{BZ2_bzCompress} ++ if @code{BZ_OK} is returned ++ no specific action needed in case of error ++@end display ++ ++@subsection @code{BZ2_bzCompress} ++@example ++ int BZ2_bzCompress ( bz_stream *strm, int action ); ++@end example ++Provides more input and/or output buffer space for the library. The ++caller maintains input and output buffers, and calls @code{BZ2_bzCompress} to ++transfer data between them. ++ ++Before each call to @code{BZ2_bzCompress}, @code{next_in} should point at ++the data to be compressed, and @code{avail_in} should indicate how many ++bytes the library may read. @code{BZ2_bzCompress} updates @code{next_in}, ++@code{avail_in} and @code{total_in} to reflect the number of bytes it ++has read. ++ ++Similarly, @code{next_out} should point to a buffer in which the ++compressed data is to be placed, with @code{avail_out} indicating how ++much output space is available. @code{BZ2_bzCompress} updates ++@code{next_out}, @code{avail_out} and @code{total_out} to reflect the ++number of bytes output. ++ ++You may provide and remove as little or as much data as you like on each ++call of @code{BZ2_bzCompress}. In the limit, it is acceptable to supply and ++remove data one byte at a time, although this would be terribly ++inefficient. You should always ensure that at least one byte of output ++space is available at each call. ++ ++A second purpose of @code{BZ2_bzCompress} is to request a change of mode of the ++compressed stream. ++ ++Conceptually, a compressed stream can be in one of four states: IDLE, ++RUNNING, FLUSHING and FINISHING. Before initialisation ++(@code{BZ2_bzCompressInit}) and after termination (@code{BZ2_bzCompressEnd}), a ++stream is regarded as IDLE. ++ ++Upon initialisation (@code{BZ2_bzCompressInit}), the stream is placed in the ++RUNNING state. Subsequent calls to @code{BZ2_bzCompress} should pass ++@code{BZ_RUN} as the requested action; other actions are illegal and ++will result in @code{BZ_SEQUENCE_ERROR}. ++ ++At some point, the calling program will have provided all the input data ++it wants to. It will then want to finish up -- in effect, asking the ++library to process any data it might have buffered internally. In this ++state, @code{BZ2_bzCompress} will no longer attempt to read data from ++@code{next_in}, but it will want to write data to @code{next_out}. ++Because the output buffer supplied by the user can be arbitrarily small, ++the finishing-up operation cannot necessarily be done with a single call ++of @code{BZ2_bzCompress}. ++ ++Instead, the calling program passes @code{BZ_FINISH} as an action to ++@code{BZ2_bzCompress}. This changes the stream's state to FINISHING. Any ++remaining input (ie, @code{next_in[0 .. avail_in-1]}) is compressed and ++transferred to the output buffer. To do this, @code{BZ2_bzCompress} must be ++called repeatedly until all the output has been consumed. At that ++point, @code{BZ2_bzCompress} returns @code{BZ_STREAM_END}, and the stream's ++state is set back to IDLE. @code{BZ2_bzCompressEnd} should then be ++called. ++ ++Just to make sure the calling program does not cheat, the library makes ++a note of @code{avail_in} at the time of the first call to ++@code{BZ2_bzCompress} which has @code{BZ_FINISH} as an action (ie, at the ++time the program has announced its intention to not supply any more ++input). By comparing this value with that of @code{avail_in} over ++subsequent calls to @code{BZ2_bzCompress}, the library can detect any ++attempts to slip in more data to compress. Any calls for which this is ++detected will return @code{BZ_SEQUENCE_ERROR}. This indicates a ++programming mistake which should be corrected. ++ ++Instead of asking to finish, the calling program may ask ++@code{BZ2_bzCompress} to take all the remaining input, compress it and ++terminate the current (Burrows-Wheeler) compression block. This could ++be useful for error control purposes. The mechanism is analogous to ++that for finishing: call @code{BZ2_bzCompress} with an action of ++@code{BZ_FLUSH}, remove output data, and persist with the ++@code{BZ_FLUSH} action until the value @code{BZ_RUN} is returned. As ++with finishing, @code{BZ2_bzCompress} detects any attempt to provide more ++input data once the flush has begun. ++ ++Once the flush is complete, the stream returns to the normal RUNNING ++state. ++ ++This all sounds pretty complex, but isn't really. Here's a table ++which shows which actions are allowable in each state, what action ++will be taken, what the next state is, and what the non-error return ++values are. Note that you can't explicitly ask what state the ++stream is in, but nor do you need to -- it can be inferred from the ++values returned by @code{BZ2_bzCompress}. ++@display ++IDLE/@code{any} ++ Illegal. IDLE state only exists after @code{BZ2_bzCompressEnd} or ++ before @code{BZ2_bzCompressInit}. ++ Return value = @code{BZ_SEQUENCE_ERROR} ++ ++RUNNING/@code{BZ_RUN} ++ Compress from @code{next_in} to @code{next_out} as much as possible. ++ Next state = RUNNING ++ Return value = @code{BZ_RUN_OK} ++ ++RUNNING/@code{BZ_FLUSH} ++ Remember current value of @code{next_in}. Compress from @code{next_in} ++ to @code{next_out} as much as possible, but do not accept any more input. ++ Next state = FLUSHING ++ Return value = @code{BZ_FLUSH_OK} ++ ++RUNNING/@code{BZ_FINISH} ++ Remember current value of @code{next_in}. Compress from @code{next_in} ++ to @code{next_out} as much as possible, but do not accept any more input. ++ Next state = FINISHING ++ Return value = @code{BZ_FINISH_OK} ++ ++FLUSHING/@code{BZ_FLUSH} ++ Compress from @code{next_in} to @code{next_out} as much as possible, ++ but do not accept any more input. ++ If all the existing input has been used up and all compressed ++ output has been removed ++ Next state = RUNNING; Return value = @code{BZ_RUN_OK} ++ else ++ Next state = FLUSHING; Return value = @code{BZ_FLUSH_OK} ++ ++FLUSHING/other ++ Illegal. ++ Return value = @code{BZ_SEQUENCE_ERROR} ++ ++FINISHING/@code{BZ_FINISH} ++ Compress from @code{next_in} to @code{next_out} as much as possible, ++ but to not accept any more input. ++ If all the existing input has been used up and all compressed ++ output has been removed ++ Next state = IDLE; Return value = @code{BZ_STREAM_END} ++ else ++ Next state = FINISHING; Return value = @code{BZ_FINISHING} ++ ++FINISHING/other ++ Illegal. ++ Return value = @code{BZ_SEQUENCE_ERROR} ++@end display ++ ++That still looks complicated? Well, fair enough. The usual sequence ++of calls for compressing a load of data is: ++@itemize @bullet ++@item Get started with @code{BZ2_bzCompressInit}. ++@item Shovel data in and shlurp out its compressed form using zero or more ++calls of @code{BZ2_bzCompress} with action = @code{BZ_RUN}. ++@item Finish up. ++Repeatedly call @code{BZ2_bzCompress} with action = @code{BZ_FINISH}, ++copying out the compressed output, until @code{BZ_STREAM_END} is returned. ++@item Close up and go home. Call @code{BZ2_bzCompressEnd}. ++@end itemize ++If the data you want to compress fits into your input buffer all ++at once, you can skip the calls of @code{BZ2_bzCompress ( ..., BZ_RUN )} and ++just do the @code{BZ2_bzCompress ( ..., BZ_FINISH )} calls. ++ ++All required memory is allocated by @code{BZ2_bzCompressInit}. The ++compression library can accept any data at all (obviously). So you ++shouldn't get any error return values from the @code{BZ2_bzCompress} calls. ++If you do, they will be @code{BZ_SEQUENCE_ERROR}, and indicate a bug in ++your programming. ++ ++Trivial other possible return values: ++@display ++ @code{BZ_PARAM_ERROR} ++ if @code{strm} is @code{NULL}, or @code{strm->s} is @code{NULL} ++@end display ++ ++@subsection @code{BZ2_bzCompressEnd} ++@example ++int BZ2_bzCompressEnd ( bz_stream *strm ); ++@end example ++Releases all memory associated with a compression stream. ++ ++Possible return values: ++@display ++ @code{BZ_PARAM_ERROR} if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} ++ @code{BZ_OK} otherwise ++@end display ++ ++ ++@subsection @code{BZ2_bzDecompressInit} ++@example ++int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small ); ++@end example ++Prepares for decompression. As with @code{BZ2_bzCompressInit}, a ++@code{bz_stream} record should be allocated and initialised before the ++call. Fields @code{bzalloc}, @code{bzfree} and @code{opaque} should be ++set if a custom memory allocator is required, or made @code{NULL} for ++the normal @code{malloc}/@code{free} routines. Upon return, the internal ++state will have been initialised, and @code{total_in} and ++@code{total_out} will be zero. ++ ++For the meaning of parameter @code{verbosity}, see @code{BZ2_bzCompressInit}. ++ ++If @code{small} is nonzero, the library will use an alternative ++decompression algorithm which uses less memory but at the cost of ++decompressing more slowly (roughly speaking, half the speed, but the ++maximum memory requirement drops to around 2300k). See Chapter 2 for ++more information on memory management. ++ ++Note that the amount of memory needed to decompress ++a stream cannot be determined until the stream's header has been read, ++so even if @code{BZ2_bzDecompressInit} succeeds, a subsequent ++@code{BZ2_bzDecompress} could fail with @code{BZ_MEM_ERROR}. ++ ++Possible return values: ++@display ++ @code{BZ_CONFIG_ERROR} ++ if the library has been mis-compiled ++ @code{BZ_PARAM_ERROR} ++ if @code{(small != 0 && small != 1)} ++ or @code{(verbosity < 0 || verbosity > 4)} ++ @code{BZ_MEM_ERROR} ++ if insufficient memory is available ++@end display ++ ++Allowable next actions: ++@display ++ @code{BZ2_bzDecompress} ++ if @code{BZ_OK} was returned ++ no specific action required in case of error ++@end display ++ ++ ++ ++@subsection @code{BZ2_bzDecompress} ++@example ++int BZ2_bzDecompress ( bz_stream *strm ); ++@end example ++Provides more input and/out output buffer space for the library. The ++caller maintains input and output buffers, and uses @code{BZ2_bzDecompress} ++to transfer data between them. ++ ++Before each call to @code{BZ2_bzDecompress}, @code{next_in} ++should point at the compressed data, ++and @code{avail_in} should indicate how many bytes the library ++may read. @code{BZ2_bzDecompress} updates @code{next_in}, @code{avail_in} ++and @code{total_in} ++to reflect the number of bytes it has read. ++ ++Similarly, @code{next_out} should point to a buffer in which the uncompressed ++output is to be placed, with @code{avail_out} indicating how much output space ++is available. @code{BZ2_bzCompress} updates @code{next_out}, ++@code{avail_out} and @code{total_out} to reflect ++the number of bytes output. ++ ++You may provide and remove as little or as much data as you like on ++each call of @code{BZ2_bzDecompress}. ++In the limit, it is acceptable to ++supply and remove data one byte at a time, although this would be ++terribly inefficient. You should always ensure that at least one ++byte of output space is available at each call. ++ ++Use of @code{BZ2_bzDecompress} is simpler than @code{BZ2_bzCompress}. ++ ++You should provide input and remove output as described above, and ++repeatedly call @code{BZ2_bzDecompress} until @code{BZ_STREAM_END} is ++returned. Appearance of @code{BZ_STREAM_END} denotes that ++@code{BZ2_bzDecompress} has detected the logical end of the compressed ++stream. @code{BZ2_bzDecompress} will not produce @code{BZ_STREAM_END} until ++all output data has been placed into the output buffer, so once ++@code{BZ_STREAM_END} appears, you are guaranteed to have available all ++the decompressed output, and @code{BZ2_bzDecompressEnd} can safely be ++called. ++ ++If case of an error return value, you should call @code{BZ2_bzDecompressEnd} ++to clean up and release memory. ++ ++Possible return values: ++@display ++ @code{BZ_PARAM_ERROR} ++ if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} ++ or @code{strm->avail_out < 1} ++ @code{BZ_DATA_ERROR} ++ if a data integrity error is detected in the compressed stream ++ @code{BZ_DATA_ERROR_MAGIC} ++ if the compressed stream doesn't begin with the right magic bytes ++ @code{BZ_MEM_ERROR} ++ if there wasn't enough memory available ++ @code{BZ_STREAM_END} ++ if the logical end of the data stream was detected and all ++ output in has been consumed, eg @code{s->avail_out > 0} ++ @code{BZ_OK} ++ otherwise ++@end display ++Allowable next actions: ++@display ++ @code{BZ2_bzDecompress} ++ if @code{BZ_OK} was returned ++ @code{BZ2_bzDecompressEnd} ++ otherwise ++@end display ++ ++ ++@subsection @code{BZ2_bzDecompressEnd} ++@example ++int BZ2_bzDecompressEnd ( bz_stream *strm ); ++@end example ++Releases all memory associated with a decompression stream. ++ ++Possible return values: ++@display ++ @code{BZ_PARAM_ERROR} ++ if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} ++ @code{BZ_OK} ++ otherwise ++@end display ++ ++Allowable next actions: ++@display ++ None. ++@end display ++ ++ ++@section High-level interface ++ ++This interface provides functions for reading and writing ++@code{bzip2} format files. First, some general points. ++ ++@itemize @bullet ++@item All of the functions take an @code{int*} first argument, ++ @code{bzerror}. ++ After each call, @code{bzerror} should be consulted first to determine ++ the outcome of the call. If @code{bzerror} is @code{BZ_OK}, ++ the call completed ++ successfully, and only then should the return value of the function ++ (if any) be consulted. If @code{bzerror} is @code{BZ_IO_ERROR}, ++ there was an error ++ reading/writing the underlying compressed file, and you should ++ then consult @code{errno}/@code{perror} to determine the ++ cause of the difficulty. ++ @code{bzerror} may also be set to various other values; precise details are ++ given on a per-function basis below. ++@item If @code{bzerror} indicates an error ++ (ie, anything except @code{BZ_OK} and @code{BZ_STREAM_END}), ++ you should immediately call @code{BZ2_bzReadClose} (or @code{BZ2_bzWriteClose}, ++ depending on whether you are attempting to read or to write) ++ to free up all resources associated ++ with the stream. Once an error has been indicated, behaviour of all calls ++ except @code{BZ2_bzReadClose} (@code{BZ2_bzWriteClose}) is undefined. ++ The implication is that (1) @code{bzerror} should ++ be checked after each call, and (2) if @code{bzerror} indicates an error, ++ @code{BZ2_bzReadClose} (@code{BZ2_bzWriteClose}) should then be called to clean up. ++@item The @code{FILE*} arguments passed to ++ @code{BZ2_bzReadOpen}/@code{BZ2_bzWriteOpen} ++ should be set to binary mode. ++ Most Unix systems will do this by default, but other platforms, ++ including Windows and Mac, will not. If you omit this, you may ++ encounter problems when moving code to new platforms. ++@item Memory allocation requests are handled by ++ @code{malloc}/@code{free}. ++ At present ++ there is no facility for user-defined memory allocators in the file I/O ++ functions (could easily be added, though). ++@end itemize ++ ++ ++ ++@subsection @code{BZ2_bzReadOpen} ++@example ++ typedef void BZFILE; ++ ++ BZFILE *BZ2_bzReadOpen ( int *bzerror, FILE *f, ++ int small, int verbosity, ++ void *unused, int nUnused ); ++@end example ++Prepare to read compressed data from file handle @code{f}. @code{f} ++should refer to a file which has been opened for reading, and for which ++the error indicator (@code{ferror(f)})is not set. If @code{small} is 1, ++the library will try to decompress using less memory, at the expense of ++speed. ++ ++For reasons explained below, @code{BZ2_bzRead} will decompress the ++@code{nUnused} bytes starting at @code{unused}, before starting to read ++from the file @code{f}. At most @code{BZ_MAX_UNUSED} bytes may be ++supplied like this. If this facility is not required, you should pass ++@code{NULL} and @code{0} for @code{unused} and n@code{Unused} ++respectively. ++ ++For the meaning of parameters @code{small} and @code{verbosity}, ++see @code{BZ2_bzDecompressInit}. ++ ++The amount of memory needed to decompress a file cannot be determined ++until the file's header has been read. So it is possible that ++@code{BZ2_bzReadOpen} returns @code{BZ_OK} but a subsequent call of ++@code{BZ2_bzRead} will return @code{BZ_MEM_ERROR}. ++ ++Possible assignments to @code{bzerror}: ++@display ++ @code{BZ_CONFIG_ERROR} ++ if the library has been mis-compiled ++ @code{BZ_PARAM_ERROR} ++ if @code{f} is @code{NULL} ++ or @code{small} is neither @code{0} nor @code{1} ++ or @code{(unused == NULL && nUnused != 0)} ++ or @code{(unused != NULL && !(0 <= nUnused <= BZ_MAX_UNUSED))} ++ @code{BZ_IO_ERROR} ++ if @code{ferror(f)} is nonzero ++ @code{BZ_MEM_ERROR} ++ if insufficient memory is available ++ @code{BZ_OK} ++ otherwise. ++@end display ++ ++Possible return values: ++@display ++ Pointer to an abstract @code{BZFILE} ++ if @code{bzerror} is @code{BZ_OK} ++ @code{NULL} ++ otherwise ++@end display ++ ++Allowable next actions: ++@display ++ @code{BZ2_bzRead} ++ if @code{bzerror} is @code{BZ_OK} ++ @code{BZ2_bzClose} ++ otherwise ++@end display ++ ++ ++@subsection @code{BZ2_bzRead} ++@example ++ int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len ); ++@end example ++Reads up to @code{len} (uncompressed) bytes from the compressed file ++@code{b} into ++the buffer @code{buf}. If the read was successful, ++@code{bzerror} is set to @code{BZ_OK} ++and the number of bytes read is returned. If the logical end-of-stream ++was detected, @code{bzerror} will be set to @code{BZ_STREAM_END}, ++and the number ++of bytes read is returned. All other @code{bzerror} values denote an error. ++ ++@code{BZ2_bzRead} will supply @code{len} bytes, ++unless the logical stream end is detected ++or an error occurs. Because of this, it is possible to detect the ++stream end by observing when the number of bytes returned is ++less than the number ++requested. Nevertheless, this is regarded as inadvisable; you should ++instead check @code{bzerror} after every call and watch out for ++@code{BZ_STREAM_END}. ++ ++Internally, @code{BZ2_bzRead} copies data from the compressed file in chunks ++of size @code{BZ_MAX_UNUSED} bytes ++before decompressing it. If the file contains more bytes than strictly ++needed to reach the logical end-of-stream, @code{BZ2_bzRead} will almost certainly ++read some of the trailing data before signalling @code{BZ_SEQUENCE_END}. ++To collect the read but unused data once @code{BZ_SEQUENCE_END} has ++appeared, call @code{BZ2_bzReadGetUnused} immediately before @code{BZ2_bzReadClose}. ++ ++Possible assignments to @code{bzerror}: ++@display ++ @code{BZ_PARAM_ERROR} ++ if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} ++ @code{BZ_SEQUENCE_ERROR} ++ if @code{b} was opened with @code{BZ2_bzWriteOpen} ++ @code{BZ_IO_ERROR} ++ if there is an error reading from the compressed file ++ @code{BZ_UNEXPECTED_EOF} ++ if the compressed file ended before the logical end-of-stream was detected ++ @code{BZ_DATA_ERROR} ++ if a data integrity error was detected in the compressed stream ++ @code{BZ_DATA_ERROR_MAGIC} ++ if the stream does not begin with the requisite header bytes (ie, is not ++ a @code{bzip2} data file). This is really a special case of @code{BZ_DATA_ERROR}. ++ @code{BZ_MEM_ERROR} ++ if insufficient memory was available ++ @code{BZ_STREAM_END} ++ if the logical end of stream was detected. ++ @code{BZ_OK} ++ otherwise. ++@end display ++ ++Possible return values: ++@display ++ number of bytes read ++ if @code{bzerror} is @code{BZ_OK} or @code{BZ_STREAM_END} ++ undefined ++ otherwise ++@end display ++ ++Allowable next actions: ++@display ++ collect data from @code{buf}, then @code{BZ2_bzRead} or @code{BZ2_bzReadClose} ++ if @code{bzerror} is @code{BZ_OK} ++ collect data from @code{buf}, then @code{BZ2_bzReadClose} or @code{BZ2_bzReadGetUnused} ++ if @code{bzerror} is @code{BZ_SEQUENCE_END} ++ @code{BZ2_bzReadClose} ++ otherwise ++@end display ++ ++ ++ ++@subsection @code{BZ2_bzReadGetUnused} ++@example ++ void BZ2_bzReadGetUnused ( int* bzerror, BZFILE *b, ++ void** unused, int* nUnused ); ++@end example ++Returns data which was read from the compressed file but was not needed ++to get to the logical end-of-stream. @code{*unused} is set to the address ++of the data, and @code{*nUnused} to the number of bytes. @code{*nUnused} will ++be set to a value between @code{0} and @code{BZ_MAX_UNUSED} inclusive. ++ ++This function may only be called once @code{BZ2_bzRead} has signalled ++@code{BZ_STREAM_END} but before @code{BZ2_bzReadClose}. ++ ++Possible assignments to @code{bzerror}: ++@display ++ @code{BZ_PARAM_ERROR} ++ if @code{b} is @code{NULL} ++ or @code{unused} is @code{NULL} or @code{nUnused} is @code{NULL} ++ @code{BZ_SEQUENCE_ERROR} ++ if @code{BZ_STREAM_END} has not been signalled ++ or if @code{b} was opened with @code{BZ2_bzWriteOpen} ++ @code{BZ_OK} ++ otherwise ++@end display ++ ++Allowable next actions: ++@display ++ @code{BZ2_bzReadClose} ++@end display ++ ++ ++@subsection @code{BZ2_bzReadClose} ++@example ++ void BZ2_bzReadClose ( int *bzerror, BZFILE *b ); ++@end example ++Releases all memory pertaining to the compressed file @code{b}. ++@code{BZ2_bzReadClose} does not call @code{fclose} on the underlying file ++handle, so you should do that yourself if appropriate. ++@code{BZ2_bzReadClose} should be called to clean up after all error ++situations. ++ ++Possible assignments to @code{bzerror}: ++@display ++ @code{BZ_SEQUENCE_ERROR} ++ if @code{b} was opened with @code{BZ2_bzOpenWrite} ++ @code{BZ_OK} ++ otherwise ++@end display ++ ++Allowable next actions: ++@display ++ none ++@end display ++ ++ ++ ++@subsection @code{BZ2_bzWriteOpen} ++@example ++ BZFILE *BZ2_bzWriteOpen ( int *bzerror, FILE *f, ++ int blockSize100k, int verbosity, ++ int workFactor ); ++@end example ++Prepare to write compressed data to file handle @code{f}. ++@code{f} should refer to ++a file which has been opened for writing, and for which the error ++indicator (@code{ferror(f)})is not set. ++ ++For the meaning of parameters @code{blockSize100k}, ++@code{verbosity} and @code{workFactor}, see ++@* @code{BZ2_bzCompressInit}. ++ ++All required memory is allocated at this stage, so if the call ++completes successfully, @code{BZ_MEM_ERROR} cannot be signalled by a ++subsequent call to @code{BZ2_bzWrite}. ++ ++Possible assignments to @code{bzerror}: ++@display ++ @code{BZ_CONFIG_ERROR} ++ if the library has been mis-compiled ++ @code{BZ_PARAM_ERROR} ++ if @code{f} is @code{NULL} ++ or @code{blockSize100k < 1} or @code{blockSize100k > 9} ++ @code{BZ_IO_ERROR} ++ if @code{ferror(f)} is nonzero ++ @code{BZ_MEM_ERROR} ++ if insufficient memory is available ++ @code{BZ_OK} ++ otherwise ++@end display ++ ++Possible return values: ++@display ++ Pointer to an abstract @code{BZFILE} ++ if @code{bzerror} is @code{BZ_OK} ++ @code{NULL} ++ otherwise ++@end display ++ ++Allowable next actions: ++@display ++ @code{BZ2_bzWrite} ++ if @code{bzerror} is @code{BZ_OK} ++ (you could go directly to @code{BZ2_bzWriteClose}, but this would be pretty pointless) ++ @code{BZ2_bzWriteClose} ++ otherwise ++@end display ++ ++ ++ ++@subsection @code{BZ2_bzWrite} ++@example ++ void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len ); ++@end example ++Absorbs @code{len} bytes from the buffer @code{buf}, eventually to be ++compressed and written to the file. ++ ++Possible assignments to @code{bzerror}: ++@display ++ @code{BZ_PARAM_ERROR} ++ if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} ++ @code{BZ_SEQUENCE_ERROR} ++ if b was opened with @code{BZ2_bzReadOpen} ++ @code{BZ_IO_ERROR} ++ if there is an error writing the compressed file. ++ @code{BZ_OK} ++ otherwise ++@end display ++ ++ ++ ++ ++@subsection @code{BZ2_bzWriteClose} ++@example ++ void BZ2_bzWriteClose ( int *bzerror, BZFILE* f, ++ int abandon, ++ unsigned int* nbytes_in, ++ unsigned int* nbytes_out ); ++ ++ void BZ2_bzWriteClose64 ( int *bzerror, BZFILE* f, ++ int abandon, ++ unsigned int* nbytes_in_lo32, ++ unsigned int* nbytes_in_hi32, ++ unsigned int* nbytes_out_lo32, ++ unsigned int* nbytes_out_hi32 ); ++@end example ++ ++Compresses and flushes to the compressed file all data so far supplied ++by @code{BZ2_bzWrite}. The logical end-of-stream markers are also written, so ++subsequent calls to @code{BZ2_bzWrite} are illegal. All memory associated ++with the compressed file @code{b} is released. ++@code{fflush} is called on the ++compressed file, but it is not @code{fclose}'d. ++ ++If @code{BZ2_bzWriteClose} is called to clean up after an error, the only ++action is to release the memory. The library records the error codes ++issued by previous calls, so this situation will be detected ++automatically. There is no attempt to complete the compression ++operation, nor to @code{fflush} the compressed file. You can force this ++behaviour to happen even in the case of no error, by passing a nonzero ++value to @code{abandon}. ++ ++If @code{nbytes_in} is non-null, @code{*nbytes_in} will be set to be the ++total volume of uncompressed data handled. Similarly, @code{nbytes_out} ++will be set to the total volume of compressed data written. For ++compatibility with older versions of the library, @code{BZ2_bzWriteClose} ++only yields the lower 32 bits of these counts. Use ++@code{BZ2_bzWriteClose64} if you want the full 64 bit counts. These ++two functions are otherwise absolutely identical. ++ ++ ++Possible assignments to @code{bzerror}: ++@display ++ @code{BZ_SEQUENCE_ERROR} ++ if @code{b} was opened with @code{BZ2_bzReadOpen} ++ @code{BZ_IO_ERROR} ++ if there is an error writing the compressed file ++ @code{BZ_OK} ++ otherwise ++@end display ++ ++@subsection Handling embedded compressed data streams ++ ++The high-level library facilitates use of ++@code{bzip2} data streams which form some part of a surrounding, larger ++data stream. ++@itemize @bullet ++@item For writing, the library takes an open file handle, writes ++compressed data to it, @code{fflush}es it but does not @code{fclose} it. ++The calling application can write its own data before and after the ++compressed data stream, using that same file handle. ++@item Reading is more complex, and the facilities are not as general ++as they could be since generality is hard to reconcile with efficiency. ++@code{BZ2_bzRead} reads from the compressed file in blocks of size ++@code{BZ_MAX_UNUSED} bytes, and in doing so probably will overshoot ++the logical end of compressed stream. ++To recover this data once decompression has ++ended, call @code{BZ2_bzReadGetUnused} after the last call of @code{BZ2_bzRead} ++(the one returning @code{BZ_STREAM_END}) but before calling ++@code{BZ2_bzReadClose}. ++@end itemize ++ ++This mechanism makes it easy to decompress multiple @code{bzip2} ++streams placed end-to-end. As the end of one stream, when @code{BZ2_bzRead} ++returns @code{BZ_STREAM_END}, call @code{BZ2_bzReadGetUnused} to collect the ++unused data (copy it into your own buffer somewhere). ++That data forms the start of the next compressed stream. ++To start uncompressing that next stream, call @code{BZ2_bzReadOpen} again, ++feeding in the unused data via the @code{unused}/@code{nUnused} ++parameters. ++Keep doing this until @code{BZ_STREAM_END} return coincides with the ++physical end of file (@code{feof(f)}). In this situation ++@code{BZ2_bzReadGetUnused} ++will of course return no data. ++ ++This should give some feel for how the high-level interface can be used. ++If you require extra flexibility, you'll have to bite the bullet and get ++to grips with the low-level interface. ++ ++@subsection Standard file-reading/writing code ++Here's how you'd write data to a compressed file: ++@example @code ++FILE* f; ++BZFILE* b; ++int nBuf; ++char buf[ /* whatever size you like */ ]; ++int bzerror; ++int nWritten; ++ ++f = fopen ( "myfile.bz2", "w" ); ++if (!f) @{ ++ /* handle error */ ++@} ++b = BZ2_bzWriteOpen ( &bzerror, f, 9 ); ++if (bzerror != BZ_OK) @{ ++ BZ2_bzWriteClose ( b ); ++ /* handle error */ ++@} ++ ++while ( /* condition */ ) @{ ++ /* get data to write into buf, and set nBuf appropriately */ ++ nWritten = BZ2_bzWrite ( &bzerror, b, buf, nBuf ); ++ if (bzerror == BZ_IO_ERROR) @{ ++ BZ2_bzWriteClose ( &bzerror, b ); ++ /* handle error */ ++ @} ++@} ++ ++BZ2_bzWriteClose ( &bzerror, b ); ++if (bzerror == BZ_IO_ERROR) @{ ++ /* handle error */ ++@} ++@end example ++And to read from a compressed file: ++@example ++FILE* f; ++BZFILE* b; ++int nBuf; ++char buf[ /* whatever size you like */ ]; ++int bzerror; ++int nWritten; ++ ++f = fopen ( "myfile.bz2", "r" ); ++if (!f) @{ ++ /* handle error */ ++@} ++b = BZ2_bzReadOpen ( &bzerror, f, 0, NULL, 0 ); ++if (bzerror != BZ_OK) @{ ++ BZ2_bzReadClose ( &bzerror, b ); ++ /* handle error */ ++@} ++ ++bzerror = BZ_OK; ++while (bzerror == BZ_OK && /* arbitrary other conditions */) @{ ++ nBuf = BZ2_bzRead ( &bzerror, b, buf, /* size of buf */ ); ++ if (bzerror == BZ_OK) @{ ++ /* do something with buf[0 .. nBuf-1] */ ++ @} ++@} ++if (bzerror != BZ_STREAM_END) @{ ++ BZ2_bzReadClose ( &bzerror, b ); ++ /* handle error */ ++@} else @{ ++ BZ2_bzReadClose ( &bzerror ); ++@} ++@end example ++ ++ ++ ++@section Utility functions ++@subsection @code{BZ2_bzBuffToBuffCompress} ++@example ++ int BZ2_bzBuffToBuffCompress( char* dest, ++ unsigned int* destLen, ++ char* source, ++ unsigned int sourceLen, ++ int blockSize100k, ++ int verbosity, ++ int workFactor ); ++@end example ++Attempts to compress the data in @code{source[0 .. sourceLen-1]} ++into the destination buffer, @code{dest[0 .. *destLen-1]}. ++If the destination buffer is big enough, @code{*destLen} is ++set to the size of the compressed data, and @code{BZ_OK} is ++returned. If the compressed data won't fit, @code{*destLen} ++is unchanged, and @code{BZ_OUTBUFF_FULL} is returned. ++ ++Compression in this manner is a one-shot event, done with a single call ++to this function. The resulting compressed data is a complete ++@code{bzip2} format data stream. There is no mechanism for making ++additional calls to provide extra input data. If you want that kind of ++mechanism, use the low-level interface. ++ ++For the meaning of parameters @code{blockSize100k}, @code{verbosity} ++and @code{workFactor}, @* see @code{BZ2_bzCompressInit}. ++ ++To guarantee that the compressed data will fit in its buffer, allocate ++an output buffer of size 1% larger than the uncompressed data, plus ++six hundred extra bytes. ++ ++@code{BZ2_bzBuffToBuffDecompress} will not write data at or ++beyond @code{dest[*destLen]}, even in case of buffer overflow. ++ ++Possible return values: ++@display ++ @code{BZ_CONFIG_ERROR} ++ if the library has been mis-compiled ++ @code{BZ_PARAM_ERROR} ++ if @code{dest} is @code{NULL} or @code{destLen} is @code{NULL} ++ or @code{blockSize100k < 1} or @code{blockSize100k > 9} ++ or @code{verbosity < 0} or @code{verbosity > 4} ++ or @code{workFactor < 0} or @code{workFactor > 250} ++ @code{BZ_MEM_ERROR} ++ if insufficient memory is available ++ @code{BZ_OUTBUFF_FULL} ++ if the size of the compressed data exceeds @code{*destLen} ++ @code{BZ_OK} ++ otherwise ++@end display ++ ++ ++ ++@subsection @code{BZ2_bzBuffToBuffDecompress} ++@example ++ int BZ2_bzBuffToBuffDecompress ( char* dest, ++ unsigned int* destLen, ++ char* source, ++ unsigned int sourceLen, ++ int small, ++ int verbosity ); ++@end example ++Attempts to decompress the data in @code{source[0 .. sourceLen-1]} ++into the destination buffer, @code{dest[0 .. *destLen-1]}. ++If the destination buffer is big enough, @code{*destLen} is ++set to the size of the uncompressed data, and @code{BZ_OK} is ++returned. If the compressed data won't fit, @code{*destLen} ++is unchanged, and @code{BZ_OUTBUFF_FULL} is returned. ++ ++@code{source} is assumed to hold a complete @code{bzip2} format ++data stream. @* @code{BZ2_bzBuffToBuffDecompress} tries to decompress ++the entirety of the stream into the output buffer. ++ ++For the meaning of parameters @code{small} and @code{verbosity}, ++see @code{BZ2_bzDecompressInit}. ++ ++Because the compression ratio of the compressed data cannot be known in ++advance, there is no easy way to guarantee that the output buffer will ++be big enough. You may of course make arrangements in your code to ++record the size of the uncompressed data, but such a mechanism is beyond ++the scope of this library. ++ ++@code{BZ2_bzBuffToBuffDecompress} will not write data at or ++beyond @code{dest[*destLen]}, even in case of buffer overflow. ++ ++Possible return values: ++@display ++ @code{BZ_CONFIG_ERROR} ++ if the library has been mis-compiled ++ @code{BZ_PARAM_ERROR} ++ if @code{dest} is @code{NULL} or @code{destLen} is @code{NULL} ++ or @code{small != 0 && small != 1} ++ or @code{verbosity < 0} or @code{verbosity > 4} ++ @code{BZ_MEM_ERROR} ++ if insufficient memory is available ++ @code{BZ_OUTBUFF_FULL} ++ if the size of the compressed data exceeds @code{*destLen} ++ @code{BZ_DATA_ERROR} ++ if a data integrity error was detected in the compressed data ++ @code{BZ_DATA_ERROR_MAGIC} ++ if the compressed data doesn't begin with the right magic bytes ++ @code{BZ_UNEXPECTED_EOF} ++ if the compressed data ends unexpectedly ++ @code{BZ_OK} ++ otherwise ++@end display ++ ++ ++ ++@section @code{zlib} compatibility functions ++Yoshioka Tsuneo has contributed some functions to ++give better @code{zlib} compatibility. These functions are ++@code{BZ2_bzopen}, @code{BZ2_bzread}, @code{BZ2_bzwrite}, @code{BZ2_bzflush}, ++@code{BZ2_bzclose}, ++@code{BZ2_bzerror} and @code{BZ2_bzlibVersion}. ++These functions are not (yet) officially part of ++the library. If they break, you get to keep all the pieces. ++Nevertheless, I think they work ok. ++@example ++typedef void BZFILE; ++ ++const char * BZ2_bzlibVersion ( void ); ++@end example ++Returns a string indicating the library version. ++@example ++BZFILE * BZ2_bzopen ( const char *path, const char *mode ); ++BZFILE * BZ2_bzdopen ( int fd, const char *mode ); ++@end example ++Opens a @code{.bz2} file for reading or writing, using either its name ++or a pre-existing file descriptor. ++Analogous to @code{fopen} and @code{fdopen}. ++@example ++int BZ2_bzread ( BZFILE* b, void* buf, int len ); ++int BZ2_bzwrite ( BZFILE* b, void* buf, int len ); ++@end example ++Reads/writes data from/to a previously opened @code{BZFILE}. ++Analogous to @code{fread} and @code{fwrite}. ++@example ++int BZ2_bzflush ( BZFILE* b ); ++void BZ2_bzclose ( BZFILE* b ); ++@end example ++Flushes/closes a @code{BZFILE}. @code{BZ2_bzflush} doesn't actually do ++anything. Analogous to @code{fflush} and @code{fclose}. ++ ++@example ++const char * BZ2_bzerror ( BZFILE *b, int *errnum ) ++@end example ++Returns a string describing the more recent error status of ++@code{b}, and also sets @code{*errnum} to its numerical value. ++ ++ ++@section Using the library in a @code{stdio}-free environment ++ ++@subsection Getting rid of @code{stdio} ++ ++In a deeply embedded application, you might want to use just ++the memory-to-memory functions. You can do this conveniently ++by compiling the library with preprocessor symbol @code{BZ_NO_STDIO} ++defined. Doing this gives you a library containing only the following ++eight functions: ++ ++@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, @code{BZ2_bzCompressEnd} @* ++@code{BZ2_bzDecompressInit}, @code{BZ2_bzDecompress}, @code{BZ2_bzDecompressEnd} @* ++@code{BZ2_bzBuffToBuffCompress}, @code{BZ2_bzBuffToBuffDecompress} ++ ++When compiled like this, all functions will ignore @code{verbosity} ++settings. ++ ++@subsection Critical error handling ++@code{libbzip2} contains a number of internal assertion checks which ++should, needless to say, never be activated. Nevertheless, if an ++assertion should fail, behaviour depends on whether or not the library ++was compiled with @code{BZ_NO_STDIO} set. ++ ++For a normal compile, an assertion failure yields the message ++@example ++ bzip2/libbzip2: internal error number N. ++ This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000. ++ Please report it to me at: jseward@@acm.org. If this happened ++ when you were using some program which uses libbzip2 as a ++ component, you should also report this bug to the author(s) ++ of that program. Please make an effort to report this bug; ++ timely and accurate bug reports eventually lead to higher ++ quality software. Thanks. Julian Seward, 21 March 2000. ++@end example ++where @code{N} is some error code number. @code{exit(3)} ++is then called. ++ ++For a @code{stdio}-free library, assertion failures result ++in a call to a function declared as: ++@example ++ extern void bz_internal_error ( int errcode ); ++@end example ++The relevant code is passed as a parameter. You should supply ++such a function. ++ ++In either case, once an assertion failure has occurred, any ++@code{bz_stream} records involved can be regarded as invalid. ++You should not attempt to resume normal operation with them. ++ ++You may, of course, change critical error handling to suit ++your needs. As I said above, critical errors indicate bugs ++in the library and should not occur. All "normal" error ++situations are indicated via error return codes from functions, ++and can be recovered from. ++ ++ ++@section Making a Windows DLL ++Everything related to Windows has been contributed by Yoshioka Tsuneo ++@* (@code{QWF00133@@niftyserve.or.jp} / ++@code{tsuneo-y@@is.aist-nara.ac.jp}), so you should send your queries to ++him (but perhaps Cc: me, @code{jseward@@acm.org}). ++ ++My vague understanding of what to do is: using Visual C++ 5.0, ++open the project file @code{libbz2.dsp}, and build. That's all. ++ ++If you can't ++open the project file for some reason, make a new one, naming these files: ++@code{blocksort.c}, @code{bzlib.c}, @code{compress.c}, ++@code{crctable.c}, @code{decompress.c}, @code{huffman.c}, @* ++@code{randtable.c} and @code{libbz2.def}. You will also need ++to name the header files @code{bzlib.h} and @code{bzlib_private.h}. ++ ++If you don't use VC++, you may need to define the proprocessor symbol ++@code{_WIN32}. ++ ++Finally, @code{dlltest.c} is a sample program using the DLL. It has a ++project file, @code{dlltest.dsp}. ++ ++If you just want a makefile for Visual C, have a look at ++@code{makefile.msc}. ++ ++Be aware that if you compile @code{bzip2} itself on Win32, you must set ++@code{BZ_UNIX} to 0 and @code{BZ_LCCWIN32} to 1, in the file ++@code{bzip2.c}, before compiling. Otherwise the resulting binary won't ++work correctly. ++ ++I haven't tried any of this stuff myself, but it all looks plausible. ++ ++ ++ ++@chapter Miscellanea ++ ++These are just some random thoughts of mine. Your mileage may ++vary. ++ ++@section Limitations of the compressed file format ++@code{bzip2-1.0}, @code{0.9.5} and @code{0.9.0} ++use exactly the same file format as the previous ++version, @code{bzip2-0.1}. This decision was made in the interests of ++stability. Creating yet another incompatible compressed file format ++would create further confusion and disruption for users. ++ ++Nevertheless, this is not a painless decision. Development ++work since the release of @code{bzip2-0.1} in August 1997 ++has shown complexities in the file format which slow down ++decompression and, in retrospect, are unnecessary. These are: ++@itemize @bullet ++@item The run-length encoder, which is the first of the ++ compression transformations, is entirely irrelevant. ++ The original purpose was to protect the sorting algorithm ++ from the very worst case input: a string of repeated ++ symbols. But algorithm steps Q6a and Q6b in the original ++ Burrows-Wheeler technical report (SRC-124) show how ++ repeats can be handled without difficulty in block ++ sorting. ++@item The randomisation mechanism doesn't really need to be ++ there. Udi Manber and Gene Myers published a suffix ++ array construction algorithm a few years back, which ++ can be employed to sort any block, no matter how ++ repetitive, in O(N log N) time. Subsequent work by ++ Kunihiko Sadakane has produced a derivative O(N (log N)^2) ++ algorithm which usually outperforms the Manber-Myers ++ algorithm. ++ ++ I could have changed to Sadakane's algorithm, but I find ++ it to be slower than @code{bzip2}'s existing algorithm for ++ most inputs, and the randomisation mechanism protects ++ adequately against bad cases. I didn't think it was ++ a good tradeoff to make. Partly this is due to the fact ++ that I was not flooded with email complaints about ++ @code{bzip2-0.1}'s performance on repetitive data, so ++ perhaps it isn't a problem for real inputs. ++ ++ Probably the best long-term solution, ++ and the one I have incorporated into 0.9.5 and above, ++ is to use the existing sorting ++ algorithm initially, and fall back to a O(N (log N)^2) ++ algorithm if the standard algorithm gets into difficulties. ++@item The compressed file format was never designed to be ++ handled by a library, and I have had to jump though ++ some hoops to produce an efficient implementation of ++ decompression. It's a bit hairy. Try passing ++ @code{decompress.c} through the C preprocessor ++ and you'll see what I mean. Much of this complexity ++ could have been avoided if the compressed size of ++ each block of data was recorded in the data stream. ++@item An Adler-32 checksum, rather than a CRC32 checksum, ++ would be faster to compute. ++@end itemize ++It would be fair to say that the @code{bzip2} format was frozen ++before I properly and fully understood the performance ++consequences of doing so. ++ ++Improvements which I was able to incorporate into ++0.9.0, despite using the same file format, are: ++@itemize @bullet ++@item Single array implementation of the inverse BWT. This ++ significantly speeds up decompression, presumably ++ because it reduces the number of cache misses. ++@item Faster inverse MTF transform for large MTF values. The ++ new implementation is based on the notion of sliding blocks ++ of values. ++@item @code{bzip2-0.9.0} now reads and writes files with @code{fread} ++ and @code{fwrite}; version 0.1 used @code{putc} and @code{getc}. ++ Duh! Well, you live and learn. ++ ++@end itemize ++Further ahead, it would be nice ++to be able to do random access into files. This will ++require some careful design of compressed file formats. ++ ++ ++ ++@section Portability issues ++After some consideration, I have decided not to use ++GNU @code{autoconf} to configure 0.9.5 or 1.0. ++ ++@code{autoconf}, admirable and wonderful though it is, ++mainly assists with portability problems between Unix-like ++platforms. But @code{bzip2} doesn't have much in the way ++of portability problems on Unix; most of the difficulties appear ++when porting to the Mac, or to Microsoft's operating systems. ++@code{autoconf} doesn't help in those cases, and brings in a ++whole load of new complexity. ++ ++Most people should be able to compile the library and program ++under Unix straight out-of-the-box, so to speak, especially ++if you have a version of GNU C available. ++ ++There are a couple of @code{__inline__} directives in the code. GNU C ++(@code{gcc}) should be able to handle them. If you're not using ++GNU C, your C compiler shouldn't see them at all. ++If your compiler does, for some reason, see them and doesn't ++like them, just @code{#define} @code{__inline__} to be @code{/* */}. One ++easy way to do this is to compile with the flag @code{-D__inline__=}, ++which should be understood by most Unix compilers. ++ ++If you still have difficulties, try compiling with the macro ++@code{BZ_STRICT_ANSI} defined. This should enable you to build the ++library in a strictly ANSI compliant environment. Building the program ++itself like this is dangerous and not supported, since you remove ++@code{bzip2}'s checks against compressing directories, symbolic links, ++devices, and other not-really-a-file entities. This could cause ++filesystem corruption! ++ ++One other thing: if you create a @code{bzip2} binary for public ++distribution, please try and link it statically (@code{gcc -s}). This ++avoids all sorts of library-version issues that others may encounter ++later on. ++ ++If you build @code{bzip2} on Win32, you must set @code{BZ_UNIX} to 0 and ++@code{BZ_LCCWIN32} to 1, in the file @code{bzip2.c}, before compiling. ++Otherwise the resulting binary won't work correctly. ++ ++ ++ ++@section Reporting bugs ++I tried pretty hard to make sure @code{bzip2} is ++bug free, both by design and by testing. Hopefully ++you'll never need to read this section for real. ++ ++Nevertheless, if @code{bzip2} dies with a segmentation ++fault, a bus error or an internal assertion failure, it ++will ask you to email me a bug report. Experience with ++version 0.1 shows that almost all these problems can ++be traced to either compiler bugs or hardware problems. ++@itemize @bullet ++@item ++Recompile the program with no optimisation, and see if it ++works. And/or try a different compiler. ++I heard all sorts of stories about various flavours ++of GNU C (and other compilers) generating bad code for ++@code{bzip2}, and I've run across two such examples myself. ++ ++2.7.X versions of GNU C are known to generate bad code from ++time to time, at high optimisation levels. ++If you get problems, try using the flags ++@code{-O2} @code{-fomit-frame-pointer} @code{-fno-strength-reduce}. ++You should specifically @emph{not} use @code{-funroll-loops}. ++ ++You may notice that the Makefile runs six tests as part of ++the build process. If the program passes all of these, it's ++a pretty good (but not 100%) indication that the compiler has ++done its job correctly. ++@item ++If @code{bzip2} crashes randomly, and the crashes are not ++repeatable, you may have a flaky memory subsystem. @code{bzip2} ++really hammers your memory hierarchy, and if it's a bit marginal, ++you may get these problems. Ditto if your disk or I/O subsystem ++is slowly failing. Yup, this really does happen. ++ ++Try using a different machine of the same type, and see if ++you can repeat the problem. ++@item This isn't really a bug, but ... If @code{bzip2} tells ++you your file is corrupted on decompression, and you ++obtained the file via FTP, there is a possibility that you ++forgot to tell FTP to do a binary mode transfer. That absolutely ++will cause the file to be non-decompressible. You'll have to transfer ++it again. ++@end itemize ++ ++If you've incorporated @code{libbzip2} into your own program ++and are getting problems, please, please, please, check that the ++parameters you are passing in calls to the library, are ++correct, and in accordance with what the documentation says ++is allowable. I have tried to make the library robust against ++such problems, but I'm sure I haven't succeeded. ++ ++Finally, if the above comments don't help, you'll have to send ++me a bug report. Now, it's just amazing how many people will ++send me a bug report saying something like ++@display ++ bzip2 crashed with segmentation fault on my machine ++@end display ++and absolutely nothing else. Needless to say, a such a report ++is @emph{totally, utterly, completely and comprehensively 100% useless; ++a waste of your time, my time, and net bandwidth}. ++With no details at all, there's no way I can possibly begin ++to figure out what the problem is. ++ ++The rules of the game are: facts, facts, facts. Don't omit ++them because "oh, they won't be relevant". At the bare ++minimum: ++@display ++ Machine type. Operating system version. ++ Exact version of @code{bzip2} (do @code{bzip2 -V}). ++ Exact version of the compiler used. ++ Flags passed to the compiler. ++@end display ++However, the most important single thing that will help me is ++the file that you were trying to compress or decompress at the ++time the problem happened. Without that, my ability to do anything ++more than speculate about the cause, is limited. ++ ++Please remember that I connect to the Internet with a modem, so ++you should contact me before mailing me huge files. ++ ++ ++@section Did you get the right package? ++ ++@code{bzip2} is a resource hog. It soaks up large amounts of CPU cycles ++and memory. Also, it gives very large latencies. In the worst case, you ++can feed many megabytes of uncompressed data into the library before ++getting any compressed output, so this probably rules out applications ++requiring interactive behaviour. ++ ++These aren't faults of my implementation, I hope, but more ++an intrinsic property of the Burrows-Wheeler transform (unfortunately). ++Maybe this isn't what you want. ++ ++If you want a compressor and/or library which is faster, uses less ++memory but gets pretty good compression, and has minimal latency, ++consider Jean-loup ++Gailly's and Mark Adler's work, @code{zlib-1.1.2} and ++@code{gzip-1.2.4}. Look for them at ++ ++@code{http://www.cdrom.com/pub/infozip/zlib} and ++@code{http://www.gzip.org} respectively. ++ ++For something faster and lighter still, you might try Markus F X J ++Oberhumer's @code{LZO} real-time compression/decompression library, at ++@* @code{http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html}. ++ ++If you want to use the @code{bzip2} algorithms to compress small blocks ++of data, 64k bytes or smaller, for example on an on-the-fly disk ++compressor, you'd be well advised not to use this library. Instead, ++I've made a special library tuned for that kind of use. It's part of ++@code{e2compr-0.40}, an on-the-fly disk compressor for the Linux ++@code{ext2} filesystem. Look at ++@code{http://www.netspace.net.au/~reiter/e2compr}. ++ ++ ++ ++@section Testing ++ ++A record of the tests I've done. ++ ++First, some data sets: ++@itemize @bullet ++@item B: a directory containing 6001 files, one for every length in the ++ range 0 to 6000 bytes. The files contain random lowercase ++ letters. 18.7 megabytes. ++@item H: my home directory tree. Documents, source code, mail files, ++ compressed data. H contains B, and also a directory of ++ files designed as boundary cases for the sorting; mostly very ++ repetitive, nasty files. 565 megabytes. ++@item A: directory tree holding various applications built from source: ++ @code{egcs}, @code{gcc-2.8.1}, KDE, GTK, Octave, etc. ++ 2200 megabytes. ++@end itemize ++The tests conducted are as follows. Each test means compressing ++(a copy of) each file in the data set, decompressing it and ++comparing it against the original. ++ ++First, a bunch of tests with block sizes and internal buffer ++sizes set very small, ++to detect any problems with the ++blocking and buffering mechanisms. ++This required modifying the source code so as to try to ++break it. ++@enumerate ++@item Data set H, with ++ buffer size of 1 byte, and block size of 23 bytes. ++@item Data set B, buffer sizes 1 byte, block size 1 byte. ++@item As (2) but small-mode decompression. ++@item As (2) with block size 2 bytes. ++@item As (2) with block size 3 bytes. ++@item As (2) with block size 4 bytes. ++@item As (2) with block size 5 bytes. ++@item As (2) with block size 6 bytes and small-mode decompression. ++@item H with buffer size of 1 byte, but normal block ++ size (up to 900000 bytes). ++@end enumerate ++Then some tests with unmodified source code. ++@enumerate ++@item H, all settings normal. ++@item As (1), with small-mode decompress. ++@item H, compress with flag @code{-1}. ++@item H, compress with flag @code{-s}, decompress with flag @code{-s}. ++@item Forwards compatibility: H, @code{bzip2-0.1pl2} compressing, ++ @code{bzip2-0.9.5} decompressing, all settings normal. ++@item Backwards compatibility: H, @code{bzip2-0.9.5} compressing, ++ @code{bzip2-0.1pl2} decompressing, all settings normal. ++@item Bigger tests: A, all settings normal. ++@item As (7), using the fallback (Sadakane-like) sorting algorithm. ++@item As (8), compress with flag @code{-1}, decompress with flag ++ @code{-s}. ++@item H, using the fallback sorting algorithm. ++@item Forwards compatibility: A, @code{bzip2-0.1pl2} compressing, ++ @code{bzip2-0.9.5} decompressing, all settings normal. ++@item Backwards compatibility: A, @code{bzip2-0.9.5} compressing, ++ @code{bzip2-0.1pl2} decompressing, all settings normal. ++@item Misc test: about 400 megabytes of @code{.tar} files with ++ @code{bzip2} compiled with Checker (a memory access error ++ detector, like Purify). ++@item Misc tests to make sure it builds and runs ok on non-Linux/x86 ++ platforms. ++@end enumerate ++These tests were conducted on a 225 MHz IDT WinChip machine, running ++Linux 2.0.36. They represent nearly a week of continuous computation. ++All tests completed successfully. ++ ++ ++@section Further reading ++@code{bzip2} is not research work, in the sense that it doesn't present ++any new ideas. Rather, it's an engineering exercise based on existing ++ideas. ++ ++Four documents describe essentially all the ideas behind @code{bzip2}: ++@example ++Michael Burrows and D. J. Wheeler: ++ "A block-sorting lossless data compression algorithm" ++ 10th May 1994. ++ Digital SRC Research Report 124. ++ ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz ++ If you have trouble finding it, try searching at the ++ New Zealand Digital Library, http://www.nzdl.org. ++ ++Daniel S. Hirschberg and Debra A. LeLewer ++ "Efficient Decoding of Prefix Codes" ++ Communications of the ACM, April 1990, Vol 33, Number 4. ++ You might be able to get an electronic copy of this ++ from the ACM Digital Library. ++ ++David J. Wheeler ++ Program bred3.c and accompanying document bred3.ps. ++ This contains the idea behind the multi-table Huffman ++ coding scheme. ++ ftp://ftp.cl.cam.ac.uk/users/djw3/ ++ ++Jon L. Bentley and Robert Sedgewick ++ "Fast Algorithms for Sorting and Searching Strings" ++ Available from Sedgewick's web page, ++ www.cs.princeton.edu/~rs ++@end example ++The following paper gives valuable additional insights into the ++algorithm, but is not immediately the basis of any code ++used in bzip2. ++@example ++Peter Fenwick: ++ Block Sorting Text Compression ++ Proceedings of the 19th Australasian Computer Science Conference, ++ Melbourne, Australia. Jan 31 - Feb 2, 1996. ++ ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps ++@end example ++Kunihiko Sadakane's sorting algorithm, mentioned above, ++is available from: ++@example ++http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz ++@end example ++The Manber-Myers suffix array construction ++algorithm is described in a paper ++available from: ++@example ++http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps ++@end example ++Finally, the following paper documents some recent investigations ++I made into the performance of sorting algorithms: ++@example ++Julian Seward: ++ On the Performance of BWT Sorting Algorithms ++ Proceedings of the IEEE Data Compression Conference 2000 ++ Snowbird, Utah. 28-30 March 2000. ++@end example ++ ++ ++@contents ++ ++@bye ++ +diff -Nru bzip2-1.0.1/doc/bzip2recover.1 bzip2-1.0.1.new/doc/bzip2recover.1 +--- bzip2-1.0.1/doc/bzip2recover.1 Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/bzip2recover.1 Sat Jun 24 20:13:06 2000 +@@ -0,0 +1 @@ ++.so bzip2.1 +\ No newline at end of file +diff -Nru bzip2-1.0.1/doc/pl/Makefile.am bzip2-1.0.1.new/doc/pl/Makefile.am +--- bzip2-1.0.1/doc/pl/Makefile.am Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/pl/Makefile.am Sat Jun 24 20:13:06 2000 +@@ -0,0 +1,4 @@ ++ ++mandir = @mandir@/pl ++man_MANS = bzip2.1 bunzip2.1 bzcat.1 bzip2recover.1 ++ +diff -Nru bzip2-1.0.1/doc/pl/bunzip2.1 bzip2-1.0.1.new/doc/pl/bunzip2.1 +--- bzip2-1.0.1/doc/pl/bunzip2.1 Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/pl/bunzip2.1 Sat Jun 24 20:13:06 2000 +@@ -0,0 +1 @@ ++.so bzip2.1 +\ No newline at end of file +diff -Nru bzip2-1.0.1/doc/pl/bzcat.1 bzip2-1.0.1.new/doc/pl/bzcat.1 +--- bzip2-1.0.1/doc/pl/bzcat.1 Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/pl/bzcat.1 Sat Jun 24 20:13:06 2000 +@@ -0,0 +1 @@ ++.so bzip2.1 +\ No newline at end of file +diff -Nru bzip2-1.0.1/doc/pl/bzip2.1 bzip2-1.0.1.new/doc/pl/bzip2.1 +--- bzip2-1.0.1/doc/pl/bzip2.1 Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/pl/bzip2.1 Sat Jun 24 20:13:06 2000 +@@ -0,0 +1,384 @@ ++.\" T³umaczenie Maciej Wojciechowski wojciech@staszic.waw.pl ++.PU ++.TH bzip2 1 "" "" "wersja 1.0" ++.SH NAZWA ++bzip2, bunzip2 \- sortuj±cy bloki kompresor/dekompresor plików, v1.0 ++.br ++bzcat \- dekompresuje pliki na standardowe wyj¶cie ++.br ++bzip2recover \- odzyskuje dane ze zniszczonych archiwów bzip2 ++.SH SK£ADNIA ++.ll +8 ++.B bzip2 ++.RB [ \-cdfkqstvzVL123456789 ] ++.RI [ nazwy_plików \&...] ++.ll -8 ++.br ++.B bunzip2 ++.RB [ \-fkvsVL ] ++.RI [ nazwy_plików \&...] ++.br ++.B bzcat ++.RB [ \-s ] ++.RI [ nazwy_plików \&...] ++.br ++.B bzip2recover ++.I nazwa_pliku ++.SH OPIS ++.I bzip2 ++kompresuje pliki u¿ywaj±c algorytmu sortowania bloków Burrowsa-Wheelera i ++kodu Huffmana. Kompresja jest generalnie sporo lepsza od konwencjonalnych ++kompresorów opartych o metodê LZ77/LZ78, i jest porównywalna z ++osi±gniêciami statystycznych kompresorów z rodziny PPM. ++ ++Opcje linii poleceñ s± w wiêkszo¶ci bardzo podobne do tych z ++.IR "GNU gzip" , ++ale nie s± identyczne. ++ ++.I bzip2 ++oczekuje listy plików towarzysz±cych parametrom linii poleceñ. Ka¿dy plik jest ++zastêpowany przez swoj± skompresowan± wersjê, z nazw± ++"oryginalny_plik.bz2". Ka¿dy skompresowany plik ma ten sam czas modyfikacji, ++uprawnienia i, je¶li to mo¿liwe, w³a¶ciciela co orygina³, po to, aby te ++ustawienia mog³y zostaæ odtworzone podczas dekompresji. Utrzymywanie nazwy ++plików nie jest do koñca dok³adne w tym sensie, ¿e nie ma mo¿liwo¶ci ++przetrzymywania daty, uprawnieñ, w³a¶ciciela i nazw plików na systemach, na ++których brakuje tych mo¿liwo¶ci lub maj± ograniczenia co do d³ugo¶ci nazwy, ++tak np. jak MS-DOS. ++ ++.I bzip2 ++i ++.I bunzip2 ++standardowo nie nadpisuj± istniej±cych ju¿ plików. Je¶li chcesz aby to ++robi³y, musisz u¿yæ parametru \-f. ++ ++Je¶li nie podano ¿adnej nazwy pliku, ++.I bzip2 ++kompresuje ze standardowego wej¶cia na standardowe wyj¶cie. Odmiawia wówczas ++wypisywania skompresowanego wyj¶cie na terminal, gdy¿ by³oby to ++ca³kiem niezrozumia³e i przez to bez wiêkszego sensu. ++ ++.I bunzip2 ++(lub ++.IR bzip2 \-d ) ++dekompresuje wszystkie podane pliki. Pliki, które nie by³y ++utworzone przez ++.I bzip2 ++zostan± wykryte i zignorowane, a na ekranie pojawi siê komunikat ++ostrzegawczy. ++.I bzip2 ++próbuje zgadn±æ nazwê dla dekompresowanego pliku w nastêpuj±cy sposób: ++.nf ++ nazwa_pliku.bz2 staje siê nazwa_pliku ++ nazwa_pliku.bz staje siê nazwa_pliku ++ nazwa_pliku.tbz2 staje siê nazwa_pliku.tar ++ nazwa_pliku.tbz staje siê nazwa_pliku.tar ++ inna_nazwa staje siê inna_nazwa.out ++.fi ++Je¶li plik nie ma jednego z nastêpuj±cych rozpoznawalnych rozszerzeñ, ++.IR .bz2 , ++.IR .bz , ++.I .tbz2 ++lub ++.IR .tbz , ++to ++.I bzip2 ++napisze, ¿e nie mo¿e zgadn±æ nazwy pierwotnego pliku, i u¿yje ++oryginalnej nazwy z dodanym rozszerzeniem ++.IR .out . ++ ++Tak jak kompresja, nie posiadaj±ca ¿adnych plików, powoduje kompresjê ze ++standardowego wej¶cia na standardowe wyj¶cie. ++ ++.I bunzip2 ++poprawnie zdekompresuje plik, który jest po³aczeniem dwóch lub wiêcej ++skompresowanych plików. Rezultatem jest po³±czony odpowiedni ++nieskompresowany plik. Obs³ugiwane jest równie¿ sprawdzanie spójno¶ci ++(\-t) po³±czonych skompresowanych plików. ++ ++Mo¿esz równie¿ kompresowaæ lub dekompresowaæ pliki na standardowe wyj¶cie ++u¿ywaj±c parametru \-c. W ten w³a¶nie sposób mo¿na przeprowadzaæ kompresjê ++wielu plików równocze¶nie. ++Powsta³e wyniki s± przesy³ane sekwencyjnie na standardowe wyj¶cie. ++W ten sposób kompresja wielu plików generuje strumieñ ++zawieraj±cy reprezentacje kilku skompresowanych plików. Taki strumieñ mo¿e ++byæ zdekompresowany poprawnie tylko przez ++.I bzip2 ++w wersji 0.9.0 lub pó¼niejszej. Wcze¶niejsze wersje ++.I bzip2 ++zatrzymaj± siê po zdekmpresowaniu pierwszego pliku w strumieniu. ++ ++.I bzcat ++(lub ++.I bzip2 -dc) ++dekompresuje wszystkie wybrane pliki na standardowe wyj¶cie. ++ ++.I bzip2 ++czyta argumenty ze zmiennych ¶rodowiskowych ++.I BZIP2 ++i ++.I BZIP, ++w podanej kolejno¶ci, i przetwarza je przed jakimikolwiek argumentami ++przeczytanymi z linii poleceñ. To dobra metoda na specyfikowanie ++standardowych ustawieñ. ++ ++Kompresja stosowana jest zawsze, nawet je¶li skompresowany plik jest ++nieznaczniej wiêkszy od pliku oryginalnego. Pliki mniejsze ni¿ mniej wiêcej ++sto bajtów staj± siê wiêksze, poniewa¿ mechanizm kompresji ma sta³y ++nag³ówek wynosz±cy oko³o 50 bajtów. Przypadkowe dane (w³±czaj±c wyj¶cie ++wiêkszo¶ci kompresorów plików) d± kodowane na mniej wiêcej 8.05 bitu na ++bajt, daj±c zysk oko³o 0.5%. ++ ++Jako samosprawdzenie dla twojej ochrony ++.I bzip2 ++u¿ywa 32-bitowego CRC aby upewniæ siê, ¿e zdekompresowana wersja pliku jest ++identyczna z oryginaln±. To strze¿e przed stratami w skompresowanych danych ++i przed niewykrytymi b³êdami w ++.I bzip2 ++(na szczê¶cie bardzo rzadkich). Mo¿liwo¶æ niewykrycia utraty danych ++jest mikroskopijna, mniej wiêcej jedna szansa na cztery biliony dla ka¿dego ++pliku. Uwa¿aj jednak, gdy¿ sprawdzenie jest dokonywane przed dekompresj±, ++wiêc dowiesz siê tylko tego, ¿e co¶ jest nie w porz±dku. Nie pomo¿e ci to odzyskaæ ++oryginalnych nieskompresowanych danych. Mo¿esz u¿yæ ++.I bzip2recover ++aby spróbowaæ odzyskaæ dane z uszkodzonych plików. ++ ++Zwracane warto¶ci: 0 dla normalnego wyj¶cia, 1 dla problemów technicznych ++(plik nie znaleziony, niew³a¶ciwy parametr, b³±d wyj¶cia/wyj¶cia itp.), 2 dla ++zasygnalizowania b³êdu skompresowanego pliku, 3 dla wewnêtrznego b³êdu (np. ++bug), który zmusi³ \fIbzip2\fP do przerwania. ++ ++.SH OPCJE ++.TP ++.B \-c --stdout ++Kompresuje lub dekompresuje na standardowe wyj¶cie. ++.TP ++.B \-d --decompress ++Wymusza dekompresjê. ++.IR bzip2 , ++.I bunzip2 ++i ++.I bzcat ++s± tak naprawdê tymi samymi programami i decyzja jakie akcje bêd± wykonane ++jest wykonywana na podstawie nazwy jaka zosta³a u¿yta. Ten parametr ma wy¿szy ++priorytet i wymusza na \fIbzip2\fP dekompresjê. ++.TP ++.B \-z --compress ++Podobne do \-d: wymusza kompresjê, bez wzglêdu na sposób wywo³ania. ++.TP ++.B \-t --test ++Sprawdza integralno¶æ wybranego pliku(ów), ale nie dekompresuje ich. Wymusza ++to próbn± dekompresjê i mówi, jaki jest rezultat. ++.TP ++.B \-f --force ++Wymusza zastêpowanie plików wyj¶ciowych. Normalnie, \fIbzip2\fP nie ++zastêpuje istniej±cych plików wyj¶ciowych. Wymusza równie¿ na \fIbzip2\fP ++³amanie dowi±zañ twardych, czego normalnie nie robi. ++.TP ++.B \-k --keep ++Zatrzymaj (nie kasuj) pliki wej¶ciowe przy kompresji lub dekompresji. ++.TP ++.B \-s --small ++Zredukuj u¿ycie pamiêci na kompresjê, dekompresjê i testowanie. Pliki s± ++dekompresowane i testowane przy u¿yciu zmodyfikowanego algorytmu, który ++potrzebuje tylko 2.5 bajtu na blok bajtów. Oznacza to, ¿e ka¿dy plik mo¿e ++byæ zdekompresowany przy u¿yciu oko³o 2300k pamiêci, jednak trac±c oko³o po³owê ++normalnej szybko¶ci. ++ ++Podczas kompresji, \-s wybiera bloki wielko¶ci 200k, których limity ++pamiêci wynosz± mniej wiêcej tyle samo, w zamian za jako¶æ kompresji. W ++skrócie, je¶li twój komputer ma ma³o pamiêci (8 megabajtów lub mniej), ++u¿ywaj opcji \-s do wszystkiego. Zobacz \fBzarz±dzanie pamiêci±\fP poni¿ej. ++.TP ++.B \-q --quiet ++Wy³±cza wszystkie nieistotne komunikaty ostrzegawcze. ++Nie s± eliminowane komunikaty dotycz±ce b³êdów wej¶cia/wyj¶cia i innych ++zdarzeñ krytycznych. ++.TP ++.B \-v --verbose ++Tryb gadatliwy -- pokazuje stopieñ kompresji dla ka¿dego pliku. Nastêpne ++\fB\-v\fP zwiêkszaj± stopieñ gadatliwo¶ci, powoduj±c wy¶wietlanie du¿ej ++ilo¶ci informacji, przydatnych g³ównie przy diagnostyce. ++.TP ++.B \-L --license -V --version ++Wy¶wietla wersjê programu i warunki licencji. ++.TP ++.B \-1 to \-9 ++Ustawia wielko¶æ bloku na 100 k, 200 k .. 900 k przy kompresji. Nie ma ++¿adnego znaczenia przy dekompresji. Zobacz \fBzarz±dzanie pamiêci±\fP ++poni¿ej. ++.TP ++.B \-- ++Traktuje wszystkie nastêpuj±ce po nim argumenty jako nazwy plików, nawet je¶li ++zaczynaj± siê one od my¶lnika. Mo¿esz wiêc kompresowaæ i dekompresowaæ ++pliki, których nazwa zaczyna siê od my¶lnika, na przyk³ad: bzip2 \-- ++\-mój_plik. ++.TP ++.B \--repetitive-fast --repetitive-best ++Te parametry nie maj± znaczenia w wersjach 0.9.5 i wy¿szych. Umo¿liwia³y one ++pewn± infantyln± kontrolê nad zachowaniem algorytmu sortuj±cego we ++wcze¶niejszych wersjach, co by³o czasami u¿yteczne. Wersje 0.9.5 i wy¿sze ++maj± usprawniony algorytm, który powoduje bezu¿yteczno¶æ tej funkcji. ++ ++.SH ZARZ¡DZANIE PAMIÊCI¡ ++.I bzip2 ++kompresuje du¿e pliki w blokach. Rozmiar bloku ma wp³yw zarówno na stopieñ ++osi±ganej kompresji, jak równie¿ na ilo¶æ pamiêci potrzebnej do kompresji ++i dekompresji. Parametry od \-1 do \-9 wybieraj± rozmiar bloku odpowiednio ++od 100,000 bajtów a¿ do 900,000 bajtów (standardowo). W czasie dekompresji, ++rozmiar bloku u¿ytego do kompresji jest odczytywany z nag³ówku pliku ++skompresowanego i ++.I bunzip2 ++sam zajmuje odpowiedni± do dekompresji ilo¶æ pamiêci. Poniewa¿ rozmiar ++bloków jest przetrzymywany w pliku skompresowanym, parametry od \-1 do \-9 ++nie maj± przy dekompresji ¿adnego znaczenia. ++ ++Wymagania kompresji i dekompresji w bajtach, mog± byæ wyliczone przez: ++ ++ Kompresja : 400k + ( 8 x rozmiar bloku ) ++ ++ Dekompresja : 100k + ( 4 x rozmiar bloku ) lub ++ 100k + ( 2.5 x rozmiar bloku ) ++ ++Wiêksze bloki daj± du¿e zmniejszenie zwrotów marginalnych. Wiêkszo¶æ ++kompresji pochodzi z pierwszych stu lub dwustu kilobajtów rozmiaru bloku. ++Warto o tym pamiêtaæ u¿ywaj±c \fIbzip2\fP na wolnych ++komputerach. Warto równie¿ podkre¶liæ, ¿e rozmiar pamiêci potrzebnej do ++dekompresji jest wybierany poprzez ustawienie odpowiedniej ++wielko¶ci bloku przy kompresji. ++ ++Dla plików skompresowanych standardowym blokiem wielko¶ci 900k, ++\fIbunzip2\fP bêdzie wymaga³ oko³o 3700 kilobajtów do dekompresji. Aby ++umo¿liwiæ dekompresjê na komputerze wyposa¿onym jedynie w 4 megabajty ++pamiêci, \fIbunzip2\fP ma opcjê, która mo¿e zmniejszyæ wymagania prawie do ++po³owy, tzn. oko³o 2300 kilobajtów. Prêdko¶æ dekompresji jest równie¿ bardzo ++zmiejszona, wiêc u¿ywaj tej opcji tylko wtedy, kiedy jest to konieczne. Tym ++parametrem jest -s. ++ ++Generalnie, próbuj i u¿ywaj najwiêkszych rozmiarów bloków, je¶li ilo¶æ ++pamiêci ci na to pozwala. Prêdko¶æ kompresji i dekompresji w zasadzie nie ++zale¿y od wielko¶ci u¿ytego bloku. ++ ++Inna wa¿na rzecz dotyczy plików, które mieszcz± siê w pojedyñczym bloku -- ++oznacza to wiêkszo¶æ plików na które siê natkniesz u¿ywaj±c du¿ych bloków. ++Rozmiar realny pamiêci zabieranej jest proporcjonalny do wielko¶ci pliku, ++je¶li plik jest mniejszy ni¿ blok. Na przyk³ad, kompresja pliku o ++wielko¶ci 20,000 bajtów z parametrem -9 wymusi na kompresorze odnalezienie ++7600 k pamiêci, ale zajêcie tylko 400k + 20000 * 8 = 560 kilobajtów z ++tego. Podobnie, dekompresor odnajdzie 3700k, ale zajmie tylko 100k + 20000 ++* 4 = 180 kilobajtów. ++ ++Tu jest tabela, która podsumowuje maksymalne u¿ycie pamiêci dla ró¿nych ++rozmiarów bloków. Podano te¿ ca³kowity rozmiar skompresowanych 14 ++plików tekstowych (Calgary Text Compressione Corpus) zajmuj±cych razem ++3,141,622 bajtów. Ta kolumna daje pewne pojêcie o tym, jaki wp³yw na ++kompresjê ma wielko¶æ bloków. Ta tabela uzmys³awia równie¿ przewagê u¿ycia ++wiêkszych bloków dla wiêkszych plików, poniewa¿ "Corpus" jest zdominowany ++przez mniejsze pliki. ++.nf ++ U¿ycie U¿ycie U¿ycie Corpus ++ Parametr kompresji dekompresji dekompresji -s Size ++ ++ -1 1200k 500k 350k 914704 ++ -2 2000k 900k 600k 877703 ++ -3 2800k 1300k 850k 860338 ++ -4 3600k 1700k 1100k 846899 ++ -5 4400k 2100k 1350k 845160 ++ -6 5200k 2500k 1600k 838626 ++ -7 6100k 2900k 1850k 834096 ++ -8 6800k 3300k 2100k 828642 ++ -9 7600k 3700k 2350k 828642 ++.fi ++.SH ODZYSKIWANIE DANYCH ZE ZNISZCZONYCH PLIKÓW BZIP2 ++.I bzip2 ++kompresuje pliki w blokach, zazwyczaj 900 kilbajtowych. Ka¿dy blok jest ++trzymany osobno. Je¶li b³êdy transmisji lub no¶nika uszkodz± plik ++wieloblokowy .bz2, mo¿liwe jest odtworzenie danych zawartych w ++niezniszczonych blokach pliku. ++ ++Ka¿dy blok jest reprezentowany przez 48-bitowy wzorzec, który umo¿liwia ++znajdowanie przyporz±dkowañ bloków z rozs±dn± pewno¶ci±. Ka¿dy blok ++ma równie¿ swój 32-bitowy CRC, wiêc bloki uszkodzone mog± byæ ³atwo ++odseparowane od poprawnych. ++ ++.I bzip2recover ++jest oddzielnym programem, którego zadaniem jest poszukiwanie bloków w ++plikach .bz2 i zapisywanie ich do w³asnego pliku .bz2. Mo¿esz potem u¿yæ ++\fIbzip2\fP \-t aby sprawdziæ spójno¶æ wyj¶ciowego pliku i zdekompresowaæ ++te, które nie s± uszkodzone. ++ ++.I bzip2recover ++pobiera pojedynczy argument, nazwê uszkodzonego pliku, i tworzy pewn± liczbê ++plików "rec0001plik.bz2", "rec0002plik.bz2", itd., przetrzymuj±ce odzyskane ++bloki. Wyj¶ciowe nazwy plików s± tak tworzone, aby ³atwo by³o potem u¿ywaæ ++ich razem za pomoc± gwiazdek -- na przyk³ad, "bzip2 -dc rec*plik.bz2 > ++odzyskany_plik" -- wylistuje pliki we w³a¶ciwej kolejno¶ci. ++ ++.I bzip2recover ++powinien byæ u¿ywany najczê¶ciej z du¿ymi plikami .bz2, jako i¿ one ++zawieraj± najczê¶ciej du¿o bloków. Jest czystym bezsensem u¿ywaæ go na ++uszkodzonym jedno-blokowym pliku, poniewa¿ uszkodzony blok nie mo¿e byæ ++odzyskany. Je¶li chcesz zminimalizowaæ jakiekolwiek mo¿liwe straty danych ++poprzez no¶nik lub transmisjê, powiniene¶ zastanowiæ siê nad u¿yciem ++mniejszych bloków. ++ ++.SH OPISY WYNIKÓW ++Etap sortuj±cy kompresji ³±czy razem podobne ci±gi znaków w pliku. Przez ++to, pliki zawieraj±ce bardzo d³ugie ci±gi powtarzaj±cych siê symboli, jak ++"aabaabaabaab ..." (powtórzone kilkaset razy) mog± byæ kompresowane wolniej ++ni¿ normalnie. Wersje 0.9.5 i wy¿sze zachowuj± siê du¿o lepiej w tej ++sytuacji ni¿ wersje poprzednie. Ró¿nica stopnia kompresji pomiêdzy ++najgorszym i najlepszym przypadkiem kompresji wynosi oko³o 10:1. Dla ++wcze¶niejszych wersji by³o to nawet oko³o 100:1. Je¶li chcesz, mo¿esz u¿yæ ++parametru \-vvvv aby monitorowaæ postêpy bardzo szczegó³owo. ++ ++Prêdko¶æ dekompresji nie jest zmieniana przez to zjawisko. ++ ++.I bzip2 ++zazwyczaj rezerwuje kilka megabajtów pamiêci do dzia³ania a ++potem wykorzystuje j± w sposób zupe³nie przypadkowy. ++Oznacza to, ¿e zarówno prêdko¶æ kompresji jak i dekompresji jest w ++du¿ej czê¶ci zale¿na od prêdko¶ci, z jak± twój komputer mo¿e naprawiaæ braki ++bufora podrêcznego. Z tego powodu, wprowadzone zosta³y ma³e zmiany kody aby ++zmniejszyæ straty, które da³y nieproporcjonalnie du¿y wzrost osi±gniêæ. ++My¶lê, ¿e ++.I bzip2 ++bêdzie dzia³a³ najlepiej na komputerach z du¿ymi buforami podrêcznymi. ++ ++.SH ZAKAMARKI ++Wiadomo¶ci o b³êdach wej¶cia/wyj¶cia nie s± a¿ tak pomocne, jak mog³yby byæ. ++.I bzip2 ++stara siê wykryæ b³±d wej¶cia/wyj¶cia i wyj¶æ "czysto", ale ++szczegó³y tego, jaki to problem mog± byæ czasami bardzo myl±ce. ++ ++Ta strona podrêcznika odnosi siê do wersji 1.0 programu \fIbzip2\fP. ++Skompresowane pliki utworzone przez tê wersjê s± kompatybilne zarówno z ++w przód jak i wstecznie z poprzednimi publicznymi wydaniami, ++wersjami 0.1pl2, 0.9.0 i 0.9.5 ale z ma³ymi wyj±tkami: 0.9.0 i wy¿sze potrafi± ++poprawnie dekompresowaæ wiele skompresowanych plików z³±czonych w jeden. ++0.1pl2 nie potrafi tego; zatrzyma siê ju¿ po dekompresji pierwszego pliku w ++strumieniu. ++ ++.I bzip2recover ++u¿ywa 32-bitowych liczb do reprezentacji pozycji bitu w skompresowanym ++pliku, wiêc nie mo¿e przetwarzaæ skompresowanych plików d³u¿szych ni¿ 512 ++megabajtów. Mo¿na to ³atwo naprawiæ. ++ ++.SH AUTOR ++Julian Seward, jseward@acm.org. ++ ++http://www.muraroa.demon.co.uk ++http://sourceware.cygnus.com/bzip2 ++ ++Idee zawarte w \fIbzip2\fP s± podzielone (przynajmniej) pomiêdzy ++nastepuj±cy ludzi: Michael Burrows i David Wheeler (transformacja ++sortuj±c± bloki), David Wheeler (znów, koder Huffmana), Peter Fenwick ++(struktura kodowania modelu w oryginalnym \fIbzip2\fP, i wiele ++udoskonaleñ), i Alistair Moffar, Radford Neal i Ian Witten (arytmetyczny ++koder w oryginalnym \fIbzip2\fP). Jestem im bardzo wdziêczny za ich pomoc, ++wsparcie i porady. Zobacz stronê manuala w ¼ród³owej dystrybucji po ++wska¼niki do ¼róde³ dokumentacji. Christian von Roques zachêci³ mnie do ++wymy¶lenia szybszego algorytmu sortuj±cego, po to ¿eby przyspieszyæ ++kompresjê. Bela Lubkin zachêci³a mnie do polepszenia najgorszych wyników ++kompresji. Wiele ludzi przys³a³o ³atki, pomog³o w ró¿nych problemach, ++po¿yczy³o komputerów, da³o rady i by³o ogólnie pomocnych. +diff -Nru bzip2-1.0.1/doc/pl/bzip2recover.1 bzip2-1.0.1.new/doc/pl/bzip2recover.1 +--- bzip2-1.0.1/doc/pl/bzip2recover.1 Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/doc/pl/bzip2recover.1 Sat Jun 24 20:13:06 2000 +@@ -0,0 +1 @@ ++.so bzip2.1 +\ No newline at end of file +diff -Nru bzip2-1.0.1/huffman.c bzip2-1.0.1.new/huffman.c +--- bzip2-1.0.1/huffman.c Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/huffman.c Sat Jun 24 20:13:06 2000 +@@ -58,6 +58,10 @@ + For more information on these sources, see the manual. + --*/ + ++#ifdef HAVE_CONFIG_H ++#include ++#endif ++ + + #include "bzlib_private.h" + +diff -Nru bzip2-1.0.1/makefile.msc bzip2-1.0.1.new/makefile.msc +--- bzip2-1.0.1/makefile.msc Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/makefile.msc Thu Jan 1 01:00:00 1970 +@@ -1,63 +0,0 @@ +-# Makefile for Microsoft Visual C++ 6.0 +-# usage: nmake -f makefile.msc +-# K.M. Syring (syring@gsf.de) +-# Fixed up by JRS for bzip2-0.9.5d release. +- +-CC=cl +-CFLAGS= -DWIN32 -MD -Ox -D_FILE_OFFSET_BITS=64 +- +-OBJS= blocksort.obj \ +- huffman.obj \ +- crctable.obj \ +- randtable.obj \ +- compress.obj \ +- decompress.obj \ +- bzlib.obj +- +-all: lib bzip2 test +- +-bzip2: lib +- $(CC) $(CFLAGS) -o bzip2 bzip2.c libbz2.lib setargv.obj +- $(CC) $(CFLAGS) -o bzip2recover bzip2recover.c +- +-lib: $(OBJS) +- lib /out:libbz2.lib $(OBJS) +- +-test: bzip2 +- type words1 +- .\\bzip2 -1 < sample1.ref > sample1.rb2 +- .\\bzip2 -2 < sample2.ref > sample2.rb2 +- .\\bzip2 -3 < sample3.ref > sample3.rb2 +- .\\bzip2 -d < sample1.bz2 > sample1.tst +- .\\bzip2 -d < sample2.bz2 > sample2.tst +- .\\bzip2 -ds < sample3.bz2 > sample3.tst +- @echo All six of the fc's should find no differences. +- @echo If fc finds an error on sample3.bz2, this could be +- @echo because WinZip's 'TAR file smart CR/LF conversion' +- @echo is too clever for its own good. Disable this option. +- @echo The correct size for sample3.ref is 120,244. If it +- @echo is 150,251, WinZip has messed it up. +- fc sample1.bz2 sample1.rb2 +- fc sample2.bz2 sample2.rb2 +- fc sample3.bz2 sample3.rb2 +- fc sample1.tst sample1.ref +- fc sample2.tst sample2.ref +- fc sample3.tst sample3.ref +- +- +- +-clean: +- del *.obj +- del libbz2.lib +- del bzip2.exe +- del bzip2recover.exe +- del sample1.rb2 +- del sample2.rb2 +- del sample3.rb2 +- del sample1.tst +- del sample2.tst +- del sample3.tst +- +-.c.obj: +- $(CC) $(CFLAGS) -c $*.c -o $*.obj +- +diff -Nru bzip2-1.0.1/manual.ps bzip2-1.0.1.new/manual.ps +--- bzip2-1.0.1/manual.ps Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/manual.ps Thu Jan 1 01:00:00 1970 +@@ -1,3808 +0,0 @@ +-%!PS-Adobe-2.0 +-%%Creator: dvips(k) 5.78 Copyright 1998 Radical Eye Software (www.radicaleye.com) +-%%Title: manual.dvi +-%%Pages: 39 +-%%PageOrder: Ascend +-%%BoundingBox: 0 0 596 842 +-%%EndComments +-%DVIPSCommandLine: dvips -o manual.ps manual.dvi +-%DVIPSParameters: dpi=600, compressed +-%DVIPSSource: TeX output 2000.03.23:2343 +-%%BeginProcSet: texc.pro +-%! +-/TeXDict 300 dict def TeXDict begin /N{def}def /B{bind def}N /S{exch}N +-/X{S N}B /TR{translate}N /isls false N /vsize 11 72 mul N /hsize 8.5 72 +-mul N /landplus90{false}def /@rigin{isls{[0 landplus90{1 -1}{-1 1} +-ifelse 0 0 0]concat}if 72 Resolution div 72 VResolution div neg scale +-isls{landplus90{VResolution 72 div vsize mul 0 exch}{Resolution -72 div +-hsize mul 0}ifelse TR}if Resolution VResolution vsize -72 div 1 add mul +-TR[matrix currentmatrix{dup dup round sub abs 0.00001 lt{round}if} +-forall round exch round exch]setmatrix}N /@landscape{/isls true N}B +-/@manualfeed{statusdict /manualfeed true put}B /@copies{/#copies X}B +-/FMat[1 0 0 -1 0 0]N /FBB[0 0 0 0]N /nn 0 N /IE 0 N /ctr 0 N /df-tail{ +-/nn 8 dict N nn begin /FontType 3 N /FontMatrix fntrx N /FontBBox FBB N +-string /base X array /BitMaps X /BuildChar{CharBuilder}N /Encoding IE N +-end dup{/foo setfont}2 array copy cvx N load 0 nn put /ctr 0 N[}B /df{ +-/sf 1 N /fntrx FMat N df-tail}B /dfs{div /sf X /fntrx[sf 0 0 sf neg 0 0] +-N df-tail}B /E{pop nn dup definefont setfont}B /ch-width{ch-data dup +-length 5 sub get}B /ch-height{ch-data dup length 4 sub get}B /ch-xoff{ +-128 ch-data dup length 3 sub get sub}B /ch-yoff{ch-data dup length 2 sub +-get 127 sub}B /ch-dx{ch-data dup length 1 sub get}B /ch-image{ch-data +-dup type /stringtype ne{ctr get /ctr ctr 1 add N}if}B /id 0 N /rw 0 N +-/rc 0 N /gp 0 N /cp 0 N /G 0 N /sf 0 N /CharBuilder{save 3 1 roll S dup +-/base get 2 index get S /BitMaps get S get /ch-data X pop /ctr 0 N ch-dx +-0 ch-xoff ch-yoff ch-height sub ch-xoff ch-width add ch-yoff +-setcachedevice ch-width ch-height true[1 0 0 -1 -.1 ch-xoff sub ch-yoff +-.1 sub]/id ch-image N /rw ch-width 7 add 8 idiv string N /rc 0 N /gp 0 N +-/cp 0 N{rc 0 ne{rc 1 sub /rc X rw}{G}ifelse}imagemask restore}B /G{{id +-gp get /gp gp 1 add N dup 18 mod S 18 idiv pl S get exec}loop}B /adv{cp +-add /cp X}B /chg{rw cp id gp 4 index getinterval putinterval dup gp add +-/gp X adv}B /nd{/cp 0 N rw exit}B /lsh{rw cp 2 copy get dup 0 eq{pop 1}{ +-dup 255 eq{pop 254}{dup dup add 255 and S 1 and or}ifelse}ifelse put 1 +-adv}B /rsh{rw cp 2 copy get dup 0 eq{pop 128}{dup 255 eq{pop 127}{dup 2 +-idiv S 128 and or}ifelse}ifelse put 1 adv}B /clr{rw cp 2 index string +-putinterval adv}B /set{rw cp fillstr 0 4 index getinterval putinterval +-adv}B /fillstr 18 string 0 1 17{2 copy 255 put pop}for N /pl[{adv 1 chg} +-{adv 1 chg nd}{1 add chg}{1 add chg nd}{adv lsh}{adv lsh nd}{adv rsh}{ +-adv rsh nd}{1 add adv}{/rc X nd}{1 add set}{1 add clr}{adv 2 chg}{adv 2 +-chg nd}{pop nd}]dup{bind pop}forall N /D{/cc X dup type /stringtype ne{] +-}if nn /base get cc ctr put nn /BitMaps get S ctr S sf 1 ne{dup dup +-length 1 sub dup 2 index S get sf div put}if put /ctr ctr 1 add N}B /I{ +-cc 1 add D}B /bop{userdict /bop-hook known{bop-hook}if /SI save N @rigin +-0 0 moveto /V matrix currentmatrix dup 1 get dup mul exch 0 get dup mul +-add .99 lt{/QV}{/RV}ifelse load def pop pop}N /eop{SI restore userdict +-/eop-hook known{eop-hook}if showpage}N /@start{userdict /start-hook +-known{start-hook}if pop /VResolution X /Resolution X 1000 div /DVImag X +-/IE 256 array N 2 string 0 1 255{IE S dup 360 add 36 4 index cvrs cvn +-put}for pop 65781.76 div /vsize X 65781.76 div /hsize X}N /p{show}N +-/RMat[1 0 0 -1 0 0]N /BDot 260 string N /rulex 0 N /ruley 0 N /v{/ruley +-X /rulex X V}B /V{}B /RV statusdict begin /product where{pop false[ +-(Display)(NeXT)(LaserWriter 16/600)]{dup length product length le{dup +-length product exch 0 exch getinterval eq{pop true exit}if}{pop}ifelse} +-forall}{false}ifelse end{{gsave TR -.1 .1 TR 1 1 scale rulex ruley false +-RMat{BDot}imagemask grestore}}{{gsave TR -.1 .1 TR rulex ruley scale 1 1 +-false RMat{BDot}imagemask grestore}}ifelse B /QV{gsave newpath transform +-round exch round exch itransform moveto rulex 0 rlineto 0 ruley neg +-rlineto rulex neg 0 rlineto fill grestore}B /a{moveto}B /delta 0 N /tail +-{dup /delta X 0 rmoveto}B /M{S p delta add tail}B /b{S p tail}B /c{-4 M} +-B /d{-3 M}B /e{-2 M}B /f{-1 M}B /g{0 M}B /h{1 M}B /i{2 M}B /j{3 M}B /k{ +-4 M}B /w{0 rmoveto}B /l{p -4 w}B /m{p -3 w}B /n{p -2 w}B /o{p -1 w}B /q{ +-p 1 w}B /r{p 2 w}B /s{p 3 w}B /t{p 4 w}B /x{0 S rmoveto}B /y{3 2 roll p +-a}B /bos{/SS save N}B /eos{SS restore}B end +- +-%%EndProcSet +-TeXDict begin 39158280 55380996 1000 600 600 (manual.dvi) +-@start +-%DVIPSBitmapFont: Fa cmti10 10.95 1 +-/Fa 1 47 df<120FEA3FC0127FA212FFA31380EA7F00123C0A0A77891C>46 +-D E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fb cmbxti10 14.4 1 +-/Fb 1 47 df<13FCEA03FF000F13804813C05AA25AA2B5FCA31480A214006C5A6C5A6C5A +-EA0FE0121271912B>46 D E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fc cmsl10 10.95 25 +-/Fc 25 122 df37 D44 D48 D<157015F014011407143F903803FFE0137FEBFFCFEBF80F1300 +-141F15C0A5143F1580A5147F1500A55C5CA513015CA513035CA513075CA5130F5CA3131F +-497EB612F8A31D3D78BC2D>I<133C137F5B481380A31400A26C5A137890C7FCB3EA0780 +-EA0FE0121F123FA5121FEA0F601200A213E05BA212015B120390C7FC5A1206120E5A5A12 +-3012705A5A11397AA619>59 D97 DIIIII<147FEB3FFFA313017FA25CA513015CA51303 +-5CA4ED07F80107EB1FFF9139F0781FC09138F1E00F9139F38007E0ECF70002FE14F0495A +-5CA25CA24A130F131F4A14E0A4161F133F4A14C0A4163F137F91C71380A4167F5B491500 +-A300015D486C491380B5D8F87F13FCA32E3F7DBE33>104 D<1478EB01FE130314FFA25B +-14FE130314FCEB00F01400ACEB03F8EA01FF14F0A2EA001F130FA314E0A5131F14C0A513 +-3F1480A5137F1400A55B5BA4EA03FF007F13F0A2B5FC183E7DBD1A>I<143FEB1FFF5BA2 +-13017FA214FEA5130114FCA5130314F8A5130714F0A5130F14E0A5131F14C0A5133F1480 +-A5137F1400A55B5BA4EA03FF007F13F8A2B5FC183F7DBE1A>108 +-D<902707F007F8EB03FCD803FFD91FFF90380FFF80913CE0781FC03C0FE09126E1E00FEB +-F0073E001FE38007E1C003F090260FE700EBE38002EEDAF70013F802FC14FE02D85C14F8 +-4A5CA24A5C011F020F14074A4A14F0A5013F021F140F4A4A14E0A5017F023F141F91C749 +-14C0A549027F143F4992C71380A300014B147F486C496DEBFFC0B5D8F87FD9FC3F13FEA3 +-47287DA74C>I<903907F007F8D803FFEB1FFF9139E0781FC09138E1E00F3B001FE38007 +-E090380FE70002EE14F014FC14D814F85CA24A130F131F4A14E0A4161F133F4A14C0A416 +-3F137F91C71380A4167F5B491500A300015D486C491380B5D8F87F13FCA32E287DA733> +-II<91387F01FE903A7FFF0FFFC09139FE3E03F09238F801F890 +-3A01FFE000FE4B137F6D497F4990C713804A15C04A141FA218E0A20103150F5C18F0A317 +-1F010716E05CA3173F18C0130F4A147F1880A2EFFF004C5A011F5D16034C5A6E495AEE1F +-C06E495AD93FDC017EC7FC91388F01F8913883FFE0028090C8FC92C9FC137FA291CAFCA4 +-5BA25BA31201487EB512F8A3343A81A733>I<903907F01F80D803FFEB7FE09138E1E1F0 +-9138E387F839001FE707EB0FE614EE02FC13F002D813E09138F801804AC7FCA25C131FA2 +-5CA4133F5CA5137F91C8FCA55B5BA31201487EB512FEA325287EA724>114 +-D<9138FF81C0010713E390381F807F90397C003F8049131F4848130F5B00031407A24848 +-1400A27FA27F6D90C7FCEBFF8014FC6C13FF6C14C015F06C6C7F011F7F13079038007FFE +-1403140100381300157EA2123C153E157E007C147CA2007E147815F8007F495A4A5A486C +-485A26F9E01FC7FC38E0FFFC38C01FE0222A7DA824>II<01FE147F00FFEC7FFF4914FEA20007140300031401A34914FCA4 +-150312074914F8A41507120F4914F0A4150F121F4914E0A2151FA3153F4914C0157F15FF +-EC01DF3A0FC003BFE09138073FFF3803F01E3801FFF826003FE01380282977A733>III<90B539E007FFF05E18E0902707FE000313006D48EB01FC +-705A5F01014A5A5F16036E5C0100140794C7FC160E805E805E1678ED8070023F13F05EED +-81C015C191381FC38015C793C8FC15EF15EEEC0FFCA25DA26E5AA25DA26E5A5DA24AC9FC +-5C140E141E141C5C121C003F5B5A485B495A130300FE5B4848CAFCEA701EEA783CEA3FF0 +-EA0FC0343A80A630>121 D E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fd cmtt12 14.4 10 +-/Fd 10 123 df50 D<383FFF805AB57EA37E7EEA003FAEED07FC92383FFF +-8092B512E002C314F802CF8002DF8091B7FCDBF80F1380DBC00113C092C713E04A143F4A +-EC1FF04A15F84A140F4AEC07FCA217034A15FE1701A318FF83A95F18FEA280170318FC6E +-140718F86E140FEF1FF06E143F6EEC7FE06EECFFC0DBC0031380EDF01F92B6120002DF14 +-FC02CF5C02C35CD91F8114C090260F807F90C7FC90C7EA0FF8384A7FC83E>98 +-D<923803FFF85D4B7FA38181ED0003AEEC1FF0ECFFFE0103EBFF83010F14E34914F3017F +-14FB90B7FC48EBF80F48EBC00191C7FC4848143F4848141F5B4848140F491407123F4914 +-03127F5BA312FF90C8FCA97F127FA216077F123F6D140FA26C6C141F6D143F000F157F6C +-6C14FF01FF5B6C6D5A6CD9F01FEBFFFC6C90B500FB13FE6D02F313FF6D14E3010F14C36D +-020113FE010101FC14FC9026003FE0C8FC384A7CC83E>100 D<143E147F4A7E497FA56D +-5B6EC8FC143E91C9FCAC003FB57E5A81A47EC7123FB3B3007FB71280B812C0A56C16802A +-4A76C93E>105 D<007FB512C0B6FC81A47EC7121FB3B3B3A5007FB712F8B812FCA56C16 +-F82E4978C83E>108 D111 +-DI<903901FFF00F011F9038 +-FE1F8090B612BF000315FF5A5A5A393FFE003F01F01307D87FC0130190C8FC5A48157FA4 +-7EEE3F00D87FC091C7FC13F0EA3FFE381FFFF06CEBFFC06C14FE6C6E7EC615E0013F14F8 +-010780D9003F7F02007F03071380030013C0003EED3FE0007F151F48150F17F06D1407A3 +-7FA26D140F6D15E0161F01FCEC3FC06D14FF9026FFC00F138091B612005E485D013F5C6D +-14E0D8FC0714802778007FF8C7FC2C3677B43E>115 D<147C14FC497EAD003FB712FC5A +-B87EA36C5EA2260001FEC9FCB3A6173FA2EF7F80A76E14FF6D16006F5A9238C007FE9138 +-7FF01F92B55A6E5C6E5C6E5C6E1480020149C7FC9138003FF031437DC13E>I<000FB812 +-804817C04817E0A418C001C0C712014C13804C1300EE1FFE4C5AEE7FF06C484A5A4B5BC8 +-485B4B90C7FC4B5A4B5A4B5A4B5A4B5A4A5B4A5B4A90C8FC4A5A4A5A4A5A4A5A4A5A495B +-495B4990C9FC495A495A4948EC0FC0495A4948EC1FE0485B00075B4890C8FCEA1FFC485A +-485A90B8FCB9FCA46C17C07E33337CB23E>122 D E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fe cmtt12 13.14 31 +-/Fe 31 123 df50 D<003FB6FC4815E0B712F882826C816C16802701FC000113C0 +-9238007FE0161FEE0FF0A2160717F81603A6160717F0A2160FEE1FE0163FEE7FC0923801 +-FF80030F130090B65A5E16F08216FEEEFF8017C001FCC7EA7FE0EE1FF0EE07F8160317FC +-EE01FE1600A217FF177FA717FF17FE16011603EE07FC160FEE3FF8EEFFF0003FB7FC4816 +-E0B812C01780EEFE006C15F86C15C030437DC238>66 DI<007FB512F8B7FC16C08216F8826C813A03F8001FFF15 +-07030113806F13C0167FEE3FE0161FEE0FF0A2EE07F8A2EE03FCA21601A217FE1600A417 +-7FAD17FF17FEA4160117FCA2160317F81607A2EE0FF0161FEE3FE0167FEEFFC04B13805D +-031F1300007FB65AB75A5E5E16C093C7FC6C14F830437DC238>I<007FB712FCB87EA57E +-D801FCC8FCA9177C94C7FCA6ED07C04B7EA590B6FCA79038FC000FA56F5A92C9FCA7EF0F +-80EF1FC0AA007FB8FCB9FCA56C178032437DC238>I<91391FF003C091397FFC07E049B5 +-FC010714CF4914EF4914FF5B90387FF81F9038FFE00748EB800191C7FC4848147F485A49 +-143F485A161F485AA249140F123F5BA2127F90C8EA07C093C7FCA35A5AAA923807FFFC4B +-13FE4B13FF7E7E6F13FE6F13FC9238000FE07F003F151FA27F121F7F163F6C7EA26C6C14 +-7F7F6C6C14FF6C6C5B6E5A6C6D5A90387FF81F6DB6FC6D14EF6D14CF6D148F0101140F90 +-3A007FFC07C0DA1FF0C7FC30457CC338>71 D<007FB612F0B712F8A56C15F0260001FCC7 +-FCB3B3B1007FB612F0B712F8A56C15F0254377C238>73 D<90380FFFFE90B612E0000315 +-F8488148814881A2273FFE000F138001F01301497F49147F4848EC3FC0A290C8121FA448 +-16E0A248150FB3AC6C151FA36C16C0A36D143FA36C6CEC7F806D14FF6D5B01FE130F6CB7 +-1200A26C5D6C5D6C5DC615E0010F49C7FC2B457AC338>79 D<003FB512F04814FEB77E16 +-E0826C816C813A01FC003FFEED07FF03017F81707E163F161F83160FA7161F5F163F167F +-4C5A5D030790C7FCED3FFE90B65A5E5E5EA282829038FC001FED07FC6F7E150115008282 +-AA18E0EF01F0EF03F8A31783EE3F87263FFFE0ECC7F0486D14FFB56C7F18E07013C06C49 +-6D13806C496D1300CA12FC35447EC238>82 D<003FB8FC481780B9FCA53BFE0007F0003F +-A9007CEE1F00C792C7FCB3B3A70107B512F04980A56D5C31437DC238>84 +-D<267FFFF090387FFFF0B56C90B512F8A56C496D13F0D801FCC73801FC00B3B3A66D1403 +-00005EA36D14076D5D6E130F6D6C495A6E133F6D6C495A6D6C495AECFF076D90B5C7FC6D +-5C6D5C6D5C023F13E0020F1380DA03FEC8FC35447FC238>I87 +-D<001FB712F04816F85AA417F090C8121F17E0EE3FC0167F1780EEFF00A24B5A4B5A123E +-C8485A4B5AA24B5A4B5AA24B5A4BC7FCA24A5A14035D4A5A140F5D4A5A143F5D4A5A14FF +-92C8FC495A13035C495AA2495A495AA2495A495A17F849C7EA01FC485AA2485A485AA248 +-5A121F5B485A127F90B7FCB8FCA56C16F82E437BC238>90 D<003FB712804816C0B812E0 +-A46C16C06C16802B087A7D38>95 D97 DIIIII<14F0497E497E497EA4 +-6D5A6D5A6D5A91C8FCAB383FFFFC487FB5FCA37E7EC7FCB3AF007FB612F0B712F816FCA3 +-16F86C15F0264476C338>105 D<387FFFFEB6FCA57EC77EB3B3B1007FB7FCB81280A56C +-1600294379C238>108 D<023FEB07E03B3FE0FFC01FF8D87FF39038E07FFCD8FFF76D48 +-7E90B500F97F15FB6C91B612806C01C1EBF83F00030100EBE01F4902C013C0A24990387F +-800FA2491400A349137EB3A73C3FFF03FFE07FFC4801879038F0FFFEB500C76D13FFA36C +-01874913FE6C01039038E07FFC383080AF38>IIII114 D<903907FF80F0017FEBF1F848B5 +-12FD000714FF5A5A5AEBFC00D87FE0131F0180130F48C71207481403A5007FEC01F001C0 +-90C7FCEA3FF013FE381FFFF86CEBFFC0000314F8C614FF013F1480010714E0D9003F13F0 +-020013F8ED0FFC1503003CEC01FE007E140000FE15FF167F7EA37F6D14FF16FE01F01303 +-6DEB07FC01FF137F91B512F816F016E04815C0D8FC3F1400010F13FCD8780113E0283278 +-B038>III<000FB712FC4816FE5AA417 +-FC0180C7EA1FF8EE3FF0EE7FE0EEFFC04B13804B13006CC7485AC8485A4B5A4B5A4B5A4B +-5A4A5B4A90C7FCEC07FC4A5A4A5A4A5A4A5A49485A4990C8FC495A495A495A495A494814 +-7C494814FE485B4890C8FC485A485A485A485A48B7FCB8FCA56C16FC2F2F7DAE38>122 +-D E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Ff cmbx12 13.14 53 +-/Ff 53 122 df<923807FFE092B512FC020714FF021F81027F9038007FC0902601FFF0EB +-0FE04901C0497E4990C7487ED90FFC147F011F824A14FF495AA2137F5CA2715A715A715A +-EF078094C8FCA7EF07FCB9FCA526007FF0C7123F171FB3B3A2003FB5D8E00FB512F8A53D +-4D7ECC44>12 D45 DI<177817F8EE01FCA21603A2EE07F8A217F016 +-0FA217E0161FA2EE3FC0A21780167FA217005EA24B5AA25E1503A24B5AA25E150FA25E15 +-1FA24B5AA25E157FA24BC7FCA25D1401A25D1403A24A5AA25D140FA24A5AA25D143FA25D +-147FA24AC8FCA25C1301A25C1303A2495AA25C130FA2495AA25C133FA25C137FA249C9FC +-A25B1201A2485AA25B1207A25B120FA2485AA25B123FA25B127FA248CAFCA25AA2127CA2 +-2E6D79D13D>I<15F014011407141F147FEB03FF137FB6FCA313FC1380C7FCB3B3B2007F +-B712E0A52B4777C63D>49 DIIIII<121F7F7F13FE90B812E0A45A18C0188018005F5FA25F485E90C8 +-EA07E0007E4B5A5F007C151F4CC7FC167E5E485D15014B5A4B5AC8485A4B5AA24BC8FC15 +-7EA25D1401A24A5A1407A24A5AA2141FA24A5AA2147FA314FFA3495BA45BA55BAA6D5BA2 +-6D90C9FCEB007C334B79C93D>III65 +-D<93261FFF80EB01C00307B500F81303033F02FE13074AB7EAC00F0207EEE03F021F903A +-FE007FF87F027F01E0903807FCFF91B5C70001B5FC010301FC6E7E4901F0151F4901C081 +-4949814990C97E494882494882485B48197F4A173F5A4A171F5A5C48190FA2485B1A07A2 +-5AA297C7FC91CDFCA2B5FCAD7EA280A2F207C07EA36C7FA26C190F6E18807E6E171F6C1A +-006E5F6C193E6C6D177E6D6C5F6D6C4C5A6D6D15036D6D4B5A6D01F04B5A6D01FCED3FC0 +-010001FFEDFF806E01E0D903FEC7FC021F01FEEB3FFC020790B612F002015EDA003F92C8 +-FC030714FCDB001F13804A4D79CB59>67 D +-III<93261FFF80EB01C00307B500F8 +-1303033F02FE13074AB7EAC00F0207EEE03F021F903AFE007FF87F027F01E0903807FCFF +-91B5C70001B5FC010301FC6E7E4901F0151F4901C0814949814990C97E49488249488248 +-5B48197F4A173F5A4A171F5A5C48190FA2485B1A07A25AA297C8FC91CEFCA2B5FCAD6C04 +-0FB712C0A280A36C93C7001FEBC000A2807EA27E807E807E806C7F7E6D7E6D7E6D7F6D01 +-E05D6D6D5D6D13FC010001FF4AB5FC6E01E0EB07F9021F01FFEB3FF0020791B5EAE07F02 +-01EEC01FDA003FED0007030702F81301DB001F018090C8FC524D79CB61>III76 DII +-II82 DI<003FBB12C0A5DA80019038FC001FD9FC001601D8 +-7FF09438007FE001C0183F49181F90C7170FA2007E1907A3007C1903A500FC1AF0481901 +-A5C894C7FCB3B3A749B812FCA54C4A7CC955>III89 D97 +-DI<91380FFF8091B512F8 +-010314FF010F15804948C613C0D97FF8EB1FE0D9FFE0EB3FF04849137F4849EBFFF84890 +-C7FCA2485A121FA24848EC7FF0EE3FE0EE1FC0007F92C7FC5BA212FFAC127FA27FA2123F +-A26C6C153EA26C6C157E177C6C6D14FC6C6D14F86C6D13036C6DEB07F0D97FFCEB1FE06D +-B4EBFFC0010F90B5120001035C010014F0020F13802F347CB237>IIIIII<13FCEA03FF487F487FA2487FA66C5BA26C5B6C90C7FCEA00FC90C8 +-FCABEB7FC0B5FCA512037EB3B3A2B61280A5194D7BCC22>I108 D<90287FC001FFC0EC7FF0B5010F01FC0103B5FC033F +-6D010F804B6D4980DBFE079026803F817F9126C1F801903AC07E007FF00003D9C3E0DAE0 +-F8806C9026C78000D9F1E06D7E02CFC7EBF3C002DEEDF780DD7FFF6E7E02FC93C7FC4A5D +-A24A5DA34A5DB3AAB6D8C03FB5D8F00FB512FCA55E327BB167>I<903A7FC001FFC0B501 +-0F13F8033F7F4B13FFDBFE077F9138C1F00300039026C3E0017F6CD9C78080ECCF0014DE +-02DC6D7F14FC5CA25CA35CB3AAB6D8C07FEBFFE0A53B327BB144>I<913807FF80027F13 +-F80103B6FC010F15C090261FFE017F903A7FF0003FF849486D7E480180EB07FE4890C76C +-7E4817804980000F17C048486E13E0A2003F17F0A249157F007F17F8A400FF17FCAB007F +-17F8A46C6CEDFFF0A2001F17E0A26C6C4A13C0A26C6C4A13806C6D4913006C5E6C01E0EB +-1FFC6D6C495A903A3FFE01FFF0010FB612C0010392C7FCD9007F13F80207138036347DB2 +-3D>I<90397FC007FFB5017F13E002C1B512FC02C714FF9126CFF80F7F9126DFC0037F00 +-0301FFC77F6C496E7E02F8814A6E7E717E4A81831980A28319C0A37113E0AC19C05FA319 +-805F19005F606E143F6E5D4D5A6E4A5A02FF495BDBC0075B9126EFF01F5B02E7B548C7FC +-02E114F8DAE07F13E0DB0FFEC8FC92CAFCAFB612C0A53B477CB144>I<9039FF803FE0B5 +-EBFFF8028113FE02837FDA87E11380EC8F830003D99F0713C06C139E14BCA214F8A24A6C +-13806F13006F5A4A90C7FCA45CB3A8B612E0A52A327CB132>114 +-D<903907FF8070017FEBF1F048B6FC1207380FFC01391FE0003F4848130F491307127F90 +-C71203A2481401A27FA27F01F090C7FC13FCEBFFC06C13FEECFFE06C14FC6C806CECFF80 +-6C15C06C15E06C15F06C7E011F14F8010114FCEB000FEC007FED1FFE0078140F00F81407 +-15037E1501A27E16FC7E15036D14F86D13076D14F001F8EB1FE001FFEBFFC04890B51280 +-486C1400D8F81F13FCD8E00313C027347CB230>I<14F8A51301A41303A21307A2130FA2 +-131F133F137F13FF1203000F90B512F0B7FCA426007FF8C7FCB3A7167CAA013F14F880A2 +-90391FFE01F0010F1303903907FF87E06DEBFFC06D14806D6C1300EC0FFC26467EC430> +-IIII<007FB500C090387FFFE0A5C601F0C73803F8006E5D017F5E6E140701 +-3F5E80170F011F5E6E141F6D93C7FC6F5B6D153E6F137E6D157C6F13FCA26D6D5B16016D +-5DEDF803027F5CEDFC07023F5CEDFE0F021F5C15FF161F6E91C8FC16BF6E13BE16FE6E5B +-A26E5BA36E5BA26F5AA26F5AA26F5AA393C9FC5D153E157E157CD81F8013FC486C5B387F +-E001D8FFF05B14035D14074A5A49485A007F133F4948CAFC383F81FE381FFFF86C5B6C13 +-C0C648CBFC3B477EB041>121 D E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fg cmtt12 17.28 6 +-/Fg 6 123 df<913803FFC0023F13FC49B67E010715F04981013F15FE498190B812C048 +-8348D9FC0180489026E0001F7F480180130391C87F48486F7E49153F4848ED0FFF834848 +-178083496F13C012FF8319E07FA2187FA36C5A6C5A6C5ACBFCA218FFA219C05FA219805F +-A24D13005F604D5A173F4D5A4D5AA24C5B4C5B4C5B041F90C7FC4C5A4C5A4C5A4B5B4B5B +-4B5B031F5B4B90C8FC4B5AEDFFF84A5B4A5B4A5B021F5B4A90C9FCEC7FFC4A5A495B495B +-010F5B495B4948CAFC4948ED1F804948ED3FC04849ED7FE0485B000F5B4890C9FC4890B8 +-FC5ABAFCA56C18C06C18803B5A79D94A>50 D<383FFFF0487F80B5FCA37EA27EEA000FB0 +-EE0FFC93B57E030714E0031F14F84B14FE92B7FC02FD8291B87E85DCE01F7FEE000703FC +-01017F4B6D7F03E0143F4B6E7E4B140F8592C87E4A6F1380A34A6F13C0A284A21AE0A219 +-7FAA19FFA21AC0A26E5DA24E138080606F1600606F4A5A6F143F6F4A5A6F4A5A6F130303 +-FF010F5BDCC03F5B93B65A6102FD93C7FC02FC5D6F5C031F14F0902607F80714C0902603 +-F00191C8FC90C8EA3FF043597FD74A>98 D105 D<003FB512FE4880B77EA57E7EC71201B3B3B3 +-B0003FB812FC4817FEBAFCA56C17FE6C17FC385877D74A>108 D +-112 D<000FB912E04818F04818F8A619F001F0C8000313E04D13C04D13804D13004D5A4D +-5A4D5A6C484A5B6C484A5BC9000F5B4C5B4C90C7FC4C5A4C5A4B5B4B5B4B5B4B5B4B5B4B +-90C8FC4B5A4B5A4A5B4A5B4A5B4A5B4A5B4A90C9FC4A5A4A5A495B495B495B4949EC07E0 +-4949EC0FF04948C8EA1FF8495A495A485B485B485B485B4890C9FC485A48B9FCBAFCA66C +-18F06C18E03D3E7BBD4A>122 D E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fh cmbx12 17.28 28 +-/Fh 28 120 df<16F04B7E1507151F153FEC01FF1407147F010FB5FCB7FCA41487EBF007 +-C7FCB3B3B3B3007FB91280A6395E74DD51>49 D<913801FFF8021FEBFFC091B612F80103 +-15FF010F16C0013F8290267FFC0114F89027FFE0003F7F4890C7000F7F48486E7FD807F8 +-6E148048486E14C048486E14E048486F13F001FC17F8486C816D17FC6E80B56C16FE8380 +-A219FFA283A36C5BA26C5B6C90C8FCD807FC5DEA01F0CA14FEA34D13FCA219F85F19F04D +-13E0A294B512C019804C14004C5B604C5B4C5B604C13804C90C7FC4C5A4C5A4B13F05F4B +-13804B90C8FC4B5AED1FF84B5A4B5A4B48143F4A5B4A48C8FC4A5A4A48157E4A5A4A5AEC +-7F8092C9FC02FE16FE495A495A4948ED01FCD90FC0150749B8FC5B5B90B9FC5A4818F85A +-5A5A5A5ABAFCA219F0A4405E78DD51>I<92B5FC020F14F8023F14FF49B712C04916F001 +-0FD9C01F13FC90271FFC00077FD93FE001017F49486D8049C86C7F484883486C6F7F14C0 +-486D826E806E82487FA4805CA36C5E4A5E6C5B6C5B6C495E011FC85A90C95CA294B55A61 +-4C91C7FC604C5B4C5B4C5B4C5B047F138092260FFFFEC8FC020FB512F817E094C9FC17F8 +-17FF91C7003F13E0040713F8040113FE707F717F7113E085717FA2717F85A285831A80A3 +-1AC0EA03FCEA0FFF487F487F487FA2B57EA31A80A34D14005C7E4A5E5F6C495E49C8485B +-D81FF85F000F5ED807FE92B55A6C6C6C4914806C01F0010791C7FC6C9026FF803F5B6D90 +-B65A011F16F0010716C001014BC8FCD9001F14F0020149C9FC426079DD51>II<4DB5ED03C0057F02F0 +-14070407B600FE140F047FDBFFC0131F4BB800F0133F030F05FC137F033F9127F8007FFE +-13FF92B6C73807FF814A02F0020113C3020702C09138007FE74A91C9001FB5FC023F01FC +-16074A01F08291B54882490280824991CB7E49498449498449498449865D49498490B5FC +-484A84A2484A84A24891CD127FA25A4A1A3F5AA348491A1FA44899C7FCA25CA3B5FCB07E +-A380A27EA2F50FC0A26C7FA37E6E1A1F6C1D80A26C801D3F6C6E1A00A26C6E616D1BFE6D +-7F6F4E5A7F6D6D4E5A6D6D4E5A6D6D4E5A6D6E171F6D02E04D5A6E6DEFFF806E01FC4C90 +-C7FC020F01FFEE07FE6E02C0ED1FF8020102F8ED7FF06E02FF913803FFE0033F02F8013F +-1380030F91B648C8FC030117F86F6C16E004071680DC007F02F8C9FC050191CAFC626677 +-E375>67 D72 DI77 +-D80 D<001FBEFCA64849C79126E0000F148002E0180091 +-C8171F498601F81A0349864986A2491B7FA2491B3F007F1DC090C9181FA4007E1C0FA600 +-FE1DE0481C07A5CA95C7FCB3B3B3A3021FBAFCA663617AE070>84 +-D<913803FFFE027FEBFFF00103B612FE010F6F7E4916E090273FFE001F7FD97FE001077F +-D9FFF801017F486D6D7F717E486D6E7F85717FA2717FA36C496E7FA26C5B6D5AEB1FC090 +-C9FCA74BB6FC157F0207B7FC147F49B61207010F14C0013FEBFE004913F048B512C04891 +-C7FC485B4813F85A5C485B5A5CA2B55AA45FA25F806C5E806C047D7F6EEB01F96C6DD903 +-F1EBFF806C01FED90FE114FF6C9027FFC07FC01580000191B5487E6C6C4B7E011F02FC13 +-0F010302F001011400D9001F90CBFC49437CC14E>97 D<903807FF80B6FCA6C6FC7F7FB3 +-A8EFFFF8040FEBFF80047F14F00381B612FC038715FF038F010014C0DBBFF0011F7FDBFF +-C001077F93C76C7F4B02007F03F8824B6F7E4B6F13804B17C0851BE0A27313F0A21BF8A3 +-7313FCA41BFEAE1BFCA44F13F8A31BF0A24F13E0A24F13C06F17804F1300816F4B5A6F4A +-5B4AB402075B4A6C6C495B9126F83FE0013F13C09127F00FFC03B55A4A6CB648C7FCDAC0 +-0115F84A6C15E091C7001F91C8FC90C8000313E04F657BE35A>I<92380FFFF04AB67E02 +-0F15F0023F15FC91B77E01039039FE001FFF4901F8010113804901E0010713C049018049 +-13E0017F90C7FC49484A13F0A2485B485B5A5C5A7113E0485B7113C048701380943800FE +-0095C7FC485BA4B5FCAE7EA280A27EA2806C18FCA26C6D150119F87E6C6D15036EED07F0 +-6C18E06C6D150F6D6DEC1FC06D01E0EC7F806D6DECFF00010701FCEB03FE6D9039FFC03F +-FC010091B512F0023F5D020F1580020102FCC7FCDA000F13C03E437BC148>II<92380FFFC0 +-4AB512FC020FECFF80023F15E091B712F80103D9FE037F499039F0007FFF011F01C0011F +-7F49496D7F4990C76C7F49486E7F48498048844A804884485B727E5A5C48717EA35A5C72 +-1380A2B5FCA391B9FCA41A0002C0CBFCA67EA380A27EA27E6E160FF11F806C183F6C7FF1 +-7F006C7F6C6D16FE6C17016D6C4B5A6D6D4A5A6D01E04A5A6D6DEC3FE0010301FC49B45A +-6D9026FFC01F90C7FC6D6C90B55A021F15F8020715E0020092C8FC030713F041437CC14A +->III<903807FF80B6FCA6C6FC7F7FB3A8EF1FFF94B512F0040714 +-FC041F14FF4C8193267FE07F7F922781FE001F7FDB83F86D7FDB87F07FDB8FC0814C7F03 +-9FC78015BE03BC8003FC825DA25DA25DA45DB3B2B7D8F007B71280A651647BE35A>II<903807FF80B6 +-FCA6C6FC7F7FB3B3B3B3ADB712E0A623647BE32C>108 D<902607FF80D91FFFEEFFF8B6 +-91B500F00207EBFF80040702FC023F14E0041F02FF91B612F84C6F488193267FE07F6D48 +-01037F922781FE001F9027E00FF0007FC6DA83F86D9026F01FC06D7F6DD987F06D4A487F +-6DD98FC0DBF87EC7804C6D027C80039FC76E488203BEEEFDF003BC6E4A8003FC04FF834B +-5FA24B5FA24B94C8FCA44B5EB3B2B7D8F007B7D8803FB612FCA67E417BC087>I<902607 +-FF80EB1FFFB691B512F0040714FC041F14FF4C8193267FE07F7F922781FE001F7FC6DA83 +-F86D7F6DD987F07F6DD98FC0814C7F039FC78015BE03BC8003FC825DA25DA25DA45DB3B2 +-B7D8F007B71280A651417BC05A>I<923807FFE092B6FC020715E0021F15F8027F15FE49 +-4848C66C6C7E010701F0010F13E04901C001037F49496D7F4990C87F49486F7E49486F7E +-48496F13804819C04A814819E048496F13F0A24819F8A348496F13FCA34819FEA4B518FF +-AD6C19FEA46C6D4B13FCA36C19F8A26C6D4B13F0A26C19E06C6D4B13C0A26C6D4B13806C +-6D4B13006D6C4B5A6D6D495B6D6D495B010701F0010F13E06D01FE017F5B010090B7C7FC +-023F15FC020715E0020092C8FC030713E048437CC151>I114 D<913A3FFF8007800107B5EAF81F011FECFE7F017F91B5FC48B8FC48EBE0 +-014890C7121FD80FFC1407D81FF0801600485A007F167F49153FA212FF171FA27F7F7F6D +-92C7FC13FF14E014FF6C14F8EDFFC06C15FC16FF6C16C06C16F06C826C826C826C82013F +-1680010F16C01303D9007F15E0020315F0EC001F1500041F13F81607007C150100FC8117 +-7F6C163FA2171F7EA26D16F0A27F173F6D16E06D157F6D16C001FEEDFF806D0203130002 +-C0EB0FFE02FCEB7FFC01DFB65A010F5DD8FE0315C026F8007F49C7FC48010F13E035437B +-C140>II<90 +-2607FFC0ED3FFEB60207B5FCA6C6EE00076D826D82B3B3A260A360A2607F60183E6D6D14 +-7E4E7F6D6D4948806D6DD907F0ECFF806D01FFEB3FE06D91B55A6E1500021F5C020314F8 +-DA003F018002F0C7FC51427BC05A>I119 D +-E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fi cmsy10 10.95 1 +-/Fi 1 16 df15 +-D E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fj cmtt10 10.95 89 +-/Fj 89 127 df<121C127FEAFF80B3EA7F00B2123EC7FCA8121C127FA2EAFF80A3EA7F00 +-A2121C09396DB830>33 D<00101304007C131F00FEEB3F80A26C137FA248133FB2007E14 +-00007C7F003C131E00101304191C75B830>I<903907C007C0A2496C487EA8011F131FA2 +-02C05BA3007FB7FCA2B81280A36C16006C5D3A007F807F80A2020090C7FCA9495BA2003F +-90B512FE4881B81280A36C1600A22701FC01FCC7FCA300031303A201F85BA76C486C5AA2 +-29387DB730>I38 DI<141E147F14FF5BEB03FEEB07FCEB0FF0EB1FE0EB3FC0EB7F80EBFF00 +-485A5B12035B485A120F5BA2485AA2123F5BA2127F90C7FCA412FEAD127FA47F123FA27F +-121FA26C7EA27F12076C7E7F12017F6C7EEB7F80EB3FC0EB1FE0EB0FF0EB07FCEB03FEEB +-01FF7F147F141E184771BE30>I<127812FE7E7F6C7E6C7EEA0FF06C7E6C7E6C7E6C7EEB +-7F80133F14C0131FEB0FE014F01307A2EB03F8A214FC1301A214FE1300A4147FAD14FEA4 +-130114FCA2130314F8A2EB07F0A2130F14E0EB1FC0133F1480137FEBFF00485A485A485A +-485AEA3FE0485A485A90C7FC5A1278184778BE30>I<14E0497E497EA60038EC0380007E +-EC0FC0D8FF83EB3FE001C3137F9038F3F9FF267FFBFB13C06CB61280000FECFE00000314 +-F86C5C6C6C13C0011F90C7FC017F13C048B512F04880000F14FE003FECFF80267FFBFB13 +-C026FFF3F913E09038C3F87F0183133FD87E03EB0FC00038EC0380000091C7FCA66D5A6D +-5A23277AAE30>I<143EA2147FAF007FB7FCA2B81280A36C1600A2C76CC8FCAF143EA229 +-297DAF30>II<007FB612F0 +-A2B712F8A36C15F0A225077B9E30>I<120FEA3FC0EA7FE0A2EAFFF0A4EA7FE0A2EA3FC0 +-EA0F000C0C6E8B30>I<16F01501ED03F8A21507A2ED0FF0A2ED1FE0A2ED3FC0A2ED7F80 +-A2EDFF00A24A5AA25D1403A24A5AA24A5AA24A5AA24A5AA24A5AA24AC7FCA2495AA25C13 +-03A2495AA2495AA2495AA2495AA2495AA249C8FCA2485AA25B1203A2485AA2485AA2485A +-A2485AA2485AA248C9FCA25AA2127CA225477BBE30>I<14FE903807FFC0497F013F13F8 +-497F90B57E48EB83FF4848C6138049137F4848EB3FC04848EB1FE049130F001F15F04913 +-07A24848EB03F8A290C712014815FCA400FEEC00FEAD6C14016C15FCA36D1303003F15F8 +-A26D1307001F15F0A26D130F6C6CEB1FE0A26C6CEB3FC06C6CEB7F806D13FF2601FF8313 +-006CEBFFFE6D5B6D5B010F13E06D5BD900FEC7FC273A7CB830>IIIII<000FB612804815C05AA316800180C8FCAEEB83FF019F13C0 +-90B512F015FC8181D9FE0313809039F0007FC049133F0180EB1FE06CC7120F000E15F0C8 +-1207A216F81503A31218127EA2B4FC150716F048140F6C15E06C141F6DEB3FC06D137F3A +-3FE001FF80261FFC0F13006CB55A6C5C6C5C6C14E06C6C1380D90FFCC7FC25397BB730> +-II<127CB712FC16FEA416FC48C7EA0FF816F0ED1FE0007CEC3FC0C8EA7F80EDFF00 +-A24A5A4A5A5D14075D140F5D4A5AA24A5AA24AC7FCA25C5C13015CA213035CA213075CA4 +-495AA6131F5CA96D5A6DC8FC273A7CB830>I<49B4FC011F13F0017F13FC90B57E0003EC +-FF804815C048010113E03A1FF8003FF049131FD83FC0EB07F8A24848EB03FC90C71201A5 +-6D1303003F15F86D13076C6CEB0FF06C6CEB1FE0D807FCEB7FC03A03FF83FF806C90B512 +-006C6C13FC011F13F0497F90B512FE48802607FE0013C0D80FF8EB3FE0D81FE0EB0FF048 +-48EB07F8491303007F15FC90C712014815FE481400A66C14016C15FC6D1303003F15F86D +-1307D81FF0EB1FF06D133F3A0FFF01FFE06C90B512C06C1580C6ECFE006D5B011F13F001 +-0190C7FC273A7CB830>I<49B4FC010F13E0013F13F890B57E4880488048010113803A0F +-FC007FC0D81FF0EB3FE04848131F49EB0FF048481307A290C7EA03F85A4815FC1501A416 +-FEA37E7E6D130315076C7E6C6C130F6D133FD80FFC13FF6CB6FC7E6C14FE6C14F9013FEB +-E1FC010F138190380060011400ED03F8A2150716F0150F000F15E0486C131F486CEB3FC0 +-157FEDFF804A1300EC07FE391FF01FFC90B55A6C5C6C5C6C1480C649C7FCEB3FF0273A7C +-B830>I<120FEA3FC0EA7FE0A2EAFFF0A4EA7FE0A2EA3FC0EA0F00C7FCAF120FEA3FC0EA +-7FE0A2EAFFF0A4EA7FE0A2EA3FC0EA0F000C276EA630>II<16F01503ED07F8151F157FEDFFF014034A13C0021F138091383FFE00ECFFF8495B01 +-0713C0495BD93FFEC7FC495A3801FFF0485B000F13804890C8FCEA7FFC5BEAFFE05B7FEA +-7FF87FEA1FFF6C7F000313E06C7F38007FFC6D7E90380FFF806D7F010113F06D7FEC3FFE +-91381FFF80020713C06E13F01400ED7FF8151F1507ED03F01500252F7BB230>I<007FB7 +-FCA2B81280A36C16006C5DCBFCA7003FB612FE4881B81280A36C1600A229157DA530>I< +-1278127EB4FC13C07FEA7FF813FEEA1FFF6C13C000037F6C13F86C6C7EEB1FFF6D7F0103 +-13E06D7F9038007FFC6E7E91380FFF806E13C0020113F080ED3FF8151F153FEDFFF05C02 +-0713C04A138091383FFE004A5A903801FFF0495B010F13804990C7FCEB7FFC48485A4813 +-E0000F5B4890C8FCEA7FFE13F8EAFFE05B90C9FC127E1278252F7BB230>I64 D<147F4A7EA2497FA449 +-7F14F7A401077F14E3A3010F7FA314C1A2011F7FA490383F80FEA590387F007FA4498049 +-133F90B6FCA34881A39038FC001F00038149130FA4000781491307A2D87FFFEB7FFFB56C +-B51280A46C496C130029397DB830>I<007FB512F0B612FE6F7E82826C813A03F8001FF8 +-15076F7E1501A26F7EA615015EA24B5A1507ED1FF0ED7FE090B65A5E4BC7FC6F7E16E082 +-9039F8000FF8ED03FC6F7E1500167FA3EE3F80A6167F1700A25E4B5A1503ED1FFC007FB6 +-FCB75A5E16C05E6C02FCC7FC29387EB730>I<91387F803C903903FFF03E49EBFC7E011F +-13FE49EBFFFE5B9038FFE07F48EB801F3903FE000F484813075B48481303A2484813015B +-123F491300A2127F90C8FC167C16005A5AAC7E7EA2167C6D14FE123FA27F121F6D13016C +-6C14FCA26C6CEB03F86D13076C6CEB0FF03901FF801F6C9038E07FE06DB512C06D14806D +-1400010713FC6D13F09038007FC0273A7CB830>I<003FB512E04814FCB67E6F7E6C816C +-813A03F8007FF0ED1FF8150F6F7E6F7E15016F7EA2EE7F80A2163F17C0161FA4EE0FE0AC +-161F17C0A3163F1780A2167F17005E4B5A15034B5A150F4B5AED7FF0003FB65A485DB75A +-93C7FC6C14FC6C14E02B387FB730>I<007FB7FCB81280A47ED803F8C7123FA8EE1F0093 +-C7FCA4157C15FEA490B5FCA6EBF800A4157C92C8FCA5EE07C0EE0FE0A9007FB7FCB8FCA4 +-6C16C02B387EB730>I<003FB712804816C0B8FCA27E7ED801FCC7121FA8EE0F8093C7FC +-A5153E157FA490B6FCA69038FC007FA4153E92C8FCAE383FFFF8487FB5FCA27E6C5B2A38 +-7EB730>I<02FF13F00103EBC0F8010F13F1013F13FD4913FF90B6FC4813C1EC007F4848 +-133F4848131F49130F485A491307121F5B123F491303A2127F90C7FC6F5A92C8FC5A5AA8 +-92B5FC4A14805CA26C7F6C6D1400ED03F8A27F003F1407A27F121F6D130F120F7F6C6C13 +-1FA2D803FE133F6C6C137FECC1FF6C90B5FC7F6D13FB010F13F30103EBC1F0010090C8FC +-293A7DB830>I<3B3FFF800FFFE0486D4813F0B56C4813F8A26C496C13F06C496C13E0D8 +-03F8C7EAFE00B290B6FCA601F8C7FCB3A23B3FFF800FFFE0486D4813F0B56C4813F8A26C +-496C13F06C496C13E02D387FB730>I<007FB6FCB71280A46C1500260007F0C7FCB3B3A8 +-007FB6FCB71280A46C1500213879B730>I<49B512F04914F85BA27F6D14F090C7EAFE00 +-B3B3123C127EB4FCA24A5A1403EB8007397FF01FF86CB55A5D6C5C00075C000149C7FC38 +-003FF025397AB730>II<383FFFF8487FB57EA26C5B6C5BD801FCC9FCB3B0EE0F +-80EE1FC0A9003FB7FC5AB8FCA27E6C16802A387EB730>III<90383FFFE048B512FC00 +-0714FF4815804815C04815E0EBF80001E0133FD87F80EB0FF0A290C71207A44815F84814 +-03B3A96C1407A26C15F0A36D130FA26D131F6C6CEB3FE001F813FF90B6FC6C15C06C1580 +-6C1500000114FCD8003F13E0253A7BB830>I<007FB512F0B612FE6F7E16E0826C813903 +-F8003FED0FFCED03FE15016F7EA2821780163FA6167F17005EA24B5A1503ED0FFCED3FF8 +-90B6FC5E5E16804BC7FC15F001F8C9FCB0387FFFC0B57EA46C5B29387EB730>I<90383F +-FFE048B512FC000714FF4815804815C04815E0EBF80001E0133F4848EB1FF049130F90C7 +-1207A44815F8481403B3A8147E14FE6CEBFF076C15F0EC7F87A2EC3FC7018013CF9038C0 +-1FFFD83FE014E0EBF80F90B6FC6C15C06C15806C1500000114FCD8003F7FEB00016E7EA2 +-1680157F16C0153F16E0151F16F0150FED07E025467BB830>I<003FB57E4814F0B612FC +-15FF6C816C812603F8017F9138003FF0151F6F7E15071503821501A515035E1507150F4B +-5A153F4AB45A90B65A5E93C7FC5D8182D9F8007FED3FE0151F150F821507A817F8EEF1FC +-A53A3FFF8003FB4801C0EBFFF8B56C7E17F06C496C13E06C49EB7FC0C9EA1F002E397FB7 +-30>I<90390FF803C0D97FFF13E048B512C74814F74814FF5A381FF80F383FE001497E48 +-48137F90C7123F5A48141FA2150FA37EED07C06C91C7FC7F7FEA3FF0EA1FFEEBFFF06C13 +-FF6C14E0000114F86C80011F13FF01031480D9003F13C014019138007FE0151FED0FF0A2 +-ED07F8A2007C140312FEA56C140716F07F6DEB0FE06D131F01F8EB3FC001FF13FF91B512 +-80160000FD5CD8FC7F13F8D8F81F5BD878011380253A7BB830>I<003FB712C04816E0B8 +-FCA43AFE003F800FA8007CED07C0C791C7FCB3B1011FB5FC4980A46D91C7FC2B387EB730 +->I<3B7FFFC007FFFCB56C4813FEA46C496C13FCD803F8C7EA3F80B3B16D147F00011600 +-A36C6C14FE6D13016D5CEC800390393FE00FF890391FF83FF06DB55A6D5C6D5C6D91C7FC +-9038007FFCEC1FF02F3980B730>III<3A3FFF01FFF84801837F02C77FA202835B6C01015B3A01FC007F806D91C7 +-FC00005C6D5BEB7F01EC81FCEB3F8314C3011F5B14E7010F5B14FF6D5BA26D5BA26D5BA2 +-6D90C8FCA4497FA2497FA2815B81EB0FE781EB1FC381EB3F8181EB7F0081497F49800001 +-143F49800003141F49800007140FD87FFEEB7FFFB590B5128080A25C6C486D130029387D +-B730>II<001FB612FC4815FE5AA490C7EA03FCED07F816F0150FED1FE016C0153F +-ED7F80003E1500C85A4A5A5D14034A5A5D140F4A5A5D143F4A5A92C7FC5C495A5C130349 +-5A5C130F495A5C133F495A91C8FC5B4848147C4914FE1203485A5B120F485A5B123F485A +-90B6FCB7FCA46C15FC27387CB730>I<007FB5FCB61280A4150048C8FCB3B3B3A5B6FC15 +-80A46C140019476DBE30>I<007FB5FCB61280A47EC7123FB3B3B3A5007FB5FCB6FCA46C +-140019477DBE30>93 D<1307EB1FC0EB7FF0497E000313FE000FEBFF80003F14E0D87FFD +-13F039FFF07FF8EBC01FEB800F38FE0003007CEB01F00010EB00401D0E77B730>I<007F +-B612F0A2B712F8A36C15F0A225077B7D30>I97 +-DII<913801FFE04A7F5C +-A28080EC0007AAEB03FE90381FFF874913E790B6FC5A5A481303380FFC00D81FF0133F49 +-131F485A150F4848130790C7FCA25AA25AA87E6C140FA27F003F141F6D133F6C7E6D137F +-390FF801FF2607FE07EBFFC06CB712E06C16F06C14F76D01C713E0011F010313C0D907FC +-C8FC2C397DB730>I<49B4FC010713E0011F13F8017F7F90B57E488048018113803A07FC +-007FC04848133FD81FE0EB1FE0150F484814F0491307127F90C7FCED03F85A5AB7FCA516 +-F048C9FC7E7EA27F003FEC01F06DEB03F86C7E6C7E6D1307D807FEEB1FF03A03FFC07FE0 +-6C90B5FC6C15C0013F14806DEBFE00010713F8010013C0252A7CA830>IIII< +-14E0EB03F8A2497EA36D5AA2EB00E091C8FCA9381FFFF8487F5AA27E7EEA0001B3A9003F +-B612C04815E0B7FCA27E6C15C023397AB830>III<387FFFF8B57EA47EEA0001B3B3A8007FB612F0B712F8A46C15F025387BB7 +-30>I<02FC137E3B7FC3FF01FF80D8FFEF01877F90B500CF7F15DF92B57E6C010F138726 +-07FE07EB03F801FC13FE9039F803FC01A201F013F8A301E013F0B3A23C7FFE0FFF07FF80 +-B548018F13C0A46C486C01071380322881A730>II< +-49B4FC010F13E0013F13F8497F90B57E0003ECFF8014013A07FC007FC04848EB3FE0D81F +-E0EB0FF0A24848EB07F8491303007F15FC90C71201A300FEEC00FEA86C14016C15FCA26D +-1303003F15F86D13076D130F6C6CEB1FF06C6CEB3FE06D137F3A07FF01FFC06C90B51280 +-6C15006C6C13FC6D5B010F13E0010190C7FC272A7CA830>II<49B413F8010FEBC1FC013F13F14913FD48B6FC5A4813 +-81390FFC007F49131F4848130F491307485A491303127F90C7FC15015A5AA77E7E15037F +-A26C6C1307150F6C6C131F6C6C133F01FC137F3907FF01FF6C90B5FC6C14FD6C14F9013F +-13F1010F13C1903803FE0190C7FCAD92B512F84A14FCA46E14F82E3C7DA730>II<90381FFC1E48B5129F000714FF5A5A5A387FF007EB800100FEC7FC4880A46C143E +-007F91C7FC13E06CB4FC6C13FC6CEBFF806C14E0000114F86C6C7F01037F9038000FFF02 +-001380007C147F00FEEC1FC0A2150F7EA27F151F6DEB3F806D137F9039FC03FF0090B6FC +-5D5D00FC14F0D8F83F13C026780FFEC7FC222A79A830>III<3B3F +-FFC07FFF80486DB512C0B515E0A26C16C06C496C13803B01F80003F000A26D130700005D +-A26D130F017E5CA2017F131F6D5CA2EC803F011F91C7FCA26E5A010F137EA2ECE0FE0107 +-5BA214F101035BA3903801FBF0A314FF6D5BA36E5A6E5A2B277EA630>I<3B3FFFC01FFF +-E0486D4813F0B515F8A26C16F06C496C13E0D807E0C7EA3F00A26D5C0003157EA56D14FE +-00015DEC0F80EC1FC0EC3FE0A33A00FC7FF1F8A2147DA2ECFDF9017C5C14F8A3017E13FB +-A290393FF07FE0A3ECE03FA2011F5C90390F800F802D277FA630>I<3A3FFF81FFFC4801 +-C37FB580A26C5D6C01815BC648C66CC7FC137FEC80FE90383F81FC90381FC3F8EB0FE3EC +-E7F06DB45A6D5B7F6D5B92C8FC147E147F5C497F81903803F7E0EB07E790380FE3F0ECC1 +-F890381F81FC90383F80FE90387F007E017E137F01FE6D7E48486D7E267FFF80B5FCB500 +-C1148014E3A214C16C0180140029277DA630>I<3B3FFFC07FFF80486DB512C0B515E0A2 +-6C16C06C496C13803B01FC0003F000A2000014076D5C137E150F017F5C7F151FD91F805B +-A214C0010F49C7FCA214E00107137EA2EB03F0157C15FCEB01F85DA2EB00F9ECFDF0147D +-147FA26E5AA36E5AA35DA2143F92C8FCA25C147EA2000F13FE486C5AEA3FC1EBC3F81387 +-EB8FF0EBFFE06C5B5C6C90C9FC6C5AEA01F02B3C7EA630>I<001FB612FC4815FE5AA316 +-FC90C7EA0FF8ED1FF0ED3FE0ED7FC0EDFF80003E491300C7485A4A5A4A5A4A5A4A5A4A5A +-4A5A4990C7FC495A495A495A495A495A495A4948133E4890C7127F485A485A485A485A48 +-5A48B7FCB8FCA46C15FE28277DA630>II< +-127CA212FEB3B3B3AD127CA207476CBE30>II<017C13 +-3848B4137C48EB80FE4813C14813C348EBEFFC397FEFFFF0D8FF8713E0010713C0486C13 +-80D87C0113003838007C1F0C78B730>I E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fk cmbx12 14.4 49 +-/Fk 49 122 df12 D45 DI<157815FC14031407141F14FF130F00 +-07B5FCB6FCA2147F13F0EAF800C7FCB3B3B3A6007FB712FEA52F4E76CD43>49 +-DI<91380FFFC091B512FC0107ECFF80011F15E09026 +-3FF8077F9026FF800113FC4848C76C7ED803F86E7E491680D807FC8048B416C080486D15 +-E0A4805CA36C17C06C5B6C90C75AD801FC1680C9FC4C13005FA24C5A4B5B4B5B4B13C04B +-5BDBFFFEC7FC91B512F816E016FCEEFF80DA000713E0030113F89238007FFE707E701380 +-7013C018E07013F0A218F8A27013FCA218FEA2EA03E0EA0FF8487E487E487EB57EA318FC +-A25E18F891C7FC6C17F0495C6C4816E001F04A13C06C484A1380D80FF84A13006CB44A5A +-6CD9F0075BC690B612F06D5D011F1580010302FCC7FCD9001F1380374F7ACD43>I<177C +-17FEA2160116031607160FA2161F163F167FA216FF5D5DA25D5DED1FBFED3F3F153E157C +-15FCEC01F815F0EC03E01407EC0FC01580EC1F005C147E147C5C1301495A495A5C495A13 +-1F49C7FC133E5B13FC485A5B485A1207485A485A90C8FC123E127E5ABA12C0A5C96C48C7 +-FCAF020FB712C0A53A4F7CCE43>III<121F7F7FEB +-FF8091B81280A45A1900606060A2606060485F0180C86CC7FC007EC95A4C5A007C4B5A5F +-4C5A160F4C5A484B5A4C5A94C8FC16FEC812014B5A5E4B5A150F4B5AA24B5AA24B5A15FF +-A24A90C9FCA25C5D1407A2140FA25D141FA2143FA4147F5DA314FFA55BAC6D5BA2EC3FC0 +-6E5A395279D043>I<913807FFC0027F13FC0103B67E010F15E090261FFC0113F8903A3F +-E0003FFCD97F80EB0FFE49C76C7E48488048486E1380000717C04980120F18E0177FA212 +-1F7FA27F7F6E14FF02E015C014F802FE4913806C7FDBC00313009238F007FE6C02F85B92 +-38FE1FF86C9138FFBFF06CEDFFE017806C4BC7FC6D806D81010F15E06D81010115FC0107 +-81011F81491680EBFFE748018115C048D9007F14E04848011F14F048487F484813030300 +-14F8484880161F4848020713FC1601824848157F173FA2171FA2170FA218F8A27F007F17 +-F06D151FA26C6CED3FE0001F17C06D157F6C6CEDFF806C6C6C010313006C01E0EB0FFE6C +-01FCEBFFFC6C6CB612F06D5D010F1580010102FCC7FCD9000F13C0364F7ACD43>I<9138 +-0FFF8091B512F8010314FE010F6E7E4901037F90267FF8007F4948EB3FF048496D7E4849 +-80486F7E484980824817805A91C714C05A7013E0A218F0B5FCA318F8A618FCA46C5DA37E +-A25E6C7F6C5DA26C5D6C7F6C6D137B6C6D13F390387FF803011FB512E36D14C301030283 +-13F89039007FFE03EC00401500A218F05EA3D801F816E0487E486C16C0487E486D491380 +-A218005E5F4C5A91C7FC6C484A5A494A5A49495B6C48495BD803FC010F5B9027FF807FFE +-C7FC6C90B55A6C6C14F06D14C0010F49C8FC010013F0364F7ACD43>I<91B5FC010F14F8 +-017F14FF90B712C00003D9C00F7F2707FC00017FD80FE06D7F48486E7E48C87FD87FE06E +-7E7F7F486C1680A66C5A18006C485C6C5AC9485A5F4B5B4B5B4B5B4B5B4B90C7FC16FC4B +-5A4B5A16C04B5A93C8FC4A5A5D14035D5D14075DA25D140FA25DAB91CAFCAAEC1FC04A7E +-ECFFF8497FA2497FA76D5BA26D5BEC3FE06E5A315479D340>63 D68 DI +-I72 D +-I<027FB71280A591C76C90C7FCB3B3B3EA07F0EA1FFC487E487EA2B57EA44C5AA34A485B +-7E49495BD83FF8495BD81FE05DD80FFC011F5B2707FF807F90C8FC000190B512FC6C6C14 +-F0011F14C0010101F8C9FC39537DD145>I76 DI80 +-D82 D<91260FFF80130791B5 +-00F85B010702FF5B011FEDC03F49EDF07F9026FFFC006D5A4801E0EB0FFD4801800101B5 +-FC4848C87E48488149150F001F824981123F4981007F82A28412FF84A27FA26D82A27F7F +-6D93C7FC14C06C13F014FF15F86CECFF8016FC6CEDFFC017F06C16FC6C16FF6C17C06C83 +-6C836D826D82010F821303010082021F16801400030F15C0ED007F040714E01600173F05 +-0F13F08383A200788200F882A3187FA27EA219E07EA26CEFFFC0A27F6D4B13806D17006D +-5D01FC4B5A01FF4B5A02C04A5A02F8EC7FF0903B1FFFC003FFE0486C90B65AD8FC0393C7 +-FC48C66C14FC48010F14F048D9007F90C8FC3C5479D24B>I<003FBC1280A59126C0003F +-9038C0007F49C71607D87FF8060113C001E08449197F49193F90C8171FA2007E1A0FA300 +-7C1A07A500FC1BE0481A03A6C994C7FCB3B3AC91B912F0A553517BD05E>II87 +-D97 +-DI<913801FFF8021FEBFF8091B612F0010315FC010F9038C00FFE903A1FFE0001 +-FFD97FFC491380D9FFF05B4817C048495B5C5A485BA2486F138091C7FC486F1300705A48 +-92C8FC5BA312FFAD127F7FA27EA2EF03E06C7F17076C6D15C07E6E140F6CEE1F806C6DEC +-3F006C6D147ED97FFE5C6D6CEB03F8010F9038E01FF0010390B55A01001580023F49C7FC +-020113E033387CB63C>I<4DB47E0407B5FCA5EE001F1707B3A4913801FFE0021F13FC91 +-B6FC010315C7010F9038E03FE74990380007F7D97FFC0101B5FC49487F4849143F484980 +-485B83485B5A91C8FC5AA3485AA412FFAC127FA36C7EA37EA26C7F5F6C6D5C7E6C6D5C6C +-6D49B5FC6D6C4914E0D93FFED90FEFEBFF80903A0FFFC07FCF6D90B5128F0101ECFE0FD9 +-003F13F8020301C049C7FC41547CD24B>I<913803FFC0023F13FC49B6FC010715C04901 +-817F903A3FFC007FF849486D7E49486D7E4849130F48496D7E48178048497F18C0488191 +-C7FC4817E0A248815B18F0A212FFA490B8FCA318E049CAFCA6127FA27F7EA218E06CEE01 +-F06E14037E6C6DEC07E0A26C6DEC0FC06C6D141F6C6DEC3F806D6CECFF00D91FFEEB03FE +-903A0FFFC03FF8010390B55A010015C0021F49C7FC020113F034387CB63D>IIII<137F497E +-000313E0487FA2487FA76C5BA26C5BC613806DC7FC90C8FCADEB3FF0B5FCA512017EB3B3 +-A6B612E0A51B547BD325>I +-107 DIII<913801FFE0021F13FE91B612C0010315F0010F9038 +-807FFC903A1FFC000FFED97FF86D6C7E49486D7F48496D7F48496D7F4A147F48834890C8 +-6C7EA24883A248486F7EA3007F1880A400FF18C0AC007F1880A3003F18006D5DA26C5FA2 +-6C5F6E147F6C5F6C6D4A5A6C6D495B6C6D495B6D6C495BD93FFE011F90C7FC903A0FFF80 +-7FFC6D90B55A010015C0023F91C8FC020113E03A387CB643>I<903A3FF001FFE0B5010F +-13FE033FEBFFC092B612F002F301017F913AF7F8007FFE0003D9FFE0EB1FFFC602806D7F +-92C76C7F4A824A6E7F4A6E7FA2717FA285187F85A4721380AC1A0060A36118FFA2615F61 +-6E4A5BA26E4A5B6E4A5B6F495B6F4990C7FC03F0EBFFFC9126FBFE075B02F8B612E06F14 +-80031F01FCC8FC030313C092CBFCB1B612F8A5414D7BB54B>I<90397FE003FEB590380F +-FF80033F13E04B13F09238FE1FF89139E1F83FFC0003D9E3E013FEC6ECC07FECE78014EF +-150014EE02FEEB3FFC5CEE1FF8EE0FF04A90C7FCA55CB3AAB612FCA52F367CB537>114 +-D<903903FFF00F013FEBFE1F90B7FC120348EB003FD80FF81307D81FE0130148487F4980 +-127F90C87EA24881A27FA27F01F091C7FC13FCEBFFC06C13FF15F86C14FF16C06C15F06C +-816C816C81C681013F1580010F15C01300020714E0EC003F030713F015010078EC007F00 +-F8153F161F7E160FA27E17E07E6D141F17C07F6DEC3F8001F8EC7F0001FEEB01FE9039FF +-C00FFC6DB55AD8FC1F14E0D8F807148048C601F8C7FC2C387CB635>I<143EA6147EA414 +-FEA21301A313031307A2130F131F133F13FF5A000F90B6FCB8FCA426003FFEC8FCB3A9EE +-07C0AB011FEC0F8080A26DEC1F0015806DEBC03E6DEBF0FC6DEBFFF86D6C5B021F5B0203 +-13802A4D7ECB34>IIII121 D E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fl cmr10 10.95 86 +-/Fl 86 124 df<4AB4EB0FE0021F9038E03FFC913A7F00F8FC1ED901FC90383FF03FD907 +-F090397FE07F80494801FF13FF4948485BD93F805C137F0200ED7F00EF003E01FE6D91C7 +-FC82ADB97EA3C648C76CC8FCB3AE486C4A7E007FD9FC3FEBFF80A339407FBF35>11 +-D<4AB4FC021F13C091387F01F0903901FC0078D907F0131C4948133E494813FF49485A13 +-7F1400A213FE6F5A163893C7FCAA167FB8FCA33900FE00018182B3AC486CECFF80007FD9 +-FC3F13FEA32F407FBF33>I<4AB47E021F13F791387F00FFEB01F8903807F001EB0FE0EB +-1FC0EB3F80137F14008101FE80AEB8FCA3C648C77EB3AE486CECFF80007FD9FC3F13FEA3 +-2F407FBF33>I<4AB4ECFF80021FD9C00F13E0913B7F01F03F80F8903C01F80078FE003C +-D907F0D93FF8130E49484948131F49484948EB7F804948484913FF137F02005CA201FE92 +-C7FC6FED7F0070141C96C7FCAAF13F80BBFCA3C648C76CC7FC197F193FB3AC486C4A6CEB +-7FC0007FD9FC3FD9FE1FB5FCA348407FBF4C>I<121EEA7F80EAFFC0A9EA7F80ACEA3F00 +-AC121EAB120CC7FCA8121EEA7F80A2EAFFC0A4EA7F80A2EA1E000A4179C019>33 +-D<001E130F397F803FC000FF137F01C013E0A201E013F0A3007F133F391E600F30000013 +-00A401E01370491360A3000114E04913C00003130101001380481303000EEB070048130E +-0018130C0038131C003013181C1C7DBE2D>I<013F4C7ED9FFC04B7E2601E0E015072607 +-C070150F48486C4B5A023E4BC7FC48486C5D48D90FC0EB01FE003ED90EF0EB07FCDA0F3F +-133E007E903A070FFFF8F8007C0200EBC1F0EE000300FC6D6C495A604D5A171F95C8FC17 +-3E177E177C5F16015F007C4948485A1607007E5E003E49495A020E131F003F93C9FC6C49 +-133E260F803C137E0238137C6C6C485B3901E0E0016CB448485AD93F0049133F90C74848 +-EBFFC0030F903801E0E093398007C0704B4848487E4B153C033E90381F001C4B497F03FC +-133E4B150F4A48017E7F0203147C5D4A4801FCEB0380140F5D4AC7FC5C143E5C14FC5C49 +-5A13034948027CEB07005C4948147E011F033E5B91C8140E013E153F017E6F5B017C9238 +-0F803C4917380001706C5A49923801E0E0496FB45A6C48043FC7FC41497BC34C>37 +-DI<121EEA7F8012FF13C0A213E0A3127FEA1E601200A413E013C0A3120113801203 +-13005A120E5A1218123812300B1C79BE19>I<1430147014E0EB01C0EB03801307EB0F00 +-131E133E133C5B13F85B12015B1203A2485AA2120F5BA2121F90C7FCA25AA3123E127EA6 +-127C12FCB2127C127EA6123E123FA37EA27F120FA27F1207A26C7EA212017F12007F1378 +-7F133E131E7FEB07801303EB01C0EB00E014701430145A77C323>I<12C07E12707E7E12 +-1E7E6C7E7F12036C7E7F12007F1378137CA27FA2133F7FA21480130FA214C0A3130714E0 +-A6130314F0B214E01307A614C0130FA31480A2131F1400A25B133EA25BA2137813F85B12 +-015B485A12075B48C7FC121E121C5A5A5A5A145A7BC323>II<121EEA7F8012FF13C0A213 +-E0A3127FEA1E601200A413E013C0A312011380120313005A120E5A1218123812300B1C79 +-8919>44 DI<121EEA7F80A2EAFFC0A4EA7F80A2EA1E000A0A79 +-8919>IIIIII<150E151E153EA2157EA215FE1401A21403EC +-077E1406140E141CA214381470A214E0EB01C0A2EB0380EB0700A2130E5BA25B5BA25B5B +-1201485A90C7FC5A120E120C121C5AA25A5AB8FCA3C8EAFE00AC4A7E49B6FCA3283E7EBD +-2D>I<00061403D80780131F01F813FE90B5FC5D5D5D15C092C7FC14FCEB3FE090C9FCAC +-EB01FE90380FFF8090383E03E090387001F8496C7E49137E497F90C713800006141FC813 +-C0A216E0150FA316F0A3120C127F7F12FFA416E090C7121F12FC007015C012780038EC3F +-80123C6CEC7F00001F14FE6C6C485A6C6C485A3903F80FE0C6B55A013F90C7FCEB07F824 +-3F7CBC2D>II<1238123C123F90B6 +-12FCA316F85A16F016E00078C712010070EC03C0ED078016005D48141E151C153C5DC812 +-7015F04A5A5D14034A5A92C7FC5C141EA25CA2147C147814F8A213015C1303A31307A313 +-0F5CA2131FA6133FAA6D5A0107C8FC26407BBD2D>III<12 +-1EEA7F80A2EAFFC0A4EA7F80A2EA1E00C7FCB3121EEA7F80A2EAFFC0A4EA7F80A2EA1E00 +-0A2779A619>I<121EEA7F80A2EAFFC0A4EA7F80A2EA1E00C7FCB3121E127FEAFF80A213 +-C0A4127F121E1200A412011380A3120313005A1206120E120C121C5A1230A20A3979A619 +->I<007FB912E0BA12F0A26C18E0CDFCAE007FB912E0BA12F0A26C18E03C167BA147>61 +-D63 D<15074B7EA34B7EA34B7EA34B7EA34B7E15E7A2913801C7FC15C3A291380381 +-FEA34AC67EA3020E6D7EA34A6D7EA34A6D7EA34A6D7EA34A6D7EA349486D7E91B6FCA249 +-819138800001A249C87EA24982010E157FA2011E82011C153FA2013C820138151FA20178 +-82170F13FC00034C7ED80FFF4B7EB500F0010FB512F8A33D417DC044>65 +-DII +-IIII< +-B6D8C01FB512F8A3000101E0C7383FFC0026007F80EC0FF0B3A691B7FCA30280C7120FB3 +-A92601FFE0EC3FFCB6D8C01FB512F8A33D3E7DBD44>II<011FB512FCA3D9000713006E5A1401B3B3A6123FEA +-7F80EAFFC0A44A5A1380D87F005B007C130700385C003C495A6C495A6C495A2603E07EC7 +-FC3800FFF8EB3FC026407CBD2F>IIIIIII +-III<003FB91280A3903AF0 +-007FE001018090393FC0003F48C7ED1FC0007E1707127C00781703A300701701A548EF00 +-E0A5C81600B3B14B7E4B7E0107B612FEA33B3D7DBC42>IIII<007F +-B5D8C003B512E0A3C649C7EBFC00D93FF8EC3FE06D48EC1F806D6C92C7FC171E6D6C141C +-6D6C143C5F6D6C14706D6D13F04C5ADA7FC05B023F13036F485ADA1FF090C8FC020F5BED +-F81E913807FC1C163C6E6C5A913801FF7016F06E5B6F5AA26F7E6F7EA28282153FED3BFE +-ED71FF15F103E07F913801C07F0203804B6C7EEC07004A6D7E020E6D7E5C023C6D7E0238 +-6D7E14784A6D7E4A6D7F130149486E7E4A6E7E130749C86C7E496F7E497ED9FFC04A7E00 +-076DEC7FFFB500FC0103B512FEA33F3E7EBD44>II<003FB712F8A391C7EA1FF013F801E0EC3FE00180EC7FC090C8FC003EED +-FF80A2003C4A1300007C4A5A12784B5A4B5AA200704A5AA24B5A4B5AA2C8485A4A90C7FC +-A24A5A4A5AA24A5AA24A5A4A5AA24A5A4A5AA24990C8FCA2495A4948141CA2495A495AA2 +-495A495A173C495AA24890C8FC485A1778485A484815F8A24848140116034848140F4848 +-143FED01FFB8FCA32E3E7BBD38>I +-I<486C13C00003130101001380481303000EEB070048130E0018130C0038131C00301318 +-0070133800601330A300E01370481360A400CFEB678039FFC07FE001E013F0A3007F133F +-A2003F131F01C013E0390F0007801C1C73BE2D>II97 +-DI<49B4FC010F13E090383F00F8017C131E4848131F +-4848137F0007ECFF80485A5B121FA24848EB7F00151C007F91C7FCA290C9FC5AAB6C7EA3 +-003FEC01C07F001F140316806C6C13076C6C14000003140E6C6C131E6C6C137890383F01 +-F090380FFFC0D901FEC7FC222A7DA828>II +-II<167C903903F801 +-FF903A1FFF078F8090397E0FDE1F9038F803F83803F001A23B07E000FC0600000F6EC7FC +-49137E001F147FA8000F147E6D13FE00075C6C6C485AA23901F803E03903FE0FC026071F +-FFC8FCEB03F80006CAFC120EA3120FA27F7F6CB512E015FE6C6E7E6C15E06C810003813A +-0FC0001FFC48C7EA01FE003E140048157E825A82A46C5D007C153E007E157E6C5D6C6C49 +-5A6C6C495AD803F0EB0FC0D800FE017FC7FC90383FFFFC010313C0293D7EA82D>III<1478EB01FEA2EB03FFA4EB01FEA2EB00781400AC147FEB7FFFA313 +-017F147FB3B3A5123E127F38FF807E14FEA214FCEB81F8EA7F01387C03F0381E07C0380F +-FF803801FC00185185BD1C>II +-I<2701F801FE14FF00FF902707FFC00313E0913B1E07E00F03F0913B7803F03C01F80007 +-903BE001F87000FC2603F9C06D487F000101805C01FBD900FF147F91C75B13FF4992C7FC +-A2495CB3A6486C496CECFF80B5D8F87FD9FC3F13FEA347287DA74C>I<3901F801FE00FF +-903807FFC091381E07E091387803F000079038E001F82603F9C07F0001138001FB6D7E91 +-C7FC13FF5BA25BB3A6486C497EB5D8F87F13FCA32E287DA733>I<14FF010713E090381F +-81F890387E007E01F8131F4848EB0F804848EB07C04848EB03E0000F15F04848EB01F8A2 +-003F15FCA248C812FEA44815FFA96C15FEA36C6CEB01FCA3001F15F86C6CEB03F0A26C6C +-EB07E06C6CEB0FC06C6CEB1F80D8007EEB7E0090383F81FC90380FFFF0010090C7FC282A +-7EA82D>I<3901FC03FC00FF90381FFF8091387C0FE09039FDE003F03A03FFC001FC6C49 +-6C7E91C7127F49EC3F805BEE1FC017E0A2EE0FF0A3EE07F8AAEE0FF0A4EE1FE0A2EE3FC0 +-6D1580EE7F007F6E13FE9138C001F89039FDE007F09039FC780FC0DA3FFFC7FCEC07F891 +-C9FCAD487EB512F8A32D3A7EA733>I<02FF131C0107EBC03C90381F80F090397F00387C +-01FC131CD803F8130E4848EB0FFC150748481303121F485A1501485AA448C7FCAA6C7EA3 +-6C7EA2001F14036C7E15076C6C130F6C7E6C6C133DD8007E137990383F81F190380FFFC1 +-903801FE0190C7FCAD4B7E92B512F8A32D3A7DA730>I<3901F807E000FFEB1FF8EC787C +-ECE1FE3807F9C100031381EA01FB1401EC00FC01FF1330491300A35BB3A5487EB512FEA3 +-1F287EA724>I<90383FC0603901FFF8E03807C03F381F000F003E1307003C1303127C00 +-78130112F81400A27E7E7E6D1300EA7FF8EBFFC06C13F86C13FE6C7F6C1480000114C0D8 +-003F13E0010313F0EB001FEC0FF800E01303A214017E1400A27E15F07E14016C14E06CEB +-03C0903880078039F3E01F0038E0FFFC38C01FE01D2A7DA824>I<131CA6133CA4137CA2 +-13FCA2120112031207001FB512C0B6FCA2D801FCC7FCB3A215E0A912009038FE01C0A2EB +-7F03013F138090381F8700EB07FEEB01F81B397EB723>IIIIII<001FB61280A2EBE0000180140049485A001E495A121C4A5A003C49 +-5A141F00385C4A5A147F5D4AC7FCC6485AA2495A495A130F5C495A90393FC00380A2EB7F +-80EBFF005A5B484813071207491400485A48485BA248485B4848137F00FF495A90B6FCA2 +-21277EA628>II E +-%EndDVIPSBitmapFont +-%DVIPSBitmapFont: Fm cmbx12 20.736 9 +-/Fm 9 123 df<92380FFFE04AB67E020F15F0027F15FE49B87E4917E0010F17F8013F83 +-49D9C01F14FF9027FFFC0001814801E06D6C80480180021F804890C86C8048486F804848 +-6F8001FF6F804801C06E8002F081486D18806E816E18C0B5821BE06E81A37214F0A56C5B +-A36C5B6C5B6C5B000313C0C690C9FC90CA15E060A34E14C0A21B80601B0060626295B55A +-5F624D5C624D5C4D91C7FC614D5B4D13F04D5B6194B55A4C49C8FC4C5B4C5B4C13E04C5B +-604C90C9FCEE7FFC4C5A4B5B4B5B4B0180EC0FF04B90C8FC4B5A4B5A4B48ED1FE0EDFFE0 +-4A5B4A5B4A90C9FC4A48163F4A5ADA3FF017C05D4A48167F4A5A4990CA12FFD903FC1607 +-49BAFC5B4919805B5B90BBFC5A5A5A5A481A005A5ABCFCA462A44C7176F061>50 +-D<92383FFFF80207B612E0027F15FC49B87E010717E0011F83499026F0007F13FC4948C7 +-000F7F90B502036D7E486E6D806F6D80727F486E6E7F8486727FA28684A26C5C72806C5C +-6D90C8FC6D5AEB0FF8EB03E090CAFCA70507B6FC041FB7FC0303B8FC157F0203B9FC021F +-ECFE0391B612800103ECF800010F14C04991C7FC017F13FC90B512F04814C0485C4891C8 +-FC485B5A485B5C5A5CA2B5FC5CA360A36E5DA26C5F6E5D187E6C6D846E4A48806C6D4A48 +-14FC6C6ED90FF0ECFFFC6C02E090263FE07F14FE00019139FC03FFC06C91B6487E013F4B +-487E010F4B1307010303F01301D9003F0280D9003F13FC020101F8CBFC57507ACE5E>97 +-D<903801FFFCB6FCA8C67E131F7FB3ADF0FFFC050FEBFFE0057F14FE0403B77E040F16E0 +-043F16F84CD9007F13FE9226FDFFF001077F92B500C001018094C86C13E004FC6F7F4C6F +-7F04E06F7F4C6F7F5E747F93C915804B7014C0A27414E0A21DF087A21DF8A31DFC87A41D +-FEAF1DFCA4631DF8A31DF098B5FC1DE0A25014C0A26F1980501400705D705F704B5B505B +-704B5B04FC4B5BDBE7FE92B55A9226C3FF8001035C038101E0011F49C7FC9226807FFC90 +-B55A4B6CB712F04A010F16C04A010393C8FC4A010015F84A023F14C090C9000301F0C9FC +-5F797AF76C>I<97380FFFE00607B6FCA8F00003190086B3AD93383FFF800307B512F803 +-3F14FF4AB712C0020716F0021F16FC027F9039FE007FFE91B500F0EB0FFF010302800101 +-90B5FC4949C87E49498149498149498149498190B548814884484A8192CAFC5AA2485BA2 +-5A5C5AA35A5CA4B5FCAF7EA4807EA37EA2807EA26C7F616C6E5D6C606C80616D6D5D6D6D +-5D6D6D92B67E6D6D4A15FC010301FF0207EDFFFE6D02C0EB3FFE6D6C9039FC01FFF86E90 +-B65A020F16C002031600DA007F14FC030F14E09226007FFEC749C7FC5F797AF76C>100 +-D105 D<903801FFFCB6FCA8C67E131F7FB3B3B3B3B3ABB812C0A82A7879F7 +-35>108 D<902601FFF891380FFFE0B692B512FE05036E7E050F15E0053F15F84D819327 +-01FFF01F7F4CD900077FDC07FC6D80C66CDA0FF06D80011FDA1FC07F6D4A48824CC8FC04 +-7E6F7F5EEDF9F85E03FB707F5E15FF5EA25EA293C9FCA45DB3B3A6B8D8E003B81280A861 +-4E79CD6C>110 D<902601FFFCEC7FFEB6020FB512F0057F14FE4CB712C0040716F0041F +-82047F16FE93B5C66C7F92B500F0010F14C0C66C0380010380011F4AC76C806D4A6E8004 +-F06F7F4C6F7F4C6F7F4C8193C915804B7014C0861DE0A27414F0A27414F8A47513FCA575 +-13FEAF5113FCA598B512F8A31DF0621DE0621DC0621D806F5E701800704B5B505B704B5B +-7092B55A04FC4A5C704A5C706C010F5C05E0013F49C7FC9227FE7FFC01B55A70B712F004 +-0F16C0040393C8FC040015F8053F14C0050301F0C9FC94CCFCB3A6B812E0A85F6F7ACD6C +->112 D<0007BA12FC1AFEA503E0C714FC4AC74814F84801F04A14F05C02804A14E091C8 +-4814C04D14805B494B14004D5B4992B55AA24C5C494A5C615E4C5C001F4B5C5B4C91C7FC +-4C5B93B55AA24B5CC8485C4B5CA24B5C4B5C4B91C8FCA24B5B92B55AA24A5C4A5C4A4A14 +-FFA24A5C4A5C4A91C8FC614A4915FE91B55A495CA2495C494A14035E5B495C4991C81207 +-A24949ED0FFC90B55A484A151FA2484A153F484A157F484A15FF1803484A140F4891C812 +-3F48490207B5FC91B9FCBB12F8A57E484D7BCC56>122 D E +-%EndDVIPSBitmapFont +-end +-%%EndProlog +-%%BeginSetup +-%%Feature: *Resolution 600dpi +-TeXDict begin +-%%PaperSize: A4 +- +-%%EndSetup +-%%Page: 1 1 +-1 0 bop 150 1318 a Fm(bzip2)64 b(and)g(libbzip2)p 150 +-1418 3600 34 v 2010 1515 a Fl(a)31 b(program)f(and)g(library)e(for)i +-(data)h(compression)2198 1623 y(cop)m(yrigh)m(t)f(\(C\))h(1996-2000)j +-(Julian)28 b(Sew)m(ard)2605 1731 y(v)m(ersion)i(1.0)h(of)g(21)g(Marc)m +-(h)g(2000)150 5091 y Fk(Julian)46 b(Sew)l(ard)p 150 5141 +-3600 17 v eop +-%%Page: 1 2 +-1 1 bop 3705 -116 a Fl(1)150 299 y(This)24 b(program,)j +-Fj(bzip2)p Fl(,)e(and)g(asso)s(ciated)i(library)c Fj(libbzip2)p +-Fl(,)i(are)h(Cop)m(yrigh)m(t)g(\(C\))g(1996-2000)j(Julian)150 +-408 y(R)h(Sew)m(ard.)40 b(All)29 b(righ)m(ts)h(reserv)m(ed.)150 +-565 y(Redistribution)f(and)i(use)h(in)f(source)h(and)g(binary)e(forms,) +-j(with)e(or)h(without)f(mo)s(di\014cation,)g(are)i(p)s(er-)150 +-675 y(mitted)d(pro)m(vided)f(that)i(the)f(follo)m(wing)f(conditions)g +-(are)i(met:)225 832 y Fi(\017)60 b Fl(Redistributions)26 +-b(of)k(source)g(co)s(de)g(m)m(ust)g(retain)f(the)h(ab)s(o)m(v)m(e)h +-(cop)m(yrigh)m(t)g(notice,)f(this)f(list)f(of)i(con-)330 +-941 y(ditions)e(and)i(the)h(follo)m(wing)e(disclaimer.)225 +-1076 y Fi(\017)60 b Fl(The)33 b(origin)f(of)h(this)f(soft)m(w)m(are)j +-(m)m(ust)e(not)h(b)s(e)e(misrepresen)m(ted;)i(y)m(ou)g(m)m(ust)f(not)g +-(claim)g(that)h(y)m(ou)330 1185 y(wrote)d(the)h(original)d(soft)m(w)m +-(are.)44 b(If)31 b(y)m(ou)g(use)g(this)f(soft)m(w)m(are)i(in)e(a)h(pro) +-s(duct,)g(an)f(ac)m(kno)m(wledgmen)m(t)330 1295 y(in)f(the)i(pro)s +-(duct)e(do)s(cumen)m(tation)h(w)m(ould)f(b)s(e)h(appreciated)g(but)g +-(is)f(not)i(required.)225 1429 y Fi(\017)60 b Fl(Altered)21 +-b(source)g(v)m(ersions)f(m)m(ust)h(b)s(e)f(plainly)e(mark)m(ed)j(as)g +-(suc)m(h,)i(and)d(m)m(ust)h(not)g(b)s(e)f(misrepresen)m(ted)330 +-1539 y(as)31 b(b)s(eing)e(the)h(original)f(soft)m(w)m(are.)225 +-1674 y Fi(\017)60 b Fl(The)27 b(name)h(of)f(the)h(author)f(ma)m(y)h +-(not)g(b)s(e)f(used)g(to)h(endorse)f(or)h(promote)g(pro)s(ducts)e +-(deriv)m(ed)g(from)330 1783 y(this)j(soft)m(w)m(are)j(without)d(sp)s +-(eci\014c)h(prior)e(written)i(p)s(ermission.)150 1965 +-y(THIS)37 b(SOFTW)-10 b(ARE)38 b(IS)f(PR)m(O)m(VIDED)i(BY)g(THE)f(A)m +-(UTHOR)g(\\AS)g(IS")g(AND)h(ANY)f(EXPRESS)150 2074 y(OR)31 +-b(IMPLIED)h(W)-10 b(ARRANTIES,)31 b(INCLUDING,)i(BUT)f(NOT)f(LIMITED)g +-(TO,)h(THE)f(IMPLIED)150 2184 y(W)-10 b(ARRANTIES)27 +-b(OF)h(MER)m(CHANT)-8 b(ABILITY)28 b(AND)g(FITNESS)f(F)m(OR)g(A)h(P)-8 +-b(AR)g(TICULAR)28 b(PUR-)150 2294 y(POSE)37 b(ARE)g(DISCLAIMED.)h(IN)f +-(NO)h(EVENT)f(SHALL)g(THE)g(A)m(UTHOR)h(BE)g(LIABLE)g(F)m(OR)150 +-2403 y(ANY)56 b(DIRECT,)f(INDIRECT,)h(INCIDENT)-8 b(AL,)56 +-b(SPECIAL,)e(EXEMPLAR)-8 b(Y,)57 b(OR)e(CONSE-)150 2513 +-y(QUENTIAL)48 b(D)m(AMA)m(GES)i(\(INCLUDING,)g(BUT)f(NOT)f(LIMITED)g +-(TO,)g(PR)m(OCUREMENT)150 2622 y(OF)35 b(SUBSTITUTE)e(GOODS)i(OR)f(SER) +--10 b(VICES;)34 b(LOSS)f(OF)i(USE,)g(D)m(A)-8 b(T)g(A,)36 +-b(OR)f(PR)m(OFITS;)f(OR)150 2732 y(BUSINESS)28 b(INTERR)m(UPTION\))g +-(HO)m(WEVER)i(CA)m(USED)f(AND)g(ON)g(ANY)g(THEOR)-8 b(Y)29 +-b(OF)g(LIA-)150 2842 y(BILITY,)36 b(WHETHER)g(IN)g(CONTRA)m(CT,)g +-(STRICT)e(LIABILITY,)i(OR)g(TOR)-8 b(T)35 b(\(INCLUDING)150 +-2951 y(NEGLIGENCE)45 b(OR)g(OTHER)-10 b(WISE\))44 b(ARISING)h(IN)g(ANY) +-h(W)-10 b(A)i(Y)46 b(OUT)e(OF)i(THE)e(USE)h(OF)150 3061 +-y(THIS)29 b(SOFTW)-10 b(ARE,)31 b(EVEN)f(IF)g(AD)m(VISED)i(OF)e(THE)g +-(POSSIBILITY)e(OF)j(SUCH)f(D)m(AMA)m(GE.)150 3218 y(Julian)e(Sew)m +-(ard,)i(Cam)m(bridge,)g(UK.)150 3374 y Fj(jseward@acm.org)150 +-3531 y(http://sourceware.cygnus)o(.com)o(/bzi)o(p2)150 +-3688 y(http://www.cacheprof.org)150 3845 y(http://www.muraroa.demon)o +-(.co.)o(uk)150 4002 y(bzip2)p Fl(/)p Fj(libbzip2)d Fl(v)m(ersion)j(1.0) +-i(of)e(21)h(Marc)m(h)g(2000.)150 4159 y(P)-8 b(A)g(TENTS:)40 +-b(T)-8 b(o)40 b(the)g(b)s(est)g(of)g(m)m(y)g(kno)m(wledge,)j +-Fj(bzip2)38 b Fl(do)s(es)i(not)g(use)g(an)m(y)g(paten)m(ted)h +-(algorithms.)150 4268 y(Ho)m(w)m(ev)m(er,)33 b(I)e(do)f(not)h(ha)m(v)m +-(e)h(the)f(resources)g(a)m(v)-5 b(ailable)30 b(to)h(carry)g(out)g(a)g +-(full)d(paten)m(t)k(searc)m(h.)42 b(Therefore)150 4378 +-y(I)30 b(cannot)h(giv)m(e)g(an)m(y)g(guaran)m(tee)h(of)e(the)h(ab)s(o)m +-(v)m(e)g(statemen)m(t.)p eop +-%%Page: 2 3 +-2 2 bop 150 -116 a Fl(Chapter)30 b(1:)41 b(In)m(tro)s(duction)2591 +-b(2)150 299 y Fh(1)80 b(In)l(tro)t(duction)150 555 y +-Fj(bzip2)20 b Fl(compresses)h(\014les)f(using)g(the)h(Burro)m +-(ws-Wheeler)g(blo)s(c)m(k-sorting)f(text)j(compression)d(algorithm,)150 +-665 y(and)33 b(Hu\013man)g(co)s(ding.)50 b(Compression)32 +-b(is)h(generally)g(considerably)f(b)s(etter)i(than)f(that)h(ac)m(hiev)m +-(ed)h(b)m(y)150 775 y(more)f(con)m(v)m(en)m(tional)g(LZ77/LZ78-based)g +-(compressors,)g(and)f(approac)m(hes)h(the)f(p)s(erformance)g(of)h(the) +-150 884 y(PPM)c(family)f(of)i(statistical)f(compressors.)150 +-1041 y Fj(bzip2)k Fl(is)h(built)e(on)i(top)h(of)g Fj(libbzip2)p +-Fl(,)e(a)i(\015exible)e(library)f(for)i(handling)e(compressed)i(data)i +-(in)d(the)150 1151 y Fj(bzip2)c Fl(format.)43 b(This)30 +-b(man)m(ual)g(describ)s(es)g(b)s(oth)g(ho)m(w)i(to)g(use)f(the)g +-(program)g(and)g(ho)m(w)g(to)h(w)m(ork)f(with)150 1260 +-y(the)d(library)d(in)m(terface.)41 b(Most)28 b(of)g(the)g(man)m(ual)f +-(is)g(dev)m(oted)i(to)f(this)f(library)-8 b(,)26 b(not)i(the)g +-(program,)g(whic)m(h)150 1370 y(is)h(go)s(o)s(d)i(news)e(if)h(y)m(our)g +-(in)m(terest)h(is)e(only)g(in)h(the)g(program.)150 1527 +-y(Chapter)24 b(2)g(describ)s(es)f(ho)m(w)h(to)h(use)f +-Fj(bzip2)p Fl(;)h(this)e(is)g(the)i(only)e(part)h(y)m(ou)h(need)f(to)h +-(read)f(if)f(y)m(ou)h(just)g(w)m(an)m(t)150 1636 y(to)35 +-b(kno)m(w)f(ho)m(w)g(to)g(op)s(erate)h(the)f(program.)51 +-b(Chapter)34 b(3)g(describ)s(es)e(the)i(programming)f(in)m(terfaces)h +-(in)150 1746 y(detail,)23 b(and)d(Chapter)h(4)h(records)f(some)h +-(miscellaneous)e(notes)i(whic)m(h)e(I)h(though)m(t)h(ough)m(t)g(to)g(b) +-s(e)f(recorded)150 1855 y(somewhere.)p eop +-%%Page: 3 4 +-3 3 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 +-b(to)g(use)f Fj(bzip2)2375 b Fl(3)150 299 y Fh(2)80 b(Ho)l(w)53 +-b(to)g(use)g Fg(bzip2)150 566 y Fl(This)29 b(c)m(hapter)i(con)m(tains)f +-(a)h(cop)m(y)g(of)g(the)f Fj(bzip2)f Fl(man)h(page,)h(and)f(nothing)g +-(else.)390 818 y Ff(NAME)570 1004 y Fj(bzip2)p Fl(,)f +-Fj(bunzip2)g Fl(-)h(a)h(blo)s(c)m(k-sorting)f(\014le)f(compressor,)i +-(v1.0)570 1136 y Fj(bzcat)e Fl(-)i(decompresses)f(\014les)f(to)i +-(stdout)570 1267 y Fj(bzip2recover)c Fl(-)k(reco)m(v)m(ers)h(data)f +-(from)f(damaged)g(bzip2)g(\014les)390 1519 y Ff(SYNOPSIS)570 +-1706 y Fj(bzip2)f Fl([)h(-cdfkqstvzVL123456789)35 b(])c([)g +-(\014lenames)e(...)41 b(])570 1837 y Fj(bunzip2)28 b +-Fl([)j(-fkvsVL)f(])h([)f(\014lenames)g(...)41 b(])570 +-1968 y Fj(bzcat)29 b Fl([)h(-s)h(])g([)f(\014lenames)g(...)41 +-b(])570 2100 y Fj(bzip2recover)27 b Fl(\014lename)390 +-2352 y Ff(DESCRIPTION)390 2538 y Fj(bzip2)i Fl(compresses)i(\014les)f +-(using)f(the)i(Burro)m(ws-Wheeler)g(blo)s(c)m(k)f(sorting)g(text)i +-(compres-)390 2642 y(sion)40 b(algorithm,)j(and)d(Hu\013man)h(co)s +-(ding.)71 b(Compression)40 b(is)g(generally)g(considerably)390 +-2746 y(b)s(etter)25 b(than)g(that)h(ac)m(hiev)m(ed)g(b)m(y)f(more)g +-(con)m(v)m(en)m(tional)h(LZ77/LZ78-based)g(compressors,)390 +-2850 y(and)k(approac)m(hes)h(the)f(p)s(erformance)g(of)h(the)f(PPM)g +-(family)f(of)i(statistical)f(compressors.)390 3001 y(The)e +-(command-line)e(options)i(are)h(delib)s(erately)d(v)m(ery)i(similar)e +-(to)j(those)g(of)f(GNU)h Fj(gzip)p Fl(,)390 3104 y(but)h(they)g(are)h +-(not)g(iden)m(tical.)390 3255 y Fj(bzip2)f Fl(exp)s(ects)h(a)g(list)f +-(of)h(\014le)f(names)h(to)h(accompan)m(y)h(the)e(command-line)e +-(\015ags.)43 b(Eac)m(h)390 3359 y(\014le)e(is)h(replaced)g(b)m(y)g(a)h +-(compressed)f(v)m(ersion)g(of)g(itself,)i(with)e(the)g(name)g +-Fj(original_)390 3463 y(name.bz2)p Fl(.)49 b(Eac)m(h)34 +-b(compressed)g(\014le)f(has)g(the)h(same)g(mo)s(di\014cation)e(date,)k +-(p)s(ermissions,)390 3567 y(and,)24 b(when)f(p)s(ossible,)f(o)m +-(wnership)f(as)j(the)f(corresp)s(onding)f(original,)h(so)g(that)h +-(these)g(prop-)390 3671 y(erties)34 b(can)g(b)s(e)f(correctly)i +-(restored)f(at)g(decompression)f(time.)51 b(File)34 b(name)g(handling)d +-(is)390 3774 y(naiv)m(e)26 b(in)f(the)i(sense)f(that)h(there)f(is)f(no) +-i(mec)m(hanism)e(for)h(preserving)f(original)f(\014le)i(names,)390 +-3878 y(p)s(ermissions,)37 b(o)m(wnerships)f(or)h(dates)i(in)d +-(\014lesystems)h(whic)m(h)g(lac)m(k)h(these)g(concepts,)j(or)390 +-3982 y(ha)m(v)m(e)32 b(serious)d(\014le)g(name)i(length)f +-(restrictions,)f(suc)m(h)h(as)h(MS-DOS.)390 4133 y Fj(bzip2)26 +-b Fl(and)h Fj(bunzip2)e Fl(will)f(b)m(y)k(default)e(not)i(o)m(v)m +-(erwrite)g(existing)e(\014les.)38 b(If)27 b(y)m(ou)h(w)m(an)m(t)g(this) +-390 4237 y(to)j(happ)s(en,)e(sp)s(ecify)g(the)i Fj(-f)e +-Fl(\015ag.)390 4388 y(If)34 b(no)h(\014le)f(names)g(are)i(sp)s +-(eci\014ed,)e Fj(bzip2)f Fl(compresses)i(from)f(standard)g(input)f(to)j +-(stan-)390 4491 y(dard)c(output.)49 b(In)32 b(this)g(case,)k +-Fj(bzip2)31 b Fl(will)g(decline)h(to)i(write)e(compressed)h(output)g +-(to)h(a)390 4595 y(terminal,)29 b(as)i(this)e(w)m(ould)g(b)s(e)h(en)m +-(tirely)f(incomprehensible)e(and)j(therefore)h(p)s(oin)m(tless.)390 +-4746 y Fj(bunzip2)36 b Fl(\(or)j Fj(bzip2)29 b(-d)p Fl(\))37 +-b(decompresses)i(all)e(sp)s(eci\014ed)f(\014les.)63 b(Files)37 +-b(whic)m(h)g(w)m(ere)i(not)390 4850 y(created)e(b)m(y)f +-Fj(bzip2)f Fl(will)e(b)s(e)i(detected)j(and)d(ignored,)i(and)e(a)i(w)m +-(arning)d(issued.)56 b Fj(bzip2)390 4954 y Fl(attempts)31 +-b(to)f(guess)g(the)g(\014lename)f(for)h(the)g(decompressed)f(\014le)g +-(from)h(that)g(of)g(the)g(com-)390 5058 y(pressed)f(\014le)h(as)h +-(follo)m(ws:)570 5209 y Fj(filename.bz2)57 b Fl(b)s(ecomes)31 +-b Fj(filename)570 5340 y(filename.bz)58 b Fl(b)s(ecomes)30 +-b Fj(filename)p eop +-%%Page: 4 5 +-4 4 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 +-b(to)g(use)f Fj(bzip2)2375 b Fl(4)570 299 y Fj(filename.tbz2)27 +-b Fl(b)s(ecomes)j Fj(filename.tar)570 470 y(filename.tbz)57 +-b Fl(b)s(ecomes)31 b Fj(filename.tar)570 641 y(anyothername)57 +-b Fl(b)s(ecomes)31 b Fj(anyothername.out)390 859 y Fl(If)j(the)h +-(\014le)e(do)s(es)i(not)f(end)g(in)f(one)i(of)g(the)g(recognised)f +-(endings,)g Fj(.bz2)p Fl(,)h Fj(.bz)p Fl(,)g Fj(.tbz2)e +-Fl(or)390 963 y Fj(.tbz)p Fl(,)h Fj(bzip2)f Fl(complains)f(that)j(it)e +-(cannot)i(guess)f(the)g(name)h(of)f(the)g(original)e(\014le,)j(and)390 +-1067 y(uses)30 b(the)g(original)f(name)h(with)g Fj(.out)f +-Fl(app)s(ended.)390 1218 y(As)j(with)f(compression,)h(supplying)c(no)k +-(\014lenames)f(causes)i(decompression)e(from)h(stan-)390 +-1321 y(dard)d(input)g(to)i(standard)e(output.)390 1472 +-y Fj(bunzip2)k Fl(will)g(correctly)j(decompress)e(a)i(\014le)e(whic)m +-(h)g(is)h(the)g(concatenation)i(of)e(t)m(w)m(o)i(or)390 +-1576 y(more)j(compressed)f(\014les.)67 b(The)39 b(result)g(is)g(the)g +-(concatenation)i(of)f(the)g(corresp)s(onding)390 1680 +-y(uncompressed)c(\014les.)59 b(In)m(tegrit)m(y)38 b(testing)f(\()p +-Fj(-t)p Fl(\))g(of)g(concatenated)i(compressed)e(\014les)f(is)390 +-1784 y(also)30 b(supp)s(orted.)390 1935 y(Y)-8 b(ou)40 +-b(can)g(also)f(compress)g(or)g(decompress)g(\014les)g(to)h(the)f +-(standard)g(output)g(b)m(y)g(giving)390 2039 y(the)30 +-b Fj(-c)g Fl(\015ag.)40 b(Multiple)28 b(\014les)h(ma)m(y)i(b)s(e)e +-(compressed)h(and)f(decompressed)h(lik)m(e)f(this.)39 +-b(The)390 2142 y(resulting)31 b(outputs)i(are)h(fed)f(sequen)m(tially)f +-(to)i(stdout.)49 b(Compression)32 b(of)h(m)m(ultiple)e(\014les)390 +-2246 y(in)24 b(this)g(manner)h(generates)h(a)g(stream)f(con)m(taining)g +-(m)m(ultiple)e(compressed)i(\014le)f(represen-)390 2350 +-y(tations.)58 b(Suc)m(h)36 b(a)g(stream)g(can)h(b)s(e)e(decompressed)h +-(correctly)h(only)e(b)m(y)h Fj(bzip2)e Fl(v)m(ersion)390 +-2454 y(0.9.0)g(or)e(later.)47 b(Earlier)30 b(v)m(ersions)i(of)g +-Fj(bzip2)f Fl(will)f(stop)i(after)h(decompressing)e(the)i(\014rst)390 +-2558 y(\014le)c(in)h(the)g(stream.)390 2709 y Fj(bzcat)f +-Fl(\(or)i Fj(bzip2)e(-dc)p Fl(\))g(decompresses)i(all)e(sp)s(eci\014ed) +-g(\014les)g(to)i(the)g(standard)e(output.)390 2860 y +-Fj(bzip2)f Fl(will)g(read)i(argumen)m(ts)g(from)f(the)h(en)m(vironmen)m +-(t)g(v)-5 b(ariables)28 b Fj(BZIP2)h Fl(and)g Fj(BZIP)p +-Fl(,)g(in)390 2963 y(that)24 b(order,)g(and)f(will)e(pro)s(cess)i(them) +-g(b)s(efore)g(an)m(y)h(argumen)m(ts)f(read)h(from)f(the)g(command)390 +-3067 y(line.)39 b(This)29 b(giv)m(es)h(a)h(con)m(v)m(enien)m(t)h(w)m(a) +-m(y)f(to)g(supply)d(default)i(argumen)m(ts.)390 3218 +-y(Compression)h(is)h(alw)m(a)m(ys)i(p)s(erformed,)e(ev)m(en)h(if)f(the) +-h(compressed)g(\014le)f(is)g(sligh)m(tly)f(larger)390 +-3322 y(than)26 b(the)g(original.)38 b(Files)25 b(of)h(less)g(than)g(ab) +-s(out)g(one)g(h)m(undred)e(b)m(ytes)j(tend)f(to)h(get)g(larger,)390 +-3426 y(since)34 b(the)g(compression)f(mec)m(hanism)h(has)f(a)i(constan) +-m(t)g(o)m(v)m(erhead)h(in)d(the)h(region)g(of)g(50)390 +-3529 y(b)m(ytes.)54 b(Random)34 b(data)h(\(including)d(the)i(output)h +-(of)f(most)h(\014le)f(compressors\))h(is)e(co)s(ded)390 +-3633 y(at)e(ab)s(out)f(8.05)i(bits)d(p)s(er)h(b)m(yte,)h(giving)e(an)h +-(expansion)g(of)g(around)g(0.5\045.)390 3784 y(As)h(a)g(self-c)m(hec)m +-(k)h(for)e(y)m(our)h(protection,)g Fj(bzip2)f Fl(uses)g(32-bit)h(CR)m +-(Cs)f(to)i(mak)m(e)f(sure)f(that)390 3888 y(the)45 b(decompressed)f(v)m +-(ersion)g(of)g(a)h(\014le)e(is)h(iden)m(tical)f(to)i(the)g(original.)81 +-b(This)43 b(guards)390 3992 y(against)i(corruption)e(of)h(the)h +-(compressed)f(data,)49 b(and)44 b(against)h(undetected)g(bugs)e(in)390 +-4096 y Fj(bzip2)35 b Fl(\(hop)s(efully)e(v)m(ery)k(unlik)m(ely\).)56 +-b(The)36 b(c)m(hances)h(of)f(data)h(corruption)e(going)h(unde-)390 +-4199 y(tected)g(is)e(microscopic,)h(ab)s(out)f(one)h(c)m(hance)g(in)f +-(four)g(billion)d(for)j(eac)m(h)i(\014le)d(pro)s(cessed.)390 +-4303 y(Be)38 b(a)m(w)m(are,)k(though,)d(that)f(the)g(c)m(hec)m(k)i(o)s +-(ccurs)d(up)s(on)f(decompression,)j(so)f(it)f(can)h(only)390 +-4407 y(tell)28 b(y)m(ou)g(that)i(something)d(is)h(wrong.)40 +-b(It)28 b(can't)i(help)d(y)m(ou)i(reco)m(v)m(er)h(the)e(original)f +-(uncom-)390 4511 y(pressed)h(data.)41 b(Y)-8 b(ou)30 +-b(can)f(use)g Fj(bzip2recover)d Fl(to)k(try)f(to)h(reco)m(v)m(er)h +-(data)f(from)e(damaged)390 4614 y(\014les.)390 4766 y(Return)22 +-b(v)-5 b(alues:)37 b(0)23 b(for)g(a)g(normal)f(exit,)j(1)e(for)g(en)m +-(vironmen)m(tal)f(problems)f(\(\014le)i(not)g(found,)390 +-4869 y(in)m(v)-5 b(alid)30 b(\015ags,)k(I/O)f(errors,)g(&c\),)h(2)f(to) +-g(indicate)f(a)h(corrupt)f(compressed)h(\014le,)f(3)i(for)e(an)390 +-4973 y(in)m(ternal)d(consistency)h(error)g(\(eg,)i(bug\))e(whic)m(h)f +-(caused)i Fj(bzip2)e Fl(to)i(panic.)390 5304 y Ff(OPTIONS)p +-eop +-%%Page: 5 6 +-5 5 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 +-b(to)g(use)f Fj(bzip2)2375 b Fl(5)390 299 y Fj(-c)30 +-b(--stdout)870 403 y Fl(Compress)f(or)i(decompress)f(to)h(standard)e +-(output.)390 557 y Fj(-d)h(--decompress)870 661 y Fl(F)-8 +-b(orce)44 b(decompression.)77 b Fj(bzip2)p Fl(,)44 b +-Fj(bunzip2)d Fl(and)h Fj(bzcat)f Fl(are)i(really)f(the)870 +-764 y(same)27 b(program,)h(and)e(the)i(decision)d(ab)s(out)i(what)g +-(actions)g(to)h(tak)m(e)g(is)e(done)870 868 y(on)k(the)h(basis)e(of)i +-(whic)m(h)e(name)h(is)g(used.)40 b(This)28 b(\015ag)j(o)m(v)m(errides)f +-(that)h(mec)m(h-)870 972 y(anism,)e(and)h(forces)h(bzip2)e(to)i +-(decompress.)390 1126 y Fj(-z)f(--compress)870 1230 y +-Fl(The)39 b(complemen)m(t)h(to)g Fj(-d)p Fl(:)59 b(forces)40 +-b(compression,)h(regardless)d(of)i(the)g(in-)870 1334 +-y(v)m(ok)-5 b(ation)31 b(name.)390 1488 y Fj(-t)f(--test)8 +-b Fl(Chec)m(k)33 b(in)m(tegrit)m(y)j(of)f(the)g(sp)s(eci\014ed)e +-(\014le\(s\),)k(but)d(don't)h(decompress)g(them.)870 +-1591 y(This)40 b(really)g(p)s(erforms)g(a)i(trial)e(decompression)h +-(and)g(thro)m(ws)g(a)m(w)m(a)m(y)j(the)870 1695 y(result.)390 +-1849 y Fj(-f)30 b(--force)870 1953 y Fl(F)-8 b(orce)31 +-b(o)m(v)m(erwrite)f(of)g(output)f(\014les.)40 b(Normally)-8 +-b(,)29 b Fj(bzip2)f Fl(will)f(not)j(o)m(v)m(erwrite)870 +-2057 y(existing)e(output)g(\014les.)39 b(Also)28 b(forces)h +-Fj(bzip2)e Fl(to)i(break)g(hard)e(links)f(to)k(\014les,)870 +-2161 y(whic)m(h)f(it)h(otherwise)g(w)m(ouldn't)f(do.)390 +-2315 y Fj(-k)h(--keep)8 b Fl(Keep)24 b(\(don't)i(delete\))h(input)d +-(\014les)g(during)g(compression)h(or)h(decompression.)390 +-2469 y Fj(-s)k(--small)870 2573 y Fl(Reduce)23 b(memory)f(usage,)j(for) +-d(compression,)h(decompression)f(and)g(testing.)870 2676 +-y(Files)f(are)i(decompressed)e(and)h(tested)h(using)e(a)h(mo)s +-(di\014ed)e(algorithm)h(whic)m(h)870 2780 y(only)30 b(requires)g(2.5)j +-(b)m(ytes)f(p)s(er)e(blo)s(c)m(k)h(b)m(yte.)44 b(This)30 +-b(means)h(an)m(y)h(\014le)e(can)i(b)s(e)870 2884 y(decompressed)d(in)f +-(2300k)j(of)e(memory)-8 b(,)30 b(alb)s(eit)e(at)i(ab)s(out)f(half)g +-(the)g(normal)870 2988 y(sp)s(eed.)870 3117 y(During)42 +-b(compression,)k Fj(-s)d Fl(selects)h(a)g(blo)s(c)m(k)g(size)f(of)h +-(200k,)k(whic)m(h)42 b(lim-)870 3220 y(its)33 b(memory)g(use)g(to)h +-(around)e(the)i(same)f(\014gure,)h(at)g(the)g(exp)s(ense)f(of)g(y)m +-(our)870 3324 y(compression)g(ratio.)50 b(In)33 b(short,)i(if)d(y)m +-(our)i(mac)m(hine)f(is)g(lo)m(w)g(on)h(memory)f(\(8)870 +-3428 y(megab)m(ytes)42 b(or)e(less\),)j(use)d(-s)g(for)g(ev)m +-(erything.)71 b(See)40 b(MEMOR)-8 b(Y)41 b(MAN-)870 3532 +-y(A)m(GEMENT)31 b(b)s(elo)m(w.)390 3686 y Fj(-q)f(--quiet)870 +-3790 y Fl(Suppress)j(non-essen)m(tial)j(w)m(arning)e(messages.)58 +-b(Messages)38 b(p)s(ertaining)33 b(to)870 3893 y(I/O)d(errors)g(and)g +-(other)h(critical)e(ev)m(en)m(ts)j(will)27 b(not)k(b)s(e)f(suppressed.) +-390 4047 y Fj(-v)g(--verbose)870 4151 y Fl(V)-8 b(erb)s(ose)28 +-b(mo)s(de)f({)i(sho)m(w)e(the)h(compression)f(ratio)h(for)f(eac)m(h)i +-(\014le)e(pro)s(cessed.)870 4255 y(F)-8 b(urther)30 b +-Fj(-v)p Fl('s)g(increase)g(the)g(v)m(erb)s(osit)m(y)g(lev)m(el,)h(sp)s +-(ewing)d(out)j(lots)f(of)g(infor-)870 4359 y(mation)g(whic)m(h)f(is)h +-(primarily)d(of)j(in)m(terest)h(for)f(diagnostic)g(purp)s(oses.)390 +-4513 y Fj(-L)g(--license)e(-V)h(--version)870 4617 y +-Fl(Displa)m(y)h(the)g(soft)m(w)m(are)i(v)m(ersion,)e(license)f(terms)i +-(and)e(conditions.)390 4771 y Fj(-1)h(to)g(-9)72 b Fl(Set)35 +-b(the)g(blo)s(c)m(k)f(size)h(to)g(100)h(k,)g(200)g(k)f(..)53 +-b(900)36 b(k)f(when)f(compressing.)53 b(Has)870 4875 +-y(no)41 b(e\013ect)h(when)d(decompressing.)71 b(See)41 +-b(MEMOR)-8 b(Y)41 b(MANA)m(GEMENT)870 4978 y(b)s(elo)m(w.)390 +-5132 y Fj(--)324 b Fl(T)-8 b(reats)25 b(all)e(subsequen)m(t)g(argumen)m +-(ts)i(as)f(\014le)g(names,)h(ev)m(en)g(if)e(they)i(start)f(with)870 +-5236 y(a)32 b(dash.)43 b(This)29 b(is)h(so)i(y)m(ou)g(can)f(handle)f +-(\014les)g(with)g(names)i(b)s(eginning)c(with)870 5340 +-y(a)j(dash,)f(for)g(example:)40 b Fj(bzip2)29 b(--)h(-myfilename)p +-Fl(.)p eop +-%%Page: 6 7 +-6 6 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 +-b(to)g(use)f Fj(bzip2)2375 b Fl(6)390 299 y Fj(--repetitive-fast)390 +-427 y(--repetitive-best)870 530 y Fl(These)34 b(\015ags)g(are)h +-(redundan)m(t)e(in)g(v)m(ersions)g(0.9.5)j(and)e(ab)s(o)m(v)m(e.)53 +-b(They)34 b(pro-)870 634 y(vided)h(some)i(coarse)g(con)m(trol)g(o)m(v)m +-(er)g(the)g(b)s(eha)m(viour)e(of)h(the)g(sorting)g(algo-)870 +-738 y(rithm)h(in)h(earlier)g(v)m(ersions,)j(whic)m(h)d(w)m(as)h +-(sometimes)h(useful.)65 b(0.9.5)41 b(and)870 842 y(ab)s(o)m(v)m(e)34 +-b(ha)m(v)m(e)g(an)f(impro)m(v)m(ed)g(algorithm)f(whic)m(h)f(renders)h +-(these)h(\015ags)h(irrel-)870 946 y(ev)-5 b(an)m(t.)390 +-1190 y Ff(MEMOR)-10 b(Y)40 b(MANA)m(GEMENT)390 1377 y +-Fj(bzip2)25 b Fl(compresses)i(large)g(\014les)e(in)g(blo)s(c)m(ks.)39 +-b(The)26 b(blo)s(c)m(k)h(size)f(a\013ects)i(b)s(oth)e(the)h(compres-) +-390 1481 y(sion)39 b(ratio)g(ac)m(hiev)m(ed,)k(and)d(the)f(amoun)m(t)i +-(of)e(memory)h(needed)f(for)h(compression)f(and)390 1585 +-y(decompression.)59 b(The)36 b(\015ags)h Fj(-1)f Fl(through)h +-Fj(-9)f Fl(sp)s(ecify)f(the)i(blo)s(c)m(k)g(size)f(to)i(b)s(e)e +-(100,000)390 1688 y(b)m(ytes)29 b(through)e(900,000)k(b)m(ytes)d(\(the) +-h(default\))e(resp)s(ectiv)m(ely)-8 b(.)40 b(A)m(t)29 +-b(decompression)e(time,)390 1792 y(the)32 b(blo)s(c)m(k)g(size)g(used)g +-(for)g(compression)f(is)g(read)h(from)g(the)g(header)g(of)h(the)f +-(compressed)390 1896 y(\014le,)j(and)f Fj(bunzip2)e Fl(then)i(allo)s +-(cates)h(itself)e(just)h(enough)g(memory)g(to)i(decompress)e(the)390 +-2000 y(\014le.)39 b(Since)29 b(blo)s(c)m(k)g(sizes)g(are)h(stored)f(in) +-f(compressed)h(\014les,)g(it)g(follo)m(ws)f(that)i(the)g(\015ags)g +-Fj(-1)390 2103 y Fl(to)h Fj(-9)f Fl(are)h(irrelev)-5 +-b(an)m(t)29 b(to)i(and)f(so)h(ignored)e(during)f(decompression.)390 +-2255 y(Compression)h(and)g(decompression)h(requiremen)m(ts,)f(in)g(b)m +-(ytes,)j(can)e(b)s(e)g(estimated)h(as:)869 2406 y Fj(Compression:)140 +-b(400k)46 b(+)i(\()f(8)h(x)f(block)f(size)h(\))869 2613 +-y(Decompression:)d(100k)i(+)i(\()f(4)h(x)f(block)f(size)h(\),)g(or)1585 +-2717 y(100k)f(+)i(\()f(2.5)g(x)g(block)g(size)f(\))390 +-2868 y Fl(Larger)29 b(blo)s(c)m(k)f(sizes)h(giv)m(e)g(rapidly)d +-(diminishing)e(marginal)k(returns.)39 b(Most)29 b(of)g(the)g(com-)390 +-2972 y(pression)d(comes)j(from)f(the)g(\014rst)g(t)m(w)m(o)h(or)f +-(three)h(h)m(undred)d(k)i(of)g(blo)s(c)m(k)g(size,)g(a)h(fact)g(w)m +-(orth)390 3075 y(b)s(earing)j(in)f(mind)g(when)h(using)f +-Fj(bzip2)h Fl(on)g(small)g(mac)m(hines.)47 b(It)33 b(is)f(also)h(imp)s +-(ortan)m(t)f(to)390 3179 y(appreciate)j(that)h(the)f(decompression)f +-(memory)h(requiremen)m(t)f(is)h(set)g(at)h(compression)390 +-3283 y(time)30 b(b)m(y)g(the)h(c)m(hoice)g(of)g(blo)s(c)m(k)f(size.)390 +-3434 y(F)-8 b(or)45 b(\014les)f(compressed)g(with)g(the)g(default)g +-(900k)i(blo)s(c)m(k)e(size,)49 b Fj(bunzip2)42 b Fl(will)g(require)390 +-3538 y(ab)s(out)29 b(3700)j(kb)m(ytes)e(to)h(decompress.)40 +-b(T)-8 b(o)30 b(supp)s(ort)e(decompression)h(of)h(an)m(y)g(\014le)f(on) +-g(a)i(4)390 3642 y(megab)m(yte)h(mac)m(hine,)d Fj(bunzip2)f +-Fl(has)i(an)g(option)f(to)i(decompress)e(using)g(appro)m(ximately)390 +-3745 y(half)k(this)g(amoun)m(t)i(of)f(memory)-8 b(,)36 +-b(ab)s(out)e(2300)i(kb)m(ytes.)53 b(Decompression)34 +-b(sp)s(eed)g(is)f(also)390 3849 y(halv)m(ed,)i(so)f(y)m(ou)h(should)d +-(use)h(this)g(option)h(only)f(where)h(necessary)-8 b(.)53 +-b(The)33 b(relev)-5 b(an)m(t)35 b(\015ag)390 3953 y(is)29 +-b Fj(-s)p Fl(.)390 4104 y(In)34 b(general,)i(try)f(and)f(use)g(the)h +-(largest)h(blo)s(c)m(k)e(size)h(memory)f(constrain)m(ts)h(allo)m(w,)h +-(since)390 4208 y(that)45 b(maximises)f(the)h(compression)f(ac)m(hiev)m +-(ed.)85 b(Compression)43 b(and)h(decompression)390 4311 +-y(sp)s(eed)30 b(are)g(virtually)e(una\013ected)j(b)m(y)f(blo)s(c)m(k)g +-(size.)390 4463 y(Another)25 b(signi\014can)m(t)f(p)s(oin)m(t)g +-(applies)f(to)j(\014les)e(whic)m(h)g(\014t)h(in)e(a)j(single)d(blo)s(c) +-m(k)i({)g(that)h(means)390 4566 y(most)g(\014les)g(y)m(ou'd)g(encoun)m +-(ter)h(using)d(a)j(large)f(blo)s(c)m(k)g(size.)39 b(The)25 +-b(amoun)m(t)i(of)f(real)g(memory)390 4670 y(touc)m(hed)38 +-b(is)e(prop)s(ortional)f(to)j(the)f(size)g(of)h(the)f(\014le,)h(since)f +-(the)g(\014le)g(is)f(smaller)g(than)h(a)390 4774 y(blo)s(c)m(k.)49 +-b(F)-8 b(or)35 b(example,)f(compressing)e(a)i(\014le)e(20,000)k(b)m +-(ytes)e(long)f(with)f(the)i(\015ag)g Fj(-9)f Fl(will)390 +-4878 y(cause)28 b(the)f(compressor)g(to)h(allo)s(cate)f(around)f(7600k) +-j(of)e(memory)-8 b(,)28 b(but)f(only)f(touc)m(h)i(400k)390 +-4981 y Fj(+)h Fl(20000)j(*)e(8)g(=)f(560)i(kb)m(ytes)f(of)g(it.)40 +-b(Similarly)-8 b(,)26 b(the)k(decompressor)f(will)e(allo)s(cate)j +-(3700k)390 5085 y(but)g(only)f(touc)m(h)i(100k)h Fj(+)e +-Fl(20000)i(*)f(4)g(=)f(180)i(kb)m(ytes.)390 5236 y(Here)41 +-b(is)f(a)i(table)f(whic)m(h)e(summarises)g(the)j(maxim)m(um)d(memory)i +-(usage)h(for)e(di\013eren)m(t)390 5340 y(blo)s(c)m(k)25 +-b(sizes.)38 b(Also)25 b(recorded)g(is)f(the)i(total)g(compressed)e +-(size)h(for)g(14)h(\014les)e(of)i(the)f(Calgary)p eop +-%%Page: 7 8 +-7 7 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 +-b(to)g(use)f Fj(bzip2)2375 b Fl(7)390 299 y(T)-8 b(ext)38 +-b(Compression)d(Corpus)h(totalling)h(3,141,622)k(b)m(ytes.)61 +-b(This)36 b(column)g(giv)m(es)i(some)390 403 y(feel)23 +-b(for)f(ho)m(w)h(compression)f(v)-5 b(aries)23 b(with)e(blo)s(c)m(k)i +-(size.)38 b(These)23 b(\014gures)f(tend)g(to)i(understate)390 +-506 y(the)g(adv)-5 b(an)m(tage)26 b(of)e(larger)f(blo)s(c)m(k)h(sizes)f +-(for)h(larger)f(\014les,)h(since)g(the)g(Corpus)e(is)h(dominated)390 +-610 y(b)m(y)30 b(smaller)f(\014les.)1107 761 y Fj(Compress)141 +-b(Decompress)g(Decompress)f(Corpus)773 865 y(Flag)238 +-b(usage)285 b(usage)332 b(-s)48 b(usage)237 b(Size)821 +-1073 y(-1)286 b(1200k)332 b(500k)429 b(350k)285 b(914704)821 +-1176 y(-2)h(2000k)332 b(900k)429 b(600k)285 b(877703)821 +-1280 y(-3)h(2800k)f(1300k)428 b(850k)285 b(860338)821 +-1384 y(-4)h(3600k)f(1700k)380 b(1100k)285 b(846899)821 +-1488 y(-5)h(4400k)f(2100k)380 b(1350k)285 b(845160)821 +-1591 y(-6)h(5200k)f(2500k)380 b(1600k)285 b(838626)821 +-1695 y(-7)h(6100k)f(2900k)380 b(1850k)285 b(834096)821 +-1799 y(-8)h(6800k)f(3300k)380 b(2100k)285 b(828642)821 +-1903 y(-9)h(7600k)f(3700k)380 b(2350k)285 b(828642)390 +-2147 y Ff(RECO)m(VERING)37 b(D)m(A)-10 b(T)g(A)40 b(FR)m(OM)h(D)m(AMA)m +-(GED)e(FILES)390 2333 y Fj(bzip2)25 b Fl(compresses)h(\014les)g(in)f +-(blo)s(c)m(ks,)h(usually)e(900kb)m(ytes)29 b(long.)39 +-b(Eac)m(h)27 b(blo)s(c)m(k)e(is)h(handled)390 2437 y(indep)s(enden)m +-(tly)-8 b(.)47 b(If)32 b(a)i(media)e(or)h(transmission)e(error)i +-(causes)h(a)f(m)m(ulti-blo)s(c)m(k)f Fj(.bz2)g Fl(\014le)390 +-2541 y(to)k(b)s(ecome)h(damaged,)g(it)e(ma)m(y)i(b)s(e)e(p)s(ossible)e +-(to)k(reco)m(v)m(er)g(data)f(from)g(the)f(undamaged)390 +-2645 y(blo)s(c)m(ks)30 b(in)f(the)h(\014le.)390 2796 +-y(The)j(compressed)h(represen)m(tation)f(of)h(eac)m(h)h(blo)s(c)m(k)e +-(is)g(delimited)e(b)m(y)j(a)g(48-bit)g(pattern,)390 2900 +-y(whic)m(h)27 b(mak)m(es)j(it)e(p)s(ossible)e(to)j(\014nd)e(the)i(blo)s +-(c)m(k)f(b)s(oundaries)e(with)i(reasonable)g(certain)m(t)m(y)-8 +-b(.)390 3003 y(Eac)m(h)34 b(blo)s(c)m(k)f(also)g(carries)g(its)g(o)m +-(wn)g(32-bit)g(CR)m(C,)h(so)f(damaged)h(blo)s(c)m(ks)f(can)g(b)s(e)g +-(distin-)390 3107 y(guished)c(from)h(undamaged)g(ones.)390 +-3258 y Fj(bzip2recover)37 b Fl(is)j(a)h(simple)e(program)h(whose)g +-(purp)s(ose)f(is)h(to)i(searc)m(h)f(for)f(blo)s(c)m(ks)g(in)390 +-3362 y Fj(.bz2)34 b Fl(\014les,)i(and)f(write)f(eac)m(h)j(blo)s(c)m(k)d +-(out)i(in)m(to)f(its)g(o)m(wn)g Fj(.bz2)f Fl(\014le.)55 +-b(Y)-8 b(ou)36 b(can)f(then)g(use)390 3466 y Fj(bzip2)29 +-b(-t)c Fl(to)i(test)f(the)g(in)m(tegrit)m(y)g(of)g(the)g(resulting)e +-(\014les,)i(and)f(decompress)h(those)g(whic)m(h)390 3569 +-y(are)31 b(undamaged.)390 3721 y Fj(bzip2recover)41 b +-Fl(tak)m(es)46 b(a)f(single)e(argumen)m(t,)49 b(the)44 +-b(name)h(of)g(the)f(damaged)h(\014le,)j(and)390 3824 +-y(writes)33 b(a)i(n)m(um)m(b)s(er)d(of)j(\014les)e Fj(rec0001file.bz2)p +-Fl(,)e Fj(rec0002file.bz2)p Fl(,)g(etc,)36 b(con)m(taining)390 +-3928 y(the)42 b(extracted)g(blo)s(c)m(ks.)74 b(The)41 +-b(output)g(\014lenames)f(are)i(designed)e(so)i(that)g(the)g(use)f(of) +-390 4032 y(wildcards)30 b(in)h(subsequen)m(t)h(pro)s(cessing)f({)i(for) +-g(example,)g Fj(bzip2)c(-dc)g(rec*file.bz2)e(>)390 4136 +-y(recovered_data)f Fl({)31 b(lists)e(the)i(\014les)e(in)g(the)i +-(correct)g(order.)390 4287 y Fj(bzip2recover)38 b Fl(should)i(b)s(e)g +-(of)i(most)g(use)f(dealing)f(with)g(large)i Fj(.bz2)e +-Fl(\014les,)k(as)d(these)390 4390 y(will)29 b(con)m(tain)j(man)m(y)g +-(blo)s(c)m(ks.)45 b(It)32 b(is)f(clearly)g(futile)f(to)i(use)g(it)f(on) +-h(damaged)g(single-blo)s(c)m(k)390 4494 y(\014les,)g(since)f(a)h +-(damaged)h(blo)s(c)m(k)e(cannot)i(b)s(e)e(reco)m(v)m(ered.)47 +-b(If)32 b(y)m(ou)g(wish)e(to)j(minimise)c(an)m(y)390 +-4598 y(p)s(oten)m(tial)36 b(data)i(loss)e(through)g(media)h(or)f +-(transmission)f(errors,)j(y)m(ou)f(migh)m(t)g(consider)390 +-4702 y(compressing)29 b(with)g(a)i(smaller)e(blo)s(c)m(k)h(size.)390 +-4946 y Ff(PERF)m(ORMANCE)39 b(NOTES)390 5132 y Fl(The)f(sorting)f +-(phase)h(of)h(compression)e(gathers)i(together)h(similar)35 +-b(strings)i(in)g(the)i(\014le.)390 5236 y(Because)54 +-b(of)f(this,)58 b(\014les)52 b(con)m(taining)g(v)m(ery)h(long)g(runs)e +-(of)i(rep)s(eated)g(sym)m(b)s(ols,)58 b(lik)m(e)390 5340 +-y Fj(")p Fl(aabaabaabaab)e(...)p Fj(")g Fl(\(rep)s(eated)g(sev)m(eral)f +-(h)m(undred)e(times\))i(ma)m(y)h(compress)f(more)p eop +-%%Page: 8 9 +-8 8 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 +-b(to)g(use)f Fj(bzip2)2375 b Fl(8)390 299 y(slo)m(wly)33 +-b(than)g(normal.)50 b(V)-8 b(ersions)33 b(0.9.5)i(and)f(ab)s(o)m(v)m(e) +-h(fare)e(m)m(uc)m(h)h(b)s(etter)g(than)f(previous)390 +-403 y(v)m(ersions)i(in)g(this)f(resp)s(ect.)57 b(The)35 +-b(ratio)h(b)s(et)m(w)m(een)h(w)m(orst-case)g(and)e(a)m(v)m(erage-case) +-40 b(com-)390 506 y(pression)e(time)h(is)f(in)g(the)h(region)g(of)h +-(10:1.)69 b(F)-8 b(or)40 b(previous)e(v)m(ersions,)j(this)d(\014gure)h +-(w)m(as)390 610 y(more)f(lik)m(e)g(100:1.)66 b(Y)-8 b(ou)38 +-b(can)h(use)e(the)i Fj(-vvvv)d Fl(option)i(to)h(monitor)e(progress)h +-(in)f(great)390 714 y(detail,)30 b(if)f(y)m(ou)i(w)m(an)m(t.)390 +-865 y(Decompression)f(sp)s(eed)g(is)f(una\013ected)i(b)m(y)f(these)h +-(phenomena.)390 1016 y Fj(bzip2)i Fl(usually)g(allo)s(cates)i(sev)m +-(eral)f(megab)m(ytes)j(of)d(memory)h(to)g(op)s(erate)h(in,)e(and)g +-(then)390 1120 y(c)m(harges)k(all)d(o)m(v)m(er)j(it)f(in)e(a)i(fairly)e +-(random)h(fashion.)59 b(This)34 b(means)j(that)g(p)s(erformance,)390 +-1224 y(b)s(oth)27 b(for)h(compressing)f(and)g(decompressing,)h(is)f +-(largely)g(determined)g(b)m(y)h(the)g(sp)s(eed)f(at)390 +-1327 y(whic)m(h)35 b(y)m(our)h(mac)m(hine)g(can)g(service)g(cac)m(he)i +-(misses.)57 b(Because)37 b(of)g(this,)f(small)f(c)m(hanges)390 +-1431 y(to)f(the)f(co)s(de)h(to)f(reduce)g(the)h(miss)d(rate)j(ha)m(v)m +-(e)h(b)s(een)d(observ)m(ed)h(to)h(giv)m(e)g(disprop)s(ortion-)390 +-1535 y(ately)i(large)f(p)s(erformance)f(impro)m(v)m(emen)m(ts.)56 +-b(I)35 b(imagine)f Fj(bzip2)g Fl(will)e(p)s(erform)i(b)s(est)h(on)390 +-1639 y(mac)m(hines)30 b(with)f(v)m(ery)i(large)f(cac)m(hes.)390 +-1885 y Ff(CA)-14 b(VEA)k(TS)390 2072 y Fl(I/O)38 b(error)g(messages)h +-(are)f(not)h(as)f(helpful)e(as)i(they)g(could)f(b)s(e.)64 +-b Fj(bzip2)37 b Fl(tries)g(hard)g(to)390 2176 y(detect)29 +-b(I/O)e(errors)g(and)f(exit)i(cleanly)-8 b(,)27 b(but)g(the)h(details)e +-(of)h(what)h(the)f(problem)f(is)g(some-)390 2280 y(times)k(seem)h +-(rather)f(misleading.)390 2431 y(This)j(man)m(ual)g(page)i(p)s(ertains) +-e(to)i(v)m(ersion)f(1.0)i(of)e Fj(bzip2)p Fl(.)51 b(Compressed)34 +-b(data)h(created)390 2534 y(b)m(y)25 b(this)e(v)m(ersion)i(is)e(en)m +-(tirely)h(forw)m(ards)h(and)f(bac)m(kw)m(ards)h(compatible)f(with)f +-(the)i(previous)390 2638 y(public)18 b(releases,)24 b(v)m(ersions)c +-(0.1pl2,)k(0.9.0)e(and)f(0.9.5,)k(but)20 b(with)g(the)h(follo)m(wing)e +-(exception:)390 2742 y(0.9.0)43 b(and)e(ab)s(o)m(v)m(e)h(can)g +-(correctly)f(decompress)g(m)m(ultiple)e(concatenated)k(compressed)390 +-2846 y(\014les.)c(0.1pl2)30 b(cannot)g(do)f(this;)f(it)h(will)e(stop)i +-(after)h(decompressing)e(just)g(the)i(\014rst)e(\014le)g(in)390 +-2949 y(the)j(stream.)390 3100 y Fj(bzip2recover)20 b +-Fl(uses)k(32-bit)g(in)m(tegers)f(to)i(represen)m(t)f(bit)e(p)s +-(ositions)g(in)g(compressed)i(\014les,)390 3204 y(so)j(it)f(cannot)i +-(handle)d(compressed)i(\014les)f(more)h(than)f(512)i(megab)m(ytes)h +-(long.)39 b(This)25 b(could)390 3308 y(easily)30 b(b)s(e)f(\014xed.)390 +-3555 y Ff(A)m(UTHOR)390 3741 y Fl(Julian)f(Sew)m(ard,)i +-Fj(jseward@acm.org)p Fl(.)390 3892 y(The)24 b(ideas)f(em)m(b)s(o)s +-(died)f(in)h Fj(bzip2)f Fl(are)j(due)e(to)i(\(at)g(least\))g(the)f +-(follo)m(wing)e(p)s(eople:)37 b(Mic)m(hael)390 3996 y(Burro)m(ws)48 +-b(and)g(Da)m(vid)h(Wheeler)f(\(for)h(the)g(blo)s(c)m(k)f(sorting)g +-(transformation\),)53 b(Da)m(vid)390 4100 y(Wheeler)45 +-b(\(again,)50 b(for)45 b(the)g(Hu\013man)g(co)s(der\),)k(P)m(eter)d(F) +--8 b(en)m(wic)m(k)46 b(\(for)g(the)f(structured)390 4204 +-y(co)s(ding)26 b(mo)s(del)g(in)f(the)i(original)e Fj(bzip)p +-Fl(,)i(and)f(man)m(y)h(re\014nemen)m(ts\),)h(and)e(Alistair)f +-(Mo\013at,)390 4307 y(Radford)34 b(Neal)h(and)f(Ian)h(Witten)g(\(for)f +-(the)h(arithmetic)g(co)s(der)f(in)g(the)h(original)d +-Fj(bzip)p Fl(\).)390 4411 y(I)41 b(am)g(m)m(uc)m(h)h(indebted)e(for)h +-(their)f(help,)j(supp)s(ort)c(and)i(advice.)74 b(See)41 +-b(the)h(man)m(ual)e(in)390 4515 y(the)28 b(source)g(distribution)23 +-b(for)28 b(p)s(oin)m(ters)e(to)j(sources)e(of)h(do)s(cumen)m(tation.)40 +-b(Christian)25 b(v)m(on)390 4619 y(Ro)s(ques)31 b(encouraged)h(me)g(to) +-g(lo)s(ok)f(for)h(faster)g(sorting)f(algorithms,)f(so)i(as)g(to)g(sp)s +-(eed)f(up)390 4723 y(compression.)47 b(Bela)34 b(Lubkin)c(encouraged)k +-(me)f(to)g(impro)m(v)m(e)g(the)g(w)m(orst-case)i(compres-)390 +-4826 y(sion)25 b(p)s(erformance.)38 b(Man)m(y)26 b(p)s(eople)f(sen)m(t) +-h(patc)m(hes,)h(help)s(ed)d(with)g(p)s(ortabilit)m(y)f(problems,)390 +-4930 y(len)m(t)30 b(mac)m(hines,)g(ga)m(v)m(e)j(advice)d(and)g(w)m(ere) +-h(generally)f(helpful.)p eop +-%%Page: 9 10 +-9 9 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1927 b Fl(9)150 299 y Fh(3)80 b(Programming)53 +-b(with)h Fg(libbzip2)150 568 y Fl(This)29 b(c)m(hapter)i(describ)s(es)d +-(the)j(programming)e(in)m(terface)i(to)g Fj(libbzip2)p +-Fl(.)150 725 y(F)-8 b(or)36 b(general)e(bac)m(kground)h(information,)f +-(particularly)f(ab)s(out)h(memory)h(use)f(and)g(p)s(erformance)g(as-) +-150 834 y(p)s(ects,)d(y)m(ou'd)f(b)s(e)g(w)m(ell)f(advised)g(to)j(read) +-e(Chapter)g(2)g(as)h(w)m(ell.)150 1124 y Fk(3.1)68 b(T)-11 +-b(op-lev)l(el)46 b(structure)150 1316 y Fj(libbzip2)33 +-b Fl(is)i(a)h(\015exible)e(library)f(for)j(compressing)f(and)g +-(decompressing)f(data)j(in)d(the)i Fj(bzip2)e Fl(data)150 +-1426 y(format.)39 b(Although)24 b(pac)m(k)-5 b(aged)26 +-b(as)e(a)h(single)e(en)m(tit)m(y)-8 b(,)27 b(it)d(helps)f(to)i(regard)g +-(the)g(library)d(as)i(three)h(separate)150 1535 y(parts:)40 +-b(the)31 b(lo)m(w)f(lev)m(el)g(in)m(terface,)h(and)f(the)h(high)e(lev)m +-(el)h(in)m(terface,)h(and)f(some)h(utilit)m(y)d(functions.)150 +-1692 y(The)38 b(structure)g(of)g Fj(libbzip2)p Fl('s)e(in)m(terfaces)j +-(is)e(similar)f(to)j(that)g(of)g(Jean-loup)e(Gailly's)g(and)h(Mark)150 +-1802 y(Adler's)29 b(excellen)m(t)i Fj(zlib)e Fl(library)-8 +-b(.)150 1959 y(All)29 b(externally)g(visible)f(sym)m(b)s(ols)h(ha)m(v)m +-(e)i(names)f(b)s(eginning)e Fj(BZ2_)p Fl(.)39 b(This)29 +-b(is)g(new)h(in)f(v)m(ersion)h(1.0.)41 b(The)150 2068 +-y(in)m(ten)m(tion)30 b(is)f(to)i(minimise)d(p)s(ollution)f(of)k(the)f +-(namespaces)h(of)g(library)d(clien)m(ts.)150 2321 y Ff(3.1.1)63 +-b(Lo)m(w-lev)m(el)39 b(summary)150 2514 y Fl(This)21 +-b(in)m(terface)h(pro)m(vides)g(services)g(for)g(compressing)f(and)h +-(decompressing)f(data)i(in)e(memory)-8 b(.)38 b(There's)150 +-2623 y(no)43 b(pro)m(vision)e(for)h(dealing)g(with)f(\014les,)k +-(streams)e(or)g(an)m(y)g(other)g(I/O)g(mec)m(hanisms,)i(just)e(straigh) +-m(t)150 2733 y(memory-to-memory)25 b(w)m(ork.)38 b(In)23 +-b(fact,)k(this)22 b(part)i(of)f(the)h(library)d(can)j(b)s(e)f(compiled) +-f(without)h(inclusion)150 2843 y(of)31 b Fj(stdio.h)p +-Fl(,)d(whic)m(h)h(ma)m(y)i(b)s(e)f(helpful)d(for)k(em)m(b)s(edded)e +-(applications.)150 2999 y(The)h(lo)m(w-lev)m(el)g(part)g(of)h(the)f +-(library)e(has)i(no)h(global)e(v)-5 b(ariables)29 b(and)h(is)g +-(therefore)g(thread-safe.)150 3156 y(Six)d(routines)g(mak)m(e)j(up)d +-(the)i(lo)m(w)f(lev)m(el)g(in)m(terface:)41 b Fj(BZ2_bzCompressInit)p +-Fl(,)24 b Fj(BZ2_bzCompress)p Fl(,)h(and)150 3266 y Fj +-(BZ2_bzCompressEnd)h Fl(for)k(compression,)f(and)h(a)h(corresp)s +-(onding)d(trio)i Fj(BZ2_bzDecompressInit)p Fl(,)150 3375 +-y Fj(BZ2_bzDecompress)37 b Fl(and)j Fj(BZ2_bzDecompressEnd)c +-Fl(for)42 b(decompression.)72 b(The)41 b Fj(*Init)e Fl(functions)150 +-3485 y(allo)s(cate)44 b(memory)g(for)f(compression/decompression)f(and) +-h(do)h(other)g(initialisations,)f(whilst)f(the)150 3595 +-y Fj(*End)29 b Fl(functions)g(close)i(do)m(wn)f(op)s(erations)f(and)h +-(release)h(memory)-8 b(.)150 3751 y(The)36 b(real)f(w)m(ork)i(is)e +-(done)h(b)m(y)g Fj(BZ2_bzCompress)c Fl(and)j Fj(BZ2_bzDecompress)p +-Fl(.)54 b(These)36 b(compress)g(and)150 3861 y(decompress)30 +-b(data)h(from)f(a)h(user-supplied)c(input)i(bu\013er)g(to)i(a)g +-(user-supplied)c(output)j(bu\013er.)40 b(These)150 3971 +-y(bu\013ers)32 b(can)i(b)s(e)e(an)m(y)i(size;)g(arbitrary)e(quan)m +-(tities)h(of)g(data)h(are)g(handled)d(b)m(y)i(making)f(rep)s(eated)i +-(calls)150 4080 y(to)f(these)f(functions.)44 b(This)30 +-b(is)h(a)h(\015exible)e(mec)m(hanism)i(allo)m(wing)e(a)i(consumer-pull) +-e(st)m(yle)i(of)g(activit)m(y)-8 b(,)150 4190 y(or)30 +-b(pro)s(ducer-push,)e(or)i(a)h(mixture)e(of)i(b)s(oth.)150 +-4443 y Ff(3.1.2)63 b(High-lev)m(el)41 b(summary)150 4635 +-y Fl(This)d(in)m(terface)j(pro)m(vides)e(some)h(handy)f(wrapp)s(ers)f +-(around)h(the)i(lo)m(w-lev)m(el)f(in)m(terface)g(to)h(facilitate)150 +-4745 y(reading)26 b(and)g(writing)f Fj(bzip2)g Fl(format)i(\014les)f +-(\()p Fj(.bz2)g Fl(\014les\).)38 b(The)27 b(routines)e(pro)m(vide)h(ho) +-s(oks)h(to)g(facilitate)150 4854 y(reading)43 b(\014les)f(in)h(whic)m +-(h)f(the)i Fj(bzip2)f Fl(data)h(stream)g(is)f(em)m(b)s(edded)f(within)g +-(some)i(larger-scale)g(\014le)150 4964 y(structure,)30 +-b(or)h(where)e(there)i(are)g(m)m(ultiple)d Fj(bzip2)h +-Fl(data)i(streams)f(concatenated)j(end-to-end.)150 5121 +-y(F)-8 b(or)31 b(reading)f(\014les,)f Fj(BZ2_bzReadOpen)p +-Fl(,)e Fj(BZ2_bzRead)p Fl(,)h Fj(BZ2_bzReadClose)e Fl(and)150 +-5230 y Fj(BZ2_bzReadGetUnused)19 b Fl(are)25 b(supplied.)36 +-b(F)-8 b(or)25 b(writing)d(\014les,)j Fj(BZ2_bzWriteOpen)p +-Fl(,)d Fj(BZ2_bzWrite)g Fl(and)150 5340 y Fj(BZ2_bzWriteFinish)k +-Fl(are)k(a)m(v)-5 b(ailable.)p eop +-%%Page: 10 11 +-10 10 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(10)150 299 y(As)24 b(with)f(the)h(lo)m +-(w-lev)m(el)h(library)-8 b(,)23 b(no)h(global)g(v)-5 +-b(ariables)23 b(are)h(used)g(so)g(the)h(library)c(is)j(p)s(er)f(se)h +-(thread-safe.)150 408 y(Ho)m(w)m(ev)m(er,)32 b(if)c(I/O)h(errors)g(o)s +-(ccur)g(whilst)e(reading)i(or)g(writing)e(the)j(underlying)c +-(compressed)j(\014les,)g(y)m(ou)150 518 y(ma)m(y)j(ha)m(v)m(e)g(to)g +-(consult)e Fj(errno)g Fl(to)h(determine)g(the)g(cause)g(of)h(the)f +-(error.)42 b(In)30 b(that)i(case,)h(y)m(ou'd)e(need)g(a)150 +-628 y(C)f(library)e(whic)m(h)h(correctly)i(supp)s(orts)d +-Fj(errno)h Fl(in)g(a)i(m)m(ultithreaded)e(en)m(vironmen)m(t.)150 +-784 y(T)-8 b(o)56 b(mak)m(e)g(the)g(library)d(a)j(little)e(simpler)f +-(and)i(more)h(p)s(ortable,)61 b Fj(BZ2_bzReadOpen)51 +-b Fl(and)k Fj(BZ2_)150 894 y(bzWriteOpen)34 b Fl(require)j(y)m(ou)g(to) +-i(pass)e(them)g(\014le)g(handles)f(\()p Fj(FILE*)p Fl(s\))g(whic)m(h)h +-(ha)m(v)m(e)h(previously)e(b)s(een)150 1004 y(op)s(ened)41 +-b(for)g(reading)f(or)h(writing)f(resp)s(ectiv)m(ely)-8 +-b(.)73 b(That)41 b(a)m(v)m(oids)h(p)s(ortabilit)m(y)d(problems)g(asso)s +-(ciated)150 1113 y(with)j(\014le)h(op)s(erations)g(and)g(\014le)g +-(attributes,)j(whilst)c(not)i(b)s(eing)e(m)m(uc)m(h)h(of)h(an)g(imp)s +-(osition)c(on)k(the)150 1223 y(programmer.)150 1474 y +-Ff(3.1.3)63 b(Utilit)m(y)40 b(functions)h(summary)150 +-1666 y Fl(F)-8 b(or)45 b(v)m(ery)g(simple)d(needs,)48 +-b Fj(BZ2_bzBuffToBuffCompres)o(s)38 b Fl(and)44 b Fj +-(BZ2_bzBuffToBuffDecompres)o(s)150 1776 y Fl(are)29 b(pro)m(vided.)38 +-b(These)28 b(compress)g(data)h(in)e(memory)h(from)g(one)h(bu\013er)e +-(to)i(another)f(bu\013er)g(in)f(a)h(single)150 1885 y(function)38 +-b(call.)67 b(Y)-8 b(ou)40 b(should)d(assess)j(whether)f(these)h +-(functions)d(ful\014ll)f(y)m(our)k(memory-to-memory)150 +-1995 y(compression/decompression)26 b(requiremen)m(ts)h(b)s(efore)g(in) +-m(v)m(esting)g(e\013ort)i(in)d(understanding)f(the)j(more)150 +-2105 y(general)i(but)g(more)h(complex)f(lo)m(w-lev)m(el)g(in)m +-(terface.)150 2261 y(Y)-8 b(oshiok)j(a)47 b(Tsuneo)e(\()p +-Fj(QWF00133@niftyserve.or.jp)40 b Fl(/)46 b Fj +-(tsuneo-y@is.aist-nara.ac.)o(jp)p Fl(\))40 b(has)150 +-2371 y(con)m(tributed)f(some)h(functions)e(to)j(giv)m(e)f(b)s(etter)g +-Fj(zlib)f Fl(compatibilit)m(y)-8 b(.)67 b(These)40 b(functions)e(are)i +-Fj(BZ2_)150 2481 y(bzopen)p Fl(,)e Fj(BZ2_bzread)p Fl(,)f +-Fj(BZ2_bzwrite)p Fl(,)g Fj(BZ2_bzflush)p Fl(,)g Fj(BZ2_bzclose)p +-Fl(,)f Fj(BZ2_bzerror)f Fl(and)i Fj(BZ2_)150 2590 y(bzlibVersion)p +-Fl(.)49 b(Y)-8 b(ou)35 b(ma)m(y)g(\014nd)e(these)i(functions)d(more)j +-(con)m(v)m(enien)m(t)g(for)f(simple)f(\014le)g(reading)h(and)150 +-2700 y(writing,)c(than)h(those)h(in)e(the)i(high-lev)m(el)e(in)m +-(terface.)45 b(These)31 b(functions)f(are)i(not)g(\(y)m(et\))h +-(o\016cially)d(part)150 2809 y(of)k(the)g(library)-8 +-b(,)33 b(and)g(are)h(minimally)c(do)s(cumen)m(ted)k(here.)51 +-b(If)33 b(they)h(break,)h(y)m(ou)f(get)h(to)g(k)m(eep)f(all)f(the)150 +-2919 y(pieces.)40 b(I)31 b(hop)s(e)e(to)i(do)s(cumen)m(t)g(them)f(prop) +-s(erly)e(when)h(time)i(p)s(ermits.)150 3076 y(Y)-8 b(oshiok)j(a)27 +-b(also)g(con)m(tributed)f(mo)s(di\014cations)f(to)i(allo)m(w)f(the)h +-(library)e(to)i(b)s(e)f(built)f(as)i(a)g(Windo)m(ws)f(DLL.)150 +-3362 y Fk(3.2)68 b(Error)45 b(handling)150 3554 y Fl(The)23 +-b(library)f(is)h(designed)g(to)i(reco)m(v)m(er)g(cleanly)f(in)e(all)h +-(situations,)h(including)d(the)j(w)m(orst-case)i(situation)150 +-3664 y(of)j(decompressing)e(random)g(data.)41 b(I'm)28 +-b(not)h(100\045)g(sure)f(that)h(it)f(can)h(alw)m(a)m(ys)g(do)f(this,)g +-(so)g(y)m(ou)h(migh)m(t)150 3774 y(w)m(an)m(t)i(to)g(add)e(a)i(signal)d +-(handler)g(to)j(catc)m(h)h(segmen)m(tation)f(violations)e(during)f +-(decompression)h(if)g(y)m(ou)150 3883 y(are)g(feeling)f(esp)s(ecially)f +-(paranoid.)39 b(I)28 b(w)m(ould)g(b)s(e)g(in)m(terested)h(in)e(hearing) +-h(more)h(ab)s(out)f(the)h(robustness)150 3993 y(of)i(the)f(library)e +-(to)j(corrupted)f(compressed)g(data.)150 4150 y(V)-8 +-b(ersion)39 b(1.0)h(is)f(m)m(uc)m(h)g(more)h(robust)e(in)g(this)g(resp) +-s(ect)i(than)f(0.9.0)i(or)e(0.9.5.)70 b(In)m(v)m(estigations)39 +-b(with)150 4259 y(Chec)m(k)m(er)21 b(\(a)g(to)s(ol)g(for)f(detecting)h +-(problems)d(with)h(memory)h(managemen)m(t,)k(similar)18 +-b(to)j(Purify\))e(indicate)150 4369 y(that,)40 b(at)e(least)f(for)g +-(the)h(few)e(\014les)h(I)g(tested,)j(all)c(single-bit)f(errors)i(in)e +-(the)j(decompressed)f(data)h(are)150 4478 y(caugh)m(t)c(prop)s(erly)-8 +-b(,)31 b(with)g(no)i(segmen)m(tation)h(faults,)e(no)g(reads)h(of)g +-(uninitialised)27 b(data)34 b(and)e(no)g(out)h(of)150 +-4588 y(range)f(reads)g(or)f(writes.)44 b(So)32 b(it's)f(certainly)g(m)m +-(uc)m(h)h(impro)m(v)m(ed,)g(although)f(I)g(w)m(ouldn't)g(claim)g(it)g +-(to)i(b)s(e)150 4698 y(totally)d(b)s(om)m(bpro)s(of.)150 +-4854 y(The)25 b(\014le)g Fj(bzlib.h)f Fl(con)m(tains)i(all)f +-(de\014nitions)e(needed)i(to)i(use)e(the)h(library)-8 +-b(.)37 b(In)26 b(particular,)f(y)m(ou)h(should)150 4964 +-y(de\014nitely)i(not)j(include)d Fj(bzlib_private.h)p +-Fl(.)150 5121 y(In)39 b Fj(bzlib.h)p Fl(,)h(the)g(v)-5 +-b(arious)39 b(return)f(v)-5 b(alues)39 b(are)h(de\014ned.)68 +-b(The)39 b(follo)m(wing)f(list)h(is)f(not)i(in)m(tended)f(as)150 +-5230 y(an)c(exhaustiv)m(e)h(description)d(of)i(the)h(circumstances)f +-(in)f(whic)m(h)g(a)i(giv)m(en)f(v)-5 b(alue)35 b(ma)m(y)h(b)s(e)e +-(returned)h({)150 5340 y(those)h(descriptions)d(are)j(giv)m(en)f +-(later.)56 b(Rather,)37 b(it)d(is)h(in)m(tended)f(to)i(con)m(v)m(ey)h +-(the)e(rough)g(meaning)g(of)p eop +-%%Page: 11 12 +-11 11 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(11)150 299 y(eac)m(h)38 +-b(return)d(v)-5 b(alue.)59 b(The)36 b(\014rst)g(\014v)m(e)g(actions)h +-(are)g(normal)f(and)f(not)i(in)m(tended)f(to)h(denote)g(an)f(error)150 +-408 y(situation.)150 592 y Fj(BZ_OK)180 b Fl(The)30 b(requested)g +-(action)h(w)m(as)g(completed)f(successfully)-8 b(.)150 +-756 y Fj(BZ_RUN_OK)150 866 y(BZ_FLUSH_OK)150 975 y(BZ_FINISH_OK)630 +-1085 y Fl(In)24 b Fj(BZ2_bzCompress)p Fl(,)e(the)i(requested)g +-(\015ush/\014nish/nothing-sp)s(ecial)c(action)k(w)m(as)h(com-)630 +-1194 y(pleted)30 b(successfully)-8 b(.)150 1358 y Fj(BZ_STREAM_END)630 +-1468 y Fl(Compression)38 b(of)j(data)f(w)m(as)h(completed,)h(or)f(the)f +-(logical)f(stream)i(end)e(w)m(as)i(detected)630 1577 +-y(during)28 b(decompression.)150 1761 y(The)i(follo)m(wing)f(return)g +-(v)-5 b(alues)30 b(indicate)f(an)h(error)g(of)h(some)g(kind.)150 +-1945 y Fj(BZ_CONFIG_ERROR)630 2055 y Fl(Indicates)48 +-b(that)h(the)g(library)e(has)h(b)s(een)g(improp)s(erly)d(compiled)j(on) +-g(y)m(our)h(platform)630 2164 y({)j(a)g(ma)5 b(jor)51 +-b(con\014guration)g(error.)104 b(Sp)s(eci\014cally)-8 +-b(,)55 b(it)c(means)g(that)h Fj(sizeof\(char\))p Fl(,)630 +-2274 y Fj(sizeof\(short\))44 b Fl(and)i Fj(sizeof\(int\))f +-Fl(are)j(not)f(1,)52 b(2)c(and)f(4)h(resp)s(ectiv)m(ely)-8 +-b(,)51 b(as)d(they)630 2384 y(should)27 b(b)s(e.)40 b(Note)30 +-b(that)g(the)f(library)e(should)g(still)g(w)m(ork)i(prop)s(erly)e(on)i +-(64-bit)g(platforms)630 2493 y(whic)m(h)d(follo)m(w)h(the)g(LP64)h +-(programming)e(mo)s(del)h({)g(that)h(is,)g(where)e Fj(sizeof\(long\))f +-Fl(and)630 2603 y Fj(sizeof\(void*\))e Fl(are)k(8.)40 +-b(Under)25 b(LP64,)j Fj(sizeof\(int\))c Fl(is)h(still)f(4,)k(so)f +-Fj(libbzip2)p Fl(,)e(whic)m(h)630 2712 y(do)s(esn't)30 +-b(use)g(the)h Fj(long)e Fl(t)m(yp)s(e,)i(is)e(OK.)150 +-2876 y Fj(BZ_SEQUENCE_ERROR)630 2986 y Fl(When)43 b(using)f(the)i +-(library)-8 b(,)45 b(it)e(is)f(imp)s(ortan)m(t)h(to)h(call)e(the)i +-(functions)e(in)g(the)i(correct)630 3095 y(sequence)28 +-b(and)f(with)f(data)j(structures)e(\(bu\013ers)f(etc\))j(in)e(the)g +-(correct)i(states.)41 b Fj(libbzip2)630 3205 y Fl(c)m(hec)m(ks)26 +-b(as)e(m)m(uc)m(h)h(as)f(it)g(can)g(to)h(ensure)f(this)f(is)g(happ)s +-(ening,)h(and)f(returns)g Fj(BZ_SEQUENCE_)630 3314 y(ERROR)36 +-b Fl(if)h(not.)62 b(Co)s(de)37 b(whic)m(h)g(complies)f(precisely)g +-(with)h(the)g(function)g(seman)m(tics,)j(as)630 3424 +-y(detailed)d(b)s(elo)m(w,)i(should)d(nev)m(er)i(receiv)m(e)h(this)d(v) +--5 b(alue;)41 b(suc)m(h)d(an)g(ev)m(en)m(t)h(denotes)f(buggy)630 +-3534 y(co)s(de)31 b(whic)m(h)e(y)m(ou)h(should)f(in)m(v)m(estigate.)150 +-3697 y Fj(BZ_PARAM_ERROR)630 3807 y Fl(Returned)43 b(when)f(a)i +-(parameter)g(to)h(a)f(function)e(call)h(is)f(out)i(of)g(range)g(or)g +-(otherwise)630 3917 y(manifestly)34 b(incorrect.)57 b(As)36 +-b(with)e Fj(BZ_SEQUENCE_ERROR)p Fl(,)f(this)i(denotes)h(a)g(bug)f(in)g +-(the)630 4026 y(clien)m(t)23 b(co)s(de.)39 b(The)22 b(distinction)f(b)s +-(et)m(w)m(een)j Fj(BZ_PARAM_ERROR)c Fl(and)j Fj(BZ_SEQUENCE_ERROR)630 +-4136 y Fl(is)29 b(a)i(bit)f(hazy)-8 b(,)31 b(but)f(still)e(w)m(orth)i +-(making.)150 4300 y Fj(BZ_MEM_ERROR)630 4409 y Fl(Returned)g(when)f(a)i +-(request)f(to)i(allo)s(cate)f(memory)f(failed.)40 b(Note)31 +-b(that)g(the)g(quan)m(tit)m(y)g(of)630 4519 y(memory)21 +-b(needed)g(to)i(decompress)e(a)g(stream)h(cannot)g(b)s(e)f(determined)f +-(un)m(til)g(the)h(stream's)630 4628 y(header)29 b(has)g(b)s(een)g +-(read.)40 b(So)29 b Fj(BZ2_bzDecompress)c Fl(and)j Fj(BZ2_bzRead)f +-Fl(ma)m(y)j(return)e Fj(BZ_)630 4738 y(MEM_ERROR)d Fl(ev)m(en)k(though) +-e(some)h(of)g(the)g(compressed)g(data)g(has)g(b)s(een)f(read.)39 +-b(The)28 b(same)630 4847 y(is)38 b(not)i(true)f(for)g(compression;)k +-(once)d Fj(BZ2_bzCompressInit)34 b Fl(or)39 b Fj(BZ2_bzWriteOpen)630 +-4957 y Fl(ha)m(v)m(e)32 b(successfully)c(completed,)j +-Fj(BZ_MEM_ERROR)c Fl(cannot)k(o)s(ccur.)150 5121 y Fj(BZ_DATA_ERROR)630 +-5230 y Fl(Returned)h(when)g(a)h(data)g(in)m(tegrit)m(y)g(error)g(is)e +-(detected)k(during)30 b(decompression.)47 b(Most)630 +-5340 y(imp)s(ortan)m(tly)-8 b(,)31 b(this)f(means)i(when)f(stored)g +-(and)g(computed)h(CR)m(Cs)f(for)g(the)h(data)g(do)g(not)p +-eop +-%%Page: 12 13 +-12 12 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(12)630 299 y(matc)m(h.)41 +-b(This)28 b(v)-5 b(alue)29 b(is)f(also)i(returned)e(up)s(on)g +-(detection)i(of)g(an)m(y)g(other)f(anomaly)h(in)e(the)630 +-408 y(compressed)i(data.)150 560 y Fj(BZ_DATA_ERROR_MAGIC)630 +-670 y Fl(As)k(a)g(sp)s(ecial)f(case)i(of)f Fj(BZ_DATA_ERROR)p +-Fl(,)d(it)i(is)g(sometimes)h(useful)e(to)j(kno)m(w)f(when)f(the)630 +-779 y(compressed)d(stream)h(do)s(es)f(not)g(start)h(with)e(the)i +-(correct)h(magic)e(b)m(ytes)h(\()p Fj('B')f('Z')f('h')p +-Fl(\).)150 931 y Fj(BZ_IO_ERROR)630 1040 y Fl(Returned)k(b)m(y)h +-Fj(BZ2_bzRead)d Fl(and)i Fj(BZ2_bzWrite)e Fl(when)i(there)h(is)f(an)g +-(error)h(reading)f(or)630 1150 y(writing)28 b(in)h(the)h(compressed)g +-(\014le,)f(and)h(b)m(y)g Fj(BZ2_bzReadOpen)c Fl(and)j +-Fj(BZ2_bzWriteOpen)630 1259 y Fl(for)i(attempts)i(to)f(use)f(a)h +-(\014le)e(for)i(whic)m(h)e(the)h(error)g(indicator)g(\(viz,)g +-Fj(ferror\(f\))p Fl(\))f(is)g(set.)630 1369 y(On)h(receipt)g(of)h +-Fj(BZ_IO_ERROR)p Fl(,)e(the)h(caller)h(should)d(consult)i +-Fj(errno)g Fl(and/or)g Fj(perror)f Fl(to)630 1479 y(acquire)g(op)s +-(erating-system)g(sp)s(eci\014c)f(information)g(ab)s(out)h(the)h +-(problem.)150 1630 y Fj(BZ_UNEXPECTED_EOF)630 1740 y +-Fl(Returned)36 b(b)m(y)g Fj(BZ2_bzRead)e Fl(when)i(the)h(compressed)f +-(\014le)g(\014nishes)e(b)s(efore)j(the)f(logical)630 +-1849 y(end)30 b(of)g(stream)h(is)e(detected.)150 2001 +-y Fj(BZ_OUTBUFF_FULL)630 2110 y Fl(Returned)g(b)m(y)i +-Fj(BZ2_bzBuffToBuffCompres)o(s)24 b Fl(and)30 b Fj +-(BZ2_bzBuffToBuffDecompres)o(s)630 2220 y Fl(to)h(indicate)f(that)h +-(the)f(output)g(data)h(will)d(not)i(\014t)h(in)m(to)f(the)h(output)f +-(bu\013er)f(pro)m(vided.)150 2492 y Fk(3.3)68 b(Lo)l(w-lev)l(el)47 +-b(in)l(terface)150 2766 y Ff(3.3.1)63 b Fe(BZ2_bzCompressInit)390 +-2953 y Fj(typedef)533 3057 y(struct)46 b({)676 3161 y(char)h(*next_in;) +-676 3264 y(unsigned)f(int)h(avail_in;)676 3368 y(unsigned)f(int)h +-(total_in_lo32;)676 3472 y(unsigned)f(int)h(total_in_hi32;)676 +-3680 y(char)g(*next_out;)676 3783 y(unsigned)f(int)h(avail_out;)676 +-3887 y(unsigned)f(int)h(total_out_lo32;)676 3991 y(unsigned)f(int)h +-(total_out_hi32;)676 4198 y(void)g(*state;)676 4406 y(void)g +-(*\(*bzalloc\)\(void)c(*,int,int\);)676 4510 y(void)k +-(\(*bzfree\)\(void)d(*,void)i(*\);)676 4614 y(void)h(*opaque;)533 +-4717 y(})533 4821 y(bz_stream;)390 5029 y(int)g(BZ2_bzCompressInit)c +-(\()k(bz_stream)e(*strm,)1583 5132 y(int)i(blockSize100k,)1583 +-5236 y(int)g(verbosity,)1583 5340 y(int)g(workFactor)e(\);)p +-eop +-%%Page: 13 14 +-13 13 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(13)150 456 y(Prepares)32 +-b(for)h(compression.)47 b(The)32 b Fj(bz_stream)e Fl(structure)j(holds) +-e(all)h(data)h(p)s(ertaining)e(to)i(the)g(com-)150 565 +-y(pression)i(activit)m(y)-8 b(.)62 b(A)37 b Fj(bz_stream)e +-Fl(structure)h(should)f(b)s(e)i(allo)s(cated)g(and)f(initialised)e +-(prior)h(to)j(the)150 675 y(call.)67 b(The)39 b(\014elds)e(of)j +-Fj(bz_stream)d Fl(comprise)h(the)i(en)m(tiret)m(y)g(of)f(the)h +-(user-visible)c(data.)68 b Fj(state)38 b Fl(is)h(a)150 +-784 y(p)s(oin)m(ter)29 b(to)i(the)g(priv)-5 b(ate)30 +-b(data)h(structures)f(required)e(for)i(compression.)150 +-941 y(Custom)37 b(memory)g(allo)s(cators)g(are)h(supp)s(orted,)f(via)g +-(\014elds)f Fj(bzalloc)p Fl(,)h Fj(bzfree)p Fl(,)g(and)g +-Fj(opaque)p Fl(.)59 b(The)150 1051 y(v)-5 b(alue)32 b +-Fj(opaque)e Fl(is)i(passed)f(to)i(as)g(the)f(\014rst)g(argumen)m(t)h +-(to)g(all)e(calls)g(to)i Fj(bzalloc)d Fl(and)i Fj(bzfree)p +-Fl(,)f(but)h(is)150 1160 y(otherwise)d(ignored)g(b)m(y)h(the)g(library) +--8 b(.)38 b(The)29 b(call)h Fj(bzalloc)e(\()i(opaque,)e(n,)i(m)g(\))g +-Fl(is)e(exp)s(ected)j(to)f(return)150 1270 y(a)g(p)s(oin)m(ter)e +-Fj(p)h Fl(to)h Fj(n)g(*)g(m)f Fl(b)m(ytes)h(of)g(memory)-8 +-b(,)30 b(and)e Fj(bzfree)h(\()h(opaque,)f(p)h(\))f Fl(should)e(free)i +-(that)h(memory)-8 b(.)150 1427 y(If)33 b(y)m(ou)g(don't)h(w)m(an)m(t)g +-(to)g(use)f(a)g(custom)h(memory)f(allo)s(cator,)h(set)g +-Fj(bzalloc)p Fl(,)e Fj(bzfree)g Fl(and)h Fj(opaque)e +-Fl(to)150 1537 y Fj(NULL)p Fl(,)e(and)h(the)h(library)d(will)f(then)k +-(use)f(the)g(standard)g Fj(malloc)p Fl(/)p Fj(free)e +-Fl(routines.)150 1693 y(Before)39 b(calling)d Fj(BZ2_bzCompressInit)p +-Fl(,)f(\014elds)h Fj(bzalloc)p Fl(,)h Fj(bzfree)f Fl(and)h +-Fj(opaque)f Fl(should)g(b)s(e)h(\014lled)150 1803 y(appropriately)-8 +-b(,)35 b(as)h(just)f(describ)s(ed.)53 b(Up)s(on)34 b(return,)i(the)g +-(in)m(ternal)e(state)i(will)d(ha)m(v)m(e)j(b)s(een)f(allo)s(cated)150 +-1913 y(and)43 b(initialised,)g(and)g Fj(total_in_lo32)p +-Fl(,)h Fj(total_in_hi32)p Fl(,)f Fj(total_out_lo32)d +-Fl(and)j Fj(total_out_)150 2022 y(hi32)37 b Fl(will)f(ha)m(v)m(e)j(b)s +-(een)f(set)h(to)g(zero.)65 b(These)38 b(four)g(\014elds)e(are)j(used)f +-(b)m(y)g(the)g(library)e(to)j(inform)e(the)150 2132 y(caller)j(of)g +-(the)h(total)g(amoun)m(t)g(of)g(data)g(passed)f(in)m(to)g(and)g(out)g +-(of)h(the)g(library)-8 b(,)41 b(resp)s(ectiv)m(ely)-8 +-b(.)70 b(Y)-8 b(ou)150 2241 y(should)34 b(not)j(try)f(to)h(c)m(hange)g +-(them.)58 b(As)36 b(of)h(v)m(ersion)f(1.0,)j(64-bit)d(coun)m(ts)h(are)f +-(main)m(tained,)h(ev)m(en)g(on)150 2351 y(32-bit)i(platforms,)h(using)d +-(the)i Fj(_hi32)e Fl(\014elds)g(to)j(store)f(the)g(upp)s(er)d(32)k +-(bits)d(of)i(the)g(coun)m(t.)66 b(So,)41 b(for)150 2460 +-y(example,)30 b(the)h(total)g(amoun)m(t)g(of)f(data)h(in)f(is)f +-Fj(\(total_in_hi32)d(<<)k(32\))g(+)g(total_in_lo32)p +-Fl(.)150 2617 y(P)m(arameter)g Fj(blockSize100k)25 b +-Fl(sp)s(eci\014es)i(the)h(blo)s(c)m(k)g(size)h(to)g(b)s(e)f(used)f(for) +-h(compression.)40 b(It)28 b(should)f(b)s(e)150 2727 y(a)k(v)-5 +-b(alue)30 b(b)s(et)m(w)m(een)i(1)f(and)f(9)h(inclusiv)m(e,)e(and)h(the) +-h(actual)g(blo)s(c)m(k)f(size)g(used)g(is)g(100000)j(x)e(this)e +-(\014gure.)42 b(9)150 2836 y(giv)m(es)31 b(the)f(b)s(est)g(compression) +-g(but)f(tak)m(es)j(most)f(memory)-8 b(.)150 2993 y(P)m(arameter)29 +-b Fj(verbosity)c Fl(should)h(b)s(e)h(set)i(to)f(a)h(n)m(um)m(b)s(er)d +-(b)s(et)m(w)m(een)j(0)f(and)f(4)h(inclusiv)m(e.)38 b(0)28 +-b(is)f(silen)m(t,)h(and)150 3103 y(greater)j(n)m(um)m(b)s(ers)c(giv)m +-(e)j(increasingly)d(v)m(erb)s(ose)j(monitoring/debugging)d(output.)40 +-b(If)29 b(the)g(library)e(has)150 3212 y(b)s(een)j(compiled)e(with)i +-Fj(-DBZ_NO_STDIO)p Fl(,)d(no)j(suc)m(h)g(output)g(will)e(app)s(ear)h +-(for)h(an)m(y)h(v)m(erb)s(osit)m(y)f(setting.)150 3369 +-y(P)m(arameter)35 b Fj(workFactor)d Fl(con)m(trols)i(ho)m(w)g(the)g +-(compression)f(phase)h(b)s(eha)m(v)m(es)g(when)f(presen)m(ted)h(with) +-150 3479 y(w)m(orst)40 b(case,)j(highly)37 b(rep)s(etitiv)m(e,)k(input) +-d(data.)68 b(If)39 b(compression)g(runs)e(in)m(to)j(di\016culties)d +-(caused)i(b)m(y)150 3588 y(rep)s(etitiv)m(e)34 b(data,)j(the)e(library) +-d(switc)m(hes)j(from)f(the)h(standard)f(sorting)g(algorithm)g(to)i(a)f +-(fallbac)m(k)f(al-)150 3698 y(gorithm.)47 b(The)32 b(fallbac)m(k)g(is)g +-(slo)m(w)m(er)g(than)h(the)f(standard)g(algorithm)g(b)m(y)g(p)s(erhaps) +-f(a)i(factor)h(of)e(three,)150 3808 y(but)e(alw)m(a)m(ys)h(b)s(eha)m(v) +-m(es)f(reasonably)-8 b(,)31 b(no)f(matter)h(ho)m(w)g(bad)f(the)g +-(input.)150 3965 y(Lo)m(w)m(er)25 b(v)-5 b(alues)24 b(of)h +-Fj(workFactor)d Fl(reduce)i(the)h(amoun)m(t)g(of)g(e\013ort)g(the)g +-(standard)f(algorithm)f(will)f(exp)s(end)150 4074 y(b)s(efore)j +-(resorting)h(to)g(the)g(fallbac)m(k.)39 b(Y)-8 b(ou)27 +-b(should)c(set)k(this)e(parameter)h(carefully;)g(to)s(o)h(lo)m(w,)g +-(and)e(man)m(y)150 4184 y(inputs)32 b(will)f(b)s(e)i(handled)f(b)m(y)i +-(the)g(fallbac)m(k)g(algorithm)f(and)g(so)h(compress)g(rather)g(slo)m +-(wly)-8 b(,)34 b(to)s(o)h(high,)150 4293 y(and)24 b(y)m(our)h(a)m(v)m +-(erage-to-w)m(orst)30 b(case)c(compression)e(times)h(can)g(b)s(ecome)g +-(v)m(ery)h(large.)39 b(The)24 b(default)g(v)-5 b(alue)150 +-4403 y(of)31 b(30)g(giv)m(es)f(reasonable)h(b)s(eha)m(viour)e(o)m(v)m +-(er)i(a)g(wide)e(range)i(of)f(circumstances.)150 4560 +-y(Allo)m(w)m(able)h(v)-5 b(alues)31 b(range)i(from)e(0)i(to)f(250)h +-(inclusiv)m(e.)44 b(0)32 b(is)f(a)h(sp)s(ecial)f(case,)i(equiv)-5 +-b(alen)m(t)32 b(to)g(using)f(the)150 4669 y(default)f(v)-5 +-b(alue)29 b(of)i(30.)150 4826 y(Note)38 b(that)f(the)g(compressed)f +-(output)g(generated)h(is)f(the)g(same)h(regardless)f(of)h(whether)f(or) +-g(not)h(the)150 4936 y(fallbac)m(k)30 b(algorithm)f(is)h(used.)150 +-5093 y(Be)23 b(a)m(w)m(are)h(also)f(that)g(this)f(parameter)h(ma)m(y)g +-(disapp)s(ear)e(en)m(tirely)h(in)f(future)h(v)m(ersions)g(of)h(the)g +-(library)-8 b(.)36 b(In)150 5202 y(principle)20 b(it)j(should)e(b)s(e)h +-(p)s(ossible)f(to)j(devise)f(a)g(go)s(o)s(d)g(w)m(a)m(y)i(to)f +-(automatically)f(c)m(ho)s(ose)h(whic)m(h)e(algorithm)150 +-5312 y(to)31 b(use.)41 b(Suc)m(h)29 b(a)i(mec)m(hanism)f(w)m(ould)f +-(render)g(the)i(parameter)g(obsolete.)p eop +-%%Page: 14 15 +-14 14 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(14)150 299 y(P)m(ossible)29 +-b(return)h(v)-5 b(alues:)572 450 y Fj(BZ_CONFIG_ERROR)663 +-554 y Fl(if)29 b(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 +-657 y Fj(BZ_PARAM_ERROR)663 761 y Fl(if)g Fj(strm)g Fl(is)h +-Fj(NULL)663 865 y Fl(or)g Fj(blockSize)e(<)i Fl(1)h(or)f +-Fj(blockSize)e(>)i Fl(9)663 969 y(or)g Fj(verbosity)e(<)i +-Fl(0)h(or)f Fj(verbosity)e(>)i Fl(4)663 1073 y(or)g Fj(workFactor)e(<)i +-Fl(0)g(or)h Fj(workFactor)c(>)j Fl(250)572 1176 y Fj(BZ_MEM_ERROR)663 +-1280 y Fl(if)f(not)i(enough)f(memory)g(is)f(a)m(v)-5 +-b(ailable)572 1384 y Fj(BZ_OK)663 1488 y Fl(otherwise)150 +-1645 y(Allo)m(w)m(able)30 b(next)g(actions:)572 1796 +-y Fj(BZ2_bzCompress)663 1899 y Fl(if)f Fj(BZ_OK)g Fl(is)g(returned)572 +-2003 y(no)h(sp)s(eci\014c)f(action)i(needed)f(in)f(case)i(of)g(error) +-150 2255 y Ff(3.3.2)63 b Fe(BZ2_bzCompress)533 2441 y +-Fj(int)47 b(BZ2_bzCompress)d(\()j(bz_stream)f(*strm,)g(int)h(action)f +-(\);)150 2598 y Fl(Pro)m(vides)28 b(more)g(input)f(and/or)h(output)g +-(bu\013er)g(space)h(for)f(the)h(library)-8 b(.)38 b(The)28 +-b(caller)g(main)m(tains)f(input)150 2708 y(and)j(output)g(bu\013ers,)f +-(and)h(calls)g Fj(BZ2_bzCompress)c Fl(to)31 b(transfer)f(data)h(b)s(et) +-m(w)m(een)g(them.)150 2865 y(Before)j(eac)m(h)g(call)e(to)i +-Fj(BZ2_bzCompress)p Fl(,)c Fj(next_in)h Fl(should)g(p)s(oin)m(t)h(at)h +-(the)g(data)h(to)g(b)s(e)e(compressed,)150 2974 y(and)41 +-b Fj(avail_in)f Fl(should)g(indicate)h(ho)m(w)h(man)m(y)f(b)m(ytes)i +-(the)f(library)d(ma)m(y)k(read.)75 b Fj(BZ2_bzCompress)150 +-3084 y Fl(up)s(dates)29 b Fj(next_in)p Fl(,)g Fj(avail_in)f +-Fl(and)i Fj(total_in)e Fl(to)j(re\015ect)g(the)g(n)m(um)m(b)s(er)e(of)h +-(b)m(ytes)h(it)f(has)g(read.)150 3241 y(Similarly)-8 +-b(,)27 b Fj(next_out)h Fl(should)g(p)s(oin)m(t)h(to)i(a)f(bu\013er)f +-(in)g(whic)m(h)g(the)h(compressed)g(data)h(is)e(to)i(b)s(e)e(placed,) +-150 3350 y(with)i Fj(avail_out)f Fl(indicating)h(ho)m(w)h(m)m(uc)m(h)h +-(output)f(space)h(is)f(a)m(v)-5 b(ailable.)46 b Fj(BZ2_bzCompress)29 +-b Fl(up)s(dates)150 3460 y Fj(next_out)p Fl(,)f Fj(avail_out)g +-Fl(and)i Fj(total_out)e Fl(to)j(re\015ect)g(the)f(n)m(um)m(b)s(er)g(of) +-g(b)m(ytes)h(output.)150 3617 y(Y)-8 b(ou)40 b(ma)m(y)g(pro)m(vide)e +-(and)h(remo)m(v)m(e)i(as)f(little)e(or)h(as)h(m)m(uc)m(h)f(data)h(as)g +-(y)m(ou)f(lik)m(e)g(on)g(eac)m(h)i(call)e(of)g Fj(BZ2_)150 +-3726 y(bzCompress)p Fl(.)48 b(In)33 b(the)h(limit,)f(it)h(is)f +-(acceptable)h(to)h(supply)c(and)j(remo)m(v)m(e)h(data)g(one)f(b)m(yte)g +-(at)h(a)f(time,)150 3836 y(although)28 b(this)f(w)m(ould)g(b)s(e)h +-(terribly)e(ine\016cien)m(t.)39 b(Y)-8 b(ou)29 b(should)e(alw)m(a)m(ys) +-h(ensure)g(that)h(at)g(least)g(one)f(b)m(yte)150 3946 +-y(of)j(output)f(space)g(is)g(a)m(v)-5 b(ailable)30 b(at)h(eac)m(h)g +-(call.)150 4102 y(A)38 b(second)h(purp)s(ose)d(of)j Fj(BZ2_bzCompress) +-34 b Fl(is)j(to)i(request)f(a)h(c)m(hange)g(of)g(mo)s(de)e(of)i(the)f +-(compressed)150 4212 y(stream.)150 4369 y(Conceptually)-8 +-b(,)24 b(a)g(compressed)g(stream)g(can)f(b)s(e)g(in)g(one)h(of)f(four)g +-(states:)39 b(IDLE,)24 b(R)m(UNNING,)h(FLUSH-)150 4478 +-y(ING)37 b(and)g(FINISHING.)g(Before)i(initialisation)33 +-b(\()p Fj(BZ2_bzCompressInit)p Fl(\))g(and)j(after)i(termination)150 +-4588 y(\()p Fj(BZ2_bzCompressEnd)p Fl(\),)27 b(a)j(stream)h(is)f +-(regarded)g(as)g(IDLE.)150 4745 y(Up)s(on)35 b(initialisation)e(\()p +-Fj(BZ2_bzCompressInit)p Fl(\),)h(the)i(stream)h(is)e(placed)h(in)e(the) +-j(R)m(UNNING)g(state.)150 4854 y(Subsequen)m(t)j(calls)g(to)i +-Fj(BZ2_bzCompress)37 b Fl(should)j(pass)g Fj(BZ_RUN)g +-Fl(as)h(the)g(requested)h(action;)47 b(other)150 4964 +-y(actions)31 b(are)f(illegal)f(and)h(will)d(result)j(in)f +-Fj(BZ_SEQUENCE_ERROR)p Fl(.)150 5121 y(A)m(t)38 b(some)f(p)s(oin)m(t,)h +-(the)f(calling)e(program)i(will)d(ha)m(v)m(e)k(pro)m(vided)e(all)f(the) +-i(input)e(data)j(it)e(w)m(an)m(ts)i(to.)61 b(It)150 5230 +-y(will)28 b(then)h(w)m(an)m(t)i(to)g(\014nish)d(up)h({)i(in)d +-(e\013ect,)k(asking)e(the)g(library)e(to)j(pro)s(cess)f(an)m(y)g(data)h +-(it)f(migh)m(t)g(ha)m(v)m(e)150 5340 y(bu\013ered)25 +-b(in)m(ternally)-8 b(.)38 b(In)25 b(this)g(state,)k Fj(BZ2_bzCompress) +-22 b Fl(will)i(no)i(longer)g(attempt)h(to)g(read)f(data)h(from)p +-eop +-%%Page: 15 16 +-15 15 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(15)150 299 y Fj(next_in)p +-Fl(,)33 b(but)g(it)h(will)d(w)m(an)m(t)k(to)g(write)e(data)h(to)h +-Fj(next_out)p Fl(.)49 b(Because)36 b(the)e(output)f(bu\013er)g +-(supplied)150 408 y(b)m(y)e(the)h(user)e(can)i(b)s(e)f(arbitrarily)d +-(small,)j(the)g(\014nishing-up)d(op)s(eration)i(cannot)i(necessarily)e +-(b)s(e)h(done)150 518 y(with)e(a)i(single)e(call)h(of)g +-Fj(BZ2_bzCompress)p Fl(.)150 675 y(Instead,)47 b(the)d(calling)f +-(program)g(passes)h Fj(BZ_FINISH)d Fl(as)j(an)g(action)g(to)h +-Fj(BZ2_bzCompress)p Fl(.)77 b(This)150 784 y(c)m(hanges)30 +-b(the)f(stream's)g(state)h(to)f(FINISHING.)g(An)m(y)g(remaining)e +-(input)g(\(ie,)i Fj(next_in[0)f(..)i(avail_)150 894 y(in-1])p +-Fl(\))36 b(is)f(compressed)i(and)f(transferred)g(to)h(the)g(output)g +-(bu\013er.)58 b(T)-8 b(o)38 b(do)e(this,)i Fj(BZ2_bzCompress)150 +-1004 y Fl(m)m(ust)h(b)s(e)f(called)g(rep)s(eatedly)h(un)m(til)e(all)h +-(the)h(output)f(has)h(b)s(een)f(consumed.)66 b(A)m(t)40 +-b(that)g(p)s(oin)m(t,)g Fj(BZ2_)150 1113 y(bzCompress)h +-Fl(returns)h Fj(BZ_STREAM_END)p Fl(,)i(and)f(the)h(stream's)g(state)h +-(is)d(set)j(bac)m(k)f(to)g(IDLE.)g Fj(BZ2_)150 1223 y(bzCompressEnd)27 +-b Fl(should)h(then)i(b)s(e)g(called.)150 1380 y(Just)25 +-b(to)i(mak)m(e)g(sure)e(the)i(calling)d(program)i(do)s(es)g(not)g(c)m +-(heat,)i(the)f(library)c(mak)m(es)k(a)f(note)h(of)f Fj(avail_in)150 +-1489 y Fl(at)g(the)g(time)f(of)g(the)g(\014rst)g(call)g(to)h +-Fj(BZ2_bzCompress)21 b Fl(whic)m(h)j(has)h Fj(BZ_FINISH)e +-Fl(as)i(an)h(action)f(\(ie,)i(at)f(the)150 1599 y(time)d(the)h(program) +-g(has)f(announced)g(its)h(in)m(ten)m(tion)f(to)h(not)g(supply)e(an)m(y) +-i(more)g(input\).)37 b(By)24 b(comparing)150 1708 y(this)k(v)-5 +-b(alue)28 b(with)g(that)h(of)h Fj(avail_in)c Fl(o)m(v)m(er)k(subsequen) +-m(t)f(calls)f(to)h Fj(BZ2_bzCompress)p Fl(,)d(the)j(library)e(can)150 +-1818 y(detect)33 b(an)m(y)e(attempts)i(to)f(slip)d(in)h(more)h(data)h +-(to)h(compress.)43 b(An)m(y)31 b(calls)g(for)g(whic)m(h)f(this)g(is)h +-(detected)150 1928 y(will)j(return)h Fj(BZ_SEQUENCE_ERROR)p +-Fl(.)55 b(This)34 b(indicates)i(a)h(programming)e(mistak)m(e)i(whic)m +-(h)e(should)g(b)s(e)150 2037 y(corrected.)150 2194 y(Instead)i(of)g +-(asking)f(to)h(\014nish,)f(the)h(calling)f(program)g(ma)m(y)h(ask)g +-Fj(BZ2_bzCompress)c Fl(to)38 b(tak)m(e)g(all)e(the)150 +-2304 y(remaining)j(input,)i(compress)f(it)g(and)g(terminate)h(the)g +-(curren)m(t)f(\(Burro)m(ws-Wheeler\))h(compression)150 +-2413 y(blo)s(c)m(k.)e(This)26 b(could)h(b)s(e)g(useful)f(for)h(error)h +-(con)m(trol)g(purp)s(oses.)38 b(The)27 b(mec)m(hanism)g(is)g(analogous) +-h(to)g(that)150 2523 y(for)35 b(\014nishing:)46 b(call)35 +-b Fj(BZ2_bzCompress)c Fl(with)i(an)i(action)g(of)g Fj(BZ_FLUSH)p +-Fl(,)g(remo)m(v)m(e)h(output)f(data,)i(and)150 2632 y(p)s(ersist)h +-(with)g(the)i Fj(BZ_FLUSH)e Fl(action)i(un)m(til)e(the)i(v)-5 +-b(alue)39 b Fj(BZ_RUN)f Fl(is)h(returned.)68 b(As)39 +-b(with)g(\014nishing,)150 2742 y Fj(BZ2_bzCompress)23 +-b Fl(detects)28 b(an)m(y)f(attempt)h(to)f(pro)m(vide)f(more)h(input)e +-(data)i(once)g(the)g(\015ush)e(has)i(b)s(egun.)150 2899 +-y(Once)j(the)h(\015ush)e(is)g(complete,)i(the)g(stream)f(returns)g(to)h +-(the)f(normal)g(R)m(UNNING)h(state.)150 3056 y(This)f(all)h(sounds)g +-(prett)m(y)h(complex,)h(but)e(isn't)g(really)-8 b(.)45 +-b(Here's)33 b(a)f(table)g(whic)m(h)f(sho)m(ws)h(whic)m(h)f(actions)150 +-3165 y(are)e(allo)m(w)m(able)f(in)f(eac)m(h)j(state,)g(what)f(action)g +-(will)c(b)s(e)j(tak)m(en,)j(what)d(the)h(next)f(state)i(is,)e(and)g +-(what)h(the)150 3275 y(non-error)h(return)f(v)-5 b(alues)29 +-b(are.)41 b(Note)32 b(that)e(y)m(ou)h(can't)g(explicitly)d(ask)i(what)g +-(state)i(the)e(stream)h(is)e(in,)150 3384 y(but)h(nor)g(do)g(y)m(ou)h +-(need)f(to)h({)g(it)e(can)i(b)s(e)f(inferred)e(from)i(the)h(v)-5 +-b(alues)29 b(returned)h(b)m(y)g Fj(BZ2_bzCompress)p Fl(.)390 +-3535 y(IDLE/)p Fj(any)572 3639 y Fl(Illegal.)60 b(IDLE)30 +-b(state)i(only)d(exists)h(after)h Fj(BZ2_bzCompressEnd)26 +-b Fl(or)572 3743 y(b)s(efore)k Fj(BZ2_bzCompressInit)p +-Fl(.)572 3847 y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_SEQUENCE_ERROR)390 +-4054 y Fl(R)m(UNNING/)p Fj(BZ_RUN)572 4158 y Fl(Compress)f(from)h +-Fj(next_in)f Fl(to)i Fj(next_out)d Fl(as)i(m)m(uc)m(h)h(as)f(p)s +-(ossible.)572 4262 y(Next)h(state)h(=)e(R)m(UNNING)572 +-4366 y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_RUN_OK)390 +-4573 y Fl(R)m(UNNING/)p Fj(BZ_FLUSH)572 4677 y Fl(Remem)m(b)s(er)g +-(curren)m(t)g(v)-5 b(alue)30 b(of)g Fj(next_in)p Fl(.)59 +-b(Compress)30 b(from)g Fj(next_in)572 4781 y Fl(to)h +-Fj(next_out)d Fl(as)j(m)m(uc)m(h)f(as)h(p)s(ossible,)d(but)i(do)g(not)g +-(accept)i(an)m(y)f(more)f(input.)572 4885 y(Next)h(state)h(=)e +-(FLUSHING)572 4988 y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_FLUSH_OK)390 +-5196 y Fl(R)m(UNNING/)p Fj(BZ_FINISH)572 5300 y Fl(Remem)m(b)s(er)g +-(curren)m(t)g(v)-5 b(alue)30 b(of)g Fj(next_in)p Fl(.)59 +-b(Compress)30 b(from)g Fj(next_in)p eop +-%%Page: 16 17 +-16 16 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(16)572 299 y(to)31 b Fj(next_out)d +-Fl(as)j(m)m(uc)m(h)f(as)h(p)s(ossible,)d(but)i(do)g(not)g(accept)i(an)m +-(y)f(more)f(input.)572 403 y(Next)h(state)h(=)e(FINISHING)572 +-506 y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_FINISH_OK)390 +-714 y Fl(FLUSHING/)p Fj(BZ_FLUSH)572 818 y Fl(Compress)f(from)h +-Fj(next_in)f Fl(to)i Fj(next_out)d Fl(as)i(m)m(uc)m(h)h(as)f(p)s +-(ossible,)572 922 y(but)f(do)i(not)f(accept)i(an)m(y)f(more)f(input.) +-572 1025 y(If)g(all)f(the)i(existing)e(input)f(has)i(b)s(een)g(used)g +-(up)f(and)h(all)f(compressed)572 1129 y(output)h(has)g(b)s(een)g(remo)m +-(v)m(ed)663 1233 y(Next)h(state)h(=)e(R)m(UNNING;)i(Return)d(v)-5 +-b(alue)30 b(=)g Fj(BZ_RUN_OK)572 1337 y Fl(else)663 1440 +-y(Next)h(state)h(=)e(FLUSHING;)h(Return)e(v)-5 b(alue)30 +-b(=)g Fj(BZ_FLUSH_OK)390 1648 y Fl(FLUSHING/other)572 +-1752 y(Illegal.)572 1856 y(Return)f(v)-5 b(alue)30 b(=)g +-Fj(BZ_SEQUENCE_ERROR)390 2063 y Fl(FINISHING/)p Fj(BZ_FINISH)572 +-2167 y Fl(Compress)f(from)h Fj(next_in)f Fl(to)i Fj(next_out)d +-Fl(as)i(m)m(uc)m(h)h(as)f(p)s(ossible,)572 2271 y(but)f(to)j(not)e +-(accept)i(an)m(y)f(more)f(input.)572 2374 y(If)g(all)f(the)i(existing)e +-(input)f(has)i(b)s(een)g(used)g(up)f(and)h(all)f(compressed)572 +-2478 y(output)h(has)g(b)s(een)g(remo)m(v)m(ed)663 2582 +-y(Next)h(state)h(=)e(IDLE;)g(Return)g(v)-5 b(alue)30 +-b(=)g Fj(BZ_STREAM_END)572 2686 y Fl(else)663 2790 y(Next)h(state)h(=)e +-(FINISHING;)g(Return)g(v)-5 b(alue)30 b(=)g Fj(BZ_FINISHING)390 +-2997 y Fl(FINISHING/other)572 3101 y(Illegal.)572 3205 +-y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_SEQUENCE_ERROR)150 +-3361 y Fl(That)24 b(still)f(lo)s(oks)g(complicated?)39 +-b(W)-8 b(ell,)25 b(fair)f(enough.)38 b(The)24 b(usual)f(sequence)i(of)f +-(calls)g(for)g(compressing)150 3471 y(a)31 b(load)f(of)g(data)h(is:)225 +-3628 y Fi(\017)60 b Fl(Get)31 b(started)g(with)e Fj(BZ2_bzCompressInit) +-p Fl(.)225 3774 y Fi(\017)60 b Fl(Sho)m(v)m(el)38 b(data)h(in)e(and)g +-(shlurp)e(out)k(its)e(compressed)h(form)g(using)e(zero)j(or)f(more)h +-(calls)e(of)h Fj(BZ2_)330 3884 y(bzCompress)28 b Fl(with)h(action)h(=)g +-Fj(BZ_RUN)p Fl(.)225 4030 y Fi(\017)60 b Fl(Finish)23 +-b(up.)38 b(Rep)s(eatedly)25 b(call)f Fj(BZ2_bzCompress)e +-Fl(with)i(action)h(=)g Fj(BZ_FINISH)p Fl(,)f(cop)m(ying)h(out)h(the)330 +-4139 y(compressed)k(output,)g(un)m(til)f Fj(BZ_STREAM_END)e +-Fl(is)i(returned.)225 4285 y Fi(\017)60 b Fl(Close)30 +-b(up)f(and)h(go)h(home.)41 b(Call)29 b Fj(BZ2_bzCompressEnd)p +-Fl(.)150 4478 y(If)23 b(the)h(data)h(y)m(ou)f(w)m(an)m(t)h(to)f +-(compress)g(\014ts)f(in)m(to)h(y)m(our)g(input)e(bu\013er)h(all)f(at)j +-(once,)h(y)m(ou)e(can)g(skip)f(the)h(calls)150 4588 y(of)37 +-b Fj(BZ2_bzCompress)26 b(\()k(...,)f(BZ_RUN)g(\))36 b +-Fl(and)g(just)g(do)h(the)g Fj(BZ2_bzCompress)26 b(\()k(...,)f +-(BZ_FINISH)150 4698 y(\))h Fl(calls.)150 4854 y(All)36 +-b(required)g(memory)h(is)f(allo)s(cated)i(b)m(y)f Fj +-(BZ2_bzCompressInit)p Fl(.)56 b(The)37 b(compression)g(library)e(can) +-150 4964 y(accept)g(an)m(y)f(data)h(at)g(all)d(\(ob)m(viously\).)51 +-b(So)34 b(y)m(ou)g(shouldn't)e(get)j(an)m(y)f(error)f(return)g(v)-5 +-b(alues)33 b(from)h(the)150 5074 y Fj(BZ2_bzCompress)29 +-b Fl(calls.)46 b(If)32 b(y)m(ou)h(do,)g(they)g(will)d(b)s(e)i +-Fj(BZ_SEQUENCE_ERROR)p Fl(,)d(and)j(indicate)f(a)i(bug)f(in)150 +-5183 y(y)m(our)e(programming.)150 5340 y(T)-8 b(rivial)28 +-b(other)j(p)s(ossible)d(return)h(v)-5 b(alues:)p eop +-%%Page: 17 18 +-17 17 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(17)572 299 y Fj(BZ_PARAM_ERROR)663 +-403 y Fl(if)29 b Fj(strm)g Fl(is)h Fj(NULL)p Fl(,)f(or)i +-Fj(strm->s)d Fl(is)h Fj(NULL)150 652 y Ff(3.3.3)63 b +-Fe(BZ2_bzCompressEnd)390 839 y Fj(int)47 b(BZ2_bzCompressEnd)c(\()k +-(bz_stream)f(*strm)g(\);)150 996 y Fl(Releases)31 b(all)e(memory)h +-(asso)s(ciated)h(with)e(a)i(compression)e(stream.)150 +-1153 y(P)m(ossible)g(return)h(v)-5 b(alues:)481 1304 +-y Fj(BZ_PARAM_ERROR)117 b Fl(if)30 b Fj(strm)f Fl(is)g +-Fj(NULL)g Fl(or)i Fj(strm->s)d Fl(is)i Fj(NULL)481 1408 +-y(BZ_OK)120 b Fl(otherwise)150 1657 y Ff(3.3.4)63 b Fe +-(BZ2_bzDecompressInit)390 1844 y Fj(int)47 b(BZ2_bzDecompressInit)42 +-b(\()48 b(bz_stream)d(*strm,)h(int)h(verbosity,)e(int)i(small)f(\);)150 +-2001 y Fl(Prepares)30 b(for)f(decompression.)40 b(As)29 +-b(with)g Fj(BZ2_bzCompressInit)p Fl(,)c(a)31 b Fj(bz_stream)c +-Fl(record)j(should)e(b)s(e)150 2110 y(allo)s(cated)c(and)f(initialised) +-e(b)s(efore)i(the)i(call.)38 b(Fields)22 b Fj(bzalloc)p +-Fl(,)i Fj(bzfree)e Fl(and)i Fj(opaque)e Fl(should)g(b)s(e)h(set)i(if) +-150 2220 y(a)h(custom)f(memory)g(allo)s(cator)g(is)g(required,)f(or)h +-(made)h Fj(NULL)e Fl(for)h(the)g(normal)f Fj(malloc)p +-Fl(/)p Fj(free)f Fl(routines.)150 2330 y(Up)s(on)h(return,)h(the)g(in)m +-(ternal)f(state)i(will)c(ha)m(v)m(e)k(b)s(een)f(initialised,)d(and)i +-Fj(total_in)f Fl(and)h Fj(total_out)f Fl(will)150 2439 +-y(b)s(e)30 b(zero.)150 2596 y(F)-8 b(or)31 b(the)g(meaning)e(of)i +-(parameter)g Fj(verbosity)p Fl(,)d(see)j Fj(BZ2_bzCompressInit)p +-Fl(.)150 2753 y(If)e Fj(small)e Fl(is)h(nonzero,)i(the)f(library)e +-(will)f(use)j(an)g(alternativ)m(e)h(decompression)e(algorithm)g(whic)m +-(h)f(uses)150 2862 y(less)c(memory)g(but)g(at)h(the)g(cost)h(of)e +-(decompressing)g(more)g(slo)m(wly)g(\(roughly)f(sp)s(eaking,)i(half)f +-(the)h(sp)s(eed,)150 2972 y(but)34 b(the)i(maxim)m(um)d(memory)i +-(requiremen)m(t)g(drops)e(to)j(around)e(2300k\).)57 b(See)35 +-b(Chapter)g(2)g(for)g(more)150 3082 y(information)29 +-b(on)h(memory)g(managemen)m(t.)150 3238 y(Note)40 b(that)f(the)f(amoun) +-m(t)h(of)g(memory)f(needed)g(to)i(decompress)e(a)h(stream)f(cannot)h(b) +-s(e)f(determined)150 3348 y(un)m(til)j(the)h(stream's)h(header)f(has)g +-(b)s(een)g(read,)j(so)e(ev)m(en)g(if)e Fj(BZ2_bzDecompressInit)c +-Fl(succeeds,)46 b(a)150 3458 y(subsequen)m(t)30 b Fj(BZ2_bzDecompress)c +-Fl(could)j(fail)g(with)g Fj(BZ_MEM_ERROR)p Fl(.)150 3614 +-y(P)m(ossible)g(return)h(v)-5 b(alues:)572 3765 y Fj(BZ_CONFIG_ERROR) +-663 3869 y Fl(if)29 b(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 +-3973 y Fj(BZ_PARAM_ERROR)663 4077 y Fl(if)g Fj(\(small)46 +-b(!=)h(0)h(&&)f(small)f(!=)h(1\))663 4181 y Fl(or)30 +-b Fj(\(verbosity)45 b(<)j(0)f(||)g(verbosity)e(>)j(4\))572 +-4284 y(BZ_MEM_ERROR)663 4388 y Fl(if)29 b(insu\016cien)m(t)g(memory)h +-(is)f(a)m(v)-5 b(ailable)150 4545 y(Allo)m(w)m(able)30 +-b(next)g(actions:)572 4696 y Fj(BZ2_bzDecompress)663 +-4800 y Fl(if)f Fj(BZ_OK)g Fl(w)m(as)i(returned)572 4904 +-y(no)f(sp)s(eci\014c)f(action)i(required)e(in)g(case)i(of)g(error)150 +-5153 y Ff(3.3.5)63 b Fe(BZ2_bzDecompress)390 5340 y Fj(int)47 +-b(BZ2_bzDecompress)c(\()48 b(bz_stream)d(*strm)h(\);)p +-eop +-%%Page: 18 19 +-18 18 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(18)150 299 y(Pro)m(vides)24 +-b(more)g(input)f(and/out)h(output)g(bu\013er)g(space)h(for)f(the)g +-(library)-8 b(.)37 b(The)24 b(caller)g(main)m(tains)f(input)150 +-408 y(and)30 b(output)g(bu\013ers,)f(and)h(uses)g Fj(BZ2_bzDecompress)c +-Fl(to)31 b(transfer)f(data)h(b)s(et)m(w)m(een)g(them.)150 +-565 y(Before)g(eac)m(h)g(call)f(to)g Fj(BZ2_bzDecompress)p +-Fl(,)c Fj(next_in)i Fl(should)h(p)s(oin)m(t)g(at)h(the)h(compressed)e +-(data,)j(and)150 675 y Fj(avail_in)h Fl(should)h(indicate)h(ho)m(w)h +-(man)m(y)f(b)m(ytes)i(the)e(library)f(ma)m(y)i(read.)56 +-b Fj(BZ2_bzDecompress)32 b Fl(up-)150 784 y(dates)f Fj(next_in)p +-Fl(,)e Fj(avail_in)f Fl(and)h Fj(total_in)g Fl(to)i(re\015ect)g(the)f +-(n)m(um)m(b)s(er)f(of)i(b)m(ytes)g(it)f(has)g(read.)150 +-941 y(Similarly)-8 b(,)37 b Fj(next_out)f Fl(should)g(p)s(oin)m(t)i(to) +-g(a)h(bu\013er)e(in)g(whic)m(h)g(the)i(uncompressed)e(output)g(is)h(to) +-h(b)s(e)150 1051 y(placed,)d(with)e Fj(avail_out)f Fl(indicating)g(ho)m +-(w)i(m)m(uc)m(h)g(output)g(space)h(is)e(a)m(v)-5 b(ailable.)55 +-b Fj(BZ2_bzCompress)150 1160 y Fl(up)s(dates)29 b Fj(next_out)p +-Fl(,)g Fj(avail_out)f Fl(and)h Fj(total_out)f Fl(to)j(re\015ect)g(the)g +-(n)m(um)m(b)s(er)e(of)h(b)m(ytes)h(output.)150 1317 y(Y)-8 +-b(ou)40 b(ma)m(y)g(pro)m(vide)e(and)h(remo)m(v)m(e)i(as)f(little)e(or)h +-(as)h(m)m(uc)m(h)f(data)h(as)g(y)m(ou)f(lik)m(e)g(on)g(eac)m(h)i(call)e +-(of)g Fj(BZ2_)150 1427 y(bzDecompress)p Fl(.)e(In)27 +-b(the)i(limit,)d(it)i(is)f(acceptable)j(to)f(supply)d(and)h(remo)m(v)m +-(e)j(data)f(one)f(b)m(yte)h(at)g(a)g(time,)150 1537 y(although)f(this)f +-(w)m(ould)g(b)s(e)h(terribly)e(ine\016cien)m(t.)39 b(Y)-8 +-b(ou)29 b(should)e(alw)m(a)m(ys)h(ensure)g(that)h(at)g(least)g(one)f(b) +-m(yte)150 1646 y(of)j(output)f(space)g(is)g(a)m(v)-5 +-b(ailable)30 b(at)h(eac)m(h)g(call.)150 1803 y(Use)g(of)f +-Fj(BZ2_bzDecompress)c Fl(is)k(simpler)e(than)i Fj(BZ2_bzCompress)p +-Fl(.)150 1960 y(Y)-8 b(ou)31 b(should)d(pro)m(vide)h(input)f(and)i +-(remo)m(v)m(e)i(output)d(as)i(describ)s(ed)d(ab)s(o)m(v)m(e,)k(and)d +-(rep)s(eatedly)h(call)f Fj(BZ2_)150 2069 y(bzDecompress)35 +-b Fl(un)m(til)i Fj(BZ_STREAM_END)e Fl(is)j(returned.)64 +-b(App)s(earance)39 b(of)g Fj(BZ_STREAM_END)c Fl(denotes)150 +-2179 y(that)47 b Fj(BZ2_bzDecompress)42 b Fl(has)k(detected)h(the)f +-(logical)g(end)g(of)g(the)h(compressed)e(stream.)89 b +-Fj(BZ2_)150 2289 y(bzDecompress)28 b Fl(will)g(not)j(pro)s(duce)f +-Fj(BZ_STREAM_END)d Fl(un)m(til)j(all)f(output)i(data)h(has)e(b)s(een)h +-(placed)f(in)m(to)150 2398 y(the)36 b(output)g(bu\013er,)h(so)g(once)g +-Fj(BZ_STREAM_END)32 b Fl(app)s(ears,)38 b(y)m(ou)e(are)h(guaran)m(teed) +-g(to)g(ha)m(v)m(e)h(a)m(v)-5 b(ailable)150 2508 y(all)29 +-b(the)i(decompressed)f(output,)g(and)g Fj(BZ2_bzDecompressEnd)25 +-b Fl(can)31 b(safely)f(b)s(e)f(called.)150 2665 y(If)40 +-b(case)h(of)f(an)h(error)e(return)h(v)-5 b(alue,)42 b(y)m(ou)f(should)d +-(call)h Fj(BZ2_bzDecompressEnd)c Fl(to)41 b(clean)f(up)g(and)150 +-2774 y(release)31 b(memory)-8 b(.)150 2931 y(P)m(ossible)29 +-b(return)h(v)-5 b(alues:)572 3082 y Fj(BZ_PARAM_ERROR)663 +-3186 y Fl(if)29 b Fj(strm)g Fl(is)h Fj(NULL)f Fl(or)h +-Fj(strm->s)f Fl(is)g Fj(NULL)663 3290 y Fl(or)h Fj(strm->avail_out)44 +-b(<)j(1)572 3393 y(BZ_DATA_ERROR)663 3497 y Fl(if)29 +-b(a)i(data)g(in)m(tegrit)m(y)f(error)g(is)g(detected)h(in)e(the)i +-(compressed)f(stream)572 3601 y Fj(BZ_DATA_ERROR_MAGIC)663 +-3705 y Fl(if)f(the)i(compressed)f(stream)g(do)s(esn't)h(b)s(egin)e +-(with)g(the)h(righ)m(t)g(magic)h(b)m(ytes)572 3808 y +-Fj(BZ_MEM_ERROR)663 3912 y Fl(if)e(there)i(w)m(asn't)f(enough)h(memory) +-f(a)m(v)-5 b(ailable)572 4016 y Fj(BZ_STREAM_END)663 +-4120 y Fl(if)29 b(the)i(logical)e(end)h(of)h(the)f(data)h(stream)g(w)m +-(as)g(detected)g(and)f(all)663 4224 y(output)g(in)f(has)h(b)s(een)g +-(consumed,)f(eg)j Fj(s->avail_out)44 b(>)k(0)572 4327 +-y(BZ_OK)663 4431 y Fl(otherwise)150 4588 y(Allo)m(w)m(able)30 +-b(next)g(actions:)572 4739 y Fj(BZ2_bzDecompress)663 +-4843 y Fl(if)f Fj(BZ_OK)g Fl(w)m(as)i(returned)572 4946 +-y Fj(BZ2_bzDecompressEnd)663 5050 y Fl(otherwise)p eop +-%%Page: 19 20 +-19 19 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(19)150 299 y Ff(3.3.6)63 +-b Fe(BZ2_bzDecompressEnd)390 486 y Fj(int)47 b(BZ2_bzDecompressEnd)42 +-b(\()48 b(bz_stream)d(*strm)i(\);)150 643 y Fl(Releases)31 +-b(all)e(memory)h(asso)s(ciated)h(with)e(a)i(decompression)e(stream.)150 +-799 y(P)m(ossible)g(return)h(v)-5 b(alues:)572 950 y +-Fj(BZ_PARAM_ERROR)663 1054 y Fl(if)29 b Fj(strm)g Fl(is)h +-Fj(NULL)f Fl(or)h Fj(strm->s)f Fl(is)g Fj(NULL)572 1158 +-y(BZ_OK)663 1262 y Fl(otherwise)150 1419 y(Allo)m(w)m(able)h(next)g +-(actions:)572 1570 y(None.)150 1857 y Fk(3.4)68 b(High-lev)l(el)47 +-b(in)l(terface)150 2050 y Fl(This)35 b(in)m(terface)j(pro)m(vides)d +-(functions)h(for)g(reading)g(and)h(writing)e Fj(bzip2)g +-Fl(format)i(\014les.)59 b(First,)39 b(some)150 2159 y(general)30 +-b(p)s(oin)m(ts.)225 2316 y Fi(\017)60 b Fl(All)35 b(of)h(the)g +-(functions)e(tak)m(e)k(an)e Fj(int*)f Fl(\014rst)g(argumen)m(t,)j +-Fj(bzerror)p Fl(.)56 b(After)36 b(eac)m(h)h(call,)g Fj(bzerror)330 +-2426 y Fl(should)23 b(b)s(e)i(consulted)g(\014rst)g(to)h(determine)e +-(the)i(outcome)h(of)e(the)h(call.)38 b(If)25 b Fj(bzerror)f +-Fl(is)g Fj(BZ_OK)p Fl(,)i(the)330 2535 y(call)35 b(completed)g +-(successfully)-8 b(,)36 b(and)f(only)g(then)g(should)f(the)h(return)g +-(v)-5 b(alue)35 b(of)h(the)f(function)g(\(if)330 2645 +-y(an)m(y\))30 b(b)s(e)f(consulted.)39 b(If)29 b Fj(bzerror)e +-Fl(is)h Fj(BZ_IO_ERROR)p Fl(,)f(there)i(w)m(as)h(an)f(error)g +-(reading/writing)e(the)330 2754 y(underlying)32 b(compressed)j(\014le,) +-h(and)f(y)m(ou)h(should)d(then)i(consult)g Fj(errno)p +-Fl(/)p Fj(perror)e Fl(to)j(determine)330 2864 y(the)i(cause)g(of)g(the) +-g(di\016cult)m(y)-8 b(.)61 b Fj(bzerror)36 b Fl(ma)m(y)i(also)g(b)s(e)f +-(set)h(to)g(v)-5 b(arious)37 b(other)h(v)-5 b(alues;)41 +-b(precise)330 2974 y(details)29 b(are)i(giv)m(en)g(on)f(a)h(p)s +-(er-function)d(basis)h(b)s(elo)m(w.)225 3111 y Fi(\017)60 +-b Fl(If)40 b Fj(bzerror)f Fl(indicates)g(an)i(error)f(\(ie,)j(an)m +-(ything)d(except)h Fj(BZ_OK)f Fl(and)g Fj(BZ_STREAM_END)p +-Fl(\),)g(y)m(ou)330 3220 y(should)56 b(immediately)h(call)g +-Fj(BZ2_bzReadClose)e Fl(\(or)j Fj(BZ2_bzWriteClose)p +-Fl(,)j(dep)s(ending)56 b(on)330 3330 y(whether)50 b(y)m(ou)g(are)h +-(attempting)g(to)g(read)f(or)g(to)i(write\))d(to)j(free)e(up)f(all)h +-(resources)g(asso)s(ci-)330 3439 y(ated)33 b(with)e(the)i(stream.)47 +-b(Once)32 b(an)h(error)f(has)g(b)s(een)g(indicated,)f(b)s(eha)m(viour)g +-(of)i(all)e(calls)h(except)330 3549 y Fj(BZ2_bzReadClose)46 +-b Fl(\()p Fj(BZ2_bzWriteClose)p Fl(\))h(is)j(unde\014ned.)99 +-b(The)50 b(implication)e(is)i(that)h(\(1\))330 3659 y +-Fj(bzerror)44 b Fl(should)g(b)s(e)h(c)m(hec)m(k)m(ed)j(after)e(eac)m(h) +-h(call,)i(and)c(\(2\))i(if)e Fj(bzerror)f Fl(indicates)g(an)i(error,) +-330 3768 y Fj(BZ2_bzReadClose)26 b Fl(\()p Fj(BZ2_bzWriteClose)p +-Fl(\))h(should)h(then)i(b)s(e)g(called)g(to)h(clean)f(up.)225 +-3905 y Fi(\017)60 b Fl(The)33 b Fj(FILE*)f Fl(argumen)m(ts)h(passed)g +-(to)h Fj(BZ2_bzReadOpen)p Fl(/)p Fj(BZ2_bzWriteOp)o(en)27 +-b Fl(should)32 b(b)s(e)g(set)i(to)330 4015 y(binary)23 +-b(mo)s(de.)38 b(Most)26 b(Unix)d(systems)i(will)d(do)i(this)g(b)m(y)g +-(default,)i(but)e(other)g(platforms,)h(including)330 +-4124 y(Windo)m(ws)20 b(and)g(Mac,)k(will)19 b(not.)38 +-b(If)20 b(y)m(ou)h(omit)g(this,)h(y)m(ou)f(ma)m(y)h(encoun)m(ter)f +-(problems)e(when)h(mo)m(ving)330 4234 y(co)s(de)31 b(to)g(new)f +-(platforms.)225 4371 y Fi(\017)60 b Fl(Memory)23 b(allo)s(cation)f +-(requests)h(are)g(handled)e(b)m(y)i Fj(malloc)p Fl(/)p +-Fj(free)p Fl(.)36 b(A)m(t)23 b(presen)m(t)g(there)g(is)f(no)h(facilit)m +-(y)330 4481 y(for)40 b(user-de\014ned)e(memory)i(allo)s(cators)g(in)f +-(the)h(\014le)g(I/O)g(functions)e(\(could)i(easily)f(b)s(e)g(added,)330 +-4590 y(though\).)150 4842 y Ff(3.4.1)63 b Fe(BZ2_bzReadOpen)533 +-5029 y Fj(typedef)46 b(void)h(BZFILE;)533 5236 y(BZFILE)f +-(*BZ2_bzReadOpen)e(\()j(int)g(*bzerror,)f(FILE)g(*f,)1726 +-5340 y(int)h(small,)f(int)h(verbosity,)p eop +-%%Page: 20 21 +-20 20 bop 150 -116 a Fl(Chapter)30 b(3:)h(Programming)e(with)g +-Fj(libbzip2)1891 b Fl(20)1726 299 y Fj(void)47 b(*unused,)f(int)g +-(nUnused)g(\);)150 456 y Fl(Prepare)29 b(to)g(read)g(compressed)f(data) +-i(from)e(\014le)g(handle)f Fj(f)p Fl(.)40 b Fj(f)29 b +-Fl(should)d(refer)j(to)h(a)f(\014le)f(whic)m(h)f(has)i(b)s(een)150 +-565 y(op)s(ened)h(for)h(reading,)f(and)h(for)f(whic)m(h)g(the)h(error)g +-(indicator)e(\()p Fj(ferror\(f\))p Fl(\)is)f(not)k(set.)42 +-b(If)31 b Fj(small)e Fl(is)h(1,)150 675 y(the)h(library)d(will)f(try)j +-(to)i(decompress)e(using)f(less)g(memory)-8 b(,)31 b(at)g(the)g(exp)s +-(ense)f(of)g(sp)s(eed.)150 832 y(F)-8 b(or)39 b(reasons)f(explained)f +-(b)s(elo)m(w,)j Fj(BZ2_bzRead)35 b Fl(will)h(decompress)i(the)g +-Fj(nUnused)e Fl(b)m(ytes)j(starting)f(at)150 941 y Fj(unused)p +-Fl(,)k(b)s(efore)e(starting)h(to)g(read)g(from)f(the)h(\014le)f +-Fj(f)p Fl(.)71 b(A)m(t)42 b(most)f Fj(BZ_MAX_UNUSED)c +-Fl(b)m(ytes)k(ma)m(y)h(b)s(e)150 1051 y(supplied)32 b(lik)m(e)k(this.) +-55 b(If)36 b(this)e(facilit)m(y)h(is)g(not)h(required,)g(y)m(ou)g +-(should)e(pass)h Fj(NULL)g Fl(and)g Fj(0)g Fl(for)h Fj(unused)150 +-1160 y Fl(and)30 b(n)p Fj(Unused)e Fl(resp)s(ectiv)m(ely)-8 +-b(.)150 1317 y(F)g(or)31 b(the)g(meaning)e(of)i(parameters)g +-Fj(small)e Fl(and)g Fj(verbosity)p Fl(,)f(see)j Fj +-(BZ2_bzDecompressInit)p Fl(.)150 1474 y(The)k(amoun)m(t)g(of)g(memory)g +-(needed)g(to)g(decompress)g(a)h(\014le)e(cannot)h(b)s(e)g(determined)e +-(un)m(til)h(the)h(\014le's)150 1584 y(header)22 b(has)f(b)s(een)g +-(read.)38 b(So)22 b(it)f(is)g(p)s(ossible)e(that)k Fj(BZ2_bzReadOpen)17 +-b Fl(returns)k Fj(BZ_OK)f Fl(but)h(a)i(subsequen)m(t)150 +-1693 y(call)30 b(of)g Fj(BZ2_bzRead)e Fl(will)f(return)j +-Fj(BZ_MEM_ERROR)p Fl(.)150 1850 y(P)m(ossible)f(assignmen)m(ts)h(to)h +-Fj(bzerror)p Fl(:)572 2001 y Fj(BZ_CONFIG_ERROR)663 2105 +-y Fl(if)e(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 +-2209 y Fj(BZ_PARAM_ERROR)663 2313 y Fl(if)g Fj(f)h Fl(is)g +-Fj(NULL)663 2416 y Fl(or)g Fj(small)f Fl(is)g(neither)h +-Fj(0)g Fl(nor)g Fj(1)663 2520 y Fl(or)g Fj(\(unused)46 +-b(==)h(NULL)g(&&)g(nUnused)f(!=)h(0\))663 2624 y Fl(or)30 +-b Fj(\(unused)46 b(!=)h(NULL)g(&&)g(!\(0)g(<=)g(nUnused)f(<=)h +-(BZ_MAX_UNUSED\)\))572 2728 y(BZ_IO_ERROR)663 2831 y +-Fl(if)29 b Fj(ferror\(f\))f Fl(is)h(nonzero)572 2935 +-y Fj(BZ_MEM_ERROR)663 3039 y Fl(if)g(insu\016cien)m(t)g(memory)h(is)f +-(a)m(v)-5 b(ailable)572 3143 y Fj(BZ_OK)663 3247 y Fl(otherwise.)150 +-3403 y(P)m(ossible)29 b(return)h(v)-5 b(alues:)572 3554 +-y(P)m(oin)m(ter)31 b(to)g(an)f(abstract)h Fj(BZFILE)663 +-3658 y Fl(if)e Fj(bzerror)f Fl(is)i Fj(BZ_OK)572 3762 +-y(NULL)663 3866 y Fl(otherwise)150 4023 y(Allo)m(w)m(able)g(next)g +-(actions:)572 4174 y Fj(BZ2_bzRead)663 4277 y Fl(if)f +-Fj(bzerror)f Fl(is)i Fj(BZ_OK)572 4381 y(BZ2_bzClose)663 +-4485 y Fl(otherwise)150 4887 y Ff(3.4.2)63 b Fe(BZ2_bzRead)533 +-5074 y Fj(int)47 b(BZ2_bzRead)e(\()j(int)e(*bzerror,)g(BZFILE)g(*b,)h +-(void)f(*buf,)h(int)g(len)g(\);)150 5230 y Fl(Reads)35 +-b(up)f(to)h Fj(len)f Fl(\(uncompressed\))h(b)m(ytes)g(from)f(the)h +-(compressed)g(\014le)f Fj(b)g Fl(in)m(to)h(the)g(bu\013er)f +-Fj(buf)p Fl(.)53 b(If)150 5340 y(the)30 b(read)f(w)m(as)h(successful,)f +-Fj(bzerror)e Fl(is)i(set)h(to)g Fj(BZ_OK)e Fl(and)h(the)h(n)m(um)m(b)s +-(er)e(of)i(b)m(ytes)g(read)f(is)g(returned.)p eop +-%%Page: 21 22 +-21 21 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(21)150 299 y(If)33 b(the)g(logical)g +-(end-of-stream)h(w)m(as)g(detected,)i Fj(bzerror)31 b +-Fl(will)g(b)s(e)h(set)i(to)g Fj(BZ_STREAM_END)p Fl(,)d(and)i(the)150 +-408 y(n)m(um)m(b)s(er)c(of)i(b)m(ytes)f(read)h(is)e(returned.)40 +-b(All)29 b(other)h Fj(bzerror)f Fl(v)-5 b(alues)29 b(denote)i(an)g +-(error.)150 565 y Fj(BZ2_bzRead)37 b Fl(will)f(supply)h +-Fj(len)i Fl(b)m(ytes,)j(unless)c(the)i(logical)f(stream)h(end)e(is)h +-(detected)i(or)e(an)g(error)150 675 y(o)s(ccurs.)75 b(Because)43 +-b(of)f(this,)i(it)d(is)g(p)s(ossible)e(to)k(detect)g(the)f(stream)g +-(end)f(b)m(y)h(observing)f(when)g(the)150 784 y(n)m(um)m(b)s(er)29 +-b(of)h(b)m(ytes)g(returned)f(is)g(less)g(than)h(the)g(n)m(um)m(b)s(er)f +-(requested.)40 b(Nev)m(ertheless,)31 b(this)e(is)g(regarded)150 +-894 y(as)38 b(inadvisable;)g(y)m(ou)g(should)d(instead)i(c)m(hec)m(k)i +-Fj(bzerror)d Fl(after)i(ev)m(ery)g(call)e(and)h(w)m(atc)m(h)i(out)f +-(for)f Fj(BZ_)150 1004 y(STREAM_END)p Fl(.)150 1160 y(In)m(ternally)-8 +-b(,)47 b Fj(BZ2_bzRead)41 b Fl(copies)j(data)g(from)g(the)g(compressed) +-g(\014le)f(in)f(c)m(h)m(unks)i(of)g(size)g Fj(BZ_MAX_)150 +-1270 y(UNUSED)31 b Fl(b)m(ytes)i(b)s(efore)f(decompressing)f(it.)47 +-b(If)32 b(the)h(\014le)e(con)m(tains)i(more)g(b)m(ytes)g(than)f +-(strictly)f(needed)150 1380 y(to)48 b(reac)m(h)f(the)g(logical)f +-(end-of-stream,)52 b Fj(BZ2_bzRead)44 b Fl(will)g(almost)j(certainly)f +-(read)h(some)g(of)g(the)150 1489 y(trailing)c(data)j(b)s(efore)e +-(signalling)f Fj(BZ_SEQUENCE_END)p Fl(.)80 b(T)-8 b(o)46 +-b(collect)f(the)g(read)g(but)g(un)m(used)e(data)150 1599 +-y(once)29 b Fj(BZ_SEQUENCE_END)24 b Fl(has)k(app)s(eared,)g(call)f +-Fj(BZ2_bzReadGetUnused)c Fl(immediately)j(b)s(efore)i +-Fj(BZ2_)150 1708 y(bzReadClose)p Fl(.)150 1865 y(P)m(ossible)h +-(assignmen)m(ts)h(to)h Fj(bzerror)p Fl(:)572 2016 y Fj(BZ_PARAM_ERROR) +-663 2120 y Fl(if)e Fj(b)h Fl(is)g Fj(NULL)f Fl(or)h Fj(buf)g +-Fl(is)f Fj(NULL)g Fl(or)i Fj(len)46 b(<)i(0)572 2224 +-y(BZ_SEQUENCE_ERROR)663 2328 y Fl(if)29 b Fj(b)h Fl(w)m(as)h(op)s(ened) +-e(with)h Fj(BZ2_bzWriteOpen)572 2431 y(BZ_IO_ERROR)663 +-2535 y Fl(if)f(there)i(is)e(an)h(error)g(reading)g(from)g(the)g +-(compressed)g(\014le)572 2639 y Fj(BZ_UNEXPECTED_EOF)663 +-2743 y Fl(if)f(the)i(compressed)f(\014le)f(ended)h(b)s(efore)g(the)g +-(logical)g(end-of-stream)h(w)m(as)g(detected)572 2847 +-y Fj(BZ_DATA_ERROR)663 2950 y Fl(if)e(a)i(data)g(in)m(tegrit)m(y)f +-(error)g(w)m(as)h(detected)h(in)d(the)h(compressed)g(stream)572 +-3054 y Fj(BZ_DATA_ERROR_MAGIC)663 3158 y Fl(if)f(the)i(stream)f(do)s +-(es)g(not)h(b)s(egin)e(with)g(the)i(requisite)e(header)h(b)m(ytes)h +-(\(ie,)f(is)g(not)663 3262 y(a)g Fj(bzip2)f Fl(data)i(\014le\).)61 +-b(This)28 b(is)i(really)f(a)i(sp)s(ecial)e(case)i(of)g +-Fj(BZ_DATA_ERROR)p Fl(.)572 3365 y Fj(BZ_MEM_ERROR)663 +-3469 y Fl(if)e(insu\016cien)m(t)g(memory)h(w)m(as)h(a)m(v)-5 +-b(ailable)572 3573 y Fj(BZ_STREAM_END)663 3677 y Fl(if)29 +-b(the)i(logical)e(end)h(of)h(stream)f(w)m(as)h(detected.)572 +-3781 y Fj(BZ_OK)663 3884 y Fl(otherwise.)150 4041 y(P)m(ossible)e +-(return)h(v)-5 b(alues:)572 4192 y(n)m(um)m(b)s(er)29 +-b(of)h(b)m(ytes)h(read)663 4296 y(if)e Fj(bzerror)f Fl(is)i +-Fj(BZ_OK)f Fl(or)h Fj(BZ_STREAM_END)572 4400 y Fl(unde\014ned)663 +-4503 y(otherwise)150 4660 y(Allo)m(w)m(able)g(next)g(actions:)572 +-4811 y(collect)h(data)g(from)f Fj(buf)p Fl(,)f(then)h +-Fj(BZ2_bzRead)e Fl(or)i Fj(BZ2_bzReadClose)663 4915 y +-Fl(if)f Fj(bzerror)f Fl(is)i Fj(BZ_OK)572 5019 y Fl(collect)h(data)g +-(from)f Fj(buf)p Fl(,)f(then)h Fj(BZ2_bzReadClose)d Fl(or)j +-Fj(BZ2_bzReadGetUnused)663 5123 y Fl(if)f Fj(bzerror)f +-Fl(is)i Fj(BZ_SEQUENCE_END)572 5226 y(BZ2_bzReadClose)663 +-5330 y Fl(otherwise)p eop +-%%Page: 22 23 +-22 22 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(22)150 299 y Ff(3.4.3)63 +-b Fe(BZ2_bzReadGetUnused)533 486 y Fj(void)47 b(BZ2_bzReadGetUnused)42 +-b(\()48 b(int*)e(bzerror,)g(BZFILE)g(*b,)1822 589 y(void**)g(unused,)g +-(int*)g(nUnused)g(\);)150 746 y Fl(Returns)36 b(data)i(whic)m(h)d(w)m +-(as)j(read)f(from)f(the)h(compressed)g(\014le)f(but)g(w)m(as)h(not)h +-(needed)e(to)i(get)g(to)g(the)150 856 y(logical)k(end-of-stream.)78 +-b Fj(*unused)41 b Fl(is)h(set)h(to)g(the)g(address)f(of)g(the)h(data,)k +-(and)42 b Fj(*nUnused)e Fl(to)k(the)150 965 y(n)m(um)m(b)s(er)29 +-b(of)i(b)m(ytes.)41 b Fj(*nUnused)28 b Fl(will)g(b)s(e)h(set)i(to)g(a)g +-(v)-5 b(alue)30 b(b)s(et)m(w)m(een)h Fj(0)f Fl(and)g +-Fj(BZ_MAX_UNUSED)d Fl(inclusiv)m(e.)150 1122 y(This)d(function)h(ma)m +-(y)h(only)g(b)s(e)f(called)g(once)i Fj(BZ2_bzRead)c Fl(has)j(signalled) +-e Fj(BZ_STREAM_END)e Fl(but)j(b)s(efore)150 1232 y Fj(BZ2_bzReadClose)p +-Fl(.)150 1389 y(P)m(ossible)k(assignmen)m(ts)h(to)h Fj(bzerror)p +-Fl(:)572 1540 y Fj(BZ_PARAM_ERROR)663 1644 y Fl(if)e +-Fj(b)h Fl(is)g Fj(NULL)663 1747 y Fl(or)g Fj(unused)f +-Fl(is)g Fj(NULL)g Fl(or)i Fj(nUnused)d Fl(is)i Fj(NULL)572 +-1851 y(BZ_SEQUENCE_ERROR)663 1955 y Fl(if)f Fj(BZ_STREAM_END)e +-Fl(has)j(not)h(b)s(een)e(signalled)663 2059 y(or)h(if)f +-Fj(b)h Fl(w)m(as)h(op)s(ened)f(with)f Fj(BZ2_bzWriteOpen)542 +-2162 y(BZ_OK)663 2266 y Fl(otherwise)150 2423 y(Allo)m(w)m(able)h(next) +-g(actions:)572 2574 y Fj(BZ2_bzReadClose)150 2882 y Ff(3.4.4)63 +-b Fe(BZ2_bzReadClose)533 3068 y Fj(void)47 b(BZ2_bzReadClose)c(\()48 +-b(int)f(*bzerror,)e(BZFILE)h(*b)h(\);)150 3225 y Fl(Releases)36 +-b(all)e(memory)h(p)s(ertaining)e(to)i(the)h(compressed)f(\014le)f +-Fj(b)p Fl(.)54 b Fj(BZ2_bzReadClose)31 b Fl(do)s(es)k(not)h(call)150 +-3335 y Fj(fclose)c Fl(on)h(the)h(underlying)d(\014le)h(handle,)h(so)h +-(y)m(ou)g(should)e(do)h(that)h(y)m(ourself)f(if)g(appropriate.)49 +-b Fj(BZ2_)150 3445 y(bzReadClose)27 b Fl(should)i(b)s(e)g(called)h(to)h +-(clean)f(up)g(after)h(all)e(error)h(situations.)150 3601 +-y(P)m(ossible)f(assignmen)m(ts)h(to)h Fj(bzerror)p Fl(:)572 +-3752 y Fj(BZ_SEQUENCE_ERROR)663 3856 y Fl(if)e Fj(b)h +-Fl(w)m(as)h(op)s(ened)e(with)h Fj(BZ2_bzOpenWrite)572 +-3960 y(BZ_OK)663 4064 y Fl(otherwise)150 4221 y(Allo)m(w)m(able)g(next) +-g(actions:)572 4372 y(none)150 4679 y Ff(3.4.5)63 b Fe(BZ2_bzWriteOpen) +-533 4866 y Fj(BZFILE)46 b(*BZ2_bzWriteOpen)e(\()j(int)g(*bzerror,)e +-(FILE)i(*f,)1774 4970 y(int)g(blockSize100k,)d(int)j(verbosity,)1774 +-5074 y(int)g(workFactor)e(\);)150 5230 y Fl(Prepare)33 +-b(to)g(write)f(compressed)h(data)h(to)f(\014le)f(handle)g +-Fj(f)p Fl(.)47 b Fj(f)33 b Fl(should)e(refer)i(to)g(a)g(\014le)f(whic)m +-(h)g(has)h(b)s(een)150 5340 y(op)s(ened)d(for)g(writing,)e(and)i(for)g +-(whic)m(h)f(the)i(error)f(indicator)f(\()p Fj(ferror\(f\))p +-Fl(\)is)f(not)i(set.)p eop +-%%Page: 23 24 +-23 23 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(23)150 299 y(F)-8 b(or)31 +-b(the)g(meaning)e(of)i(parameters)g Fj(blockSize100k)p +-Fl(,)c Fj(verbosity)g Fl(and)j Fj(workFactor)p Fl(,)e(see)150 +-408 y Fj(BZ2_bzCompressInit)p Fl(.)150 565 y(All)d(required)f(memory)i +-(is)f(allo)s(cated)i(at)g(this)e(stage,)j(so)f(if)e(the)h(call)g +-(completes)g(successfully)-8 b(,)26 b Fj(BZ_MEM_)150 +-675 y(ERROR)j Fl(cannot)i(b)s(e)f(signalled)e(b)m(y)i(a)h(subsequen)m +-(t)f(call)f(to)i Fj(BZ2_bzWrite)p Fl(.)150 832 y(P)m(ossible)e +-(assignmen)m(ts)h(to)h Fj(bzerror)p Fl(:)572 983 y Fj(BZ_CONFIG_ERROR) +-663 1087 y Fl(if)e(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 +-1190 y Fj(BZ_PARAM_ERROR)663 1294 y Fl(if)g Fj(f)h Fl(is)g +-Fj(NULL)663 1398 y Fl(or)g Fj(blockSize100k)44 b(<)k(1)30 +-b Fl(or)g Fj(blockSize100k)44 b(>)k(9)572 1502 y(BZ_IO_ERROR)663 +-1605 y Fl(if)29 b Fj(ferror\(f\))f Fl(is)h(nonzero)572 +-1709 y Fj(BZ_MEM_ERROR)663 1813 y Fl(if)g(insu\016cien)m(t)g(memory)h +-(is)f(a)m(v)-5 b(ailable)572 1917 y Fj(BZ_OK)663 2021 +-y Fl(otherwise)150 2177 y(P)m(ossible)29 b(return)h(v)-5 +-b(alues:)572 2328 y(P)m(oin)m(ter)31 b(to)g(an)f(abstract)h +-Fj(BZFILE)663 2432 y Fl(if)e Fj(bzerror)f Fl(is)i Fj(BZ_OK)572 +-2536 y(NULL)663 2640 y Fl(otherwise)150 2797 y(Allo)m(w)m(able)g(next)g +-(actions:)572 2948 y Fj(BZ2_bzWrite)663 3051 y Fl(if)f +-Fj(bzerror)f Fl(is)i Fj(BZ_OK)604 3155 y Fl(\(y)m(ou)25 +-b(could)e(go)h(directly)f(to)h Fj(BZ2_bzWriteClose)p +-Fl(,)c(but)j(this)g(w)m(ould)g(b)s(e)g(prett)m(y)h(p)s(oin)m(tless\)) +-572 3259 y Fj(BZ2_bzWriteClose)663 3363 y Fl(otherwise)150 +-3639 y Ff(3.4.6)63 b Fe(BZ2_bzWrite)533 3826 y Fj(void)47 +-b(BZ2_bzWrite)e(\()i(int)g(*bzerror,)e(BZFILE)h(*b,)h(void)g(*buf,)f +-(int)h(len)g(\);)150 3983 y Fl(Absorbs)26 b Fj(len)g +-Fl(b)m(ytes)i(from)e(the)i(bu\013er)e Fj(buf)p Fl(,)h(ev)m(en)m(tually) +-g(to)h(b)s(e)e(compressed)h(and)f(written)g(to)i(the)g(\014le.)150 +-4140 y(P)m(ossible)h(assignmen)m(ts)h(to)h Fj(bzerror)p +-Fl(:)572 4291 y Fj(BZ_PARAM_ERROR)663 4395 y Fl(if)e +-Fj(b)h Fl(is)g Fj(NULL)f Fl(or)h Fj(buf)g Fl(is)f Fj(NULL)g +-Fl(or)i Fj(len)46 b(<)i(0)572 4498 y(BZ_SEQUENCE_ERROR)663 +-4602 y Fl(if)29 b(b)h(w)m(as)h(op)s(ened)e(with)g Fj(BZ2_bzReadOpen)572 +-4706 y(BZ_IO_ERROR)663 4810 y Fl(if)g(there)i(is)e(an)h(error)g +-(writing)f(the)h(compressed)g(\014le.)572 4914 y Fj(BZ_OK)663 +-5017 y Fl(otherwise)150 5294 y Ff(3.4.7)63 b Fe(BZ2_bzWriteClose)p +-eop +-%%Page: 24 25 +-24 24 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(24)533 299 y Fj(void)47 +-b(BZ2_bzWriteClose)c(\()48 b(int)f(*bzerror,)e(BZFILE*)h(f,)1679 +-403 y(int)h(abandon,)1679 506 y(unsigned)e(int*)i(nbytes_in,)1679 +-610 y(unsigned)e(int*)i(nbytes_out)e(\);)533 818 y(void)i +-(BZ2_bzWriteClose64)c(\()k(int)g(*bzerror,)e(BZFILE*)h(f,)1774 +-922 y(int)h(abandon,)1774 1025 y(unsigned)f(int*)g(nbytes_in_lo32,)1774 +-1129 y(unsigned)g(int*)g(nbytes_in_hi32,)1774 1233 y(unsigned)g(int*)g +-(nbytes_out_lo32,)1774 1337 y(unsigned)g(int*)g(nbytes_out_hi32)e(\);) +-150 1493 y Fl(Compresses)39 b(and)g(\015ushes)g(to)h(the)g(compressed)g +-(\014le)f(all)f(data)j(so)f(far)g(supplied)c(b)m(y)k +-Fj(BZ2_bzWrite)p Fl(.)150 1603 y(The)27 b(logical)g(end-of-stream)h +-(mark)m(ers)g(are)g(also)f(written,)h(so)f(subsequen)m(t)g(calls)g(to)h +-Fj(BZ2_bzWrite)d Fl(are)150 1713 y(illegal.)50 b(All)33 +-b(memory)h(asso)s(ciated)g(with)f(the)i(compressed)e(\014le)h +-Fj(b)f Fl(is)g(released.)52 b Fj(fflush)33 b Fl(is)g(called)g(on)150 +-1822 y(the)e(compressed)f(\014le,)f(but)h(it)g(is)f(not)i +-Fj(fclose)p Fl('d.)150 1979 y(If)i Fj(BZ2_bzWriteClose)c +-Fl(is)k(called)f(to)j(clean)e(up)f(after)i(an)g(error,)g(the)g(only)e +-(action)i(is)f(to)h(release)g(the)150 2089 y(memory)-8 +-b(.)42 b(The)30 b(library)e(records)j(the)g(error)f(co)s(des)h(issued)e +-(b)m(y)h(previous)f(calls,)i(so)f(this)g(situation)g(will)150 +-2198 y(b)s(e)c(detected)h(automatically)-8 b(.)40 b(There)26 +-b(is)g(no)g(attempt)h(to)h(complete)e(the)h(compression)f(op)s +-(eration,)g(nor)150 2308 y(to)32 b Fj(fflush)d Fl(the)i(compressed)g +-(\014le.)42 b(Y)-8 b(ou)32 b(can)f(force)h(this)e(b)s(eha)m(viour)g(to) +-h(happ)s(en)f(ev)m(en)i(in)d(the)j(case)g(of)150 2417 +-y(no)e(error,)g(b)m(y)h(passing)e(a)i(nonzero)f(v)-5 +-b(alue)30 b(to)h Fj(abandon)p Fl(.)150 2574 y(If)j Fj(nbytes_in)d +-Fl(is)j(non-n)m(ull,)f Fj(*nbytes_in)e Fl(will)h(b)s(e)h(set)i(to)g(b)s +-(e)f(the)g(total)h(v)m(olume)f(of)g(uncompressed)150 +-2684 y(data)k(handled.)60 b(Similarly)-8 b(,)35 b Fj(nbytes_out)g +-Fl(will)g(b)s(e)h(set)i(to)g(the)g(total)g(v)m(olume)f(of)g(compressed) +-g(data)150 2793 y(written.)h(F)-8 b(or)27 b(compatibilit)m(y)d(with)h +-(older)g(v)m(ersions)h(of)g(the)g(library)-8 b(,)25 b +-Fj(BZ2_bzWriteClose)d Fl(only)j(yields)150 2903 y(the)40 +-b(lo)m(w)m(er)g(32)h(bits)d(of)i(these)h(coun)m(ts.)69 +-b(Use)40 b Fj(BZ2_bzWriteClose64)35 b Fl(if)k(y)m(ou)h(w)m(an)m(t)h +-(the)f(full)d(64)k(bit)150 3013 y(coun)m(ts.)g(These)30 +-b(t)m(w)m(o)i(functions)d(are)i(otherwise)f(absolutely)f(iden)m(tical.) +-150 3169 y(P)m(ossible)g(assignmen)m(ts)h(to)h Fj(bzerror)p +-Fl(:)572 3320 y Fj(BZ_SEQUENCE_ERROR)663 3424 y Fl(if)e +-Fj(b)h Fl(w)m(as)h(op)s(ened)e(with)h Fj(BZ2_bzReadOpen)572 +-3528 y(BZ_IO_ERROR)663 3632 y Fl(if)f(there)i(is)e(an)h(error)g +-(writing)f(the)h(compressed)g(\014le)572 3736 y Fj(BZ_OK)663 +-3839 y Fl(otherwise)150 4161 y Ff(3.4.8)63 b(Handling)41 +-b(em)m(b)s(edded)g(compressed)h(data)e(streams)150 4354 +-y Fl(The)i(high-lev)m(el)g(library)f(facilitates)h(use)h(of)g +-Fj(bzip2)e Fl(data)j(streams)f(whic)m(h)f(form)g(some)i(part)e(of)i(a) +-150 4463 y(surrounding,)27 b(larger)j(data)h(stream.)225 +-4620 y Fi(\017)60 b Fl(F)-8 b(or)22 b(writing,)f(the)g(library)e(tak)m +-(es)k(an)e(op)s(en)f(\014le)g(handle,)i(writes)e(compressed)h(data)h +-(to)g(it,)g Fj(fflush)p Fl(es)330 4730 y(it)34 b(but)f(do)s(es)h(not)h +-Fj(fclose)d Fl(it.)52 b(The)34 b(calling)f(application)g(can)h(write)g +-(its)f(o)m(wn)i(data)g(b)s(efore)f(and)330 4839 y(after)d(the)f +-(compressed)h(data)g(stream,)g(using)d(that)j(same)g(\014le)f(handle.) +-225 5011 y Fi(\017)60 b Fl(Reading)34 b(is)f(more)i(complex,)g(and)f +-(the)h(facilities)d(are)j(not)g(as)g(general)f(as)h(they)f(could)g(b)s +-(e)g(since)330 5121 y(generalit)m(y)e(is)f(hard)f(to)j(reconcile)e +-(with)f(e\016ciency)-8 b(.)46 b Fj(BZ2_bzRead)29 b Fl(reads)i(from)g +-(the)h(compressed)330 5230 y(\014le)39 b(in)g(blo)s(c)m(ks)g(of)h(size) +-g Fj(BZ_MAX_UNUSED)c Fl(b)m(ytes,)44 b(and)39 b(in)g(doing)g(so)h +-(probably)e(will)f(o)m(v)m(ersho)s(ot)330 5340 y(the)i(logical)g(end)f +-(of)h(compressed)f(stream.)67 b(T)-8 b(o)40 b(reco)m(v)m(er)g(this)e +-(data)i(once)f(decompression)f(has)p eop +-%%Page: 25 26 +-25 25 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(25)330 299 y(ended,)28 +-b(call)g Fj(BZ2_bzReadGetUnused)23 b Fl(after)29 b(the)g(last)f(call)g +-(of)g Fj(BZ2_bzRead)e Fl(\(the)j(one)g(returning)330 +-408 y Fj(BZ_STREAM_END)p Fl(\))e(but)j(b)s(efore)g(calling)f +-Fj(BZ2_bzReadClose)p Fl(.)150 596 y(This)51 b(mec)m(hanism)g(mak)m(es)j +-(it)e(easy)h(to)g(decompress)f(m)m(ultiple)e Fj(bzip2)i +-Fl(streams)g(placed)g(end-to-)150 706 y(end.)90 b(As)48 +-b(the)f(end)f(of)i(one)f(stream,)52 b(when)46 b Fj(BZ2_bzRead)f +-Fl(returns)h Fj(BZ_STREAM_END)p Fl(,)i(call)e Fj(BZ2_)150 +-816 y(bzReadGetUnused)36 b Fl(to)41 b(collect)g(the)g(un)m(used)e(data) +-i(\(cop)m(y)g(it)f(in)m(to)g(y)m(our)h(o)m(wn)f(bu\013er)f +-(somewhere\).)150 925 y(That)25 b(data)g(forms)f(the)h(start)h(of)e +-(the)h(next)g(compressed)g(stream.)39 b(T)-8 b(o)25 b(start)h +-(uncompressing)c(that)k(next)150 1035 y(stream,)40 b(call)d +-Fj(BZ2_bzReadOpen)d Fl(again,)40 b(feeding)d(in)g(the)h(un)m(used)e +-(data)j(via)e(the)h Fj(unused)p Fl(/)p Fj(nUnused)150 +-1144 y Fl(parameters.)54 b(Keep)34 b(doing)g(this)f(un)m(til)g +-Fj(BZ_STREAM_END)e Fl(return)j(coincides)f(with)h(the)g(ph)m(ysical)g +-(end)150 1254 y(of)d(\014le)e(\()p Fj(feof\(f\))p Fl(\).)39 +-b(In)30 b(this)f(situation)h Fj(BZ2_bzReadGetUnused)25 +-b Fl(will)i(of)k(course)g(return)e(no)h(data.)150 1411 +-y(This)c(should)f(giv)m(e)j(some)g(feel)f(for)g(ho)m(w)h(the)g +-(high-lev)m(el)e(in)m(terface)i(can)f(b)s(e)g(used.)39 +-b(If)27 b(y)m(ou)h(require)e(extra)150 1520 y(\015exibilit)m(y)-8 +-b(,)28 b(y)m(ou'll)i(ha)m(v)m(e)h(to)g(bite)f(the)h(bullet)d(and)i(get) +-i(to)f(grips)e(with)g(the)h(lo)m(w-lev)m(el)h(in)m(terface.)150 +-1779 y Ff(3.4.9)63 b(Standard)40 b(\014le-reading/writing)j(co)s(de)150 +-1972 y Fl(Here's)31 b(ho)m(w)f(y)m(ou'd)h(write)e(data)j(to)f(a)f +-(compressed)g(\014le:)390 2330 y Fj(FILE*)142 b(f;)390 +-2434 y(BZFILE*)46 b(b;)390 2538 y(int)238 b(nBuf;)390 +-2642 y(char)190 b(buf[)46 b(/*)i(whatever)d(size)i(you)g(like)f(*/)i +-(];)390 2746 y(int)238 b(bzerror;)390 2849 y(int)g(nWritten;)390 +-3057 y(f)47 b(=)h(fopen)e(\()i("myfile.bz2",)c("w")j(\);)390 +-3161 y(if)g(\(!f\))g({)533 3264 y(/*)g(handle)f(error)h(*/)390 +-3368 y(})390 3472 y(b)g(=)h(BZ2_bzWriteOpen)c(\()j(&bzerror,)e(f,)i(9)h +-(\);)390 3576 y(if)f(\(bzerror)f(!=)h(BZ_OK\))f({)533 +-3680 y(BZ2_bzWriteClose)e(\()j(b)g(\);)533 3783 y(/*)g(handle)f(error)h +-(*/)390 3887 y(})390 4095 y(while)f(\()i(/*)f(condition)e(*/)i(\))h({) +-533 4198 y(/*)f(get)g(data)g(to)g(write)f(into)h(buf,)g(and)g(set)g +-(nBuf)f(appropriately)e(*/)533 4302 y(nWritten)i(=)h(BZ2_bzWrite)e(\()i +-(&bzerror,)f(b,)h(buf,)f(nBuf)h(\);)533 4406 y(if)g(\(bzerror)f(==)h +-(BZ_IO_ERROR\))e({)676 4510 y(BZ2_bzWriteClose)f(\()j(&bzerror,)e(b)j +-(\);)676 4614 y(/*)g(handle)e(error)g(*/)533 4717 y(})390 +-4821 y(})390 5029 y(BZ2_bzWriteClose)d(\()48 b(&bzerror,)d(b)j(\);)390 +-5132 y(if)f(\(bzerror)f(==)h(BZ_IO_ERROR\))d({)533 5236 +-y(/*)j(handle)f(error)h(*/)390 5340 y(})p eop +-%%Page: 26 27 +-26 26 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(26)150 299 y(And)29 b(to)j(read)e(from)g +-(a)h(compressed)f(\014le:)390 450 y Fj(FILE*)142 b(f;)390 +-554 y(BZFILE*)46 b(b;)390 657 y(int)238 b(nBuf;)390 761 +-y(char)190 b(buf[)46 b(/*)i(whatever)d(size)i(you)g(like)f(*/)i(];)390 +-865 y(int)238 b(bzerror;)390 969 y(int)g(nWritten;)390 +-1176 y(f)47 b(=)h(fopen)e(\()i("myfile.bz2",)c("r")j(\);)390 +-1280 y(if)g(\(!f\))g({)533 1384 y(/*)g(handle)f(error)h(*/)390 +-1488 y(})390 1591 y(b)g(=)h(BZ2_bzReadOpen)c(\()j(&bzerror,)f(f,)h(0,)g +-(NULL,)f(0)i(\);)390 1695 y(if)f(\(bzerror)f(!=)h(BZ_OK\))f({)533 +-1799 y(BZ2_bzReadClose)e(\()j(&bzerror,)f(b)h(\);)533 +-1903 y(/*)g(handle)f(error)h(*/)390 2007 y(})390 2214 +-y(bzerror)f(=)h(BZ_OK;)390 2318 y(while)f(\(bzerror)g(==)h(BZ_OK)f(&&)i +-(/*)f(arbitrary)e(other)h(conditions)f(*/\))i({)533 2422 +-y(nBuf)g(=)g(BZ2_bzRead)e(\()j(&bzerror,)d(b,)i(buf,)g(/*)g(size)g(of)g +-(buf)g(*/)g(\);)533 2525 y(if)g(\(bzerror)f(==)h(BZ_OK\))f({)676 +-2629 y(/*)i(do)f(something)e(with)i(buf[0)f(..)h(nBuf-1])f(*/)533 +-2733 y(})390 2837 y(})390 2941 y(if)h(\(bzerror)f(!=)h(BZ_STREAM_END\)) +-d({)533 3044 y(BZ2_bzReadClose)g(\()j(&bzerror,)f(b)h(\);)533 +-3148 y(/*)g(handle)f(error)h(*/)390 3252 y(})g(else)g({)533 +-3356 y(BZ2_bzReadClose)d(\()j(&bzerror)f(\);)390 3459 +-y(})150 3753 y Fk(3.5)68 b(Utilit)l(y)47 b(functions)150 +-4045 y Ff(3.5.1)63 b Fe(BZ2_bzBuffToBuffCompress)533 +-4232 y Fj(int)47 b(BZ2_bzBuffToBuffCompress\()41 b(char*)428 +-b(dest,)1965 4335 y(unsigned)46 b(int*)g(destLen,)1965 +-4439 y(char*)428 b(source,)1965 4543 y(unsigned)46 b(int)94 +-b(sourceLen,)1965 4647 y(int)524 b(blockSize100k,)1965 +-4751 y(int)g(verbosity,)1965 4854 y(int)g(workFactor)45 +-b(\);)150 5011 y Fl(A)m(ttempts)33 b(to)g(compress)f(the)g(data)h(in)e +-Fj(source[0)d(..)i(sourceLen-1])e Fl(in)m(to)k(the)h(destination)e +-(bu\013er,)150 5121 y Fj(dest[0)e(..)g(*destLen-1])p +-Fl(.)37 b(If)26 b(the)g(destination)g(bu\013er)f(is)h(big)f(enough,)j +-Fj(*destLen)c Fl(is)h(set)i(to)g(the)g(size)150 5230 +-y(of)i(the)f(compressed)h(data,)g(and)f Fj(BZ_OK)f Fl(is)h(returned.)39 +-b(If)28 b(the)h(compressed)f(data)h(w)m(on't)g(\014t,)g +-Fj(*destLen)150 5340 y Fl(is)g(unc)m(hanged,)i(and)e +-Fj(BZ_OUTBUFF_FULL)e Fl(is)i(returned.)p eop +-%%Page: 27 28 +-27 27 bop 150 -116 a Fl(Chapter)30 b(3:)h(Programming)e(with)g +-Fj(libbzip2)1891 b Fl(27)150 299 y(Compression)22 b(in)g(this)h(manner) +-g(is)g(a)h(one-shot)g(ev)m(en)m(t,)j(done)c(with)g(a)h(single)e(call)h +-(to)i(this)d(function.)37 b(The)150 408 y(resulting)25 +-b(compressed)i(data)i(is)d(a)i(complete)f Fj(bzip2)f +-Fl(format)i(data)g(stream.)40 b(There)27 b(is)f(no)i(mec)m(hanism)150 +-518 y(for)23 b(making)g(additional)e(calls)i(to)h(pro)m(vide)f(extra)h +-(input)e(data.)39 b(If)23 b(y)m(ou)h(w)m(an)m(t)g(that)g(kind)e(of)h +-(mec)m(hanism,)150 628 y(use)30 b(the)h(lo)m(w-lev)m(el)f(in)m +-(terface.)150 784 y(F)-8 b(or)31 b(the)g(meaning)e(of)i(parameters)g +-Fj(blockSize100k)p Fl(,)c Fj(verbosity)g Fl(and)j Fj(workFactor)p +-Fl(,)150 894 y(see)h Fj(BZ2_bzCompressInit)p Fl(.)150 +-1051 y(T)-8 b(o)27 b(guaran)m(tee)h(that)e(the)h(compressed)f(data)h +-(will)d(\014t)i(in)f(its)g(bu\013er,)i(allo)s(cate)f(an)g(output)g +-(bu\013er)g(of)g(size)150 1160 y(1\045)31 b(larger)f(than)g(the)g +-(uncompressed)f(data,)j(plus)c(six)h(h)m(undred)g(extra)i(b)m(ytes.)150 +-1317 y Fj(BZ2_bzBuffToBuffDecompre)o(ss)25 b Fl(will)k(not)j(write)e +-(data)j(at)f(or)f(b)s(ey)m(ond)g Fj(dest[*destLen])p +-Fl(,)d(ev)m(en)k(in)150 1427 y(case)f(of)g(bu\013er)e(o)m(v)m(er\015o)m +-(w.)150 1584 y(P)m(ossible)g(return)h(v)-5 b(alues:)572 +-1735 y Fj(BZ_CONFIG_ERROR)663 1839 y Fl(if)29 b(the)i(library)d(has)i +-(b)s(een)f(mis-compiled)572 1942 y Fj(BZ_PARAM_ERROR)663 +-2046 y Fl(if)g Fj(dest)g Fl(is)h Fj(NULL)f Fl(or)h Fj(destLen)f +-Fl(is)g Fj(NULL)663 2150 y Fl(or)h Fj(blockSize100k)44 +-b(<)k(1)30 b Fl(or)g Fj(blockSize100k)44 b(>)k(9)663 +-2254 y Fl(or)30 b Fj(verbosity)45 b(<)j(0)30 b Fl(or)g +-Fj(verbosity)45 b(>)j(4)663 2357 y Fl(or)30 b Fj(workFactor)45 +-b(<)j(0)30 b Fl(or)g Fj(workFactor)45 b(>)i(250)572 2461 +-y(BZ_MEM_ERROR)663 2565 y Fl(if)29 b(insu\016cien)m(t)g(memory)h(is)f +-(a)m(v)-5 b(ailable)572 2669 y Fj(BZ_OUTBUFF_FULL)663 +-2773 y Fl(if)29 b(the)i(size)f(of)g(the)h(compressed)f(data)h(exceeds)g +-Fj(*destLen)572 2876 y(BZ_OK)663 2980 y Fl(otherwise)150 +-3349 y Ff(3.5.2)63 b Fe(BZ2_bzBuffToBuffDecompress)533 +-3536 y Fj(int)47 b(BZ2_bzBuffToBuffDecompres)o(s)42 b(\()47 +-b(char*)428 b(dest,)2108 3640 y(unsigned)46 b(int*)g(destLen,)2108 +-3744 y(char*)428 b(source,)2108 3848 y(unsigned)46 b(int)94 +-b(sourceLen,)2108 3951 y(int)524 b(small,)2108 4055 y(int)g(verbosity) +-46 b(\);)150 4212 y Fl(A)m(ttempts)24 b(to)g(decompress)f(the)g(data)g +-(in)f Fj(source[0)28 b(..)i(sourceLen-1])20 b Fl(in)m(to)j(the)g +-(destination)f(bu\013er,)150 4322 y Fj(dest[0)29 b(..)g(*destLen-1])p +-Fl(.)37 b(If)26 b(the)g(destination)g(bu\013er)f(is)h(big)f(enough,)j +-Fj(*destLen)c Fl(is)h(set)i(to)g(the)g(size)150 4431 +-y(of)21 b(the)g(uncompressed)e(data,)24 b(and)c Fj(BZ_OK)f +-Fl(is)h(returned.)36 b(If)20 b(the)h(compressed)g(data)g(w)m(on't)h +-(\014t,)g Fj(*destLen)150 4541 y Fl(is)29 b(unc)m(hanged,)i(and)e +-Fj(BZ_OUTBUFF_FULL)e Fl(is)i(returned.)150 4698 y Fj(source)g +-Fl(is)g(assumed)h(to)h(hold)e(a)i(complete)f Fj(bzip2)f +-Fl(format)i(data)g(stream.)150 4807 y Fj(BZ2_bzBuffToBuffDecompre)o(ss) +-22 b Fl(tries)28 b(to)i(decompress)e(the)h(en)m(tiret)m(y)g(of)g(the)f +-(stream)h(in)m(to)g(the)f(out-)150 4917 y(put)i(bu\013er.)150 +-5074 y(F)-8 b(or)31 b(the)g(meaning)e(of)i(parameters)g +-Fj(small)e Fl(and)g Fj(verbosity)p Fl(,)f(see)j Fj +-(BZ2_bzDecompressInit)p Fl(.)150 5230 y(Because)j(the)f(compression)e +-(ratio)i(of)g(the)g(compressed)f(data)h(cannot)g(b)s(e)f(kno)m(wn)g(in) +-g(adv)-5 b(ance,)34 b(there)150 5340 y(is)d(no)h(easy)g(w)m(a)m(y)h(to) +-f(guaran)m(tee)i(that)e(the)g(output)f(bu\013er)g(will)e(b)s(e)i(big)g +-(enough.)45 b(Y)-8 b(ou)32 b(ma)m(y)h(of)f(course)p eop +-%%Page: 28 29 +-28 28 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(28)150 299 y(mak)m(e)36 +-b(arrangemen)m(ts)f(in)e(y)m(our)i(co)s(de)g(to)g(record)g(the)g(size)f +-(of)h(the)g(uncompressed)f(data,)i(but)e(suc)m(h)h(a)150 +-408 y(mec)m(hanism)30 b(is)f(b)s(ey)m(ond)h(the)g(scop)s(e)h(of)f(this) +-g(library)-8 b(.)150 565 y Fj(BZ2_bzBuffToBuffDecompre)o(ss)25 +-b Fl(will)k(not)j(write)e(data)j(at)f(or)f(b)s(ey)m(ond)g +-Fj(dest[*destLen])p Fl(,)d(ev)m(en)k(in)150 675 y(case)f(of)g(bu\013er) +-e(o)m(v)m(er\015o)m(w.)150 832 y(P)m(ossible)g(return)h(v)-5 +-b(alues:)572 983 y Fj(BZ_CONFIG_ERROR)663 1087 y Fl(if)29 +-b(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 1190 +-y Fj(BZ_PARAM_ERROR)663 1294 y Fl(if)g Fj(dest)g Fl(is)h +-Fj(NULL)f Fl(or)h Fj(destLen)f Fl(is)g Fj(NULL)663 1398 +-y Fl(or)h Fj(small)46 b(!=)i(0)f(&&)g(small)g(!=)g(1)663 +-1502 y Fl(or)30 b Fj(verbosity)45 b(<)j(0)30 b Fl(or)g +-Fj(verbosity)45 b(>)j(4)572 1605 y(BZ_MEM_ERROR)663 1709 +-y Fl(if)29 b(insu\016cien)m(t)g(memory)h(is)f(a)m(v)-5 +-b(ailable)572 1813 y Fj(BZ_OUTBUFF_FULL)663 1917 y Fl(if)29 +-b(the)i(size)f(of)g(the)h(compressed)f(data)h(exceeds)g +-Fj(*destLen)572 2021 y(BZ_DATA_ERROR)663 2124 y Fl(if)e(a)i(data)g(in)m +-(tegrit)m(y)f(error)g(w)m(as)h(detected)h(in)d(the)h(compressed)g(data) +-572 2228 y Fj(BZ_DATA_ERROR_MAGIC)663 2332 y Fl(if)f(the)i(compressed)f +-(data)h(do)s(esn't)f(b)s(egin)f(with)g(the)i(righ)m(t)e(magic)i(b)m +-(ytes)572 2436 y Fj(BZ_UNEXPECTED_EOF)663 2539 y Fl(if)e(the)i +-(compressed)f(data)h(ends)e(unexp)s(ectedly)572 2643 +-y Fj(BZ_OK)663 2747 y Fl(otherwise)150 3116 y Fk(3.6)68 +-b Fd(zlib)43 b Fk(compatibilit)l(y)k(functions)150 3308 +-y Fl(Y)-8 b(oshiok)j(a)33 b(Tsuneo)e(has)h(con)m(tributed)g(some)g +-(functions)f(to)i(giv)m(e)g(b)s(etter)f Fj(zlib)f Fl(compatibilit)m(y) +--8 b(.)45 b(These)150 3418 y(functions)36 b(are)i Fj(BZ2_bzopen)p +-Fl(,)e Fj(BZ2_bzread)p Fl(,)h Fj(BZ2_bzwrite)p Fl(,)f +-Fj(BZ2_bzflush)p Fl(,)h Fj(BZ2_bzclose)p Fl(,)f Fj(BZ2_)150 +-3527 y(bzerror)23 b Fl(and)h Fj(BZ2_bzlibVersion)p Fl(.)34 +-b(These)25 b(functions)e(are)j(not)f(\(y)m(et\))h(o\016cially)e(part)h +-(of)g(the)g(library)-8 b(.)150 3637 y(If)30 b(they)g(break,)h(y)m(ou)g +-(get)g(to)g(k)m(eep)g(all)f(the)g(pieces.)41 b(Nev)m(ertheless,)31 +-b(I)f(think)f(they)i(w)m(ork)f(ok.)390 3788 y Fj(typedef)46 +-b(void)g(BZFILE;)390 3995 y(const)g(char)h(*)g(BZ2_bzlibVersion)d(\()j +-(void)g(\);)150 4152 y Fl(Returns)29 b(a)i(string)f(indicating)e(the)i +-(library)e(v)m(ersion.)390 4303 y Fj(BZFILE)46 b(*)i(BZ2_bzopen)92 +-b(\()48 b(const)e(char)h(*path,)f(const)g(char)h(*mode)f(\);)390 +-4407 y(BZFILE)g(*)i(BZ2_bzdopen)c(\()k(int)381 b(fd,)190 +-b(const)46 b(char)h(*mode)f(\);)150 4564 y Fl(Op)s(ens)19 +-b(a)j Fj(.bz2)e Fl(\014le)g(for)g(reading)g(or)h(writing,)g(using)f +-(either)g(its)h(name)g(or)g(a)g(pre-existing)f(\014le)g(descriptor.)150 +-4674 y(Analogous)30 b(to)i Fj(fopen)c Fl(and)i Fj(fdopen)p +-Fl(.)390 4825 y Fj(int)47 b(BZ2_bzread)93 b(\()47 b(BZFILE*)f(b,)h +-(void*)f(buf,)h(int)g(len)g(\);)390 4928 y(int)g(BZ2_bzwrite)e(\()i +-(BZFILE*)f(b,)h(void*)f(buf,)h(int)g(len)g(\);)150 5085 +-y Fl(Reads/writes)30 b(data)h(from/to)g(a)g(previously)d(op)s(ened)i +-Fj(BZFILE)p Fl(.)39 b(Analogous)30 b(to)h Fj(fread)e +-Fl(and)h Fj(fwrite)p Fl(.)390 5236 y Fj(int)95 b(BZ2_bzflush)44 +-b(\()k(BZFILE*)e(b)h(\);)390 5340 y(void)g(BZ2_bzclose)d(\()k(BZFILE*)e +-(b)h(\);)p eop +-%%Page: 29 30 +-29 29 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(29)150 299 y(Flushes/closes)27 +-b(a)h Fj(BZFILE)p Fl(.)39 b Fj(BZ2_bzflush)24 b Fl(do)s(esn't)k +-(actually)f(do)h(an)m(ything.)39 b(Analogous)28 b(to)h +-Fj(fflush)150 408 y Fl(and)h Fj(fclose)p Fl(.)390 559 +-y Fj(const)46 b(char)h(*)g(BZ2_bzerror)e(\()j(BZFILE)e(*b,)h(int)g +-(*errnum)e(\))150 716 y Fl(Returns)31 b(a)i(string)e(describing)f(the)i +-(more)g(recen)m(t)h(error)f(status)h(of)f Fj(b)p Fl(,)g(and)g(also)g +-(sets)h Fj(*errnum)d Fl(to)j(its)150 826 y(n)m(umerical)c(v)-5 +-b(alue.)150 1242 y Fk(3.7)68 b(Using)46 b(the)f(library)g(in)g(a)g +-Fd(stdio)p Fk(-free)f(en)l(vironmen)l(t)150 1615 y Ff(3.7.1)63 +-b(Getting)40 b(rid)h(of)g Fe(stdio)150 1807 y Fl(In)i(a)g(deeply)g(em)m +-(b)s(edded)f(application,)j(y)m(ou)f(migh)m(t)f(w)m(an)m(t)h(to)g(use)f +-(just)g(the)h(memory-to-memory)150 1917 y(functions.)39 +-b(Y)-8 b(ou)30 b(can)f(do)g(this)g(con)m(v)m(enien)m(tly)g(b)m(y)g +-(compiling)e(the)j(library)d(with)h(prepro)s(cessor)g(sym)m(b)s(ol)150 +-2026 y Fj(BZ_NO_STDIO)35 b Fl(de\014ned.)63 b(Doing)39 +-b(this)e(giv)m(es)h(y)m(ou)h(a)f(library)e(con)m(taining)i(only)f(the)i +-(follo)m(wing)e(eigh)m(t)150 2136 y(functions:)150 2293 +-y Fj(BZ2_bzCompressInit)p Fl(,)26 b Fj(BZ2_bzCompress)p +-Fl(,)g Fj(BZ2_bzCompressEnd)150 2402 y(BZ2_bzDecompressInit)p +-Fl(,)f Fj(BZ2_bzDecompress)p Fl(,)h Fj(BZ2_bzDecompressEnd)150 +-2512 y(BZ2_bzBuffToBuffCompress)o Fl(,)f Fj(BZ2_bzBuffToBuffDecompre)o +-(ss)150 2669 y Fl(When)30 b(compiled)f(lik)m(e)h(this,)f(all)g +-(functions)g(will)f(ignore)i Fj(verbosity)e Fl(settings.)150 +-3006 y Ff(3.7.2)63 b(Critical)40 b(error)h(handling)150 +-3199 y Fj(libbzip2)20 b Fl(con)m(tains)j(a)g(n)m(um)m(b)s(er)f(of)g(in) +-m(ternal)g(assertion)g(c)m(hec)m(ks)i(whic)m(h)d(should,)i(needless)f +-(to)h(sa)m(y)-8 b(,)26 b(nev)m(er)150 3308 y(b)s(e)g(activ)-5 +-b(ated.)40 b(Nev)m(ertheless,)28 b(if)d(an)i(assertion)f(should)e +-(fail,)i(b)s(eha)m(viour)f(dep)s(ends)f(on)j(whether)e(or)i(not)150 +-3418 y(the)k(library)d(w)m(as)i(compiled)f(with)g Fj(BZ_NO_STDIO)e +-Fl(set.)150 3575 y(F)-8 b(or)31 b(a)g(normal)e(compile,)h(an)g +-(assertion)g(failure)f(yields)f(the)j(message)533 3726 +-y Fj(bzip2/libbzip2:)44 b(internal)h(error)i(number)f(N.)533 +-3829 y(This)h(is)g(a)g(bug)g(in)h(bzip2/libbzip2,)43 +-b(1.0)k(of)g(21-Mar-2000.)533 3933 y(Please)f(report)g(it)i(to)f(me)g +-(at:)g(jseward@acm.org.)91 b(If)47 b(this)g(happened)533 +-4037 y(when)g(you)g(were)f(using)h(some)f(program)g(which)h(uses)f +-(libbzip2)g(as)h(a)533 4141 y(component,)e(you)i(should)f(also)h +-(report)f(this)h(bug)f(to)i(the)f(author\(s\))533 4244 +-y(of)g(that)g(program.)93 b(Please)46 b(make)h(an)g(effort)f(to)h +-(report)g(this)f(bug;)533 4348 y(timely)g(and)h(accurate)f(bug)h +-(reports)e(eventually)g(lead)i(to)g(higher)533 4452 y(quality)f +-(software.)93 b(Thanks.)h(Julian)46 b(Seward,)f(21)j(March)e(2000.)150 +-4609 y Fl(where)30 b Fj(N)g Fl(is)f(some)i(error)f(co)s(de)h(n)m(um)m +-(b)s(er.)39 b Fj(exit\(3\))28 b Fl(is)i(then)g(called.)150 +-4766 y(F)-8 b(or)31 b(a)g Fj(stdio)p Fl(-free)e(library)-8 +-b(,)29 b(assertion)h(failures)e(result)i(in)f(a)i(call)e(to)i(a)g +-(function)e(declared)h(as:)533 4917 y Fj(extern)46 b(void)h +-(bz_internal_error)c(\()k(int)g(errcode)f(\);)150 5074 +-y Fl(The)30 b(relev)-5 b(an)m(t)31 b(co)s(de)f(is)g(passed)f(as)i(a)g +-(parameter.)41 b(Y)-8 b(ou)31 b(should)d(supply)g(suc)m(h)i(a)h +-(function.)150 5230 y(In)g(either)g(case,)j(once)e(an)g(assertion)g +-(failure)e(has)h(o)s(ccurred,)h(an)m(y)g Fj(bz_stream)e +-Fl(records)h(in)m(v)m(olv)m(ed)h(can)150 5340 y(b)s(e)e(regarded)g(as)h +-(in)m(v)-5 b(alid.)38 b(Y)-8 b(ou)31 b(should)d(not)j(attempt)g(to)g +-(resume)f(normal)g(op)s(eration)f(with)g(them.)p eop +-%%Page: 30 31 +-30 30 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 +-b(with)g Fj(libbzip2)1881 b Fl(30)150 299 y(Y)-8 b(ou)22 +-b(ma)m(y)-8 b(,)25 b(of)d(course,)h(c)m(hange)g(critical)e(error)g +-(handling)e(to)j(suit)f(y)m(our)g(needs.)38 b(As)21 b(I)h(said)e(ab)s +-(o)m(v)m(e,)25 b(critical)150 408 y(errors)30 b(indicate)g(bugs)g(in)g +-(the)h(library)d(and)i(should)f(not)i(o)s(ccur.)42 b(All)29 +-b Fj(")p Fl(normal)p Fj(")h Fl(error)g(situations)g(are)150 +-518 y(indicated)f(via)h(error)g(return)f(co)s(des)i(from)f(functions,)f +-(and)g(can)i(b)s(e)f(reco)m(v)m(ered)i(from.)150 798 +-y Fk(3.8)68 b(Making)45 b(a)g(Windo)l(ws)h(DLL)150 990 +-y Fl(Ev)m(erything)30 b(related)g(to)h(Windo)m(ws)f(has)g(b)s(een)f +-(con)m(tributed)h(b)m(y)g(Y)-8 b(oshiok)j(a)31 b(Tsuneo)150 +-1100 y(\()p Fj(QWF00133@niftyserve.or.jp)46 b Fl(/)52 +-b Fj(tsuneo-y@is.aist-nara.ac.j)o(p)p Fl(\),)g(so)h(y)m(ou)f(should)f +-(send)150 1210 y(y)m(our)30 b(queries)g(to)h(him)e(\(but)h(p)s(erhaps)e +-(Cc:)41 b(me,)31 b Fj(jseward@acm.org)p Fl(\).)150 1366 +-y(My)43 b(v)-5 b(ague)44 b(understanding)d(of)i(what)g(to)h(do)f(is:)65 +-b(using)41 b(Visual)h(C)p Fj(++)g Fl(5.0,)48 b(op)s(en)42 +-b(the)h(pro)5 b(ject)44 b(\014le)150 1476 y Fj(libbz2.dsp)p +-Fl(,)28 b(and)i(build.)37 b(That's)31 b(all.)150 1633 +-y(If)41 b(y)m(ou)g(can't)h(op)s(en)e(the)h(pro)5 b(ject)42 +-b(\014le)e(for)h(some)g(reason,)j(mak)m(e)e(a)g(new)e(one,)k(naming)c +-(these)i(\014les:)150 1742 y Fj(blocksort.c)p Fl(,)28 +-b Fj(bzlib.c)p Fl(,)g Fj(compress.c)p Fl(,)g Fj(crctable.c)p +-Fl(,)g Fj(decompress.c)p Fl(,)f Fj(huffman.c)p Fl(,)150 +-1852 y Fj(randtable.c)32 b Fl(and)j Fj(libbz2.def)p Fl(.)53 +-b(Y)-8 b(ou)36 b(will)d(also)i(need)g(to)h(name)g(the)g(header)f +-(\014les)f Fj(bzlib.h)g Fl(and)150 1962 y Fj(bzlib_private.h)p +-Fl(.)150 2118 y(If)c(y)m(ou)h(don't)f(use)g(V)m(C)p Fj(++)p +-Fl(,)g(y)m(ou)h(ma)m(y)g(need)f(to)h(de\014ne)f(the)h(propro)s(cessor)e +-(sym)m(b)s(ol)g Fj(_WIN32)p Fl(.)150 2275 y(Finally)-8 +-b(,)28 b Fj(dlltest.c)e Fl(is)h(a)i(sample)f(program)g(using)g(the)g +-(DLL.)h(It)g(has)f(a)h(pro)5 b(ject)29 b(\014le,)g Fj(dlltest.dsp)p +-Fl(.)150 2432 y(If)h(y)m(ou)h(just)e(w)m(an)m(t)j(a)e(mak)m(e\014le)h +-(for)f(Visual)f(C,)h(ha)m(v)m(e)i(a)e(lo)s(ok)g(at)i +-Fj(makefile.msc)p Fl(.)150 2589 y(Be)k(a)m(w)m(are)g(that)g(if)e(y)m +-(ou)h(compile)f Fj(bzip2)g Fl(itself)g(on)h(Win32,)h(y)m(ou)g(m)m(ust)f +-(set)g Fj(BZ_UNIX)e Fl(to)j(0)f(and)g Fj(BZ_)150 2698 +-y(LCCWIN32)27 b Fl(to)j(1,)g(in)f(the)g(\014le)g Fj(bzip2.c)p +-Fl(,)e(b)s(efore)i(compiling.)39 b(Otherwise)28 b(the)h(resulting)f +-(binary)f(w)m(on't)150 2808 y(w)m(ork)j(correctly)-8 +-b(.)150 2965 y(I)30 b(ha)m(v)m(en't)i(tried)d(an)m(y)i(of)g(this)e +-(stu\013)h(m)m(yself,)g(but)g(it)f(all)h(lo)s(oks)g(plausible.)p +-eop +-%%Page: 31 32 +-31 31 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 +-b(31)150 299 y Fh(4)80 b(Miscellanea)150 583 y Fl(These)30 +-b(are)h(just)f(some)g(random)g(though)m(ts)h(of)f(mine.)40 +-b(Y)-8 b(our)30 b(mileage)h(ma)m(y)g(v)-5 b(ary)d(.)150 +-884 y Fk(4.1)68 b(Limitations)47 b(of)e(the)g(compressed)g(\014le)h +-(format)150 1077 y Fj(bzip2-1.0)p Fl(,)e Fj(0.9.5)e Fl(and)g +-Fj(0.9.0)g Fl(use)h(exactly)h(the)f(same)h(\014le)e(format)i(as)f(the)h +-(previous)d(v)m(ersion,)150 1186 y Fj(bzip2-0.1)p Fl(.)75 +-b(This)41 b(decision)g(w)m(as)i(made)g(in)e(the)i(in)m(terests)g(of)g +-(stabilit)m(y)-8 b(.)77 b(Creating)42 b(y)m(et)i(another)150 +-1296 y(incompatible)21 b(compressed)i(\014le)f(format)i(w)m(ould)e +-(create)i(further)e(confusion)g(and)h(disruption)d(for)j(users.)150 +-1453 y(Nev)m(ertheless,)31 b(this)e(is)g(not)h(a)g(painless)e +-(decision.)39 b(Dev)m(elopmen)m(t)31 b(w)m(ork)f(since)f(the)h(release) +-h(of)f Fj(bzip2-)150 1562 y(0.1)19 b Fl(in)g(August)i(1997)h(has)e(sho) +-m(wn)f(complexities)h(in)f(the)h(\014le)g(format)g(whic)m(h)f(slo)m(w)h +-(do)m(wn)g(decompression)150 1672 y(and,)30 b(in)f(retrosp)s(ect,)i +-(are)g(unnecessary)-8 b(.)40 b(These)31 b(are:)225 1829 +-y Fi(\017)60 b Fl(The)20 b(run-length)g(enco)s(der,)i(whic)m(h)e(is)g +-(the)h(\014rst)f(of)h(the)g(compression)f(transformations,)i(is)e(en)m +-(tirely)330 1938 y(irrelev)-5 b(an)m(t.)63 b(The)38 b(original)e(purp)s +-(ose)g(w)m(as)j(to)g(protect)g(the)f(sorting)g(algorithm)f(from)g(the)i +-(v)m(ery)330 2048 y(w)m(orst)h(case)h(input:)58 b(a)41 +-b(string)e(of)h(rep)s(eated)g(sym)m(b)s(ols.)68 b(But)40 +-b(algorithm)f(steps)h(Q6a)h(and)e(Q6b)330 2157 y(in)30 +-b(the)i(original)e(Burro)m(ws-Wheeler)i(tec)m(hnical)g(rep)s(ort)f +-(\(SR)m(C-124\))i(sho)m(w)f(ho)m(w)g(rep)s(eats)g(can)g(b)s(e)330 +-2267 y(handled)c(without)i(di\016cult)m(y)f(in)g(blo)s(c)m(k)h +-(sorting.)225 2409 y Fi(\017)60 b Fl(The)30 b(randomisation)e(mec)m +-(hanism)i(do)s(esn't)g(really)f(need)h(to)g(b)s(e)g(there.)41 +-b(Udi)29 b(Man)m(b)s(er)h(and)f(Gene)330 2518 y(My)m(ers)j(published)c +-(a)33 b(su\016x)e(arra)m(y)h(construction)f(algorithm)g(a)h(few)g(y)m +-(ears)h(bac)m(k,)g(whic)m(h)d(can)j(b)s(e)330 2628 y(emplo)m(y)m(ed)27 +-b(to)h(sort)g(an)m(y)f(blo)s(c)m(k,)h(no)f(matter)h(ho)m(w)f(rep)s +-(etitiv)m(e,)h(in)d(O\(N)j(log)f(N\))h(time.)39 b(Subsequen)m(t)330 +-2737 y(w)m(ork)25 b(b)m(y)f(Kunihik)m(o)f(Sadak)-5 b(ane)24 +-b(has)h(pro)s(duced)e(a)i(deriv)-5 b(ativ)m(e)24 b(O\(N)h(\(log)g(N\))p +-Fj(^)p Fl(2\))h(algorithm)d(whic)m(h)330 2847 y(usually)28 +-b(outp)s(erforms)h(the)i(Man)m(b)s(er-My)m(ers)g(algorithm.)330 +-2988 y(I)g(could)g(ha)m(v)m(e)i(c)m(hanged)f(to)g(Sadak)-5 +-b(ane's)32 b(algorithm,)f(but)g(I)g(\014nd)f(it)h(to)h(b)s(e)f(slo)m(w) +-m(er)h(than)f Fj(bzip2)p Fl('s)330 3098 y(existing)38 +-b(algorithm)g(for)h(most)h(inputs,)f(and)g(the)g(randomisation)f(mec)m +-(hanism)g(protects)i(ade-)330 3208 y(quately)34 b(against)f(bad)g +-(cases.)52 b(I)33 b(didn't)f(think)g(it)i(w)m(as)g(a)g(go)s(o)s(d)f +-(tradeo\013)i(to)f(mak)m(e.)51 b(P)m(artly)34 b(this)330 +-3317 y(is)39 b(due)h(to)h(the)f(fact)h(that)g(I)f(w)m(as)g(not)h(\015o) +-s(o)s(ded)e(with)g(email)g(complain)m(ts)g(ab)s(out)h +-Fj(bzip2-0.1)p Fl('s)330 3427 y(p)s(erformance)30 b(on)g(rep)s(etitiv)m +-(e)g(data,)h(so)g(p)s(erhaps)d(it)i(isn't)g(a)h(problem)d(for)j(real)f +-(inputs.)330 3568 y(Probably)i(the)h(b)s(est)g(long-term)g(solution,)g +-(and)g(the)g(one)h(I)f(ha)m(v)m(e)h(incorp)s(orated)e(in)m(to)i(0.9.5)h +-(and)330 3678 y(ab)s(o)m(v)m(e,)42 b(is)c(to)h(use)f(the)h(existing)f +-(sorting)g(algorithm)f(initially)-8 b(,)38 b(and)g(fall)f(bac)m(k)i(to) +-h(a)f(O\(N)f(\(log)330 3787 y(N\))p Fj(^)p Fl(2\))31 +-b(algorithm)f(if)f(the)i(standard)e(algorithm)h(gets)h(in)m(to)f +-(di\016culties.)225 3929 y Fi(\017)60 b Fl(The)31 b(compressed)f +-(\014le)g(format)i(w)m(as)f(nev)m(er)h(designed)d(to)j(b)s(e)f(handled) +-e(b)m(y)i(a)g(library)-8 b(,)29 b(and)i(I)g(ha)m(v)m(e)330 +-4039 y(had)d(to)i(jump)e(though)g(some)i(ho)s(ops)e(to)i(pro)s(duce)e +-(an)h(e\016cien)m(t)g(implemen)m(tation)f(of)h(decompres-)330 +-4148 y(sion.)38 b(It's)26 b(a)h(bit)e(hairy)-8 b(.)38 +-b(T)-8 b(ry)26 b(passing)f Fj(decompress.c)d Fl(through)k(the)g(C)f +-(prepro)s(cessor)g(and)h(y)m(ou'll)330 4258 y(see)32 +-b(what)g(I)f(mean.)45 b(Muc)m(h)32 b(of)g(this)e(complexit)m(y)i(could) +-f(ha)m(v)m(e)i(b)s(een)e(a)m(v)m(oided)h(if)e(the)i(compressed)330 +-4367 y(size)e(of)h(eac)m(h)g(blo)s(c)m(k)f(of)h(data)g(w)m(as)g +-(recorded)f(in)f(the)h(data)h(stream.)225 4509 y Fi(\017)60 +-b Fl(An)30 b(Adler-32)g(c)m(hec)m(ksum,)i(rather)e(than)g(a)h(CR)m(C32) +-g(c)m(hec)m(ksum,)g(w)m(ould)e(b)s(e)h(faster)h(to)g(compute.)150 +-4698 y(It)e(w)m(ould)f(b)s(e)g(fair)g(to)h(sa)m(y)h(that)g(the)f +-Fj(bzip2)e Fl(format)i(w)m(as)h(frozen)f(b)s(efore)f(I)h(prop)s(erly)d +-(and)j(fully)d(under-)150 4807 y(sto)s(o)s(d)k(the)h(p)s(erformance)e +-(consequences)i(of)g(doing)e(so.)150 4964 y(Impro)m(v)m(emen)m(ts)d +-(whic)m(h)e(I)i(w)m(as)g(able)f(to)h(incorp)s(orate)f(in)m(to)g(0.9.0,) +-k(despite)24 b(using)g(the)i(same)g(\014le)e(format,)150 +-5074 y(are:)225 5230 y Fi(\017)60 b Fl(Single)30 b(arra)m(y)i(implemen) +-m(tation)e(of)h(the)h(in)m(v)m(erse)f(BWT.)h(This)e(signi\014can)m(tly) +-f(sp)s(eeds)i(up)f(decom-)330 5340 y(pression,)f(presumably)f(b)s +-(ecause)i(it)g(reduces)g(the)h(n)m(um)m(b)s(er)e(of)i(cac)m(he)h +-(misses.)p eop +-%%Page: 32 33 +-32 32 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 +-b(32)225 299 y Fi(\017)60 b Fl(F)-8 b(aster)27 b(in)m(v)m(erse)e(MTF)h +-(transform)f(for)g(large)h(MTF)f(v)-5 b(alues.)39 b(The)25 +-b(new)g(implemen)m(tation)f(is)g(based)330 408 y(on)30 +-b(the)h(notion)f(of)g(sliding)e(blo)s(c)m(ks)h(of)i(v)-5 +-b(alues.)225 544 y Fi(\017)60 b Fj(bzip2-0.9.0)24 b Fl(no)m(w)k(reads)f +-(and)f(writes)h(\014les)f(with)g Fj(fread)g Fl(and)h +-Fj(fwrite)p Fl(;)f(v)m(ersion)h(0.1)i(used)d Fj(putc)330 +-653 y Fl(and)k Fj(getc)p Fl(.)39 b(Duh!)h(W)-8 b(ell,)31 +-b(y)m(ou)f(liv)m(e)g(and)g(learn.)150 836 y(F)-8 b(urther)30 +-b(ahead,)g(it)f(w)m(ould)g(b)s(e)g(nice)h(to)g(b)s(e)g(able)f(to)i(do)e +-(random)g(access)j(in)m(to)d(\014les.)40 b(This)28 b(will)f(require)150 +-945 y(some)k(careful)e(design)h(of)g(compressed)g(\014le)g(formats.)150 +-1227 y Fk(4.2)68 b(P)l(ortabilit)l(y)47 b(issues)150 +-1419 y Fl(After)36 b(some)g(consideration,)g(I)f(ha)m(v)m(e)i(decided)d +-(not)i(to)g(use)g(GNU)g Fj(autoconf)d Fl(to)j(con\014gure)g(0.9.5)h(or) +-150 1529 y(1.0.)150 1686 y Fj(autoconf)p Fl(,)31 b(admirable)g(and)h(w) +-m(onderful)f(though)i(it)f(is,)h(mainly)d(assists)j(with)e(p)s +-(ortabilit)m(y)g(problems)150 1795 y(b)s(et)m(w)m(een)f(Unix-lik)m(e)d +-(platforms.)40 b(But)29 b Fj(bzip2)f Fl(do)s(esn't)h(ha)m(v)m(e)h(m)m +-(uc)m(h)f(in)f(the)h(w)m(a)m(y)h(of)g(p)s(ortabilit)m(y)d(prob-)150 +-1905 y(lems)35 b(on)h(Unix;)j(most)d(of)g(the)h(di\016culties)d(app)s +-(ear)h(when)g(p)s(orting)g(to)i(the)f(Mac,)j(or)d(to)h(Microsoft's)150 +-2015 y(op)s(erating)26 b(systems.)40 b Fj(autoconf)25 +-b Fl(do)s(esn't)h(help)g(in)f(those)j(cases,)h(and)d(brings)f(in)g(a)j +-(whole)e(load)g(of)h(new)150 2124 y(complexit)m(y)-8 +-b(.)150 2281 y(Most)28 b(p)s(eople)f(should)f(b)s(e)h(able)g(to)h +-(compile)e(the)i(library)d(and)i(program)h(under)e(Unix)g(straigh)m(t)i +-(out-of-)150 2391 y(the-b)s(o)m(x,)j(so)g(to)g(sp)s(eak,)f(esp)s +-(ecially)f(if)g(y)m(ou)i(ha)m(v)m(e)g(a)g(v)m(ersion)f(of)g(GNU)h(C)f +-(a)m(v)-5 b(ailable.)150 2547 y(There)32 b(are)h(a)g(couple)f(of)h +-Fj(__inline__)d Fl(directiv)m(es)i(in)f(the)i(co)s(de.)48 +-b(GNU)33 b(C)f(\()p Fj(gcc)p Fl(\))g(should)f(b)s(e)h(able)g(to)150 +-2657 y(handle)24 b(them.)39 b(If)25 b(y)m(ou're)i(not)e(using)g(GNU)h +-(C,)f(y)m(our)h(C)f(compiler)f(shouldn't)g(see)i(them)f(at)i(all.)38 +-b(If)25 b(y)m(our)150 2767 y(compiler)k(do)s(es,)i(for)g(some)g +-(reason,)h(see)f(them)g(and)f(do)s(esn't)h(lik)m(e)f(them,)i(just)e +-Fj(#define)f(__inline__)150 2876 y Fl(to)37 b(b)s(e)f +-Fj(/*)30 b(*/)p Fl(.)58 b(One)36 b(easy)h(w)m(a)m(y)g(to)h(do)e(this)f +-(is)h(to)h(compile)e(with)g(the)i(\015ag)g Fj(-D__inline__=)p +-Fl(,)d(whic)m(h)150 2986 y(should)28 b(b)s(e)i(understo)s(o)s(d)f(b)m +-(y)h(most)h(Unix)e(compilers.)150 3143 y(If)35 b(y)m(ou)g(still)e(ha)m +-(v)m(e)j(di\016culties,)e(try)h(compiling)e(with)g(the)j(macro)f +-Fj(BZ_STRICT_ANSI)c Fl(de\014ned.)54 b(This)150 3252 +-y(should)28 b(enable)i(y)m(ou)h(to)g(build)d(the)i(library)e(in)h(a)i +-(strictly)f(ANSI)g(complian)m(t)f(en)m(vironmen)m(t.)41 +-b(Building)150 3362 y(the)25 b(program)f(itself)f(lik)m(e)g(this)h(is)f +-(dangerous)h(and)g(not)g(supp)s(orted,)g(since)g(y)m(ou)h(remo)m(v)m(e) +-g Fj(bzip2)p Fl('s)e(c)m(hec)m(ks)150 3471 y(against)30 +-b(compressing)f(directories,)g(sym)m(b)s(olic)g(links,)f(devices,)i +-(and)f(other)h(not-really-a-\014le)g(en)m(tities.)150 +-3581 y(This)f(could)g(cause)i(\014lesystem)f(corruption!)150 +-3738 y(One)e(other)i(thing:)39 b(if)27 b(y)m(ou)j(create)g(a)f +-Fj(bzip2)f Fl(binary)f(for)i(public)d(distribution,)g(please)i(try)h +-(and)g(link)d(it)150 3847 y(statically)g(\()p Fj(gcc)k(-s)p +-Fl(\).)39 b(This)25 b(a)m(v)m(oids)i(all)f(sorts)h(of)g(library-v)m +-(ersion)d(issues)h(that)i(others)g(ma)m(y)g(encoun)m(ter)150 +-3957 y(later)j(on.)150 4114 y(If)f(y)m(ou)g(build)e Fj(bzip2)h +-Fl(on)h(Win32,)h(y)m(ou)f(m)m(ust)g(set)h Fj(BZ_UNIX)e +-Fl(to)i(0)f(and)g Fj(BZ_LCCWIN32)d Fl(to)k(1,)g(in)e(the)i(\014le)150 +-4223 y Fj(bzip2.c)p Fl(,)f(b)s(efore)h(compiling.)38 +-b(Otherwise)29 b(the)i(resulting)d(binary)h(w)m(on't)i(w)m(ork)f +-(correctly)-8 b(.)150 4505 y Fk(4.3)68 b(Rep)t(orting)46 +-b(bugs)150 4698 y Fl(I)25 b(tried)f(prett)m(y)i(hard)e(to)i(mak)m(e)g +-(sure)f Fj(bzip2)e Fl(is)i(bug)f(free,)j(b)s(oth)d(b)m(y)h(design)f +-(and)h(b)m(y)g(testing.)39 b(Hop)s(efully)150 4807 y(y)m(ou'll)29 +-b(nev)m(er)i(need)f(to)h(read)g(this)e(section)h(for)h(real.)150 +-4964 y(Nev)m(ertheless,)36 b(if)c Fj(bzip2)h Fl(dies)g(with)f(a)i +-(segmen)m(tation)h(fault,)g(a)f(bus)f(error)g(or)h(an)g(in)m(ternal)e +-(assertion)150 5074 y(failure,)i(it)h(will)d(ask)j(y)m(ou)g(to)g(email) +-f(me)h(a)g(bug)f(rep)s(ort.)54 b(Exp)s(erience)33 b(with)h(v)m(ersion)g +-(0.1)i(sho)m(ws)e(that)150 5183 y(almost)c(all)g(these)h(problems)d +-(can)j(b)s(e)f(traced)h(to)g(either)f(compiler)e(bugs)i(or)g(hardw)m +-(are)g(problems.)225 5340 y Fi(\017)60 b Fl(Recompile)22 +-b(the)h(program)g(with)f(no)h(optimisation,)g(and)f(see)i(if)e(it)g(w)m +-(orks.)39 b(And/or)22 b(try)h(a)g(di\013eren)m(t)p eop +-%%Page: 33 34 +-33 33 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 +-b(33)330 299 y(compiler.)77 b(I)43 b(heard)f(all)g(sorts)h(of)h +-(stories)e(ab)s(out)h(v)-5 b(arious)42 b(\015a)m(v)m(ours)h(of)h(GNU)f +-(C)g(\(and)g(other)330 408 y(compilers\))20 b(generating)i(bad)e(co)s +-(de)i(for)f Fj(bzip2)p Fl(,)h(and)f(I'v)m(e)h(run)e(across)i(t)m(w)m(o) +-g(suc)m(h)f(examples)g(m)m(yself.)330 606 y(2.7.X)35 +-b(v)m(ersions)e(of)g(GNU)h(C)f(are)h(kno)m(wn)f(to)h(generate)h(bad)d +-(co)s(de)i(from)f(time)g(to)h(time,)g(at)g(high)330 716 +-y(optimisation)20 b(lev)m(els.)37 b(If)21 b(y)m(ou)g(get)i(problems,)e +-(try)g(using)f(the)i(\015ags)f Fj(-O2)f(-fomit-frame-pointer)330 +-825 y(-fno-strength-reduce)p Fl(.)35 b(Y)-8 b(ou)31 b(should)d(sp)s +-(eci\014cally)h Fc(not)j Fl(use)e Fj(-funroll-loops)p +-Fl(.)330 1023 y(Y)-8 b(ou)38 b(ma)m(y)g(notice)g(that)g(the)g(Mak)m +-(e\014le)g(runs)e(six)g(tests)i(as)g(part)f(of)h(the)g(build)c(pro)s +-(cess.)62 b(If)37 b(the)330 1132 y(program)43 b(passes)g(all)f(of)h +-(these,)k(it's)c(a)h(prett)m(y)f(go)s(o)s(d)g(\(but)g(not)g(100\045\))i +-(indication)c(that)j(the)330 1242 y(compiler)29 b(has)h(done)g(its)g +-(job)g(correctly)-8 b(.)225 1440 y Fi(\017)60 b Fl(If)33 +-b Fj(bzip2)f Fl(crashes)i(randomly)-8 b(,)33 b(and)g(the)h(crashes)g +-(are)g(not)g(rep)s(eatable,)g(y)m(ou)g(ma)m(y)g(ha)m(v)m(e)h(a)f +-(\015aky)330 1549 y(memory)k(subsystem.)64 b Fj(bzip2)37 +-b Fl(really)g(hammers)h(y)m(our)g(memory)g(hierarc)m(h)m(y)-8 +-b(,)41 b(and)d(if)f(it's)h(a)h(bit)330 1659 y(marginal,)33 +-b(y)m(ou)h(ma)m(y)g(get)h(these)f(problems.)49 b(Ditto)34 +-b(if)f(y)m(our)h(disk)e(or)h(I/O)h(subsystem)e(is)h(slo)m(wly)330 +-1768 y(failing.)39 b(Y)-8 b(up,)30 b(this)f(really)g(do)s(es)h(happ)s +-(en.)330 1966 y(T)-8 b(ry)28 b(using)f(a)i(di\013eren)m(t)f(mac)m(hine) +-g(of)h(the)g(same)f(t)m(yp)s(e,)i(and)e(see)h(if)e(y)m(ou)i(can)g(rep)s +-(eat)g(the)f(problem.)225 2163 y Fi(\017)60 b Fl(This)21 +-b(isn't)i(really)f(a)h(bug,)i(but)d(...)39 b(If)23 b +-Fj(bzip2)f Fl(tells)g(y)m(ou)h(y)m(our)h(\014le)e(is)g(corrupted)h(on)g +-(decompression,)330 2273 y(and)29 b(y)m(ou)g(obtained)f(the)i(\014le)e +-(via)h(FTP)-8 b(,)29 b(there)h(is)e(a)h(p)s(ossibilit)m(y)d(that)k(y)m +-(ou)f(forgot)h(to)g(tell)e(FTP)h(to)330 2383 y(do)23 +-b(a)g(binary)e(mo)s(de)i(transfer.)38 b(That)23 b(absolutely)f(will)e +-(cause)j(the)h(\014le)e(to)h(b)s(e)g(non-decompressible.)330 +-2492 y(Y)-8 b(ou'll)30 b(ha)m(v)m(e)h(to)g(transfer)f(it)g(again.)150 +-2737 y(If)i(y)m(ou'v)m(e)h(incorp)s(orated)e Fj(libbzip2)f +-Fl(in)m(to)i(y)m(our)g(o)m(wn)g(program)g(and)g(are)g(getting)h +-(problems,)e(please,)150 2847 y(please,)d(please,)h(c)m(hec)m(k)g(that) +-f(the)g(parameters)g(y)m(ou)g(are)g(passing)f(in)f(calls)h(to)h(the)g +-(library)-8 b(,)26 b(are)j(correct,)150 2956 y(and)e(in)f(accordance)k +-(with)c(what)i(the)g(do)s(cumen)m(tation)f(sa)m(ys)h(is)f(allo)m(w)m +-(able.)39 b(I)28 b(ha)m(v)m(e)h(tried)e(to)h(mak)m(e)h(the)150 +-3066 y(library)f(robust)i(against)g(suc)m(h)g(problems,)f(but)h(I'm)g +-(sure)g(I)g(ha)m(v)m(en't)h(succeeded.)150 3223 y(Finally)-8 +-b(,)32 b(if)g(the)h(ab)s(o)m(v)m(e)i(commen)m(ts)e(don't)g(help,)g(y)m +-(ou'll)f(ha)m(v)m(e)i(to)g(send)e(me)h(a)g(bug)g(rep)s(ort.)48 +-b(No)m(w,)34 b(it's)150 3332 y(just)c(amazing)g(ho)m(w)h(man)m(y)f(p)s +-(eople)g(will)d(send)j(me)g(a)h(bug)f(rep)s(ort)g(sa)m(ying)g +-(something)g(lik)m(e)481 3483 y(bzip2)f(crashed)h(with)f(segmen)m +-(tation)j(fault)e(on)g(m)m(y)g(mac)m(hine)150 3640 y(and)h(absolutely)f +-(nothing)h(else.)44 b(Needless)32 b(to)g(sa)m(y)-8 b(,)33 +-b(a)f(suc)m(h)f(a)h(rep)s(ort)f(is)g Fc(totally)-8 b(,)32 +-b(utterly)-8 b(,)32 b(completely)150 3750 y(and)40 b(comprehensiv)m +-(ely)g(100\045)h(useless;)46 b(a)41 b(w)m(aste)g(of)g(y)m(our)g(time,)i +-(m)m(y)e(time,)i(and)e(net)g(bandwidth)p Fl(.)150 3859 +-y(With)31 b(no)h(details)f(at)i(all,)e(there's)h(no)g(w)m(a)m(y)h(I)f +-(can)g(p)s(ossibly)d(b)s(egin)h(to)j(\014gure)e(out)i(what)e(the)i +-(problem)150 3969 y(is.)150 4126 y(The)d(rules)e(of)i(the)g(game)h +-(are:)41 b(facts,)32 b(facts,)f(facts.)41 b(Don't)31 +-b(omit)f(them)g(b)s(ecause)g Fj(")p Fl(oh,)g(they)g(w)m(on't)h(b)s(e) +-150 4235 y(relev)-5 b(an)m(t)p Fj(")p Fl(.)41 b(A)m(t)31 +-b(the)g(bare)f(minim)m(um:)481 4386 y(Mac)m(hine)h(t)m(yp)s(e.)61 +-b(Op)s(erating)29 b(system)h(v)m(ersion.)481 4490 y(Exact)h(v)m(ersion) +-f(of)h Fj(bzip2)e Fl(\(do)h Fj(bzip2)47 b(-V)p Fl(\).)481 +-4594 y(Exact)31 b(v)m(ersion)f(of)h(the)f(compiler)f(used.)481 +-4698 y(Flags)i(passed)e(to)j(the)e(compiler.)150 4854 +-y(Ho)m(w)m(ev)m(er,)i(the)d(most)h(imp)s(ortan)m(t)f(single)f(thing)g +-(that)i(will)d(help)h(me)h(is)f(the)i(\014le)e(that)i(y)m(ou)g(w)m(ere) +-g(trying)150 4964 y(to)f(compress)f(or)g(decompress)g(at)h(the)f(time)g +-(the)g(problem)f(happ)s(ened.)38 b(Without)28 b(that,)h(m)m(y)g(abilit) +-m(y)d(to)150 5074 y(do)k(an)m(ything)g(more)h(than)f(sp)s(eculate)g(ab) +-s(out)g(the)g(cause,)i(is)d(limited.)150 5230 y(Please)34 +-b(remem)m(b)s(er)f(that)h(I)f(connect)i(to)f(the)g(In)m(ternet)g(with)e +-(a)i(mo)s(dem,)g(so)f(y)m(ou)h(should)e(con)m(tact)k(me)150 +-5340 y(b)s(efore)30 b(mailing)e(me)j(h)m(uge)f(\014les.)p +-eop +-%%Page: 34 35 +-34 34 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 +-b(34)150 299 y Fk(4.4)68 b(Did)45 b(y)l(ou)g(get)h(the)f(righ)l(t)h +-(pac)l(k)-7 b(age?)150 491 y Fj(bzip2)34 b Fl(is)h(a)h(resource)g(hog.) +-56 b(It)36 b(soaks)g(up)f(large)g(amoun)m(ts)h(of)g(CPU)f(cycles)h(and) +-f(memory)-8 b(.)57 b(Also,)36 b(it)150 601 y(giv)m(es)26 +-b(v)m(ery)h(large)f(latencies.)39 b(In)25 b(the)h(w)m(orst)g(case,)i(y) +-m(ou)f(can)f(feed)g(man)m(y)g(megab)m(ytes)h(of)f(uncompressed)150 +-711 y(data)45 b(in)m(to)e(the)i(library)c(b)s(efore)j(getting)g(an)m(y) +-g(compressed)g(output,)j(so)d(this)f(probably)f(rules)h(out)150 +-820 y(applications)29 b(requiring)e(in)m(teractiv)m(e)32 +-b(b)s(eha)m(viour.)150 977 y(These)38 b(aren't)h(faults)e(of)h(m)m(y)g +-(implemen)m(tation,)h(I)f(hop)s(e,)i(but)d(more)h(an)g(in)m(trinsic)e +-(prop)s(ert)m(y)h(of)i(the)150 1087 y(Burro)m(ws-Wheeler)30 +-b(transform)g(\(unfortunately\).)40 b(Ma)m(yb)s(e)31 +-b(this)e(isn't)h(what)g(y)m(ou)h(w)m(an)m(t.)150 1244 +-y(If)h(y)m(ou)h(w)m(an)m(t)g(a)g(compressor)g(and/or)f(library)e(whic)m +-(h)h(is)h(faster,)i(uses)e(less)g(memory)g(but)g(gets)h(prett)m(y)150 +-1353 y(go)s(o)s(d)e(compression,)g(and)g(has)h(minimal)c(latency)-8 +-b(,)33 b(consider)e(Jean-loup)f(Gailly's)g(and)h(Mark)h(Adler's)150 +-1463 y(w)m(ork,)f Fj(zlib-1.1.2)c Fl(and)j Fj(gzip-1.2.4)p +-Fl(.)38 b(Lo)s(ok)31 b(for)f(them)g(at)150 1620 y Fj +-(http://www.cdrom.com/pub)o(/inf)o(ozip)o(/zl)o(ib)24 +-b Fl(and)30 b Fj(http://www.gzip.org)25 b Fl(resp)s(ectiv)m(ely)-8 +-b(.)150 1776 y(F)g(or)32 b(something)f(faster)i(and)e(ligh)m(ter)f +-(still,)h(y)m(ou)g(migh)m(t)h(try)f(Markus)h(F)g(X)f(J)h(Ob)s(erh)m +-(umer's)d Fj(LZO)i Fl(real-)150 1886 y(time)f +-(compression/decompression)f(library)-8 b(,)28 b(at)150 +-1996 y Fj(http://wildsau.idv.uni-l)o(inz.)o(ac.a)o(t/m)o(fx/l)o(zo.h)o +-(tml)o Fl(.)150 2152 y(If)38 b(y)m(ou)h(w)m(an)m(t)g(to)h(use)e(the)g +-Fj(bzip2)g Fl(algorithms)f(to)i(compress)f(small)g(blo)s(c)m(ks)f(of)i +-(data,)j(64k)d(b)m(ytes)g(or)150 2262 y(smaller,)i(for)e(example)g(on)h +-(an)f(on-the-\015y)h(disk)e(compressor,)k(y)m(ou'd)e(b)s(e)f(w)m(ell)g +-(advised)f(not)i(to)g(use)150 2372 y(this)i(library)-8 +-b(.)77 b(Instead,)47 b(I'v)m(e)d(made)f(a)h(sp)s(ecial)e(library)f +-(tuned)h(for)h(that)h(kind)d(of)j(use.)79 b(It's)43 b(part)150 +-2481 y(of)d Fj(e2compr-0.40)p Fl(,)f(an)g(on-the-\015y)h(disk)e +-(compressor)h(for)h(the)f(Lin)m(ux)f Fj(ext2)h Fl(\014lesystem.)67 +-b(Lo)s(ok)40 b(at)150 2591 y Fj(http://www.netspace.net.)o(au/~)o(reit) +-o(er/)o(e2co)o(mpr)p Fl(.)150 2880 y Fk(4.5)68 b(T)-11 +-b(esting)150 3072 y Fl(A)30 b(record)h(of)f(the)h(tests)g(I'v)m(e)g +-(done.)150 3229 y(First,)f(some)h(data)g(sets:)225 3386 +-y Fi(\017)60 b Fl(B:)32 b(a)f(directory)f(con)m(taining)h(6001)i +-(\014les,)d(one)h(for)g(ev)m(ery)h(length)e(in)g(the)h(range)g(0)h(to)f +-(6000)i(b)m(ytes.)330 3496 y(The)d(\014les)f(con)m(tain)i(random)e(lo)m +-(w)m(ercase)j(letters.)41 b(18.7)32 b(megab)m(ytes.)225 +-3633 y Fi(\017)60 b Fl(H:)36 b(m)m(y)f(home)h(directory)f(tree.)56 +-b(Do)s(cumen)m(ts,)38 b(source)d(co)s(de,)i(mail)d(\014les,)i +-(compressed)f(data.)57 b(H)330 3743 y(con)m(tains)39 +-b(B,)h(and)f(also)g(a)g(directory)g(of)g(\014les)f(designed)g(as)i(b)s +-(oundary)d(cases)j(for)f(the)g(sorting;)330 3853 y(mostly)30 +-b(v)m(ery)h(rep)s(etitiv)m(e,)f(nast)m(y)h(\014les.)39 +-b(565)32 b(megab)m(ytes.)225 3990 y Fi(\017)60 b Fl(A:)43 +-b(directory)f(tree)i(holding)d(v)-5 b(arious)41 b(applications)g(built) +-g(from)h(source:)66 b Fj(egcs)p Fl(,)45 b Fj(gcc-2.8.1)p +-Fl(,)330 4100 y(KDE,)31 b(GTK,)f(Octa)m(v)m(e,)j(etc.)41 +-b(2200)33 b(megab)m(ytes.)150 4285 y(The)i(tests)g(conducted)g(are)h +-(as)f(follo)m(ws.)54 b(Eac)m(h)36 b(test)g(means)f(compressing)f(\(a)h +-(cop)m(y)h(of)7 b(\))36 b(eac)m(h)g(\014le)e(in)150 4394 +-y(the)d(data)g(set,)g(decompressing)e(it)h(and)g(comparing)f(it)h +-(against)h(the)g(original.)150 4551 y(First,)26 b(a)g(bunc)m(h)f(of)h +-(tests)h(with)d(blo)s(c)m(k)h(sizes)h(and)f(in)m(ternal)g(bu\013er)f +-(sizes)i(set)g(v)m(ery)g(small,)g(to)g(detect)i(an)m(y)150 +-4661 y(problems)g(with)g(the)i(blo)s(c)m(king)f(and)g(bu\013ering)e +-(mec)m(hanisms.)40 b(This)28 b(required)g(mo)s(difying)f(the)j(source) +-150 4770 y(co)s(de)h(so)f(as)h(to)g(try)f(to)h(break)g(it.)199 +-4927 y(1.)61 b(Data)32 b(set)f(H,)g(with)e(bu\013er)g(size)h(of)h(1)g +-(b)m(yte,)g(and)f(blo)s(c)m(k)g(size)g(of)g(23)i(b)m(ytes.)199 +-5065 y(2.)61 b(Data)32 b(set)f(B,)g(bu\013er)e(sizes)h(1)h(b)m(yte,)g +-(blo)s(c)m(k)f(size)g(1)h(b)m(yte.)199 5202 y(3.)61 b(As)30 +-b(\(2\))i(but)d(small-mo)s(de)g(decompression.)199 5340 +-y(4.)61 b(As)30 b(\(2\))i(with)d(blo)s(c)m(k)h(size)g(2)h(b)m(ytes.)p +-eop +-%%Page: 35 36 +-35 35 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 +-b(35)199 299 y(5.)61 b(As)30 b(\(2\))i(with)d(blo)s(c)m(k)h(size)g(3)h +-(b)m(ytes.)199 431 y(6.)61 b(As)30 b(\(2\))i(with)d(blo)s(c)m(k)h(size) +-g(4)h(b)m(ytes.)199 564 y(7.)61 b(As)30 b(\(2\))i(with)d(blo)s(c)m(k)h +-(size)g(5)h(b)m(ytes.)199 697 y(8.)61 b(As)30 b(\(2\))i(with)d(blo)s(c) +-m(k)h(size)g(6)h(b)m(ytes)g(and)e(small-mo)s(de)g(decompression.)199 +-829 y(9.)61 b(H)30 b(with)g(bu\013er)f(size)h(of)h(1)f(b)m(yte,)i(but)d +-(normal)h(blo)s(c)m(k)g(size)g(\(up)f(to)j(900000)h(b)m(ytes\).)150 +-1009 y(Then)c(some)i(tests)g(with)e(unmo)s(di\014ed)f(source)i(co)s +-(de.)199 1166 y(1.)61 b(H,)31 b(all)e(settings)h(normal.)199 +-1299 y(2.)61 b(As)30 b(\(1\),)i(with)d(small-mo)s(de)g(decompress.)199 +-1431 y(3.)61 b(H,)31 b(compress)f(with)f(\015ag)i Fj(-1)p +-Fl(.)199 1564 y(4.)61 b(H,)31 b(compress)f(with)f(\015ag)i +-Fj(-s)p Fl(,)f(decompress)g(with)f(\015ag)i Fj(-s)p Fl(.)199 +-1697 y(5.)61 b(F)-8 b(orw)m(ards)33 b(compatibilit)m(y:)45 +-b(H,)33 b Fj(bzip2-0.1pl2)d Fl(compressing,)j Fj(bzip2-0.9.5)d +-Fl(decompressing,)330 1806 y(all)f(settings)i(normal.)199 +-1939 y(6.)61 b(Bac)m(kw)m(ards)23 b(compatibilit)m(y:)35 +-b(H,)23 b Fj(bzip2-0.9.5)c Fl(compressing,)k Fj(bzip2-0.1pl2)c +-Fl(decompressing,)330 2048 y(all)29 b(settings)i(normal.)199 +-2181 y(7.)61 b(Bigger)31 b(tests:)41 b(A,)31 b(all)e(settings)i +-(normal.)199 2314 y(8.)61 b(As)30 b(\(7\),)i(using)d(the)i(fallbac)m(k) +-e(\(Sadak)-5 b(ane-lik)m(e\))31 b(sorting)f(algorithm.)199 +-2446 y(9.)61 b(As)30 b(\(8\),)i(compress)e(with)f(\015ag)i +-Fj(-1)p Fl(,)f(decompress)g(with)f(\015ag)i Fj(-s)p Fl(.)154 +-2579 y(10.)61 b(H,)31 b(using)e(the)h(fallbac)m(k)g(sorting)g +-(algorithm.)154 2711 y(11.)61 b(F)-8 b(orw)m(ards)33 +-b(compatibilit)m(y:)45 b(A,)33 b Fj(bzip2-0.1pl2)d Fl(compressing,)j +-Fj(bzip2-0.9.5)d Fl(decompressing,)330 2821 y(all)f(settings)i(normal.) +-154 2954 y(12.)61 b(Bac)m(kw)m(ards)23 b(compatibilit)m(y:)35 +-b(A,)23 b Fj(bzip2-0.9.5)c Fl(compressing,)k Fj(bzip2-0.1pl2)c +-Fl(decompressing,)330 3063 y(all)29 b(settings)i(normal.)154 +-3196 y(13.)61 b(Misc)39 b(test:)58 b(ab)s(out)39 b(400)h(megab)m(ytes)h +-(of)e Fj(.tar)f Fl(\014les)f(with)h Fj(bzip2)f Fl(compiled)h(with)f +-(Chec)m(k)m(er)j(\(a)330 3305 y(memory)30 b(access)i(error)e(detector,) +-i(lik)m(e)e(Purify\).)154 3438 y(14.)61 b(Misc)30 b(tests)h(to)g(mak)m +-(e)h(sure)d(it)h(builds)e(and)h(runs)g(ok)i(on)f(non-Lin)m(ux/x86)g +-(platforms.)150 3618 y(These)35 b(tests)h(w)m(ere)f(conducted)g(on)g(a) +-h(225)g(MHz)g(IDT)f(WinChip)d(mac)m(hine,)k(running)d(Lin)m(ux)g +-(2.0.36.)150 3728 y(They)d(represen)m(t)g(nearly)g(a)h(w)m(eek)g(of)f +-(con)m(tin)m(uous)g(computation.)41 b(All)29 b(tests)i(completed)f +-(successfully)-8 b(.)150 4003 y Fk(4.6)68 b(F)-11 b(urther)44 +-b(reading)150 4196 y Fj(bzip2)28 b Fl(is)h(not)h(researc)m(h)g(w)m +-(ork,)g(in)e(the)i(sense)g(that)g(it)f(do)s(esn't)g(presen)m(t)h(an)m +-(y)g(new)f(ideas.)40 b(Rather,)30 b(it's)150 4306 y(an)g(engineering)f +-(exercise)i(based)f(on)g(existing)g(ideas.)150 4463 y(F)-8 +-b(our)31 b(do)s(cumen)m(ts)f(describ)s(e)e(essen)m(tially)i(all)f(the)i +-(ideas)e(b)s(ehind)f Fj(bzip2)p Fl(:)390 4614 y Fj(Michael)46 +-b(Burrows)g(and)h(D.)g(J.)g(Wheeler:)485 4717 y("A)h(block-sorting)c +-(lossless)h(data)i(compression)e(algorithm")533 4821 +-y(10th)i(May)g(1994.)533 4925 y(Digital)f(SRC)h(Research)e(Report)i +-(124.)533 5029 y(ftp://ftp.digital.com/pub)o(/DEC)o(/SR)o(C/re)o(sear)o +-(ch-)o(repo)o(rts/)o(SRC)o(-124)o(.ps.)o(gz)533 5132 +-y(If)g(you)g(have)g(trouble)f(finding)g(it,)g(try)h(searching)f(at)h +-(the)533 5236 y(New)g(Zealand)f(Digital)g(Library,)f +-(http://www.nzdl.org.)p eop +-%%Page: 36 37 +-36 36 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 +-b(36)390 299 y Fj(Daniel)46 b(S.)h(Hirschberg)e(and)i(Debra)g(A.)g +-(LeLewer)485 403 y("Efficient)e(Decoding)h(of)h(Prefix)f(Codes")533 +-506 y(Communications)e(of)j(the)g(ACM,)g(April)f(1990,)h(Vol)f(33,)h +-(Number)f(4.)533 610 y(You)h(might)f(be)i(able)e(to)h(get)g(an)h +-(electronic)d(copy)h(of)h(this)676 714 y(from)g(the)g(ACM)g(Digital)f +-(Library.)390 922 y(David)g(J.)i(Wheeler)533 1025 y(Program)e(bred3.c)g +-(and)h(accompanying)d(document)i(bred3.ps.)533 1129 y(This)h(contains)e +-(the)i(idea)g(behind)f(the)h(multi-table)e(Huffman)533 +-1233 y(coding)h(scheme.)533 1337 y(ftp://ftp.cl.cam.ac.uk/us)o(ers/)o +-(djw)o(3/)390 1544 y(Jon)h(L.)g(Bentley)f(and)h(Robert)f(Sedgewick)485 +-1648 y("Fast)h(Algorithms)e(for)i(Sorting)f(and)g(Searching)g(Strings") +-533 1752 y(Available)f(from)i(Sedgewick's)e(web)i(page,)533 +-1856 y(www.cs.princeton.edu/~rs)150 2012 y Fl(The)29 +-b(follo)m(wing)f(pap)s(er)g(giv)m(es)h(v)-5 b(aluable)28 +-b(additional)g(insigh)m(ts)f(in)m(to)j(the)f(algorithm,)g(but)g(is)f +-(not)i(imme-)150 2122 y(diately)g(the)g(basis)f(of)i(an)m(y)g(co)s(de)f +-(used)g(in)f(bzip2.)390 2273 y Fj(Peter)46 b(Fenwick:)533 +-2377 y(Block)h(Sorting)e(Text)i(Compression)533 2481 +-y(Proceedings)e(of)i(the)g(19th)g(Australasian)d(Computer)i(Science)f +-(Conference,)629 2584 y(Melbourne,)g(Australia.)92 b(Jan)47 +-b(31)g(-)h(Feb)f(2,)g(1996.)533 2688 y(ftp://ftp.cs.auckland.ac.)o +-(nz/p)o(ub/)o(pete)o(r-f/)o(ACS)o(C96p)o(aper)o(.ps)150 +-2845 y Fl(Kunihik)m(o)28 b(Sadak)-5 b(ane's)31 b(sorting)e(algorithm,)h +-(men)m(tioned)g(ab)s(o)m(v)m(e,)i(is)d(a)m(v)-5 b(ailable)30 +-b(from:)390 2996 y Fj(http://naomi.is.s.u-toky)o(o.ac)o(.jp/)o(~sa)o +-(da/p)o(aper)o(s/S)o(ada9)o(8b.p)o(s.g)o(z)150 3153 y +-Fl(The)41 b(Man)m(b)s(er-My)m(ers)g(su\016x)g(arra)m(y)g(construction)g +-(algorithm)f(is)g(describ)s(ed)f(in)h(a)i(pap)s(er)e(a)m(v)-5 +-b(ailable)150 3262 y(from:)390 3413 y Fj(http://www.cs.arizona.ed)o +-(u/pe)o(ople)o(/ge)o(ne/P)o(APER)o(S/s)o(uffi)o(x.ps)150 +-3570 y Fl(Finally)d(,)33 b(the)h(follo)m(wing)e(pap)s(er)h(do)s(cumen)m +-(ts)g(some)h(recen)m(t)h(in)m(v)m(estigations)e(I)h(made)f(in)m(to)h +-(the)g(p)s(erfor-)150 3680 y(mance)d(of)f(sorting)g(algorithms:)390 +-3831 y Fj(Julian)46 b(Seward:)533 3935 y(On)h(the)g(Performance)e(of)i +-(BWT)g(Sorting)f(Algorithms)533 4038 y(Proceedings)f(of)i(the)g(IEEE)g +-(Data)f(Compression)f(Conference)g(2000)629 4142 y(Snowbird,)g(Utah.)94 +-b(28-30)46 b(March)h(2000.)p eop +-%%Page: -1 38 +--1 37 bop 3725 -116 a Fl(i)150 299 y Fh(T)-13 b(able)54 +-b(of)g(Con)l(ten)l(ts)150 641 y Fk(1)135 b(In)l(tro)t(duction)15 +-b Fb(.)20 b(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f +-(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)60 +-b Fk(2)150 911 y(2)135 b(Ho)l(w)45 b(to)h(use)f Fd(bzip2)31 +-b Fb(.)19 b(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g +-(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)78 b Fk(3)1047 +-1048 y Fl(NAME)20 b Fa(.)c(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)50 b Fl(3)1047 +-1157 y(SYNOPSIS)21 b Fa(.)13 b(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)h(.)f(.)g(.)50 b Fl(3)1047 1267 y(DESCRIPTION)10 +-b Fa(.)j(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)40 +-b Fl(3)1047 1377 y(OPTIONS)16 b Fa(.)d(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)45 b Fl(4)1047 +-1486 y(MEMOR)-8 b(Y)31 b(MANA)m(GEMENT)14 b Fa(.)j(.)e(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)44 +-b Fl(6)1047 1596 y(RECO)m(VERING)30 b(D)m(A)-8 b(T)g(A)32 +-b(FR)m(OM)f(D)m(AMA)m(GED)i(FILES)1256 1705 y Fa(.)15 +-b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)57 b Fl(7)1047 1815 y(PERF)m(ORMANCE)30 +-b(NOTES)9 b Fa(.)14 b(.)h(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)38 b Fl(7)1047 1924 +-y(CA)-10 b(VEA)i(TS)10 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)40 b Fl(8)1047 2034 +-y(A)m(UTHOR)23 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)53 b Fl(8)150 2276 y Fk(3)135 +-b(Programming)46 b(with)f Fd(libbzip2)29 b Fb(.)16 b(.)j(.)h(.)f(.)h(.) +-f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)74 b Fk(9)449 +-2413 y Fl(3.1)92 b(T)-8 b(op-lev)m(el)30 b(structure)24 +-b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h +-(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)54 b Fl(9)748 2523 y(3.1.1)93 b(Lo)m(w-lev)m(el)30 +-b(summary)23 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)53 +-b Fl(9)748 2633 y(3.1.2)93 b(High-lev)m(el)29 b(summary)12 +-b Fa(.)i(.)h(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)41 b +-Fl(9)748 2742 y(3.1.3)93 b(Utilit)m(y)29 b(functions)g(summary)12 +-b Fa(.)h(.)j(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)41 b Fl(10)449 2852 y(3.2)92 b(Error)29 +-b(handling)18 b Fa(.)13 b(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)48 b Fl(10)449 +-2961 y(3.3)92 b(Lo)m(w-lev)m(el)31 b(in)m(terface)d Fa(.)15 +-b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)57 b Fl(12)748 3071 y(3.3.1)93 b Fj(BZ2_bzCompressInit)21 +-b Fa(.)9 b(.)15 b(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)50 b Fl(12)748 +-3181 y(3.3.2)93 b Fj(BZ2_bzCompress)9 b Fa(.)h(.)15 b(.)g(.)g(.)g(.)g +-(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)38 b Fl(14)748 3290 y(3.3.3)93 +-b Fj(BZ2_bzCompressEnd)23 b Fa(.)10 b(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g +-(.)52 b Fl(17)748 3400 y(3.3.4)93 b Fj(BZ2_bzDecompressInit)16 +-b Fa(.)9 b(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h +-(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)45 b Fl(17)748 3509 +-y(3.3.5)93 b Fj(BZ2_bzDecompress)21 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)55 b Fl(17)748 3619 y(3.3.6)93 b Fj(BZ2_bzDecompressEnd)18 +-b Fa(.)10 b(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)48 b Fl(19)449 +-3729 y(3.4)92 b(High-lev)m(el)30 b(in)m(terface)16 b +-Fa(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h +-(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)45 b Fl(19)748 3838 y(3.4.1)93 b Fj(BZ2_bzReadOpen)9 +-b Fa(.)h(.)15 b(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)38 +-b Fl(19)748 3948 y(3.4.2)93 b Fj(BZ2_bzRead)18 b Fa(.)12 +-b(.)j(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)47 +-b Fl(20)748 4057 y(3.4.3)93 b Fj(BZ2_bzReadGetUnused)18 +-b Fa(.)10 b(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)48 b Fl(22)748 +-4167 y(3.4.4)93 b Fj(BZ2_bzReadClose)23 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)57 b Fl(22)748 4276 y(3.4.5)93 b +-Fj(BZ2_bzWriteOpen)23 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)57 b Fl(22)748 4386 y(3.4.6)93 b Fj(BZ2_bzWrite)16 +-b Fa(.)11 b(.)k(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-45 b Fl(23)748 4496 y(3.4.7)93 b Fj(BZ2_bzWriteClose)21 +-b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)55 b Fl(23)748 +-4605 y(3.4.8)93 b(Handling)28 b(em)m(b)s(edded)h(compressed)h(data)h +-(streams)17 b Fa(.)f(.)f(.)g(.)46 b Fl(24)748 4715 y(3.4.9)93 +-b(Standard)29 b(\014le-reading/writing)e(co)s(de)22 b +-Fa(.)16 b(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)52 +-b Fl(25)449 4824 y(3.5)92 b(Utilit)m(y)29 b(functions)f +-Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)59 b Fl(26)748 4934 y(3.5.1)93 b +-Fj(BZ2_bzBuffToBuffCompres)o(s)22 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)57 b Fl(26)748 +-5044 y(3.5.2)93 b Fj(BZ2_bzBuffToBuffDecompr)o(ess)17 +-b Fa(.)e(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-53 b Fl(27)449 5153 y(3.6)92 b Fj(zlib)29 b Fl(compatibilit)m(y)g +-(functions)23 b Fa(.)13 b(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h +-(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)52 +-b Fl(28)449 5263 y(3.7)92 b(Using)30 b(the)g(library)e(in)h(a)i +-Fj(stdio)p Fl(-free)e(en)m(vironmen)m(t)23 b Fa(.)15 +-b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)52 b Fl(29)p +-eop +-%%Page: -2 39 +--2 38 bop 3699 -116 a Fl(ii)748 83 y(3.7.1)93 b(Getting)31 +-b(rid)d(of)j Fj(stdio)20 b Fa(.)13 b(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)49 +-b Fl(29)748 193 y(3.7.2)93 b(Critical)28 b(error)i(handling)22 +-b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)54 b Fl(29)449 302 +-y(3.8)92 b(Making)30 b(a)h(Windo)m(ws)e(DLL)15 b Fa(.)h(.)f(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)44 b Fl(30)150 545 +-y Fk(4)135 b(Miscellanea)11 b Fb(.)21 b(.)f(.)f(.)h(.)f(.)g(.)h(.)f(.)h +-(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.) +-h(.)f(.)g(.)h(.)56 b Fk(31)449 682 y Fl(4.1)92 b(Limitations)29 +-b(of)h(the)h(compressed)f(\014le)f(format)9 b Fa(.)15 +-b(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)38 +-b Fl(31)449 791 y(4.2)92 b(P)m(ortabilit)m(y)30 b(issues)14 +-b Fa(.)f(.)j(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)h(.)f(.)g(.)43 b Fl(32)449 901 y(4.3)92 b(Rep)s(orting)29 +-b(bugs)f Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)58 b Fl(32)449 1010 y(4.4)92 +-b(Did)29 b(y)m(ou)i(get)h(the)e(righ)m(t)g(pac)m(k)-5 +-b(age?)22 b Fa(.)17 b(.)e(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h +-(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)51 +-b Fl(34)449 1120 y(4.5)92 b(T)-8 b(esting)16 b Fa(.)f(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)45 b Fl(34)449 1230 y(4.6)92 +-b(F)-8 b(urther)30 b(reading)22 b Fa(.)14 b(.)h(.)g(.)h(.)f(.)g(.)g(.)g +-(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) +-g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)51 +-b Fl(35)p eop +-%%Trailer +-end +-userdict /end-hook known{end-hook}if +-%%EOF +diff -Nru bzip2-1.0.1/manual.texi bzip2-1.0.1.new/manual.texi +--- bzip2-1.0.1/manual.texi Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/manual.texi Thu Jan 1 01:00:00 1970 +@@ -1,2215 +0,0 @@ +-\input texinfo @c -*- Texinfo -*- +-@setfilename bzip2.info +- +-@ignore +-This file documents bzip2 version 1.0, and associated library +-libbzip2, written by Julian Seward (jseward@acm.org). +- +-Copyright (C) 1996-2000 Julian R Seward +- +-Permission is granted to make and distribute verbatim copies of +-this manual provided the copyright notice and this permission notice +-are preserved on all copies. +- +-Permission is granted to copy and distribute translations of this manual +-into another language, under the above conditions for verbatim copies. +-@end ignore +- +-@ifinfo +-@format +-START-INFO-DIR-ENTRY +-* Bzip2: (bzip2). A program and library for data compression. +-END-INFO-DIR-ENTRY +-@end format +- +-@end ifinfo +- +-@iftex +-@c @finalout +-@settitle bzip2 and libbzip2 +-@titlepage +-@title bzip2 and libbzip2 +-@subtitle a program and library for data compression +-@subtitle copyright (C) 1996-2000 Julian Seward +-@subtitle version 1.0 of 21 March 2000 +-@author Julian Seward +- +-@end titlepage +- +-@parindent 0mm +-@parskip 2mm +- +-@end iftex +-@node Top, Overview, (dir), (dir) +- +-This program, @code{bzip2}, +-and associated library @code{libbzip2}, are +-Copyright (C) 1996-2000 Julian R Seward. All rights reserved. +- +-Redistribution and use in source and binary forms, with or without +-modification, are permitted provided that the following conditions +-are met: +-@itemize @bullet +-@item +- Redistributions of source code must retain the above copyright +- notice, this list of conditions and the following disclaimer. +-@item +- The origin of this software must not be misrepresented; you must +- not claim that you wrote the original software. If you use this +- software in a product, an acknowledgment in the product +- documentation would be appreciated but is not required. +-@item +- Altered source versions must be plainly marked as such, and must +- not be misrepresented as being the original software. +-@item +- The name of the author may not be used to endorse or promote +- products derived from this software without specific prior written +- permission. +-@end itemize +-THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS +-OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +-WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +-ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY +-DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +-DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +-GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +-INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +-WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +-NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +-SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +- +-Julian Seward, Cambridge, UK. +- +-@code{jseward@@acm.org} +- +-@code{http://sourceware.cygnus.com/bzip2} +- +-@code{http://www.cacheprof.org} +- +-@code{http://www.muraroa.demon.co.uk} +- +-@code{bzip2}/@code{libbzip2} version 1.0 of 21 March 2000. +- +-PATENTS: To the best of my knowledge, @code{bzip2} does not use any patented +-algorithms. However, I do not have the resources available to carry out +-a full patent search. Therefore I cannot give any guarantee of the +-above statement. +- +- +- +- +- +- +- +-@node Overview, Implementation, Top, Top +-@chapter Introduction +- +-@code{bzip2} compresses files using the Burrows-Wheeler +-block-sorting text compression algorithm, and Huffman coding. +-Compression is generally considerably better than that +-achieved by more conventional LZ77/LZ78-based compressors, +-and approaches the performance of the PPM family of statistical compressors. +- +-@code{bzip2} is built on top of @code{libbzip2}, a flexible library +-for handling compressed data in the @code{bzip2} format. This manual +-describes both how to use the program and +-how to work with the library interface. Most of the +-manual is devoted to this library, not the program, +-which is good news if your interest is only in the program. +- +-Chapter 2 describes how to use @code{bzip2}; this is the only part +-you need to read if you just want to know how to operate the program. +-Chapter 3 describes the programming interfaces in detail, and +-Chapter 4 records some miscellaneous notes which I thought +-ought to be recorded somewhere. +- +- +-@chapter How to use @code{bzip2} +- +-This chapter contains a copy of the @code{bzip2} man page, +-and nothing else. +- +-@quotation +- +-@unnumberedsubsubsec NAME +-@itemize +-@item @code{bzip2}, @code{bunzip2} +-- a block-sorting file compressor, v1.0 +-@item @code{bzcat} +-- decompresses files to stdout +-@item @code{bzip2recover} +-- recovers data from damaged bzip2 files +-@end itemize +- +-@unnumberedsubsubsec SYNOPSIS +-@itemize +-@item @code{bzip2} [ -cdfkqstvzVL123456789 ] [ filenames ... ] +-@item @code{bunzip2} [ -fkvsVL ] [ filenames ... ] +-@item @code{bzcat} [ -s ] [ filenames ... ] +-@item @code{bzip2recover} filename +-@end itemize +- +-@unnumberedsubsubsec DESCRIPTION +- +-@code{bzip2} compresses files using the Burrows-Wheeler block sorting +-text compression algorithm, and Huffman coding. Compression is +-generally considerably better than that achieved by more conventional +-LZ77/LZ78-based compressors, and approaches the performance of the PPM +-family of statistical compressors. +- +-The command-line options are deliberately very similar to those of GNU +-@code{gzip}, but they are not identical. +- +-@code{bzip2} expects a list of file names to accompany the command-line +-flags. Each file is replaced by a compressed version of itself, with +-the name @code{original_name.bz2}. Each compressed file has the same +-modification date, permissions, and, when possible, ownership as the +-corresponding original, so that these properties can be correctly +-restored at decompression time. File name handling is naive in the +-sense that there is no mechanism for preserving original file names, +-permissions, ownerships or dates in filesystems which lack these +-concepts, or have serious file name length restrictions, such as MS-DOS. +- +-@code{bzip2} and @code{bunzip2} will by default not overwrite existing +-files. If you want this to happen, specify the @code{-f} flag. +- +-If no file names are specified, @code{bzip2} compresses from standard +-input to standard output. In this case, @code{bzip2} will decline to +-write compressed output to a terminal, as this would be entirely +-incomprehensible and therefore pointless. +- +-@code{bunzip2} (or @code{bzip2 -d}) decompresses all +-specified files. Files which were not created by @code{bzip2} +-will be detected and ignored, and a warning issued. +-@code{bzip2} attempts to guess the filename for the decompressed file +-from that of the compressed file as follows: +-@itemize +-@item @code{filename.bz2 } becomes @code{filename} +-@item @code{filename.bz } becomes @code{filename} +-@item @code{filename.tbz2} becomes @code{filename.tar} +-@item @code{filename.tbz } becomes @code{filename.tar} +-@item @code{anyothername } becomes @code{anyothername.out} +-@end itemize +-If the file does not end in one of the recognised endings, +-@code{.bz2}, @code{.bz}, +-@code{.tbz2} or @code{.tbz}, @code{bzip2} complains that it cannot +-guess the name of the original file, and uses the original name +-with @code{.out} appended. +- +-As with compression, supplying no +-filenames causes decompression from standard input to standard output. +- +-@code{bunzip2} will correctly decompress a file which is the +-concatenation of two or more compressed files. The result is the +-concatenation of the corresponding uncompressed files. Integrity +-testing (@code{-t}) of concatenated compressed files is also supported. +- +-You can also compress or decompress files to the standard output by +-giving the @code{-c} flag. Multiple files may be compressed and +-decompressed like this. The resulting outputs are fed sequentially to +-stdout. Compression of multiple files in this manner generates a stream +-containing multiple compressed file representations. Such a stream +-can be decompressed correctly only by @code{bzip2} version 0.9.0 or +-later. Earlier versions of @code{bzip2} will stop after decompressing +-the first file in the stream. +- +-@code{bzcat} (or @code{bzip2 -dc}) decompresses all specified files to +-the standard output. +- +-@code{bzip2} will read arguments from the environment variables +-@code{BZIP2} and @code{BZIP}, in that order, and will process them +-before any arguments read from the command line. This gives a +-convenient way to supply default arguments. +- +-Compression is always performed, even if the compressed file is slightly +-larger than the original. Files of less than about one hundred bytes +-tend to get larger, since the compression mechanism has a constant +-overhead in the region of 50 bytes. Random data (including the output +-of most file compressors) is coded at about 8.05 bits per byte, giving +-an expansion of around 0.5%. +- +-As a self-check for your protection, @code{bzip2} uses 32-bit CRCs to +-make sure that the decompressed version of a file is identical to the +-original. This guards against corruption of the compressed data, and +-against undetected bugs in @code{bzip2} (hopefully very unlikely). The +-chances of data corruption going undetected is microscopic, about one +-chance in four billion for each file processed. Be aware, though, that +-the check occurs upon decompression, so it can only tell you that +-something is wrong. It can't help you recover the original uncompressed +-data. You can use @code{bzip2recover} to try to recover data from +-damaged files. +- +-Return values: 0 for a normal exit, 1 for environmental problems (file +-not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt +-compressed file, 3 for an internal consistency error (eg, bug) which +-caused @code{bzip2} to panic. +- +- +-@unnumberedsubsubsec OPTIONS +-@table @code +-@item -c --stdout +-Compress or decompress to standard output. +-@item -d --decompress +-Force decompression. @code{bzip2}, @code{bunzip2} and @code{bzcat} are +-really the same program, and the decision about what actions to take is +-done on the basis of which name is used. This flag overrides that +-mechanism, and forces bzip2 to decompress. +-@item -z --compress +-The complement to @code{-d}: forces compression, regardless of the +-invokation name. +-@item -t --test +-Check integrity of the specified file(s), but don't decompress them. +-This really performs a trial decompression and throws away the result. +-@item -f --force +-Force overwrite of output files. Normally, @code{bzip2} will not overwrite +-existing output files. Also forces @code{bzip2} to break hard links +-to files, which it otherwise wouldn't do. +-@item -k --keep +-Keep (don't delete) input files during compression +-or decompression. +-@item -s --small +-Reduce memory usage, for compression, decompression and testing. Files +-are decompressed and tested using a modified algorithm which only +-requires 2.5 bytes per block byte. This means any file can be +-decompressed in 2300k of memory, albeit at about half the normal speed. +- +-During compression, @code{-s} selects a block size of 200k, which limits +-memory use to around the same figure, at the expense of your compression +-ratio. In short, if your machine is low on memory (8 megabytes or +-less), use -s for everything. See MEMORY MANAGEMENT below. +-@item -q --quiet +-Suppress non-essential warning messages. Messages pertaining to +-I/O errors and other critical events will not be suppressed. +-@item -v --verbose +-Verbose mode -- show the compression ratio for each file processed. +-Further @code{-v}'s increase the verbosity level, spewing out lots of +-information which is primarily of interest for diagnostic purposes. +-@item -L --license -V --version +-Display the software version, license terms and conditions. +-@item -1 to -9 +-Set the block size to 100 k, 200 k .. 900 k when compressing. Has no +-effect when decompressing. See MEMORY MANAGEMENT below. +-@item -- +-Treats all subsequent arguments as file names, even if they start +-with a dash. This is so you can handle files with names beginning +-with a dash, for example: @code{bzip2 -- -myfilename}. +-@item --repetitive-fast +-@item --repetitive-best +-These flags are redundant in versions 0.9.5 and above. They provided +-some coarse control over the behaviour of the sorting algorithm in +-earlier versions, which was sometimes useful. 0.9.5 and above have an +-improved algorithm which renders these flags irrelevant. +-@end table +- +- +-@unnumberedsubsubsec MEMORY MANAGEMENT +- +-@code{bzip2} compresses large files in blocks. The block size affects +-both the compression ratio achieved, and the amount of memory needed for +-compression and decompression. The flags @code{-1} through @code{-9} +-specify the block size to be 100,000 bytes through 900,000 bytes (the +-default) respectively. At decompression time, the block size used for +-compression is read from the header of the compressed file, and +-@code{bunzip2} then allocates itself just enough memory to decompress +-the file. Since block sizes are stored in compressed files, it follows +-that the flags @code{-1} to @code{-9} are irrelevant to and so ignored +-during decompression. +- +-Compression and decompression requirements, in bytes, can be estimated +-as: +-@example +- Compression: 400k + ( 8 x block size ) +- +- Decompression: 100k + ( 4 x block size ), or +- 100k + ( 2.5 x block size ) +-@end example +-Larger block sizes give rapidly diminishing marginal returns. Most of +-the compression comes from the first two or three hundred k of block +-size, a fact worth bearing in mind when using @code{bzip2} on small machines. +-It is also important to appreciate that the decompression memory +-requirement is set at compression time by the choice of block size. +- +-For files compressed with the default 900k block size, @code{bunzip2} +-will require about 3700 kbytes to decompress. To support decompression +-of any file on a 4 megabyte machine, @code{bunzip2} has an option to +-decompress using approximately half this amount of memory, about 2300 +-kbytes. Decompression speed is also halved, so you should use this +-option only where necessary. The relevant flag is @code{-s}. +- +-In general, try and use the largest block size memory constraints allow, +-since that maximises the compression achieved. Compression and +-decompression speed are virtually unaffected by block size. +- +-Another significant point applies to files which fit in a single block +--- that means most files you'd encounter using a large block size. The +-amount of real memory touched is proportional to the size of the file, +-since the file is smaller than a block. For example, compressing a file +-20,000 bytes long with the flag @code{-9} will cause the compressor to +-allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 +-kbytes of it. Similarly, the decompressor will allocate 3700k but only +-touch 100k + 20000 * 4 = 180 kbytes. +- +-Here is a table which summarises the maximum memory usage for different +-block sizes. Also recorded is the total compressed size for 14 files of +-the Calgary Text Compression Corpus totalling 3,141,622 bytes. This +-column gives some feel for how compression varies with block size. +-These figures tend to understate the advantage of larger block sizes for +-larger files, since the Corpus is dominated by smaller files. +-@example +- Compress Decompress Decompress Corpus +- Flag usage usage -s usage Size +- +- -1 1200k 500k 350k 914704 +- -2 2000k 900k 600k 877703 +- -3 2800k 1300k 850k 860338 +- -4 3600k 1700k 1100k 846899 +- -5 4400k 2100k 1350k 845160 +- -6 5200k 2500k 1600k 838626 +- -7 6100k 2900k 1850k 834096 +- -8 6800k 3300k 2100k 828642 +- -9 7600k 3700k 2350k 828642 +-@end example +- +-@unnumberedsubsubsec RECOVERING DATA FROM DAMAGED FILES +- +-@code{bzip2} compresses files in blocks, usually 900kbytes long. Each +-block is handled independently. If a media or transmission error causes +-a multi-block @code{.bz2} file to become damaged, it may be possible to +-recover data from the undamaged blocks in the file. +- +-The compressed representation of each block is delimited by a 48-bit +-pattern, which makes it possible to find the block boundaries with +-reasonable certainty. Each block also carries its own 32-bit CRC, so +-damaged blocks can be distinguished from undamaged ones. +- +-@code{bzip2recover} is a simple program whose purpose is to search for +-blocks in @code{.bz2} files, and write each block out into its own +-@code{.bz2} file. You can then use @code{bzip2 -t} to test the +-integrity of the resulting files, and decompress those which are +-undamaged. +- +-@code{bzip2recover} +-takes a single argument, the name of the damaged file, +-and writes a number of files @code{rec0001file.bz2}, +- @code{rec0002file.bz2}, etc, containing the extracted blocks. +- The output filenames are designed so that the use of +- wildcards in subsequent processing -- for example, +-@code{bzip2 -dc rec*file.bz2 > recovered_data} -- lists the files in +- the correct order. +- +-@code{bzip2recover} should be of most use dealing with large @code{.bz2} +- files, as these will contain many blocks. It is clearly +- futile to use it on damaged single-block files, since a +- damaged block cannot be recovered. If you wish to minimise +-any potential data loss through media or transmission errors, +-you might consider compressing with a smaller +- block size. +- +- +-@unnumberedsubsubsec PERFORMANCE NOTES +- +-The sorting phase of compression gathers together similar strings in the +-file. Because of this, files containing very long runs of repeated +-symbols, like "aabaabaabaab ..." (repeated several hundred times) may +-compress more slowly than normal. Versions 0.9.5 and above fare much +-better than previous versions in this respect. The ratio between +-worst-case and average-case compression time is in the region of 10:1. +-For previous versions, this figure was more like 100:1. You can use the +-@code{-vvvv} option to monitor progress in great detail, if you want. +- +-Decompression speed is unaffected by these phenomena. +- +-@code{bzip2} usually allocates several megabytes of memory to operate +-in, and then charges all over it in a fairly random fashion. This means +-that performance, both for compressing and decompressing, is largely +-determined by the speed at which your machine can service cache misses. +-Because of this, small changes to the code to reduce the miss rate have +-been observed to give disproportionately large performance improvements. +-I imagine @code{bzip2} will perform best on machines with very large +-caches. +- +- +-@unnumberedsubsubsec CAVEATS +- +-I/O error messages are not as helpful as they could be. @code{bzip2} +-tries hard to detect I/O errors and exit cleanly, but the details of +-what the problem is sometimes seem rather misleading. +- +-This manual page pertains to version 1.0 of @code{bzip2}. Compressed +-data created by this version is entirely forwards and backwards +-compatible with the previous public releases, versions 0.1pl2, 0.9.0 and +-0.9.5, but with the following exception: 0.9.0 and above can correctly +-decompress multiple concatenated compressed files. 0.1pl2 cannot do +-this; it will stop after decompressing just the first file in the +-stream. +- +-@code{bzip2recover} uses 32-bit integers to represent bit positions in +-compressed files, so it cannot handle compressed files more than 512 +-megabytes long. This could easily be fixed. +- +- +-@unnumberedsubsubsec AUTHOR +-Julian Seward, @code{jseward@@acm.org}. +- +-The ideas embodied in @code{bzip2} are due to (at least) the following +-people: Michael Burrows and David Wheeler (for the block sorting +-transformation), David Wheeler (again, for the Huffman coder), Peter +-Fenwick (for the structured coding model in the original @code{bzip}, +-and many refinements), and Alistair Moffat, Radford Neal and Ian Witten +-(for the arithmetic coder in the original @code{bzip}). I am much +-indebted for their help, support and advice. See the manual in the +-source distribution for pointers to sources of documentation. Christian +-von Roques encouraged me to look for faster sorting algorithms, so as to +-speed up compression. Bela Lubkin encouraged me to improve the +-worst-case compression performance. Many people sent patches, helped +-with portability problems, lent machines, gave advice and were generally +-helpful. +- +-@end quotation +- +- +- +- +-@chapter Programming with @code{libbzip2} +- +-This chapter describes the programming interface to @code{libbzip2}. +- +-For general background information, particularly about memory +-use and performance aspects, you'd be well advised to read Chapter 2 +-as well. +- +-@section Top-level structure +- +-@code{libbzip2} is a flexible library for compressing and decompressing +-data in the @code{bzip2} data format. Although packaged as a single +-entity, it helps to regard the library as three separate parts: the low +-level interface, and the high level interface, and some utility +-functions. +- +-The structure of @code{libbzip2}'s interfaces is similar to +-that of Jean-loup Gailly's and Mark Adler's excellent @code{zlib} +-library. +- +-All externally visible symbols have names beginning @code{BZ2_}. +-This is new in version 1.0. The intention is to minimise pollution +-of the namespaces of library clients. +- +-@subsection Low-level summary +- +-This interface provides services for compressing and decompressing +-data in memory. There's no provision for dealing with files, streams +-or any other I/O mechanisms, just straight memory-to-memory work. +-In fact, this part of the library can be compiled without inclusion +-of @code{stdio.h}, which may be helpful for embedded applications. +- +-The low-level part of the library has no global variables and +-is therefore thread-safe. +- +-Six routines make up the low level interface: +-@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, and @* @code{BZ2_bzCompressEnd} +-for compression, +-and a corresponding trio @code{BZ2_bzDecompressInit}, @* @code{BZ2_bzDecompress} +-and @code{BZ2_bzDecompressEnd} for decompression. +-The @code{*Init} functions allocate +-memory for compression/decompression and do other +-initialisations, whilst the @code{*End} functions close down operations +-and release memory. +- +-The real work is done by @code{BZ2_bzCompress} and @code{BZ2_bzDecompress}. +-These compress and decompress data from a user-supplied input buffer +-to a user-supplied output buffer. These buffers can be any size; +-arbitrary quantities of data are handled by making repeated calls +-to these functions. This is a flexible mechanism allowing a +-consumer-pull style of activity, or producer-push, or a mixture of +-both. +- +- +- +-@subsection High-level summary +- +-This interface provides some handy wrappers around the low-level +-interface to facilitate reading and writing @code{bzip2} format +-files (@code{.bz2} files). The routines provide hooks to facilitate +-reading files in which the @code{bzip2} data stream is embedded +-within some larger-scale file structure, or where there are +-multiple @code{bzip2} data streams concatenated end-to-end. +- +-For reading files, @code{BZ2_bzReadOpen}, @code{BZ2_bzRead}, +-@code{BZ2_bzReadClose} and @* @code{BZ2_bzReadGetUnused} are supplied. For +-writing files, @code{BZ2_bzWriteOpen}, @code{BZ2_bzWrite} and +-@code{BZ2_bzWriteFinish} are available. +- +-As with the low-level library, no global variables are used +-so the library is per se thread-safe. However, if I/O errors +-occur whilst reading or writing the underlying compressed files, +-you may have to consult @code{errno} to determine the cause of +-the error. In that case, you'd need a C library which correctly +-supports @code{errno} in a multithreaded environment. +- +-To make the library a little simpler and more portable, +-@code{BZ2_bzReadOpen} and @code{BZ2_bzWriteOpen} require you to pass them file +-handles (@code{FILE*}s) which have previously been opened for reading or +-writing respectively. That avoids portability problems associated with +-file operations and file attributes, whilst not being much of an +-imposition on the programmer. +- +- +- +-@subsection Utility functions summary +-For very simple needs, @code{BZ2_bzBuffToBuffCompress} and +-@code{BZ2_bzBuffToBuffDecompress} are provided. These compress +-data in memory from one buffer to another buffer in a single +-function call. You should assess whether these functions +-fulfill your memory-to-memory compression/decompression +-requirements before investing effort in understanding the more +-general but more complex low-level interface. +- +-Yoshioka Tsuneo (@code{QWF00133@@niftyserve.or.jp} / +-@code{tsuneo-y@@is.aist-nara.ac.jp}) has contributed some functions to +-give better @code{zlib} compatibility. These functions are +-@code{BZ2_bzopen}, @code{BZ2_bzread}, @code{BZ2_bzwrite}, @code{BZ2_bzflush}, +-@code{BZ2_bzclose}, +-@code{BZ2_bzerror} and @code{BZ2_bzlibVersion}. You may find these functions +-more convenient for simple file reading and writing, than those in the +-high-level interface. These functions are not (yet) officially part of +-the library, and are minimally documented here. If they break, you +-get to keep all the pieces. I hope to document them properly when time +-permits. +- +-Yoshioka also contributed modifications to allow the library to be +-built as a Windows DLL. +- +- +-@section Error handling +- +-The library is designed to recover cleanly in all situations, including +-the worst-case situation of decompressing random data. I'm not +-100% sure that it can always do this, so you might want to add +-a signal handler to catch segmentation violations during decompression +-if you are feeling especially paranoid. I would be interested in +-hearing more about the robustness of the library to corrupted +-compressed data. +- +-Version 1.0 is much more robust in this respect than +-0.9.0 or 0.9.5. Investigations with Checker (a tool for +-detecting problems with memory management, similar to Purify) +-indicate that, at least for the few files I tested, all single-bit +-errors in the decompressed data are caught properly, with no +-segmentation faults, no reads of uninitialised data and no +-out of range reads or writes. So it's certainly much improved, +-although I wouldn't claim it to be totally bombproof. +- +-The file @code{bzlib.h} contains all definitions needed to use +-the library. In particular, you should definitely not include +-@code{bzlib_private.h}. +- +-In @code{bzlib.h}, the various return values are defined. The following +-list is not intended as an exhaustive description of the circumstances +-in which a given value may be returned -- those descriptions are given +-later. Rather, it is intended to convey the rough meaning of each +-return value. The first five actions are normal and not intended to +-denote an error situation. +-@table @code +-@item BZ_OK +-The requested action was completed successfully. +-@item BZ_RUN_OK +-@itemx BZ_FLUSH_OK +-@itemx BZ_FINISH_OK +-In @code{BZ2_bzCompress}, the requested flush/finish/nothing-special action +-was completed successfully. +-@item BZ_STREAM_END +-Compression of data was completed, or the logical stream end was +-detected during decompression. +-@end table +- +-The following return values indicate an error of some kind. +-@table @code +-@item BZ_CONFIG_ERROR +-Indicates that the library has been improperly compiled on your +-platform -- a major configuration error. Specifically, it means +-that @code{sizeof(char)}, @code{sizeof(short)} and @code{sizeof(int)} +-are not 1, 2 and 4 respectively, as they should be. Note that the +-library should still work properly on 64-bit platforms which follow +-the LP64 programming model -- that is, where @code{sizeof(long)} +-and @code{sizeof(void*)} are 8. Under LP64, @code{sizeof(int)} is +-still 4, so @code{libbzip2}, which doesn't use the @code{long} type, +-is OK. +-@item BZ_SEQUENCE_ERROR +-When using the library, it is important to call the functions in the +-correct sequence and with data structures (buffers etc) in the correct +-states. @code{libbzip2} checks as much as it can to ensure this is +-happening, and returns @code{BZ_SEQUENCE_ERROR} if not. Code which +-complies precisely with the function semantics, as detailed below, +-should never receive this value; such an event denotes buggy code +-which you should investigate. +-@item BZ_PARAM_ERROR +-Returned when a parameter to a function call is out of range +-or otherwise manifestly incorrect. As with @code{BZ_SEQUENCE_ERROR}, +-this denotes a bug in the client code. The distinction between +-@code{BZ_PARAM_ERROR} and @code{BZ_SEQUENCE_ERROR} is a bit hazy, but still worth +-making. +-@item BZ_MEM_ERROR +-Returned when a request to allocate memory failed. Note that the +-quantity of memory needed to decompress a stream cannot be determined +-until the stream's header has been read. So @code{BZ2_bzDecompress} and +-@code{BZ2_bzRead} may return @code{BZ_MEM_ERROR} even though some of +-the compressed data has been read. The same is not true for +-compression; once @code{BZ2_bzCompressInit} or @code{BZ2_bzWriteOpen} have +-successfully completed, @code{BZ_MEM_ERROR} cannot occur. +-@item BZ_DATA_ERROR +-Returned when a data integrity error is detected during decompression. +-Most importantly, this means when stored and computed CRCs for the +-data do not match. This value is also returned upon detection of any +-other anomaly in the compressed data. +-@item BZ_DATA_ERROR_MAGIC +-As a special case of @code{BZ_DATA_ERROR}, it is sometimes useful to +-know when the compressed stream does not start with the correct +-magic bytes (@code{'B' 'Z' 'h'}). +-@item BZ_IO_ERROR +-Returned by @code{BZ2_bzRead} and @code{BZ2_bzWrite} when there is an error +-reading or writing in the compressed file, and by @code{BZ2_bzReadOpen} +-and @code{BZ2_bzWriteOpen} for attempts to use a file for which the +-error indicator (viz, @code{ferror(f)}) is set. +-On receipt of @code{BZ_IO_ERROR}, the caller should consult +-@code{errno} and/or @code{perror} to acquire operating-system +-specific information about the problem. +-@item BZ_UNEXPECTED_EOF +-Returned by @code{BZ2_bzRead} when the compressed file finishes +-before the logical end of stream is detected. +-@item BZ_OUTBUFF_FULL +-Returned by @code{BZ2_bzBuffToBuffCompress} and +-@code{BZ2_bzBuffToBuffDecompress} to indicate that the output data +-will not fit into the output buffer provided. +-@end table +- +- +- +-@section Low-level interface +- +-@subsection @code{BZ2_bzCompressInit} +-@example +-typedef +- struct @{ +- char *next_in; +- unsigned int avail_in; +- unsigned int total_in_lo32; +- unsigned int total_in_hi32; +- +- char *next_out; +- unsigned int avail_out; +- unsigned int total_out_lo32; +- unsigned int total_out_hi32; +- +- void *state; +- +- void *(*bzalloc)(void *,int,int); +- void (*bzfree)(void *,void *); +- void *opaque; +- @} +- bz_stream; +- +-int BZ2_bzCompressInit ( bz_stream *strm, +- int blockSize100k, +- int verbosity, +- int workFactor ); +- +-@end example +- +-Prepares for compression. The @code{bz_stream} structure +-holds all data pertaining to the compression activity. +-A @code{bz_stream} structure should be allocated and initialised +-prior to the call. +-The fields of @code{bz_stream} +-comprise the entirety of the user-visible data. @code{state} +-is a pointer to the private data structures required for compression. +- +-Custom memory allocators are supported, via fields @code{bzalloc}, +-@code{bzfree}, +-and @code{opaque}. The value +-@code{opaque} is passed to as the first argument to +-all calls to @code{bzalloc} and @code{bzfree}, but is +-otherwise ignored by the library. +-The call @code{bzalloc ( opaque, n, m )} is expected to return a +-pointer @code{p} to +-@code{n * m} bytes of memory, and @code{bzfree ( opaque, p )} +-should free +-that memory. +- +-If you don't want to use a custom memory allocator, set @code{bzalloc}, +-@code{bzfree} and +-@code{opaque} to @code{NULL}, +-and the library will then use the standard @code{malloc}/@code{free} +-routines. +- +-Before calling @code{BZ2_bzCompressInit}, fields @code{bzalloc}, +-@code{bzfree} and @code{opaque} should +-be filled appropriately, as just described. Upon return, the internal +-state will have been allocated and initialised, and @code{total_in_lo32}, +-@code{total_in_hi32}, @code{total_out_lo32} and +-@code{total_out_hi32} will have been set to zero. +-These four fields are used by the library +-to inform the caller of the total amount of data passed into and out of +-the library, respectively. You should not try to change them. +-As of version 1.0, 64-bit counts are maintained, even on 32-bit +-platforms, using the @code{_hi32} fields to store the upper 32 bits +-of the count. So, for example, the total amount of data in +-is @code{(total_in_hi32 << 32) + total_in_lo32}. +- +-Parameter @code{blockSize100k} specifies the block size to be used for +-compression. It should be a value between 1 and 9 inclusive, and the +-actual block size used is 100000 x this figure. 9 gives the best +-compression but takes most memory. +- +-Parameter @code{verbosity} should be set to a number between 0 and 4 +-inclusive. 0 is silent, and greater numbers give increasingly verbose +-monitoring/debugging output. If the library has been compiled with +-@code{-DBZ_NO_STDIO}, no such output will appear for any verbosity +-setting. +- +-Parameter @code{workFactor} controls how the compression phase behaves +-when presented with worst case, highly repetitive, input data. If +-compression runs into difficulties caused by repetitive data, the +-library switches from the standard sorting algorithm to a fallback +-algorithm. The fallback is slower than the standard algorithm by +-perhaps a factor of three, but always behaves reasonably, no matter how +-bad the input. +- +-Lower values of @code{workFactor} reduce the amount of effort the +-standard algorithm will expend before resorting to the fallback. You +-should set this parameter carefully; too low, and many inputs will be +-handled by the fallback algorithm and so compress rather slowly, too +-high, and your average-to-worst case compression times can become very +-large. The default value of 30 gives reasonable behaviour over a wide +-range of circumstances. +- +-Allowable values range from 0 to 250 inclusive. 0 is a special case, +-equivalent to using the default value of 30. +- +-Note that the compressed output generated is the same regardless of +-whether or not the fallback algorithm is used. +- +-Be aware also that this parameter may disappear entirely in future +-versions of the library. In principle it should be possible to devise a +-good way to automatically choose which algorithm to use. Such a +-mechanism would render the parameter obsolete. +- +-Possible return values: +-@display +- @code{BZ_CONFIG_ERROR} +- if the library has been mis-compiled +- @code{BZ_PARAM_ERROR} +- if @code{strm} is @code{NULL} +- or @code{blockSize} < 1 or @code{blockSize} > 9 +- or @code{verbosity} < 0 or @code{verbosity} > 4 +- or @code{workFactor} < 0 or @code{workFactor} > 250 +- @code{BZ_MEM_ERROR} +- if not enough memory is available +- @code{BZ_OK} +- otherwise +-@end display +-Allowable next actions: +-@display +- @code{BZ2_bzCompress} +- if @code{BZ_OK} is returned +- no specific action needed in case of error +-@end display +- +-@subsection @code{BZ2_bzCompress} +-@example +- int BZ2_bzCompress ( bz_stream *strm, int action ); +-@end example +-Provides more input and/or output buffer space for the library. The +-caller maintains input and output buffers, and calls @code{BZ2_bzCompress} to +-transfer data between them. +- +-Before each call to @code{BZ2_bzCompress}, @code{next_in} should point at +-the data to be compressed, and @code{avail_in} should indicate how many +-bytes the library may read. @code{BZ2_bzCompress} updates @code{next_in}, +-@code{avail_in} and @code{total_in} to reflect the number of bytes it +-has read. +- +-Similarly, @code{next_out} should point to a buffer in which the +-compressed data is to be placed, with @code{avail_out} indicating how +-much output space is available. @code{BZ2_bzCompress} updates +-@code{next_out}, @code{avail_out} and @code{total_out} to reflect the +-number of bytes output. +- +-You may provide and remove as little or as much data as you like on each +-call of @code{BZ2_bzCompress}. In the limit, it is acceptable to supply and +-remove data one byte at a time, although this would be terribly +-inefficient. You should always ensure that at least one byte of output +-space is available at each call. +- +-A second purpose of @code{BZ2_bzCompress} is to request a change of mode of the +-compressed stream. +- +-Conceptually, a compressed stream can be in one of four states: IDLE, +-RUNNING, FLUSHING and FINISHING. Before initialisation +-(@code{BZ2_bzCompressInit}) and after termination (@code{BZ2_bzCompressEnd}), a +-stream is regarded as IDLE. +- +-Upon initialisation (@code{BZ2_bzCompressInit}), the stream is placed in the +-RUNNING state. Subsequent calls to @code{BZ2_bzCompress} should pass +-@code{BZ_RUN} as the requested action; other actions are illegal and +-will result in @code{BZ_SEQUENCE_ERROR}. +- +-At some point, the calling program will have provided all the input data +-it wants to. It will then want to finish up -- in effect, asking the +-library to process any data it might have buffered internally. In this +-state, @code{BZ2_bzCompress} will no longer attempt to read data from +-@code{next_in}, but it will want to write data to @code{next_out}. +-Because the output buffer supplied by the user can be arbitrarily small, +-the finishing-up operation cannot necessarily be done with a single call +-of @code{BZ2_bzCompress}. +- +-Instead, the calling program passes @code{BZ_FINISH} as an action to +-@code{BZ2_bzCompress}. This changes the stream's state to FINISHING. Any +-remaining input (ie, @code{next_in[0 .. avail_in-1]}) is compressed and +-transferred to the output buffer. To do this, @code{BZ2_bzCompress} must be +-called repeatedly until all the output has been consumed. At that +-point, @code{BZ2_bzCompress} returns @code{BZ_STREAM_END}, and the stream's +-state is set back to IDLE. @code{BZ2_bzCompressEnd} should then be +-called. +- +-Just to make sure the calling program does not cheat, the library makes +-a note of @code{avail_in} at the time of the first call to +-@code{BZ2_bzCompress} which has @code{BZ_FINISH} as an action (ie, at the +-time the program has announced its intention to not supply any more +-input). By comparing this value with that of @code{avail_in} over +-subsequent calls to @code{BZ2_bzCompress}, the library can detect any +-attempts to slip in more data to compress. Any calls for which this is +-detected will return @code{BZ_SEQUENCE_ERROR}. This indicates a +-programming mistake which should be corrected. +- +-Instead of asking to finish, the calling program may ask +-@code{BZ2_bzCompress} to take all the remaining input, compress it and +-terminate the current (Burrows-Wheeler) compression block. This could +-be useful for error control purposes. The mechanism is analogous to +-that for finishing: call @code{BZ2_bzCompress} with an action of +-@code{BZ_FLUSH}, remove output data, and persist with the +-@code{BZ_FLUSH} action until the value @code{BZ_RUN} is returned. As +-with finishing, @code{BZ2_bzCompress} detects any attempt to provide more +-input data once the flush has begun. +- +-Once the flush is complete, the stream returns to the normal RUNNING +-state. +- +-This all sounds pretty complex, but isn't really. Here's a table +-which shows which actions are allowable in each state, what action +-will be taken, what the next state is, and what the non-error return +-values are. Note that you can't explicitly ask what state the +-stream is in, but nor do you need to -- it can be inferred from the +-values returned by @code{BZ2_bzCompress}. +-@display +-IDLE/@code{any} +- Illegal. IDLE state only exists after @code{BZ2_bzCompressEnd} or +- before @code{BZ2_bzCompressInit}. +- Return value = @code{BZ_SEQUENCE_ERROR} +- +-RUNNING/@code{BZ_RUN} +- Compress from @code{next_in} to @code{next_out} as much as possible. +- Next state = RUNNING +- Return value = @code{BZ_RUN_OK} +- +-RUNNING/@code{BZ_FLUSH} +- Remember current value of @code{next_in}. Compress from @code{next_in} +- to @code{next_out} as much as possible, but do not accept any more input. +- Next state = FLUSHING +- Return value = @code{BZ_FLUSH_OK} +- +-RUNNING/@code{BZ_FINISH} +- Remember current value of @code{next_in}. Compress from @code{next_in} +- to @code{next_out} as much as possible, but do not accept any more input. +- Next state = FINISHING +- Return value = @code{BZ_FINISH_OK} +- +-FLUSHING/@code{BZ_FLUSH} +- Compress from @code{next_in} to @code{next_out} as much as possible, +- but do not accept any more input. +- If all the existing input has been used up and all compressed +- output has been removed +- Next state = RUNNING; Return value = @code{BZ_RUN_OK} +- else +- Next state = FLUSHING; Return value = @code{BZ_FLUSH_OK} +- +-FLUSHING/other +- Illegal. +- Return value = @code{BZ_SEQUENCE_ERROR} +- +-FINISHING/@code{BZ_FINISH} +- Compress from @code{next_in} to @code{next_out} as much as possible, +- but to not accept any more input. +- If all the existing input has been used up and all compressed +- output has been removed +- Next state = IDLE; Return value = @code{BZ_STREAM_END} +- else +- Next state = FINISHING; Return value = @code{BZ_FINISHING} +- +-FINISHING/other +- Illegal. +- Return value = @code{BZ_SEQUENCE_ERROR} +-@end display +- +-That still looks complicated? Well, fair enough. The usual sequence +-of calls for compressing a load of data is: +-@itemize @bullet +-@item Get started with @code{BZ2_bzCompressInit}. +-@item Shovel data in and shlurp out its compressed form using zero or more +-calls of @code{BZ2_bzCompress} with action = @code{BZ_RUN}. +-@item Finish up. +-Repeatedly call @code{BZ2_bzCompress} with action = @code{BZ_FINISH}, +-copying out the compressed output, until @code{BZ_STREAM_END} is returned. +-@item Close up and go home. Call @code{BZ2_bzCompressEnd}. +-@end itemize +-If the data you want to compress fits into your input buffer all +-at once, you can skip the calls of @code{BZ2_bzCompress ( ..., BZ_RUN )} and +-just do the @code{BZ2_bzCompress ( ..., BZ_FINISH )} calls. +- +-All required memory is allocated by @code{BZ2_bzCompressInit}. The +-compression library can accept any data at all (obviously). So you +-shouldn't get any error return values from the @code{BZ2_bzCompress} calls. +-If you do, they will be @code{BZ_SEQUENCE_ERROR}, and indicate a bug in +-your programming. +- +-Trivial other possible return values: +-@display +- @code{BZ_PARAM_ERROR} +- if @code{strm} is @code{NULL}, or @code{strm->s} is @code{NULL} +-@end display +- +-@subsection @code{BZ2_bzCompressEnd} +-@example +-int BZ2_bzCompressEnd ( bz_stream *strm ); +-@end example +-Releases all memory associated with a compression stream. +- +-Possible return values: +-@display +- @code{BZ_PARAM_ERROR} if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} +- @code{BZ_OK} otherwise +-@end display +- +- +-@subsection @code{BZ2_bzDecompressInit} +-@example +-int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small ); +-@end example +-Prepares for decompression. As with @code{BZ2_bzCompressInit}, a +-@code{bz_stream} record should be allocated and initialised before the +-call. Fields @code{bzalloc}, @code{bzfree} and @code{opaque} should be +-set if a custom memory allocator is required, or made @code{NULL} for +-the normal @code{malloc}/@code{free} routines. Upon return, the internal +-state will have been initialised, and @code{total_in} and +-@code{total_out} will be zero. +- +-For the meaning of parameter @code{verbosity}, see @code{BZ2_bzCompressInit}. +- +-If @code{small} is nonzero, the library will use an alternative +-decompression algorithm which uses less memory but at the cost of +-decompressing more slowly (roughly speaking, half the speed, but the +-maximum memory requirement drops to around 2300k). See Chapter 2 for +-more information on memory management. +- +-Note that the amount of memory needed to decompress +-a stream cannot be determined until the stream's header has been read, +-so even if @code{BZ2_bzDecompressInit} succeeds, a subsequent +-@code{BZ2_bzDecompress} could fail with @code{BZ_MEM_ERROR}. +- +-Possible return values: +-@display +- @code{BZ_CONFIG_ERROR} +- if the library has been mis-compiled +- @code{BZ_PARAM_ERROR} +- if @code{(small != 0 && small != 1)} +- or @code{(verbosity < 0 || verbosity > 4)} +- @code{BZ_MEM_ERROR} +- if insufficient memory is available +-@end display +- +-Allowable next actions: +-@display +- @code{BZ2_bzDecompress} +- if @code{BZ_OK} was returned +- no specific action required in case of error +-@end display +- +- +- +-@subsection @code{BZ2_bzDecompress} +-@example +-int BZ2_bzDecompress ( bz_stream *strm ); +-@end example +-Provides more input and/out output buffer space for the library. The +-caller maintains input and output buffers, and uses @code{BZ2_bzDecompress} +-to transfer data between them. +- +-Before each call to @code{BZ2_bzDecompress}, @code{next_in} +-should point at the compressed data, +-and @code{avail_in} should indicate how many bytes the library +-may read. @code{BZ2_bzDecompress} updates @code{next_in}, @code{avail_in} +-and @code{total_in} +-to reflect the number of bytes it has read. +- +-Similarly, @code{next_out} should point to a buffer in which the uncompressed +-output is to be placed, with @code{avail_out} indicating how much output space +-is available. @code{BZ2_bzCompress} updates @code{next_out}, +-@code{avail_out} and @code{total_out} to reflect +-the number of bytes output. +- +-You may provide and remove as little or as much data as you like on +-each call of @code{BZ2_bzDecompress}. +-In the limit, it is acceptable to +-supply and remove data one byte at a time, although this would be +-terribly inefficient. You should always ensure that at least one +-byte of output space is available at each call. +- +-Use of @code{BZ2_bzDecompress} is simpler than @code{BZ2_bzCompress}. +- +-You should provide input and remove output as described above, and +-repeatedly call @code{BZ2_bzDecompress} until @code{BZ_STREAM_END} is +-returned. Appearance of @code{BZ_STREAM_END} denotes that +-@code{BZ2_bzDecompress} has detected the logical end of the compressed +-stream. @code{BZ2_bzDecompress} will not produce @code{BZ_STREAM_END} until +-all output data has been placed into the output buffer, so once +-@code{BZ_STREAM_END} appears, you are guaranteed to have available all +-the decompressed output, and @code{BZ2_bzDecompressEnd} can safely be +-called. +- +-If case of an error return value, you should call @code{BZ2_bzDecompressEnd} +-to clean up and release memory. +- +-Possible return values: +-@display +- @code{BZ_PARAM_ERROR} +- if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} +- or @code{strm->avail_out < 1} +- @code{BZ_DATA_ERROR} +- if a data integrity error is detected in the compressed stream +- @code{BZ_DATA_ERROR_MAGIC} +- if the compressed stream doesn't begin with the right magic bytes +- @code{BZ_MEM_ERROR} +- if there wasn't enough memory available +- @code{BZ_STREAM_END} +- if the logical end of the data stream was detected and all +- output in has been consumed, eg @code{s->avail_out > 0} +- @code{BZ_OK} +- otherwise +-@end display +-Allowable next actions: +-@display +- @code{BZ2_bzDecompress} +- if @code{BZ_OK} was returned +- @code{BZ2_bzDecompressEnd} +- otherwise +-@end display +- +- +-@subsection @code{BZ2_bzDecompressEnd} +-@example +-int BZ2_bzDecompressEnd ( bz_stream *strm ); +-@end example +-Releases all memory associated with a decompression stream. +- +-Possible return values: +-@display +- @code{BZ_PARAM_ERROR} +- if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} +- @code{BZ_OK} +- otherwise +-@end display +- +-Allowable next actions: +-@display +- None. +-@end display +- +- +-@section High-level interface +- +-This interface provides functions for reading and writing +-@code{bzip2} format files. First, some general points. +- +-@itemize @bullet +-@item All of the functions take an @code{int*} first argument, +- @code{bzerror}. +- After each call, @code{bzerror} should be consulted first to determine +- the outcome of the call. If @code{bzerror} is @code{BZ_OK}, +- the call completed +- successfully, and only then should the return value of the function +- (if any) be consulted. If @code{bzerror} is @code{BZ_IO_ERROR}, +- there was an error +- reading/writing the underlying compressed file, and you should +- then consult @code{errno}/@code{perror} to determine the +- cause of the difficulty. +- @code{bzerror} may also be set to various other values; precise details are +- given on a per-function basis below. +-@item If @code{bzerror} indicates an error +- (ie, anything except @code{BZ_OK} and @code{BZ_STREAM_END}), +- you should immediately call @code{BZ2_bzReadClose} (or @code{BZ2_bzWriteClose}, +- depending on whether you are attempting to read or to write) +- to free up all resources associated +- with the stream. Once an error has been indicated, behaviour of all calls +- except @code{BZ2_bzReadClose} (@code{BZ2_bzWriteClose}) is undefined. +- The implication is that (1) @code{bzerror} should +- be checked after each call, and (2) if @code{bzerror} indicates an error, +- @code{BZ2_bzReadClose} (@code{BZ2_bzWriteClose}) should then be called to clean up. +-@item The @code{FILE*} arguments passed to +- @code{BZ2_bzReadOpen}/@code{BZ2_bzWriteOpen} +- should be set to binary mode. +- Most Unix systems will do this by default, but other platforms, +- including Windows and Mac, will not. If you omit this, you may +- encounter problems when moving code to new platforms. +-@item Memory allocation requests are handled by +- @code{malloc}/@code{free}. +- At present +- there is no facility for user-defined memory allocators in the file I/O +- functions (could easily be added, though). +-@end itemize +- +- +- +-@subsection @code{BZ2_bzReadOpen} +-@example +- typedef void BZFILE; +- +- BZFILE *BZ2_bzReadOpen ( int *bzerror, FILE *f, +- int small, int verbosity, +- void *unused, int nUnused ); +-@end example +-Prepare to read compressed data from file handle @code{f}. @code{f} +-should refer to a file which has been opened for reading, and for which +-the error indicator (@code{ferror(f)})is not set. If @code{small} is 1, +-the library will try to decompress using less memory, at the expense of +-speed. +- +-For reasons explained below, @code{BZ2_bzRead} will decompress the +-@code{nUnused} bytes starting at @code{unused}, before starting to read +-from the file @code{f}. At most @code{BZ_MAX_UNUSED} bytes may be +-supplied like this. If this facility is not required, you should pass +-@code{NULL} and @code{0} for @code{unused} and n@code{Unused} +-respectively. +- +-For the meaning of parameters @code{small} and @code{verbosity}, +-see @code{BZ2_bzDecompressInit}. +- +-The amount of memory needed to decompress a file cannot be determined +-until the file's header has been read. So it is possible that +-@code{BZ2_bzReadOpen} returns @code{BZ_OK} but a subsequent call of +-@code{BZ2_bzRead} will return @code{BZ_MEM_ERROR}. +- +-Possible assignments to @code{bzerror}: +-@display +- @code{BZ_CONFIG_ERROR} +- if the library has been mis-compiled +- @code{BZ_PARAM_ERROR} +- if @code{f} is @code{NULL} +- or @code{small} is neither @code{0} nor @code{1} +- or @code{(unused == NULL && nUnused != 0)} +- or @code{(unused != NULL && !(0 <= nUnused <= BZ_MAX_UNUSED))} +- @code{BZ_IO_ERROR} +- if @code{ferror(f)} is nonzero +- @code{BZ_MEM_ERROR} +- if insufficient memory is available +- @code{BZ_OK} +- otherwise. +-@end display +- +-Possible return values: +-@display +- Pointer to an abstract @code{BZFILE} +- if @code{bzerror} is @code{BZ_OK} +- @code{NULL} +- otherwise +-@end display +- +-Allowable next actions: +-@display +- @code{BZ2_bzRead} +- if @code{bzerror} is @code{BZ_OK} +- @code{BZ2_bzClose} +- otherwise +-@end display +- +- +-@subsection @code{BZ2_bzRead} +-@example +- int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len ); +-@end example +-Reads up to @code{len} (uncompressed) bytes from the compressed file +-@code{b} into +-the buffer @code{buf}. If the read was successful, +-@code{bzerror} is set to @code{BZ_OK} +-and the number of bytes read is returned. If the logical end-of-stream +-was detected, @code{bzerror} will be set to @code{BZ_STREAM_END}, +-and the number +-of bytes read is returned. All other @code{bzerror} values denote an error. +- +-@code{BZ2_bzRead} will supply @code{len} bytes, +-unless the logical stream end is detected +-or an error occurs. Because of this, it is possible to detect the +-stream end by observing when the number of bytes returned is +-less than the number +-requested. Nevertheless, this is regarded as inadvisable; you should +-instead check @code{bzerror} after every call and watch out for +-@code{BZ_STREAM_END}. +- +-Internally, @code{BZ2_bzRead} copies data from the compressed file in chunks +-of size @code{BZ_MAX_UNUSED} bytes +-before decompressing it. If the file contains more bytes than strictly +-needed to reach the logical end-of-stream, @code{BZ2_bzRead} will almost certainly +-read some of the trailing data before signalling @code{BZ_SEQUENCE_END}. +-To collect the read but unused data once @code{BZ_SEQUENCE_END} has +-appeared, call @code{BZ2_bzReadGetUnused} immediately before @code{BZ2_bzReadClose}. +- +-Possible assignments to @code{bzerror}: +-@display +- @code{BZ_PARAM_ERROR} +- if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} +- @code{BZ_SEQUENCE_ERROR} +- if @code{b} was opened with @code{BZ2_bzWriteOpen} +- @code{BZ_IO_ERROR} +- if there is an error reading from the compressed file +- @code{BZ_UNEXPECTED_EOF} +- if the compressed file ended before the logical end-of-stream was detected +- @code{BZ_DATA_ERROR} +- if a data integrity error was detected in the compressed stream +- @code{BZ_DATA_ERROR_MAGIC} +- if the stream does not begin with the requisite header bytes (ie, is not +- a @code{bzip2} data file). This is really a special case of @code{BZ_DATA_ERROR}. +- @code{BZ_MEM_ERROR} +- if insufficient memory was available +- @code{BZ_STREAM_END} +- if the logical end of stream was detected. +- @code{BZ_OK} +- otherwise. +-@end display +- +-Possible return values: +-@display +- number of bytes read +- if @code{bzerror} is @code{BZ_OK} or @code{BZ_STREAM_END} +- undefined +- otherwise +-@end display +- +-Allowable next actions: +-@display +- collect data from @code{buf}, then @code{BZ2_bzRead} or @code{BZ2_bzReadClose} +- if @code{bzerror} is @code{BZ_OK} +- collect data from @code{buf}, then @code{BZ2_bzReadClose} or @code{BZ2_bzReadGetUnused} +- if @code{bzerror} is @code{BZ_SEQUENCE_END} +- @code{BZ2_bzReadClose} +- otherwise +-@end display +- +- +- +-@subsection @code{BZ2_bzReadGetUnused} +-@example +- void BZ2_bzReadGetUnused ( int* bzerror, BZFILE *b, +- void** unused, int* nUnused ); +-@end example +-Returns data which was read from the compressed file but was not needed +-to get to the logical end-of-stream. @code{*unused} is set to the address +-of the data, and @code{*nUnused} to the number of bytes. @code{*nUnused} will +-be set to a value between @code{0} and @code{BZ_MAX_UNUSED} inclusive. +- +-This function may only be called once @code{BZ2_bzRead} has signalled +-@code{BZ_STREAM_END} but before @code{BZ2_bzReadClose}. +- +-Possible assignments to @code{bzerror}: +-@display +- @code{BZ_PARAM_ERROR} +- if @code{b} is @code{NULL} +- or @code{unused} is @code{NULL} or @code{nUnused} is @code{NULL} +- @code{BZ_SEQUENCE_ERROR} +- if @code{BZ_STREAM_END} has not been signalled +- or if @code{b} was opened with @code{BZ2_bzWriteOpen} +- @code{BZ_OK} +- otherwise +-@end display +- +-Allowable next actions: +-@display +- @code{BZ2_bzReadClose} +-@end display +- +- +-@subsection @code{BZ2_bzReadClose} +-@example +- void BZ2_bzReadClose ( int *bzerror, BZFILE *b ); +-@end example +-Releases all memory pertaining to the compressed file @code{b}. +-@code{BZ2_bzReadClose} does not call @code{fclose} on the underlying file +-handle, so you should do that yourself if appropriate. +-@code{BZ2_bzReadClose} should be called to clean up after all error +-situations. +- +-Possible assignments to @code{bzerror}: +-@display +- @code{BZ_SEQUENCE_ERROR} +- if @code{b} was opened with @code{BZ2_bzOpenWrite} +- @code{BZ_OK} +- otherwise +-@end display +- +-Allowable next actions: +-@display +- none +-@end display +- +- +- +-@subsection @code{BZ2_bzWriteOpen} +-@example +- BZFILE *BZ2_bzWriteOpen ( int *bzerror, FILE *f, +- int blockSize100k, int verbosity, +- int workFactor ); +-@end example +-Prepare to write compressed data to file handle @code{f}. +-@code{f} should refer to +-a file which has been opened for writing, and for which the error +-indicator (@code{ferror(f)})is not set. +- +-For the meaning of parameters @code{blockSize100k}, +-@code{verbosity} and @code{workFactor}, see +-@* @code{BZ2_bzCompressInit}. +- +-All required memory is allocated at this stage, so if the call +-completes successfully, @code{BZ_MEM_ERROR} cannot be signalled by a +-subsequent call to @code{BZ2_bzWrite}. +- +-Possible assignments to @code{bzerror}: +-@display +- @code{BZ_CONFIG_ERROR} +- if the library has been mis-compiled +- @code{BZ_PARAM_ERROR} +- if @code{f} is @code{NULL} +- or @code{blockSize100k < 1} or @code{blockSize100k > 9} +- @code{BZ_IO_ERROR} +- if @code{ferror(f)} is nonzero +- @code{BZ_MEM_ERROR} +- if insufficient memory is available +- @code{BZ_OK} +- otherwise +-@end display +- +-Possible return values: +-@display +- Pointer to an abstract @code{BZFILE} +- if @code{bzerror} is @code{BZ_OK} +- @code{NULL} +- otherwise +-@end display +- +-Allowable next actions: +-@display +- @code{BZ2_bzWrite} +- if @code{bzerror} is @code{BZ_OK} +- (you could go directly to @code{BZ2_bzWriteClose}, but this would be pretty pointless) +- @code{BZ2_bzWriteClose} +- otherwise +-@end display +- +- +- +-@subsection @code{BZ2_bzWrite} +-@example +- void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len ); +-@end example +-Absorbs @code{len} bytes from the buffer @code{buf}, eventually to be +-compressed and written to the file. +- +-Possible assignments to @code{bzerror}: +-@display +- @code{BZ_PARAM_ERROR} +- if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} +- @code{BZ_SEQUENCE_ERROR} +- if b was opened with @code{BZ2_bzReadOpen} +- @code{BZ_IO_ERROR} +- if there is an error writing the compressed file. +- @code{BZ_OK} +- otherwise +-@end display +- +- +- +- +-@subsection @code{BZ2_bzWriteClose} +-@example +- void BZ2_bzWriteClose ( int *bzerror, BZFILE* f, +- int abandon, +- unsigned int* nbytes_in, +- unsigned int* nbytes_out ); +- +- void BZ2_bzWriteClose64 ( int *bzerror, BZFILE* f, +- int abandon, +- unsigned int* nbytes_in_lo32, +- unsigned int* nbytes_in_hi32, +- unsigned int* nbytes_out_lo32, +- unsigned int* nbytes_out_hi32 ); +-@end example +- +-Compresses and flushes to the compressed file all data so far supplied +-by @code{BZ2_bzWrite}. The logical end-of-stream markers are also written, so +-subsequent calls to @code{BZ2_bzWrite} are illegal. All memory associated +-with the compressed file @code{b} is released. +-@code{fflush} is called on the +-compressed file, but it is not @code{fclose}'d. +- +-If @code{BZ2_bzWriteClose} is called to clean up after an error, the only +-action is to release the memory. The library records the error codes +-issued by previous calls, so this situation will be detected +-automatically. There is no attempt to complete the compression +-operation, nor to @code{fflush} the compressed file. You can force this +-behaviour to happen even in the case of no error, by passing a nonzero +-value to @code{abandon}. +- +-If @code{nbytes_in} is non-null, @code{*nbytes_in} will be set to be the +-total volume of uncompressed data handled. Similarly, @code{nbytes_out} +-will be set to the total volume of compressed data written. For +-compatibility with older versions of the library, @code{BZ2_bzWriteClose} +-only yields the lower 32 bits of these counts. Use +-@code{BZ2_bzWriteClose64} if you want the full 64 bit counts. These +-two functions are otherwise absolutely identical. +- +- +-Possible assignments to @code{bzerror}: +-@display +- @code{BZ_SEQUENCE_ERROR} +- if @code{b} was opened with @code{BZ2_bzReadOpen} +- @code{BZ_IO_ERROR} +- if there is an error writing the compressed file +- @code{BZ_OK} +- otherwise +-@end display +- +-@subsection Handling embedded compressed data streams +- +-The high-level library facilitates use of +-@code{bzip2} data streams which form some part of a surrounding, larger +-data stream. +-@itemize @bullet +-@item For writing, the library takes an open file handle, writes +-compressed data to it, @code{fflush}es it but does not @code{fclose} it. +-The calling application can write its own data before and after the +-compressed data stream, using that same file handle. +-@item Reading is more complex, and the facilities are not as general +-as they could be since generality is hard to reconcile with efficiency. +-@code{BZ2_bzRead} reads from the compressed file in blocks of size +-@code{BZ_MAX_UNUSED} bytes, and in doing so probably will overshoot +-the logical end of compressed stream. +-To recover this data once decompression has +-ended, call @code{BZ2_bzReadGetUnused} after the last call of @code{BZ2_bzRead} +-(the one returning @code{BZ_STREAM_END}) but before calling +-@code{BZ2_bzReadClose}. +-@end itemize +- +-This mechanism makes it easy to decompress multiple @code{bzip2} +-streams placed end-to-end. As the end of one stream, when @code{BZ2_bzRead} +-returns @code{BZ_STREAM_END}, call @code{BZ2_bzReadGetUnused} to collect the +-unused data (copy it into your own buffer somewhere). +-That data forms the start of the next compressed stream. +-To start uncompressing that next stream, call @code{BZ2_bzReadOpen} again, +-feeding in the unused data via the @code{unused}/@code{nUnused} +-parameters. +-Keep doing this until @code{BZ_STREAM_END} return coincides with the +-physical end of file (@code{feof(f)}). In this situation +-@code{BZ2_bzReadGetUnused} +-will of course return no data. +- +-This should give some feel for how the high-level interface can be used. +-If you require extra flexibility, you'll have to bite the bullet and get +-to grips with the low-level interface. +- +-@subsection Standard file-reading/writing code +-Here's how you'd write data to a compressed file: +-@example @code +-FILE* f; +-BZFILE* b; +-int nBuf; +-char buf[ /* whatever size you like */ ]; +-int bzerror; +-int nWritten; +- +-f = fopen ( "myfile.bz2", "w" ); +-if (!f) @{ +- /* handle error */ +-@} +-b = BZ2_bzWriteOpen ( &bzerror, f, 9 ); +-if (bzerror != BZ_OK) @{ +- BZ2_bzWriteClose ( b ); +- /* handle error */ +-@} +- +-while ( /* condition */ ) @{ +- /* get data to write into buf, and set nBuf appropriately */ +- nWritten = BZ2_bzWrite ( &bzerror, b, buf, nBuf ); +- if (bzerror == BZ_IO_ERROR) @{ +- BZ2_bzWriteClose ( &bzerror, b ); +- /* handle error */ +- @} +-@} +- +-BZ2_bzWriteClose ( &bzerror, b ); +-if (bzerror == BZ_IO_ERROR) @{ +- /* handle error */ +-@} +-@end example +-And to read from a compressed file: +-@example +-FILE* f; +-BZFILE* b; +-int nBuf; +-char buf[ /* whatever size you like */ ]; +-int bzerror; +-int nWritten; +- +-f = fopen ( "myfile.bz2", "r" ); +-if (!f) @{ +- /* handle error */ +-@} +-b = BZ2_bzReadOpen ( &bzerror, f, 0, NULL, 0 ); +-if (bzerror != BZ_OK) @{ +- BZ2_bzReadClose ( &bzerror, b ); +- /* handle error */ +-@} +- +-bzerror = BZ_OK; +-while (bzerror == BZ_OK && /* arbitrary other conditions */) @{ +- nBuf = BZ2_bzRead ( &bzerror, b, buf, /* size of buf */ ); +- if (bzerror == BZ_OK) @{ +- /* do something with buf[0 .. nBuf-1] */ +- @} +-@} +-if (bzerror != BZ_STREAM_END) @{ +- BZ2_bzReadClose ( &bzerror, b ); +- /* handle error */ +-@} else @{ +- BZ2_bzReadClose ( &bzerror ); +-@} +-@end example +- +- +- +-@section Utility functions +-@subsection @code{BZ2_bzBuffToBuffCompress} +-@example +- int BZ2_bzBuffToBuffCompress( char* dest, +- unsigned int* destLen, +- char* source, +- unsigned int sourceLen, +- int blockSize100k, +- int verbosity, +- int workFactor ); +-@end example +-Attempts to compress the data in @code{source[0 .. sourceLen-1]} +-into the destination buffer, @code{dest[0 .. *destLen-1]}. +-If the destination buffer is big enough, @code{*destLen} is +-set to the size of the compressed data, and @code{BZ_OK} is +-returned. If the compressed data won't fit, @code{*destLen} +-is unchanged, and @code{BZ_OUTBUFF_FULL} is returned. +- +-Compression in this manner is a one-shot event, done with a single call +-to this function. The resulting compressed data is a complete +-@code{bzip2} format data stream. There is no mechanism for making +-additional calls to provide extra input data. If you want that kind of +-mechanism, use the low-level interface. +- +-For the meaning of parameters @code{blockSize100k}, @code{verbosity} +-and @code{workFactor}, @* see @code{BZ2_bzCompressInit}. +- +-To guarantee that the compressed data will fit in its buffer, allocate +-an output buffer of size 1% larger than the uncompressed data, plus +-six hundred extra bytes. +- +-@code{BZ2_bzBuffToBuffDecompress} will not write data at or +-beyond @code{dest[*destLen]}, even in case of buffer overflow. +- +-Possible return values: +-@display +- @code{BZ_CONFIG_ERROR} +- if the library has been mis-compiled +- @code{BZ_PARAM_ERROR} +- if @code{dest} is @code{NULL} or @code{destLen} is @code{NULL} +- or @code{blockSize100k < 1} or @code{blockSize100k > 9} +- or @code{verbosity < 0} or @code{verbosity > 4} +- or @code{workFactor < 0} or @code{workFactor > 250} +- @code{BZ_MEM_ERROR} +- if insufficient memory is available +- @code{BZ_OUTBUFF_FULL} +- if the size of the compressed data exceeds @code{*destLen} +- @code{BZ_OK} +- otherwise +-@end display +- +- +- +-@subsection @code{BZ2_bzBuffToBuffDecompress} +-@example +- int BZ2_bzBuffToBuffDecompress ( char* dest, +- unsigned int* destLen, +- char* source, +- unsigned int sourceLen, +- int small, +- int verbosity ); +-@end example +-Attempts to decompress the data in @code{source[0 .. sourceLen-1]} +-into the destination buffer, @code{dest[0 .. *destLen-1]}. +-If the destination buffer is big enough, @code{*destLen} is +-set to the size of the uncompressed data, and @code{BZ_OK} is +-returned. If the compressed data won't fit, @code{*destLen} +-is unchanged, and @code{BZ_OUTBUFF_FULL} is returned. +- +-@code{source} is assumed to hold a complete @code{bzip2} format +-data stream. @* @code{BZ2_bzBuffToBuffDecompress} tries to decompress +-the entirety of the stream into the output buffer. +- +-For the meaning of parameters @code{small} and @code{verbosity}, +-see @code{BZ2_bzDecompressInit}. +- +-Because the compression ratio of the compressed data cannot be known in +-advance, there is no easy way to guarantee that the output buffer will +-be big enough. You may of course make arrangements in your code to +-record the size of the uncompressed data, but such a mechanism is beyond +-the scope of this library. +- +-@code{BZ2_bzBuffToBuffDecompress} will not write data at or +-beyond @code{dest[*destLen]}, even in case of buffer overflow. +- +-Possible return values: +-@display +- @code{BZ_CONFIG_ERROR} +- if the library has been mis-compiled +- @code{BZ_PARAM_ERROR} +- if @code{dest} is @code{NULL} or @code{destLen} is @code{NULL} +- or @code{small != 0 && small != 1} +- or @code{verbosity < 0} or @code{verbosity > 4} +- @code{BZ_MEM_ERROR} +- if insufficient memory is available +- @code{BZ_OUTBUFF_FULL} +- if the size of the compressed data exceeds @code{*destLen} +- @code{BZ_DATA_ERROR} +- if a data integrity error was detected in the compressed data +- @code{BZ_DATA_ERROR_MAGIC} +- if the compressed data doesn't begin with the right magic bytes +- @code{BZ_UNEXPECTED_EOF} +- if the compressed data ends unexpectedly +- @code{BZ_OK} +- otherwise +-@end display +- +- +- +-@section @code{zlib} compatibility functions +-Yoshioka Tsuneo has contributed some functions to +-give better @code{zlib} compatibility. These functions are +-@code{BZ2_bzopen}, @code{BZ2_bzread}, @code{BZ2_bzwrite}, @code{BZ2_bzflush}, +-@code{BZ2_bzclose}, +-@code{BZ2_bzerror} and @code{BZ2_bzlibVersion}. +-These functions are not (yet) officially part of +-the library. If they break, you get to keep all the pieces. +-Nevertheless, I think they work ok. +-@example +-typedef void BZFILE; +- +-const char * BZ2_bzlibVersion ( void ); +-@end example +-Returns a string indicating the library version. +-@example +-BZFILE * BZ2_bzopen ( const char *path, const char *mode ); +-BZFILE * BZ2_bzdopen ( int fd, const char *mode ); +-@end example +-Opens a @code{.bz2} file for reading or writing, using either its name +-or a pre-existing file descriptor. +-Analogous to @code{fopen} and @code{fdopen}. +-@example +-int BZ2_bzread ( BZFILE* b, void* buf, int len ); +-int BZ2_bzwrite ( BZFILE* b, void* buf, int len ); +-@end example +-Reads/writes data from/to a previously opened @code{BZFILE}. +-Analogous to @code{fread} and @code{fwrite}. +-@example +-int BZ2_bzflush ( BZFILE* b ); +-void BZ2_bzclose ( BZFILE* b ); +-@end example +-Flushes/closes a @code{BZFILE}. @code{BZ2_bzflush} doesn't actually do +-anything. Analogous to @code{fflush} and @code{fclose}. +- +-@example +-const char * BZ2_bzerror ( BZFILE *b, int *errnum ) +-@end example +-Returns a string describing the more recent error status of +-@code{b}, and also sets @code{*errnum} to its numerical value. +- +- +-@section Using the library in a @code{stdio}-free environment +- +-@subsection Getting rid of @code{stdio} +- +-In a deeply embedded application, you might want to use just +-the memory-to-memory functions. You can do this conveniently +-by compiling the library with preprocessor symbol @code{BZ_NO_STDIO} +-defined. Doing this gives you a library containing only the following +-eight functions: +- +-@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, @code{BZ2_bzCompressEnd} @* +-@code{BZ2_bzDecompressInit}, @code{BZ2_bzDecompress}, @code{BZ2_bzDecompressEnd} @* +-@code{BZ2_bzBuffToBuffCompress}, @code{BZ2_bzBuffToBuffDecompress} +- +-When compiled like this, all functions will ignore @code{verbosity} +-settings. +- +-@subsection Critical error handling +-@code{libbzip2} contains a number of internal assertion checks which +-should, needless to say, never be activated. Nevertheless, if an +-assertion should fail, behaviour depends on whether or not the library +-was compiled with @code{BZ_NO_STDIO} set. +- +-For a normal compile, an assertion failure yields the message +-@example +- bzip2/libbzip2: internal error number N. +- This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000. +- Please report it to me at: jseward@@acm.org. If this happened +- when you were using some program which uses libbzip2 as a +- component, you should also report this bug to the author(s) +- of that program. Please make an effort to report this bug; +- timely and accurate bug reports eventually lead to higher +- quality software. Thanks. Julian Seward, 21 March 2000. +-@end example +-where @code{N} is some error code number. @code{exit(3)} +-is then called. +- +-For a @code{stdio}-free library, assertion failures result +-in a call to a function declared as: +-@example +- extern void bz_internal_error ( int errcode ); +-@end example +-The relevant code is passed as a parameter. You should supply +-such a function. +- +-In either case, once an assertion failure has occurred, any +-@code{bz_stream} records involved can be regarded as invalid. +-You should not attempt to resume normal operation with them. +- +-You may, of course, change critical error handling to suit +-your needs. As I said above, critical errors indicate bugs +-in the library and should not occur. All "normal" error +-situations are indicated via error return codes from functions, +-and can be recovered from. +- +- +-@section Making a Windows DLL +-Everything related to Windows has been contributed by Yoshioka Tsuneo +-@* (@code{QWF00133@@niftyserve.or.jp} / +-@code{tsuneo-y@@is.aist-nara.ac.jp}), so you should send your queries to +-him (but perhaps Cc: me, @code{jseward@@acm.org}). +- +-My vague understanding of what to do is: using Visual C++ 5.0, +-open the project file @code{libbz2.dsp}, and build. That's all. +- +-If you can't +-open the project file for some reason, make a new one, naming these files: +-@code{blocksort.c}, @code{bzlib.c}, @code{compress.c}, +-@code{crctable.c}, @code{decompress.c}, @code{huffman.c}, @* +-@code{randtable.c} and @code{libbz2.def}. You will also need +-to name the header files @code{bzlib.h} and @code{bzlib_private.h}. +- +-If you don't use VC++, you may need to define the proprocessor symbol +-@code{_WIN32}. +- +-Finally, @code{dlltest.c} is a sample program using the DLL. It has a +-project file, @code{dlltest.dsp}. +- +-If you just want a makefile for Visual C, have a look at +-@code{makefile.msc}. +- +-Be aware that if you compile @code{bzip2} itself on Win32, you must set +-@code{BZ_UNIX} to 0 and @code{BZ_LCCWIN32} to 1, in the file +-@code{bzip2.c}, before compiling. Otherwise the resulting binary won't +-work correctly. +- +-I haven't tried any of this stuff myself, but it all looks plausible. +- +- +- +-@chapter Miscellanea +- +-These are just some random thoughts of mine. Your mileage may +-vary. +- +-@section Limitations of the compressed file format +-@code{bzip2-1.0}, @code{0.9.5} and @code{0.9.0} +-use exactly the same file format as the previous +-version, @code{bzip2-0.1}. This decision was made in the interests of +-stability. Creating yet another incompatible compressed file format +-would create further confusion and disruption for users. +- +-Nevertheless, this is not a painless decision. Development +-work since the release of @code{bzip2-0.1} in August 1997 +-has shown complexities in the file format which slow down +-decompression and, in retrospect, are unnecessary. These are: +-@itemize @bullet +-@item The run-length encoder, which is the first of the +- compression transformations, is entirely irrelevant. +- The original purpose was to protect the sorting algorithm +- from the very worst case input: a string of repeated +- symbols. But algorithm steps Q6a and Q6b in the original +- Burrows-Wheeler technical report (SRC-124) show how +- repeats can be handled without difficulty in block +- sorting. +-@item The randomisation mechanism doesn't really need to be +- there. Udi Manber and Gene Myers published a suffix +- array construction algorithm a few years back, which +- can be employed to sort any block, no matter how +- repetitive, in O(N log N) time. Subsequent work by +- Kunihiko Sadakane has produced a derivative O(N (log N)^2) +- algorithm which usually outperforms the Manber-Myers +- algorithm. +- +- I could have changed to Sadakane's algorithm, but I find +- it to be slower than @code{bzip2}'s existing algorithm for +- most inputs, and the randomisation mechanism protects +- adequately against bad cases. I didn't think it was +- a good tradeoff to make. Partly this is due to the fact +- that I was not flooded with email complaints about +- @code{bzip2-0.1}'s performance on repetitive data, so +- perhaps it isn't a problem for real inputs. +- +- Probably the best long-term solution, +- and the one I have incorporated into 0.9.5 and above, +- is to use the existing sorting +- algorithm initially, and fall back to a O(N (log N)^2) +- algorithm if the standard algorithm gets into difficulties. +-@item The compressed file format was never designed to be +- handled by a library, and I have had to jump though +- some hoops to produce an efficient implementation of +- decompression. It's a bit hairy. Try passing +- @code{decompress.c} through the C preprocessor +- and you'll see what I mean. Much of this complexity +- could have been avoided if the compressed size of +- each block of data was recorded in the data stream. +-@item An Adler-32 checksum, rather than a CRC32 checksum, +- would be faster to compute. +-@end itemize +-It would be fair to say that the @code{bzip2} format was frozen +-before I properly and fully understood the performance +-consequences of doing so. +- +-Improvements which I was able to incorporate into +-0.9.0, despite using the same file format, are: +-@itemize @bullet +-@item Single array implementation of the inverse BWT. This +- significantly speeds up decompression, presumably +- because it reduces the number of cache misses. +-@item Faster inverse MTF transform for large MTF values. The +- new implementation is based on the notion of sliding blocks +- of values. +-@item @code{bzip2-0.9.0} now reads and writes files with @code{fread} +- and @code{fwrite}; version 0.1 used @code{putc} and @code{getc}. +- Duh! Well, you live and learn. +- +-@end itemize +-Further ahead, it would be nice +-to be able to do random access into files. This will +-require some careful design of compressed file formats. +- +- +- +-@section Portability issues +-After some consideration, I have decided not to use +-GNU @code{autoconf} to configure 0.9.5 or 1.0. +- +-@code{autoconf}, admirable and wonderful though it is, +-mainly assists with portability problems between Unix-like +-platforms. But @code{bzip2} doesn't have much in the way +-of portability problems on Unix; most of the difficulties appear +-when porting to the Mac, or to Microsoft's operating systems. +-@code{autoconf} doesn't help in those cases, and brings in a +-whole load of new complexity. +- +-Most people should be able to compile the library and program +-under Unix straight out-of-the-box, so to speak, especially +-if you have a version of GNU C available. +- +-There are a couple of @code{__inline__} directives in the code. GNU C +-(@code{gcc}) should be able to handle them. If you're not using +-GNU C, your C compiler shouldn't see them at all. +-If your compiler does, for some reason, see them and doesn't +-like them, just @code{#define} @code{__inline__} to be @code{/* */}. One +-easy way to do this is to compile with the flag @code{-D__inline__=}, +-which should be understood by most Unix compilers. +- +-If you still have difficulties, try compiling with the macro +-@code{BZ_STRICT_ANSI} defined. This should enable you to build the +-library in a strictly ANSI compliant environment. Building the program +-itself like this is dangerous and not supported, since you remove +-@code{bzip2}'s checks against compressing directories, symbolic links, +-devices, and other not-really-a-file entities. This could cause +-filesystem corruption! +- +-One other thing: if you create a @code{bzip2} binary for public +-distribution, please try and link it statically (@code{gcc -s}). This +-avoids all sorts of library-version issues that others may encounter +-later on. +- +-If you build @code{bzip2} on Win32, you must set @code{BZ_UNIX} to 0 and +-@code{BZ_LCCWIN32} to 1, in the file @code{bzip2.c}, before compiling. +-Otherwise the resulting binary won't work correctly. +- +- +- +-@section Reporting bugs +-I tried pretty hard to make sure @code{bzip2} is +-bug free, both by design and by testing. Hopefully +-you'll never need to read this section for real. +- +-Nevertheless, if @code{bzip2} dies with a segmentation +-fault, a bus error or an internal assertion failure, it +-will ask you to email me a bug report. Experience with +-version 0.1 shows that almost all these problems can +-be traced to either compiler bugs or hardware problems. +-@itemize @bullet +-@item +-Recompile the program with no optimisation, and see if it +-works. And/or try a different compiler. +-I heard all sorts of stories about various flavours +-of GNU C (and other compilers) generating bad code for +-@code{bzip2}, and I've run across two such examples myself. +- +-2.7.X versions of GNU C are known to generate bad code from +-time to time, at high optimisation levels. +-If you get problems, try using the flags +-@code{-O2} @code{-fomit-frame-pointer} @code{-fno-strength-reduce}. +-You should specifically @emph{not} use @code{-funroll-loops}. +- +-You may notice that the Makefile runs six tests as part of +-the build process. If the program passes all of these, it's +-a pretty good (but not 100%) indication that the compiler has +-done its job correctly. +-@item +-If @code{bzip2} crashes randomly, and the crashes are not +-repeatable, you may have a flaky memory subsystem. @code{bzip2} +-really hammers your memory hierarchy, and if it's a bit marginal, +-you may get these problems. Ditto if your disk or I/O subsystem +-is slowly failing. Yup, this really does happen. +- +-Try using a different machine of the same type, and see if +-you can repeat the problem. +-@item This isn't really a bug, but ... If @code{bzip2} tells +-you your file is corrupted on decompression, and you +-obtained the file via FTP, there is a possibility that you +-forgot to tell FTP to do a binary mode transfer. That absolutely +-will cause the file to be non-decompressible. You'll have to transfer +-it again. +-@end itemize +- +-If you've incorporated @code{libbzip2} into your own program +-and are getting problems, please, please, please, check that the +-parameters you are passing in calls to the library, are +-correct, and in accordance with what the documentation says +-is allowable. I have tried to make the library robust against +-such problems, but I'm sure I haven't succeeded. +- +-Finally, if the above comments don't help, you'll have to send +-me a bug report. Now, it's just amazing how many people will +-send me a bug report saying something like +-@display +- bzip2 crashed with segmentation fault on my machine +-@end display +-and absolutely nothing else. Needless to say, a such a report +-is @emph{totally, utterly, completely and comprehensively 100% useless; +-a waste of your time, my time, and net bandwidth}. +-With no details at all, there's no way I can possibly begin +-to figure out what the problem is. +- +-The rules of the game are: facts, facts, facts. Don't omit +-them because "oh, they won't be relevant". At the bare +-minimum: +-@display +- Machine type. Operating system version. +- Exact version of @code{bzip2} (do @code{bzip2 -V}). +- Exact version of the compiler used. +- Flags passed to the compiler. +-@end display +-However, the most important single thing that will help me is +-the file that you were trying to compress or decompress at the +-time the problem happened. Without that, my ability to do anything +-more than speculate about the cause, is limited. +- +-Please remember that I connect to the Internet with a modem, so +-you should contact me before mailing me huge files. +- +- +-@section Did you get the right package? +- +-@code{bzip2} is a resource hog. It soaks up large amounts of CPU cycles +-and memory. Also, it gives very large latencies. In the worst case, you +-can feed many megabytes of uncompressed data into the library before +-getting any compressed output, so this probably rules out applications +-requiring interactive behaviour. +- +-These aren't faults of my implementation, I hope, but more +-an intrinsic property of the Burrows-Wheeler transform (unfortunately). +-Maybe this isn't what you want. +- +-If you want a compressor and/or library which is faster, uses less +-memory but gets pretty good compression, and has minimal latency, +-consider Jean-loup +-Gailly's and Mark Adler's work, @code{zlib-1.1.2} and +-@code{gzip-1.2.4}. Look for them at +- +-@code{http://www.cdrom.com/pub/infozip/zlib} and +-@code{http://www.gzip.org} respectively. +- +-For something faster and lighter still, you might try Markus F X J +-Oberhumer's @code{LZO} real-time compression/decompression library, at +-@* @code{http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html}. +- +-If you want to use the @code{bzip2} algorithms to compress small blocks +-of data, 64k bytes or smaller, for example on an on-the-fly disk +-compressor, you'd be well advised not to use this library. Instead, +-I've made a special library tuned for that kind of use. It's part of +-@code{e2compr-0.40}, an on-the-fly disk compressor for the Linux +-@code{ext2} filesystem. Look at +-@code{http://www.netspace.net.au/~reiter/e2compr}. +- +- +- +-@section Testing +- +-A record of the tests I've done. +- +-First, some data sets: +-@itemize @bullet +-@item B: a directory containing 6001 files, one for every length in the +- range 0 to 6000 bytes. The files contain random lowercase +- letters. 18.7 megabytes. +-@item H: my home directory tree. Documents, source code, mail files, +- compressed data. H contains B, and also a directory of +- files designed as boundary cases for the sorting; mostly very +- repetitive, nasty files. 565 megabytes. +-@item A: directory tree holding various applications built from source: +- @code{egcs}, @code{gcc-2.8.1}, KDE, GTK, Octave, etc. +- 2200 megabytes. +-@end itemize +-The tests conducted are as follows. Each test means compressing +-(a copy of) each file in the data set, decompressing it and +-comparing it against the original. +- +-First, a bunch of tests with block sizes and internal buffer +-sizes set very small, +-to detect any problems with the +-blocking and buffering mechanisms. +-This required modifying the source code so as to try to +-break it. +-@enumerate +-@item Data set H, with +- buffer size of 1 byte, and block size of 23 bytes. +-@item Data set B, buffer sizes 1 byte, block size 1 byte. +-@item As (2) but small-mode decompression. +-@item As (2) with block size 2 bytes. +-@item As (2) with block size 3 bytes. +-@item As (2) with block size 4 bytes. +-@item As (2) with block size 5 bytes. +-@item As (2) with block size 6 bytes and small-mode decompression. +-@item H with buffer size of 1 byte, but normal block +- size (up to 900000 bytes). +-@end enumerate +-Then some tests with unmodified source code. +-@enumerate +-@item H, all settings normal. +-@item As (1), with small-mode decompress. +-@item H, compress with flag @code{-1}. +-@item H, compress with flag @code{-s}, decompress with flag @code{-s}. +-@item Forwards compatibility: H, @code{bzip2-0.1pl2} compressing, +- @code{bzip2-0.9.5} decompressing, all settings normal. +-@item Backwards compatibility: H, @code{bzip2-0.9.5} compressing, +- @code{bzip2-0.1pl2} decompressing, all settings normal. +-@item Bigger tests: A, all settings normal. +-@item As (7), using the fallback (Sadakane-like) sorting algorithm. +-@item As (8), compress with flag @code{-1}, decompress with flag +- @code{-s}. +-@item H, using the fallback sorting algorithm. +-@item Forwards compatibility: A, @code{bzip2-0.1pl2} compressing, +- @code{bzip2-0.9.5} decompressing, all settings normal. +-@item Backwards compatibility: A, @code{bzip2-0.9.5} compressing, +- @code{bzip2-0.1pl2} decompressing, all settings normal. +-@item Misc test: about 400 megabytes of @code{.tar} files with +- @code{bzip2} compiled with Checker (a memory access error +- detector, like Purify). +-@item Misc tests to make sure it builds and runs ok on non-Linux/x86 +- platforms. +-@end enumerate +-These tests were conducted on a 225 MHz IDT WinChip machine, running +-Linux 2.0.36. They represent nearly a week of continuous computation. +-All tests completed successfully. +- +- +-@section Further reading +-@code{bzip2} is not research work, in the sense that it doesn't present +-any new ideas. Rather, it's an engineering exercise based on existing +-ideas. +- +-Four documents describe essentially all the ideas behind @code{bzip2}: +-@example +-Michael Burrows and D. J. Wheeler: +- "A block-sorting lossless data compression algorithm" +- 10th May 1994. +- Digital SRC Research Report 124. +- ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz +- If you have trouble finding it, try searching at the +- New Zealand Digital Library, http://www.nzdl.org. +- +-Daniel S. Hirschberg and Debra A. LeLewer +- "Efficient Decoding of Prefix Codes" +- Communications of the ACM, April 1990, Vol 33, Number 4. +- You might be able to get an electronic copy of this +- from the ACM Digital Library. +- +-David J. Wheeler +- Program bred3.c and accompanying document bred3.ps. +- This contains the idea behind the multi-table Huffman +- coding scheme. +- ftp://ftp.cl.cam.ac.uk/users/djw3/ +- +-Jon L. Bentley and Robert Sedgewick +- "Fast Algorithms for Sorting and Searching Strings" +- Available from Sedgewick's web page, +- www.cs.princeton.edu/~rs +-@end example +-The following paper gives valuable additional insights into the +-algorithm, but is not immediately the basis of any code +-used in bzip2. +-@example +-Peter Fenwick: +- Block Sorting Text Compression +- Proceedings of the 19th Australasian Computer Science Conference, +- Melbourne, Australia. Jan 31 - Feb 2, 1996. +- ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps +-@end example +-Kunihiko Sadakane's sorting algorithm, mentioned above, +-is available from: +-@example +-http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz +-@end example +-The Manber-Myers suffix array construction +-algorithm is described in a paper +-available from: +-@example +-http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps +-@end example +-Finally, the following paper documents some recent investigations +-I made into the performance of sorting algorithms: +-@example +-Julian Seward: +- On the Performance of BWT Sorting Algorithms +- Proceedings of the IEEE Data Compression Conference 2000 +- Snowbird, Utah. 28-30 March 2000. +-@end example +- +- +-@contents +- +-@bye +- +diff -Nru bzip2-1.0.1/manual_1.html bzip2-1.0.1.new/manual_1.html +--- bzip2-1.0.1/manual_1.html Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/manual_1.html Thu Jan 1 01:00:00 1970 +@@ -1,47 +0,0 @@ +- +- +- +- +-bzip2 and libbzip2 - Introduction +- +- +- +- +- +-

Go to the first, previous, next, last section, table of contents. +-


+- +- +-

Introduction

+- +-

+-bzip2 compresses files using the Burrows-Wheeler +-block-sorting text compression algorithm, and Huffman coding. +-Compression is generally considerably better than that +-achieved by more conventional LZ77/LZ78-based compressors, +-and approaches the performance of the PPM family of statistical compressors. +- +-

+-

+-bzip2 is built on top of libbzip2, a flexible library +-for handling compressed data in the bzip2 format. This manual +-describes both how to use the program and +-how to work with the library interface. Most of the +-manual is devoted to this library, not the program, +-which is good news if your interest is only in the program. +- +-

+-

+-Chapter 2 describes how to use bzip2; this is the only part +-you need to read if you just want to know how to operate the program. +-Chapter 3 describes the programming interfaces in detail, and +-Chapter 4 records some miscellaneous notes which I thought +-ought to be recorded somewhere. +- +-

+- +-


+-

Go to the first, previous, next, last section, table of contents. +- +- +diff -Nru bzip2-1.0.1/manual_2.html bzip2-1.0.1.new/manual_2.html +--- bzip2-1.0.1/manual_2.html Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/manual_2.html Thu Jan 1 01:00:00 1970 +@@ -1,484 +0,0 @@ +- +- +- +- +-bzip2 and libbzip2 - How to use bzip2 +- +- +- +- +- +- +-

Go to the first, previous, next, last section, table of contents. +-


+- +- +-

How to use bzip2

+- +-

+-This chapter contains a copy of the bzip2 man page, +-and nothing else. +- +-

+- +-
+- +- +- +-

NAME

+- +-
    +-
  • bzip2, bunzip2 +- +-- a block-sorting file compressor, v1.0 +-
  • bzcat +- +-- decompresses files to stdout +-
  • bzip2recover +- +-- recovers data from damaged bzip2 files +-
+- +- +- +-

SYNOPSIS

+- +-
    +-
  • bzip2 [ -cdfkqstvzVL123456789 ] [ filenames ... ] +- +-
  • bunzip2 [ -fkvsVL ] [ filenames ... ] +- +-
  • bzcat [ -s ] [ filenames ... ] +- +-
  • bzip2recover filename +- +-
+- +- +- +-

DESCRIPTION

+- +-

+-bzip2 compresses files using the Burrows-Wheeler block sorting +-text compression algorithm, and Huffman coding. Compression is +-generally considerably better than that achieved by more conventional +-LZ77/LZ78-based compressors, and approaches the performance of the PPM +-family of statistical compressors. +- +-

+-

+-The command-line options are deliberately very similar to those of GNU +-gzip, but they are not identical. +- +-

+-

+-bzip2 expects a list of file names to accompany the command-line +-flags. Each file is replaced by a compressed version of itself, with +-the name original_name.bz2. Each compressed file has the same +-modification date, permissions, and, when possible, ownership as the +-corresponding original, so that these properties can be correctly +-restored at decompression time. File name handling is naive in the +-sense that there is no mechanism for preserving original file names, +-permissions, ownerships or dates in filesystems which lack these +-concepts, or have serious file name length restrictions, such as MS-DOS. +- +-

+-

+-bzip2 and bunzip2 will by default not overwrite existing +-files. If you want this to happen, specify the -f flag. +- +-

+-

+-If no file names are specified, bzip2 compresses from standard +-input to standard output. In this case, bzip2 will decline to +-write compressed output to a terminal, as this would be entirely +-incomprehensible and therefore pointless. +- +-

+-

+-bunzip2 (or bzip2 -d) decompresses all +-specified files. Files which were not created by bzip2 +-will be detected and ignored, and a warning issued. +-bzip2 attempts to guess the filename for the decompressed file +-from that of the compressed file as follows: +- +-

    +-
  • filename.bz2 becomes filename +- +-
  • filename.bz becomes filename +- +-
  • filename.tbz2 becomes filename.tar +- +-
  • filename.tbz becomes filename.tar +- +-
  • anyothername becomes anyothername.out +- +-
+- +-

+-If the file does not end in one of the recognised endings, +-.bz2, .bz, +-.tbz2 or .tbz, bzip2 complains that it cannot +-guess the name of the original file, and uses the original name +-with .out appended. +- +-

+-

+-As with compression, supplying no +-filenames causes decompression from standard input to standard output. +- +-

+-

+-bunzip2 will correctly decompress a file which is the +-concatenation of two or more compressed files. The result is the +-concatenation of the corresponding uncompressed files. Integrity +-testing (-t) of concatenated compressed files is also supported. +- +-

+-

+-You can also compress or decompress files to the standard output by +-giving the -c flag. Multiple files may be compressed and +-decompressed like this. The resulting outputs are fed sequentially to +-stdout. Compression of multiple files in this manner generates a stream +-containing multiple compressed file representations. Such a stream +-can be decompressed correctly only by bzip2 version 0.9.0 or +-later. Earlier versions of bzip2 will stop after decompressing +-the first file in the stream. +- +-

+-

+-bzcat (or bzip2 -dc) decompresses all specified files to +-the standard output. +- +-

+-

+-bzip2 will read arguments from the environment variables +-BZIP2 and BZIP, in that order, and will process them +-before any arguments read from the command line. This gives a +-convenient way to supply default arguments. +- +-

+-

+-Compression is always performed, even if the compressed file is slightly +-larger than the original. Files of less than about one hundred bytes +-tend to get larger, since the compression mechanism has a constant +-overhead in the region of 50 bytes. Random data (including the output +-of most file compressors) is coded at about 8.05 bits per byte, giving +-an expansion of around 0.5%. +- +-

+-

+-As a self-check for your protection, bzip2 uses 32-bit CRCs to +-make sure that the decompressed version of a file is identical to the +-original. This guards against corruption of the compressed data, and +-against undetected bugs in bzip2 (hopefully very unlikely). The +-chances of data corruption going undetected is microscopic, about one +-chance in four billion for each file processed. Be aware, though, that +-the check occurs upon decompression, so it can only tell you that +-something is wrong. It can't help you recover the original uncompressed +-data. You can use bzip2recover to try to recover data from +-damaged files. +- +-

+-

+-Return values: 0 for a normal exit, 1 for environmental problems (file +-not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt +-compressed file, 3 for an internal consistency error (eg, bug) which +-caused bzip2 to panic. +- +-

+- +- +- +-

OPTIONS

+-
+- +-
-c --stdout +-
+-Compress or decompress to standard output. +-
-d --decompress +-
+-Force decompression. bzip2, bunzip2 and bzcat are +-really the same program, and the decision about what actions to take is +-done on the basis of which name is used. This flag overrides that +-mechanism, and forces bzip2 to decompress. +-
-z --compress +-
+-The complement to -d: forces compression, regardless of the +-invokation name. +-
-t --test +-
+-Check integrity of the specified file(s), but don't decompress them. +-This really performs a trial decompression and throws away the result. +-
-f --force +-
+-Force overwrite of output files. Normally, bzip2 will not overwrite +-existing output files. Also forces bzip2 to break hard links +-to files, which it otherwise wouldn't do. +-
-k --keep +-
+-Keep (don't delete) input files during compression +-or decompression. +-
-s --small +-
+-Reduce memory usage, for compression, decompression and testing. Files +-are decompressed and tested using a modified algorithm which only +-requires 2.5 bytes per block byte. This means any file can be +-decompressed in 2300k of memory, albeit at about half the normal speed. +- +-During compression, -s selects a block size of 200k, which limits +-memory use to around the same figure, at the expense of your compression +-ratio. In short, if your machine is low on memory (8 megabytes or +-less), use -s for everything. See MEMORY MANAGEMENT below. +-
-q --quiet +-
+-Suppress non-essential warning messages. Messages pertaining to +-I/O errors and other critical events will not be suppressed. +-
-v --verbose +-
+-Verbose mode -- show the compression ratio for each file processed. +-Further -v's increase the verbosity level, spewing out lots of +-information which is primarily of interest for diagnostic purposes. +-
-L --license -V --version +-
+-Display the software version, license terms and conditions. +-
-1 to -9 +-
+-Set the block size to 100 k, 200 k .. 900 k when compressing. Has no +-effect when decompressing. See MEMORY MANAGEMENT below. +-
-- +-
+-Treats all subsequent arguments as file names, even if they start +-with a dash. This is so you can handle files with names beginning +-with a dash, for example: bzip2 -- -myfilename. +-
--repetitive-fast +-
+-
--repetitive-best +-
+-These flags are redundant in versions 0.9.5 and above. They provided +-some coarse control over the behaviour of the sorting algorithm in +-earlier versions, which was sometimes useful. 0.9.5 and above have an +-improved algorithm which renders these flags irrelevant. +-
+- +- +- +-

MEMORY MANAGEMENT

+- +-

+-bzip2 compresses large files in blocks. The block size affects +-both the compression ratio achieved, and the amount of memory needed for +-compression and decompression. The flags -1 through -9 +-specify the block size to be 100,000 bytes through 900,000 bytes (the +-default) respectively. At decompression time, the block size used for +-compression is read from the header of the compressed file, and +-bunzip2 then allocates itself just enough memory to decompress +-the file. Since block sizes are stored in compressed files, it follows +-that the flags -1 to -9 are irrelevant to and so ignored +-during decompression. +- +-

+-

+-Compression and decompression requirements, in bytes, can be estimated +-as: +- +-

+-     Compression:   400k + ( 8 x block size )
+-
+-     Decompression: 100k + ( 4 x block size ), or
+-                    100k + ( 2.5 x block size )
+-
+- +-

+-Larger block sizes give rapidly diminishing marginal returns. Most of +-the compression comes from the first two or three hundred k of block +-size, a fact worth bearing in mind when using bzip2 on small machines. +-It is also important to appreciate that the decompression memory +-requirement is set at compression time by the choice of block size. +- +-

+-

+-For files compressed with the default 900k block size, bunzip2 +-will require about 3700 kbytes to decompress. To support decompression +-of any file on a 4 megabyte machine, bunzip2 has an option to +-decompress using approximately half this amount of memory, about 2300 +-kbytes. Decompression speed is also halved, so you should use this +-option only where necessary. The relevant flag is -s. +- +-

+-

+-In general, try and use the largest block size memory constraints allow, +-since that maximises the compression achieved. Compression and +-decompression speed are virtually unaffected by block size. +- +-

+-

+-Another significant point applies to files which fit in a single block +--- that means most files you'd encounter using a large block size. The +-amount of real memory touched is proportional to the size of the file, +-since the file is smaller than a block. For example, compressing a file +-20,000 bytes long with the flag -9 will cause the compressor to +-allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 +-kbytes of it. Similarly, the decompressor will allocate 3700k but only +-touch 100k + 20000 * 4 = 180 kbytes. +- +-

+-

+-Here is a table which summarises the maximum memory usage for different +-block sizes. Also recorded is the total compressed size for 14 files of +-the Calgary Text Compression Corpus totalling 3,141,622 bytes. This +-column gives some feel for how compression varies with block size. +-These figures tend to understate the advantage of larger block sizes for +-larger files, since the Corpus is dominated by smaller files. +- +-

+-          Compress   Decompress   Decompress   Corpus
+-   Flag     usage      usage       -s usage     Size
+-
+-    -1      1200k       500k         350k      914704
+-    -2      2000k       900k         600k      877703
+-    -3      2800k      1300k         850k      860338
+-    -4      3600k      1700k        1100k      846899
+-    -5      4400k      2100k        1350k      845160
+-    -6      5200k      2500k        1600k      838626
+-    -7      6100k      2900k        1850k      834096
+-    -8      6800k      3300k        2100k      828642
+-    -9      7600k      3700k        2350k      828642
+-
+- +- +- +-

RECOVERING DATA FROM DAMAGED FILES

+- +-

+-bzip2 compresses files in blocks, usually 900kbytes long. Each +-block is handled independently. If a media or transmission error causes +-a multi-block .bz2 file to become damaged, it may be possible to +-recover data from the undamaged blocks in the file. +- +-

+-

+-The compressed representation of each block is delimited by a 48-bit +-pattern, which makes it possible to find the block boundaries with +-reasonable certainty. Each block also carries its own 32-bit CRC, so +-damaged blocks can be distinguished from undamaged ones. +- +-

+-

+-bzip2recover is a simple program whose purpose is to search for +-blocks in .bz2 files, and write each block out into its own +-.bz2 file. You can then use bzip2 -t to test the +-integrity of the resulting files, and decompress those which are +-undamaged. +- +-

+-

+-bzip2recover +-takes a single argument, the name of the damaged file, +-and writes a number of files rec0001file.bz2, +- rec0002file.bz2, etc, containing the extracted blocks. +- The output filenames are designed so that the use of +- wildcards in subsequent processing -- for example, +-bzip2 -dc rec*file.bz2 > recovered_data -- lists the files in +- the correct order. +- +-

+-

+-bzip2recover should be of most use dealing with large .bz2 +- files, as these will contain many blocks. It is clearly +- futile to use it on damaged single-block files, since a +- damaged block cannot be recovered. If you wish to minimise +-any potential data loss through media or transmission errors, +-you might consider compressing with a smaller +- block size. +- +-

+- +- +- +-

PERFORMANCE NOTES

+- +-

+-The sorting phase of compression gathers together similar strings in the +-file. Because of this, files containing very long runs of repeated +-symbols, like "aabaabaabaab ..." (repeated several hundred times) may +-compress more slowly than normal. Versions 0.9.5 and above fare much +-better than previous versions in this respect. The ratio between +-worst-case and average-case compression time is in the region of 10:1. +-For previous versions, this figure was more like 100:1. You can use the +--vvvv option to monitor progress in great detail, if you want. +- +-

+-

+-Decompression speed is unaffected by these phenomena. +- +-

+-

+-bzip2 usually allocates several megabytes of memory to operate +-in, and then charges all over it in a fairly random fashion. This means +-that performance, both for compressing and decompressing, is largely +-determined by the speed at which your machine can service cache misses. +-Because of this, small changes to the code to reduce the miss rate have +-been observed to give disproportionately large performance improvements. +-I imagine bzip2 will perform best on machines with very large +-caches. +- +-

+- +- +- +-

CAVEATS

+- +-

+-I/O error messages are not as helpful as they could be. bzip2 +-tries hard to detect I/O errors and exit cleanly, but the details of +-what the problem is sometimes seem rather misleading. +- +-

+-

+-This manual page pertains to version 1.0 of bzip2. Compressed +-data created by this version is entirely forwards and backwards +-compatible with the previous public releases, versions 0.1pl2, 0.9.0 and +-0.9.5, but with the following exception: 0.9.0 and above can correctly +-decompress multiple concatenated compressed files. 0.1pl2 cannot do +-this; it will stop after decompressing just the first file in the +-stream. +- +-

+-

+-bzip2recover uses 32-bit integers to represent bit positions in +-compressed files, so it cannot handle compressed files more than 512 +-megabytes long. This could easily be fixed. +- +-

+- +- +- +-

AUTHOR

+-

+-Julian Seward, jseward@acm.org. +- +-

+-

+-The ideas embodied in bzip2 are due to (at least) the following +-people: Michael Burrows and David Wheeler (for the block sorting +-transformation), David Wheeler (again, for the Huffman coder), Peter +-Fenwick (for the structured coding model in the original bzip, +-and many refinements), and Alistair Moffat, Radford Neal and Ian Witten +-(for the arithmetic coder in the original bzip). I am much +-indebted for their help, support and advice. See the manual in the +-source distribution for pointers to sources of documentation. Christian +-von Roques encouraged me to look for faster sorting algorithms, so as to +-speed up compression. Bela Lubkin encouraged me to improve the +-worst-case compression performance. Many people sent patches, helped +-with portability problems, lent machines, gave advice and were generally +-helpful. +- +-

+-
+- +-


+-

Go to the first, previous, next, last section, table of contents. +- +- +diff -Nru bzip2-1.0.1/manual_3.html bzip2-1.0.1.new/manual_3.html +--- bzip2-1.0.1/manual_3.html Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/manual_3.html Thu Jan 1 01:00:00 1970 +@@ -1,1773 +0,0 @@ +- +- +- +- +-bzip2 and libbzip2 - Programming with libbzip2 +- +- +- +- +- +- +-

Go to the first, previous, next, last section, table of contents. +-


+- +- +-

Programming with libbzip2

+- +-

+-This chapter describes the programming interface to libbzip2. +- +-

+-

+-For general background information, particularly about memory +-use and performance aspects, you'd be well advised to read Chapter 2 +-as well. +- +-

+- +- +-

Top-level structure

+- +-

+-libbzip2 is a flexible library for compressing and decompressing +-data in the bzip2 data format. Although packaged as a single +-entity, it helps to regard the library as three separate parts: the low +-level interface, and the high level interface, and some utility +-functions. +- +-

+-

+-The structure of libbzip2's interfaces is similar to +-that of Jean-loup Gailly's and Mark Adler's excellent zlib +-library. +- +-

+-

+-All externally visible symbols have names beginning BZ2_. +-This is new in version 1.0. The intention is to minimise pollution +-of the namespaces of library clients. +- +-

+- +- +-

Low-level summary

+- +-

+-This interface provides services for compressing and decompressing +-data in memory. There's no provision for dealing with files, streams +-or any other I/O mechanisms, just straight memory-to-memory work. +-In fact, this part of the library can be compiled without inclusion +-of stdio.h, which may be helpful for embedded applications. +- +-

+-

+-The low-level part of the library has no global variables and +-is therefore thread-safe. +- +-

+-

+-Six routines make up the low level interface: +-BZ2_bzCompressInit, BZ2_bzCompress, and
BZ2_bzCompressEnd +-for compression, +-and a corresponding trio BZ2_bzDecompressInit,
BZ2_bzDecompress +-and BZ2_bzDecompressEnd for decompression. +-The *Init functions allocate +-memory for compression/decompression and do other +-initialisations, whilst the *End functions close down operations +-and release memory. +- +-

+-

+-The real work is done by BZ2_bzCompress and BZ2_bzDecompress. +-These compress and decompress data from a user-supplied input buffer +-to a user-supplied output buffer. These buffers can be any size; +-arbitrary quantities of data are handled by making repeated calls +-to these functions. This is a flexible mechanism allowing a +-consumer-pull style of activity, or producer-push, or a mixture of +-both. +- +-

+- +- +- +-

High-level summary

+- +-

+-This interface provides some handy wrappers around the low-level +-interface to facilitate reading and writing bzip2 format +-files (.bz2 files). The routines provide hooks to facilitate +-reading files in which the bzip2 data stream is embedded +-within some larger-scale file structure, or where there are +-multiple bzip2 data streams concatenated end-to-end. +- +-

+-

+-For reading files, BZ2_bzReadOpen, BZ2_bzRead, +-BZ2_bzReadClose and
BZ2_bzReadGetUnused are supplied. For +-writing files, BZ2_bzWriteOpen, BZ2_bzWrite and +-BZ2_bzWriteFinish are available. +- +-

+-

+-As with the low-level library, no global variables are used +-so the library is per se thread-safe. However, if I/O errors +-occur whilst reading or writing the underlying compressed files, +-you may have to consult errno to determine the cause of +-the error. In that case, you'd need a C library which correctly +-supports errno in a multithreaded environment. +- +-

+-

+-To make the library a little simpler and more portable, +-BZ2_bzReadOpen and BZ2_bzWriteOpen require you to pass them file +-handles (FILE*s) which have previously been opened for reading or +-writing respectively. That avoids portability problems associated with +-file operations and file attributes, whilst not being much of an +-imposition on the programmer. +- +-

+- +- +- +-

Utility functions summary

+-

+-For very simple needs, BZ2_bzBuffToBuffCompress and +-BZ2_bzBuffToBuffDecompress are provided. These compress +-data in memory from one buffer to another buffer in a single +-function call. You should assess whether these functions +-fulfill your memory-to-memory compression/decompression +-requirements before investing effort in understanding the more +-general but more complex low-level interface. +- +-

+-

+-Yoshioka Tsuneo (QWF00133@niftyserve.or.jp / +-tsuneo-y@is.aist-nara.ac.jp) has contributed some functions to +-give better zlib compatibility. These functions are +-BZ2_bzopen, BZ2_bzread, BZ2_bzwrite, BZ2_bzflush, +-BZ2_bzclose, +-BZ2_bzerror and BZ2_bzlibVersion. You may find these functions +-more convenient for simple file reading and writing, than those in the +-high-level interface. These functions are not (yet) officially part of +-the library, and are minimally documented here. If they break, you +-get to keep all the pieces. I hope to document them properly when time +-permits. +- +-

+-

+-Yoshioka also contributed modifications to allow the library to be +-built as a Windows DLL. +- +-

+- +- +- +-

Error handling

+- +-

+-The library is designed to recover cleanly in all situations, including +-the worst-case situation of decompressing random data. I'm not +-100% sure that it can always do this, so you might want to add +-a signal handler to catch segmentation violations during decompression +-if you are feeling especially paranoid. I would be interested in +-hearing more about the robustness of the library to corrupted +-compressed data. +- +-

+-

+-Version 1.0 is much more robust in this respect than +-0.9.0 or 0.9.5. Investigations with Checker (a tool for +-detecting problems with memory management, similar to Purify) +-indicate that, at least for the few files I tested, all single-bit +-errors in the decompressed data are caught properly, with no +-segmentation faults, no reads of uninitialised data and no +-out of range reads or writes. So it's certainly much improved, +-although I wouldn't claim it to be totally bombproof. +- +-

+-

+-The file bzlib.h contains all definitions needed to use +-the library. In particular, you should definitely not include +-bzlib_private.h. +- +-

+-

+-In bzlib.h, the various return values are defined. The following +-list is not intended as an exhaustive description of the circumstances +-in which a given value may be returned -- those descriptions are given +-later. Rather, it is intended to convey the rough meaning of each +-return value. The first five actions are normal and not intended to +-denote an error situation. +-

+- +-
BZ_OK +-
+-The requested action was completed successfully. +-
BZ_RUN_OK +-
+-
BZ_FLUSH_OK +-
+-
BZ_FINISH_OK +-
+-In BZ2_bzCompress, the requested flush/finish/nothing-special action +-was completed successfully. +-
BZ_STREAM_END +-
+-Compression of data was completed, or the logical stream end was +-detected during decompression. +-
+- +-

+-The following return values indicate an error of some kind. +-

+- +-
BZ_CONFIG_ERROR +-
+-Indicates that the library has been improperly compiled on your +-platform -- a major configuration error. Specifically, it means +-that sizeof(char), sizeof(short) and sizeof(int) +-are not 1, 2 and 4 respectively, as they should be. Note that the +-library should still work properly on 64-bit platforms which follow +-the LP64 programming model -- that is, where sizeof(long) +-and sizeof(void*) are 8. Under LP64, sizeof(int) is +-still 4, so libbzip2, which doesn't use the long type, +-is OK. +-
BZ_SEQUENCE_ERROR +-
+-When using the library, it is important to call the functions in the +-correct sequence and with data structures (buffers etc) in the correct +-states. libbzip2 checks as much as it can to ensure this is +-happening, and returns BZ_SEQUENCE_ERROR if not. Code which +-complies precisely with the function semantics, as detailed below, +-should never receive this value; such an event denotes buggy code +-which you should investigate. +-
BZ_PARAM_ERROR +-
+-Returned when a parameter to a function call is out of range +-or otherwise manifestly incorrect. As with BZ_SEQUENCE_ERROR, +-this denotes a bug in the client code. The distinction between +-BZ_PARAM_ERROR and BZ_SEQUENCE_ERROR is a bit hazy, but still worth +-making. +-
BZ_MEM_ERROR +-
+-Returned when a request to allocate memory failed. Note that the +-quantity of memory needed to decompress a stream cannot be determined +-until the stream's header has been read. So BZ2_bzDecompress and +-BZ2_bzRead may return BZ_MEM_ERROR even though some of +-the compressed data has been read. The same is not true for +-compression; once BZ2_bzCompressInit or BZ2_bzWriteOpen have +-successfully completed, BZ_MEM_ERROR cannot occur. +-
BZ_DATA_ERROR +-
+-Returned when a data integrity error is detected during decompression. +-Most importantly, this means when stored and computed CRCs for the +-data do not match. This value is also returned upon detection of any +-other anomaly in the compressed data. +-
BZ_DATA_ERROR_MAGIC +-
+-As a special case of BZ_DATA_ERROR, it is sometimes useful to +-know when the compressed stream does not start with the correct +-magic bytes ('B' 'Z' 'h'). +-
BZ_IO_ERROR +-
+-Returned by BZ2_bzRead and BZ2_bzWrite when there is an error +-reading or writing in the compressed file, and by BZ2_bzReadOpen +-and BZ2_bzWriteOpen for attempts to use a file for which the +-error indicator (viz, ferror(f)) is set. +-On receipt of BZ_IO_ERROR, the caller should consult +-errno and/or perror to acquire operating-system +-specific information about the problem. +-
BZ_UNEXPECTED_EOF +-
+-Returned by BZ2_bzRead when the compressed file finishes +-before the logical end of stream is detected. +-
BZ_OUTBUFF_FULL +-
+-Returned by BZ2_bzBuffToBuffCompress and +-BZ2_bzBuffToBuffDecompress to indicate that the output data +-will not fit into the output buffer provided. +-
+- +- +- +-

Low-level interface

+- +- +- +-

BZ2_bzCompressInit

+- +-
+-typedef 
+-   struct {
+-      char *next_in;
+-      unsigned int avail_in;
+-      unsigned int total_in_lo32;
+-      unsigned int total_in_hi32;
+-
+-      char *next_out;
+-      unsigned int avail_out;
+-      unsigned int total_out_lo32;
+-      unsigned int total_out_hi32;
+-
+-      void *state;
+-
+-      void *(*bzalloc)(void *,int,int);
+-      void (*bzfree)(void *,void *);
+-      void *opaque;
+-   } 
+-   bz_stream;
+-
+-int BZ2_bzCompressInit ( bz_stream *strm, 
+-                         int blockSize100k, 
+-                         int verbosity,
+-                         int workFactor );
+-
+-
+- +-

+-Prepares for compression. The bz_stream structure +-holds all data pertaining to the compression activity. +-A bz_stream structure should be allocated and initialised +-prior to the call. +-The fields of bz_stream +-comprise the entirety of the user-visible data. state +-is a pointer to the private data structures required for compression. +- +-

+-

+-Custom memory allocators are supported, via fields bzalloc, +-bzfree, +-and opaque. The value +-opaque is passed to as the first argument to +-all calls to bzalloc and bzfree, but is +-otherwise ignored by the library. +-The call bzalloc ( opaque, n, m ) is expected to return a +-pointer p to +-n * m bytes of memory, and bzfree ( opaque, p ) +-should free +-that memory. +- +-

+-

+-If you don't want to use a custom memory allocator, set bzalloc, +-bzfree and +-opaque to NULL, +-and the library will then use the standard malloc/free +-routines. +- +-

+-

+-Before calling BZ2_bzCompressInit, fields bzalloc, +-bzfree and opaque should +-be filled appropriately, as just described. Upon return, the internal +-state will have been allocated and initialised, and total_in_lo32, +-total_in_hi32, total_out_lo32 and +-total_out_hi32 will have been set to zero. +-These four fields are used by the library +-to inform the caller of the total amount of data passed into and out of +-the library, respectively. You should not try to change them. +-As of version 1.0, 64-bit counts are maintained, even on 32-bit +-platforms, using the _hi32 fields to store the upper 32 bits +-of the count. So, for example, the total amount of data in +-is (total_in_hi32 << 32) + total_in_lo32. +- +-

+-

+-Parameter blockSize100k specifies the block size to be used for +-compression. It should be a value between 1 and 9 inclusive, and the +-actual block size used is 100000 x this figure. 9 gives the best +-compression but takes most memory. +- +-

+-

+-Parameter verbosity should be set to a number between 0 and 4 +-inclusive. 0 is silent, and greater numbers give increasingly verbose +-monitoring/debugging output. If the library has been compiled with +--DBZ_NO_STDIO, no such output will appear for any verbosity +-setting. +- +-

+-

+-Parameter workFactor controls how the compression phase behaves +-when presented with worst case, highly repetitive, input data. If +-compression runs into difficulties caused by repetitive data, the +-library switches from the standard sorting algorithm to a fallback +-algorithm. The fallback is slower than the standard algorithm by +-perhaps a factor of three, but always behaves reasonably, no matter how +-bad the input. +- +-

+-

+-Lower values of workFactor reduce the amount of effort the +-standard algorithm will expend before resorting to the fallback. You +-should set this parameter carefully; too low, and many inputs will be +-handled by the fallback algorithm and so compress rather slowly, too +-high, and your average-to-worst case compression times can become very +-large. The default value of 30 gives reasonable behaviour over a wide +-range of circumstances. +- +-

+-

+-Allowable values range from 0 to 250 inclusive. 0 is a special case, +-equivalent to using the default value of 30. +- +-

+-

+-Note that the compressed output generated is the same regardless of +-whether or not the fallback algorithm is used. +- +-

+-

+-Be aware also that this parameter may disappear entirely in future +-versions of the library. In principle it should be possible to devise a +-good way to automatically choose which algorithm to use. Such a +-mechanism would render the parameter obsolete. +- +-

+-

+-Possible return values: +- +-

+-      BZ_CONFIG_ERROR
+-         if the library has been mis-compiled
+-      BZ_PARAM_ERROR 
+-         if strm is NULL 
+-         or blockSize < 1 or blockSize > 9
+-         or verbosity < 0 or verbosity > 4
+-         or workFactor < 0 or workFactor > 250
+-      BZ_MEM_ERROR 
+-         if not enough memory is available
+-      BZ_OK 
+-         otherwise
+-
+- +-

+-Allowable next actions: +- +-

+-      BZ2_bzCompress 
+-         if BZ_OK is returned
+-      no specific action needed in case of error
+-
+- +- +- +-

BZ2_bzCompress

+- +-
+-   int BZ2_bzCompress ( bz_stream *strm, int action );
+-
+- +-

+-Provides more input and/or output buffer space for the library. The +-caller maintains input and output buffers, and calls BZ2_bzCompress to +-transfer data between them. +- +-

+-

+-Before each call to BZ2_bzCompress, next_in should point at +-the data to be compressed, and avail_in should indicate how many +-bytes the library may read. BZ2_bzCompress updates next_in, +-avail_in and total_in to reflect the number of bytes it +-has read. +- +-

+-

+-Similarly, next_out should point to a buffer in which the +-compressed data is to be placed, with avail_out indicating how +-much output space is available. BZ2_bzCompress updates +-next_out, avail_out and total_out to reflect the +-number of bytes output. +- +-

+-

+-You may provide and remove as little or as much data as you like on each +-call of BZ2_bzCompress. In the limit, it is acceptable to supply and +-remove data one byte at a time, although this would be terribly +-inefficient. You should always ensure that at least one byte of output +-space is available at each call. +- +-

+-

+-A second purpose of BZ2_bzCompress is to request a change of mode of the +-compressed stream. +- +-

+-

+-Conceptually, a compressed stream can be in one of four states: IDLE, +-RUNNING, FLUSHING and FINISHING. Before initialisation +-(BZ2_bzCompressInit) and after termination (BZ2_bzCompressEnd), a +-stream is regarded as IDLE. +- +-

+-

+-Upon initialisation (BZ2_bzCompressInit), the stream is placed in the +-RUNNING state. Subsequent calls to BZ2_bzCompress should pass +-BZ_RUN as the requested action; other actions are illegal and +-will result in BZ_SEQUENCE_ERROR. +- +-

+-

+-At some point, the calling program will have provided all the input data +-it wants to. It will then want to finish up -- in effect, asking the +-library to process any data it might have buffered internally. In this +-state, BZ2_bzCompress will no longer attempt to read data from +-next_in, but it will want to write data to next_out. +-Because the output buffer supplied by the user can be arbitrarily small, +-the finishing-up operation cannot necessarily be done with a single call +-of BZ2_bzCompress. +- +-

+-

+-Instead, the calling program passes BZ_FINISH as an action to +-BZ2_bzCompress. This changes the stream's state to FINISHING. Any +-remaining input (ie, next_in[0 .. avail_in-1]) is compressed and +-transferred to the output buffer. To do this, BZ2_bzCompress must be +-called repeatedly until all the output has been consumed. At that +-point, BZ2_bzCompress returns BZ_STREAM_END, and the stream's +-state is set back to IDLE. BZ2_bzCompressEnd should then be +-called. +- +-

+-

+-Just to make sure the calling program does not cheat, the library makes +-a note of avail_in at the time of the first call to +-BZ2_bzCompress which has BZ_FINISH as an action (ie, at the +-time the program has announced its intention to not supply any more +-input). By comparing this value with that of avail_in over +-subsequent calls to BZ2_bzCompress, the library can detect any +-attempts to slip in more data to compress. Any calls for which this is +-detected will return BZ_SEQUENCE_ERROR. This indicates a +-programming mistake which should be corrected. +- +-

+-

+-Instead of asking to finish, the calling program may ask +-BZ2_bzCompress to take all the remaining input, compress it and +-terminate the current (Burrows-Wheeler) compression block. This could +-be useful for error control purposes. The mechanism is analogous to +-that for finishing: call BZ2_bzCompress with an action of +-BZ_FLUSH, remove output data, and persist with the +-BZ_FLUSH action until the value BZ_RUN is returned. As +-with finishing, BZ2_bzCompress detects any attempt to provide more +-input data once the flush has begun. +- +-

+-

+-Once the flush is complete, the stream returns to the normal RUNNING +-state. +- +-

+-

+-This all sounds pretty complex, but isn't really. Here's a table +-which shows which actions are allowable in each state, what action +-will be taken, what the next state is, and what the non-error return +-values are. Note that you can't explicitly ask what state the +-stream is in, but nor do you need to -- it can be inferred from the +-values returned by BZ2_bzCompress. +- +-

+-IDLE/any           
+-      Illegal.  IDLE state only exists after BZ2_bzCompressEnd or
+-      before BZ2_bzCompressInit.
+-      Return value = BZ_SEQUENCE_ERROR
+-
+-RUNNING/BZ_RUN     
+-      Compress from next_in to next_out as much as possible.
+-      Next state = RUNNING
+-      Return value = BZ_RUN_OK
+-
+-RUNNING/BZ_FLUSH   
+-      Remember current value of next_in.  Compress from next_in
+-      to next_out as much as possible, but do not accept any more input.  
+-      Next state = FLUSHING
+-      Return value = BZ_FLUSH_OK
+-
+-RUNNING/BZ_FINISH  
+-      Remember current value of next_in.  Compress from next_in
+-      to next_out as much as possible, but do not accept any more input.
+-      Next state = FINISHING
+-      Return value = BZ_FINISH_OK
+-
+-FLUSHING/BZ_FLUSH  
+-      Compress from next_in to next_out as much as possible, 
+-      but do not accept any more input.  
+-      If all the existing input has been used up and all compressed
+-      output has been removed
+-         Next state = RUNNING; Return value = BZ_RUN_OK
+-      else
+-         Next state = FLUSHING; Return value = BZ_FLUSH_OK
+-
+-FLUSHING/other     
+-      Illegal.
+-      Return value = BZ_SEQUENCE_ERROR
+-
+-FINISHING/BZ_FINISH  
+-      Compress from next_in to next_out as much as possible,
+-      but to not accept any more input.  
+-      If all the existing input has been used up and all compressed
+-      output has been removed
+-         Next state = IDLE; Return value = BZ_STREAM_END
+-      else
+-         Next state = FINISHING; Return value = BZ_FINISHING
+-
+-FINISHING/other
+-      Illegal.
+-      Return value = BZ_SEQUENCE_ERROR
+-
+- +-

+-That still looks complicated? Well, fair enough. The usual sequence +-of calls for compressing a load of data is: +- +-

    +-
  • Get started with BZ2_bzCompressInit. +- +-
  • Shovel data in and shlurp out its compressed form using zero or more +- +-calls of BZ2_bzCompress with action = BZ_RUN. +-
  • Finish up. +- +-Repeatedly call BZ2_bzCompress with action = BZ_FINISH, +-copying out the compressed output, until BZ_STREAM_END is returned. +-
  • Close up and go home. Call BZ2_bzCompressEnd. +- +-
+- +-

+-If the data you want to compress fits into your input buffer all +-at once, you can skip the calls of BZ2_bzCompress ( ..., BZ_RUN ) and +-just do the BZ2_bzCompress ( ..., BZ_FINISH ) calls. +- +-

+-

+-All required memory is allocated by BZ2_bzCompressInit. The +-compression library can accept any data at all (obviously). So you +-shouldn't get any error return values from the BZ2_bzCompress calls. +-If you do, they will be BZ_SEQUENCE_ERROR, and indicate a bug in +-your programming. +- +-

+-

+-Trivial other possible return values: +- +-

+-      BZ_PARAM_ERROR   
+-         if strm is NULL, or strm->s is NULL
+-
+- +- +- +-

BZ2_bzCompressEnd

+- +-
+-int BZ2_bzCompressEnd ( bz_stream *strm );
+-
+- +-

+-Releases all memory associated with a compression stream. +- +-

+-

+-Possible return values: +- +-

+-   BZ_PARAM_ERROR    if strm is NULL or strm->s is NULL
+-   BZ_OK    otherwise
+-
+- +- +- +-

BZ2_bzDecompressInit

+- +-
+-int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small );
+-
+- +-

+-Prepares for decompression. As with BZ2_bzCompressInit, a +-bz_stream record should be allocated and initialised before the +-call. Fields bzalloc, bzfree and opaque should be +-set if a custom memory allocator is required, or made NULL for +-the normal malloc/free routines. Upon return, the internal +-state will have been initialised, and total_in and +-total_out will be zero. +- +-

+-

+-For the meaning of parameter verbosity, see BZ2_bzCompressInit. +- +-

+-

+-If small is nonzero, the library will use an alternative +-decompression algorithm which uses less memory but at the cost of +-decompressing more slowly (roughly speaking, half the speed, but the +-maximum memory requirement drops to around 2300k). See Chapter 2 for +-more information on memory management. +- +-

+-

+-Note that the amount of memory needed to decompress +-a stream cannot be determined until the stream's header has been read, +-so even if BZ2_bzDecompressInit succeeds, a subsequent +-BZ2_bzDecompress could fail with BZ_MEM_ERROR. +- +-

+-

+-Possible return values: +- +-

+-      BZ_CONFIG_ERROR
+-         if the library has been mis-compiled
+-      BZ_PARAM_ERROR
+-         if (small != 0 && small != 1)
+-         or (verbosity < 0 || verbosity > 4)
+-      BZ_MEM_ERROR
+-         if insufficient memory is available
+-
+- +-

+-Allowable next actions: +- +-

+-      BZ2_bzDecompress
+-         if BZ_OK was returned
+-      no specific action required in case of error
+-
+- +-

+- +- +-

+- +- +-

BZ2_bzDecompress

+- +-
+-int BZ2_bzDecompress ( bz_stream *strm );
+-
+- +-

+-Provides more input and/out output buffer space for the library. The +-caller maintains input and output buffers, and uses BZ2_bzDecompress +-to transfer data between them. +- +-

+-

+-Before each call to BZ2_bzDecompress, next_in +-should point at the compressed data, +-and avail_in should indicate how many bytes the library +-may read. BZ2_bzDecompress updates next_in, avail_in +-and total_in +-to reflect the number of bytes it has read. +- +-

+-

+-Similarly, next_out should point to a buffer in which the uncompressed +-output is to be placed, with avail_out indicating how much output space +-is available. BZ2_bzCompress updates next_out, +-avail_out and total_out to reflect +-the number of bytes output. +- +-

+-

+-You may provide and remove as little or as much data as you like on +-each call of BZ2_bzDecompress. +-In the limit, it is acceptable to +-supply and remove data one byte at a time, although this would be +-terribly inefficient. You should always ensure that at least one +-byte of output space is available at each call. +- +-

+-

+-Use of BZ2_bzDecompress is simpler than BZ2_bzCompress. +- +-

+-

+-You should provide input and remove output as described above, and +-repeatedly call BZ2_bzDecompress until BZ_STREAM_END is +-returned. Appearance of BZ_STREAM_END denotes that +-BZ2_bzDecompress has detected the logical end of the compressed +-stream. BZ2_bzDecompress will not produce BZ_STREAM_END until +-all output data has been placed into the output buffer, so once +-BZ_STREAM_END appears, you are guaranteed to have available all +-the decompressed output, and BZ2_bzDecompressEnd can safely be +-called. +- +-

+-

+-If case of an error return value, you should call BZ2_bzDecompressEnd +-to clean up and release memory. +- +-

+-

+-Possible return values: +- +-

+-      BZ_PARAM_ERROR
+-         if strm is NULL or strm->s is NULL
+-         or strm->avail_out < 1
+-      BZ_DATA_ERROR
+-         if a data integrity error is detected in the compressed stream
+-      BZ_DATA_ERROR_MAGIC
+-         if the compressed stream doesn't begin with the right magic bytes
+-      BZ_MEM_ERROR
+-         if there wasn't enough memory available
+-      BZ_STREAM_END
+-         if the logical end of the data stream was detected and all
+-         output in has been consumed, eg s->avail_out > 0
+-      BZ_OK
+-         otherwise
+-
+- +-

+-Allowable next actions: +- +-

+-      BZ2_bzDecompress
+-         if BZ_OK was returned
+-      BZ2_bzDecompressEnd
+-         otherwise
+-
+- +- +- +-

BZ2_bzDecompressEnd

+- +-
+-int BZ2_bzDecompressEnd ( bz_stream *strm );
+-
+- +-

+-Releases all memory associated with a decompression stream. +- +-

+-

+-Possible return values: +- +-

+-      BZ_PARAM_ERROR
+-         if strm is NULL or strm->s is NULL
+-      BZ_OK
+-         otherwise
+-
+- +-

+-Allowable next actions: +- +-

+-      None.
+-
+- +- +- +-

High-level interface

+- +-

+-This interface provides functions for reading and writing +-bzip2 format files. First, some general points. +- +-

+- +-
    +-
  • All of the functions take an int* first argument, +- +- bzerror. +- After each call, bzerror should be consulted first to determine +- the outcome of the call. If bzerror is BZ_OK, +- the call completed +- successfully, and only then should the return value of the function +- (if any) be consulted. If bzerror is BZ_IO_ERROR, +- there was an error +- reading/writing the underlying compressed file, and you should +- then consult errno/perror to determine the +- cause of the difficulty. +- bzerror may also be set to various other values; precise details are +- given on a per-function basis below. +-
  • If bzerror indicates an error +- +- (ie, anything except BZ_OK and BZ_STREAM_END), +- you should immediately call BZ2_bzReadClose (or BZ2_bzWriteClose, +- depending on whether you are attempting to read or to write) +- to free up all resources associated +- with the stream. Once an error has been indicated, behaviour of all calls +- except BZ2_bzReadClose (BZ2_bzWriteClose) is undefined. +- The implication is that (1) bzerror should +- be checked after each call, and (2) if bzerror indicates an error, +- BZ2_bzReadClose (BZ2_bzWriteClose) should then be called to clean up. +-
  • The FILE* arguments passed to +- +- BZ2_bzReadOpen/BZ2_bzWriteOpen +- should be set to binary mode. +- Most Unix systems will do this by default, but other platforms, +- including Windows and Mac, will not. If you omit this, you may +- encounter problems when moving code to new platforms. +-
  • Memory allocation requests are handled by +- +- malloc/free. +- At present +- there is no facility for user-defined memory allocators in the file I/O +- functions (could easily be added, though). +-
+- +- +- +-

BZ2_bzReadOpen

+- +-
+-   typedef void BZFILE;
+-
+-   BZFILE *BZ2_bzReadOpen ( int *bzerror, FILE *f, 
+-                            int small, int verbosity,
+-                            void *unused, int nUnused );
+-
+- +-

+-Prepare to read compressed data from file handle f. f +-should refer to a file which has been opened for reading, and for which +-the error indicator (ferror(f))is not set. If small is 1, +-the library will try to decompress using less memory, at the expense of +-speed. +- +-

+-

+-For reasons explained below, BZ2_bzRead will decompress the +-nUnused bytes starting at unused, before starting to read +-from the file f. At most BZ_MAX_UNUSED bytes may be +-supplied like this. If this facility is not required, you should pass +-NULL and 0 for unused and nUnused +-respectively. +- +-

+-

+-For the meaning of parameters small and verbosity, +-see BZ2_bzDecompressInit. +- +-

+-

+-The amount of memory needed to decompress a file cannot be determined +-until the file's header has been read. So it is possible that +-BZ2_bzReadOpen returns BZ_OK but a subsequent call of +-BZ2_bzRead will return BZ_MEM_ERROR. +- +-

+-

+-Possible assignments to bzerror: +- +-

+-      BZ_CONFIG_ERROR
+-         if the library has been mis-compiled
+-      BZ_PARAM_ERROR
+-         if f is NULL 
+-         or small is neither 0 nor 1                 
+-         or (unused == NULL && nUnused != 0)
+-         or (unused != NULL && !(0 <= nUnused <= BZ_MAX_UNUSED))
+-      BZ_IO_ERROR    
+-         if ferror(f) is nonzero
+-      BZ_MEM_ERROR   
+-         if insufficient memory is available
+-      BZ_OK
+-         otherwise.
+-
+- +-

+-Possible return values: +- +-

+-      Pointer to an abstract BZFILE        
+-         if bzerror is BZ_OK   
+-      NULL
+-         otherwise
+-
+- +-

+-Allowable next actions: +- +-

+-      BZ2_bzRead
+-         if bzerror is BZ_OK   
+-      BZ2_bzClose 
+-         otherwise
+-
+- +- +- +-

BZ2_bzRead

+- +-
+-   int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len );
+-
+- +-

+-Reads up to len (uncompressed) bytes from the compressed file +-b into +-the buffer buf. If the read was successful, +-bzerror is set to BZ_OK +-and the number of bytes read is returned. If the logical end-of-stream +-was detected, bzerror will be set to BZ_STREAM_END, +-and the number +-of bytes read is returned. All other bzerror values denote an error. +- +-

+-

+-BZ2_bzRead will supply len bytes, +-unless the logical stream end is detected +-or an error occurs. Because of this, it is possible to detect the +-stream end by observing when the number of bytes returned is +-less than the number +-requested. Nevertheless, this is regarded as inadvisable; you should +-instead check bzerror after every call and watch out for +-BZ_STREAM_END. +- +-

+-

+-Internally, BZ2_bzRead copies data from the compressed file in chunks +-of size BZ_MAX_UNUSED bytes +-before decompressing it. If the file contains more bytes than strictly +-needed to reach the logical end-of-stream, BZ2_bzRead will almost certainly +-read some of the trailing data before signalling BZ_SEQUENCE_END. +-To collect the read but unused data once BZ_SEQUENCE_END has +-appeared, call BZ2_bzReadGetUnused immediately before BZ2_bzReadClose. +- +-

+-

+-Possible assignments to bzerror: +- +-

+-      BZ_PARAM_ERROR
+-         if b is NULL or buf is NULL or len < 0
+-      BZ_SEQUENCE_ERROR 
+-         if b was opened with BZ2_bzWriteOpen
+-      BZ_IO_ERROR 
+-         if there is an error reading from the compressed file
+-      BZ_UNEXPECTED_EOF 
+-         if the compressed file ended before the logical end-of-stream was detected
+-      BZ_DATA_ERROR 
+-         if a data integrity error was detected in the compressed stream
+-      BZ_DATA_ERROR_MAGIC
+-         if the stream does not begin with the requisite header bytes (ie, is not 
+-         a bzip2 data file).  This is really a special case of BZ_DATA_ERROR.
+-      BZ_MEM_ERROR 
+-         if insufficient memory was available
+-      BZ_STREAM_END 
+-         if the logical end of stream was detected.
+-      BZ_OK
+-         otherwise.
+-
+- +-

+-Possible return values: +- +-

+-      number of bytes read
+-         if bzerror is BZ_OK or BZ_STREAM_END
+-      undefined
+-         otherwise
+-
+- +-

+-Allowable next actions: +- +-

+-      collect data from buf, then BZ2_bzRead or BZ2_bzReadClose
+-         if bzerror is BZ_OK 
+-      collect data from buf, then BZ2_bzReadClose or BZ2_bzReadGetUnused 
+-         if bzerror is BZ_SEQUENCE_END   
+-      BZ2_bzReadClose 
+-         otherwise
+-
+- +- +- +-

BZ2_bzReadGetUnused

+- +-
+-   void BZ2_bzReadGetUnused ( int* bzerror, BZFILE *b, 
+-                              void** unused, int* nUnused );
+-
+- +-

+-Returns data which was read from the compressed file but was not needed +-to get to the logical end-of-stream. *unused is set to the address +-of the data, and *nUnused to the number of bytes. *nUnused will +-be set to a value between 0 and BZ_MAX_UNUSED inclusive. +- +-

+-

+-This function may only be called once BZ2_bzRead has signalled +-BZ_STREAM_END but before BZ2_bzReadClose. +- +-

+-

+-Possible assignments to bzerror: +- +-

+-      BZ_PARAM_ERROR 
+-         if b is NULL 
+-         or unused is NULL or nUnused is NULL
+-      BZ_SEQUENCE_ERROR 
+-         if BZ_STREAM_END has not been signalled
+-         or if b was opened with BZ2_bzWriteOpen
+-     BZ_OK
+-         otherwise
+-
+- +-

+-Allowable next actions: +- +-

+-      BZ2_bzReadClose
+-
+- +- +- +-

BZ2_bzReadClose

+- +-
+-   void BZ2_bzReadClose ( int *bzerror, BZFILE *b );
+-
+- +-

+-Releases all memory pertaining to the compressed file b. +-BZ2_bzReadClose does not call fclose on the underlying file +-handle, so you should do that yourself if appropriate. +-BZ2_bzReadClose should be called to clean up after all error +-situations. +- +-

+-

+-Possible assignments to bzerror: +- +-

+-      BZ_SEQUENCE_ERROR 
+-         if b was opened with BZ2_bzOpenWrite 
+-      BZ_OK 
+-         otherwise
+-
+- +-

+-Allowable next actions: +- +-

+-      none
+-
+- +- +- +-

BZ2_bzWriteOpen

+- +-
+-   BZFILE *BZ2_bzWriteOpen ( int *bzerror, FILE *f, 
+-                             int blockSize100k, int verbosity,
+-                             int workFactor );
+-
+- +-

+-Prepare to write compressed data to file handle f. +-f should refer to +-a file which has been opened for writing, and for which the error +-indicator (ferror(f))is not set. +- +-

+-

+-For the meaning of parameters blockSize100k, +-verbosity and workFactor, see +-
BZ2_bzCompressInit. +- +-

+-

+-All required memory is allocated at this stage, so if the call +-completes successfully, BZ_MEM_ERROR cannot be signalled by a +-subsequent call to BZ2_bzWrite. +- +-

+-

+-Possible assignments to bzerror: +- +-

+-      BZ_CONFIG_ERROR
+-         if the library has been mis-compiled
+-      BZ_PARAM_ERROR 
+-         if f is NULL 
+-         or blockSize100k < 1 or blockSize100k > 9
+-      BZ_IO_ERROR 
+-         if ferror(f) is nonzero
+-      BZ_MEM_ERROR 
+-         if insufficient memory is available
+-      BZ_OK 
+-         otherwise
+-
+- +-

+-Possible return values: +- +-

+-      Pointer to an abstract BZFILE  
+-         if bzerror is BZ_OK   
+-      NULL 
+-         otherwise
+-
+- +-

+-Allowable next actions: +- +-

+-      BZ2_bzWrite 
+-         if bzerror is BZ_OK 
+-         (you could go directly to BZ2_bzWriteClose, but this would be pretty pointless)
+-      BZ2_bzWriteClose 
+-         otherwise
+-
+- +- +- +-

BZ2_bzWrite

+- +-
+-   void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len );
+-
+- +-

+-Absorbs len bytes from the buffer buf, eventually to be +-compressed and written to the file. +- +-

+-

+-Possible assignments to bzerror: +- +-

+-      BZ_PARAM_ERROR 
+-         if b is NULL or buf is NULL or len < 0
+-      BZ_SEQUENCE_ERROR 
+-         if b was opened with BZ2_bzReadOpen
+-      BZ_IO_ERROR 
+-         if there is an error writing the compressed file.
+-      BZ_OK 
+-         otherwise
+-
+- +- +- +-

BZ2_bzWriteClose

+- +-
+-   void BZ2_bzWriteClose ( int *bzerror, BZFILE* f,
+-                           int abandon,
+-                           unsigned int* nbytes_in,
+-                           unsigned int* nbytes_out );
+-
+-   void BZ2_bzWriteClose64 ( int *bzerror, BZFILE* f,
+-                             int abandon,
+-                             unsigned int* nbytes_in_lo32,
+-                             unsigned int* nbytes_in_hi32,
+-                             unsigned int* nbytes_out_lo32,
+-                             unsigned int* nbytes_out_hi32 );
+-
+- +-

+-Compresses and flushes to the compressed file all data so far supplied +-by BZ2_bzWrite. The logical end-of-stream markers are also written, so +-subsequent calls to BZ2_bzWrite are illegal. All memory associated +-with the compressed file b is released. +-fflush is called on the +-compressed file, but it is not fclose'd. +- +-

+-

+-If BZ2_bzWriteClose is called to clean up after an error, the only +-action is to release the memory. The library records the error codes +-issued by previous calls, so this situation will be detected +-automatically. There is no attempt to complete the compression +-operation, nor to fflush the compressed file. You can force this +-behaviour to happen even in the case of no error, by passing a nonzero +-value to abandon. +- +-

+-

+-If nbytes_in is non-null, *nbytes_in will be set to be the +-total volume of uncompressed data handled. Similarly, nbytes_out +-will be set to the total volume of compressed data written. For +-compatibility with older versions of the library, BZ2_bzWriteClose +-only yields the lower 32 bits of these counts. Use +-BZ2_bzWriteClose64 if you want the full 64 bit counts. These +-two functions are otherwise absolutely identical. +- +-

+- +-

+-Possible assignments to bzerror: +- +-

+-      BZ_SEQUENCE_ERROR 
+-         if b was opened with BZ2_bzReadOpen
+-      BZ_IO_ERROR 
+-         if there is an error writing the compressed file
+-      BZ_OK 
+-         otherwise
+-
+- +- +- +-

Handling embedded compressed data streams

+- +-

+-The high-level library facilitates use of +-bzip2 data streams which form some part of a surrounding, larger +-data stream. +- +-

    +-
  • For writing, the library takes an open file handle, writes +- +-compressed data to it, fflushes it but does not fclose it. +-The calling application can write its own data before and after the +-compressed data stream, using that same file handle. +-
  • Reading is more complex, and the facilities are not as general +- +-as they could be since generality is hard to reconcile with efficiency. +-BZ2_bzRead reads from the compressed file in blocks of size +-BZ_MAX_UNUSED bytes, and in doing so probably will overshoot +-the logical end of compressed stream. +-To recover this data once decompression has +-ended, call BZ2_bzReadGetUnused after the last call of BZ2_bzRead +-(the one returning BZ_STREAM_END) but before calling +-BZ2_bzReadClose. +-
+- +-

+-This mechanism makes it easy to decompress multiple bzip2 +-streams placed end-to-end. As the end of one stream, when BZ2_bzRead +-returns BZ_STREAM_END, call BZ2_bzReadGetUnused to collect the +-unused data (copy it into your own buffer somewhere). +-That data forms the start of the next compressed stream. +-To start uncompressing that next stream, call BZ2_bzReadOpen again, +-feeding in the unused data via the unused/nUnused +-parameters. +-Keep doing this until BZ_STREAM_END return coincides with the +-physical end of file (feof(f)). In this situation +-BZ2_bzReadGetUnused +-will of course return no data. +- +-

+-

+-This should give some feel for how the high-level interface can be used. +-If you require extra flexibility, you'll have to bite the bullet and get +-to grips with the low-level interface. +- +-

+- +- +-

Standard file-reading/writing code

+-

+-Here's how you'd write data to a compressed file: +- +-

+-FILE*   f;
+-BZFILE* b;
+-int     nBuf;
+-char    buf[ /* whatever size you like */ ];
+-int     bzerror;
+-int     nWritten;
+-
+-f = fopen ( "myfile.bz2", "w" );
+-if (!f) {
+-   /* handle error */
+-}
+-b = BZ2_bzWriteOpen ( &bzerror, f, 9 );
+-if (bzerror != BZ_OK) {
+-   BZ2_bzWriteClose ( b );
+-   /* handle error */
+-}
+-
+-while ( /* condition */ ) {
+-   /* get data to write into buf, and set nBuf appropriately */
+-   nWritten = BZ2_bzWrite ( &bzerror, b, buf, nBuf );
+-   if (bzerror == BZ_IO_ERROR) { 
+-      BZ2_bzWriteClose ( &bzerror, b );
+-      /* handle error */
+-   }
+-}
+-
+-BZ2_bzWriteClose ( &bzerror, b );
+-if (bzerror == BZ_IO_ERROR) {
+-   /* handle error */
+-}
+-
+- +-

+-And to read from a compressed file: +- +-

+-FILE*   f;
+-BZFILE* b;
+-int     nBuf;
+-char    buf[ /* whatever size you like */ ];
+-int     bzerror;
+-int     nWritten;
+-
+-f = fopen ( "myfile.bz2", "r" );
+-if (!f) {
+-   /* handle error */
+-}
+-b = BZ2_bzReadOpen ( &bzerror, f, 0, NULL, 0 );
+-if (bzerror != BZ_OK) {
+-   BZ2_bzReadClose ( &bzerror, b );
+-   /* handle error */
+-}
+-
+-bzerror = BZ_OK;
+-while (bzerror == BZ_OK && /* arbitrary other conditions */) {
+-   nBuf = BZ2_bzRead ( &bzerror, b, buf, /* size of buf */ );
+-   if (bzerror == BZ_OK) {
+-      /* do something with buf[0 .. nBuf-1] */
+-   }
+-}
+-if (bzerror != BZ_STREAM_END) {
+-   BZ2_bzReadClose ( &bzerror, b );
+-   /* handle error */
+-} else {
+-   BZ2_bzReadClose ( &bzerror );
+-}
+-
+- +- +- +-

Utility functions

+- +- +-

BZ2_bzBuffToBuffCompress

+- +-
+-   int BZ2_bzBuffToBuffCompress( char*         dest,
+-                                 unsigned int* destLen,
+-                                 char*         source,
+-                                 unsigned int  sourceLen,
+-                                 int           blockSize100k,
+-                                 int           verbosity,
+-                                 int           workFactor );
+-
+- +-

+-Attempts to compress the data in source[0 .. sourceLen-1] +-into the destination buffer, dest[0 .. *destLen-1]. +-If the destination buffer is big enough, *destLen is +-set to the size of the compressed data, and BZ_OK is +-returned. If the compressed data won't fit, *destLen +-is unchanged, and BZ_OUTBUFF_FULL is returned. +- +-

+-

+-Compression in this manner is a one-shot event, done with a single call +-to this function. The resulting compressed data is a complete +-bzip2 format data stream. There is no mechanism for making +-additional calls to provide extra input data. If you want that kind of +-mechanism, use the low-level interface. +- +-

+-

+-For the meaning of parameters blockSize100k, verbosity +-and workFactor,
see BZ2_bzCompressInit. +- +-

+-

+-To guarantee that the compressed data will fit in its buffer, allocate +-an output buffer of size 1% larger than the uncompressed data, plus +-six hundred extra bytes. +- +-

+-

+-BZ2_bzBuffToBuffDecompress will not write data at or +-beyond dest[*destLen], even in case of buffer overflow. +- +-

+-

+-Possible return values: +- +-

+-      BZ_CONFIG_ERROR
+-         if the library has been mis-compiled
+-      BZ_PARAM_ERROR 
+-         if dest is NULL or destLen is NULL
+-         or blockSize100k < 1 or blockSize100k > 9
+-         or verbosity < 0 or verbosity > 4 
+-         or workFactor < 0 or workFactor > 250
+-      BZ_MEM_ERROR
+-         if insufficient memory is available 
+-      BZ_OUTBUFF_FULL
+-         if the size of the compressed data exceeds *destLen
+-      BZ_OK 
+-         otherwise
+-
+- +- +- +-

BZ2_bzBuffToBuffDecompress

+- +-
+-   int BZ2_bzBuffToBuffDecompress ( char*         dest,
+-                                    unsigned int* destLen,
+-                                    char*         source,
+-                                    unsigned int  sourceLen,
+-                                    int           small,
+-                                    int           verbosity );
+-
+- +-

+-Attempts to decompress the data in source[0 .. sourceLen-1] +-into the destination buffer, dest[0 .. *destLen-1]. +-If the destination buffer is big enough, *destLen is +-set to the size of the uncompressed data, and BZ_OK is +-returned. If the compressed data won't fit, *destLen +-is unchanged, and BZ_OUTBUFF_FULL is returned. +- +-

+-

+-source is assumed to hold a complete bzip2 format +-data stream.
BZ2_bzBuffToBuffDecompress tries to decompress +-the entirety of the stream into the output buffer. +- +-

+-

+-For the meaning of parameters small and verbosity, +-see BZ2_bzDecompressInit. +- +-

+-

+-Because the compression ratio of the compressed data cannot be known in +-advance, there is no easy way to guarantee that the output buffer will +-be big enough. You may of course make arrangements in your code to +-record the size of the uncompressed data, but such a mechanism is beyond +-the scope of this library. +- +-

+-

+-BZ2_bzBuffToBuffDecompress will not write data at or +-beyond dest[*destLen], even in case of buffer overflow. +- +-

+-

+-Possible return values: +- +-

+-      BZ_CONFIG_ERROR
+-         if the library has been mis-compiled
+-      BZ_PARAM_ERROR 
+-         if dest is NULL or destLen is NULL
+-         or small != 0 && small != 1
+-         or verbosity < 0 or verbosity > 4 
+-      BZ_MEM_ERROR
+-         if insufficient memory is available 
+-      BZ_OUTBUFF_FULL
+-         if the size of the compressed data exceeds *destLen
+-      BZ_DATA_ERROR
+-         if a data integrity error was detected in the compressed data
+-      BZ_DATA_ERROR_MAGIC
+-         if the compressed data doesn't begin with the right magic bytes
+-      BZ_UNEXPECTED_EOF
+-         if the compressed data ends unexpectedly
+-      BZ_OK 
+-         otherwise
+-
+- +- +- +-

zlib compatibility functions

+-

+-Yoshioka Tsuneo has contributed some functions to +-give better zlib compatibility. These functions are +-BZ2_bzopen, BZ2_bzread, BZ2_bzwrite, BZ2_bzflush, +-BZ2_bzclose, +-BZ2_bzerror and BZ2_bzlibVersion. +-These functions are not (yet) officially part of +-the library. If they break, you get to keep all the pieces. +-Nevertheless, I think they work ok. +- +-

+-typedef void BZFILE;
+-
+-const char * BZ2_bzlibVersion ( void );
+-
+- +-

+-Returns a string indicating the library version. +- +-

+-BZFILE * BZ2_bzopen  ( const char *path, const char *mode );
+-BZFILE * BZ2_bzdopen ( int        fd,    const char *mode );
+-
+- +-

+-Opens a .bz2 file for reading or writing, using either its name +-or a pre-existing file descriptor. +-Analogous to fopen and fdopen. +- +-

+-int BZ2_bzread  ( BZFILE* b, void* buf, int len );
+-int BZ2_bzwrite ( BZFILE* b, void* buf, int len );
+-
+- +-

+-Reads/writes data from/to a previously opened BZFILE. +-Analogous to fread and fwrite. +- +-

+-int  BZ2_bzflush ( BZFILE* b );
+-void BZ2_bzclose ( BZFILE* b );
+-
+- +-

+-Flushes/closes a BZFILE. BZ2_bzflush doesn't actually do +-anything. Analogous to fflush and fclose. +- +-

+- +-
+-const char * BZ2_bzerror ( BZFILE *b, int *errnum )
+-
+- +-

+-Returns a string describing the more recent error status of +-b, and also sets *errnum to its numerical value. +- +-

+- +- +- +-

Using the library in a stdio-free environment

+- +- +- +-

Getting rid of stdio

+- +-

+-In a deeply embedded application, you might want to use just +-the memory-to-memory functions. You can do this conveniently +-by compiling the library with preprocessor symbol BZ_NO_STDIO +-defined. Doing this gives you a library containing only the following +-eight functions: +- +-

+-

+-BZ2_bzCompressInit, BZ2_bzCompress, BZ2_bzCompressEnd
+-BZ2_bzDecompressInit, BZ2_bzDecompress, BZ2_bzDecompressEnd
+-BZ2_bzBuffToBuffCompress, BZ2_bzBuffToBuffDecompress +- +-

+-

+-When compiled like this, all functions will ignore verbosity +-settings. +- +-

+- +- +-

Critical error handling

+-

+-libbzip2 contains a number of internal assertion checks which +-should, needless to say, never be activated. Nevertheless, if an +-assertion should fail, behaviour depends on whether or not the library +-was compiled with BZ_NO_STDIO set. +- +-

+-

+-For a normal compile, an assertion failure yields the message +- +-

+-   bzip2/libbzip2: internal error number N.
+-   This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000.
+-   Please report it to me at: jseward@acm.org.  If this happened
+-   when you were using some program which uses libbzip2 as a
+-   component, you should also report this bug to the author(s)
+-   of that program.  Please make an effort to report this bug;
+-   timely and accurate bug reports eventually lead to higher
+-   quality software.  Thanks.  Julian Seward, 21 March 2000.
+-
+- +-

+-where N is some error code number. exit(3) +-is then called. +- +-

+-

+-For a stdio-free library, assertion failures result +-in a call to a function declared as: +- +-

+-   extern void bz_internal_error ( int errcode );
+-
+- +-

+-The relevant code is passed as a parameter. You should supply +-such a function. +- +-

+-

+-In either case, once an assertion failure has occurred, any +-bz_stream records involved can be regarded as invalid. +-You should not attempt to resume normal operation with them. +- +-

+-

+-You may, of course, change critical error handling to suit +-your needs. As I said above, critical errors indicate bugs +-in the library and should not occur. All "normal" error +-situations are indicated via error return codes from functions, +-and can be recovered from. +- +-

+- +- +- +-

Making a Windows DLL

+-

+-Everything related to Windows has been contributed by Yoshioka Tsuneo +-
(QWF00133@niftyserve.or.jp / +-tsuneo-y@is.aist-nara.ac.jp), so you should send your queries to +-him (but perhaps Cc: me, jseward@acm.org). +- +-

+-

+-My vague understanding of what to do is: using Visual C++ 5.0, +-open the project file libbz2.dsp, and build. That's all. +- +-

+-

+-If you can't +-open the project file for some reason, make a new one, naming these files: +-blocksort.c, bzlib.c, compress.c, +-crctable.c, decompress.c, huffman.c,
+-randtable.c and libbz2.def. You will also need +-to name the header files bzlib.h and bzlib_private.h. +- +-

+-

+-If you don't use VC++, you may need to define the proprocessor symbol +-_WIN32. +- +-

+-

+-Finally, dlltest.c is a sample program using the DLL. It has a +-project file, dlltest.dsp. +- +-

+-

+-If you just want a makefile for Visual C, have a look at +-makefile.msc. +- +-

+-

+-Be aware that if you compile bzip2 itself on Win32, you must set +-BZ_UNIX to 0 and BZ_LCCWIN32 to 1, in the file +-bzip2.c, before compiling. Otherwise the resulting binary won't +-work correctly. +- +-

+-

+-I haven't tried any of this stuff myself, but it all looks plausible. +- +-

+- +-


+-

Go to the first, previous, next, last section, table of contents. +- +- +diff -Nru bzip2-1.0.1/manual_4.html bzip2-1.0.1.new/manual_4.html +--- bzip2-1.0.1/manual_4.html Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/manual_4.html Thu Jan 1 01:00:00 1970 +@@ -1,528 +0,0 @@ +- +- +- +- +-bzip2 and libbzip2 - Miscellanea +- +- +- +- +- +-

Go to the first, previous, next, last section, table of contents. +-


+- +- +-

Miscellanea

+- +-

+-These are just some random thoughts of mine. Your mileage may +-vary. +- +-

+- +- +-

Limitations of the compressed file format

+-

+-bzip2-1.0, 0.9.5 and 0.9.0 +-use exactly the same file format as the previous +-version, bzip2-0.1. This decision was made in the interests of +-stability. Creating yet another incompatible compressed file format +-would create further confusion and disruption for users. +- +-

+-

+-Nevertheless, this is not a painless decision. Development +-work since the release of bzip2-0.1 in August 1997 +-has shown complexities in the file format which slow down +-decompression and, in retrospect, are unnecessary. These are: +- +-

    +-
  • The run-length encoder, which is the first of the +- +- compression transformations, is entirely irrelevant. +- The original purpose was to protect the sorting algorithm +- from the very worst case input: a string of repeated +- symbols. But algorithm steps Q6a and Q6b in the original +- Burrows-Wheeler technical report (SRC-124) show how +- repeats can be handled without difficulty in block +- sorting. +-
  • The randomisation mechanism doesn't really need to be +- +- there. Udi Manber and Gene Myers published a suffix +- array construction algorithm a few years back, which +- can be employed to sort any block, no matter how +- repetitive, in O(N log N) time. Subsequent work by +- Kunihiko Sadakane has produced a derivative O(N (log N)^2) +- algorithm which usually outperforms the Manber-Myers +- algorithm. +- +- I could have changed to Sadakane's algorithm, but I find +- it to be slower than bzip2's existing algorithm for +- most inputs, and the randomisation mechanism protects +- adequately against bad cases. I didn't think it was +- a good tradeoff to make. Partly this is due to the fact +- that I was not flooded with email complaints about +- bzip2-0.1's performance on repetitive data, so +- perhaps it isn't a problem for real inputs. +- +- Probably the best long-term solution, +- and the one I have incorporated into 0.9.5 and above, +- is to use the existing sorting +- algorithm initially, and fall back to a O(N (log N)^2) +- algorithm if the standard algorithm gets into difficulties. +-
  • The compressed file format was never designed to be +- +- handled by a library, and I have had to jump though +- some hoops to produce an efficient implementation of +- decompression. It's a bit hairy. Try passing +- decompress.c through the C preprocessor +- and you'll see what I mean. Much of this complexity +- could have been avoided if the compressed size of +- each block of data was recorded in the data stream. +-
  • An Adler-32 checksum, rather than a CRC32 checksum, +- +- would be faster to compute. +-
+- +-

+-It would be fair to say that the bzip2 format was frozen +-before I properly and fully understood the performance +-consequences of doing so. +- +-

+-

+-Improvements which I was able to incorporate into +-0.9.0, despite using the same file format, are: +- +-

    +-
  • Single array implementation of the inverse BWT. This +- +- significantly speeds up decompression, presumably +- because it reduces the number of cache misses. +-
  • Faster inverse MTF transform for large MTF values. The +- +- new implementation is based on the notion of sliding blocks +- of values. +-
  • bzip2-0.9.0 now reads and writes files with fread +- +- and fwrite; version 0.1 used putc and getc. +- Duh! Well, you live and learn. +- +-
+- +-

+-Further ahead, it would be nice +-to be able to do random access into files. This will +-require some careful design of compressed file formats. +- +-

+- +- +- +-

Portability issues

+-

+-After some consideration, I have decided not to use +-GNU autoconf to configure 0.9.5 or 1.0. +- +-

+-

+-autoconf, admirable and wonderful though it is, +-mainly assists with portability problems between Unix-like +-platforms. But bzip2 doesn't have much in the way +-of portability problems on Unix; most of the difficulties appear +-when porting to the Mac, or to Microsoft's operating systems. +-autoconf doesn't help in those cases, and brings in a +-whole load of new complexity. +- +-

+-

+-Most people should be able to compile the library and program +-under Unix straight out-of-the-box, so to speak, especially +-if you have a version of GNU C available. +- +-

+-

+-There are a couple of __inline__ directives in the code. GNU C +-(gcc) should be able to handle them. If you're not using +-GNU C, your C compiler shouldn't see them at all. +-If your compiler does, for some reason, see them and doesn't +-like them, just #define __inline__ to be /* */. One +-easy way to do this is to compile with the flag -D__inline__=, +-which should be understood by most Unix compilers. +- +-

+-

+-If you still have difficulties, try compiling with the macro +-BZ_STRICT_ANSI defined. This should enable you to build the +-library in a strictly ANSI compliant environment. Building the program +-itself like this is dangerous and not supported, since you remove +-bzip2's checks against compressing directories, symbolic links, +-devices, and other not-really-a-file entities. This could cause +-filesystem corruption! +- +-

+-

+-One other thing: if you create a bzip2 binary for public +-distribution, please try and link it statically (gcc -s). This +-avoids all sorts of library-version issues that others may encounter +-later on. +- +-

+-

+-If you build bzip2 on Win32, you must set BZ_UNIX to 0 and +-BZ_LCCWIN32 to 1, in the file bzip2.c, before compiling. +-Otherwise the resulting binary won't work correctly. +- +-

+- +- +- +-

Reporting bugs

+-

+-I tried pretty hard to make sure bzip2 is +-bug free, both by design and by testing. Hopefully +-you'll never need to read this section for real. +- +-

+-

+-Nevertheless, if bzip2 dies with a segmentation +-fault, a bus error or an internal assertion failure, it +-will ask you to email me a bug report. Experience with +-version 0.1 shows that almost all these problems can +-be traced to either compiler bugs or hardware problems. +- +-

    +-
  • +- +-Recompile the program with no optimisation, and see if it +-works. And/or try a different compiler. +-I heard all sorts of stories about various flavours +-of GNU C (and other compilers) generating bad code for +-bzip2, and I've run across two such examples myself. +- +-2.7.X versions of GNU C are known to generate bad code from +-time to time, at high optimisation levels. +-If you get problems, try using the flags +--O2 -fomit-frame-pointer -fno-strength-reduce. +-You should specifically not use -funroll-loops. +- +-You may notice that the Makefile runs six tests as part of +-the build process. If the program passes all of these, it's +-a pretty good (but not 100%) indication that the compiler has +-done its job correctly. +-
  • +- +-If bzip2 crashes randomly, and the crashes are not +-repeatable, you may have a flaky memory subsystem. bzip2 +-really hammers your memory hierarchy, and if it's a bit marginal, +-you may get these problems. Ditto if your disk or I/O subsystem +-is slowly failing. Yup, this really does happen. +- +-Try using a different machine of the same type, and see if +-you can repeat the problem. +-
  • This isn't really a bug, but ... If bzip2 tells +- +-you your file is corrupted on decompression, and you +-obtained the file via FTP, there is a possibility that you +-forgot to tell FTP to do a binary mode transfer. That absolutely +-will cause the file to be non-decompressible. You'll have to transfer +-it again. +-
+- +-

+-If you've incorporated libbzip2 into your own program +-and are getting problems, please, please, please, check that the +-parameters you are passing in calls to the library, are +-correct, and in accordance with what the documentation says +-is allowable. I have tried to make the library robust against +-such problems, but I'm sure I haven't succeeded. +- +-

+-

+-Finally, if the above comments don't help, you'll have to send +-me a bug report. Now, it's just amazing how many people will +-send me a bug report saying something like +- +-

+-   bzip2 crashed with segmentation fault on my machine
+-
+- +-

+-and absolutely nothing else. Needless to say, a such a report +-is totally, utterly, completely and comprehensively 100% useless; +-a waste of your time, my time, and net bandwidth. +-With no details at all, there's no way I can possibly begin +-to figure out what the problem is. +- +-

+-

+-The rules of the game are: facts, facts, facts. Don't omit +-them because "oh, they won't be relevant". At the bare +-minimum: +- +-

+-   Machine type.  Operating system version.  
+-   Exact version of bzip2 (do bzip2 -V).  
+-   Exact version of the compiler used.  
+-   Flags passed to the compiler.
+-
+- +-

+-However, the most important single thing that will help me is +-the file that you were trying to compress or decompress at the +-time the problem happened. Without that, my ability to do anything +-more than speculate about the cause, is limited. +- +-

+-

+-Please remember that I connect to the Internet with a modem, so +-you should contact me before mailing me huge files. +- +-

+- +- +- +-

Did you get the right package?

+- +-

+-bzip2 is a resource hog. It soaks up large amounts of CPU cycles +-and memory. Also, it gives very large latencies. In the worst case, you +-can feed many megabytes of uncompressed data into the library before +-getting any compressed output, so this probably rules out applications +-requiring interactive behaviour. +- +-

+-

+-These aren't faults of my implementation, I hope, but more +-an intrinsic property of the Burrows-Wheeler transform (unfortunately). +-Maybe this isn't what you want. +- +-

+-

+-If you want a compressor and/or library which is faster, uses less +-memory but gets pretty good compression, and has minimal latency, +-consider Jean-loup +-Gailly's and Mark Adler's work, zlib-1.1.2 and +-gzip-1.2.4. Look for them at +- +-

+-

+-http://www.cdrom.com/pub/infozip/zlib and +-http://www.gzip.org respectively. +- +-

+-

+-For something faster and lighter still, you might try Markus F X J +-Oberhumer's LZO real-time compression/decompression library, at +-
http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html. +- +-

+-

+-If you want to use the bzip2 algorithms to compress small blocks +-of data, 64k bytes or smaller, for example on an on-the-fly disk +-compressor, you'd be well advised not to use this library. Instead, +-I've made a special library tuned for that kind of use. It's part of +-e2compr-0.40, an on-the-fly disk compressor for the Linux +-ext2 filesystem. Look at +-http://www.netspace.net.au/~reiter/e2compr. +- +-

+- +- +- +-

Testing

+- +-

+-A record of the tests I've done. +- +-

+-

+-First, some data sets: +- +-

    +-
  • B: a directory containing 6001 files, one for every length in the +- +- range 0 to 6000 bytes. The files contain random lowercase +- letters. 18.7 megabytes. +-
  • H: my home directory tree. Documents, source code, mail files, +- +- compressed data. H contains B, and also a directory of +- files designed as boundary cases for the sorting; mostly very +- repetitive, nasty files. 565 megabytes. +-
  • A: directory tree holding various applications built from source: +- +- egcs, gcc-2.8.1, KDE, GTK, Octave, etc. +- 2200 megabytes. +-
+- +-

+-The tests conducted are as follows. Each test means compressing +-(a copy of) each file in the data set, decompressing it and +-comparing it against the original. +- +-

+-

+-First, a bunch of tests with block sizes and internal buffer +-sizes set very small, +-to detect any problems with the +-blocking and buffering mechanisms. +-This required modifying the source code so as to try to +-break it. +- +-

    +-
  1. Data set H, with +- +- buffer size of 1 byte, and block size of 23 bytes. +-
  2. Data set B, buffer sizes 1 byte, block size 1 byte. +- +-
  3. As (2) but small-mode decompression. +- +-
  4. As (2) with block size 2 bytes. +- +-
  5. As (2) with block size 3 bytes. +- +-
  6. As (2) with block size 4 bytes. +- +-
  7. As (2) with block size 5 bytes. +- +-
  8. As (2) with block size 6 bytes and small-mode decompression. +- +-
  9. H with buffer size of 1 byte, but normal block +- +- size (up to 900000 bytes). +-
+- +-

+-Then some tests with unmodified source code. +- +-

    +-
  1. H, all settings normal. +- +-
  2. As (1), with small-mode decompress. +- +-
  3. H, compress with flag -1. +- +-
  4. H, compress with flag -s, decompress with flag -s. +- +-
  5. Forwards compatibility: H, bzip2-0.1pl2 compressing, +- +- bzip2-0.9.5 decompressing, all settings normal. +-
  6. Backwards compatibility: H, bzip2-0.9.5 compressing, +- +- bzip2-0.1pl2 decompressing, all settings normal. +-
  7. Bigger tests: A, all settings normal. +- +-
  8. As (7), using the fallback (Sadakane-like) sorting algorithm. +- +-
  9. As (8), compress with flag -1, decompress with flag +- +- -s. +-
  10. H, using the fallback sorting algorithm. +- +-
  11. Forwards compatibility: A, bzip2-0.1pl2 compressing, +- +- bzip2-0.9.5 decompressing, all settings normal. +-
  12. Backwards compatibility: A, bzip2-0.9.5 compressing, +- +- bzip2-0.1pl2 decompressing, all settings normal. +-
  13. Misc test: about 400 megabytes of .tar files with +- +- bzip2 compiled with Checker (a memory access error +- detector, like Purify). +-
  14. Misc tests to make sure it builds and runs ok on non-Linux/x86 +- +- platforms. +-
+- +-

+-These tests were conducted on a 225 MHz IDT WinChip machine, running +-Linux 2.0.36. They represent nearly a week of continuous computation. +-All tests completed successfully. +- +-

+- +- +- +-

Further reading

+-

+-bzip2 is not research work, in the sense that it doesn't present +-any new ideas. Rather, it's an engineering exercise based on existing +-ideas. +- +-

+-

+-Four documents describe essentially all the ideas behind bzip2: +- +-

+-Michael Burrows and D. J. Wheeler:
+-  "A block-sorting lossless data compression algorithm"
+-   10th May 1994. 
+-   Digital SRC Research Report 124.
+-   ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz
+-   If you have trouble finding it, try searching at the
+-   New Zealand Digital Library, http://www.nzdl.org.
+-
+-Daniel S. Hirschberg and Debra A. LeLewer
+-  "Efficient Decoding of Prefix Codes"
+-   Communications of the ACM, April 1990, Vol 33, Number 4.
+-   You might be able to get an electronic copy of this
+-      from the ACM Digital Library.
+-
+-David J. Wheeler
+-   Program bred3.c and accompanying document bred3.ps.
+-   This contains the idea behind the multi-table Huffman
+-   coding scheme.
+-   ftp://ftp.cl.cam.ac.uk/users/djw3/
+-
+-Jon L. Bentley and Robert Sedgewick
+-  "Fast Algorithms for Sorting and Searching Strings"
+-   Available from Sedgewick's web page,
+-   www.cs.princeton.edu/~rs
+-
+- +-

+-The following paper gives valuable additional insights into the +-algorithm, but is not immediately the basis of any code +-used in bzip2. +- +-

+-Peter Fenwick:
+-   Block Sorting Text Compression
+-   Proceedings of the 19th Australasian Computer Science Conference,
+-     Melbourne, Australia.  Jan 31 - Feb 2, 1996.
+-   ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps
+-
+- +-

+-Kunihiko Sadakane's sorting algorithm, mentioned above, +-is available from: +- +-

+-http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz
+-
+- +-

+-The Manber-Myers suffix array construction +-algorithm is described in a paper +-available from: +- +-

+-http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps
+-
+- +-

+-Finally, the following paper documents some recent investigations +-I made into the performance of sorting algorithms: +- +-

+-Julian Seward:
+-   On the Performance of BWT Sorting Algorithms
+-   Proceedings of the IEEE Data Compression Conference 2000
+-     Snowbird, Utah.  28-30 March 2000.
+-
+- +-


+-

Go to the first, previous, next, last section, table of contents. +- +- +diff -Nru bzip2-1.0.1/manual_toc.html bzip2-1.0.1.new/manual_toc.html +--- bzip2-1.0.1/manual_toc.html Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/manual_toc.html Thu Jan 1 01:00:00 1970 +@@ -1,173 +0,0 @@ +- +- +- +- +-bzip2 and libbzip2 - Table of Contents +- +- +- +-

bzip2 and libbzip2

+-

a program and library for data compression

+-

copyright (C) 1996-2000 Julian Seward

+-

version 1.0 of 21 March 2000

+-
Julian Seward
+-

+-


+- +-

+-This program, bzip2, +-and associated library libbzip2, are +-Copyright (C) 1996-2000 Julian R Seward. All rights reserved. +- +-

+-

+-Redistribution and use in source and binary forms, with or without +-modification, are permitted provided that the following conditions +-are met: +- +-

    +-
  • +- +- Redistributions of source code must retain the above copyright +- notice, this list of conditions and the following disclaimer. +-
  • +- +- The origin of this software must not be misrepresented; you must +- not claim that you wrote the original software. If you use this +- software in a product, an acknowledgment in the product +- documentation would be appreciated but is not required. +-
  • +- +- Altered source versions must be plainly marked as such, and must +- not be misrepresented as being the original software. +-
  • +- +- The name of the author may not be used to endorse or promote +- products derived from this software without specific prior written +- permission. +-
+- +-

+-THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS +-OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +-WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +-ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY +-DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +-DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +-GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +-INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +-WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +-NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +-SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +- +-

+-

+-Julian Seward, Cambridge, UK. +- +-

+-

+-jseward@acm.org +- +-

+-

+-http://sourceware.cygnus.com/bzip2 +- +-

+-

+-http://www.cacheprof.org +- +-

+-

+-http://www.muraroa.demon.co.uk +- +-

+-

+-bzip2/libbzip2 version 1.0 of 21 March 2000. +- +-

+-

+-PATENTS: To the best of my knowledge, bzip2 does not use any patented +-algorithms. However, I do not have the resources available to carry out +-a full patent search. Therefore I cannot give any guarantee of the +-above statement. +- +-

+- +- +-


+-This document was generated on 23 March 2000 using the +-texi2html +-translator version 1.51a.

+- +- +diff -Nru bzip2-1.0.1/randtable.c bzip2-1.0.1.new/randtable.c +--- bzip2-1.0.1/randtable.c Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/randtable.c Sat Jun 24 20:13:06 2000 +@@ -58,6 +58,10 @@ + For more information on these sources, see the manual. + --*/ + ++#ifdef HAVE_CONFIG_H ++#include ++#endif ++ + + #include "bzlib_private.h" + +diff -Nru bzip2-1.0.1/spewG.c bzip2-1.0.1.new/spewG.c +--- bzip2-1.0.1/spewG.c Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/spewG.c Sat Jun 24 20:13:06 2000 +@@ -9,7 +9,10 @@ + (but is otherwise harmless). + */ + +-#define _FILE_OFFSET_BITS 64 ++#ifdef HAVE_CONFIG_H ++#include ++#endif ++ + + #include + #include +diff -Nru bzip2-1.0.1/stamp-h.in bzip2-1.0.1.new/stamp-h.in +--- bzip2-1.0.1/stamp-h.in Thu Jan 1 01:00:00 1970 ++++ bzip2-1.0.1.new/stamp-h.in Sat Jun 24 20:13:06 2000 +@@ -0,0 +1 @@ ++timestamp +diff -Nru bzip2-1.0.1/unzcrash.c bzip2-1.0.1.new/unzcrash.c +--- bzip2-1.0.1/unzcrash.c Sat Jun 24 20:13:27 2000 ++++ bzip2-1.0.1.new/unzcrash.c Sat Jun 24 20:13:06 2000 +@@ -13,6 +13,12 @@ + many hours. + */ + ++#ifdef HAVE_CONFIG_H ++#include ++#endif ++ ++ ++ + #include + #include + #include "bzlib.h"