diff -Nru bzip2-1.0.1/AUTHORS bzip2-1.0.1.new/AUTHORS --- bzip2-1.0.1/AUTHORS Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/AUTHORS Sat Jun 24 20:13:05 2000 @@ -0,0 +1 @@ +Julian Seward diff -Nru bzip2-1.0.1/CHANGES bzip2-1.0.1.new/CHANGES --- bzip2-1.0.1/CHANGES Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/CHANGES Thu Jan 1 01:00:00 1970 @@ -1,167 +0,0 @@ - - -0.9.0 -~~~~~ -First version. - - -0.9.0a -~~~~~~ -Removed 'ranlib' from Makefile, since most modern Unix-es -don't need it, or even know about it. - - -0.9.0b -~~~~~~ -Fixed a problem with error reporting in bzip2.c. This does not effect -the library in any way. Problem is: versions 0.9.0 and 0.9.0a (of the -program proper) compress and decompress correctly, but give misleading -error messages (internal panics) when an I/O error occurs, instead of -reporting the problem correctly. This shouldn't give any data loss -(as far as I can see), but is confusing. - -Made the inline declarations disappear for non-GCC compilers. - - -0.9.0c -~~~~~~ -Fixed some problems in the library pertaining to some boundary cases. -This makes the library behave more correctly in those situations. The -fixes apply only to features (calls and parameters) not used by -bzip2.c, so the non-fixedness of them in previous versions has no -effect on reliability of bzip2.c. - -In bzlib.c: - * made zero-length BZ_FLUSH work correctly in bzCompress(). - * fixed bzWrite/bzRead to ignore zero-length requests. - * fixed bzread to correctly handle read requests after EOF. - * wrong parameter order in call to bzDecompressInit in - bzBuffToBuffDecompress. Fixed. - -In compress.c: - * changed setting of nGroups in sendMTFValues() so as to - do a bit better on small files. This _does_ effect - bzip2.c. - - -0.9.5a -~~~~~~ -Major change: add a fallback sorting algorithm (blocksort.c) -to give reasonable behaviour even for very repetitive inputs. -Nuked --repetitive-best and --repetitive-fast since they are -no longer useful. - -Minor changes: mostly a whole bunch of small changes/ -bugfixes in the driver (bzip2.c). Changes pertaining to the -user interface are: - - allow decompression of symlink'd files to stdout - decompress/test files even without .bz2 extension - give more accurate error messages for I/O errors - when compressing/decompressing to stdout, don't catch control-C - read flags from BZIP2 and BZIP environment variables - decline to break hard links to a file unless forced with -f - allow -c flag even with no filenames - preserve file ownerships as far as possible - make -s -1 give the expected block size (100k) - add a flag -q --quiet to suppress nonessential warnings - stop decoding flags after --, so files beginning in - can be handled - resolved inconsistent naming: bzcat or bz2cat ? - bzip2 --help now returns 0 - -Programming-level changes are: - - fixed syntax error in GET_LL4 for Borland C++ 5.02 - let bzBuffToBuffDecompress return BZ_DATA_ERROR{_MAGIC} - fix overshoot of mode-string end in bzopen_or_bzdopen - wrapped bzlib.h in #ifdef __cplusplus ... extern "C" { ... } - close file handles under all error conditions - added minor mods so it compiles with DJGPP out of the box - fixed Makefile so it doesn't give problems with BSD make - fix uninitialised memory reads in dlltest.c - -0.9.5b -~~~~~~ -Open stdin/stdout in binary mode for DJGPP. - -0.9.5c -~~~~~~ -Changed BZ_N_OVERSHOOT to be ... + 2 instead of ... + 1. The + 1 -version could cause the sorted order to be wrong in some extremely -obscure cases. Also changed setting of quadrant in blocksort.c. - -0.9.5d -~~~~~~ -The only functional change is to make bzlibVersion() in the library -return the correct string. This has no effect whatsoever on the -functioning of the bzip2 program or library. Added a couple of casts -so the library compiles without warnings at level 3 in MS Visual -Studio 6.0. Included a Y2K statement in the file Y2K_INFO. All other -changes are minor documentation changes. - -1.0 -~~~ -Several minor bugfixes and enhancements: - -* Large file support. The library uses 64-bit counters to - count the volume of data passing through it. bzip2.c - is now compiled with -D_FILE_OFFSET_BITS=64 to get large - file support from the C library. -v correctly prints out - file sizes greater than 4 gigabytes. All these changes have - been made without assuming a 64-bit platform or a C compiler - which supports 64-bit ints, so, except for the C library - aspect, they are fully portable. - -* Decompression robustness. The library/program should be - robust to any corruption of compressed data, detecting and - handling _all_ corruption, instead of merely relying on - the CRCs. What this means is that the program should - never crash, given corrupted data, and the library should - always return BZ_DATA_ERROR. - -* Fixed an obscure race-condition bug only ever observed on - Solaris, in which, if you were very unlucky and issued - control-C at exactly the wrong time, both input and output - files would be deleted. - -* Don't run out of file handles on test/decompression when - large numbers of files have invalid magic numbers. - -* Avoid library namespace pollution. Prefix all exported - symbols with BZ2_. - -* Minor sorting enhancements from my DCC2000 paper. - -* Advance the version number to 1.0, so as to counteract the - (false-in-this-case) impression some people have that programs - with version numbers less than 1.0 are in someway, experimental, - pre-release versions. - -* Create an initial Makefile-libbz2_so to build a shared library. - Yes, I know I should really use libtool et al ... - -* Make the program exit with 2 instead of 0 when decompression - fails due to a bad magic number (ie, an invalid bzip2 header). - Also exit with 1 (as the manual claims :-) whenever a diagnostic - message would have been printed AND the corresponding operation - is aborted, for example - bzip2: Output file xx already exists. - When a diagnostic message is printed but the operation is not - aborted, for example - bzip2: Can't guess original name for wurble -- using wurble.out - then the exit value 0 is returned, unless some other problem is - also detected. - - I think it corresponds more closely to what the manual claims now. - - -1.0.1 -~~~~~ -* Modified dlltest.c so it uses the new BZ2_ naming scheme. -* Modified makefile-msc to fix minor build probs on Win2k. -* Updated README.COMPILATION.PROBLEMS. - -There are no functionality changes or bug fixes relative to version -1.0.0. This is just a documentation update + a fix for minor Win32 -build problems. For almost everyone, upgrading from 1.0.0 to 1.0.1 is -utterly pointless. Don't bother. diff -Nru bzip2-1.0.1/COPYING bzip2-1.0.1.new/COPYING --- bzip2-1.0.1/COPYING Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/COPYING Sat Jun 24 20:13:05 2000 @@ -0,0 +1,39 @@ + +This program, "bzip2" and associated library "libbzip2", are +copyright (C) 1996-2000 Julian R Seward. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + +2. The origin of this software must not be misrepresented; you must + not claim that you wrote the original software. If you use this + software in a product, an acknowledgment in the product + documentation would be appreciated but is not required. + +3. Altered source versions must be plainly marked as such, and must + not be misrepresented as being the original software. + +4. The name of the author may not be used to endorse or promote + products derived from this software without specific prior written + permission. + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS +OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY +DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Julian Seward, Cambridge, UK. +jseward@acm.org +bzip2/libbzip2 version 1.0 of 21 March 2000 + diff -Nru bzip2-1.0.1/ChangeLog bzip2-1.0.1.new/ChangeLog --- bzip2-1.0.1/ChangeLog Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/ChangeLog Sat Jun 24 20:13:05 2000 @@ -0,0 +1 @@ + diff -Nru bzip2-1.0.1/INSTALL bzip2-1.0.1.new/INSTALL --- bzip2-1.0.1/INSTALL Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/INSTALL Sat Jun 24 20:13:06 2000 @@ -0,0 +1,182 @@ +Basic Installation +================== + + These are generic installation instructions. + + The `configure' shell script attempts to guess correct values for +various system-dependent variables used during compilation. It uses +those values to create a `Makefile' in each directory of the package. +It may also create one or more `.h' files containing system-dependent +definitions. Finally, it creates a shell script `config.status' that +you can run in the future to recreate the current configuration, a file +`config.cache' that saves the results of its tests to speed up +reconfiguring, and a file `config.log' containing compiler output +(useful mainly for debugging `configure'). + + If you need to do unusual things to compile the package, please try +to figure out how `configure' could check whether to do them, and mail +diffs or instructions to the address given in the `README' so they can +be considered for the next release. If at some point `config.cache' +contains results you don't want to keep, you may remove or edit it. + + The file `configure.in' is used to create `configure' by a program +called `autoconf'. You only need `configure.in' if you want to change +it or regenerate `configure' using a newer version of `autoconf'. + +The simplest way to compile this package is: + + 1. `cd' to the directory containing the package's source code and type + `./configure' to configure the package for your system. If you're + using `csh' on an old version of System V, you might need to type + `sh ./configure' instead to prevent `csh' from trying to execute + `configure' itself. + + Running `configure' takes awhile. While running, it prints some + messages telling which features it is checking for. + + 2. Type `make' to compile the package. + + 3. Optionally, type `make check' to run any self-tests that come with + the package. + + 4. Type `make install' to install the programs and any data files and + documentation. + + 5. You can remove the program binaries and object files from the + source code directory by typing `make clean'. To also remove the + files that `configure' created (so you can compile the package for + a different kind of computer), type `make distclean'. There is + also a `make maintainer-clean' target, but that is intended mainly + for the package's developers. If you use it, you may have to get + all sorts of other programs in order to regenerate files that came + with the distribution. + +Compilers and Options +===================== + + Some systems require unusual options for compilation or linking that +the `configure' script does not know about. You can give `configure' +initial values for variables by setting them in the environment. Using +a Bourne-compatible shell, you can do that on the command line like +this: + CC=c89 CFLAGS=-O2 LIBS=-lposix ./configure + +Or on systems that have the `env' program, you can do it like this: + env CPPFLAGS=-I/usr/local/include LDFLAGS=-s ./configure + +Compiling For Multiple Architectures +==================================== + + You can compile the package for more than one kind of computer at the +same time, by placing the object files for each architecture in their +own directory. To do this, you must use a version of `make' that +supports the `VPATH' variable, such as GNU `make'. `cd' to the +directory where you want the object files and executables to go and run +the `configure' script. `configure' automatically checks for the +source code in the directory that `configure' is in and in `..'. + + If you have to use a `make' that does not supports the `VPATH' +variable, you have to compile the package for one architecture at a time +in the source code directory. After you have installed the package for +one architecture, use `make distclean' before reconfiguring for another +architecture. + +Installation Names +================== + + By default, `make install' will install the package's files in +`/usr/local/bin', `/usr/local/man', etc. You can specify an +installation prefix other than `/usr/local' by giving `configure' the +option `--prefix=PATH'. + + You can specify separate installation prefixes for +architecture-specific files and architecture-independent files. If you +give `configure' the option `--exec-prefix=PATH', the package will use +PATH as the prefix for installing programs and libraries. +Documentation and other data files will still use the regular prefix. + + In addition, if you use an unusual directory layout you can give +options like `--bindir=PATH' to specify different values for particular +kinds of files. Run `configure --help' for a list of the directories +you can set and what kinds of files go in them. + + If the package supports it, you can cause programs to be installed +with an extra prefix or suffix on their names by giving `configure' the +option `--program-prefix=PREFIX' or `--program-suffix=SUFFIX'. + +Optional Features +================= + + Some packages pay attention to `--enable-FEATURE' options to +`configure', where FEATURE indicates an optional part of the package. +They may also pay attention to `--with-PACKAGE' options, where PACKAGE +is something like `gnu-as' or `x' (for the X Window System). The +`README' should mention any `--enable-' and `--with-' options that the +package recognizes. + + For packages that use the X Window System, `configure' can usually +find the X include and library files automatically, but if it doesn't, +you can use the `configure' options `--x-includes=DIR' and +`--x-libraries=DIR' to specify their locations. + +Specifying the System Type +========================== + + There may be some features `configure' can not figure out +automatically, but needs to determine by the type of host the package +will run on. Usually `configure' can figure that out, but if it prints +a message saying it can not guess the host type, give it the +`--host=TYPE' option. TYPE can either be a short name for the system +type, such as `sun4', or a canonical name with three fields: + CPU-COMPANY-SYSTEM + +See the file `config.sub' for the possible values of each field. If +`config.sub' isn't included in this package, then this package doesn't +need to know the host type. + + If you are building compiler tools for cross-compiling, you can also +use the `--target=TYPE' option to select the type of system they will +produce code for and the `--build=TYPE' option to select the type of +system on which you are compiling the package. + +Sharing Defaults +================ + + If you want to set default values for `configure' scripts to share, +you can create a site shell script called `config.site' that gives +default values for variables like `CC', `cache_file', and `prefix'. +`configure' looks for `PREFIX/share/config.site' if it exists, then +`PREFIX/etc/config.site' if it exists. Or, you can set the +`CONFIG_SITE' environment variable to the location of the site script. +A warning: not all `configure' scripts look for a site script. + +Operation Controls +================== + + `configure' recognizes the following options to control how it +operates. + +`--cache-file=FILE' + Use and save the results of the tests in FILE instead of + `./config.cache'. Set FILE to `/dev/null' to disable caching, for + debugging `configure'. + +`--help' + Print a summary of the options to `configure', and exit. + +`--quiet' +`--silent' +`-q' + Do not print messages saying which checks are being made. To + suppress all normal output, redirect it to `/dev/null' (any error + messages will still be shown). + +`--srcdir=DIR' + Look for the package's source code in directory DIR. Usually + `configure' can determine that directory automatically. + +`--version' + Print the version of Autoconf used to generate the `configure' + script, and exit. + +`configure' also accepts some other, not widely useful, options. diff -Nru bzip2-1.0.1/LICENSE bzip2-1.0.1.new/LICENSE --- bzip2-1.0.1/LICENSE Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/LICENSE Thu Jan 1 01:00:00 1970 @@ -1,39 +0,0 @@ - -This program, "bzip2" and associated library "libbzip2", are -copyright (C) 1996-2000 Julian R Seward. All rights reserved. - -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: - -1. Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - -2. The origin of this software must not be misrepresented; you must - not claim that you wrote the original software. If you use this - software in a product, an acknowledgment in the product - documentation would be appreciated but is not required. - -3. Altered source versions must be plainly marked as such, and must - not be misrepresented as being the original software. - -4. The name of the author may not be used to endorse or promote - products derived from this software without specific prior written - permission. - -THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS -OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED -WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY -DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL -DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE -GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, -WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING -NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS -SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - -Julian Seward, Cambridge, UK. -jseward@acm.org -bzip2/libbzip2 version 1.0 of 21 March 2000 - diff -Nru bzip2-1.0.1/Makefile-libbz2_so bzip2-1.0.1.new/Makefile-libbz2_so --- bzip2-1.0.1/Makefile-libbz2_so Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/Makefile-libbz2_so Thu Jan 1 01:00:00 1970 @@ -1,43 +0,0 @@ - -# This Makefile builds a shared version of the library, -# libbz2.so.1.0.1, with soname libbz2.so.1.0, -# at least on x86-Linux (RedHat 5.2), -# with gcc-2.7.2.3. Please see the README file for some -# important info about building the library like this. - -SHELL=/bin/sh -CC=gcc -BIGFILES=-D_FILE_OFFSET_BITS=64 -CFLAGS=-fpic -fPIC -Wall -Winline -O2 -fomit-frame-pointer -fno-strength-reduce $(BIGFILES) - -OBJS= blocksort.o \ - huffman.o \ - crctable.o \ - randtable.o \ - compress.o \ - decompress.o \ - bzlib.o - -all: $(OBJS) - $(CC) -shared -Wl,-soname -Wl,libbz2.so.1.0 -o libbz2.so.1.0.1 $(OBJS) - $(CC) $(CFLAGS) -o bzip2-shared bzip2.c libbz2.so.1.0.1 - rm -f libbz2.so.1.0 - ln -s libbz2.so.1.0.1 libbz2.so.1.0 - -clean: - rm -f $(OBJS) bzip2.o libbz2.so.1.0.1 libbz2.so.1.0 bzip2-shared - -blocksort.o: blocksort.c - $(CC) $(CFLAGS) -c blocksort.c -huffman.o: huffman.c - $(CC) $(CFLAGS) -c huffman.c -crctable.o: crctable.c - $(CC) $(CFLAGS) -c crctable.c -randtable.o: randtable.c - $(CC) $(CFLAGS) -c randtable.c -compress.o: compress.c - $(CC) $(CFLAGS) -c compress.c -decompress.o: decompress.c - $(CC) $(CFLAGS) -c decompress.c -bzlib.o: bzlib.c - $(CC) $(CFLAGS) -c bzlib.c diff -Nru bzip2-1.0.1/Makefile.am bzip2-1.0.1.new/Makefile.am --- bzip2-1.0.1/Makefile.am Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/Makefile.am Sat Jun 24 20:17:47 2000 @@ -0,0 +1,31 @@ +SUBDIRS = doc + +bin_PROGRAMS = bzip2 bzip2recover +bzip2_SOURCES = bzip2.c + +bzip2_LDADD = libbz2.la +bzip2recover_SOURCES = bzip2recover.c +lib_LTLIBRARIES = libbz2.la +libbz2_la_SOURCES = \ + blocksort.c \ + huffman.c \ + crctable.c \ + randtable.c \ + compress.c \ + decompress.c \ + bzlib.c \ + bzlib.h \ + bzlib_private.h + +libbz2_la_LDFLAGS = -version-info 1:0:0 +include_HEADERS = bzlib.h bzlib_private.h + +bzip2SCRIPTS = bzless + +EXTRA_DIST = README README.COMPILATION.PROBLEMS \ + Y2K_INFO libbz2.def libbz2.dsp \ + sample1.bz2 sample1.ref sample2.bz2 sample2.ref sample3.bz2 sample3.ref + +install-exec-hook: + $(LN_S) -f bzip2 $(DESTDIR)$(bindir)/bunzip2 + $(LN_S) -f bzip2 $(DESTDIR)$(bindir)/bzcat diff -Nru bzip2-1.0.1/NEWS bzip2-1.0.1.new/NEWS --- bzip2-1.0.1/NEWS Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/NEWS Sat Jun 24 20:13:06 2000 @@ -0,0 +1,12 @@ + + +1.0.1 +~~~~~ +* Modified dlltest.c so it uses the new BZ2_ naming scheme. +* Modified makefile-msc to fix minor build probs on Win2k. +* Updated README.COMPILATION.PROBLEMS. + +There are no functionality changes or bug fixes relative to version +1.0.0. This is just a documentation update + a fix for minor Win32 +build problems. For almost everyone, upgrading from 1.0.0 to 1.0.1 is +utterly pointless. Don't bother. diff -Nru bzip2-1.0.1/acinclude.m4 bzip2-1.0.1.new/acinclude.m4 --- bzip2-1.0.1/acinclude.m4 Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/acinclude.m4 Sat Jun 24 20:13:06 2000 @@ -0,0 +1,129 @@ +#serial 7 + +dnl By default, many hosts won't let programs access large files; +dnl one must use special compiler options to get large-file access to work. +dnl For more details about this brain damage please see: +dnl http://www.sas.com/standards/large.file/x_open.20Mar96.html + +dnl Written by Paul Eggert . + +dnl Internal subroutine of AC_SYS_LARGEFILE. +dnl AC_SYS_LARGEFILE_FLAGS(FLAGSNAME) +AC_DEFUN(AC_SYS_LARGEFILE_FLAGS, + [AC_CACHE_CHECK([for $1 value to request large file support], + ac_cv_sys_largefile_$1, + [if ($GETCONF LFS_$1) >conftest.1 2>conftest.2 && test ! -s conftest.2 + then + ac_cv_sys_largefile_$1=`cat conftest.1` + else + ac_cv_sys_largefile_$1=no + ifelse($1, CFLAGS, + [case "$host_os" in + # HP-UX 10.20 requires -D__STDC_EXT__ with gcc 2.95.1. +changequote(, )dnl + hpux10.[2-9][0-9]* | hpux1[1-9]* | hpux[2-9][0-9]*) +changequote([, ])dnl + if test "$GCC" = yes; then + ac_cv_sys_largefile_CFLAGS=-D__STDC_EXT__ + fi + ;; + # IRIX 6.2 and later require cc -n32. +changequote(, )dnl + irix6.[2-9]* | irix6.1[0-9]* | irix[7-9].* | irix[1-9][0-9]*) +changequote([, ])dnl + if test "$GCC" != yes; then + ac_cv_sys_largefile_CFLAGS=-n32 + fi + esac + if test "$ac_cv_sys_largefile_CFLAGS" != no; then + ac_save_CC="$CC" + CC="$CC $ac_cv_sys_largefile_CFLAGS" + AC_TRY_LINK(, , , ac_cv_sys_largefile_CFLAGS=no) + CC="$ac_save_CC" + fi]) + fi + rm -f conftest*])]) + +dnl Internal subroutine of AC_SYS_LARGEFILE. +dnl AC_SYS_LARGEFILE_SPACE_APPEND(VAR, VAL) +AC_DEFUN(AC_SYS_LARGEFILE_SPACE_APPEND, + [case $2 in + no) ;; + ?*) + case "[$]$1" in + '') $1=$2 ;; + *) $1=[$]$1' '$2 ;; + esac ;; + esac]) + +dnl Internal subroutine of AC_SYS_LARGEFILE. +dnl AC_SYS_LARGEFILE_MACRO_VALUE(C-MACRO, CACHE-VAR, COMMENT, CODE-TO-SET-DEFAULT) +AC_DEFUN(AC_SYS_LARGEFILE_MACRO_VALUE, + [AC_CACHE_CHECK([for $1], $2, + [$2=no +changequote(, )dnl + $4 + for ac_flag in $ac_cv_sys_largefile_CFLAGS no; do + case "$ac_flag" in + -D$1) + $2=1 ;; + -D$1=*) + $2=`expr " $ac_flag" : '[^=]*=\(.*\)'` ;; + esac + done +changequote([, ])dnl + ]) + if test "[$]$2" != no; then + AC_DEFINE_UNQUOTED([$1], [$]$2, [$3]) + fi]) + +AC_DEFUN(AC_SYS_LARGEFILE, + [AC_REQUIRE([AC_CANONICAL_HOST]) + AC_ARG_ENABLE(largefile, + [ --disable-largefile omit support for large files]) + if test "$enable_largefile" != no; then + AC_CHECK_TOOL(GETCONF, getconf) + AC_SYS_LARGEFILE_FLAGS(CFLAGS) + AC_SYS_LARGEFILE_FLAGS(LDFLAGS) + AC_SYS_LARGEFILE_FLAGS(LIBS) + + for ac_flag in $ac_cv_sys_largefile_CFLAGS no; do + case "$ac_flag" in + no) ;; + -D_FILE_OFFSET_BITS=*) ;; + -D_LARGEFILE_SOURCE | -D_LARGEFILE_SOURCE=*) ;; + -D_LARGE_FILES | -D_LARGE_FILES=*) ;; + -D?* | -I?*) + AC_SYS_LARGEFILE_SPACE_APPEND(CPPFLAGS, "$ac_flag") ;; + *) + AC_SYS_LARGEFILE_SPACE_APPEND(CFLAGS, "$ac_flag") ;; + esac + done + AC_SYS_LARGEFILE_SPACE_APPEND(LDFLAGS, "$ac_cv_sys_largefile_LDFLAGS") + AC_SYS_LARGEFILE_SPACE_APPEND(LIBS, "$ac_cv_sys_largefile_LIBS") + AC_SYS_LARGEFILE_MACRO_VALUE(_FILE_OFFSET_BITS, + ac_cv_sys_file_offset_bits, + [Number of bits in a file offset, on hosts where this is settable.], + [case "$host_os" in + # HP-UX 10.20 and later + hpux10.[2-9][0-9]* | hpux1[1-9]* | hpux[2-9][0-9]*) + ac_cv_sys_file_offset_bits=64 ;; + esac]) + AC_SYS_LARGEFILE_MACRO_VALUE(_LARGEFILE_SOURCE, + ac_cv_sys_largefile_source, + [Define to make fseeko etc. visible, on some hosts.], + [case "$host_os" in + # HP-UX 10.20 and later + hpux10.[2-9][0-9]* | hpux1[1-9]* | hpux[2-9][0-9]*) + ac_cv_sys_largefile_source=1 ;; + esac]) + AC_SYS_LARGEFILE_MACRO_VALUE(_LARGE_FILES, + ac_cv_sys_large_files, + [Define for large files, on AIX-style hosts.], + [case "$host_os" in + # AIX 4.2 and later + aix4.[2-9]* | aix4.1[0-9]* | aix[5-9].* | aix[1-9][0-9]*) + ac_cv_sys_large_files=1 ;; + esac]) + fi + ]) diff -Nru bzip2-1.0.1/bzip2.1 bzip2-1.0.1.new/bzip2.1 --- bzip2-1.0.1/bzip2.1 Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/bzip2.1 Thu Jan 1 01:00:00 1970 @@ -1,439 +0,0 @@ -.PU -.TH bzip2 1 -.SH NAME -bzip2, bunzip2 \- a block-sorting file compressor, v1.0 -.br -bzcat \- decompresses files to stdout -.br -bzip2recover \- recovers data from damaged bzip2 files - -.SH SYNOPSIS -.ll +8 -.B bzip2 -.RB [ " \-cdfkqstvzVL123456789 " ] -[ -.I "filenames \&..." -] -.ll -8 -.br -.B bunzip2 -.RB [ " \-fkvsVL " ] -[ -.I "filenames \&..." -] -.br -.B bzcat -.RB [ " \-s " ] -[ -.I "filenames \&..." -] -.br -.B bzip2recover -.I "filename" - -.SH DESCRIPTION -.I bzip2 -compresses files using the Burrows-Wheeler block sorting -text compression algorithm, and Huffman coding. Compression is -generally considerably better than that achieved by more conventional -LZ77/LZ78-based compressors, and approaches the performance of the PPM -family of statistical compressors. - -The command-line options are deliberately very similar to -those of -.I GNU gzip, -but they are not identical. - -.I bzip2 -expects a list of file names to accompany the -command-line flags. Each file is replaced by a compressed version of -itself, with the name "original_name.bz2". -Each compressed file -has the same modification date, permissions, and, when possible, -ownership as the corresponding original, so that these properties can -be correctly restored at decompression time. File name handling is -naive in the sense that there is no mechanism for preserving original -file names, permissions, ownerships or dates in filesystems which lack -these concepts, or have serious file name length restrictions, such as -MS-DOS. - -.I bzip2 -and -.I bunzip2 -will by default not overwrite existing -files. If you want this to happen, specify the \-f flag. - -If no file names are specified, -.I bzip2 -compresses from standard -input to standard output. In this case, -.I bzip2 -will decline to -write compressed output to a terminal, as this would be entirely -incomprehensible and therefore pointless. - -.I bunzip2 -(or -.I bzip2 \-d) -decompresses all -specified files. Files which were not created by -.I bzip2 -will be detected and ignored, and a warning issued. -.I bzip2 -attempts to guess the filename for the decompressed file -from that of the compressed file as follows: - - filename.bz2 becomes filename - filename.bz becomes filename - filename.tbz2 becomes filename.tar - filename.tbz becomes filename.tar - anyothername becomes anyothername.out - -If the file does not end in one of the recognised endings, -.I .bz2, -.I .bz, -.I .tbz2 -or -.I .tbz, -.I bzip2 -complains that it cannot -guess the name of the original file, and uses the original name -with -.I .out -appended. - -As with compression, supplying no -filenames causes decompression from -standard input to standard output. - -.I bunzip2 -will correctly decompress a file which is the -concatenation of two or more compressed files. The result is the -concatenation of the corresponding uncompressed files. Integrity -testing (\-t) -of concatenated -compressed files is also supported. - -You can also compress or decompress files to the standard output by -giving the \-c flag. Multiple files may be compressed and -decompressed like this. The resulting outputs are fed sequentially to -stdout. Compression of multiple files -in this manner generates a stream -containing multiple compressed file representations. Such a stream -can be decompressed correctly only by -.I bzip2 -version 0.9.0 or -later. Earlier versions of -.I bzip2 -will stop after decompressing -the first file in the stream. - -.I bzcat -(or -.I bzip2 -dc) -decompresses all specified files to -the standard output. - -.I bzip2 -will read arguments from the environment variables -.I BZIP2 -and -.I BZIP, -in that order, and will process them -before any arguments read from the command line. This gives a -convenient way to supply default arguments. - -Compression is always performed, even if the compressed -file is slightly -larger than the original. Files of less than about one hundred bytes -tend to get larger, since the compression mechanism has a constant -overhead in the region of 50 bytes. Random data (including the output -of most file compressors) is coded at about 8.05 bits per byte, giving -an expansion of around 0.5%. - -As a self-check for your protection, -.I -bzip2 -uses 32-bit CRCs to -make sure that the decompressed version of a file is identical to the -original. This guards against corruption of the compressed data, and -against undetected bugs in -.I bzip2 -(hopefully very unlikely). The -chances of data corruption going undetected is microscopic, about one -chance in four billion for each file processed. Be aware, though, that -the check occurs upon decompression, so it can only tell you that -something is wrong. It can't help you -recover the original uncompressed -data. You can use -.I bzip2recover -to try to recover data from -damaged files. - -Return values: 0 for a normal exit, 1 for environmental problems (file -not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt -compressed file, 3 for an internal consistency error (eg, bug) which -caused -.I bzip2 -to panic. - -.SH OPTIONS -.TP -.B \-c --stdout -Compress or decompress to standard output. -.TP -.B \-d --decompress -Force decompression. -.I bzip2, -.I bunzip2 -and -.I bzcat -are -really the same program, and the decision about what actions to take is -done on the basis of which name is used. This flag overrides that -mechanism, and forces -.I bzip2 -to decompress. -.TP -.B \-z --compress -The complement to \-d: forces compression, regardless of the -invokation name. -.TP -.B \-t --test -Check integrity of the specified file(s), but don't decompress them. -This really performs a trial decompression and throws away the result. -.TP -.B \-f --force -Force overwrite of output files. Normally, -.I bzip2 -will not overwrite -existing output files. Also forces -.I bzip2 -to break hard links -to files, which it otherwise wouldn't do. -.TP -.B \-k --keep -Keep (don't delete) input files during compression -or decompression. -.TP -.B \-s --small -Reduce memory usage, for compression, decompression and testing. Files -are decompressed and tested using a modified algorithm which only -requires 2.5 bytes per block byte. This means any file can be -decompressed in 2300k of memory, albeit at about half the normal speed. - -During compression, \-s selects a block size of 200k, which limits -memory use to around the same figure, at the expense of your compression -ratio. In short, if your machine is low on memory (8 megabytes or -less), use \-s for everything. See MEMORY MANAGEMENT below. -.TP -.B \-q --quiet -Suppress non-essential warning messages. Messages pertaining to -I/O errors and other critical events will not be suppressed. -.TP -.B \-v --verbose -Verbose mode -- show the compression ratio for each file processed. -Further \-v's increase the verbosity level, spewing out lots of -information which is primarily of interest for diagnostic purposes. -.TP -.B \-L --license -V --version -Display the software version, license terms and conditions. -.TP -.B \-1 to \-9 -Set the block size to 100 k, 200 k .. 900 k when compressing. Has no -effect when decompressing. See MEMORY MANAGEMENT below. -.TP -.B \-- -Treats all subsequent arguments as file names, even if they start -with a dash. This is so you can handle files with names beginning -with a dash, for example: bzip2 \-- \-myfilename. -.TP -.B \--repetitive-fast --repetitive-best -These flags are redundant in versions 0.9.5 and above. They provided -some coarse control over the behaviour of the sorting algorithm in -earlier versions, which was sometimes useful. 0.9.5 and above have an -improved algorithm which renders these flags irrelevant. - -.SH MEMORY MANAGEMENT -.I bzip2 -compresses large files in blocks. The block size affects -both the compression ratio achieved, and the amount of memory needed for -compression and decompression. The flags \-1 through \-9 -specify the block size to be 100,000 bytes through 900,000 bytes (the -default) respectively. At decompression time, the block size used for -compression is read from the header of the compressed file, and -.I bunzip2 -then allocates itself just enough memory to decompress -the file. Since block sizes are stored in compressed files, it follows -that the flags \-1 to \-9 are irrelevant to and so ignored -during decompression. - -Compression and decompression requirements, -in bytes, can be estimated as: - - Compression: 400k + ( 8 x block size ) - - Decompression: 100k + ( 4 x block size ), or - 100k + ( 2.5 x block size ) - -Larger block sizes give rapidly diminishing marginal returns. Most of -the compression comes from the first two or three hundred k of block -size, a fact worth bearing in mind when using -.I bzip2 -on small machines. -It is also important to appreciate that the decompression memory -requirement is set at compression time by the choice of block size. - -For files compressed with the default 900k block size, -.I bunzip2 -will require about 3700 kbytes to decompress. To support decompression -of any file on a 4 megabyte machine, -.I bunzip2 -has an option to -decompress using approximately half this amount of memory, about 2300 -kbytes. Decompression speed is also halved, so you should use this -option only where necessary. The relevant flag is -s. - -In general, try and use the largest block size memory constraints allow, -since that maximises the compression achieved. Compression and -decompression speed are virtually unaffected by block size. - -Another significant point applies to files which fit in a single block --- that means most files you'd encounter using a large block size. The -amount of real memory touched is proportional to the size of the file, -since the file is smaller than a block. For example, compressing a file -20,000 bytes long with the flag -9 will cause the compressor to -allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 -kbytes of it. Similarly, the decompressor will allocate 3700k but only -touch 100k + 20000 * 4 = 180 kbytes. - -Here is a table which summarises the maximum memory usage for different -block sizes. Also recorded is the total compressed size for 14 files of -the Calgary Text Compression Corpus totalling 3,141,622 bytes. This -column gives some feel for how compression varies with block size. -These figures tend to understate the advantage of larger block sizes for -larger files, since the Corpus is dominated by smaller files. - - Compress Decompress Decompress Corpus - Flag usage usage -s usage Size - - -1 1200k 500k 350k 914704 - -2 2000k 900k 600k 877703 - -3 2800k 1300k 850k 860338 - -4 3600k 1700k 1100k 846899 - -5 4400k 2100k 1350k 845160 - -6 5200k 2500k 1600k 838626 - -7 6100k 2900k 1850k 834096 - -8 6800k 3300k 2100k 828642 - -9 7600k 3700k 2350k 828642 - -.SH RECOVERING DATA FROM DAMAGED FILES -.I bzip2 -compresses files in blocks, usually 900kbytes long. Each -block is handled independently. If a media or transmission error causes -a multi-block .bz2 -file to become damaged, it may be possible to -recover data from the undamaged blocks in the file. - -The compressed representation of each block is delimited by a 48-bit -pattern, which makes it possible to find the block boundaries with -reasonable certainty. Each block also carries its own 32-bit CRC, so -damaged blocks can be distinguished from undamaged ones. - -.I bzip2recover -is a simple program whose purpose is to search for -blocks in .bz2 files, and write each block out into its own .bz2 -file. You can then use -.I bzip2 -\-t -to test the -integrity of the resulting files, and decompress those which are -undamaged. - -.I bzip2recover -takes a single argument, the name of the damaged file, -and writes a number of files "rec0001file.bz2", -"rec0002file.bz2", etc, containing the extracted blocks. -The output filenames are designed so that the use of -wildcards in subsequent processing -- for example, -"bzip2 -dc rec*file.bz2 > recovered_data" -- lists the files in -the correct order. - -.I bzip2recover -should be of most use dealing with large .bz2 -files, as these will contain many blocks. It is clearly -futile to use it on damaged single-block files, since a -damaged block cannot be recovered. If you wish to minimise -any potential data loss through media or transmission errors, -you might consider compressing with a smaller -block size. - -.SH PERFORMANCE NOTES -The sorting phase of compression gathers together similar strings in the -file. Because of this, files containing very long runs of repeated -symbols, like "aabaabaabaab ..." (repeated several hundred times) may -compress more slowly than normal. Versions 0.9.5 and above fare much -better than previous versions in this respect. The ratio between -worst-case and average-case compression time is in the region of 10:1. -For previous versions, this figure was more like 100:1. You can use the -\-vvvv option to monitor progress in great detail, if you want. - -Decompression speed is unaffected by these phenomena. - -.I bzip2 -usually allocates several megabytes of memory to operate -in, and then charges all over it in a fairly random fashion. This means -that performance, both for compressing and decompressing, is largely -determined by the speed at which your machine can service cache misses. -Because of this, small changes to the code to reduce the miss rate have -been observed to give disproportionately large performance improvements. -I imagine -.I bzip2 -will perform best on machines with very large caches. - -.SH CAVEATS -I/O error messages are not as helpful as they could be. -.I bzip2 -tries hard to detect I/O errors and exit cleanly, but the details of -what the problem is sometimes seem rather misleading. - -This manual page pertains to version 1.0 of -.I bzip2. -Compressed -data created by this version is entirely forwards and backwards -compatible with the previous public releases, versions 0.1pl2, 0.9.0 -and 0.9.5, -but with the following exception: 0.9.0 and above can correctly -decompress multiple concatenated compressed files. 0.1pl2 cannot do -this; it will stop after decompressing just the first file in the -stream. - -.I bzip2recover -uses 32-bit integers to represent bit positions in -compressed files, so it cannot handle compressed files more than 512 -megabytes long. This could easily be fixed. - -.SH AUTHOR -Julian Seward, jseward@acm.org. - -http://sourceware.cygnus.com/bzip2 -http://www.muraroa.demon.co.uk - -The ideas embodied in -.I bzip2 -are due to (at least) the following -people: Michael Burrows and David Wheeler (for the block sorting -transformation), David Wheeler (again, for the Huffman coder), Peter -Fenwick (for the structured coding model in the original -.I bzip, -and many refinements), and Alistair Moffat, Radford Neal and Ian Witten -(for the arithmetic coder in the original -.I bzip). -I am much -indebted for their help, support and advice. See the manual in the -source distribution for pointers to sources of documentation. Christian -von Roques encouraged me to look for faster sorting algorithms, so as to -speed up compression. Bela Lubkin encouraged me to improve the -worst-case compression performance. Many people sent patches, helped -with portability problems, lent machines, gave advice and were generally -helpful. diff -Nru bzip2-1.0.1/bzip2.1.preformatted bzip2-1.0.1.new/bzip2.1.preformatted --- bzip2-1.0.1/bzip2.1.preformatted Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/bzip2.1.preformatted Thu Jan 1 01:00:00 1970 @@ -1,462 +0,0 @@ - - - -bzip2(1) bzip2(1) - - -NNAAMMEE - bzip2, bunzip2 - a block-sorting file compressor, v1.0 - bzcat - decompresses files to stdout - bzip2recover - recovers data from damaged bzip2 files - - -SSYYNNOOPPSSIISS - bbzziipp22 [ --ccddffkkqqssttvvzzVVLL112233445566778899 ] [ _f_i_l_e_n_a_m_e_s _._._. ] - bbuunnzziipp22 [ --ffkkvvssVVLL ] [ _f_i_l_e_n_a_m_e_s _._._. ] - bbzzccaatt [ --ss ] [ _f_i_l_e_n_a_m_e_s _._._. ] - bbzziipp22rreeccoovveerr _f_i_l_e_n_a_m_e - - -DDEESSCCRRIIPPTTIIOONN - _b_z_i_p_2 compresses files using the Burrows-Wheeler block - sorting text compression algorithm, and Huffman coding. - Compression is generally considerably better than that - achieved by more conventional LZ77/LZ78-based compressors, - and approaches the performance of the PPM family of sta- - tistical compressors. - - The command-line options are deliberately very similar to - those of _G_N_U _g_z_i_p_, but they are not identical. - - _b_z_i_p_2 expects a list of file names to accompany the com- - mand-line flags. Each file is replaced by a compressed - version of itself, with the name "original_name.bz2". - Each compressed file has the same modification date, per- - missions, and, when possible, ownership as the correspond- - ing original, so that these properties can be correctly - restored at decompression time. File name handling is - naive in the sense that there is no mechanism for preserv- - ing original file names, permissions, ownerships or dates - in filesystems which lack these concepts, or have serious - file name length restrictions, such as MS-DOS. - - _b_z_i_p_2 and _b_u_n_z_i_p_2 will by default not overwrite existing - files. If you want this to happen, specify the -f flag. - - If no file names are specified, _b_z_i_p_2 compresses from - standard input to standard output. In this case, _b_z_i_p_2 - will decline to write compressed output to a terminal, as - this would be entirely incomprehensible and therefore - pointless. - - _b_u_n_z_i_p_2 (or _b_z_i_p_2 _-_d_) decompresses all specified files. - Files which were not created by _b_z_i_p_2 will be detected and - ignored, and a warning issued. _b_z_i_p_2 attempts to guess - the filename for the decompressed file from that of the - compressed file as follows: - - filename.bz2 becomes filename - filename.bz becomes filename - filename.tbz2 becomes filename.tar - - - - 1 - - - - - -bzip2(1) bzip2(1) - - - filename.tbz becomes filename.tar - anyothername becomes anyothername.out - - If the file does not end in one of the recognised endings, - _._b_z_2_, _._b_z_, _._t_b_z_2 or _._t_b_z_, _b_z_i_p_2 complains that it cannot - guess the name of the original file, and uses the original - name with _._o_u_t appended. - - As with compression, supplying no filenames causes decom- - pression from standard input to standard output. - - _b_u_n_z_i_p_2 will correctly decompress a file which is the con- - catenation of two or more compressed files. The result is - the concatenation of the corresponding uncompressed files. - Integrity testing (-t) of concatenated compressed files is - also supported. - - You can also compress or decompress files to the standard - output by giving the -c flag. Multiple files may be com- - pressed and decompressed like this. The resulting outputs - are fed sequentially to stdout. Compression of multiple - files in this manner generates a stream containing multi- - ple compressed file representations. Such a stream can be - decompressed correctly only by _b_z_i_p_2 version 0.9.0 or - later. Earlier versions of _b_z_i_p_2 will stop after decom- - pressing the first file in the stream. - - _b_z_c_a_t (or _b_z_i_p_2 _-_d_c_) decompresses all specified files to - the standard output. - - _b_z_i_p_2 will read arguments from the environment variables - _B_Z_I_P_2 and _B_Z_I_P_, in that order, and will process them - before any arguments read from the command line. This - gives a convenient way to supply default arguments. - - Compression is always performed, even if the compressed - file is slightly larger than the original. Files of less - than about one hundred bytes tend to get larger, since the - compression mechanism has a constant overhead in the - region of 50 bytes. Random data (including the output of - most file compressors) is coded at about 8.05 bits per - byte, giving an expansion of around 0.5%. - - As a self-check for your protection, _b_z_i_p_2 uses 32-bit - CRCs to make sure that the decompressed version of a file - is identical to the original. This guards against corrup- - tion of the compressed data, and against undetected bugs - in _b_z_i_p_2 (hopefully very unlikely). The chances of data - corruption going undetected is microscopic, about one - chance in four billion for each file processed. Be aware, - though, that the check occurs upon decompression, so it - can only tell you that something is wrong. It can't help - you recover the original uncompressed data. You can use - _b_z_i_p_2_r_e_c_o_v_e_r to try to recover data from damaged files. - - - - 2 - - - - - -bzip2(1) bzip2(1) - - - Return values: 0 for a normal exit, 1 for environmental - problems (file not found, invalid flags, I/O errors, &c), - 2 to indicate a corrupt compressed file, 3 for an internal - consistency error (eg, bug) which caused _b_z_i_p_2 to panic. - - -OOPPTTIIOONNSS - --cc ----ssttddoouutt - Compress or decompress to standard output. - - --dd ----ddeeccoommpprreessss - Force decompression. _b_z_i_p_2_, _b_u_n_z_i_p_2 and _b_z_c_a_t are - really the same program, and the decision about - what actions to take is done on the basis of which - name is used. This flag overrides that mechanism, - and forces _b_z_i_p_2 to decompress. - - --zz ----ccoommpprreessss - The complement to -d: forces compression, regard- - less of the invokation name. - - --tt ----tteesstt - Check integrity of the specified file(s), but don't - decompress them. This really performs a trial - decompression and throws away the result. - - --ff ----ffoorrccee - Force overwrite of output files. Normally, _b_z_i_p_2 - will not overwrite existing output files. Also - forces _b_z_i_p_2 to break hard links to files, which it - otherwise wouldn't do. - - --kk ----kkeeeepp - Keep (don't delete) input files during compression - or decompression. - - --ss ----ssmmaallll - Reduce memory usage, for compression, decompression - and testing. Files are decompressed and tested - using a modified algorithm which only requires 2.5 - bytes per block byte. This means any file can be - decompressed in 2300k of memory, albeit at about - half the normal speed. - - During compression, -s selects a block size of - 200k, which limits memory use to around the same - figure, at the expense of your compression ratio. - In short, if your machine is low on memory (8 - megabytes or less), use -s for everything. See - MEMORY MANAGEMENT below. - - --qq ----qquuiieett - Suppress non-essential warning messages. Messages - pertaining to I/O errors and other critical events - - - - 3 - - - - - -bzip2(1) bzip2(1) - - - will not be suppressed. - - --vv ----vveerrbboossee - Verbose mode -- show the compression ratio for each - file processed. Further -v's increase the ver- - bosity level, spewing out lots of information which - is primarily of interest for diagnostic purposes. - - --LL ----lliicceennssee --VV ----vveerrssiioonn - Display the software version, license terms and - conditions. - - --11 ttoo --99 - Set the block size to 100 k, 200 k .. 900 k when - compressing. Has no effect when decompressing. - See MEMORY MANAGEMENT below. - - ---- Treats all subsequent arguments as file names, even - if they start with a dash. This is so you can han- - dle files with names beginning with a dash, for - example: bzip2 -- -myfilename. - - ----rreeppeettiittiivvee--ffaasstt ----rreeppeettiittiivvee--bbeesstt - These flags are redundant in versions 0.9.5 and - above. They provided some coarse control over the - behaviour of the sorting algorithm in earlier ver- - sions, which was sometimes useful. 0.9.5 and above - have an improved algorithm which renders these - flags irrelevant. - - -MMEEMMOORRYY MMAANNAAGGEEMMEENNTT - _b_z_i_p_2 compresses large files in blocks. The block size - affects both the compression ratio achieved, and the - amount of memory needed for compression and decompression. - The flags -1 through -9 specify the block size to be - 100,000 bytes through 900,000 bytes (the default) respec- - tively. At decompression time, the block size used for - compression is read from the header of the compressed - file, and _b_u_n_z_i_p_2 then allocates itself just enough memory - to decompress the file. Since block sizes are stored in - compressed files, it follows that the flags -1 to -9 are - irrelevant to and so ignored during decompression. - - Compression and decompression requirements, in bytes, can - be estimated as: - - Compression: 400k + ( 8 x block size ) - - Decompression: 100k + ( 4 x block size ), or - 100k + ( 2.5 x block size ) - - Larger block sizes give rapidly diminishing marginal - returns. Most of the compression comes from the first two - - - - 4 - - - - - -bzip2(1) bzip2(1) - - - or three hundred k of block size, a fact worth bearing in - mind when using _b_z_i_p_2 on small machines. It is also - important to appreciate that the decompression memory - requirement is set at compression time by the choice of - block size. - - For files compressed with the default 900k block size, - _b_u_n_z_i_p_2 will require about 3700 kbytes to decompress. To - support decompression of any file on a 4 megabyte machine, - _b_u_n_z_i_p_2 has an option to decompress using approximately - half this amount of memory, about 2300 kbytes. Decompres- - sion speed is also halved, so you should use this option - only where necessary. The relevant flag is -s. - - In general, try and use the largest block size memory con- - straints allow, since that maximises the compression - achieved. Compression and decompression speed are virtu- - ally unaffected by block size. - - Another significant point applies to files which fit in a - single block -- that means most files you'd encounter - using a large block size. The amount of real memory - touched is proportional to the size of the file, since the - file is smaller than a block. For example, compressing a - file 20,000 bytes long with the flag -9 will cause the - compressor to allocate around 7600k of memory, but only - touch 400k + 20000 * 8 = 560 kbytes of it. Similarly, the - decompressor will allocate 3700k but only touch 100k + - 20000 * 4 = 180 kbytes. - - Here is a table which summarises the maximum memory usage - for different block sizes. Also recorded is the total - compressed size for 14 files of the Calgary Text Compres- - sion Corpus totalling 3,141,622 bytes. This column gives - some feel for how compression varies with block size. - These figures tend to understate the advantage of larger - block sizes for larger files, since the Corpus is domi- - nated by smaller files. - - Compress Decompress Decompress Corpus - Flag usage usage -s usage Size - - -1 1200k 500k 350k 914704 - -2 2000k 900k 600k 877703 - -3 2800k 1300k 850k 860338 - -4 3600k 1700k 1100k 846899 - -5 4400k 2100k 1350k 845160 - -6 5200k 2500k 1600k 838626 - -7 6100k 2900k 1850k 834096 - -8 6800k 3300k 2100k 828642 - -9 7600k 3700k 2350k 828642 - - - - - - - 5 - - - - - -bzip2(1) bzip2(1) - - -RREECCOOVVEERRIINNGG DDAATTAA FFRROOMM DDAAMMAAGGEEDD FFIILLEESS - _b_z_i_p_2 compresses files in blocks, usually 900kbytes long. - Each block is handled independently. If a media or trans- - mission error causes a multi-block .bz2 file to become - damaged, it may be possible to recover data from the - undamaged blocks in the file. - - The compressed representation of each block is delimited - by a 48-bit pattern, which makes it possible to find the - block boundaries with reasonable certainty. Each block - also carries its own 32-bit CRC, so damaged blocks can be - distinguished from undamaged ones. - - _b_z_i_p_2_r_e_c_o_v_e_r is a simple program whose purpose is to - search for blocks in .bz2 files, and write each block out - into its own .bz2 file. You can then use _b_z_i_p_2 -t to test - the integrity of the resulting files, and decompress those - which are undamaged. - - _b_z_i_p_2_r_e_c_o_v_e_r takes a single argument, the name of the dam- - aged file, and writes a number of files "rec0001file.bz2", - "rec0002file.bz2", etc, containing the extracted blocks. - The output filenames are designed so that the use of - wildcards in subsequent processing -- for example, "bzip2 - -dc rec*file.bz2 > recovered_data" -- lists the files in - the correct order. - - _b_z_i_p_2_r_e_c_o_v_e_r should be of most use dealing with large .bz2 - files, as these will contain many blocks. It is clearly - futile to use it on damaged single-block files, since a - damaged block cannot be recovered. If you wish to min- - imise any potential data loss through media or transmis- - sion errors, you might consider compressing with a smaller - block size. - - -PPEERRFFOORRMMAANNCCEE NNOOTTEESS - The sorting phase of compression gathers together similar - strings in the file. Because of this, files containing - very long runs of repeated symbols, like "aabaabaabaab - ..." (repeated several hundred times) may compress more - slowly than normal. Versions 0.9.5 and above fare much - better than previous versions in this respect. The ratio - between worst-case and average-case compression time is in - the region of 10:1. For previous versions, this figure - was more like 100:1. You can use the -vvvv option to mon- - itor progress in great detail, if you want. - - Decompression speed is unaffected by these phenomena. - - _b_z_i_p_2 usually allocates several megabytes of memory to - operate in, and then charges all over it in a fairly ran- - dom fashion. This means that performance, both for com- - pressing and decompressing, is largely determined by the - - - - 6 - - - - - -bzip2(1) bzip2(1) - - - speed at which your machine can service cache misses. - Because of this, small changes to the code to reduce the - miss rate have been observed to give disproportionately - large performance improvements. I imagine _b_z_i_p_2 will per- - form best on machines with very large caches. - - -CCAAVVEEAATTSS - I/O error messages are not as helpful as they could be. - _b_z_i_p_2 tries hard to detect I/O errors and exit cleanly, - but the details of what the problem is sometimes seem - rather misleading. - - This manual page pertains to version 1.0 of _b_z_i_p_2_. Com- - pressed data created by this version is entirely forwards - and backwards compatible with the previous public - releases, versions 0.1pl2, 0.9.0 and 0.9.5, but with the - following exception: 0.9.0 and above can correctly decom- - press multiple concatenated compressed files. 0.1pl2 can- - not do this; it will stop after decompressing just the - first file in the stream. - - _b_z_i_p_2_r_e_c_o_v_e_r uses 32-bit integers to represent bit posi- - tions in compressed files, so it cannot handle compressed - files more than 512 megabytes long. This could easily be - fixed. - - -AAUUTTHHOORR - Julian Seward, jseward@acm.org. - - http://sourceware.cygnus.com/bzip2 - http://www.muraroa.demon.co.uk - - The ideas embodied in _b_z_i_p_2 are due to (at least) the fol- - lowing people: Michael Burrows and David Wheeler (for the - block sorting transformation), David Wheeler (again, for - the Huffman coder), Peter Fenwick (for the structured cod- - ing model in the original _b_z_i_p_, and many refinements), and - Alistair Moffat, Radford Neal and Ian Witten (for the - arithmetic coder in the original _b_z_i_p_)_. I am much - indebted for their help, support and advice. See the man- - ual in the source distribution for pointers to sources of - documentation. Christian von Roques encouraged me to look - for faster sorting algorithms, so as to speed up compres- - sion. Bela Lubkin encouraged me to improve the worst-case - compression performance. Many people sent patches, helped - with portability problems, lent machines, gave advice and - were generally helpful. - - - - - - - - - 7 - - diff -Nru bzip2-1.0.1/bzless bzip2-1.0.1.new/bzless --- bzip2-1.0.1/bzless Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/bzless Sat Jun 24 20:16:09 2000 @@ -0,0 +1,2 @@ +#!/bin/sh +%{_bindir}/bunzip2 -c "\$@" | /usr/bin/less diff -Nru bzip2-1.0.1/config.h.in bzip2-1.0.1.new/config.h.in --- bzip2-1.0.1/config.h.in Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/config.h.in Sat Jun 24 20:13:06 2000 @@ -0,0 +1,17 @@ +/* config.h.in. Generated automatically from configure.in by autoheader. */ + +/* Name of package */ +#undef PACKAGE + +/* Version number of package */ +#undef VERSION + +/* Number of bits in a file offset, on hosts where this is settable. */ +#undef _FILE_OFFSET_BITS + +/* Define to make fseeko etc. visible, on some hosts. */ +#undef _LARGEFILE_SOURCE + +/* Define for large files, on AIX-style hosts. */ +#undef _LARGE_FILES + diff -Nru bzip2-1.0.1/configure.in bzip2-1.0.1.new/configure.in --- bzip2-1.0.1/configure.in Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/configure.in Sat Jun 24 20:13:06 2000 @@ -0,0 +1,10 @@ +AC_INIT(bzip2.c) +AM_INIT_AUTOMAKE(bzip2,1.0.1) +AM_CONFIG_HEADER(config.h) +AC_PROG_CC +AM_PROG_LIBTOOL +AC_PROG_LN_S +AC_SYS_LARGEFILE +AC_OUTPUT(Makefile + doc/Makefile + doc/pl/Makefile) diff -Nru bzip2-1.0.1/crctable.c bzip2-1.0.1.new/crctable.c --- bzip2-1.0.1/crctable.c Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/crctable.c Sat Jun 24 20:13:06 2000 @@ -58,6 +58,10 @@ For more information on these sources, see the manual. --*/ +#ifdef HAVE_CONFIG_H +#include +#endif + #include "bzlib_private.h" diff -Nru bzip2-1.0.1/decompress.c bzip2-1.0.1.new/decompress.c --- bzip2-1.0.1/decompress.c Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/decompress.c Sat Jun 24 20:13:06 2000 @@ -58,6 +58,10 @@ For more information on these sources, see the manual. --*/ +#ifdef HAVE_CONFIG_H +#include +#endif + #include "bzlib_private.h" diff -Nru bzip2-1.0.1/dlltest.c bzip2-1.0.1.new/dlltest.c --- bzip2-1.0.1/dlltest.c Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/dlltest.c Sat Jun 24 20:13:06 2000 @@ -8,6 +8,10 @@ usage: minibz2 [-d] [-{1,2,..9}] [[srcfilename] destfilename] */ +#ifdef HAVE_CONFIG_H +#include +#endif + #define BZ_IMPORT #include #include diff -Nru bzip2-1.0.1/doc/Makefile.am bzip2-1.0.1.new/doc/Makefile.am --- bzip2-1.0.1/doc/Makefile.am Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/Makefile.am Sat Jun 24 20:14:43 2000 @@ -0,0 +1,5 @@ + +SUBDIRS = pl + +man_MANS = bzip2.1 bunzip2.1 bzcat.1 bzip2recover.1 +#info_TEXINFOS = bzip2.texi diff -Nru bzip2-1.0.1/doc/bunzip2.1 bzip2-1.0.1.new/doc/bunzip2.1 --- bzip2-1.0.1/doc/bunzip2.1 Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/bunzip2.1 Sat Jun 24 20:13:06 2000 @@ -0,0 +1 @@ +.so bzip2.1 \ No newline at end of file diff -Nru bzip2-1.0.1/doc/bzcat.1 bzip2-1.0.1.new/doc/bzcat.1 --- bzip2-1.0.1/doc/bzcat.1 Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/bzcat.1 Sat Jun 24 20:13:06 2000 @@ -0,0 +1 @@ +.so bzip2.1 \ No newline at end of file diff -Nru bzip2-1.0.1/doc/bzip2.1 bzip2-1.0.1.new/doc/bzip2.1 --- bzip2-1.0.1/doc/bzip2.1 Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/bzip2.1 Sat Jun 24 20:13:06 2000 @@ -0,0 +1,439 @@ +.PU +.TH bzip2 1 +.SH NAME +bzip2, bunzip2 \- a block-sorting file compressor, v1.0 +.br +bzcat \- decompresses files to stdout +.br +bzip2recover \- recovers data from damaged bzip2 files + +.SH SYNOPSIS +.ll +8 +.B bzip2 +.RB [ " \-cdfkqstvzVL123456789 " ] +[ +.I "filenames \&..." +] +.ll -8 +.br +.B bunzip2 +.RB [ " \-fkvsVL " ] +[ +.I "filenames \&..." +] +.br +.B bzcat +.RB [ " \-s " ] +[ +.I "filenames \&..." +] +.br +.B bzip2recover +.I "filename" + +.SH DESCRIPTION +.I bzip2 +compresses files using the Burrows-Wheeler block sorting +text compression algorithm, and Huffman coding. Compression is +generally considerably better than that achieved by more conventional +LZ77/LZ78-based compressors, and approaches the performance of the PPM +family of statistical compressors. + +The command-line options are deliberately very similar to +those of +.I GNU gzip, +but they are not identical. + +.I bzip2 +expects a list of file names to accompany the +command-line flags. Each file is replaced by a compressed version of +itself, with the name "original_name.bz2". +Each compressed file +has the same modification date, permissions, and, when possible, +ownership as the corresponding original, so that these properties can +be correctly restored at decompression time. File name handling is +naive in the sense that there is no mechanism for preserving original +file names, permissions, ownerships or dates in filesystems which lack +these concepts, or have serious file name length restrictions, such as +MS-DOS. + +.I bzip2 +and +.I bunzip2 +will by default not overwrite existing +files. If you want this to happen, specify the \-f flag. + +If no file names are specified, +.I bzip2 +compresses from standard +input to standard output. In this case, +.I bzip2 +will decline to +write compressed output to a terminal, as this would be entirely +incomprehensible and therefore pointless. + +.I bunzip2 +(or +.I bzip2 \-d) +decompresses all +specified files. Files which were not created by +.I bzip2 +will be detected and ignored, and a warning issued. +.I bzip2 +attempts to guess the filename for the decompressed file +from that of the compressed file as follows: + + filename.bz2 becomes filename + filename.bz becomes filename + filename.tbz2 becomes filename.tar + filename.tbz becomes filename.tar + anyothername becomes anyothername.out + +If the file does not end in one of the recognised endings, +.I .bz2, +.I .bz, +.I .tbz2 +or +.I .tbz, +.I bzip2 +complains that it cannot +guess the name of the original file, and uses the original name +with +.I .out +appended. + +As with compression, supplying no +filenames causes decompression from +standard input to standard output. + +.I bunzip2 +will correctly decompress a file which is the +concatenation of two or more compressed files. The result is the +concatenation of the corresponding uncompressed files. Integrity +testing (\-t) +of concatenated +compressed files is also supported. + +You can also compress or decompress files to the standard output by +giving the \-c flag. Multiple files may be compressed and +decompressed like this. The resulting outputs are fed sequentially to +stdout. Compression of multiple files +in this manner generates a stream +containing multiple compressed file representations. Such a stream +can be decompressed correctly only by +.I bzip2 +version 0.9.0 or +later. Earlier versions of +.I bzip2 +will stop after decompressing +the first file in the stream. + +.I bzcat +(or +.I bzip2 -dc) +decompresses all specified files to +the standard output. + +.I bzip2 +will read arguments from the environment variables +.I BZIP2 +and +.I BZIP, +in that order, and will process them +before any arguments read from the command line. This gives a +convenient way to supply default arguments. + +Compression is always performed, even if the compressed +file is slightly +larger than the original. Files of less than about one hundred bytes +tend to get larger, since the compression mechanism has a constant +overhead in the region of 50 bytes. Random data (including the output +of most file compressors) is coded at about 8.05 bits per byte, giving +an expansion of around 0.5%. + +As a self-check for your protection, +.I +bzip2 +uses 32-bit CRCs to +make sure that the decompressed version of a file is identical to the +original. This guards against corruption of the compressed data, and +against undetected bugs in +.I bzip2 +(hopefully very unlikely). The +chances of data corruption going undetected is microscopic, about one +chance in four billion for each file processed. Be aware, though, that +the check occurs upon decompression, so it can only tell you that +something is wrong. It can't help you +recover the original uncompressed +data. You can use +.I bzip2recover +to try to recover data from +damaged files. + +Return values: 0 for a normal exit, 1 for environmental problems (file +not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt +compressed file, 3 for an internal consistency error (eg, bug) which +caused +.I bzip2 +to panic. + +.SH OPTIONS +.TP +.B \-c --stdout +Compress or decompress to standard output. +.TP +.B \-d --decompress +Force decompression. +.I bzip2, +.I bunzip2 +and +.I bzcat +are +really the same program, and the decision about what actions to take is +done on the basis of which name is used. This flag overrides that +mechanism, and forces +.I bzip2 +to decompress. +.TP +.B \-z --compress +The complement to \-d: forces compression, regardless of the +invokation name. +.TP +.B \-t --test +Check integrity of the specified file(s), but don't decompress them. +This really performs a trial decompression and throws away the result. +.TP +.B \-f --force +Force overwrite of output files. Normally, +.I bzip2 +will not overwrite +existing output files. Also forces +.I bzip2 +to break hard links +to files, which it otherwise wouldn't do. +.TP +.B \-k --keep +Keep (don't delete) input files during compression +or decompression. +.TP +.B \-s --small +Reduce memory usage, for compression, decompression and testing. Files +are decompressed and tested using a modified algorithm which only +requires 2.5 bytes per block byte. This means any file can be +decompressed in 2300k of memory, albeit at about half the normal speed. + +During compression, \-s selects a block size of 200k, which limits +memory use to around the same figure, at the expense of your compression +ratio. In short, if your machine is low on memory (8 megabytes or +less), use \-s for everything. See MEMORY MANAGEMENT below. +.TP +.B \-q --quiet +Suppress non-essential warning messages. Messages pertaining to +I/O errors and other critical events will not be suppressed. +.TP +.B \-v --verbose +Verbose mode -- show the compression ratio for each file processed. +Further \-v's increase the verbosity level, spewing out lots of +information which is primarily of interest for diagnostic purposes. +.TP +.B \-L --license -V --version +Display the software version, license terms and conditions. +.TP +.B \-1 to \-9 +Set the block size to 100 k, 200 k .. 900 k when compressing. Has no +effect when decompressing. See MEMORY MANAGEMENT below. +.TP +.B \-- +Treats all subsequent arguments as file names, even if they start +with a dash. This is so you can handle files with names beginning +with a dash, for example: bzip2 \-- \-myfilename. +.TP +.B \--repetitive-fast --repetitive-best +These flags are redundant in versions 0.9.5 and above. They provided +some coarse control over the behaviour of the sorting algorithm in +earlier versions, which was sometimes useful. 0.9.5 and above have an +improved algorithm which renders these flags irrelevant. + +.SH MEMORY MANAGEMENT +.I bzip2 +compresses large files in blocks. The block size affects +both the compression ratio achieved, and the amount of memory needed for +compression and decompression. The flags \-1 through \-9 +specify the block size to be 100,000 bytes through 900,000 bytes (the +default) respectively. At decompression time, the block size used for +compression is read from the header of the compressed file, and +.I bunzip2 +then allocates itself just enough memory to decompress +the file. Since block sizes are stored in compressed files, it follows +that the flags \-1 to \-9 are irrelevant to and so ignored +during decompression. + +Compression and decompression requirements, +in bytes, can be estimated as: + + Compression: 400k + ( 8 x block size ) + + Decompression: 100k + ( 4 x block size ), or + 100k + ( 2.5 x block size ) + +Larger block sizes give rapidly diminishing marginal returns. Most of +the compression comes from the first two or three hundred k of block +size, a fact worth bearing in mind when using +.I bzip2 +on small machines. +It is also important to appreciate that the decompression memory +requirement is set at compression time by the choice of block size. + +For files compressed with the default 900k block size, +.I bunzip2 +will require about 3700 kbytes to decompress. To support decompression +of any file on a 4 megabyte machine, +.I bunzip2 +has an option to +decompress using approximately half this amount of memory, about 2300 +kbytes. Decompression speed is also halved, so you should use this +option only where necessary. The relevant flag is -s. + +In general, try and use the largest block size memory constraints allow, +since that maximises the compression achieved. Compression and +decompression speed are virtually unaffected by block size. + +Another significant point applies to files which fit in a single block +-- that means most files you'd encounter using a large block size. The +amount of real memory touched is proportional to the size of the file, +since the file is smaller than a block. For example, compressing a file +20,000 bytes long with the flag -9 will cause the compressor to +allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 +kbytes of it. Similarly, the decompressor will allocate 3700k but only +touch 100k + 20000 * 4 = 180 kbytes. + +Here is a table which summarises the maximum memory usage for different +block sizes. Also recorded is the total compressed size for 14 files of +the Calgary Text Compression Corpus totalling 3,141,622 bytes. This +column gives some feel for how compression varies with block size. +These figures tend to understate the advantage of larger block sizes for +larger files, since the Corpus is dominated by smaller files. + + Compress Decompress Decompress Corpus + Flag usage usage -s usage Size + + -1 1200k 500k 350k 914704 + -2 2000k 900k 600k 877703 + -3 2800k 1300k 850k 860338 + -4 3600k 1700k 1100k 846899 + -5 4400k 2100k 1350k 845160 + -6 5200k 2500k 1600k 838626 + -7 6100k 2900k 1850k 834096 + -8 6800k 3300k 2100k 828642 + -9 7600k 3700k 2350k 828642 + +.SH RECOVERING DATA FROM DAMAGED FILES +.I bzip2 +compresses files in blocks, usually 900kbytes long. Each +block is handled independently. If a media or transmission error causes +a multi-block .bz2 +file to become damaged, it may be possible to +recover data from the undamaged blocks in the file. + +The compressed representation of each block is delimited by a 48-bit +pattern, which makes it possible to find the block boundaries with +reasonable certainty. Each block also carries its own 32-bit CRC, so +damaged blocks can be distinguished from undamaged ones. + +.I bzip2recover +is a simple program whose purpose is to search for +blocks in .bz2 files, and write each block out into its own .bz2 +file. You can then use +.I bzip2 +\-t +to test the +integrity of the resulting files, and decompress those which are +undamaged. + +.I bzip2recover +takes a single argument, the name of the damaged file, +and writes a number of files "rec0001file.bz2", +"rec0002file.bz2", etc, containing the extracted blocks. +The output filenames are designed so that the use of +wildcards in subsequent processing -- for example, +"bzip2 -dc rec*file.bz2 > recovered_data" -- lists the files in +the correct order. + +.I bzip2recover +should be of most use dealing with large .bz2 +files, as these will contain many blocks. It is clearly +futile to use it on damaged single-block files, since a +damaged block cannot be recovered. If you wish to minimise +any potential data loss through media or transmission errors, +you might consider compressing with a smaller +block size. + +.SH PERFORMANCE NOTES +The sorting phase of compression gathers together similar strings in the +file. Because of this, files containing very long runs of repeated +symbols, like "aabaabaabaab ..." (repeated several hundred times) may +compress more slowly than normal. Versions 0.9.5 and above fare much +better than previous versions in this respect. The ratio between +worst-case and average-case compression time is in the region of 10:1. +For previous versions, this figure was more like 100:1. You can use the +\-vvvv option to monitor progress in great detail, if you want. + +Decompression speed is unaffected by these phenomena. + +.I bzip2 +usually allocates several megabytes of memory to operate +in, and then charges all over it in a fairly random fashion. This means +that performance, both for compressing and decompressing, is largely +determined by the speed at which your machine can service cache misses. +Because of this, small changes to the code to reduce the miss rate have +been observed to give disproportionately large performance improvements. +I imagine +.I bzip2 +will perform best on machines with very large caches. + +.SH CAVEATS +I/O error messages are not as helpful as they could be. +.I bzip2 +tries hard to detect I/O errors and exit cleanly, but the details of +what the problem is sometimes seem rather misleading. + +This manual page pertains to version 1.0 of +.I bzip2. +Compressed +data created by this version is entirely forwards and backwards +compatible with the previous public releases, versions 0.1pl2, 0.9.0 +and 0.9.5, +but with the following exception: 0.9.0 and above can correctly +decompress multiple concatenated compressed files. 0.1pl2 cannot do +this; it will stop after decompressing just the first file in the +stream. + +.I bzip2recover +uses 32-bit integers to represent bit positions in +compressed files, so it cannot handle compressed files more than 512 +megabytes long. This could easily be fixed. + +.SH AUTHOR +Julian Seward, jseward@acm.org. + +http://sourceware.cygnus.com/bzip2 +http://www.muraroa.demon.co.uk + +The ideas embodied in +.I bzip2 +are due to (at least) the following +people: Michael Burrows and David Wheeler (for the block sorting +transformation), David Wheeler (again, for the Huffman coder), Peter +Fenwick (for the structured coding model in the original +.I bzip, +and many refinements), and Alistair Moffat, Radford Neal and Ian Witten +(for the arithmetic coder in the original +.I bzip). +I am much +indebted for their help, support and advice. See the manual in the +source distribution for pointers to sources of documentation. Christian +von Roques encouraged me to look for faster sorting algorithms, so as to +speed up compression. Bela Lubkin encouraged me to improve the +worst-case compression performance. Many people sent patches, helped +with portability problems, lent machines, gave advice and were generally +helpful. diff -Nru bzip2-1.0.1/doc/bzip2.texi bzip2-1.0.1.new/doc/bzip2.texi --- bzip2-1.0.1/doc/bzip2.texi Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/bzip2.texi Sat Jun 24 20:13:06 2000 @@ -0,0 +1,2217 @@ +\input texinfo @c -*- Texinfo -*- +@setfilename bzip2.info + +@ignore +This file documents bzip2 version 1.0, and associated library +libbzip2, written by Julian Seward (jseward@acm.org). + +Copyright (C) 1996-2000 Julian R Seward + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for verbatim copies. +@end ignore + +@ifinfo +@format +@dircategory File utilities: +* Bzip2: (bzip2). A program and library for data + compression +@end direntry +@end format +@end ifinfo + +@iftex +@c @finalout +@settitle bzip2 and libbzip2 +@titlepage +@title bzip2 and libbzip2 +@subtitle a program and library for data compression +@subtitle copyright (C) 1996-2000 Julian Seward +@subtitle version 1.0 of 21 March 2000 +@author Julian Seward + +@end titlepage + +@parindent 0mm +@parskip 2mm + +@end iftex +@node Top, Overview, (dir), (dir) + +@top bzip2 + +This program, @code{bzip2}, +and associated library @code{libbzip2}, are +Copyright (C) 1996-2000 Julian R Seward. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +@itemize @bullet +@item + Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +@item + The origin of this software must not be misrepresented; you must + not claim that you wrote the original software. If you use this + software in a product, an acknowledgment in the product + documentation would be appreciated but is not required. +@item + Altered source versions must be plainly marked as such, and must + not be misrepresented as being the original software. +@item + The name of the author may not be used to endorse or promote + products derived from this software without specific prior written + permission. +@end itemize +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS +OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE +ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY +DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, +WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS +SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Julian Seward, Cambridge, UK. + +@code{jseward@@acm.org} + +@code{http://sourceware.cygnus.com/bzip2} + +@code{http://www.cacheprof.org} + +@code{http://www.muraroa.demon.co.uk} + +@code{bzip2}/@code{libbzip2} version 1.0 of 21 March 2000. + +PATENTS: To the best of my knowledge, @code{bzip2} does not use any patented +algorithms. However, I do not have the resources available to carry out +a full patent search. Therefore I cannot give any guarantee of the +above statement. + + + + + + + +@node Overview, Implementation, Top, Top +@chapter Introduction + +@code{bzip2} compresses files using the Burrows-Wheeler +block-sorting text compression algorithm, and Huffman coding. +Compression is generally considerably better than that +achieved by more conventional LZ77/LZ78-based compressors, +and approaches the performance of the PPM family of statistical compressors. + +@code{bzip2} is built on top of @code{libbzip2}, a flexible library +for handling compressed data in the @code{bzip2} format. This manual +describes both how to use the program and +how to work with the library interface. Most of the +manual is devoted to this library, not the program, +which is good news if your interest is only in the program. + +Chapter 2 describes how to use @code{bzip2}; this is the only part +you need to read if you just want to know how to operate the program. +Chapter 3 describes the programming interfaces in detail, and +Chapter 4 records some miscellaneous notes which I thought +ought to be recorded somewhere. + + +@chapter How to use @code{bzip2} + +This chapter contains a copy of the @code{bzip2} man page, +and nothing else. + +@quotation + +@unnumberedsubsubsec NAME +@itemize +@item @code{bzip2}, @code{bunzip2} +- a block-sorting file compressor, v1.0 +@item @code{bzcat} +- decompresses files to stdout +@item @code{bzip2recover} +- recovers data from damaged bzip2 files +@end itemize + +@unnumberedsubsubsec SYNOPSIS +@itemize +@item @code{bzip2} [ -cdfkqstvzVL123456789 ] [ filenames ... ] +@item @code{bunzip2} [ -fkvsVL ] [ filenames ... ] +@item @code{bzcat} [ -s ] [ filenames ... ] +@item @code{bzip2recover} filename +@end itemize + +@unnumberedsubsubsec DESCRIPTION + +@code{bzip2} compresses files using the Burrows-Wheeler block sorting +text compression algorithm, and Huffman coding. Compression is +generally considerably better than that achieved by more conventional +LZ77/LZ78-based compressors, and approaches the performance of the PPM +family of statistical compressors. + +The command-line options are deliberately very similar to those of GNU +@code{gzip}, but they are not identical. + +@code{bzip2} expects a list of file names to accompany the command-line +flags. Each file is replaced by a compressed version of itself, with +the name @code{original_name.bz2}. Each compressed file has the same +modification date, permissions, and, when possible, ownership as the +corresponding original, so that these properties can be correctly +restored at decompression time. File name handling is naive in the +sense that there is no mechanism for preserving original file names, +permissions, ownerships or dates in filesystems which lack these +concepts, or have serious file name length restrictions, such as MS-DOS. + +@code{bzip2} and @code{bunzip2} will by default not overwrite existing +files. If you want this to happen, specify the @code{-f} flag. + +If no file names are specified, @code{bzip2} compresses from standard +input to standard output. In this case, @code{bzip2} will decline to +write compressed output to a terminal, as this would be entirely +incomprehensible and therefore pointless. + +@code{bunzip2} (or @code{bzip2 -d}) decompresses all +specified files. Files which were not created by @code{bzip2} +will be detected and ignored, and a warning issued. +@code{bzip2} attempts to guess the filename for the decompressed file +from that of the compressed file as follows: +@itemize +@item @code{filename.bz2 } becomes @code{filename} +@item @code{filename.bz } becomes @code{filename} +@item @code{filename.tbz2} becomes @code{filename.tar} +@item @code{filename.tbz } becomes @code{filename.tar} +@item @code{anyothername } becomes @code{anyothername.out} +@end itemize +If the file does not end in one of the recognised endings, +@code{.bz2}, @code{.bz}, +@code{.tbz2} or @code{.tbz}, @code{bzip2} complains that it cannot +guess the name of the original file, and uses the original name +with @code{.out} appended. + +As with compression, supplying no +filenames causes decompression from standard input to standard output. + +@code{bunzip2} will correctly decompress a file which is the +concatenation of two or more compressed files. The result is the +concatenation of the corresponding uncompressed files. Integrity +testing (@code{-t}) of concatenated compressed files is also supported. + +You can also compress or decompress files to the standard output by +giving the @code{-c} flag. Multiple files may be compressed and +decompressed like this. The resulting outputs are fed sequentially to +stdout. Compression of multiple files in this manner generates a stream +containing multiple compressed file representations. Such a stream +can be decompressed correctly only by @code{bzip2} version 0.9.0 or +later. Earlier versions of @code{bzip2} will stop after decompressing +the first file in the stream. + +@code{bzcat} (or @code{bzip2 -dc}) decompresses all specified files to +the standard output. + +@code{bzip2} will read arguments from the environment variables +@code{BZIP2} and @code{BZIP}, in that order, and will process them +before any arguments read from the command line. This gives a +convenient way to supply default arguments. + +Compression is always performed, even if the compressed file is slightly +larger than the original. Files of less than about one hundred bytes +tend to get larger, since the compression mechanism has a constant +overhead in the region of 50 bytes. Random data (including the output +of most file compressors) is coded at about 8.05 bits per byte, giving +an expansion of around 0.5%. + +As a self-check for your protection, @code{bzip2} uses 32-bit CRCs to +make sure that the decompressed version of a file is identical to the +original. This guards against corruption of the compressed data, and +against undetected bugs in @code{bzip2} (hopefully very unlikely). The +chances of data corruption going undetected is microscopic, about one +chance in four billion for each file processed. Be aware, though, that +the check occurs upon decompression, so it can only tell you that +something is wrong. It can't help you recover the original uncompressed +data. You can use @code{bzip2recover} to try to recover data from +damaged files. + +Return values: 0 for a normal exit, 1 for environmental problems (file +not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt +compressed file, 3 for an internal consistency error (eg, bug) which +caused @code{bzip2} to panic. + + +@unnumberedsubsubsec OPTIONS +@table @code +@item -c --stdout +Compress or decompress to standard output. +@item -d --decompress +Force decompression. @code{bzip2}, @code{bunzip2} and @code{bzcat} are +really the same program, and the decision about what actions to take is +done on the basis of which name is used. This flag overrides that +mechanism, and forces bzip2 to decompress. +@item -z --compress +The complement to @code{-d}: forces compression, regardless of the +invokation name. +@item -t --test +Check integrity of the specified file(s), but don't decompress them. +This really performs a trial decompression and throws away the result. +@item -f --force +Force overwrite of output files. Normally, @code{bzip2} will not overwrite +existing output files. Also forces @code{bzip2} to break hard links +to files, which it otherwise wouldn't do. +@item -k --keep +Keep (don't delete) input files during compression +or decompression. +@item -s --small +Reduce memory usage, for compression, decompression and testing. Files +are decompressed and tested using a modified algorithm which only +requires 2.5 bytes per block byte. This means any file can be +decompressed in 2300k of memory, albeit at about half the normal speed. + +During compression, @code{-s} selects a block size of 200k, which limits +memory use to around the same figure, at the expense of your compression +ratio. In short, if your machine is low on memory (8 megabytes or +less), use -s for everything. See MEMORY MANAGEMENT below. +@item -q --quiet +Suppress non-essential warning messages. Messages pertaining to +I/O errors and other critical events will not be suppressed. +@item -v --verbose +Verbose mode -- show the compression ratio for each file processed. +Further @code{-v}'s increase the verbosity level, spewing out lots of +information which is primarily of interest for diagnostic purposes. +@item -L --license -V --version +Display the software version, license terms and conditions. +@item -1 to -9 +Set the block size to 100 k, 200 k .. 900 k when compressing. Has no +effect when decompressing. See MEMORY MANAGEMENT below. +@item -- +Treats all subsequent arguments as file names, even if they start +with a dash. This is so you can handle files with names beginning +with a dash, for example: @code{bzip2 -- -myfilename}. +@item --repetitive-fast +@item --repetitive-best +These flags are redundant in versions 0.9.5 and above. They provided +some coarse control over the behaviour of the sorting algorithm in +earlier versions, which was sometimes useful. 0.9.5 and above have an +improved algorithm which renders these flags irrelevant. +@end table + + +@unnumberedsubsubsec MEMORY MANAGEMENT + +@code{bzip2} compresses large files in blocks. The block size affects +both the compression ratio achieved, and the amount of memory needed for +compression and decompression. The flags @code{-1} through @code{-9} +specify the block size to be 100,000 bytes through 900,000 bytes (the +default) respectively. At decompression time, the block size used for +compression is read from the header of the compressed file, and +@code{bunzip2} then allocates itself just enough memory to decompress +the file. Since block sizes are stored in compressed files, it follows +that the flags @code{-1} to @code{-9} are irrelevant to and so ignored +during decompression. + +Compression and decompression requirements, in bytes, can be estimated +as: +@example + Compression: 400k + ( 8 x block size ) + + Decompression: 100k + ( 4 x block size ), or + 100k + ( 2.5 x block size ) +@end example +Larger block sizes give rapidly diminishing marginal returns. Most of +the compression comes from the first two or three hundred k of block +size, a fact worth bearing in mind when using @code{bzip2} on small machines. +It is also important to appreciate that the decompression memory +requirement is set at compression time by the choice of block size. + +For files compressed with the default 900k block size, @code{bunzip2} +will require about 3700 kbytes to decompress. To support decompression +of any file on a 4 megabyte machine, @code{bunzip2} has an option to +decompress using approximately half this amount of memory, about 2300 +kbytes. Decompression speed is also halved, so you should use this +option only where necessary. The relevant flag is @code{-s}. + +In general, try and use the largest block size memory constraints allow, +since that maximises the compression achieved. Compression and +decompression speed are virtually unaffected by block size. + +Another significant point applies to files which fit in a single block +-- that means most files you'd encounter using a large block size. The +amount of real memory touched is proportional to the size of the file, +since the file is smaller than a block. For example, compressing a file +20,000 bytes long with the flag @code{-9} will cause the compressor to +allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 +kbytes of it. Similarly, the decompressor will allocate 3700k but only +touch 100k + 20000 * 4 = 180 kbytes. + +Here is a table which summarises the maximum memory usage for different +block sizes. Also recorded is the total compressed size for 14 files of +the Calgary Text Compression Corpus totalling 3,141,622 bytes. This +column gives some feel for how compression varies with block size. +These figures tend to understate the advantage of larger block sizes for +larger files, since the Corpus is dominated by smaller files. +@example + Compress Decompress Decompress Corpus + Flag usage usage -s usage Size + + -1 1200k 500k 350k 914704 + -2 2000k 900k 600k 877703 + -3 2800k 1300k 850k 860338 + -4 3600k 1700k 1100k 846899 + -5 4400k 2100k 1350k 845160 + -6 5200k 2500k 1600k 838626 + -7 6100k 2900k 1850k 834096 + -8 6800k 3300k 2100k 828642 + -9 7600k 3700k 2350k 828642 +@end example + +@unnumberedsubsubsec RECOVERING DATA FROM DAMAGED FILES + +@code{bzip2} compresses files in blocks, usually 900kbytes long. Each +block is handled independently. If a media or transmission error causes +a multi-block @code{.bz2} file to become damaged, it may be possible to +recover data from the undamaged blocks in the file. + +The compressed representation of each block is delimited by a 48-bit +pattern, which makes it possible to find the block boundaries with +reasonable certainty. Each block also carries its own 32-bit CRC, so +damaged blocks can be distinguished from undamaged ones. + +@code{bzip2recover} is a simple program whose purpose is to search for +blocks in @code{.bz2} files, and write each block out into its own +@code{.bz2} file. You can then use @code{bzip2 -t} to test the +integrity of the resulting files, and decompress those which are +undamaged. + +@code{bzip2recover} +takes a single argument, the name of the damaged file, +and writes a number of files @code{rec0001file.bz2}, + @code{rec0002file.bz2}, etc, containing the extracted blocks. + The output filenames are designed so that the use of + wildcards in subsequent processing -- for example, +@code{bzip2 -dc rec*file.bz2 > recovered_data} -- lists the files in + the correct order. + +@code{bzip2recover} should be of most use dealing with large @code{.bz2} + files, as these will contain many blocks. It is clearly + futile to use it on damaged single-block files, since a + damaged block cannot be recovered. If you wish to minimise +any potential data loss through media or transmission errors, +you might consider compressing with a smaller + block size. + + +@unnumberedsubsubsec PERFORMANCE NOTES + +The sorting phase of compression gathers together similar strings in the +file. Because of this, files containing very long runs of repeated +symbols, like "aabaabaabaab ..." (repeated several hundred times) may +compress more slowly than normal. Versions 0.9.5 and above fare much +better than previous versions in this respect. The ratio between +worst-case and average-case compression time is in the region of 10:1. +For previous versions, this figure was more like 100:1. You can use the +@code{-vvvv} option to monitor progress in great detail, if you want. + +Decompression speed is unaffected by these phenomena. + +@code{bzip2} usually allocates several megabytes of memory to operate +in, and then charges all over it in a fairly random fashion. This means +that performance, both for compressing and decompressing, is largely +determined by the speed at which your machine can service cache misses. +Because of this, small changes to the code to reduce the miss rate have +been observed to give disproportionately large performance improvements. +I imagine @code{bzip2} will perform best on machines with very large +caches. + + +@unnumberedsubsubsec CAVEATS + +I/O error messages are not as helpful as they could be. @code{bzip2} +tries hard to detect I/O errors and exit cleanly, but the details of +what the problem is sometimes seem rather misleading. + +This manual page pertains to version 1.0 of @code{bzip2}. Compressed +data created by this version is entirely forwards and backwards +compatible with the previous public releases, versions 0.1pl2, 0.9.0 and +0.9.5, but with the following exception: 0.9.0 and above can correctly +decompress multiple concatenated compressed files. 0.1pl2 cannot do +this; it will stop after decompressing just the first file in the +stream. + +@code{bzip2recover} uses 32-bit integers to represent bit positions in +compressed files, so it cannot handle compressed files more than 512 +megabytes long. This could easily be fixed. + + +@unnumberedsubsubsec AUTHOR +Julian Seward, @code{jseward@@acm.org}. + +The ideas embodied in @code{bzip2} are due to (at least) the following +people: Michael Burrows and David Wheeler (for the block sorting +transformation), David Wheeler (again, for the Huffman coder), Peter +Fenwick (for the structured coding model in the original @code{bzip}, +and many refinements), and Alistair Moffat, Radford Neal and Ian Witten +(for the arithmetic coder in the original @code{bzip}). I am much +indebted for their help, support and advice. See the manual in the +source distribution for pointers to sources of documentation. Christian +von Roques encouraged me to look for faster sorting algorithms, so as to +speed up compression. Bela Lubkin encouraged me to improve the +worst-case compression performance. Many people sent patches, helped +with portability problems, lent machines, gave advice and were generally +helpful. + +@end quotation + + + + +@chapter Programming with @code{libbzip2} + +This chapter describes the programming interface to @code{libbzip2}. + +For general background information, particularly about memory +use and performance aspects, you'd be well advised to read Chapter 2 +as well. + +@section Top-level structure + +@code{libbzip2} is a flexible library for compressing and decompressing +data in the @code{bzip2} data format. Although packaged as a single +entity, it helps to regard the library as three separate parts: the low +level interface, and the high level interface, and some utility +functions. + +The structure of @code{libbzip2}'s interfaces is similar to +that of Jean-loup Gailly's and Mark Adler's excellent @code{zlib} +library. + +All externally visible symbols have names beginning @code{BZ2_}. +This is new in version 1.0. The intention is to minimise pollution +of the namespaces of library clients. + +@subsection Low-level summary + +This interface provides services for compressing and decompressing +data in memory. There's no provision for dealing with files, streams +or any other I/O mechanisms, just straight memory-to-memory work. +In fact, this part of the library can be compiled without inclusion +of @code{stdio.h}, which may be helpful for embedded applications. + +The low-level part of the library has no global variables and +is therefore thread-safe. + +Six routines make up the low level interface: +@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, and @* @code{BZ2_bzCompressEnd} +for compression, +and a corresponding trio @code{BZ2_bzDecompressInit}, @* @code{BZ2_bzDecompress} +and @code{BZ2_bzDecompressEnd} for decompression. +The @code{*Init} functions allocate +memory for compression/decompression and do other +initialisations, whilst the @code{*End} functions close down operations +and release memory. + +The real work is done by @code{BZ2_bzCompress} and @code{BZ2_bzDecompress}. +These compress and decompress data from a user-supplied input buffer +to a user-supplied output buffer. These buffers can be any size; +arbitrary quantities of data are handled by making repeated calls +to these functions. This is a flexible mechanism allowing a +consumer-pull style of activity, or producer-push, or a mixture of +both. + + + +@subsection High-level summary + +This interface provides some handy wrappers around the low-level +interface to facilitate reading and writing @code{bzip2} format +files (@code{.bz2} files). The routines provide hooks to facilitate +reading files in which the @code{bzip2} data stream is embedded +within some larger-scale file structure, or where there are +multiple @code{bzip2} data streams concatenated end-to-end. + +For reading files, @code{BZ2_bzReadOpen}, @code{BZ2_bzRead}, +@code{BZ2_bzReadClose} and @* @code{BZ2_bzReadGetUnused} are supplied. For +writing files, @code{BZ2_bzWriteOpen}, @code{BZ2_bzWrite} and +@code{BZ2_bzWriteFinish} are available. + +As with the low-level library, no global variables are used +so the library is per se thread-safe. However, if I/O errors +occur whilst reading or writing the underlying compressed files, +you may have to consult @code{errno} to determine the cause of +the error. In that case, you'd need a C library which correctly +supports @code{errno} in a multithreaded environment. + +To make the library a little simpler and more portable, +@code{BZ2_bzReadOpen} and @code{BZ2_bzWriteOpen} require you to pass them file +handles (@code{FILE*}s) which have previously been opened for reading or +writing respectively. That avoids portability problems associated with +file operations and file attributes, whilst not being much of an +imposition on the programmer. + + + +@subsection Utility functions summary +For very simple needs, @code{BZ2_bzBuffToBuffCompress} and +@code{BZ2_bzBuffToBuffDecompress} are provided. These compress +data in memory from one buffer to another buffer in a single +function call. You should assess whether these functions +fulfill your memory-to-memory compression/decompression +requirements before investing effort in understanding the more +general but more complex low-level interface. + +Yoshioka Tsuneo (@code{QWF00133@@niftyserve.or.jp} / +@code{tsuneo-y@@is.aist-nara.ac.jp}) has contributed some functions to +give better @code{zlib} compatibility. These functions are +@code{BZ2_bzopen}, @code{BZ2_bzread}, @code{BZ2_bzwrite}, @code{BZ2_bzflush}, +@code{BZ2_bzclose}, +@code{BZ2_bzerror} and @code{BZ2_bzlibVersion}. You may find these functions +more convenient for simple file reading and writing, than those in the +high-level interface. These functions are not (yet) officially part of +the library, and are minimally documented here. If they break, you +get to keep all the pieces. I hope to document them properly when time +permits. + +Yoshioka also contributed modifications to allow the library to be +built as a Windows DLL. + + +@section Error handling + +The library is designed to recover cleanly in all situations, including +the worst-case situation of decompressing random data. I'm not +100% sure that it can always do this, so you might want to add +a signal handler to catch segmentation violations during decompression +if you are feeling especially paranoid. I would be interested in +hearing more about the robustness of the library to corrupted +compressed data. + +Version 1.0 is much more robust in this respect than +0.9.0 or 0.9.5. Investigations with Checker (a tool for +detecting problems with memory management, similar to Purify) +indicate that, at least for the few files I tested, all single-bit +errors in the decompressed data are caught properly, with no +segmentation faults, no reads of uninitialised data and no +out of range reads or writes. So it's certainly much improved, +although I wouldn't claim it to be totally bombproof. + +The file @code{bzlib.h} contains all definitions needed to use +the library. In particular, you should definitely not include +@code{bzlib_private.h}. + +In @code{bzlib.h}, the various return values are defined. The following +list is not intended as an exhaustive description of the circumstances +in which a given value may be returned -- those descriptions are given +later. Rather, it is intended to convey the rough meaning of each +return value. The first five actions are normal and not intended to +denote an error situation. +@table @code +@item BZ_OK +The requested action was completed successfully. +@item BZ_RUN_OK +@itemx BZ_FLUSH_OK +@itemx BZ_FINISH_OK +In @code{BZ2_bzCompress}, the requested flush/finish/nothing-special action +was completed successfully. +@item BZ_STREAM_END +Compression of data was completed, or the logical stream end was +detected during decompression. +@end table + +The following return values indicate an error of some kind. +@table @code +@item BZ_CONFIG_ERROR +Indicates that the library has been improperly compiled on your +platform -- a major configuration error. Specifically, it means +that @code{sizeof(char)}, @code{sizeof(short)} and @code{sizeof(int)} +are not 1, 2 and 4 respectively, as they should be. Note that the +library should still work properly on 64-bit platforms which follow +the LP64 programming model -- that is, where @code{sizeof(long)} +and @code{sizeof(void*)} are 8. Under LP64, @code{sizeof(int)} is +still 4, so @code{libbzip2}, which doesn't use the @code{long} type, +is OK. +@item BZ_SEQUENCE_ERROR +When using the library, it is important to call the functions in the +correct sequence and with data structures (buffers etc) in the correct +states. @code{libbzip2} checks as much as it can to ensure this is +happening, and returns @code{BZ_SEQUENCE_ERROR} if not. Code which +complies precisely with the function semantics, as detailed below, +should never receive this value; such an event denotes buggy code +which you should investigate. +@item BZ_PARAM_ERROR +Returned when a parameter to a function call is out of range +or otherwise manifestly incorrect. As with @code{BZ_SEQUENCE_ERROR}, +this denotes a bug in the client code. The distinction between +@code{BZ_PARAM_ERROR} and @code{BZ_SEQUENCE_ERROR} is a bit hazy, but still worth +making. +@item BZ_MEM_ERROR +Returned when a request to allocate memory failed. Note that the +quantity of memory needed to decompress a stream cannot be determined +until the stream's header has been read. So @code{BZ2_bzDecompress} and +@code{BZ2_bzRead} may return @code{BZ_MEM_ERROR} even though some of +the compressed data has been read. The same is not true for +compression; once @code{BZ2_bzCompressInit} or @code{BZ2_bzWriteOpen} have +successfully completed, @code{BZ_MEM_ERROR} cannot occur. +@item BZ_DATA_ERROR +Returned when a data integrity error is detected during decompression. +Most importantly, this means when stored and computed CRCs for the +data do not match. This value is also returned upon detection of any +other anomaly in the compressed data. +@item BZ_DATA_ERROR_MAGIC +As a special case of @code{BZ_DATA_ERROR}, it is sometimes useful to +know when the compressed stream does not start with the correct +magic bytes (@code{'B' 'Z' 'h'}). +@item BZ_IO_ERROR +Returned by @code{BZ2_bzRead} and @code{BZ2_bzWrite} when there is an error +reading or writing in the compressed file, and by @code{BZ2_bzReadOpen} +and @code{BZ2_bzWriteOpen} for attempts to use a file for which the +error indicator (viz, @code{ferror(f)}) is set. +On receipt of @code{BZ_IO_ERROR}, the caller should consult +@code{errno} and/or @code{perror} to acquire operating-system +specific information about the problem. +@item BZ_UNEXPECTED_EOF +Returned by @code{BZ2_bzRead} when the compressed file finishes +before the logical end of stream is detected. +@item BZ_OUTBUFF_FULL +Returned by @code{BZ2_bzBuffToBuffCompress} and +@code{BZ2_bzBuffToBuffDecompress} to indicate that the output data +will not fit into the output buffer provided. +@end table + + + +@section Low-level interface + +@subsection @code{BZ2_bzCompressInit} +@example +typedef + struct @{ + char *next_in; + unsigned int avail_in; + unsigned int total_in_lo32; + unsigned int total_in_hi32; + + char *next_out; + unsigned int avail_out; + unsigned int total_out_lo32; + unsigned int total_out_hi32; + + void *state; + + void *(*bzalloc)(void *,int,int); + void (*bzfree)(void *,void *); + void *opaque; + @} + bz_stream; + +int BZ2_bzCompressInit ( bz_stream *strm, + int blockSize100k, + int verbosity, + int workFactor ); + +@end example + +Prepares for compression. The @code{bz_stream} structure +holds all data pertaining to the compression activity. +A @code{bz_stream} structure should be allocated and initialised +prior to the call. +The fields of @code{bz_stream} +comprise the entirety of the user-visible data. @code{state} +is a pointer to the private data structures required for compression. + +Custom memory allocators are supported, via fields @code{bzalloc}, +@code{bzfree}, +and @code{opaque}. The value +@code{opaque} is passed to as the first argument to +all calls to @code{bzalloc} and @code{bzfree}, but is +otherwise ignored by the library. +The call @code{bzalloc ( opaque, n, m )} is expected to return a +pointer @code{p} to +@code{n * m} bytes of memory, and @code{bzfree ( opaque, p )} +should free +that memory. + +If you don't want to use a custom memory allocator, set @code{bzalloc}, +@code{bzfree} and +@code{opaque} to @code{NULL}, +and the library will then use the standard @code{malloc}/@code{free} +routines. + +Before calling @code{BZ2_bzCompressInit}, fields @code{bzalloc}, +@code{bzfree} and @code{opaque} should +be filled appropriately, as just described. Upon return, the internal +state will have been allocated and initialised, and @code{total_in_lo32}, +@code{total_in_hi32}, @code{total_out_lo32} and +@code{total_out_hi32} will have been set to zero. +These four fields are used by the library +to inform the caller of the total amount of data passed into and out of +the library, respectively. You should not try to change them. +As of version 1.0, 64-bit counts are maintained, even on 32-bit +platforms, using the @code{_hi32} fields to store the upper 32 bits +of the count. So, for example, the total amount of data in +is @code{(total_in_hi32 << 32) + total_in_lo32}. + +Parameter @code{blockSize100k} specifies the block size to be used for +compression. It should be a value between 1 and 9 inclusive, and the +actual block size used is 100000 x this figure. 9 gives the best +compression but takes most memory. + +Parameter @code{verbosity} should be set to a number between 0 and 4 +inclusive. 0 is silent, and greater numbers give increasingly verbose +monitoring/debugging output. If the library has been compiled with +@code{-DBZ_NO_STDIO}, no such output will appear for any verbosity +setting. + +Parameter @code{workFactor} controls how the compression phase behaves +when presented with worst case, highly repetitive, input data. If +compression runs into difficulties caused by repetitive data, the +library switches from the standard sorting algorithm to a fallback +algorithm. The fallback is slower than the standard algorithm by +perhaps a factor of three, but always behaves reasonably, no matter how +bad the input. + +Lower values of @code{workFactor} reduce the amount of effort the +standard algorithm will expend before resorting to the fallback. You +should set this parameter carefully; too low, and many inputs will be +handled by the fallback algorithm and so compress rather slowly, too +high, and your average-to-worst case compression times can become very +large. The default value of 30 gives reasonable behaviour over a wide +range of circumstances. + +Allowable values range from 0 to 250 inclusive. 0 is a special case, +equivalent to using the default value of 30. + +Note that the compressed output generated is the same regardless of +whether or not the fallback algorithm is used. + +Be aware also that this parameter may disappear entirely in future +versions of the library. In principle it should be possible to devise a +good way to automatically choose which algorithm to use. Such a +mechanism would render the parameter obsolete. + +Possible return values: +@display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled + @code{BZ_PARAM_ERROR} + if @code{strm} is @code{NULL} + or @code{blockSize} < 1 or @code{blockSize} > 9 + or @code{verbosity} < 0 or @code{verbosity} > 4 + or @code{workFactor} < 0 or @code{workFactor} > 250 + @code{BZ_MEM_ERROR} + if not enough memory is available + @code{BZ_OK} + otherwise +@end display +Allowable next actions: +@display + @code{BZ2_bzCompress} + if @code{BZ_OK} is returned + no specific action needed in case of error +@end display + +@subsection @code{BZ2_bzCompress} +@example + int BZ2_bzCompress ( bz_stream *strm, int action ); +@end example +Provides more input and/or output buffer space for the library. The +caller maintains input and output buffers, and calls @code{BZ2_bzCompress} to +transfer data between them. + +Before each call to @code{BZ2_bzCompress}, @code{next_in} should point at +the data to be compressed, and @code{avail_in} should indicate how many +bytes the library may read. @code{BZ2_bzCompress} updates @code{next_in}, +@code{avail_in} and @code{total_in} to reflect the number of bytes it +has read. + +Similarly, @code{next_out} should point to a buffer in which the +compressed data is to be placed, with @code{avail_out} indicating how +much output space is available. @code{BZ2_bzCompress} updates +@code{next_out}, @code{avail_out} and @code{total_out} to reflect the +number of bytes output. + +You may provide and remove as little or as much data as you like on each +call of @code{BZ2_bzCompress}. In the limit, it is acceptable to supply and +remove data one byte at a time, although this would be terribly +inefficient. You should always ensure that at least one byte of output +space is available at each call. + +A second purpose of @code{BZ2_bzCompress} is to request a change of mode of the +compressed stream. + +Conceptually, a compressed stream can be in one of four states: IDLE, +RUNNING, FLUSHING and FINISHING. Before initialisation +(@code{BZ2_bzCompressInit}) and after termination (@code{BZ2_bzCompressEnd}), a +stream is regarded as IDLE. + +Upon initialisation (@code{BZ2_bzCompressInit}), the stream is placed in the +RUNNING state. Subsequent calls to @code{BZ2_bzCompress} should pass +@code{BZ_RUN} as the requested action; other actions are illegal and +will result in @code{BZ_SEQUENCE_ERROR}. + +At some point, the calling program will have provided all the input data +it wants to. It will then want to finish up -- in effect, asking the +library to process any data it might have buffered internally. In this +state, @code{BZ2_bzCompress} will no longer attempt to read data from +@code{next_in}, but it will want to write data to @code{next_out}. +Because the output buffer supplied by the user can be arbitrarily small, +the finishing-up operation cannot necessarily be done with a single call +of @code{BZ2_bzCompress}. + +Instead, the calling program passes @code{BZ_FINISH} as an action to +@code{BZ2_bzCompress}. This changes the stream's state to FINISHING. Any +remaining input (ie, @code{next_in[0 .. avail_in-1]}) is compressed and +transferred to the output buffer. To do this, @code{BZ2_bzCompress} must be +called repeatedly until all the output has been consumed. At that +point, @code{BZ2_bzCompress} returns @code{BZ_STREAM_END}, and the stream's +state is set back to IDLE. @code{BZ2_bzCompressEnd} should then be +called. + +Just to make sure the calling program does not cheat, the library makes +a note of @code{avail_in} at the time of the first call to +@code{BZ2_bzCompress} which has @code{BZ_FINISH} as an action (ie, at the +time the program has announced its intention to not supply any more +input). By comparing this value with that of @code{avail_in} over +subsequent calls to @code{BZ2_bzCompress}, the library can detect any +attempts to slip in more data to compress. Any calls for which this is +detected will return @code{BZ_SEQUENCE_ERROR}. This indicates a +programming mistake which should be corrected. + +Instead of asking to finish, the calling program may ask +@code{BZ2_bzCompress} to take all the remaining input, compress it and +terminate the current (Burrows-Wheeler) compression block. This could +be useful for error control purposes. The mechanism is analogous to +that for finishing: call @code{BZ2_bzCompress} with an action of +@code{BZ_FLUSH}, remove output data, and persist with the +@code{BZ_FLUSH} action until the value @code{BZ_RUN} is returned. As +with finishing, @code{BZ2_bzCompress} detects any attempt to provide more +input data once the flush has begun. + +Once the flush is complete, the stream returns to the normal RUNNING +state. + +This all sounds pretty complex, but isn't really. Here's a table +which shows which actions are allowable in each state, what action +will be taken, what the next state is, and what the non-error return +values are. Note that you can't explicitly ask what state the +stream is in, but nor do you need to -- it can be inferred from the +values returned by @code{BZ2_bzCompress}. +@display +IDLE/@code{any} + Illegal. IDLE state only exists after @code{BZ2_bzCompressEnd} or + before @code{BZ2_bzCompressInit}. + Return value = @code{BZ_SEQUENCE_ERROR} + +RUNNING/@code{BZ_RUN} + Compress from @code{next_in} to @code{next_out} as much as possible. + Next state = RUNNING + Return value = @code{BZ_RUN_OK} + +RUNNING/@code{BZ_FLUSH} + Remember current value of @code{next_in}. Compress from @code{next_in} + to @code{next_out} as much as possible, but do not accept any more input. + Next state = FLUSHING + Return value = @code{BZ_FLUSH_OK} + +RUNNING/@code{BZ_FINISH} + Remember current value of @code{next_in}. Compress from @code{next_in} + to @code{next_out} as much as possible, but do not accept any more input. + Next state = FINISHING + Return value = @code{BZ_FINISH_OK} + +FLUSHING/@code{BZ_FLUSH} + Compress from @code{next_in} to @code{next_out} as much as possible, + but do not accept any more input. + If all the existing input has been used up and all compressed + output has been removed + Next state = RUNNING; Return value = @code{BZ_RUN_OK} + else + Next state = FLUSHING; Return value = @code{BZ_FLUSH_OK} + +FLUSHING/other + Illegal. + Return value = @code{BZ_SEQUENCE_ERROR} + +FINISHING/@code{BZ_FINISH} + Compress from @code{next_in} to @code{next_out} as much as possible, + but to not accept any more input. + If all the existing input has been used up and all compressed + output has been removed + Next state = IDLE; Return value = @code{BZ_STREAM_END} + else + Next state = FINISHING; Return value = @code{BZ_FINISHING} + +FINISHING/other + Illegal. + Return value = @code{BZ_SEQUENCE_ERROR} +@end display + +That still looks complicated? Well, fair enough. The usual sequence +of calls for compressing a load of data is: +@itemize @bullet +@item Get started with @code{BZ2_bzCompressInit}. +@item Shovel data in and shlurp out its compressed form using zero or more +calls of @code{BZ2_bzCompress} with action = @code{BZ_RUN}. +@item Finish up. +Repeatedly call @code{BZ2_bzCompress} with action = @code{BZ_FINISH}, +copying out the compressed output, until @code{BZ_STREAM_END} is returned. +@item Close up and go home. Call @code{BZ2_bzCompressEnd}. +@end itemize +If the data you want to compress fits into your input buffer all +at once, you can skip the calls of @code{BZ2_bzCompress ( ..., BZ_RUN )} and +just do the @code{BZ2_bzCompress ( ..., BZ_FINISH )} calls. + +All required memory is allocated by @code{BZ2_bzCompressInit}. The +compression library can accept any data at all (obviously). So you +shouldn't get any error return values from the @code{BZ2_bzCompress} calls. +If you do, they will be @code{BZ_SEQUENCE_ERROR}, and indicate a bug in +your programming. + +Trivial other possible return values: +@display + @code{BZ_PARAM_ERROR} + if @code{strm} is @code{NULL}, or @code{strm->s} is @code{NULL} +@end display + +@subsection @code{BZ2_bzCompressEnd} +@example +int BZ2_bzCompressEnd ( bz_stream *strm ); +@end example +Releases all memory associated with a compression stream. + +Possible return values: +@display + @code{BZ_PARAM_ERROR} if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} + @code{BZ_OK} otherwise +@end display + + +@subsection @code{BZ2_bzDecompressInit} +@example +int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small ); +@end example +Prepares for decompression. As with @code{BZ2_bzCompressInit}, a +@code{bz_stream} record should be allocated and initialised before the +call. Fields @code{bzalloc}, @code{bzfree} and @code{opaque} should be +set if a custom memory allocator is required, or made @code{NULL} for +the normal @code{malloc}/@code{free} routines. Upon return, the internal +state will have been initialised, and @code{total_in} and +@code{total_out} will be zero. + +For the meaning of parameter @code{verbosity}, see @code{BZ2_bzCompressInit}. + +If @code{small} is nonzero, the library will use an alternative +decompression algorithm which uses less memory but at the cost of +decompressing more slowly (roughly speaking, half the speed, but the +maximum memory requirement drops to around 2300k). See Chapter 2 for +more information on memory management. + +Note that the amount of memory needed to decompress +a stream cannot be determined until the stream's header has been read, +so even if @code{BZ2_bzDecompressInit} succeeds, a subsequent +@code{BZ2_bzDecompress} could fail with @code{BZ_MEM_ERROR}. + +Possible return values: +@display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled + @code{BZ_PARAM_ERROR} + if @code{(small != 0 && small != 1)} + or @code{(verbosity < 0 || verbosity > 4)} + @code{BZ_MEM_ERROR} + if insufficient memory is available +@end display + +Allowable next actions: +@display + @code{BZ2_bzDecompress} + if @code{BZ_OK} was returned + no specific action required in case of error +@end display + + + +@subsection @code{BZ2_bzDecompress} +@example +int BZ2_bzDecompress ( bz_stream *strm ); +@end example +Provides more input and/out output buffer space for the library. The +caller maintains input and output buffers, and uses @code{BZ2_bzDecompress} +to transfer data between them. + +Before each call to @code{BZ2_bzDecompress}, @code{next_in} +should point at the compressed data, +and @code{avail_in} should indicate how many bytes the library +may read. @code{BZ2_bzDecompress} updates @code{next_in}, @code{avail_in} +and @code{total_in} +to reflect the number of bytes it has read. + +Similarly, @code{next_out} should point to a buffer in which the uncompressed +output is to be placed, with @code{avail_out} indicating how much output space +is available. @code{BZ2_bzCompress} updates @code{next_out}, +@code{avail_out} and @code{total_out} to reflect +the number of bytes output. + +You may provide and remove as little or as much data as you like on +each call of @code{BZ2_bzDecompress}. +In the limit, it is acceptable to +supply and remove data one byte at a time, although this would be +terribly inefficient. You should always ensure that at least one +byte of output space is available at each call. + +Use of @code{BZ2_bzDecompress} is simpler than @code{BZ2_bzCompress}. + +You should provide input and remove output as described above, and +repeatedly call @code{BZ2_bzDecompress} until @code{BZ_STREAM_END} is +returned. Appearance of @code{BZ_STREAM_END} denotes that +@code{BZ2_bzDecompress} has detected the logical end of the compressed +stream. @code{BZ2_bzDecompress} will not produce @code{BZ_STREAM_END} until +all output data has been placed into the output buffer, so once +@code{BZ_STREAM_END} appears, you are guaranteed to have available all +the decompressed output, and @code{BZ2_bzDecompressEnd} can safely be +called. + +If case of an error return value, you should call @code{BZ2_bzDecompressEnd} +to clean up and release memory. + +Possible return values: +@display + @code{BZ_PARAM_ERROR} + if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} + or @code{strm->avail_out < 1} + @code{BZ_DATA_ERROR} + if a data integrity error is detected in the compressed stream + @code{BZ_DATA_ERROR_MAGIC} + if the compressed stream doesn't begin with the right magic bytes + @code{BZ_MEM_ERROR} + if there wasn't enough memory available + @code{BZ_STREAM_END} + if the logical end of the data stream was detected and all + output in has been consumed, eg @code{s->avail_out > 0} + @code{BZ_OK} + otherwise +@end display +Allowable next actions: +@display + @code{BZ2_bzDecompress} + if @code{BZ_OK} was returned + @code{BZ2_bzDecompressEnd} + otherwise +@end display + + +@subsection @code{BZ2_bzDecompressEnd} +@example +int BZ2_bzDecompressEnd ( bz_stream *strm ); +@end example +Releases all memory associated with a decompression stream. + +Possible return values: +@display + @code{BZ_PARAM_ERROR} + if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} + @code{BZ_OK} + otherwise +@end display + +Allowable next actions: +@display + None. +@end display + + +@section High-level interface + +This interface provides functions for reading and writing +@code{bzip2} format files. First, some general points. + +@itemize @bullet +@item All of the functions take an @code{int*} first argument, + @code{bzerror}. + After each call, @code{bzerror} should be consulted first to determine + the outcome of the call. If @code{bzerror} is @code{BZ_OK}, + the call completed + successfully, and only then should the return value of the function + (if any) be consulted. If @code{bzerror} is @code{BZ_IO_ERROR}, + there was an error + reading/writing the underlying compressed file, and you should + then consult @code{errno}/@code{perror} to determine the + cause of the difficulty. + @code{bzerror} may also be set to various other values; precise details are + given on a per-function basis below. +@item If @code{bzerror} indicates an error + (ie, anything except @code{BZ_OK} and @code{BZ_STREAM_END}), + you should immediately call @code{BZ2_bzReadClose} (or @code{BZ2_bzWriteClose}, + depending on whether you are attempting to read or to write) + to free up all resources associated + with the stream. Once an error has been indicated, behaviour of all calls + except @code{BZ2_bzReadClose} (@code{BZ2_bzWriteClose}) is undefined. + The implication is that (1) @code{bzerror} should + be checked after each call, and (2) if @code{bzerror} indicates an error, + @code{BZ2_bzReadClose} (@code{BZ2_bzWriteClose}) should then be called to clean up. +@item The @code{FILE*} arguments passed to + @code{BZ2_bzReadOpen}/@code{BZ2_bzWriteOpen} + should be set to binary mode. + Most Unix systems will do this by default, but other platforms, + including Windows and Mac, will not. If you omit this, you may + encounter problems when moving code to new platforms. +@item Memory allocation requests are handled by + @code{malloc}/@code{free}. + At present + there is no facility for user-defined memory allocators in the file I/O + functions (could easily be added, though). +@end itemize + + + +@subsection @code{BZ2_bzReadOpen} +@example + typedef void BZFILE; + + BZFILE *BZ2_bzReadOpen ( int *bzerror, FILE *f, + int small, int verbosity, + void *unused, int nUnused ); +@end example +Prepare to read compressed data from file handle @code{f}. @code{f} +should refer to a file which has been opened for reading, and for which +the error indicator (@code{ferror(f)})is not set. If @code{small} is 1, +the library will try to decompress using less memory, at the expense of +speed. + +For reasons explained below, @code{BZ2_bzRead} will decompress the +@code{nUnused} bytes starting at @code{unused}, before starting to read +from the file @code{f}. At most @code{BZ_MAX_UNUSED} bytes may be +supplied like this. If this facility is not required, you should pass +@code{NULL} and @code{0} for @code{unused} and n@code{Unused} +respectively. + +For the meaning of parameters @code{small} and @code{verbosity}, +see @code{BZ2_bzDecompressInit}. + +The amount of memory needed to decompress a file cannot be determined +until the file's header has been read. So it is possible that +@code{BZ2_bzReadOpen} returns @code{BZ_OK} but a subsequent call of +@code{BZ2_bzRead} will return @code{BZ_MEM_ERROR}. + +Possible assignments to @code{bzerror}: +@display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled + @code{BZ_PARAM_ERROR} + if @code{f} is @code{NULL} + or @code{small} is neither @code{0} nor @code{1} + or @code{(unused == NULL && nUnused != 0)} + or @code{(unused != NULL && !(0 <= nUnused <= BZ_MAX_UNUSED))} + @code{BZ_IO_ERROR} + if @code{ferror(f)} is nonzero + @code{BZ_MEM_ERROR} + if insufficient memory is available + @code{BZ_OK} + otherwise. +@end display + +Possible return values: +@display + Pointer to an abstract @code{BZFILE} + if @code{bzerror} is @code{BZ_OK} + @code{NULL} + otherwise +@end display + +Allowable next actions: +@display + @code{BZ2_bzRead} + if @code{bzerror} is @code{BZ_OK} + @code{BZ2_bzClose} + otherwise +@end display + + +@subsection @code{BZ2_bzRead} +@example + int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len ); +@end example +Reads up to @code{len} (uncompressed) bytes from the compressed file +@code{b} into +the buffer @code{buf}. If the read was successful, +@code{bzerror} is set to @code{BZ_OK} +and the number of bytes read is returned. If the logical end-of-stream +was detected, @code{bzerror} will be set to @code{BZ_STREAM_END}, +and the number +of bytes read is returned. All other @code{bzerror} values denote an error. + +@code{BZ2_bzRead} will supply @code{len} bytes, +unless the logical stream end is detected +or an error occurs. Because of this, it is possible to detect the +stream end by observing when the number of bytes returned is +less than the number +requested. Nevertheless, this is regarded as inadvisable; you should +instead check @code{bzerror} after every call and watch out for +@code{BZ_STREAM_END}. + +Internally, @code{BZ2_bzRead} copies data from the compressed file in chunks +of size @code{BZ_MAX_UNUSED} bytes +before decompressing it. If the file contains more bytes than strictly +needed to reach the logical end-of-stream, @code{BZ2_bzRead} will almost certainly +read some of the trailing data before signalling @code{BZ_SEQUENCE_END}. +To collect the read but unused data once @code{BZ_SEQUENCE_END} has +appeared, call @code{BZ2_bzReadGetUnused} immediately before @code{BZ2_bzReadClose}. + +Possible assignments to @code{bzerror}: +@display + @code{BZ_PARAM_ERROR} + if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} + @code{BZ_SEQUENCE_ERROR} + if @code{b} was opened with @code{BZ2_bzWriteOpen} + @code{BZ_IO_ERROR} + if there is an error reading from the compressed file + @code{BZ_UNEXPECTED_EOF} + if the compressed file ended before the logical end-of-stream was detected + @code{BZ_DATA_ERROR} + if a data integrity error was detected in the compressed stream + @code{BZ_DATA_ERROR_MAGIC} + if the stream does not begin with the requisite header bytes (ie, is not + a @code{bzip2} data file). This is really a special case of @code{BZ_DATA_ERROR}. + @code{BZ_MEM_ERROR} + if insufficient memory was available + @code{BZ_STREAM_END} + if the logical end of stream was detected. + @code{BZ_OK} + otherwise. +@end display + +Possible return values: +@display + number of bytes read + if @code{bzerror} is @code{BZ_OK} or @code{BZ_STREAM_END} + undefined + otherwise +@end display + +Allowable next actions: +@display + collect data from @code{buf}, then @code{BZ2_bzRead} or @code{BZ2_bzReadClose} + if @code{bzerror} is @code{BZ_OK} + collect data from @code{buf}, then @code{BZ2_bzReadClose} or @code{BZ2_bzReadGetUnused} + if @code{bzerror} is @code{BZ_SEQUENCE_END} + @code{BZ2_bzReadClose} + otherwise +@end display + + + +@subsection @code{BZ2_bzReadGetUnused} +@example + void BZ2_bzReadGetUnused ( int* bzerror, BZFILE *b, + void** unused, int* nUnused ); +@end example +Returns data which was read from the compressed file but was not needed +to get to the logical end-of-stream. @code{*unused} is set to the address +of the data, and @code{*nUnused} to the number of bytes. @code{*nUnused} will +be set to a value between @code{0} and @code{BZ_MAX_UNUSED} inclusive. + +This function may only be called once @code{BZ2_bzRead} has signalled +@code{BZ_STREAM_END} but before @code{BZ2_bzReadClose}. + +Possible assignments to @code{bzerror}: +@display + @code{BZ_PARAM_ERROR} + if @code{b} is @code{NULL} + or @code{unused} is @code{NULL} or @code{nUnused} is @code{NULL} + @code{BZ_SEQUENCE_ERROR} + if @code{BZ_STREAM_END} has not been signalled + or if @code{b} was opened with @code{BZ2_bzWriteOpen} + @code{BZ_OK} + otherwise +@end display + +Allowable next actions: +@display + @code{BZ2_bzReadClose} +@end display + + +@subsection @code{BZ2_bzReadClose} +@example + void BZ2_bzReadClose ( int *bzerror, BZFILE *b ); +@end example +Releases all memory pertaining to the compressed file @code{b}. +@code{BZ2_bzReadClose} does not call @code{fclose} on the underlying file +handle, so you should do that yourself if appropriate. +@code{BZ2_bzReadClose} should be called to clean up after all error +situations. + +Possible assignments to @code{bzerror}: +@display + @code{BZ_SEQUENCE_ERROR} + if @code{b} was opened with @code{BZ2_bzOpenWrite} + @code{BZ_OK} + otherwise +@end display + +Allowable next actions: +@display + none +@end display + + + +@subsection @code{BZ2_bzWriteOpen} +@example + BZFILE *BZ2_bzWriteOpen ( int *bzerror, FILE *f, + int blockSize100k, int verbosity, + int workFactor ); +@end example +Prepare to write compressed data to file handle @code{f}. +@code{f} should refer to +a file which has been opened for writing, and for which the error +indicator (@code{ferror(f)})is not set. + +For the meaning of parameters @code{blockSize100k}, +@code{verbosity} and @code{workFactor}, see +@* @code{BZ2_bzCompressInit}. + +All required memory is allocated at this stage, so if the call +completes successfully, @code{BZ_MEM_ERROR} cannot be signalled by a +subsequent call to @code{BZ2_bzWrite}. + +Possible assignments to @code{bzerror}: +@display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled + @code{BZ_PARAM_ERROR} + if @code{f} is @code{NULL} + or @code{blockSize100k < 1} or @code{blockSize100k > 9} + @code{BZ_IO_ERROR} + if @code{ferror(f)} is nonzero + @code{BZ_MEM_ERROR} + if insufficient memory is available + @code{BZ_OK} + otherwise +@end display + +Possible return values: +@display + Pointer to an abstract @code{BZFILE} + if @code{bzerror} is @code{BZ_OK} + @code{NULL} + otherwise +@end display + +Allowable next actions: +@display + @code{BZ2_bzWrite} + if @code{bzerror} is @code{BZ_OK} + (you could go directly to @code{BZ2_bzWriteClose}, but this would be pretty pointless) + @code{BZ2_bzWriteClose} + otherwise +@end display + + + +@subsection @code{BZ2_bzWrite} +@example + void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len ); +@end example +Absorbs @code{len} bytes from the buffer @code{buf}, eventually to be +compressed and written to the file. + +Possible assignments to @code{bzerror}: +@display + @code{BZ_PARAM_ERROR} + if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} + @code{BZ_SEQUENCE_ERROR} + if b was opened with @code{BZ2_bzReadOpen} + @code{BZ_IO_ERROR} + if there is an error writing the compressed file. + @code{BZ_OK} + otherwise +@end display + + + + +@subsection @code{BZ2_bzWriteClose} +@example + void BZ2_bzWriteClose ( int *bzerror, BZFILE* f, + int abandon, + unsigned int* nbytes_in, + unsigned int* nbytes_out ); + + void BZ2_bzWriteClose64 ( int *bzerror, BZFILE* f, + int abandon, + unsigned int* nbytes_in_lo32, + unsigned int* nbytes_in_hi32, + unsigned int* nbytes_out_lo32, + unsigned int* nbytes_out_hi32 ); +@end example + +Compresses and flushes to the compressed file all data so far supplied +by @code{BZ2_bzWrite}. The logical end-of-stream markers are also written, so +subsequent calls to @code{BZ2_bzWrite} are illegal. All memory associated +with the compressed file @code{b} is released. +@code{fflush} is called on the +compressed file, but it is not @code{fclose}'d. + +If @code{BZ2_bzWriteClose} is called to clean up after an error, the only +action is to release the memory. The library records the error codes +issued by previous calls, so this situation will be detected +automatically. There is no attempt to complete the compression +operation, nor to @code{fflush} the compressed file. You can force this +behaviour to happen even in the case of no error, by passing a nonzero +value to @code{abandon}. + +If @code{nbytes_in} is non-null, @code{*nbytes_in} will be set to be the +total volume of uncompressed data handled. Similarly, @code{nbytes_out} +will be set to the total volume of compressed data written. For +compatibility with older versions of the library, @code{BZ2_bzWriteClose} +only yields the lower 32 bits of these counts. Use +@code{BZ2_bzWriteClose64} if you want the full 64 bit counts. These +two functions are otherwise absolutely identical. + + +Possible assignments to @code{bzerror}: +@display + @code{BZ_SEQUENCE_ERROR} + if @code{b} was opened with @code{BZ2_bzReadOpen} + @code{BZ_IO_ERROR} + if there is an error writing the compressed file + @code{BZ_OK} + otherwise +@end display + +@subsection Handling embedded compressed data streams + +The high-level library facilitates use of +@code{bzip2} data streams which form some part of a surrounding, larger +data stream. +@itemize @bullet +@item For writing, the library takes an open file handle, writes +compressed data to it, @code{fflush}es it but does not @code{fclose} it. +The calling application can write its own data before and after the +compressed data stream, using that same file handle. +@item Reading is more complex, and the facilities are not as general +as they could be since generality is hard to reconcile with efficiency. +@code{BZ2_bzRead} reads from the compressed file in blocks of size +@code{BZ_MAX_UNUSED} bytes, and in doing so probably will overshoot +the logical end of compressed stream. +To recover this data once decompression has +ended, call @code{BZ2_bzReadGetUnused} after the last call of @code{BZ2_bzRead} +(the one returning @code{BZ_STREAM_END}) but before calling +@code{BZ2_bzReadClose}. +@end itemize + +This mechanism makes it easy to decompress multiple @code{bzip2} +streams placed end-to-end. As the end of one stream, when @code{BZ2_bzRead} +returns @code{BZ_STREAM_END}, call @code{BZ2_bzReadGetUnused} to collect the +unused data (copy it into your own buffer somewhere). +That data forms the start of the next compressed stream. +To start uncompressing that next stream, call @code{BZ2_bzReadOpen} again, +feeding in the unused data via the @code{unused}/@code{nUnused} +parameters. +Keep doing this until @code{BZ_STREAM_END} return coincides with the +physical end of file (@code{feof(f)}). In this situation +@code{BZ2_bzReadGetUnused} +will of course return no data. + +This should give some feel for how the high-level interface can be used. +If you require extra flexibility, you'll have to bite the bullet and get +to grips with the low-level interface. + +@subsection Standard file-reading/writing code +Here's how you'd write data to a compressed file: +@example @code +FILE* f; +BZFILE* b; +int nBuf; +char buf[ /* whatever size you like */ ]; +int bzerror; +int nWritten; + +f = fopen ( "myfile.bz2", "w" ); +if (!f) @{ + /* handle error */ +@} +b = BZ2_bzWriteOpen ( &bzerror, f, 9 ); +if (bzerror != BZ_OK) @{ + BZ2_bzWriteClose ( b ); + /* handle error */ +@} + +while ( /* condition */ ) @{ + /* get data to write into buf, and set nBuf appropriately */ + nWritten = BZ2_bzWrite ( &bzerror, b, buf, nBuf ); + if (bzerror == BZ_IO_ERROR) @{ + BZ2_bzWriteClose ( &bzerror, b ); + /* handle error */ + @} +@} + +BZ2_bzWriteClose ( &bzerror, b ); +if (bzerror == BZ_IO_ERROR) @{ + /* handle error */ +@} +@end example +And to read from a compressed file: +@example +FILE* f; +BZFILE* b; +int nBuf; +char buf[ /* whatever size you like */ ]; +int bzerror; +int nWritten; + +f = fopen ( "myfile.bz2", "r" ); +if (!f) @{ + /* handle error */ +@} +b = BZ2_bzReadOpen ( &bzerror, f, 0, NULL, 0 ); +if (bzerror != BZ_OK) @{ + BZ2_bzReadClose ( &bzerror, b ); + /* handle error */ +@} + +bzerror = BZ_OK; +while (bzerror == BZ_OK && /* arbitrary other conditions */) @{ + nBuf = BZ2_bzRead ( &bzerror, b, buf, /* size of buf */ ); + if (bzerror == BZ_OK) @{ + /* do something with buf[0 .. nBuf-1] */ + @} +@} +if (bzerror != BZ_STREAM_END) @{ + BZ2_bzReadClose ( &bzerror, b ); + /* handle error */ +@} else @{ + BZ2_bzReadClose ( &bzerror ); +@} +@end example + + + +@section Utility functions +@subsection @code{BZ2_bzBuffToBuffCompress} +@example + int BZ2_bzBuffToBuffCompress( char* dest, + unsigned int* destLen, + char* source, + unsigned int sourceLen, + int blockSize100k, + int verbosity, + int workFactor ); +@end example +Attempts to compress the data in @code{source[0 .. sourceLen-1]} +into the destination buffer, @code{dest[0 .. *destLen-1]}. +If the destination buffer is big enough, @code{*destLen} is +set to the size of the compressed data, and @code{BZ_OK} is +returned. If the compressed data won't fit, @code{*destLen} +is unchanged, and @code{BZ_OUTBUFF_FULL} is returned. + +Compression in this manner is a one-shot event, done with a single call +to this function. The resulting compressed data is a complete +@code{bzip2} format data stream. There is no mechanism for making +additional calls to provide extra input data. If you want that kind of +mechanism, use the low-level interface. + +For the meaning of parameters @code{blockSize100k}, @code{verbosity} +and @code{workFactor}, @* see @code{BZ2_bzCompressInit}. + +To guarantee that the compressed data will fit in its buffer, allocate +an output buffer of size 1% larger than the uncompressed data, plus +six hundred extra bytes. + +@code{BZ2_bzBuffToBuffDecompress} will not write data at or +beyond @code{dest[*destLen]}, even in case of buffer overflow. + +Possible return values: +@display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled + @code{BZ_PARAM_ERROR} + if @code{dest} is @code{NULL} or @code{destLen} is @code{NULL} + or @code{blockSize100k < 1} or @code{blockSize100k > 9} + or @code{verbosity < 0} or @code{verbosity > 4} + or @code{workFactor < 0} or @code{workFactor > 250} + @code{BZ_MEM_ERROR} + if insufficient memory is available + @code{BZ_OUTBUFF_FULL} + if the size of the compressed data exceeds @code{*destLen} + @code{BZ_OK} + otherwise +@end display + + + +@subsection @code{BZ2_bzBuffToBuffDecompress} +@example + int BZ2_bzBuffToBuffDecompress ( char* dest, + unsigned int* destLen, + char* source, + unsigned int sourceLen, + int small, + int verbosity ); +@end example +Attempts to decompress the data in @code{source[0 .. sourceLen-1]} +into the destination buffer, @code{dest[0 .. *destLen-1]}. +If the destination buffer is big enough, @code{*destLen} is +set to the size of the uncompressed data, and @code{BZ_OK} is +returned. If the compressed data won't fit, @code{*destLen} +is unchanged, and @code{BZ_OUTBUFF_FULL} is returned. + +@code{source} is assumed to hold a complete @code{bzip2} format +data stream. @* @code{BZ2_bzBuffToBuffDecompress} tries to decompress +the entirety of the stream into the output buffer. + +For the meaning of parameters @code{small} and @code{verbosity}, +see @code{BZ2_bzDecompressInit}. + +Because the compression ratio of the compressed data cannot be known in +advance, there is no easy way to guarantee that the output buffer will +be big enough. You may of course make arrangements in your code to +record the size of the uncompressed data, but such a mechanism is beyond +the scope of this library. + +@code{BZ2_bzBuffToBuffDecompress} will not write data at or +beyond @code{dest[*destLen]}, even in case of buffer overflow. + +Possible return values: +@display + @code{BZ_CONFIG_ERROR} + if the library has been mis-compiled + @code{BZ_PARAM_ERROR} + if @code{dest} is @code{NULL} or @code{destLen} is @code{NULL} + or @code{small != 0 && small != 1} + or @code{verbosity < 0} or @code{verbosity > 4} + @code{BZ_MEM_ERROR} + if insufficient memory is available + @code{BZ_OUTBUFF_FULL} + if the size of the compressed data exceeds @code{*destLen} + @code{BZ_DATA_ERROR} + if a data integrity error was detected in the compressed data + @code{BZ_DATA_ERROR_MAGIC} + if the compressed data doesn't begin with the right magic bytes + @code{BZ_UNEXPECTED_EOF} + if the compressed data ends unexpectedly + @code{BZ_OK} + otherwise +@end display + + + +@section @code{zlib} compatibility functions +Yoshioka Tsuneo has contributed some functions to +give better @code{zlib} compatibility. These functions are +@code{BZ2_bzopen}, @code{BZ2_bzread}, @code{BZ2_bzwrite}, @code{BZ2_bzflush}, +@code{BZ2_bzclose}, +@code{BZ2_bzerror} and @code{BZ2_bzlibVersion}. +These functions are not (yet) officially part of +the library. If they break, you get to keep all the pieces. +Nevertheless, I think they work ok. +@example +typedef void BZFILE; + +const char * BZ2_bzlibVersion ( void ); +@end example +Returns a string indicating the library version. +@example +BZFILE * BZ2_bzopen ( const char *path, const char *mode ); +BZFILE * BZ2_bzdopen ( int fd, const char *mode ); +@end example +Opens a @code{.bz2} file for reading or writing, using either its name +or a pre-existing file descriptor. +Analogous to @code{fopen} and @code{fdopen}. +@example +int BZ2_bzread ( BZFILE* b, void* buf, int len ); +int BZ2_bzwrite ( BZFILE* b, void* buf, int len ); +@end example +Reads/writes data from/to a previously opened @code{BZFILE}. +Analogous to @code{fread} and @code{fwrite}. +@example +int BZ2_bzflush ( BZFILE* b ); +void BZ2_bzclose ( BZFILE* b ); +@end example +Flushes/closes a @code{BZFILE}. @code{BZ2_bzflush} doesn't actually do +anything. Analogous to @code{fflush} and @code{fclose}. + +@example +const char * BZ2_bzerror ( BZFILE *b, int *errnum ) +@end example +Returns a string describing the more recent error status of +@code{b}, and also sets @code{*errnum} to its numerical value. + + +@section Using the library in a @code{stdio}-free environment + +@subsection Getting rid of @code{stdio} + +In a deeply embedded application, you might want to use just +the memory-to-memory functions. You can do this conveniently +by compiling the library with preprocessor symbol @code{BZ_NO_STDIO} +defined. Doing this gives you a library containing only the following +eight functions: + +@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, @code{BZ2_bzCompressEnd} @* +@code{BZ2_bzDecompressInit}, @code{BZ2_bzDecompress}, @code{BZ2_bzDecompressEnd} @* +@code{BZ2_bzBuffToBuffCompress}, @code{BZ2_bzBuffToBuffDecompress} + +When compiled like this, all functions will ignore @code{verbosity} +settings. + +@subsection Critical error handling +@code{libbzip2} contains a number of internal assertion checks which +should, needless to say, never be activated. Nevertheless, if an +assertion should fail, behaviour depends on whether or not the library +was compiled with @code{BZ_NO_STDIO} set. + +For a normal compile, an assertion failure yields the message +@example + bzip2/libbzip2: internal error number N. + This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000. + Please report it to me at: jseward@@acm.org. If this happened + when you were using some program which uses libbzip2 as a + component, you should also report this bug to the author(s) + of that program. Please make an effort to report this bug; + timely and accurate bug reports eventually lead to higher + quality software. Thanks. Julian Seward, 21 March 2000. +@end example +where @code{N} is some error code number. @code{exit(3)} +is then called. + +For a @code{stdio}-free library, assertion failures result +in a call to a function declared as: +@example + extern void bz_internal_error ( int errcode ); +@end example +The relevant code is passed as a parameter. You should supply +such a function. + +In either case, once an assertion failure has occurred, any +@code{bz_stream} records involved can be regarded as invalid. +You should not attempt to resume normal operation with them. + +You may, of course, change critical error handling to suit +your needs. As I said above, critical errors indicate bugs +in the library and should not occur. All "normal" error +situations are indicated via error return codes from functions, +and can be recovered from. + + +@section Making a Windows DLL +Everything related to Windows has been contributed by Yoshioka Tsuneo +@* (@code{QWF00133@@niftyserve.or.jp} / +@code{tsuneo-y@@is.aist-nara.ac.jp}), so you should send your queries to +him (but perhaps Cc: me, @code{jseward@@acm.org}). + +My vague understanding of what to do is: using Visual C++ 5.0, +open the project file @code{libbz2.dsp}, and build. That's all. + +If you can't +open the project file for some reason, make a new one, naming these files: +@code{blocksort.c}, @code{bzlib.c}, @code{compress.c}, +@code{crctable.c}, @code{decompress.c}, @code{huffman.c}, @* +@code{randtable.c} and @code{libbz2.def}. You will also need +to name the header files @code{bzlib.h} and @code{bzlib_private.h}. + +If you don't use VC++, you may need to define the proprocessor symbol +@code{_WIN32}. + +Finally, @code{dlltest.c} is a sample program using the DLL. It has a +project file, @code{dlltest.dsp}. + +If you just want a makefile for Visual C, have a look at +@code{makefile.msc}. + +Be aware that if you compile @code{bzip2} itself on Win32, you must set +@code{BZ_UNIX} to 0 and @code{BZ_LCCWIN32} to 1, in the file +@code{bzip2.c}, before compiling. Otherwise the resulting binary won't +work correctly. + +I haven't tried any of this stuff myself, but it all looks plausible. + + + +@chapter Miscellanea + +These are just some random thoughts of mine. Your mileage may +vary. + +@section Limitations of the compressed file format +@code{bzip2-1.0}, @code{0.9.5} and @code{0.9.0} +use exactly the same file format as the previous +version, @code{bzip2-0.1}. This decision was made in the interests of +stability. Creating yet another incompatible compressed file format +would create further confusion and disruption for users. + +Nevertheless, this is not a painless decision. Development +work since the release of @code{bzip2-0.1} in August 1997 +has shown complexities in the file format which slow down +decompression and, in retrospect, are unnecessary. These are: +@itemize @bullet +@item The run-length encoder, which is the first of the + compression transformations, is entirely irrelevant. + The original purpose was to protect the sorting algorithm + from the very worst case input: a string of repeated + symbols. But algorithm steps Q6a and Q6b in the original + Burrows-Wheeler technical report (SRC-124) show how + repeats can be handled without difficulty in block + sorting. +@item The randomisation mechanism doesn't really need to be + there. Udi Manber and Gene Myers published a suffix + array construction algorithm a few years back, which + can be employed to sort any block, no matter how + repetitive, in O(N log N) time. Subsequent work by + Kunihiko Sadakane has produced a derivative O(N (log N)^2) + algorithm which usually outperforms the Manber-Myers + algorithm. + + I could have changed to Sadakane's algorithm, but I find + it to be slower than @code{bzip2}'s existing algorithm for + most inputs, and the randomisation mechanism protects + adequately against bad cases. I didn't think it was + a good tradeoff to make. Partly this is due to the fact + that I was not flooded with email complaints about + @code{bzip2-0.1}'s performance on repetitive data, so + perhaps it isn't a problem for real inputs. + + Probably the best long-term solution, + and the one I have incorporated into 0.9.5 and above, + is to use the existing sorting + algorithm initially, and fall back to a O(N (log N)^2) + algorithm if the standard algorithm gets into difficulties. +@item The compressed file format was never designed to be + handled by a library, and I have had to jump though + some hoops to produce an efficient implementation of + decompression. It's a bit hairy. Try passing + @code{decompress.c} through the C preprocessor + and you'll see what I mean. Much of this complexity + could have been avoided if the compressed size of + each block of data was recorded in the data stream. +@item An Adler-32 checksum, rather than a CRC32 checksum, + would be faster to compute. +@end itemize +It would be fair to say that the @code{bzip2} format was frozen +before I properly and fully understood the performance +consequences of doing so. + +Improvements which I was able to incorporate into +0.9.0, despite using the same file format, are: +@itemize @bullet +@item Single array implementation of the inverse BWT. This + significantly speeds up decompression, presumably + because it reduces the number of cache misses. +@item Faster inverse MTF transform for large MTF values. The + new implementation is based on the notion of sliding blocks + of values. +@item @code{bzip2-0.9.0} now reads and writes files with @code{fread} + and @code{fwrite}; version 0.1 used @code{putc} and @code{getc}. + Duh! Well, you live and learn. + +@end itemize +Further ahead, it would be nice +to be able to do random access into files. This will +require some careful design of compressed file formats. + + + +@section Portability issues +After some consideration, I have decided not to use +GNU @code{autoconf} to configure 0.9.5 or 1.0. + +@code{autoconf}, admirable and wonderful though it is, +mainly assists with portability problems between Unix-like +platforms. But @code{bzip2} doesn't have much in the way +of portability problems on Unix; most of the difficulties appear +when porting to the Mac, or to Microsoft's operating systems. +@code{autoconf} doesn't help in those cases, and brings in a +whole load of new complexity. + +Most people should be able to compile the library and program +under Unix straight out-of-the-box, so to speak, especially +if you have a version of GNU C available. + +There are a couple of @code{__inline__} directives in the code. GNU C +(@code{gcc}) should be able to handle them. If you're not using +GNU C, your C compiler shouldn't see them at all. +If your compiler does, for some reason, see them and doesn't +like them, just @code{#define} @code{__inline__} to be @code{/* */}. One +easy way to do this is to compile with the flag @code{-D__inline__=}, +which should be understood by most Unix compilers. + +If you still have difficulties, try compiling with the macro +@code{BZ_STRICT_ANSI} defined. This should enable you to build the +library in a strictly ANSI compliant environment. Building the program +itself like this is dangerous and not supported, since you remove +@code{bzip2}'s checks against compressing directories, symbolic links, +devices, and other not-really-a-file entities. This could cause +filesystem corruption! + +One other thing: if you create a @code{bzip2} binary for public +distribution, please try and link it statically (@code{gcc -s}). This +avoids all sorts of library-version issues that others may encounter +later on. + +If you build @code{bzip2} on Win32, you must set @code{BZ_UNIX} to 0 and +@code{BZ_LCCWIN32} to 1, in the file @code{bzip2.c}, before compiling. +Otherwise the resulting binary won't work correctly. + + + +@section Reporting bugs +I tried pretty hard to make sure @code{bzip2} is +bug free, both by design and by testing. Hopefully +you'll never need to read this section for real. + +Nevertheless, if @code{bzip2} dies with a segmentation +fault, a bus error or an internal assertion failure, it +will ask you to email me a bug report. Experience with +version 0.1 shows that almost all these problems can +be traced to either compiler bugs or hardware problems. +@itemize @bullet +@item +Recompile the program with no optimisation, and see if it +works. And/or try a different compiler. +I heard all sorts of stories about various flavours +of GNU C (and other compilers) generating bad code for +@code{bzip2}, and I've run across two such examples myself. + +2.7.X versions of GNU C are known to generate bad code from +time to time, at high optimisation levels. +If you get problems, try using the flags +@code{-O2} @code{-fomit-frame-pointer} @code{-fno-strength-reduce}. +You should specifically @emph{not} use @code{-funroll-loops}. + +You may notice that the Makefile runs six tests as part of +the build process. If the program passes all of these, it's +a pretty good (but not 100%) indication that the compiler has +done its job correctly. +@item +If @code{bzip2} crashes randomly, and the crashes are not +repeatable, you may have a flaky memory subsystem. @code{bzip2} +really hammers your memory hierarchy, and if it's a bit marginal, +you may get these problems. Ditto if your disk or I/O subsystem +is slowly failing. Yup, this really does happen. + +Try using a different machine of the same type, and see if +you can repeat the problem. +@item This isn't really a bug, but ... If @code{bzip2} tells +you your file is corrupted on decompression, and you +obtained the file via FTP, there is a possibility that you +forgot to tell FTP to do a binary mode transfer. That absolutely +will cause the file to be non-decompressible. You'll have to transfer +it again. +@end itemize + +If you've incorporated @code{libbzip2} into your own program +and are getting problems, please, please, please, check that the +parameters you are passing in calls to the library, are +correct, and in accordance with what the documentation says +is allowable. I have tried to make the library robust against +such problems, but I'm sure I haven't succeeded. + +Finally, if the above comments don't help, you'll have to send +me a bug report. Now, it's just amazing how many people will +send me a bug report saying something like +@display + bzip2 crashed with segmentation fault on my machine +@end display +and absolutely nothing else. Needless to say, a such a report +is @emph{totally, utterly, completely and comprehensively 100% useless; +a waste of your time, my time, and net bandwidth}. +With no details at all, there's no way I can possibly begin +to figure out what the problem is. + +The rules of the game are: facts, facts, facts. Don't omit +them because "oh, they won't be relevant". At the bare +minimum: +@display + Machine type. Operating system version. + Exact version of @code{bzip2} (do @code{bzip2 -V}). + Exact version of the compiler used. + Flags passed to the compiler. +@end display +However, the most important single thing that will help me is +the file that you were trying to compress or decompress at the +time the problem happened. Without that, my ability to do anything +more than speculate about the cause, is limited. + +Please remember that I connect to the Internet with a modem, so +you should contact me before mailing me huge files. + + +@section Did you get the right package? + +@code{bzip2} is a resource hog. It soaks up large amounts of CPU cycles +and memory. Also, it gives very large latencies. In the worst case, you +can feed many megabytes of uncompressed data into the library before +getting any compressed output, so this probably rules out applications +requiring interactive behaviour. + +These aren't faults of my implementation, I hope, but more +an intrinsic property of the Burrows-Wheeler transform (unfortunately). +Maybe this isn't what you want. + +If you want a compressor and/or library which is faster, uses less +memory but gets pretty good compression, and has minimal latency, +consider Jean-loup +Gailly's and Mark Adler's work, @code{zlib-1.1.2} and +@code{gzip-1.2.4}. Look for them at + +@code{http://www.cdrom.com/pub/infozip/zlib} and +@code{http://www.gzip.org} respectively. + +For something faster and lighter still, you might try Markus F X J +Oberhumer's @code{LZO} real-time compression/decompression library, at +@* @code{http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html}. + +If you want to use the @code{bzip2} algorithms to compress small blocks +of data, 64k bytes or smaller, for example on an on-the-fly disk +compressor, you'd be well advised not to use this library. Instead, +I've made a special library tuned for that kind of use. It's part of +@code{e2compr-0.40}, an on-the-fly disk compressor for the Linux +@code{ext2} filesystem. Look at +@code{http://www.netspace.net.au/~reiter/e2compr}. + + + +@section Testing + +A record of the tests I've done. + +First, some data sets: +@itemize @bullet +@item B: a directory containing 6001 files, one for every length in the + range 0 to 6000 bytes. The files contain random lowercase + letters. 18.7 megabytes. +@item H: my home directory tree. Documents, source code, mail files, + compressed data. H contains B, and also a directory of + files designed as boundary cases for the sorting; mostly very + repetitive, nasty files. 565 megabytes. +@item A: directory tree holding various applications built from source: + @code{egcs}, @code{gcc-2.8.1}, KDE, GTK, Octave, etc. + 2200 megabytes. +@end itemize +The tests conducted are as follows. Each test means compressing +(a copy of) each file in the data set, decompressing it and +comparing it against the original. + +First, a bunch of tests with block sizes and internal buffer +sizes set very small, +to detect any problems with the +blocking and buffering mechanisms. +This required modifying the source code so as to try to +break it. +@enumerate +@item Data set H, with + buffer size of 1 byte, and block size of 23 bytes. +@item Data set B, buffer sizes 1 byte, block size 1 byte. +@item As (2) but small-mode decompression. +@item As (2) with block size 2 bytes. +@item As (2) with block size 3 bytes. +@item As (2) with block size 4 bytes. +@item As (2) with block size 5 bytes. +@item As (2) with block size 6 bytes and small-mode decompression. +@item H with buffer size of 1 byte, but normal block + size (up to 900000 bytes). +@end enumerate +Then some tests with unmodified source code. +@enumerate +@item H, all settings normal. +@item As (1), with small-mode decompress. +@item H, compress with flag @code{-1}. +@item H, compress with flag @code{-s}, decompress with flag @code{-s}. +@item Forwards compatibility: H, @code{bzip2-0.1pl2} compressing, + @code{bzip2-0.9.5} decompressing, all settings normal. +@item Backwards compatibility: H, @code{bzip2-0.9.5} compressing, + @code{bzip2-0.1pl2} decompressing, all settings normal. +@item Bigger tests: A, all settings normal. +@item As (7), using the fallback (Sadakane-like) sorting algorithm. +@item As (8), compress with flag @code{-1}, decompress with flag + @code{-s}. +@item H, using the fallback sorting algorithm. +@item Forwards compatibility: A, @code{bzip2-0.1pl2} compressing, + @code{bzip2-0.9.5} decompressing, all settings normal. +@item Backwards compatibility: A, @code{bzip2-0.9.5} compressing, + @code{bzip2-0.1pl2} decompressing, all settings normal. +@item Misc test: about 400 megabytes of @code{.tar} files with + @code{bzip2} compiled with Checker (a memory access error + detector, like Purify). +@item Misc tests to make sure it builds and runs ok on non-Linux/x86 + platforms. +@end enumerate +These tests were conducted on a 225 MHz IDT WinChip machine, running +Linux 2.0.36. They represent nearly a week of continuous computation. +All tests completed successfully. + + +@section Further reading +@code{bzip2} is not research work, in the sense that it doesn't present +any new ideas. Rather, it's an engineering exercise based on existing +ideas. + +Four documents describe essentially all the ideas behind @code{bzip2}: +@example +Michael Burrows and D. J. Wheeler: + "A block-sorting lossless data compression algorithm" + 10th May 1994. + Digital SRC Research Report 124. + ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz + If you have trouble finding it, try searching at the + New Zealand Digital Library, http://www.nzdl.org. + +Daniel S. Hirschberg and Debra A. LeLewer + "Efficient Decoding of Prefix Codes" + Communications of the ACM, April 1990, Vol 33, Number 4. + You might be able to get an electronic copy of this + from the ACM Digital Library. + +David J. Wheeler + Program bred3.c and accompanying document bred3.ps. + This contains the idea behind the multi-table Huffman + coding scheme. + ftp://ftp.cl.cam.ac.uk/users/djw3/ + +Jon L. Bentley and Robert Sedgewick + "Fast Algorithms for Sorting and Searching Strings" + Available from Sedgewick's web page, + www.cs.princeton.edu/~rs +@end example +The following paper gives valuable additional insights into the +algorithm, but is not immediately the basis of any code +used in bzip2. +@example +Peter Fenwick: + Block Sorting Text Compression + Proceedings of the 19th Australasian Computer Science Conference, + Melbourne, Australia. Jan 31 - Feb 2, 1996. + ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps +@end example +Kunihiko Sadakane's sorting algorithm, mentioned above, +is available from: +@example +http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz +@end example +The Manber-Myers suffix array construction +algorithm is described in a paper +available from: +@example +http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps +@end example +Finally, the following paper documents some recent investigations +I made into the performance of sorting algorithms: +@example +Julian Seward: + On the Performance of BWT Sorting Algorithms + Proceedings of the IEEE Data Compression Conference 2000 + Snowbird, Utah. 28-30 March 2000. +@end example + + +@contents + +@bye + diff -Nru bzip2-1.0.1/doc/bzip2recover.1 bzip2-1.0.1.new/doc/bzip2recover.1 --- bzip2-1.0.1/doc/bzip2recover.1 Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/bzip2recover.1 Sat Jun 24 20:13:06 2000 @@ -0,0 +1 @@ +.so bzip2.1 \ No newline at end of file diff -Nru bzip2-1.0.1/doc/pl/Makefile.am bzip2-1.0.1.new/doc/pl/Makefile.am --- bzip2-1.0.1/doc/pl/Makefile.am Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/pl/Makefile.am Sat Jun 24 20:13:06 2000 @@ -0,0 +1,4 @@ + +mandir = @mandir@/pl +man_MANS = bzip2.1 bunzip2.1 bzcat.1 bzip2recover.1 + diff -Nru bzip2-1.0.1/doc/pl/bunzip2.1 bzip2-1.0.1.new/doc/pl/bunzip2.1 --- bzip2-1.0.1/doc/pl/bunzip2.1 Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/pl/bunzip2.1 Sat Jun 24 20:13:06 2000 @@ -0,0 +1 @@ +.so bzip2.1 \ No newline at end of file diff -Nru bzip2-1.0.1/doc/pl/bzcat.1 bzip2-1.0.1.new/doc/pl/bzcat.1 --- bzip2-1.0.1/doc/pl/bzcat.1 Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/pl/bzcat.1 Sat Jun 24 20:13:06 2000 @@ -0,0 +1 @@ +.so bzip2.1 \ No newline at end of file diff -Nru bzip2-1.0.1/doc/pl/bzip2.1 bzip2-1.0.1.new/doc/pl/bzip2.1 --- bzip2-1.0.1/doc/pl/bzip2.1 Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/pl/bzip2.1 Sat Jun 24 20:13:06 2000 @@ -0,0 +1,384 @@ +.\" T³umaczenie Maciej Wojciechowski wojciech@staszic.waw.pl +.PU +.TH bzip2 1 "" "" "wersja 1.0" +.SH NAZWA +bzip2, bunzip2 \- sortuj±cy bloki kompresor/dekompresor plików, v1.0 +.br +bzcat \- dekompresuje pliki na standardowe wyj¶cie +.br +bzip2recover \- odzyskuje dane ze zniszczonych archiwów bzip2 +.SH SK£ADNIA +.ll +8 +.B bzip2 +.RB [ \-cdfkqstvzVL123456789 ] +.RI [ nazwy_plików \&...] +.ll -8 +.br +.B bunzip2 +.RB [ \-fkvsVL ] +.RI [ nazwy_plików \&...] +.br +.B bzcat +.RB [ \-s ] +.RI [ nazwy_plików \&...] +.br +.B bzip2recover +.I nazwa_pliku +.SH OPIS +.I bzip2 +kompresuje pliki u¿ywaj±c algorytmu sortowania bloków Burrowsa-Wheelera i +kodu Huffmana. Kompresja jest generalnie sporo lepsza od konwencjonalnych +kompresorów opartych o metodê LZ77/LZ78, i jest porównywalna z +osi±gniêciami statystycznych kompresorów z rodziny PPM. + +Opcje linii poleceñ s± w wiêkszo¶ci bardzo podobne do tych z +.IR "GNU gzip" , +ale nie s± identyczne. + +.I bzip2 +oczekuje listy plików towarzysz±cych parametrom linii poleceñ. Ka¿dy plik jest +zastêpowany przez swoj± skompresowan± wersjê, z nazw± +"oryginalny_plik.bz2". Ka¿dy skompresowany plik ma ten sam czas modyfikacji, +uprawnienia i, je¶li to mo¿liwe, w³a¶ciciela co orygina³, po to, aby te +ustawienia mog³y zostaæ odtworzone podczas dekompresji. Utrzymywanie nazwy +plików nie jest do koñca dok³adne w tym sensie, ¿e nie ma mo¿liwo¶ci +przetrzymywania daty, uprawnieñ, w³a¶ciciela i nazw plików na systemach, na +których brakuje tych mo¿liwo¶ci lub maj± ograniczenia co do d³ugo¶ci nazwy, +tak np. jak MS-DOS. + +.I bzip2 +i +.I bunzip2 +standardowo nie nadpisuj± istniej±cych ju¿ plików. Je¶li chcesz aby to +robi³y, musisz u¿yæ parametru \-f. + +Je¶li nie podano ¿adnej nazwy pliku, +.I bzip2 +kompresuje ze standardowego wej¶cia na standardowe wyj¶cie. Odmiawia wówczas +wypisywania skompresowanego wyj¶cie na terminal, gdy¿ by³oby to +ca³kiem niezrozumia³e i przez to bez wiêkszego sensu. + +.I bunzip2 +(lub +.IR bzip2 \-d ) +dekompresuje wszystkie podane pliki. Pliki, które nie by³y +utworzone przez +.I bzip2 +zostan± wykryte i zignorowane, a na ekranie pojawi siê komunikat +ostrzegawczy. +.I bzip2 +próbuje zgadn±æ nazwê dla dekompresowanego pliku w nastêpuj±cy sposób: +.nf + nazwa_pliku.bz2 staje siê nazwa_pliku + nazwa_pliku.bz staje siê nazwa_pliku + nazwa_pliku.tbz2 staje siê nazwa_pliku.tar + nazwa_pliku.tbz staje siê nazwa_pliku.tar + inna_nazwa staje siê inna_nazwa.out +.fi +Je¶li plik nie ma jednego z nastêpuj±cych rozpoznawalnych rozszerzeñ, +.IR .bz2 , +.IR .bz , +.I .tbz2 +lub +.IR .tbz , +to +.I bzip2 +napisze, ¿e nie mo¿e zgadn±æ nazwy pierwotnego pliku, i u¿yje +oryginalnej nazwy z dodanym rozszerzeniem +.IR .out . + +Tak jak kompresja, nie posiadaj±ca ¿adnych plików, powoduje kompresjê ze +standardowego wej¶cia na standardowe wyj¶cie. + +.I bunzip2 +poprawnie zdekompresuje plik, który jest po³aczeniem dwóch lub wiêcej +skompresowanych plików. Rezultatem jest po³±czony odpowiedni +nieskompresowany plik. Obs³ugiwane jest równie¿ sprawdzanie spójno¶ci +(\-t) po³±czonych skompresowanych plików. + +Mo¿esz równie¿ kompresowaæ lub dekompresowaæ pliki na standardowe wyj¶cie +u¿ywaj±c parametru \-c. W ten w³a¶nie sposób mo¿na przeprowadzaæ kompresjê +wielu plików równocze¶nie. +Powsta³e wyniki s± przesy³ane sekwencyjnie na standardowe wyj¶cie. +W ten sposób kompresja wielu plików generuje strumieñ +zawieraj±cy reprezentacje kilku skompresowanych plików. Taki strumieñ mo¿e +byæ zdekompresowany poprawnie tylko przez +.I bzip2 +w wersji 0.9.0 lub pó¼niejszej. Wcze¶niejsze wersje +.I bzip2 +zatrzymaj± siê po zdekmpresowaniu pierwszego pliku w strumieniu. + +.I bzcat +(lub +.I bzip2 -dc) +dekompresuje wszystkie wybrane pliki na standardowe wyj¶cie. + +.I bzip2 +czyta argumenty ze zmiennych ¶rodowiskowych +.I BZIP2 +i +.I BZIP, +w podanej kolejno¶ci, i przetwarza je przed jakimikolwiek argumentami +przeczytanymi z linii poleceñ. To dobra metoda na specyfikowanie +standardowych ustawieñ. + +Kompresja stosowana jest zawsze, nawet je¶li skompresowany plik jest +nieznaczniej wiêkszy od pliku oryginalnego. Pliki mniejsze ni¿ mniej wiêcej +sto bajtów staj± siê wiêksze, poniewa¿ mechanizm kompresji ma sta³y +nag³ówek wynosz±cy oko³o 50 bajtów. Przypadkowe dane (w³±czaj±c wyj¶cie +wiêkszo¶ci kompresorów plików) d± kodowane na mniej wiêcej 8.05 bitu na +bajt, daj±c zysk oko³o 0.5%. + +Jako samosprawdzenie dla twojej ochrony +.I bzip2 +u¿ywa 32-bitowego CRC aby upewniæ siê, ¿e zdekompresowana wersja pliku jest +identyczna z oryginaln±. To strze¿e przed stratami w skompresowanych danych +i przed niewykrytymi b³êdami w +.I bzip2 +(na szczê¶cie bardzo rzadkich). Mo¿liwo¶æ niewykrycia utraty danych +jest mikroskopijna, mniej wiêcej jedna szansa na cztery biliony dla ka¿dego +pliku. Uwa¿aj jednak, gdy¿ sprawdzenie jest dokonywane przed dekompresj±, +wiêc dowiesz siê tylko tego, ¿e co¶ jest nie w porz±dku. Nie pomo¿e ci to odzyskaæ +oryginalnych nieskompresowanych danych. Mo¿esz u¿yæ +.I bzip2recover +aby spróbowaæ odzyskaæ dane z uszkodzonych plików. + +Zwracane warto¶ci: 0 dla normalnego wyj¶cia, 1 dla problemów technicznych +(plik nie znaleziony, niew³a¶ciwy parametr, b³±d wyj¶cia/wyj¶cia itp.), 2 dla +zasygnalizowania b³êdu skompresowanego pliku, 3 dla wewnêtrznego b³êdu (np. +bug), który zmusi³ \fIbzip2\fP do przerwania. + +.SH OPCJE +.TP +.B \-c --stdout +Kompresuje lub dekompresuje na standardowe wyj¶cie. +.TP +.B \-d --decompress +Wymusza dekompresjê. +.IR bzip2 , +.I bunzip2 +i +.I bzcat +s± tak naprawdê tymi samymi programami i decyzja jakie akcje bêd± wykonane +jest wykonywana na podstawie nazwy jaka zosta³a u¿yta. Ten parametr ma wy¿szy +priorytet i wymusza na \fIbzip2\fP dekompresjê. +.TP +.B \-z --compress +Podobne do \-d: wymusza kompresjê, bez wzglêdu na sposób wywo³ania. +.TP +.B \-t --test +Sprawdza integralno¶æ wybranego pliku(ów), ale nie dekompresuje ich. Wymusza +to próbn± dekompresjê i mówi, jaki jest rezultat. +.TP +.B \-f --force +Wymusza zastêpowanie plików wyj¶ciowych. Normalnie, \fIbzip2\fP nie +zastêpuje istniej±cych plików wyj¶ciowych. Wymusza równie¿ na \fIbzip2\fP +³amanie dowi±zañ twardych, czego normalnie nie robi. +.TP +.B \-k --keep +Zatrzymaj (nie kasuj) pliki wej¶ciowe przy kompresji lub dekompresji. +.TP +.B \-s --small +Zredukuj u¿ycie pamiêci na kompresjê, dekompresjê i testowanie. Pliki s± +dekompresowane i testowane przy u¿yciu zmodyfikowanego algorytmu, który +potrzebuje tylko 2.5 bajtu na blok bajtów. Oznacza to, ¿e ka¿dy plik mo¿e +byæ zdekompresowany przy u¿yciu oko³o 2300k pamiêci, jednak trac±c oko³o po³owê +normalnej szybko¶ci. + +Podczas kompresji, \-s wybiera bloki wielko¶ci 200k, których limity +pamiêci wynosz± mniej wiêcej tyle samo, w zamian za jako¶æ kompresji. W +skrócie, je¶li twój komputer ma ma³o pamiêci (8 megabajtów lub mniej), +u¿ywaj opcji \-s do wszystkiego. Zobacz \fBzarz±dzanie pamiêci±\fP poni¿ej. +.TP +.B \-q --quiet +Wy³±cza wszystkie nieistotne komunikaty ostrzegawcze. +Nie s± eliminowane komunikaty dotycz±ce b³êdów wej¶cia/wyj¶cia i innych +zdarzeñ krytycznych. +.TP +.B \-v --verbose +Tryb gadatliwy -- pokazuje stopieñ kompresji dla ka¿dego pliku. Nastêpne +\fB\-v\fP zwiêkszaj± stopieñ gadatliwo¶ci, powoduj±c wy¶wietlanie du¿ej +ilo¶ci informacji, przydatnych g³ównie przy diagnostyce. +.TP +.B \-L --license -V --version +Wy¶wietla wersjê programu i warunki licencji. +.TP +.B \-1 to \-9 +Ustawia wielko¶æ bloku na 100 k, 200 k .. 900 k przy kompresji. Nie ma +¿adnego znaczenia przy dekompresji. Zobacz \fBzarz±dzanie pamiêci±\fP +poni¿ej. +.TP +.B \-- +Traktuje wszystkie nastêpuj±ce po nim argumenty jako nazwy plików, nawet je¶li +zaczynaj± siê one od my¶lnika. Mo¿esz wiêc kompresowaæ i dekompresowaæ +pliki, których nazwa zaczyna siê od my¶lnika, na przyk³ad: bzip2 \-- +\-mój_plik. +.TP +.B \--repetitive-fast --repetitive-best +Te parametry nie maj± znaczenia w wersjach 0.9.5 i wy¿szych. Umo¿liwia³y one +pewn± infantyln± kontrolê nad zachowaniem algorytmu sortuj±cego we +wcze¶niejszych wersjach, co by³o czasami u¿yteczne. Wersje 0.9.5 i wy¿sze +maj± usprawniony algorytm, który powoduje bezu¿yteczno¶æ tej funkcji. + +.SH ZARZ¡DZANIE PAMIÊCI¡ +.I bzip2 +kompresuje du¿e pliki w blokach. Rozmiar bloku ma wp³yw zarówno na stopieñ +osi±ganej kompresji, jak równie¿ na ilo¶æ pamiêci potrzebnej do kompresji +i dekompresji. Parametry od \-1 do \-9 wybieraj± rozmiar bloku odpowiednio +od 100,000 bajtów a¿ do 900,000 bajtów (standardowo). W czasie dekompresji, +rozmiar bloku u¿ytego do kompresji jest odczytywany z nag³ówku pliku +skompresowanego i +.I bunzip2 +sam zajmuje odpowiedni± do dekompresji ilo¶æ pamiêci. Poniewa¿ rozmiar +bloków jest przetrzymywany w pliku skompresowanym, parametry od \-1 do \-9 +nie maj± przy dekompresji ¿adnego znaczenia. + +Wymagania kompresji i dekompresji w bajtach, mog± byæ wyliczone przez: + + Kompresja : 400k + ( 8 x rozmiar bloku ) + + Dekompresja : 100k + ( 4 x rozmiar bloku ) lub + 100k + ( 2.5 x rozmiar bloku ) + +Wiêksze bloki daj± du¿e zmniejszenie zwrotów marginalnych. Wiêkszo¶æ +kompresji pochodzi z pierwszych stu lub dwustu kilobajtów rozmiaru bloku. +Warto o tym pamiêtaæ u¿ywaj±c \fIbzip2\fP na wolnych +komputerach. Warto równie¿ podkre¶liæ, ¿e rozmiar pamiêci potrzebnej do +dekompresji jest wybierany poprzez ustawienie odpowiedniej +wielko¶ci bloku przy kompresji. + +Dla plików skompresowanych standardowym blokiem wielko¶ci 900k, +\fIbunzip2\fP bêdzie wymaga³ oko³o 3700 kilobajtów do dekompresji. Aby +umo¿liwiæ dekompresjê na komputerze wyposa¿onym jedynie w 4 megabajty +pamiêci, \fIbunzip2\fP ma opcjê, która mo¿e zmniejszyæ wymagania prawie do +po³owy, tzn. oko³o 2300 kilobajtów. Prêdko¶æ dekompresji jest równie¿ bardzo +zmiejszona, wiêc u¿ywaj tej opcji tylko wtedy, kiedy jest to konieczne. Tym +parametrem jest -s. + +Generalnie, próbuj i u¿ywaj najwiêkszych rozmiarów bloków, je¶li ilo¶æ +pamiêci ci na to pozwala. Prêdko¶æ kompresji i dekompresji w zasadzie nie +zale¿y od wielko¶ci u¿ytego bloku. + +Inna wa¿na rzecz dotyczy plików, które mieszcz± siê w pojedyñczym bloku -- +oznacza to wiêkszo¶æ plików na które siê natkniesz u¿ywaj±c du¿ych bloków. +Rozmiar realny pamiêci zabieranej jest proporcjonalny do wielko¶ci pliku, +je¶li plik jest mniejszy ni¿ blok. Na przyk³ad, kompresja pliku o +wielko¶ci 20,000 bajtów z parametrem -9 wymusi na kompresorze odnalezienie +7600 k pamiêci, ale zajêcie tylko 400k + 20000 * 8 = 560 kilobajtów z +tego. Podobnie, dekompresor odnajdzie 3700k, ale zajmie tylko 100k + 20000 +* 4 = 180 kilobajtów. + +Tu jest tabela, która podsumowuje maksymalne u¿ycie pamiêci dla ró¿nych +rozmiarów bloków. Podano te¿ ca³kowity rozmiar skompresowanych 14 +plików tekstowych (Calgary Text Compressione Corpus) zajmuj±cych razem +3,141,622 bajtów. Ta kolumna daje pewne pojêcie o tym, jaki wp³yw na +kompresjê ma wielko¶æ bloków. Ta tabela uzmys³awia równie¿ przewagê u¿ycia +wiêkszych bloków dla wiêkszych plików, poniewa¿ "Corpus" jest zdominowany +przez mniejsze pliki. +.nf + U¿ycie U¿ycie U¿ycie Corpus + Parametr kompresji dekompresji dekompresji -s Size + + -1 1200k 500k 350k 914704 + -2 2000k 900k 600k 877703 + -3 2800k 1300k 850k 860338 + -4 3600k 1700k 1100k 846899 + -5 4400k 2100k 1350k 845160 + -6 5200k 2500k 1600k 838626 + -7 6100k 2900k 1850k 834096 + -8 6800k 3300k 2100k 828642 + -9 7600k 3700k 2350k 828642 +.fi +.SH ODZYSKIWANIE DANYCH ZE ZNISZCZONYCH PLIKÓW BZIP2 +.I bzip2 +kompresuje pliki w blokach, zazwyczaj 900 kilbajtowych. Ka¿dy blok jest +trzymany osobno. Je¶li b³êdy transmisji lub no¶nika uszkodz± plik +wieloblokowy .bz2, mo¿liwe jest odtworzenie danych zawartych w +niezniszczonych blokach pliku. + +Ka¿dy blok jest reprezentowany przez 48-bitowy wzorzec, który umo¿liwia +znajdowanie przyporz±dkowañ bloków z rozs±dn± pewno¶ci±. Ka¿dy blok +ma równie¿ swój 32-bitowy CRC, wiêc bloki uszkodzone mog± byæ ³atwo +odseparowane od poprawnych. + +.I bzip2recover +jest oddzielnym programem, którego zadaniem jest poszukiwanie bloków w +plikach .bz2 i zapisywanie ich do w³asnego pliku .bz2. Mo¿esz potem u¿yæ +\fIbzip2\fP \-t aby sprawdziæ spójno¶æ wyj¶ciowego pliku i zdekompresowaæ +te, które nie s± uszkodzone. + +.I bzip2recover +pobiera pojedynczy argument, nazwê uszkodzonego pliku, i tworzy pewn± liczbê +plików "rec0001plik.bz2", "rec0002plik.bz2", itd., przetrzymuj±ce odzyskane +bloki. Wyj¶ciowe nazwy plików s± tak tworzone, aby ³atwo by³o potem u¿ywaæ +ich razem za pomoc± gwiazdek -- na przyk³ad, "bzip2 -dc rec*plik.bz2 > +odzyskany_plik" -- wylistuje pliki we w³a¶ciwej kolejno¶ci. + +.I bzip2recover +powinien byæ u¿ywany najczê¶ciej z du¿ymi plikami .bz2, jako i¿ one +zawieraj± najczê¶ciej du¿o bloków. Jest czystym bezsensem u¿ywaæ go na +uszkodzonym jedno-blokowym pliku, poniewa¿ uszkodzony blok nie mo¿e byæ +odzyskany. Je¶li chcesz zminimalizowaæ jakiekolwiek mo¿liwe straty danych +poprzez no¶nik lub transmisjê, powiniene¶ zastanowiæ siê nad u¿yciem +mniejszych bloków. + +.SH OPISY WYNIKÓW +Etap sortuj±cy kompresji ³±czy razem podobne ci±gi znaków w pliku. Przez +to, pliki zawieraj±ce bardzo d³ugie ci±gi powtarzaj±cych siê symboli, jak +"aabaabaabaab ..." (powtórzone kilkaset razy) mog± byæ kompresowane wolniej +ni¿ normalnie. Wersje 0.9.5 i wy¿sze zachowuj± siê du¿o lepiej w tej +sytuacji ni¿ wersje poprzednie. Ró¿nica stopnia kompresji pomiêdzy +najgorszym i najlepszym przypadkiem kompresji wynosi oko³o 10:1. Dla +wcze¶niejszych wersji by³o to nawet oko³o 100:1. Je¶li chcesz, mo¿esz u¿yæ +parametru \-vvvv aby monitorowaæ postêpy bardzo szczegó³owo. + +Prêdko¶æ dekompresji nie jest zmieniana przez to zjawisko. + +.I bzip2 +zazwyczaj rezerwuje kilka megabajtów pamiêci do dzia³ania a +potem wykorzystuje j± w sposób zupe³nie przypadkowy. +Oznacza to, ¿e zarówno prêdko¶æ kompresji jak i dekompresji jest w +du¿ej czê¶ci zale¿na od prêdko¶ci, z jak± twój komputer mo¿e naprawiaæ braki +bufora podrêcznego. Z tego powodu, wprowadzone zosta³y ma³e zmiany kody aby +zmniejszyæ straty, które da³y nieproporcjonalnie du¿y wzrost osi±gniêæ. +My¶lê, ¿e +.I bzip2 +bêdzie dzia³a³ najlepiej na komputerach z du¿ymi buforami podrêcznymi. + +.SH ZAKAMARKI +Wiadomo¶ci o b³êdach wej¶cia/wyj¶cia nie s± a¿ tak pomocne, jak mog³yby byæ. +.I bzip2 +stara siê wykryæ b³±d wej¶cia/wyj¶cia i wyj¶æ "czysto", ale +szczegó³y tego, jaki to problem mog± byæ czasami bardzo myl±ce. + +Ta strona podrêcznika odnosi siê do wersji 1.0 programu \fIbzip2\fP. +Skompresowane pliki utworzone przez tê wersjê s± kompatybilne zarówno z +w przód jak i wstecznie z poprzednimi publicznymi wydaniami, +wersjami 0.1pl2, 0.9.0 i 0.9.5 ale z ma³ymi wyj±tkami: 0.9.0 i wy¿sze potrafi± +poprawnie dekompresowaæ wiele skompresowanych plików z³±czonych w jeden. +0.1pl2 nie potrafi tego; zatrzyma siê ju¿ po dekompresji pierwszego pliku w +strumieniu. + +.I bzip2recover +u¿ywa 32-bitowych liczb do reprezentacji pozycji bitu w skompresowanym +pliku, wiêc nie mo¿e przetwarzaæ skompresowanych plików d³u¿szych ni¿ 512 +megabajtów. Mo¿na to ³atwo naprawiæ. + +.SH AUTOR +Julian Seward, jseward@acm.org. + +http://www.muraroa.demon.co.uk +http://sourceware.cygnus.com/bzip2 + +Idee zawarte w \fIbzip2\fP s± podzielone (przynajmniej) pomiêdzy +nastepuj±cy ludzi: Michael Burrows i David Wheeler (transformacja +sortuj±c± bloki), David Wheeler (znów, koder Huffmana), Peter Fenwick +(struktura kodowania modelu w oryginalnym \fIbzip2\fP, i wiele +udoskonaleñ), i Alistair Moffar, Radford Neal i Ian Witten (arytmetyczny +koder w oryginalnym \fIbzip2\fP). Jestem im bardzo wdziêczny za ich pomoc, +wsparcie i porady. Zobacz stronê manuala w ¼ród³owej dystrybucji po +wska¼niki do ¼róde³ dokumentacji. Christian von Roques zachêci³ mnie do +wymy¶lenia szybszego algorytmu sortuj±cego, po to ¿eby przyspieszyæ +kompresjê. Bela Lubkin zachêci³a mnie do polepszenia najgorszych wyników +kompresji. Wiele ludzi przys³a³o ³atki, pomog³o w ró¿nych problemach, +po¿yczy³o komputerów, da³o rady i by³o ogólnie pomocnych. diff -Nru bzip2-1.0.1/doc/pl/bzip2recover.1 bzip2-1.0.1.new/doc/pl/bzip2recover.1 --- bzip2-1.0.1/doc/pl/bzip2recover.1 Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/doc/pl/bzip2recover.1 Sat Jun 24 20:13:06 2000 @@ -0,0 +1 @@ +.so bzip2.1 \ No newline at end of file diff -Nru bzip2-1.0.1/huffman.c bzip2-1.0.1.new/huffman.c --- bzip2-1.0.1/huffman.c Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/huffman.c Sat Jun 24 20:13:06 2000 @@ -58,6 +58,10 @@ For more information on these sources, see the manual. --*/ +#ifdef HAVE_CONFIG_H +#include +#endif + #include "bzlib_private.h" diff -Nru bzip2-1.0.1/makefile.msc bzip2-1.0.1.new/makefile.msc --- bzip2-1.0.1/makefile.msc Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/makefile.msc Thu Jan 1 01:00:00 1970 @@ -1,63 +0,0 @@ -# Makefile for Microsoft Visual C++ 6.0 -# usage: nmake -f makefile.msc -# K.M. Syring (syring@gsf.de) -# Fixed up by JRS for bzip2-0.9.5d release. - -CC=cl -CFLAGS= -DWIN32 -MD -Ox -D_FILE_OFFSET_BITS=64 - -OBJS= blocksort.obj \ - huffman.obj \ - crctable.obj \ - randtable.obj \ - compress.obj \ - decompress.obj \ - bzlib.obj - -all: lib bzip2 test - -bzip2: lib - $(CC) $(CFLAGS) -o bzip2 bzip2.c libbz2.lib setargv.obj - $(CC) $(CFLAGS) -o bzip2recover bzip2recover.c - -lib: $(OBJS) - lib /out:libbz2.lib $(OBJS) - -test: bzip2 - type words1 - .\\bzip2 -1 < sample1.ref > sample1.rb2 - .\\bzip2 -2 < sample2.ref > sample2.rb2 - .\\bzip2 -3 < sample3.ref > sample3.rb2 - .\\bzip2 -d < sample1.bz2 > sample1.tst - .\\bzip2 -d < sample2.bz2 > sample2.tst - .\\bzip2 -ds < sample3.bz2 > sample3.tst - @echo All six of the fc's should find no differences. - @echo If fc finds an error on sample3.bz2, this could be - @echo because WinZip's 'TAR file smart CR/LF conversion' - @echo is too clever for its own good. Disable this option. - @echo The correct size for sample3.ref is 120,244. If it - @echo is 150,251, WinZip has messed it up. - fc sample1.bz2 sample1.rb2 - fc sample2.bz2 sample2.rb2 - fc sample3.bz2 sample3.rb2 - fc sample1.tst sample1.ref - fc sample2.tst sample2.ref - fc sample3.tst sample3.ref - - - -clean: - del *.obj - del libbz2.lib - del bzip2.exe - del bzip2recover.exe - del sample1.rb2 - del sample2.rb2 - del sample3.rb2 - del sample1.tst - del sample2.tst - del sample3.tst - -.c.obj: - $(CC) $(CFLAGS) -c $*.c -o $*.obj - diff -Nru bzip2-1.0.1/manual.ps bzip2-1.0.1.new/manual.ps --- bzip2-1.0.1/manual.ps Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/manual.ps Thu Jan 1 01:00:00 1970 @@ -1,3808 +0,0 @@ -%!PS-Adobe-2.0 -%%Creator: dvips(k) 5.78 Copyright 1998 Radical Eye Software (www.radicaleye.com) -%%Title: manual.dvi -%%Pages: 39 -%%PageOrder: Ascend -%%BoundingBox: 0 0 596 842 -%%EndComments -%DVIPSCommandLine: dvips -o manual.ps manual.dvi -%DVIPSParameters: dpi=600, compressed -%DVIPSSource: TeX output 2000.03.23:2343 -%%BeginProcSet: texc.pro -%! -/TeXDict 300 dict def TeXDict begin /N{def}def /B{bind def}N /S{exch}N -/X{S N}B /TR{translate}N /isls false N /vsize 11 72 mul N /hsize 8.5 72 -mul N /landplus90{false}def /@rigin{isls{[0 landplus90{1 -1}{-1 1} -ifelse 0 0 0]concat}if 72 Resolution div 72 VResolution div neg scale -isls{landplus90{VResolution 72 div vsize mul 0 exch}{Resolution -72 div -hsize mul 0}ifelse TR}if Resolution VResolution vsize -72 div 1 add mul -TR[matrix currentmatrix{dup dup round sub abs 0.00001 lt{round}if} -forall round exch round exch]setmatrix}N /@landscape{/isls true N}B -/@manualfeed{statusdict /manualfeed true put}B /@copies{/#copies X}B -/FMat[1 0 0 -1 0 0]N /FBB[0 0 0 0]N /nn 0 N /IE 0 N /ctr 0 N /df-tail{ -/nn 8 dict N nn begin /FontType 3 N /FontMatrix fntrx N /FontBBox FBB N -string /base X array /BitMaps X /BuildChar{CharBuilder}N /Encoding IE N -end dup{/foo setfont}2 array copy cvx N load 0 nn put /ctr 0 N[}B /df{ -/sf 1 N /fntrx FMat N df-tail}B /dfs{div /sf X /fntrx[sf 0 0 sf neg 0 0] -N df-tail}B /E{pop nn dup definefont setfont}B /ch-width{ch-data dup -length 5 sub get}B /ch-height{ch-data dup length 4 sub get}B /ch-xoff{ -128 ch-data dup length 3 sub get sub}B /ch-yoff{ch-data dup length 2 sub -get 127 sub}B /ch-dx{ch-data dup length 1 sub get}B /ch-image{ch-data -dup type /stringtype ne{ctr get /ctr ctr 1 add N}if}B /id 0 N /rw 0 N -/rc 0 N /gp 0 N /cp 0 N /G 0 N /sf 0 N /CharBuilder{save 3 1 roll S dup -/base get 2 index get S /BitMaps get S get /ch-data X pop /ctr 0 N ch-dx -0 ch-xoff ch-yoff ch-height sub ch-xoff ch-width add ch-yoff -setcachedevice ch-width ch-height true[1 0 0 -1 -.1 ch-xoff sub ch-yoff -.1 sub]/id ch-image N /rw ch-width 7 add 8 idiv string N /rc 0 N /gp 0 N -/cp 0 N{rc 0 ne{rc 1 sub /rc X rw}{G}ifelse}imagemask restore}B /G{{id -gp get /gp gp 1 add N dup 18 mod S 18 idiv pl S get exec}loop}B /adv{cp -add /cp X}B /chg{rw cp id gp 4 index getinterval putinterval dup gp add -/gp X adv}B /nd{/cp 0 N rw exit}B /lsh{rw cp 2 copy get dup 0 eq{pop 1}{ -dup 255 eq{pop 254}{dup dup add 255 and S 1 and or}ifelse}ifelse put 1 -adv}B /rsh{rw cp 2 copy get dup 0 eq{pop 128}{dup 255 eq{pop 127}{dup 2 -idiv S 128 and or}ifelse}ifelse put 1 adv}B /clr{rw cp 2 index string -putinterval adv}B /set{rw cp fillstr 0 4 index getinterval putinterval -adv}B /fillstr 18 string 0 1 17{2 copy 255 put pop}for N /pl[{adv 1 chg} -{adv 1 chg nd}{1 add chg}{1 add chg nd}{adv lsh}{adv lsh nd}{adv rsh}{ -adv rsh nd}{1 add adv}{/rc X nd}{1 add set}{1 add clr}{adv 2 chg}{adv 2 -chg nd}{pop nd}]dup{bind pop}forall N /D{/cc X dup type /stringtype ne{] -}if nn /base get cc ctr put nn /BitMaps get S ctr S sf 1 ne{dup dup -length 1 sub dup 2 index S get sf div put}if put /ctr ctr 1 add N}B /I{ -cc 1 add D}B /bop{userdict /bop-hook known{bop-hook}if /SI save N @rigin -0 0 moveto /V matrix currentmatrix dup 1 get dup mul exch 0 get dup mul -add .99 lt{/QV}{/RV}ifelse load def pop pop}N /eop{SI restore userdict -/eop-hook known{eop-hook}if showpage}N /@start{userdict /start-hook -known{start-hook}if pop /VResolution X /Resolution X 1000 div /DVImag X -/IE 256 array N 2 string 0 1 255{IE S dup 360 add 36 4 index cvrs cvn -put}for pop 65781.76 div /vsize X 65781.76 div /hsize X}N /p{show}N -/RMat[1 0 0 -1 0 0]N /BDot 260 string N /rulex 0 N /ruley 0 N /v{/ruley -X /rulex X V}B /V{}B /RV statusdict begin /product where{pop false[ -(Display)(NeXT)(LaserWriter 16/600)]{dup length product length le{dup -length product exch 0 exch getinterval eq{pop true exit}if}{pop}ifelse} -forall}{false}ifelse end{{gsave TR -.1 .1 TR 1 1 scale rulex ruley false -RMat{BDot}imagemask grestore}}{{gsave TR -.1 .1 TR rulex ruley scale 1 1 -false RMat{BDot}imagemask grestore}}ifelse B /QV{gsave newpath transform -round exch round exch itransform moveto rulex 0 rlineto 0 ruley neg -rlineto rulex neg 0 rlineto fill grestore}B /a{moveto}B /delta 0 N /tail -{dup /delta X 0 rmoveto}B /M{S p delta add tail}B /b{S p tail}B /c{-4 M} -B /d{-3 M}B /e{-2 M}B /f{-1 M}B /g{0 M}B /h{1 M}B /i{2 M}B /j{3 M}B /k{ -4 M}B /w{0 rmoveto}B /l{p -4 w}B /m{p -3 w}B /n{p -2 w}B /o{p -1 w}B /q{ -p 1 w}B /r{p 2 w}B /s{p 3 w}B /t{p 4 w}B /x{0 S rmoveto}B /y{3 2 roll p -a}B /bos{/SS save N}B /eos{SS restore}B end - -%%EndProcSet -TeXDict begin 39158280 55380996 1000 600 600 (manual.dvi) -@start -%DVIPSBitmapFont: Fa cmti10 10.95 1 -/Fa 1 47 df<120FEA3FC0127FA212FFA31380EA7F00123C0A0A77891C>46 -D E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fb cmbxti10 14.4 1 -/Fb 1 47 df<13FCEA03FF000F13804813C05AA25AA2B5FCA31480A214006C5A6C5A6C5A -EA0FE0121271912B>46 D E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fc cmsl10 10.95 25 -/Fc 25 122 df37 D44 D48 D<157015F014011407143F903803FFE0137FEBFFCFEBF80F1300 -141F15C0A5143F1580A5147F1500A55C5CA513015CA513035CA513075CA5130F5CA3131F -497EB612F8A31D3D78BC2D>I<133C137F5B481380A31400A26C5A137890C7FCB3EA0780 -EA0FE0121F123FA5121FEA0F601200A213E05BA212015B120390C7FC5A1206120E5A5A12 -3012705A5A11397AA619>59 D97 DIIIII<147FEB3FFFA313017FA25CA513015CA51303 -5CA4ED07F80107EB1FFF9139F0781FC09138F1E00F9139F38007E0ECF70002FE14F0495A -5CA25CA24A130F131F4A14E0A4161F133F4A14C0A4163F137F91C71380A4167F5B491500 -A300015D486C491380B5D8F87F13FCA32E3F7DBE33>104 D<1478EB01FE130314FFA25B -14FE130314FCEB00F01400ACEB03F8EA01FF14F0A2EA001F130FA314E0A5131F14C0A513 -3F1480A5137F1400A55B5BA4EA03FF007F13F0A2B5FC183E7DBD1A>I<143FEB1FFF5BA2 -13017FA214FEA5130114FCA5130314F8A5130714F0A5130F14E0A5131F14C0A5133F1480 -A5137F1400A55B5BA4EA03FF007F13F8A2B5FC183F7DBE1A>108 -D<902707F007F8EB03FCD803FFD91FFF90380FFF80913CE0781FC03C0FE09126E1E00FEB -F0073E001FE38007E1C003F090260FE700EBE38002EEDAF70013F802FC14FE02D85C14F8 -4A5CA24A5C011F020F14074A4A14F0A5013F021F140F4A4A14E0A5017F023F141F91C749 -14C0A549027F143F4992C71380A300014B147F486C496DEBFFC0B5D8F87FD9FC3F13FEA3 -47287DA74C>I<903907F007F8D803FFEB1FFF9139E0781FC09138E1E00F3B001FE38007 -E090380FE70002EE14F014FC14D814F85CA24A130F131F4A14E0A4161F133F4A14C0A416 -3F137F91C71380A4167F5B491500A300015D486C491380B5D8F87F13FCA32E287DA733> -II<91387F01FE903A7FFF0FFFC09139FE3E03F09238F801F890 -3A01FFE000FE4B137F6D497F4990C713804A15C04A141FA218E0A20103150F5C18F0A317 -1F010716E05CA3173F18C0130F4A147F1880A2EFFF004C5A011F5D16034C5A6E495AEE1F -C06E495AD93FDC017EC7FC91388F01F8913883FFE0028090C8FC92C9FC137FA291CAFCA4 -5BA25BA31201487EB512F8A3343A81A733>I<903907F01F80D803FFEB7FE09138E1E1F0 -9138E387F839001FE707EB0FE614EE02FC13F002D813E09138F801804AC7FCA25C131FA2 -5CA4133F5CA5137F91C8FCA55B5BA31201487EB512FEA325287EA724>114 -D<9138FF81C0010713E390381F807F90397C003F8049131F4848130F5B00031407A24848 -1400A27FA27F6D90C7FCEBFF8014FC6C13FF6C14C015F06C6C7F011F7F13079038007FFE -1403140100381300157EA2123C153E157E007C147CA2007E147815F8007F495A4A5A486C -485A26F9E01FC7FC38E0FFFC38C01FE0222A7DA824>II<01FE147F00FFEC7FFF4914FEA20007140300031401A34914FCA4 -150312074914F8A41507120F4914F0A4150F121F4914E0A2151FA3153F4914C0157F15FF -EC01DF3A0FC003BFE09138073FFF3803F01E3801FFF826003FE01380282977A733>III<90B539E007FFF05E18E0902707FE000313006D48EB01FC -705A5F01014A5A5F16036E5C0100140794C7FC160E805E805E1678ED8070023F13F05EED -81C015C191381FC38015C793C8FC15EF15EEEC0FFCA25DA26E5AA25DA26E5A5DA24AC9FC -5C140E141E141C5C121C003F5B5A485B495A130300FE5B4848CAFCEA701EEA783CEA3FF0 -EA0FC0343A80A630>121 D E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fd cmtt12 14.4 10 -/Fd 10 123 df50 D<383FFF805AB57EA37E7EEA003FAEED07FC92383FFF -8092B512E002C314F802CF8002DF8091B7FCDBF80F1380DBC00113C092C713E04A143F4A -EC1FF04A15F84A140F4AEC07FCA217034A15FE1701A318FF83A95F18FEA280170318FC6E -140718F86E140FEF1FF06E143F6EEC7FE06EECFFC0DBC0031380EDF01F92B6120002DF14 -FC02CF5C02C35CD91F8114C090260F807F90C7FC90C7EA0FF8384A7FC83E>98 -D<923803FFF85D4B7FA38181ED0003AEEC1FF0ECFFFE0103EBFF83010F14E34914F3017F -14FB90B7FC48EBF80F48EBC00191C7FC4848143F4848141F5B4848140F491407123F4914 -03127F5BA312FF90C8FCA97F127FA216077F123F6D140FA26C6C141F6D143F000F157F6C -6C14FF01FF5B6C6D5A6CD9F01FEBFFFC6C90B500FB13FE6D02F313FF6D14E3010F14C36D -020113FE010101FC14FC9026003FE0C8FC384A7CC83E>100 D<143E147F4A7E497FA56D -5B6EC8FC143E91C9FCAC003FB57E5A81A47EC7123FB3B3007FB71280B812C0A56C16802A -4A76C93E>105 D<007FB512C0B6FC81A47EC7121FB3B3B3A5007FB712F8B812FCA56C16 -F82E4978C83E>108 D111 -DI<903901FFF00F011F9038 -FE1F8090B612BF000315FF5A5A5A393FFE003F01F01307D87FC0130190C8FC5A48157FA4 -7EEE3F00D87FC091C7FC13F0EA3FFE381FFFF06CEBFFC06C14FE6C6E7EC615E0013F14F8 -010780D9003F7F02007F03071380030013C0003EED3FE0007F151F48150F17F06D1407A3 -7FA26D140F6D15E0161F01FCEC3FC06D14FF9026FFC00F138091B612005E485D013F5C6D -14E0D8FC0714802778007FF8C7FC2C3677B43E>115 D<147C14FC497EAD003FB712FC5A -B87EA36C5EA2260001FEC9FCB3A6173FA2EF7F80A76E14FF6D16006F5A9238C007FE9138 -7FF01F92B55A6E5C6E5C6E5C6E1480020149C7FC9138003FF031437DC13E>I<000FB812 -804817C04817E0A418C001C0C712014C13804C1300EE1FFE4C5AEE7FF06C484A5A4B5BC8 -485B4B90C7FC4B5A4B5A4B5A4B5A4B5A4A5B4A5B4A90C8FC4A5A4A5A4A5A4A5A4A5A495B -495B4990C9FC495A495A4948EC0FC0495A4948EC1FE0485B00075B4890C8FCEA1FFC485A -485A90B8FCB9FCA46C17C07E33337CB23E>122 D E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fe cmtt12 13.14 31 -/Fe 31 123 df50 D<003FB6FC4815E0B712F882826C816C16802701FC000113C0 -9238007FE0161FEE0FF0A2160717F81603A6160717F0A2160FEE1FE0163FEE7FC0923801 -FF80030F130090B65A5E16F08216FEEEFF8017C001FCC7EA7FE0EE1FF0EE07F8160317FC -EE01FE1600A217FF177FA717FF17FE16011603EE07FC160FEE3FF8EEFFF0003FB7FC4816 -E0B812C01780EEFE006C15F86C15C030437DC238>66 DI<007FB512F8B7FC16C08216F8826C813A03F8001FFF15 -07030113806F13C0167FEE3FE0161FEE0FF0A2EE07F8A2EE03FCA21601A217FE1600A417 -7FAD17FF17FEA4160117FCA2160317F81607A2EE0FF0161FEE3FE0167FEEFFC04B13805D -031F1300007FB65AB75A5E5E16C093C7FC6C14F830437DC238>I<007FB712FCB87EA57E -D801FCC8FCA9177C94C7FCA6ED07C04B7EA590B6FCA79038FC000FA56F5A92C9FCA7EF0F -80EF1FC0AA007FB8FCB9FCA56C178032437DC238>I<91391FF003C091397FFC07E049B5 -FC010714CF4914EF4914FF5B90387FF81F9038FFE00748EB800191C7FC4848147F485A49 -143F485A161F485AA249140F123F5BA2127F90C8EA07C093C7FCA35A5AAA923807FFFC4B -13FE4B13FF7E7E6F13FE6F13FC9238000FE07F003F151FA27F121F7F163F6C7EA26C6C14 -7F7F6C6C14FF6C6C5B6E5A6C6D5A90387FF81F6DB6FC6D14EF6D14CF6D148F0101140F90 -3A007FFC07C0DA1FF0C7FC30457CC338>71 D<007FB612F0B712F8A56C15F0260001FCC7 -FCB3B3B1007FB612F0B712F8A56C15F0254377C238>73 D<90380FFFFE90B612E0000315 -F8488148814881A2273FFE000F138001F01301497F49147F4848EC3FC0A290C8121FA448 -16E0A248150FB3AC6C151FA36C16C0A36D143FA36C6CEC7F806D14FF6D5B01FE130F6CB7 -1200A26C5D6C5D6C5DC615E0010F49C7FC2B457AC338>79 D<003FB512F04814FEB77E16 -E0826C816C813A01FC003FFEED07FF03017F81707E163F161F83160FA7161F5F163F167F -4C5A5D030790C7FCED3FFE90B65A5E5E5EA282829038FC001FED07FC6F7E150115008282 -AA18E0EF01F0EF03F8A31783EE3F87263FFFE0ECC7F0486D14FFB56C7F18E07013C06C49 -6D13806C496D1300CA12FC35447EC238>82 D<003FB8FC481780B9FCA53BFE0007F0003F -A9007CEE1F00C792C7FCB3B3A70107B512F04980A56D5C31437DC238>84 -D<267FFFF090387FFFF0B56C90B512F8A56C496D13F0D801FCC73801FC00B3B3A66D1403 -00005EA36D14076D5D6E130F6D6C495A6E133F6D6C495A6D6C495AECFF076D90B5C7FC6D -5C6D5C6D5C023F13E0020F1380DA03FEC8FC35447FC238>I87 -D<001FB712F04816F85AA417F090C8121F17E0EE3FC0167F1780EEFF00A24B5A4B5A123E -C8485A4B5AA24B5A4B5AA24B5A4BC7FCA24A5A14035D4A5A140F5D4A5A143F5D4A5A14FF -92C8FC495A13035C495AA2495A495AA2495A495A17F849C7EA01FC485AA2485A485AA248 -5A121F5B485A127F90B7FCB8FCA56C16F82E437BC238>90 D<003FB712804816C0B812E0 -A46C16C06C16802B087A7D38>95 D97 DIIIII<14F0497E497E497EA4 -6D5A6D5A6D5A91C8FCAB383FFFFC487FB5FCA37E7EC7FCB3AF007FB612F0B712F816FCA3 -16F86C15F0264476C338>105 D<387FFFFEB6FCA57EC77EB3B3B1007FB7FCB81280A56C -1600294379C238>108 D<023FEB07E03B3FE0FFC01FF8D87FF39038E07FFCD8FFF76D48 -7E90B500F97F15FB6C91B612806C01C1EBF83F00030100EBE01F4902C013C0A24990387F -800FA2491400A349137EB3A73C3FFF03FFE07FFC4801879038F0FFFEB500C76D13FFA36C -01874913FE6C01039038E07FFC383080AF38>IIII114 D<903907FF80F0017FEBF1F848B5 -12FD000714FF5A5A5AEBFC00D87FE0131F0180130F48C71207481403A5007FEC01F001C0 -90C7FCEA3FF013FE381FFFF86CEBFFC0000314F8C614FF013F1480010714E0D9003F13F0 -020013F8ED0FFC1503003CEC01FE007E140000FE15FF167F7EA37F6D14FF16FE01F01303 -6DEB07FC01FF137F91B512F816F016E04815C0D8FC3F1400010F13FCD8780113E0283278 -B038>III<000FB712FC4816FE5AA417 -FC0180C7EA1FF8EE3FF0EE7FE0EEFFC04B13804B13006CC7485AC8485A4B5A4B5A4B5A4B -5A4A5B4A90C7FCEC07FC4A5A4A5A4A5A4A5A49485A4990C8FC495A495A495A495A494814 -7C494814FE485B4890C8FC485A485A485A485A48B7FCB8FCA56C16FC2F2F7DAE38>122 -D E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Ff cmbx12 13.14 53 -/Ff 53 122 df<923807FFE092B512FC020714FF021F81027F9038007FC0902601FFF0EB -0FE04901C0497E4990C7487ED90FFC147F011F824A14FF495AA2137F5CA2715A715A715A -EF078094C8FCA7EF07FCB9FCA526007FF0C7123F171FB3B3A2003FB5D8E00FB512F8A53D -4D7ECC44>12 D45 DI<177817F8EE01FCA21603A2EE07F8A217F016 -0FA217E0161FA2EE3FC0A21780167FA217005EA24B5AA25E1503A24B5AA25E150FA25E15 -1FA24B5AA25E157FA24BC7FCA25D1401A25D1403A24A5AA25D140FA24A5AA25D143FA25D -147FA24AC8FCA25C1301A25C1303A2495AA25C130FA2495AA25C133FA25C137FA249C9FC -A25B1201A2485AA25B1207A25B120FA2485AA25B123FA25B127FA248CAFCA25AA2127CA2 -2E6D79D13D>I<15F014011407141F147FEB03FF137FB6FCA313FC1380C7FCB3B3B2007F -B712E0A52B4777C63D>49 DIIIII<121F7F7F13FE90B812E0A45A18C0188018005F5FA25F485E90C8 -EA07E0007E4B5A5F007C151F4CC7FC167E5E485D15014B5A4B5AC8485A4B5AA24BC8FC15 -7EA25D1401A24A5A1407A24A5AA2141FA24A5AA2147FA314FFA3495BA45BA55BAA6D5BA2 -6D90C9FCEB007C334B79C93D>III65 -D<93261FFF80EB01C00307B500F81303033F02FE13074AB7EAC00F0207EEE03F021F903A -FE007FF87F027F01E0903807FCFF91B5C70001B5FC010301FC6E7E4901F0151F4901C081 -4949814990C97E494882494882485B48197F4A173F5A4A171F5A5C48190FA2485B1A07A2 -5AA297C7FC91CDFCA2B5FCAD7EA280A2F207C07EA36C7FA26C190F6E18807E6E171F6C1A -006E5F6C193E6C6D177E6D6C5F6D6C4C5A6D6D15036D6D4B5A6D01F04B5A6D01FCED3FC0 -010001FFEDFF806E01E0D903FEC7FC021F01FEEB3FFC020790B612F002015EDA003F92C8 -FC030714FCDB001F13804A4D79CB59>67 D -III<93261FFF80EB01C00307B500F8 -1303033F02FE13074AB7EAC00F0207EEE03F021F903AFE007FF87F027F01E0903807FCFF -91B5C70001B5FC010301FC6E7E4901F0151F4901C0814949814990C97E49488249488248 -5B48197F4A173F5A4A171F5A5C48190FA2485B1A07A25AA297C8FC91CEFCA2B5FCAD6C04 -0FB712C0A280A36C93C7001FEBC000A2807EA27E807E807E806C7F7E6D7E6D7E6D7F6D01 -E05D6D6D5D6D13FC010001FF4AB5FC6E01E0EB07F9021F01FFEB3FF0020791B5EAE07F02 -01EEC01FDA003FED0007030702F81301DB001F018090C8FC524D79CB61>III76 DII -II82 DI<003FBB12C0A5DA80019038FC001FD9FC001601D8 -7FF09438007FE001C0183F49181F90C7170FA2007E1907A3007C1903A500FC1AF0481901 -A5C894C7FCB3B3A749B812FCA54C4A7CC955>III89 D97 -DI<91380FFF8091B512F8 -010314FF010F15804948C613C0D97FF8EB1FE0D9FFE0EB3FF04849137F4849EBFFF84890 -C7FCA2485A121FA24848EC7FF0EE3FE0EE1FC0007F92C7FC5BA212FFAC127FA27FA2123F -A26C6C153EA26C6C157E177C6C6D14FC6C6D14F86C6D13036C6DEB07F0D97FFCEB1FE06D -B4EBFFC0010F90B5120001035C010014F0020F13802F347CB237>IIIIII<13FCEA03FF487F487FA2487FA66C5BA26C5B6C90C7FCEA00FC90C8 -FCABEB7FC0B5FCA512037EB3B3A2B61280A5194D7BCC22>I108 D<90287FC001FFC0EC7FF0B5010F01FC0103B5FC033F -6D010F804B6D4980DBFE079026803F817F9126C1F801903AC07E007FF00003D9C3E0DAE0 -F8806C9026C78000D9F1E06D7E02CFC7EBF3C002DEEDF780DD7FFF6E7E02FC93C7FC4A5D -A24A5DA34A5DB3AAB6D8C03FB5D8F00FB512FCA55E327BB167>I<903A7FC001FFC0B501 -0F13F8033F7F4B13FFDBFE077F9138C1F00300039026C3E0017F6CD9C78080ECCF0014DE -02DC6D7F14FC5CA25CA35CB3AAB6D8C07FEBFFE0A53B327BB144>I<913807FF80027F13 -F80103B6FC010F15C090261FFE017F903A7FF0003FF849486D7E480180EB07FE4890C76C -7E4817804980000F17C048486E13E0A2003F17F0A249157F007F17F8A400FF17FCAB007F -17F8A46C6CEDFFF0A2001F17E0A26C6C4A13C0A26C6C4A13806C6D4913006C5E6C01E0EB -1FFC6D6C495A903A3FFE01FFF0010FB612C0010392C7FCD9007F13F80207138036347DB2 -3D>I<90397FC007FFB5017F13E002C1B512FC02C714FF9126CFF80F7F9126DFC0037F00 -0301FFC77F6C496E7E02F8814A6E7E717E4A81831980A28319C0A37113E0AC19C05FA319 -805F19005F606E143F6E5D4D5A6E4A5A02FF495BDBC0075B9126EFF01F5B02E7B548C7FC -02E114F8DAE07F13E0DB0FFEC8FC92CAFCAFB612C0A53B477CB144>I<9039FF803FE0B5 -EBFFF8028113FE02837FDA87E11380EC8F830003D99F0713C06C139E14BCA214F8A24A6C -13806F13006F5A4A90C7FCA45CB3A8B612E0A52A327CB132>114 -D<903907FF8070017FEBF1F048B6FC1207380FFC01391FE0003F4848130F491307127F90 -C71203A2481401A27FA27F01F090C7FC13FCEBFFC06C13FEECFFE06C14FC6C806CECFF80 -6C15C06C15E06C15F06C7E011F14F8010114FCEB000FEC007FED1FFE0078140F00F81407 -15037E1501A27E16FC7E15036D14F86D13076D14F001F8EB1FE001FFEBFFC04890B51280 -486C1400D8F81F13FCD8E00313C027347CB230>I<14F8A51301A41303A21307A2130FA2 -131F133F137F13FF1203000F90B512F0B7FCA426007FF8C7FCB3A7167CAA013F14F880A2 -90391FFE01F0010F1303903907FF87E06DEBFFC06D14806D6C1300EC0FFC26467EC430> -IIII<007FB500C090387FFFE0A5C601F0C73803F8006E5D017F5E6E140701 -3F5E80170F011F5E6E141F6D93C7FC6F5B6D153E6F137E6D157C6F13FCA26D6D5B16016D -5DEDF803027F5CEDFC07023F5CEDFE0F021F5C15FF161F6E91C8FC16BF6E13BE16FE6E5B -A26E5BA36E5BA26F5AA26F5AA26F5AA393C9FC5D153E157E157CD81F8013FC486C5B387F -E001D8FFF05B14035D14074A5A49485A007F133F4948CAFC383F81FE381FFFF86C5B6C13 -C0C648CBFC3B477EB041>121 D E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fg cmtt12 17.28 6 -/Fg 6 123 df<913803FFC0023F13FC49B67E010715F04981013F15FE498190B812C048 -8348D9FC0180489026E0001F7F480180130391C87F48486F7E49153F4848ED0FFF834848 -178083496F13C012FF8319E07FA2187FA36C5A6C5A6C5ACBFCA218FFA219C05FA219805F -A24D13005F604D5A173F4D5A4D5AA24C5B4C5B4C5B041F90C7FC4C5A4C5A4C5A4B5B4B5B -4B5B031F5B4B90C8FC4B5AEDFFF84A5B4A5B4A5B021F5B4A90C9FCEC7FFC4A5A495B495B -010F5B495B4948CAFC4948ED1F804948ED3FC04849ED7FE0485B000F5B4890C9FC4890B8 -FC5ABAFCA56C18C06C18803B5A79D94A>50 D<383FFFF0487F80B5FCA37EA27EEA000FB0 -EE0FFC93B57E030714E0031F14F84B14FE92B7FC02FD8291B87E85DCE01F7FEE000703FC -01017F4B6D7F03E0143F4B6E7E4B140F8592C87E4A6F1380A34A6F13C0A284A21AE0A219 -7FAA19FFA21AC0A26E5DA24E138080606F1600606F4A5A6F143F6F4A5A6F4A5A6F130303 -FF010F5BDCC03F5B93B65A6102FD93C7FC02FC5D6F5C031F14F0902607F80714C0902603 -F00191C8FC90C8EA3FF043597FD74A>98 D105 D<003FB512FE4880B77EA57E7EC71201B3B3B3 -B0003FB812FC4817FEBAFCA56C17FE6C17FC385877D74A>108 D -112 D<000FB912E04818F04818F8A619F001F0C8000313E04D13C04D13804D13004D5A4D -5A4D5A6C484A5B6C484A5BC9000F5B4C5B4C90C7FC4C5A4C5A4B5B4B5B4B5B4B5B4B5B4B -90C8FC4B5A4B5A4A5B4A5B4A5B4A5B4A5B4A90C9FC4A5A4A5A495B495B495B4949EC07E0 -4949EC0FF04948C8EA1FF8495A495A485B485B485B485B4890C9FC485A48B9FCBAFCA66C -18F06C18E03D3E7BBD4A>122 D E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fh cmbx12 17.28 28 -/Fh 28 120 df<16F04B7E1507151F153FEC01FF1407147F010FB5FCB7FCA41487EBF007 -C7FCB3B3B3B3007FB91280A6395E74DD51>49 D<913801FFF8021FEBFFC091B612F80103 -15FF010F16C0013F8290267FFC0114F89027FFE0003F7F4890C7000F7F48486E7FD807F8 -6E148048486E14C048486E14E048486F13F001FC17F8486C816D17FC6E80B56C16FE8380 -A219FFA283A36C5BA26C5B6C90C8FCD807FC5DEA01F0CA14FEA34D13FCA219F85F19F04D -13E0A294B512C019804C14004C5B604C5B4C5B604C13804C90C7FC4C5A4C5A4B13F05F4B -13804B90C8FC4B5AED1FF84B5A4B5A4B48143F4A5B4A48C8FC4A5A4A48157E4A5A4A5AEC -7F8092C9FC02FE16FE495A495A4948ED01FCD90FC0150749B8FC5B5B90B9FC5A4818F85A -5A5A5A5ABAFCA219F0A4405E78DD51>I<92B5FC020F14F8023F14FF49B712C04916F001 -0FD9C01F13FC90271FFC00077FD93FE001017F49486D8049C86C7F484883486C6F7F14C0 -486D826E806E82487FA4805CA36C5E4A5E6C5B6C5B6C495E011FC85A90C95CA294B55A61 -4C91C7FC604C5B4C5B4C5B4C5B047F138092260FFFFEC8FC020FB512F817E094C9FC17F8 -17FF91C7003F13E0040713F8040113FE707F717F7113E085717FA2717F85A285831A80A3 -1AC0EA03FCEA0FFF487F487F487FA2B57EA31A80A34D14005C7E4A5E5F6C495E49C8485B -D81FF85F000F5ED807FE92B55A6C6C6C4914806C01F0010791C7FC6C9026FF803F5B6D90 -B65A011F16F0010716C001014BC8FCD9001F14F0020149C9FC426079DD51>II<4DB5ED03C0057F02F0 -14070407B600FE140F047FDBFFC0131F4BB800F0133F030F05FC137F033F9127F8007FFE -13FF92B6C73807FF814A02F0020113C3020702C09138007FE74A91C9001FB5FC023F01FC -16074A01F08291B54882490280824991CB7E49498449498449498449865D49498490B5FC -484A84A2484A84A24891CD127FA25A4A1A3F5AA348491A1FA44899C7FCA25CA3B5FCB07E -A380A27EA2F50FC0A26C7FA37E6E1A1F6C1D80A26C801D3F6C6E1A00A26C6E616D1BFE6D -7F6F4E5A7F6D6D4E5A6D6D4E5A6D6D4E5A6D6E171F6D02E04D5A6E6DEFFF806E01FC4C90 -C7FC020F01FFEE07FE6E02C0ED1FF8020102F8ED7FF06E02FF913803FFE0033F02F8013F -1380030F91B648C8FC030117F86F6C16E004071680DC007F02F8C9FC050191CAFC626677 -E375>67 D72 DI77 -D80 D<001FBEFCA64849C79126E0000F148002E0180091 -C8171F498601F81A0349864986A2491B7FA2491B3F007F1DC090C9181FA4007E1C0FA600 -FE1DE0481C07A5CA95C7FCB3B3B3A3021FBAFCA663617AE070>84 -D<913803FFFE027FEBFFF00103B612FE010F6F7E4916E090273FFE001F7FD97FE001077F -D9FFF801017F486D6D7F717E486D6E7F85717FA2717FA36C496E7FA26C5B6D5AEB1FC090 -C9FCA74BB6FC157F0207B7FC147F49B61207010F14C0013FEBFE004913F048B512C04891 -C7FC485B4813F85A5C485B5A5CA2B55AA45FA25F806C5E806C047D7F6EEB01F96C6DD903 -F1EBFF806C01FED90FE114FF6C9027FFC07FC01580000191B5487E6C6C4B7E011F02FC13 -0F010302F001011400D9001F90CBFC49437CC14E>97 D<903807FF80B6FCA6C6FC7F7FB3 -A8EFFFF8040FEBFF80047F14F00381B612FC038715FF038F010014C0DBBFF0011F7FDBFF -C001077F93C76C7F4B02007F03F8824B6F7E4B6F13804B17C0851BE0A27313F0A21BF8A3 -7313FCA41BFEAE1BFCA44F13F8A31BF0A24F13E0A24F13C06F17804F1300816F4B5A6F4A -5B4AB402075B4A6C6C495B9126F83FE0013F13C09127F00FFC03B55A4A6CB648C7FCDAC0 -0115F84A6C15E091C7001F91C8FC90C8000313E04F657BE35A>I<92380FFFF04AB67E02 -0F15F0023F15FC91B77E01039039FE001FFF4901F8010113804901E0010713C049018049 -13E0017F90C7FC49484A13F0A2485B485B5A5C5A7113E0485B7113C048701380943800FE -0095C7FC485BA4B5FCAE7EA280A27EA2806C18FCA26C6D150119F87E6C6D15036EED07F0 -6C18E06C6D150F6D6DEC1FC06D01E0EC7F806D6DECFF00010701FCEB03FE6D9039FFC03F -FC010091B512F0023F5D020F1580020102FCC7FCDA000F13C03E437BC148>II<92380FFFC0 -4AB512FC020FECFF80023F15E091B712F80103D9FE037F499039F0007FFF011F01C0011F -7F49496D7F4990C76C7F49486E7F48498048844A804884485B727E5A5C48717EA35A5C72 -1380A2B5FCA391B9FCA41A0002C0CBFCA67EA380A27EA27E6E160FF11F806C183F6C7FF1 -7F006C7F6C6D16FE6C17016D6C4B5A6D6D4A5A6D01E04A5A6D6DEC3FE0010301FC49B45A -6D9026FFC01F90C7FC6D6C90B55A021F15F8020715E0020092C8FC030713F041437CC14A ->III<903807FF80B6FCA6C6FC7F7FB3A8EF1FFF94B512F0040714 -FC041F14FF4C8193267FE07F7F922781FE001F7FDB83F86D7FDB87F07FDB8FC0814C7F03 -9FC78015BE03BC8003FC825DA25DA25DA45DB3B2B7D8F007B71280A651647BE35A>II<903807FF80B6 -FCA6C6FC7F7FB3B3B3B3ADB712E0A623647BE32C>108 D<902607FF80D91FFFEEFFF8B6 -91B500F00207EBFF80040702FC023F14E0041F02FF91B612F84C6F488193267FE07F6D48 -01037F922781FE001F9027E00FF0007FC6DA83F86D9026F01FC06D7F6DD987F06D4A487F -6DD98FC0DBF87EC7804C6D027C80039FC76E488203BEEEFDF003BC6E4A8003FC04FF834B -5FA24B5FA24B94C8FCA44B5EB3B2B7D8F007B7D8803FB612FCA67E417BC087>I<902607 -FF80EB1FFFB691B512F0040714FC041F14FF4C8193267FE07F7F922781FE001F7FC6DA83 -F86D7F6DD987F07F6DD98FC0814C7F039FC78015BE03BC8003FC825DA25DA25DA45DB3B2 -B7D8F007B71280A651417BC05A>I<923807FFE092B6FC020715E0021F15F8027F15FE49 -4848C66C6C7E010701F0010F13E04901C001037F49496D7F4990C87F49486F7E49486F7E -48496F13804819C04A814819E048496F13F0A24819F8A348496F13FCA34819FEA4B518FF -AD6C19FEA46C6D4B13FCA36C19F8A26C6D4B13F0A26C19E06C6D4B13C0A26C6D4B13806C -6D4B13006D6C4B5A6D6D495B6D6D495B010701F0010F13E06D01FE017F5B010090B7C7FC -023F15FC020715E0020092C8FC030713E048437CC151>I114 D<913A3FFF8007800107B5EAF81F011FECFE7F017F91B5FC48B8FC48EBE0 -014890C7121FD80FFC1407D81FF0801600485A007F167F49153FA212FF171FA27F7F7F6D -92C7FC13FF14E014FF6C14F8EDFFC06C15FC16FF6C16C06C16F06C826C826C826C82013F -1680010F16C01303D9007F15E0020315F0EC001F1500041F13F81607007C150100FC8117 -7F6C163FA2171F7EA26D16F0A27F173F6D16E06D157F6D16C001FEEDFF806D0203130002 -C0EB0FFE02FCEB7FFC01DFB65A010F5DD8FE0315C026F8007F49C7FC48010F13E035437B -C140>II<90 -2607FFC0ED3FFEB60207B5FCA6C6EE00076D826D82B3B3A260A360A2607F60183E6D6D14 -7E4E7F6D6D4948806D6DD907F0ECFF806D01FFEB3FE06D91B55A6E1500021F5C020314F8 -DA003F018002F0C7FC51427BC05A>I119 D -E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fi cmsy10 10.95 1 -/Fi 1 16 df15 -D E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fj cmtt10 10.95 89 -/Fj 89 127 df<121C127FEAFF80B3EA7F00B2123EC7FCA8121C127FA2EAFF80A3EA7F00 -A2121C09396DB830>33 D<00101304007C131F00FEEB3F80A26C137FA248133FB2007E14 -00007C7F003C131E00101304191C75B830>I<903907C007C0A2496C487EA8011F131FA2 -02C05BA3007FB7FCA2B81280A36C16006C5D3A007F807F80A2020090C7FCA9495BA2003F -90B512FE4881B81280A36C1600A22701FC01FCC7FCA300031303A201F85BA76C486C5AA2 -29387DB730>I38 DI<141E147F14FF5BEB03FEEB07FCEB0FF0EB1FE0EB3FC0EB7F80EBFF00 -485A5B12035B485A120F5BA2485AA2123F5BA2127F90C7FCA412FEAD127FA47F123FA27F -121FA26C7EA27F12076C7E7F12017F6C7EEB7F80EB3FC0EB1FE0EB0FF0EB07FCEB03FEEB -01FF7F147F141E184771BE30>I<127812FE7E7F6C7E6C7EEA0FF06C7E6C7E6C7E6C7EEB -7F80133F14C0131FEB0FE014F01307A2EB03F8A214FC1301A214FE1300A4147FAD14FEA4 -130114FCA2130314F8A2EB07F0A2130F14E0EB1FC0133F1480137FEBFF00485A485A485A -485AEA3FE0485A485A90C7FC5A1278184778BE30>I<14E0497E497EA60038EC0380007E -EC0FC0D8FF83EB3FE001C3137F9038F3F9FF267FFBFB13C06CB61280000FECFE00000314 -F86C5C6C6C13C0011F90C7FC017F13C048B512F04880000F14FE003FECFF80267FFBFB13 -C026FFF3F913E09038C3F87F0183133FD87E03EB0FC00038EC0380000091C7FCA66D5A6D -5A23277AAE30>I<143EA2147FAF007FB7FCA2B81280A36C1600A2C76CC8FCAF143EA229 -297DAF30>II<007FB612F0 -A2B712F8A36C15F0A225077B9E30>I<120FEA3FC0EA7FE0A2EAFFF0A4EA7FE0A2EA3FC0 -EA0F000C0C6E8B30>I<16F01501ED03F8A21507A2ED0FF0A2ED1FE0A2ED3FC0A2ED7F80 -A2EDFF00A24A5AA25D1403A24A5AA24A5AA24A5AA24A5AA24A5AA24AC7FCA2495AA25C13 -03A2495AA2495AA2495AA2495AA2495AA249C8FCA2485AA25B1203A2485AA2485AA2485A -A2485AA2485AA248C9FCA25AA2127CA225477BBE30>I<14FE903807FFC0497F013F13F8 -497F90B57E48EB83FF4848C6138049137F4848EB3FC04848EB1FE049130F001F15F04913 -07A24848EB03F8A290C712014815FCA400FEEC00FEAD6C14016C15FCA36D1303003F15F8 -A26D1307001F15F0A26D130F6C6CEB1FE0A26C6CEB3FC06C6CEB7F806D13FF2601FF8313 -006CEBFFFE6D5B6D5B010F13E06D5BD900FEC7FC273A7CB830>IIIII<000FB612804815C05AA316800180C8FCAEEB83FF019F13C0 -90B512F015FC8181D9FE0313809039F0007FC049133F0180EB1FE06CC7120F000E15F0C8 -1207A216F81503A31218127EA2B4FC150716F048140F6C15E06C141F6DEB3FC06D137F3A -3FE001FF80261FFC0F13006CB55A6C5C6C5C6C14E06C6C1380D90FFCC7FC25397BB730> -II<127CB712FC16FEA416FC48C7EA0FF816F0ED1FE0007CEC3FC0C8EA7F80EDFF00 -A24A5A4A5A5D14075D140F5D4A5AA24A5AA24AC7FCA25C5C13015CA213035CA213075CA4 -495AA6131F5CA96D5A6DC8FC273A7CB830>I<49B4FC011F13F0017F13FC90B57E0003EC -FF804815C048010113E03A1FF8003FF049131FD83FC0EB07F8A24848EB03FC90C71201A5 -6D1303003F15F86D13076C6CEB0FF06C6CEB1FE0D807FCEB7FC03A03FF83FF806C90B512 -006C6C13FC011F13F0497F90B512FE48802607FE0013C0D80FF8EB3FE0D81FE0EB0FF048 -48EB07F8491303007F15FC90C712014815FE481400A66C14016C15FC6D1303003F15F86D -1307D81FF0EB1FF06D133F3A0FFF01FFE06C90B512C06C1580C6ECFE006D5B011F13F001 -0190C7FC273A7CB830>I<49B4FC010F13E0013F13F890B57E4880488048010113803A0F -FC007FC0D81FF0EB3FE04848131F49EB0FF048481307A290C7EA03F85A4815FC1501A416 -FEA37E7E6D130315076C7E6C6C130F6D133FD80FFC13FF6CB6FC7E6C14FE6C14F9013FEB -E1FC010F138190380060011400ED03F8A2150716F0150F000F15E0486C131F486CEB3FC0 -157FEDFF804A1300EC07FE391FF01FFC90B55A6C5C6C5C6C1480C649C7FCEB3FF0273A7C -B830>I<120FEA3FC0EA7FE0A2EAFFF0A4EA7FE0A2EA3FC0EA0F00C7FCAF120FEA3FC0EA -7FE0A2EAFFF0A4EA7FE0A2EA3FC0EA0F000C276EA630>II<16F01503ED07F8151F157FEDFFF014034A13C0021F138091383FFE00ECFFF8495B01 -0713C0495BD93FFEC7FC495A3801FFF0485B000F13804890C8FCEA7FFC5BEAFFE05B7FEA -7FF87FEA1FFF6C7F000313E06C7F38007FFC6D7E90380FFF806D7F010113F06D7FEC3FFE -91381FFF80020713C06E13F01400ED7FF8151F1507ED03F01500252F7BB230>I<007FB7 -FCA2B81280A36C16006C5DCBFCA7003FB612FE4881B81280A36C1600A229157DA530>I< -1278127EB4FC13C07FEA7FF813FEEA1FFF6C13C000037F6C13F86C6C7EEB1FFF6D7F0103 -13E06D7F9038007FFC6E7E91380FFF806E13C0020113F080ED3FF8151F153FEDFFF05C02 -0713C04A138091383FFE004A5A903801FFF0495B010F13804990C7FCEB7FFC48485A4813 -E0000F5B4890C8FCEA7FFE13F8EAFFE05B90C9FC127E1278252F7BB230>I64 D<147F4A7EA2497FA449 -7F14F7A401077F14E3A3010F7FA314C1A2011F7FA490383F80FEA590387F007FA4498049 -133F90B6FCA34881A39038FC001F00038149130FA4000781491307A2D87FFFEB7FFFB56C -B51280A46C496C130029397DB830>I<007FB512F0B612FE6F7E82826C813A03F8001FF8 -15076F7E1501A26F7EA615015EA24B5A1507ED1FF0ED7FE090B65A5E4BC7FC6F7E16E082 -9039F8000FF8ED03FC6F7E1500167FA3EE3F80A6167F1700A25E4B5A1503ED1FFC007FB6 -FCB75A5E16C05E6C02FCC7FC29387EB730>I<91387F803C903903FFF03E49EBFC7E011F -13FE49EBFFFE5B9038FFE07F48EB801F3903FE000F484813075B48481303A2484813015B -123F491300A2127F90C8FC167C16005A5AAC7E7EA2167C6D14FE123FA27F121F6D13016C -6C14FCA26C6CEB03F86D13076C6CEB0FF03901FF801F6C9038E07FE06DB512C06D14806D -1400010713FC6D13F09038007FC0273A7CB830>I<003FB512E04814FCB67E6F7E6C816C -813A03F8007FF0ED1FF8150F6F7E6F7E15016F7EA2EE7F80A2163F17C0161FA4EE0FE0AC -161F17C0A3163F1780A2167F17005E4B5A15034B5A150F4B5AED7FF0003FB65A485DB75A -93C7FC6C14FC6C14E02B387FB730>I<007FB7FCB81280A47ED803F8C7123FA8EE1F0093 -C7FCA4157C15FEA490B5FCA6EBF800A4157C92C8FCA5EE07C0EE0FE0A9007FB7FCB8FCA4 -6C16C02B387EB730>I<003FB712804816C0B8FCA27E7ED801FCC7121FA8EE0F8093C7FC -A5153E157FA490B6FCA69038FC007FA4153E92C8FCAE383FFFF8487FB5FCA27E6C5B2A38 -7EB730>I<02FF13F00103EBC0F8010F13F1013F13FD4913FF90B6FC4813C1EC007F4848 -133F4848131F49130F485A491307121F5B123F491303A2127F90C7FC6F5A92C8FC5A5AA8 -92B5FC4A14805CA26C7F6C6D1400ED03F8A27F003F1407A27F121F6D130F120F7F6C6C13 -1FA2D803FE133F6C6C137FECC1FF6C90B5FC7F6D13FB010F13F30103EBC1F0010090C8FC -293A7DB830>I<3B3FFF800FFFE0486D4813F0B56C4813F8A26C496C13F06C496C13E0D8 -03F8C7EAFE00B290B6FCA601F8C7FCB3A23B3FFF800FFFE0486D4813F0B56C4813F8A26C -496C13F06C496C13E02D387FB730>I<007FB6FCB71280A46C1500260007F0C7FCB3B3A8 -007FB6FCB71280A46C1500213879B730>I<49B512F04914F85BA27F6D14F090C7EAFE00 -B3B3123C127EB4FCA24A5A1403EB8007397FF01FF86CB55A5D6C5C00075C000149C7FC38 -003FF025397AB730>II<383FFFF8487FB57EA26C5B6C5BD801FCC9FCB3B0EE0F -80EE1FC0A9003FB7FC5AB8FCA27E6C16802A387EB730>III<90383FFFE048B512FC00 -0714FF4815804815C04815E0EBF80001E0133FD87F80EB0FF0A290C71207A44815F84814 -03B3A96C1407A26C15F0A36D130FA26D131F6C6CEB3FE001F813FF90B6FC6C15C06C1580 -6C1500000114FCD8003F13E0253A7BB830>I<007FB512F0B612FE6F7E16E0826C813903 -F8003FED0FFCED03FE15016F7EA2821780163FA6167F17005EA24B5A1503ED0FFCED3FF8 -90B6FC5E5E16804BC7FC15F001F8C9FCB0387FFFC0B57EA46C5B29387EB730>I<90383F -FFE048B512FC000714FF4815804815C04815E0EBF80001E0133F4848EB1FF049130F90C7 -1207A44815F8481403B3A8147E14FE6CEBFF076C15F0EC7F87A2EC3FC7018013CF9038C0 -1FFFD83FE014E0EBF80F90B6FC6C15C06C15806C1500000114FCD8003F7FEB00016E7EA2 -1680157F16C0153F16E0151F16F0150FED07E025467BB830>I<003FB57E4814F0B612FC -15FF6C816C812603F8017F9138003FF0151F6F7E15071503821501A515035E1507150F4B -5A153F4AB45A90B65A5E93C7FC5D8182D9F8007FED3FE0151F150F821507A817F8EEF1FC -A53A3FFF8003FB4801C0EBFFF8B56C7E17F06C496C13E06C49EB7FC0C9EA1F002E397FB7 -30>I<90390FF803C0D97FFF13E048B512C74814F74814FF5A381FF80F383FE001497E48 -48137F90C7123F5A48141FA2150FA37EED07C06C91C7FC7F7FEA3FF0EA1FFEEBFFF06C13 -FF6C14E0000114F86C80011F13FF01031480D9003F13C014019138007FE0151FED0FF0A2 -ED07F8A2007C140312FEA56C140716F07F6DEB0FE06D131F01F8EB3FC001FF13FF91B512 -80160000FD5CD8FC7F13F8D8F81F5BD878011380253A7BB830>I<003FB712C04816E0B8 -FCA43AFE003F800FA8007CED07C0C791C7FCB3B1011FB5FC4980A46D91C7FC2B387EB730 ->I<3B7FFFC007FFFCB56C4813FEA46C496C13FCD803F8C7EA3F80B3B16D147F00011600 -A36C6C14FE6D13016D5CEC800390393FE00FF890391FF83FF06DB55A6D5C6D5C6D91C7FC -9038007FFCEC1FF02F3980B730>III<3A3FFF01FFF84801837F02C77FA202835B6C01015B3A01FC007F806D91C7 -FC00005C6D5BEB7F01EC81FCEB3F8314C3011F5B14E7010F5B14FF6D5BA26D5BA26D5BA2 -6D90C8FCA4497FA2497FA2815B81EB0FE781EB1FC381EB3F8181EB7F0081497F49800001 -143F49800003141F49800007140FD87FFEEB7FFFB590B5128080A25C6C486D130029387D -B730>II<001FB612FC4815FE5AA490C7EA03FCED07F816F0150FED1FE016C0153F -ED7F80003E1500C85A4A5A5D14034A5A5D140F4A5A5D143F4A5A92C7FC5C495A5C130349 -5A5C130F495A5C133F495A91C8FC5B4848147C4914FE1203485A5B120F485A5B123F485A -90B6FCB7FCA46C15FC27387CB730>I<007FB5FCB61280A4150048C8FCB3B3B3A5B6FC15 -80A46C140019476DBE30>I<007FB5FCB61280A47EC7123FB3B3B3A5007FB5FCB6FCA46C -140019477DBE30>93 D<1307EB1FC0EB7FF0497E000313FE000FEBFF80003F14E0D87FFD -13F039FFF07FF8EBC01FEB800F38FE0003007CEB01F00010EB00401D0E77B730>I<007F -B612F0A2B712F8A36C15F0A225077B7D30>I97 -DII<913801FFE04A7F5C -A28080EC0007AAEB03FE90381FFF874913E790B6FC5A5A481303380FFC00D81FF0133F49 -131F485A150F4848130790C7FCA25AA25AA87E6C140FA27F003F141F6D133F6C7E6D137F -390FF801FF2607FE07EBFFC06CB712E06C16F06C14F76D01C713E0011F010313C0D907FC -C8FC2C397DB730>I<49B4FC010713E0011F13F8017F7F90B57E488048018113803A07FC -007FC04848133FD81FE0EB1FE0150F484814F0491307127F90C7FCED03F85A5AB7FCA516 -F048C9FC7E7EA27F003FEC01F06DEB03F86C7E6C7E6D1307D807FEEB1FF03A03FFC07FE0 -6C90B5FC6C15C0013F14806DEBFE00010713F8010013C0252A7CA830>IIII< -14E0EB03F8A2497EA36D5AA2EB00E091C8FCA9381FFFF8487F5AA27E7EEA0001B3A9003F -B612C04815E0B7FCA27E6C15C023397AB830>III<387FFFF8B57EA47EEA0001B3B3A8007FB612F0B712F8A46C15F025387BB7 -30>I<02FC137E3B7FC3FF01FF80D8FFEF01877F90B500CF7F15DF92B57E6C010F138726 -07FE07EB03F801FC13FE9039F803FC01A201F013F8A301E013F0B3A23C7FFE0FFF07FF80 -B548018F13C0A46C486C01071380322881A730>II< -49B4FC010F13E0013F13F8497F90B57E0003ECFF8014013A07FC007FC04848EB3FE0D81F -E0EB0FF0A24848EB07F8491303007F15FC90C71201A300FEEC00FEA86C14016C15FCA26D -1303003F15F86D13076D130F6C6CEB1FF06C6CEB3FE06D137F3A07FF01FFC06C90B51280 -6C15006C6C13FC6D5B010F13E0010190C7FC272A7CA830>II<49B413F8010FEBC1FC013F13F14913FD48B6FC5A4813 -81390FFC007F49131F4848130F491307485A491303127F90C7FC15015A5AA77E7E15037F -A26C6C1307150F6C6C131F6C6C133F01FC137F3907FF01FF6C90B5FC6C14FD6C14F9013F -13F1010F13C1903803FE0190C7FCAD92B512F84A14FCA46E14F82E3C7DA730>II<90381FFC1E48B5129F000714FF5A5A5A387FF007EB800100FEC7FC4880A46C143E -007F91C7FC13E06CB4FC6C13FC6CEBFF806C14E0000114F86C6C7F01037F9038000FFF02 -001380007C147F00FEEC1FC0A2150F7EA27F151F6DEB3F806D137F9039FC03FF0090B6FC -5D5D00FC14F0D8F83F13C026780FFEC7FC222A79A830>III<3B3F -FFC07FFF80486DB512C0B515E0A26C16C06C496C13803B01F80003F000A26D130700005D -A26D130F017E5CA2017F131F6D5CA2EC803F011F91C7FCA26E5A010F137EA2ECE0FE0107 -5BA214F101035BA3903801FBF0A314FF6D5BA36E5A6E5A2B277EA630>I<3B3FFFC01FFF -E0486D4813F0B515F8A26C16F06C496C13E0D807E0C7EA3F00A26D5C0003157EA56D14FE -00015DEC0F80EC1FC0EC3FE0A33A00FC7FF1F8A2147DA2ECFDF9017C5C14F8A3017E13FB -A290393FF07FE0A3ECE03FA2011F5C90390F800F802D277FA630>I<3A3FFF81FFFC4801 -C37FB580A26C5D6C01815BC648C66CC7FC137FEC80FE90383F81FC90381FC3F8EB0FE3EC -E7F06DB45A6D5B7F6D5B92C8FC147E147F5C497F81903803F7E0EB07E790380FE3F0ECC1 -F890381F81FC90383F80FE90387F007E017E137F01FE6D7E48486D7E267FFF80B5FCB500 -C1148014E3A214C16C0180140029277DA630>I<3B3FFFC07FFF80486DB512C0B515E0A2 -6C16C06C496C13803B01FC0003F000A2000014076D5C137E150F017F5C7F151FD91F805B -A214C0010F49C7FCA214E00107137EA2EB03F0157C15FCEB01F85DA2EB00F9ECFDF0147D -147FA26E5AA36E5AA35DA2143F92C8FCA25C147EA2000F13FE486C5AEA3FC1EBC3F81387 -EB8FF0EBFFE06C5B5C6C90C9FC6C5AEA01F02B3C7EA630>I<001FB612FC4815FE5AA316 -FC90C7EA0FF8ED1FF0ED3FE0ED7FC0EDFF80003E491300C7485A4A5A4A5A4A5A4A5A4A5A -4A5A4990C7FC495A495A495A495A495A495A4948133E4890C7127F485A485A485A485A48 -5A48B7FCB8FCA46C15FE28277DA630>II< -127CA212FEB3B3B3AD127CA207476CBE30>II<017C13 -3848B4137C48EB80FE4813C14813C348EBEFFC397FEFFFF0D8FF8713E0010713C0486C13 -80D87C0113003838007C1F0C78B730>I E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fk cmbx12 14.4 49 -/Fk 49 122 df12 D45 DI<157815FC14031407141F14FF130F00 -07B5FCB6FCA2147F13F0EAF800C7FCB3B3B3A6007FB712FEA52F4E76CD43>49 -DI<91380FFFC091B512FC0107ECFF80011F15E09026 -3FF8077F9026FF800113FC4848C76C7ED803F86E7E491680D807FC8048B416C080486D15 -E0A4805CA36C17C06C5B6C90C75AD801FC1680C9FC4C13005FA24C5A4B5B4B5B4B13C04B -5BDBFFFEC7FC91B512F816E016FCEEFF80DA000713E0030113F89238007FFE707E701380 -7013C018E07013F0A218F8A27013FCA218FEA2EA03E0EA0FF8487E487E487EB57EA318FC -A25E18F891C7FC6C17F0495C6C4816E001F04A13C06C484A1380D80FF84A13006CB44A5A -6CD9F0075BC690B612F06D5D011F1580010302FCC7FCD9001F1380374F7ACD43>I<177C -17FEA2160116031607160FA2161F163F167FA216FF5D5DA25D5DED1FBFED3F3F153E157C -15FCEC01F815F0EC03E01407EC0FC01580EC1F005C147E147C5C1301495A495A5C495A13 -1F49C7FC133E5B13FC485A5B485A1207485A485A90C8FC123E127E5ABA12C0A5C96C48C7 -FCAF020FB712C0A53A4F7CCE43>III<121F7F7FEB -FF8091B81280A45A1900606060A2606060485F0180C86CC7FC007EC95A4C5A007C4B5A5F -4C5A160F4C5A484B5A4C5A94C8FC16FEC812014B5A5E4B5A150F4B5AA24B5AA24B5A15FF -A24A90C9FCA25C5D1407A2140FA25D141FA2143FA4147F5DA314FFA55BAC6D5BA2EC3FC0 -6E5A395279D043>I<913807FFC0027F13FC0103B67E010F15E090261FFC0113F8903A3F -E0003FFCD97F80EB0FFE49C76C7E48488048486E1380000717C04980120F18E0177FA212 -1F7FA27F7F6E14FF02E015C014F802FE4913806C7FDBC00313009238F007FE6C02F85B92 -38FE1FF86C9138FFBFF06CEDFFE017806C4BC7FC6D806D81010F15E06D81010115FC0107 -81011F81491680EBFFE748018115C048D9007F14E04848011F14F048487F484813030300 -14F8484880161F4848020713FC1601824848157F173FA2171FA2170FA218F8A27F007F17 -F06D151FA26C6CED3FE0001F17C06D157F6C6CEDFF806C6C6C010313006C01E0EB0FFE6C -01FCEBFFFC6C6CB612F06D5D010F1580010102FCC7FCD9000F13C0364F7ACD43>I<9138 -0FFF8091B512F8010314FE010F6E7E4901037F90267FF8007F4948EB3FF048496D7E4849 -80486F7E484980824817805A91C714C05A7013E0A218F0B5FCA318F8A618FCA46C5DA37E -A25E6C7F6C5DA26C5D6C7F6C6D137B6C6D13F390387FF803011FB512E36D14C301030283 -13F89039007FFE03EC00401500A218F05EA3D801F816E0487E486C16C0487E486D491380 -A218005E5F4C5A91C7FC6C484A5A494A5A49495B6C48495BD803FC010F5B9027FF807FFE -C7FC6C90B55A6C6C14F06D14C0010F49C8FC010013F0364F7ACD43>I<91B5FC010F14F8 -017F14FF90B712C00003D9C00F7F2707FC00017FD80FE06D7F48486E7E48C87FD87FE06E -7E7F7F486C1680A66C5A18006C485C6C5AC9485A5F4B5B4B5B4B5B4B5B4B90C7FC16FC4B -5A4B5A16C04B5A93C8FC4A5A5D14035D5D14075DA25D140FA25DAB91CAFCAAEC1FC04A7E -ECFFF8497FA2497FA76D5BA26D5BEC3FE06E5A315479D340>63 D68 DI -I72 D -I<027FB71280A591C76C90C7FCB3B3B3EA07F0EA1FFC487E487EA2B57EA44C5AA34A485B -7E49495BD83FF8495BD81FE05DD80FFC011F5B2707FF807F90C8FC000190B512FC6C6C14 -F0011F14C0010101F8C9FC39537DD145>I76 DI80 -D82 D<91260FFF80130791B5 -00F85B010702FF5B011FEDC03F49EDF07F9026FFFC006D5A4801E0EB0FFD4801800101B5 -FC4848C87E48488149150F001F824981123F4981007F82A28412FF84A27FA26D82A27F7F -6D93C7FC14C06C13F014FF15F86CECFF8016FC6CEDFFC017F06C16FC6C16FF6C17C06C83 -6C836D826D82010F821303010082021F16801400030F15C0ED007F040714E01600173F05 -0F13F08383A200788200F882A3187FA27EA219E07EA26CEFFFC0A27F6D4B13806D17006D -5D01FC4B5A01FF4B5A02C04A5A02F8EC7FF0903B1FFFC003FFE0486C90B65AD8FC0393C7 -FC48C66C14FC48010F14F048D9007F90C8FC3C5479D24B>I<003FBC1280A59126C0003F -9038C0007F49C71607D87FF8060113C001E08449197F49193F90C8171FA2007E1A0FA300 -7C1A07A500FC1BE0481A03A6C994C7FCB3B3AC91B912F0A553517BD05E>II87 -D97 -DI<913801FFF8021FEBFF8091B612F0010315FC010F9038C00FFE903A1FFE0001 -FFD97FFC491380D9FFF05B4817C048495B5C5A485BA2486F138091C7FC486F1300705A48 -92C8FC5BA312FFAD127F7FA27EA2EF03E06C7F17076C6D15C07E6E140F6CEE1F806C6DEC -3F006C6D147ED97FFE5C6D6CEB03F8010F9038E01FF0010390B55A01001580023F49C7FC -020113E033387CB63C>I<4DB47E0407B5FCA5EE001F1707B3A4913801FFE0021F13FC91 -B6FC010315C7010F9038E03FE74990380007F7D97FFC0101B5FC49487F4849143F484980 -485B83485B5A91C8FC5AA3485AA412FFAC127FA36C7EA37EA26C7F5F6C6D5C7E6C6D5C6C -6D49B5FC6D6C4914E0D93FFED90FEFEBFF80903A0FFFC07FCF6D90B5128F0101ECFE0FD9 -003F13F8020301C049C7FC41547CD24B>I<913803FFC0023F13FC49B6FC010715C04901 -817F903A3FFC007FF849486D7E49486D7E4849130F48496D7E48178048497F18C0488191 -C7FC4817E0A248815B18F0A212FFA490B8FCA318E049CAFCA6127FA27F7EA218E06CEE01 -F06E14037E6C6DEC07E0A26C6DEC0FC06C6D141F6C6DEC3F806D6CECFF00D91FFEEB03FE -903A0FFFC03FF8010390B55A010015C0021F49C7FC020113F034387CB63D>IIII<137F497E -000313E0487FA2487FA76C5BA26C5BC613806DC7FC90C8FCADEB3FF0B5FCA512017EB3B3 -A6B612E0A51B547BD325>I -107 DIII<913801FFE0021F13FE91B612C0010315F0010F9038 -807FFC903A1FFC000FFED97FF86D6C7E49486D7F48496D7F48496D7F4A147F48834890C8 -6C7EA24883A248486F7EA3007F1880A400FF18C0AC007F1880A3003F18006D5DA26C5FA2 -6C5F6E147F6C5F6C6D4A5A6C6D495B6C6D495B6D6C495BD93FFE011F90C7FC903A0FFF80 -7FFC6D90B55A010015C0023F91C8FC020113E03A387CB643>I<903A3FF001FFE0B5010F -13FE033FEBFFC092B612F002F301017F913AF7F8007FFE0003D9FFE0EB1FFFC602806D7F -92C76C7F4A824A6E7F4A6E7FA2717FA285187F85A4721380AC1A0060A36118FFA2615F61 -6E4A5BA26E4A5B6E4A5B6F495B6F4990C7FC03F0EBFFFC9126FBFE075B02F8B612E06F14 -80031F01FCC8FC030313C092CBFCB1B612F8A5414D7BB54B>I<90397FE003FEB590380F -FF80033F13E04B13F09238FE1FF89139E1F83FFC0003D9E3E013FEC6ECC07FECE78014EF -150014EE02FEEB3FFC5CEE1FF8EE0FF04A90C7FCA55CB3AAB612FCA52F367CB537>114 -D<903903FFF00F013FEBFE1F90B7FC120348EB003FD80FF81307D81FE0130148487F4980 -127F90C87EA24881A27FA27F01F091C7FC13FCEBFFC06C13FF15F86C14FF16C06C15F06C -816C816C81C681013F1580010F15C01300020714E0EC003F030713F015010078EC007F00 -F8153F161F7E160FA27E17E07E6D141F17C07F6DEC3F8001F8EC7F0001FEEB01FE9039FF -C00FFC6DB55AD8FC1F14E0D8F807148048C601F8C7FC2C387CB635>I<143EA6147EA414 -FEA21301A313031307A2130F131F133F13FF5A000F90B6FCB8FCA426003FFEC8FCB3A9EE -07C0AB011FEC0F8080A26DEC1F0015806DEBC03E6DEBF0FC6DEBFFF86D6C5B021F5B0203 -13802A4D7ECB34>IIII121 D E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fl cmr10 10.95 86 -/Fl 86 124 df<4AB4EB0FE0021F9038E03FFC913A7F00F8FC1ED901FC90383FF03FD907 -F090397FE07F80494801FF13FF4948485BD93F805C137F0200ED7F00EF003E01FE6D91C7 -FC82ADB97EA3C648C76CC8FCB3AE486C4A7E007FD9FC3FEBFF80A339407FBF35>11 -D<4AB4FC021F13C091387F01F0903901FC0078D907F0131C4948133E494813FF49485A13 -7F1400A213FE6F5A163893C7FCAA167FB8FCA33900FE00018182B3AC486CECFF80007FD9 -FC3F13FEA32F407FBF33>I<4AB47E021F13F791387F00FFEB01F8903807F001EB0FE0EB -1FC0EB3F80137F14008101FE80AEB8FCA3C648C77EB3AE486CECFF80007FD9FC3F13FEA3 -2F407FBF33>I<4AB4ECFF80021FD9C00F13E0913B7F01F03F80F8903C01F80078FE003C -D907F0D93FF8130E49484948131F49484948EB7F804948484913FF137F02005CA201FE92 -C7FC6FED7F0070141C96C7FCAAF13F80BBFCA3C648C76CC7FC197F193FB3AC486C4A6CEB -7FC0007FD9FC3FD9FE1FB5FCA348407FBF4C>I<121EEA7F80EAFFC0A9EA7F80ACEA3F00 -AC121EAB120CC7FCA8121EEA7F80A2EAFFC0A4EA7F80A2EA1E000A4179C019>33 -D<001E130F397F803FC000FF137F01C013E0A201E013F0A3007F133F391E600F30000013 -00A401E01370491360A3000114E04913C00003130101001380481303000EEB070048130E -0018130C0038131C003013181C1C7DBE2D>I<013F4C7ED9FFC04B7E2601E0E015072607 -C070150F48486C4B5A023E4BC7FC48486C5D48D90FC0EB01FE003ED90EF0EB07FCDA0F3F -133E007E903A070FFFF8F8007C0200EBC1F0EE000300FC6D6C495A604D5A171F95C8FC17 -3E177E177C5F16015F007C4948485A1607007E5E003E49495A020E131F003F93C9FC6C49 -133E260F803C137E0238137C6C6C485B3901E0E0016CB448485AD93F0049133F90C74848 -EBFFC0030F903801E0E093398007C0704B4848487E4B153C033E90381F001C4B497F03FC -133E4B150F4A48017E7F0203147C5D4A4801FCEB0380140F5D4AC7FC5C143E5C14FC5C49 -5A13034948027CEB07005C4948147E011F033E5B91C8140E013E153F017E6F5B017C9238 -0F803C4917380001706C5A49923801E0E0496FB45A6C48043FC7FC41497BC34C>37 -DI<121EEA7F8012FF13C0A213E0A3127FEA1E601200A413E013C0A3120113801203 -13005A120E5A1218123812300B1C79BE19>I<1430147014E0EB01C0EB03801307EB0F00 -131E133E133C5B13F85B12015B1203A2485AA2120F5BA2121F90C7FCA25AA3123E127EA6 -127C12FCB2127C127EA6123E123FA37EA27F120FA27F1207A26C7EA212017F12007F1378 -7F133E131E7FEB07801303EB01C0EB00E014701430145A77C323>I<12C07E12707E7E12 -1E7E6C7E7F12036C7E7F12007F1378137CA27FA2133F7FA21480130FA214C0A3130714E0 -A6130314F0B214E01307A614C0130FA31480A2131F1400A25B133EA25BA2137813F85B12 -015B485A12075B48C7FC121E121C5A5A5A5A145A7BC323>II<121EEA7F8012FF13C0A213 -E0A3127FEA1E601200A413E013C0A312011380120313005A120E5A1218123812300B1C79 -8919>44 DI<121EEA7F80A2EAFFC0A4EA7F80A2EA1E000A0A79 -8919>IIIIII<150E151E153EA2157EA215FE1401A21403EC -077E1406140E141CA214381470A214E0EB01C0A2EB0380EB0700A2130E5BA25B5BA25B5B -1201485A90C7FC5A120E120C121C5AA25A5AB8FCA3C8EAFE00AC4A7E49B6FCA3283E7EBD -2D>I<00061403D80780131F01F813FE90B5FC5D5D5D15C092C7FC14FCEB3FE090C9FCAC -EB01FE90380FFF8090383E03E090387001F8496C7E49137E497F90C713800006141FC813 -C0A216E0150FA316F0A3120C127F7F12FFA416E090C7121F12FC007015C012780038EC3F -80123C6CEC7F00001F14FE6C6C485A6C6C485A3903F80FE0C6B55A013F90C7FCEB07F824 -3F7CBC2D>II<1238123C123F90B6 -12FCA316F85A16F016E00078C712010070EC03C0ED078016005D48141E151C153C5DC812 -7015F04A5A5D14034A5A92C7FC5C141EA25CA2147C147814F8A213015C1303A31307A313 -0F5CA2131FA6133FAA6D5A0107C8FC26407BBD2D>III<12 -1EEA7F80A2EAFFC0A4EA7F80A2EA1E00C7FCB3121EEA7F80A2EAFFC0A4EA7F80A2EA1E00 -0A2779A619>I<121EEA7F80A2EAFFC0A4EA7F80A2EA1E00C7FCB3121E127FEAFF80A213 -C0A4127F121E1200A412011380A3120313005A1206120E120C121C5A1230A20A3979A619 ->I<007FB912E0BA12F0A26C18E0CDFCAE007FB912E0BA12F0A26C18E03C167BA147>61 -D63 D<15074B7EA34B7EA34B7EA34B7EA34B7E15E7A2913801C7FC15C3A291380381 -FEA34AC67EA3020E6D7EA34A6D7EA34A6D7EA34A6D7EA34A6D7EA349486D7E91B6FCA249 -819138800001A249C87EA24982010E157FA2011E82011C153FA2013C820138151FA20178 -82170F13FC00034C7ED80FFF4B7EB500F0010FB512F8A33D417DC044>65 -DII -IIII< -B6D8C01FB512F8A3000101E0C7383FFC0026007F80EC0FF0B3A691B7FCA30280C7120FB3 -A92601FFE0EC3FFCB6D8C01FB512F8A33D3E7DBD44>II<011FB512FCA3D9000713006E5A1401B3B3A6123FEA -7F80EAFFC0A44A5A1380D87F005B007C130700385C003C495A6C495A6C495A2603E07EC7 -FC3800FFF8EB3FC026407CBD2F>IIIIIII -III<003FB91280A3903AF0 -007FE001018090393FC0003F48C7ED1FC0007E1707127C00781703A300701701A548EF00 -E0A5C81600B3B14B7E4B7E0107B612FEA33B3D7DBC42>IIII<007F -B5D8C003B512E0A3C649C7EBFC00D93FF8EC3FE06D48EC1F806D6C92C7FC171E6D6C141C -6D6C143C5F6D6C14706D6D13F04C5ADA7FC05B023F13036F485ADA1FF090C8FC020F5BED -F81E913807FC1C163C6E6C5A913801FF7016F06E5B6F5AA26F7E6F7EA28282153FED3BFE -ED71FF15F103E07F913801C07F0203804B6C7EEC07004A6D7E020E6D7E5C023C6D7E0238 -6D7E14784A6D7E4A6D7F130149486E7E4A6E7E130749C86C7E496F7E497ED9FFC04A7E00 -076DEC7FFFB500FC0103B512FEA33F3E7EBD44>II<003FB712F8A391C7EA1FF013F801E0EC3FE00180EC7FC090C8FC003EED -FF80A2003C4A1300007C4A5A12784B5A4B5AA200704A5AA24B5A4B5AA2C8485A4A90C7FC -A24A5A4A5AA24A5AA24A5A4A5AA24A5A4A5AA24990C8FCA2495A4948141CA2495A495AA2 -495A495A173C495AA24890C8FC485A1778485A484815F8A24848140116034848140F4848 -143FED01FFB8FCA32E3E7BBD38>I -I<486C13C00003130101001380481303000EEB070048130E0018130C0038131C00301318 -0070133800601330A300E01370481360A400CFEB678039FFC07FE001E013F0A3007F133F -A2003F131F01C013E0390F0007801C1C73BE2D>II97 -DI<49B4FC010F13E090383F00F8017C131E4848131F -4848137F0007ECFF80485A5B121FA24848EB7F00151C007F91C7FCA290C9FC5AAB6C7EA3 -003FEC01C07F001F140316806C6C13076C6C14000003140E6C6C131E6C6C137890383F01 -F090380FFFC0D901FEC7FC222A7DA828>II -II<167C903903F801 -FF903A1FFF078F8090397E0FDE1F9038F803F83803F001A23B07E000FC0600000F6EC7FC -49137E001F147FA8000F147E6D13FE00075C6C6C485AA23901F803E03903FE0FC026071F -FFC8FCEB03F80006CAFC120EA3120FA27F7F6CB512E015FE6C6E7E6C15E06C810003813A -0FC0001FFC48C7EA01FE003E140048157E825A82A46C5D007C153E007E157E6C5D6C6C49 -5A6C6C495AD803F0EB0FC0D800FE017FC7FC90383FFFFC010313C0293D7EA82D>III<1478EB01FEA2EB03FFA4EB01FEA2EB00781400AC147FEB7FFFA313 -017F147FB3B3A5123E127F38FF807E14FEA214FCEB81F8EA7F01387C03F0381E07C0380F -FF803801FC00185185BD1C>II -I<2701F801FE14FF00FF902707FFC00313E0913B1E07E00F03F0913B7803F03C01F80007 -903BE001F87000FC2603F9C06D487F000101805C01FBD900FF147F91C75B13FF4992C7FC -A2495CB3A6486C496CECFF80B5D8F87FD9FC3F13FEA347287DA74C>I<3901F801FE00FF -903807FFC091381E07E091387803F000079038E001F82603F9C07F0001138001FB6D7E91 -C7FC13FF5BA25BB3A6486C497EB5D8F87F13FCA32E287DA733>I<14FF010713E090381F -81F890387E007E01F8131F4848EB0F804848EB07C04848EB03E0000F15F04848EB01F8A2 -003F15FCA248C812FEA44815FFA96C15FEA36C6CEB01FCA3001F15F86C6CEB03F0A26C6C -EB07E06C6CEB0FC06C6CEB1F80D8007EEB7E0090383F81FC90380FFFF0010090C7FC282A -7EA82D>I<3901FC03FC00FF90381FFF8091387C0FE09039FDE003F03A03FFC001FC6C49 -6C7E91C7127F49EC3F805BEE1FC017E0A2EE0FF0A3EE07F8AAEE0FF0A4EE1FE0A2EE3FC0 -6D1580EE7F007F6E13FE9138C001F89039FDE007F09039FC780FC0DA3FFFC7FCEC07F891 -C9FCAD487EB512F8A32D3A7EA733>I<02FF131C0107EBC03C90381F80F090397F00387C -01FC131CD803F8130E4848EB0FFC150748481303121F485A1501485AA448C7FCAA6C7EA3 -6C7EA2001F14036C7E15076C6C130F6C7E6C6C133DD8007E137990383F81F190380FFFC1 -903801FE0190C7FCAD4B7E92B512F8A32D3A7DA730>I<3901F807E000FFEB1FF8EC787C -ECE1FE3807F9C100031381EA01FB1401EC00FC01FF1330491300A35BB3A5487EB512FEA3 -1F287EA724>I<90383FC0603901FFF8E03807C03F381F000F003E1307003C1303127C00 -78130112F81400A27E7E7E6D1300EA7FF8EBFFC06C13F86C13FE6C7F6C1480000114C0D8 -003F13E0010313F0EB001FEC0FF800E01303A214017E1400A27E15F07E14016C14E06CEB -03C0903880078039F3E01F0038E0FFFC38C01FE01D2A7DA824>I<131CA6133CA4137CA2 -13FCA2120112031207001FB512C0B6FCA2D801FCC7FCB3A215E0A912009038FE01C0A2EB -7F03013F138090381F8700EB07FEEB01F81B397EB723>IIIIII<001FB61280A2EBE0000180140049485A001E495A121C4A5A003C49 -5A141F00385C4A5A147F5D4AC7FCC6485AA2495A495A130F5C495A90393FC00380A2EB7F -80EBFF005A5B484813071207491400485A48485BA248485B4848137F00FF495A90B6FCA2 -21277EA628>II E -%EndDVIPSBitmapFont -%DVIPSBitmapFont: Fm cmbx12 20.736 9 -/Fm 9 123 df<92380FFFE04AB67E020F15F0027F15FE49B87E4917E0010F17F8013F83 -49D9C01F14FF9027FFFC0001814801E06D6C80480180021F804890C86C8048486F804848 -6F8001FF6F804801C06E8002F081486D18806E816E18C0B5821BE06E81A37214F0A56C5B -A36C5B6C5B6C5B000313C0C690C9FC90CA15E060A34E14C0A21B80601B0060626295B55A -5F624D5C624D5C4D91C7FC614D5B4D13F04D5B6194B55A4C49C8FC4C5B4C5B4C13E04C5B -604C90C9FCEE7FFC4C5A4B5B4B5B4B0180EC0FF04B90C8FC4B5A4B5A4B48ED1FE0EDFFE0 -4A5B4A5B4A90C9FC4A48163F4A5ADA3FF017C05D4A48167F4A5A4990CA12FFD903FC1607 -49BAFC5B4919805B5B90BBFC5A5A5A5A481A005A5ABCFCA462A44C7176F061>50 -D<92383FFFF80207B612E0027F15FC49B87E010717E0011F83499026F0007F13FC4948C7 -000F7F90B502036D7E486E6D806F6D80727F486E6E7F8486727FA28684A26C5C72806C5C -6D90C8FC6D5AEB0FF8EB03E090CAFCA70507B6FC041FB7FC0303B8FC157F0203B9FC021F -ECFE0391B612800103ECF800010F14C04991C7FC017F13FC90B512F04814C0485C4891C8 -FC485B5A485B5C5A5CA2B5FC5CA360A36E5DA26C5F6E5D187E6C6D846E4A48806C6D4A48 -14FC6C6ED90FF0ECFFFC6C02E090263FE07F14FE00019139FC03FFC06C91B6487E013F4B -487E010F4B1307010303F01301D9003F0280D9003F13FC020101F8CBFC57507ACE5E>97 -D<903801FFFCB6FCA8C67E131F7FB3ADF0FFFC050FEBFFE0057F14FE0403B77E040F16E0 -043F16F84CD9007F13FE9226FDFFF001077F92B500C001018094C86C13E004FC6F7F4C6F -7F04E06F7F4C6F7F5E747F93C915804B7014C0A27414E0A21DF087A21DF8A31DFC87A41D -FEAF1DFCA4631DF8A31DF098B5FC1DE0A25014C0A26F1980501400705D705F704B5B505B -704B5B04FC4B5BDBE7FE92B55A9226C3FF8001035C038101E0011F49C7FC9226807FFC90 -B55A4B6CB712F04A010F16C04A010393C8FC4A010015F84A023F14C090C9000301F0C9FC -5F797AF76C>I<97380FFFE00607B6FCA8F00003190086B3AD93383FFF800307B512F803 -3F14FF4AB712C0020716F0021F16FC027F9039FE007FFE91B500F0EB0FFF010302800101 -90B5FC4949C87E49498149498149498149498190B548814884484A8192CAFC5AA2485BA2 -5A5C5AA35A5CA4B5FCAF7EA4807EA37EA2807EA26C7F616C6E5D6C606C80616D6D5D6D6D -5D6D6D92B67E6D6D4A15FC010301FF0207EDFFFE6D02C0EB3FFE6D6C9039FC01FFF86E90 -B65A020F16C002031600DA007F14FC030F14E09226007FFEC749C7FC5F797AF76C>100 -D105 D<903801FFFCB6FCA8C67E131F7FB3B3B3B3B3ABB812C0A82A7879F7 -35>108 D<902601FFF891380FFFE0B692B512FE05036E7E050F15E0053F15F84D819327 -01FFF01F7F4CD900077FDC07FC6D80C66CDA0FF06D80011FDA1FC07F6D4A48824CC8FC04 -7E6F7F5EEDF9F85E03FB707F5E15FF5EA25EA293C9FCA45DB3B3A6B8D8E003B81280A861 -4E79CD6C>110 D<902601FFFCEC7FFEB6020FB512F0057F14FE4CB712C0040716F0041F -82047F16FE93B5C66C7F92B500F0010F14C0C66C0380010380011F4AC76C806D4A6E8004 -F06F7F4C6F7F4C6F7F4C8193C915804B7014C0861DE0A27414F0A27414F8A47513FCA575 -13FEAF5113FCA598B512F8A31DF0621DE0621DC0621D806F5E701800704B5B505B704B5B -7092B55A04FC4A5C704A5C706C010F5C05E0013F49C7FC9227FE7FFC01B55A70B712F004 -0F16C0040393C8FC040015F8053F14C0050301F0C9FC94CCFCB3A6B812E0A85F6F7ACD6C ->112 D<0007BA12FC1AFEA503E0C714FC4AC74814F84801F04A14F05C02804A14E091C8 -4814C04D14805B494B14004D5B4992B55AA24C5C494A5C615E4C5C001F4B5C5B4C91C7FC -4C5B93B55AA24B5CC8485C4B5CA24B5C4B5C4B91C8FCA24B5B92B55AA24A5C4A5C4A4A14 -FFA24A5C4A5C4A91C8FC614A4915FE91B55A495CA2495C494A14035E5B495C4991C81207 -A24949ED0FFC90B55A484A151FA2484A153F484A157F484A15FF1803484A140F4891C812 -3F48490207B5FC91B9FCBB12F8A57E484D7BCC56>122 D E -%EndDVIPSBitmapFont -end -%%EndProlog -%%BeginSetup -%%Feature: *Resolution 600dpi -TeXDict begin -%%PaperSize: A4 - -%%EndSetup -%%Page: 1 1 -1 0 bop 150 1318 a Fm(bzip2)64 b(and)g(libbzip2)p 150 -1418 3600 34 v 2010 1515 a Fl(a)31 b(program)f(and)g(library)e(for)i -(data)h(compression)2198 1623 y(cop)m(yrigh)m(t)f(\(C\))h(1996-2000)j -(Julian)28 b(Sew)m(ard)2605 1731 y(v)m(ersion)i(1.0)h(of)g(21)g(Marc)m -(h)g(2000)150 5091 y Fk(Julian)46 b(Sew)l(ard)p 150 5141 -3600 17 v eop -%%Page: 1 2 -1 1 bop 3705 -116 a Fl(1)150 299 y(This)24 b(program,)j -Fj(bzip2)p Fl(,)e(and)g(asso)s(ciated)i(library)c Fj(libbzip2)p -Fl(,)i(are)h(Cop)m(yrigh)m(t)g(\(C\))g(1996-2000)j(Julian)150 -408 y(R)h(Sew)m(ard.)40 b(All)29 b(righ)m(ts)h(reserv)m(ed.)150 -565 y(Redistribution)f(and)i(use)h(in)f(source)h(and)g(binary)e(forms,) -j(with)e(or)h(without)f(mo)s(di\014cation,)g(are)i(p)s(er-)150 -675 y(mitted)d(pro)m(vided)f(that)i(the)f(follo)m(wing)f(conditions)g -(are)i(met:)225 832 y Fi(\017)60 b Fl(Redistributions)26 -b(of)k(source)g(co)s(de)g(m)m(ust)g(retain)f(the)h(ab)s(o)m(v)m(e)h -(cop)m(yrigh)m(t)g(notice,)f(this)f(list)f(of)i(con-)330 -941 y(ditions)e(and)i(the)h(follo)m(wing)e(disclaimer.)225 -1076 y Fi(\017)60 b Fl(The)33 b(origin)f(of)h(this)f(soft)m(w)m(are)j -(m)m(ust)e(not)h(b)s(e)e(misrepresen)m(ted;)i(y)m(ou)g(m)m(ust)f(not)g -(claim)g(that)h(y)m(ou)330 1185 y(wrote)d(the)h(original)d(soft)m(w)m -(are.)44 b(If)31 b(y)m(ou)g(use)g(this)f(soft)m(w)m(are)i(in)e(a)h(pro) -s(duct,)g(an)f(ac)m(kno)m(wledgmen)m(t)330 1295 y(in)f(the)i(pro)s -(duct)e(do)s(cumen)m(tation)h(w)m(ould)f(b)s(e)h(appreciated)g(but)g -(is)f(not)i(required.)225 1429 y Fi(\017)60 b Fl(Altered)21 -b(source)g(v)m(ersions)f(m)m(ust)h(b)s(e)f(plainly)e(mark)m(ed)j(as)g -(suc)m(h,)i(and)d(m)m(ust)h(not)g(b)s(e)f(misrepresen)m(ted)330 -1539 y(as)31 b(b)s(eing)e(the)h(original)f(soft)m(w)m(are.)225 -1674 y Fi(\017)60 b Fl(The)27 b(name)h(of)f(the)h(author)f(ma)m(y)h -(not)g(b)s(e)f(used)g(to)h(endorse)f(or)h(promote)g(pro)s(ducts)e -(deriv)m(ed)g(from)330 1783 y(this)j(soft)m(w)m(are)j(without)d(sp)s -(eci\014c)h(prior)e(written)i(p)s(ermission.)150 1965 -y(THIS)37 b(SOFTW)-10 b(ARE)38 b(IS)f(PR)m(O)m(VIDED)i(BY)g(THE)f(A)m -(UTHOR)g(\\AS)g(IS")g(AND)h(ANY)f(EXPRESS)150 2074 y(OR)31 -b(IMPLIED)h(W)-10 b(ARRANTIES,)31 b(INCLUDING,)i(BUT)f(NOT)f(LIMITED)g -(TO,)h(THE)f(IMPLIED)150 2184 y(W)-10 b(ARRANTIES)27 -b(OF)h(MER)m(CHANT)-8 b(ABILITY)28 b(AND)g(FITNESS)f(F)m(OR)g(A)h(P)-8 -b(AR)g(TICULAR)28 b(PUR-)150 2294 y(POSE)37 b(ARE)g(DISCLAIMED.)h(IN)f -(NO)h(EVENT)f(SHALL)g(THE)g(A)m(UTHOR)h(BE)g(LIABLE)g(F)m(OR)150 -2403 y(ANY)56 b(DIRECT,)f(INDIRECT,)h(INCIDENT)-8 b(AL,)56 -b(SPECIAL,)e(EXEMPLAR)-8 b(Y,)57 b(OR)e(CONSE-)150 2513 -y(QUENTIAL)48 b(D)m(AMA)m(GES)i(\(INCLUDING,)g(BUT)f(NOT)f(LIMITED)g -(TO,)g(PR)m(OCUREMENT)150 2622 y(OF)35 b(SUBSTITUTE)e(GOODS)i(OR)f(SER) --10 b(VICES;)34 b(LOSS)f(OF)i(USE,)g(D)m(A)-8 b(T)g(A,)36 -b(OR)f(PR)m(OFITS;)f(OR)150 2732 y(BUSINESS)28 b(INTERR)m(UPTION\))g -(HO)m(WEVER)i(CA)m(USED)f(AND)g(ON)g(ANY)g(THEOR)-8 b(Y)29 -b(OF)g(LIA-)150 2842 y(BILITY,)36 b(WHETHER)g(IN)g(CONTRA)m(CT,)g -(STRICT)e(LIABILITY,)i(OR)g(TOR)-8 b(T)35 b(\(INCLUDING)150 -2951 y(NEGLIGENCE)45 b(OR)g(OTHER)-10 b(WISE\))44 b(ARISING)h(IN)g(ANY) -h(W)-10 b(A)i(Y)46 b(OUT)e(OF)i(THE)e(USE)h(OF)150 3061 -y(THIS)29 b(SOFTW)-10 b(ARE,)31 b(EVEN)f(IF)g(AD)m(VISED)i(OF)e(THE)g -(POSSIBILITY)e(OF)j(SUCH)f(D)m(AMA)m(GE.)150 3218 y(Julian)e(Sew)m -(ard,)i(Cam)m(bridge,)g(UK.)150 3374 y Fj(jseward@acm.org)150 -3531 y(http://sourceware.cygnus)o(.com)o(/bzi)o(p2)150 -3688 y(http://www.cacheprof.org)150 3845 y(http://www.muraroa.demon)o -(.co.)o(uk)150 4002 y(bzip2)p Fl(/)p Fj(libbzip2)d Fl(v)m(ersion)j(1.0) -i(of)e(21)h(Marc)m(h)g(2000.)150 4159 y(P)-8 b(A)g(TENTS:)40 -b(T)-8 b(o)40 b(the)g(b)s(est)g(of)g(m)m(y)g(kno)m(wledge,)j -Fj(bzip2)38 b Fl(do)s(es)i(not)g(use)g(an)m(y)g(paten)m(ted)h -(algorithms.)150 4268 y(Ho)m(w)m(ev)m(er,)33 b(I)e(do)f(not)h(ha)m(v)m -(e)h(the)f(resources)g(a)m(v)-5 b(ailable)30 b(to)h(carry)g(out)g(a)g -(full)d(paten)m(t)k(searc)m(h.)42 b(Therefore)150 4378 -y(I)30 b(cannot)h(giv)m(e)g(an)m(y)g(guaran)m(tee)h(of)e(the)h(ab)s(o)m -(v)m(e)g(statemen)m(t.)p eop -%%Page: 2 3 -2 2 bop 150 -116 a Fl(Chapter)30 b(1:)41 b(In)m(tro)s(duction)2591 -b(2)150 299 y Fh(1)80 b(In)l(tro)t(duction)150 555 y -Fj(bzip2)20 b Fl(compresses)h(\014les)f(using)g(the)h(Burro)m -(ws-Wheeler)g(blo)s(c)m(k-sorting)f(text)j(compression)d(algorithm,)150 -665 y(and)33 b(Hu\013man)g(co)s(ding.)50 b(Compression)32 -b(is)h(generally)g(considerably)f(b)s(etter)i(than)f(that)h(ac)m(hiev)m -(ed)h(b)m(y)150 775 y(more)f(con)m(v)m(en)m(tional)g(LZ77/LZ78-based)g -(compressors,)g(and)f(approac)m(hes)h(the)f(p)s(erformance)g(of)h(the) -150 884 y(PPM)c(family)f(of)i(statistical)f(compressors.)150 -1041 y Fj(bzip2)k Fl(is)h(built)e(on)i(top)h(of)g Fj(libbzip2)p -Fl(,)e(a)i(\015exible)e(library)f(for)i(handling)e(compressed)i(data)i -(in)d(the)150 1151 y Fj(bzip2)c Fl(format.)43 b(This)30 -b(man)m(ual)g(describ)s(es)g(b)s(oth)g(ho)m(w)i(to)g(use)f(the)g -(program)g(and)g(ho)m(w)g(to)h(w)m(ork)f(with)150 1260 -y(the)d(library)d(in)m(terface.)41 b(Most)28 b(of)g(the)g(man)m(ual)f -(is)g(dev)m(oted)i(to)f(this)f(library)-8 b(,)26 b(not)i(the)g -(program,)g(whic)m(h)150 1370 y(is)h(go)s(o)s(d)i(news)e(if)h(y)m(our)g -(in)m(terest)h(is)e(only)g(in)h(the)g(program.)150 1527 -y(Chapter)24 b(2)g(describ)s(es)f(ho)m(w)h(to)h(use)f -Fj(bzip2)p Fl(;)h(this)e(is)g(the)i(only)e(part)h(y)m(ou)h(need)f(to)h -(read)f(if)f(y)m(ou)h(just)g(w)m(an)m(t)150 1636 y(to)35 -b(kno)m(w)f(ho)m(w)g(to)g(op)s(erate)h(the)f(program.)51 -b(Chapter)34 b(3)g(describ)s(es)e(the)i(programming)f(in)m(terfaces)h -(in)150 1746 y(detail,)23 b(and)d(Chapter)h(4)h(records)f(some)h -(miscellaneous)e(notes)i(whic)m(h)e(I)h(though)m(t)h(ough)m(t)g(to)g(b) -s(e)f(recorded)150 1855 y(somewhere.)p eop -%%Page: 3 4 -3 3 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 -b(to)g(use)f Fj(bzip2)2375 b Fl(3)150 299 y Fh(2)80 b(Ho)l(w)53 -b(to)g(use)g Fg(bzip2)150 566 y Fl(This)29 b(c)m(hapter)i(con)m(tains)f -(a)h(cop)m(y)g(of)g(the)f Fj(bzip2)f Fl(man)h(page,)h(and)f(nothing)g -(else.)390 818 y Ff(NAME)570 1004 y Fj(bzip2)p Fl(,)f -Fj(bunzip2)g Fl(-)h(a)h(blo)s(c)m(k-sorting)f(\014le)f(compressor,)i -(v1.0)570 1136 y Fj(bzcat)e Fl(-)i(decompresses)f(\014les)f(to)i -(stdout)570 1267 y Fj(bzip2recover)c Fl(-)k(reco)m(v)m(ers)h(data)f -(from)f(damaged)g(bzip2)g(\014les)390 1519 y Ff(SYNOPSIS)570 -1706 y Fj(bzip2)f Fl([)h(-cdfkqstvzVL123456789)35 b(])c([)g -(\014lenames)e(...)41 b(])570 1837 y Fj(bunzip2)28 b -Fl([)j(-fkvsVL)f(])h([)f(\014lenames)g(...)41 b(])570 -1968 y Fj(bzcat)29 b Fl([)h(-s)h(])g([)f(\014lenames)g(...)41 -b(])570 2100 y Fj(bzip2recover)27 b Fl(\014lename)390 -2352 y Ff(DESCRIPTION)390 2538 y Fj(bzip2)i Fl(compresses)i(\014les)f -(using)f(the)i(Burro)m(ws-Wheeler)g(blo)s(c)m(k)f(sorting)g(text)i -(compres-)390 2642 y(sion)40 b(algorithm,)j(and)d(Hu\013man)h(co)s -(ding.)71 b(Compression)40 b(is)g(generally)g(considerably)390 -2746 y(b)s(etter)25 b(than)g(that)h(ac)m(hiev)m(ed)g(b)m(y)f(more)g -(con)m(v)m(en)m(tional)h(LZ77/LZ78-based)g(compressors,)390 -2850 y(and)k(approac)m(hes)h(the)f(p)s(erformance)g(of)h(the)f(PPM)g -(family)f(of)i(statistical)f(compressors.)390 3001 y(The)e -(command-line)e(options)i(are)h(delib)s(erately)d(v)m(ery)i(similar)e -(to)j(those)g(of)f(GNU)h Fj(gzip)p Fl(,)390 3104 y(but)h(they)g(are)h -(not)g(iden)m(tical.)390 3255 y Fj(bzip2)f Fl(exp)s(ects)h(a)g(list)f -(of)h(\014le)f(names)h(to)h(accompan)m(y)h(the)e(command-line)e -(\015ags.)43 b(Eac)m(h)390 3359 y(\014le)e(is)h(replaced)g(b)m(y)g(a)h -(compressed)f(v)m(ersion)g(of)g(itself,)i(with)e(the)g(name)g -Fj(original_)390 3463 y(name.bz2)p Fl(.)49 b(Eac)m(h)34 -b(compressed)g(\014le)f(has)g(the)h(same)g(mo)s(di\014cation)e(date,)k -(p)s(ermissions,)390 3567 y(and,)24 b(when)f(p)s(ossible,)f(o)m -(wnership)f(as)j(the)f(corresp)s(onding)f(original,)h(so)g(that)h -(these)g(prop-)390 3671 y(erties)34 b(can)g(b)s(e)f(correctly)i -(restored)f(at)g(decompression)f(time.)51 b(File)34 b(name)g(handling)d -(is)390 3774 y(naiv)m(e)26 b(in)f(the)i(sense)f(that)h(there)f(is)f(no) -i(mec)m(hanism)e(for)h(preserving)f(original)f(\014le)i(names,)390 -3878 y(p)s(ermissions,)37 b(o)m(wnerships)f(or)h(dates)i(in)d -(\014lesystems)h(whic)m(h)g(lac)m(k)h(these)g(concepts,)j(or)390 -3982 y(ha)m(v)m(e)32 b(serious)d(\014le)g(name)i(length)f -(restrictions,)f(suc)m(h)h(as)h(MS-DOS.)390 4133 y Fj(bzip2)26 -b Fl(and)h Fj(bunzip2)e Fl(will)f(b)m(y)k(default)e(not)i(o)m(v)m -(erwrite)g(existing)e(\014les.)38 b(If)27 b(y)m(ou)h(w)m(an)m(t)g(this) -390 4237 y(to)j(happ)s(en,)e(sp)s(ecify)g(the)i Fj(-f)e -Fl(\015ag.)390 4388 y(If)34 b(no)h(\014le)f(names)g(are)i(sp)s -(eci\014ed,)e Fj(bzip2)f Fl(compresses)i(from)f(standard)g(input)f(to)j -(stan-)390 4491 y(dard)c(output.)49 b(In)32 b(this)g(case,)k -Fj(bzip2)31 b Fl(will)g(decline)h(to)i(write)e(compressed)h(output)g -(to)h(a)390 4595 y(terminal,)29 b(as)i(this)e(w)m(ould)g(b)s(e)h(en)m -(tirely)f(incomprehensible)e(and)j(therefore)h(p)s(oin)m(tless.)390 -4746 y Fj(bunzip2)36 b Fl(\(or)j Fj(bzip2)29 b(-d)p Fl(\))37 -b(decompresses)i(all)e(sp)s(eci\014ed)f(\014les.)63 b(Files)37 -b(whic)m(h)g(w)m(ere)i(not)390 4850 y(created)e(b)m(y)f -Fj(bzip2)f Fl(will)e(b)s(e)i(detected)j(and)d(ignored,)i(and)e(a)i(w)m -(arning)d(issued.)56 b Fj(bzip2)390 4954 y Fl(attempts)31 -b(to)f(guess)g(the)g(\014lename)f(for)h(the)g(decompressed)f(\014le)g -(from)h(that)g(of)g(the)g(com-)390 5058 y(pressed)f(\014le)h(as)h -(follo)m(ws:)570 5209 y Fj(filename.bz2)57 b Fl(b)s(ecomes)31 -b Fj(filename)570 5340 y(filename.bz)58 b Fl(b)s(ecomes)30 -b Fj(filename)p eop -%%Page: 4 5 -4 4 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 -b(to)g(use)f Fj(bzip2)2375 b Fl(4)570 299 y Fj(filename.tbz2)27 -b Fl(b)s(ecomes)j Fj(filename.tar)570 470 y(filename.tbz)57 -b Fl(b)s(ecomes)31 b Fj(filename.tar)570 641 y(anyothername)57 -b Fl(b)s(ecomes)31 b Fj(anyothername.out)390 859 y Fl(If)j(the)h -(\014le)e(do)s(es)i(not)f(end)g(in)f(one)i(of)g(the)g(recognised)f -(endings,)g Fj(.bz2)p Fl(,)h Fj(.bz)p Fl(,)g Fj(.tbz2)e -Fl(or)390 963 y Fj(.tbz)p Fl(,)h Fj(bzip2)f Fl(complains)f(that)j(it)e -(cannot)i(guess)f(the)g(name)h(of)f(the)g(original)e(\014le,)j(and)390 -1067 y(uses)30 b(the)g(original)f(name)h(with)g Fj(.out)f -Fl(app)s(ended.)390 1218 y(As)j(with)f(compression,)h(supplying)c(no)k -(\014lenames)f(causes)i(decompression)e(from)h(stan-)390 -1321 y(dard)d(input)g(to)i(standard)e(output.)390 1472 -y Fj(bunzip2)k Fl(will)g(correctly)j(decompress)e(a)i(\014le)e(whic)m -(h)g(is)h(the)g(concatenation)i(of)e(t)m(w)m(o)i(or)390 -1576 y(more)j(compressed)f(\014les.)67 b(The)39 b(result)g(is)g(the)g -(concatenation)i(of)f(the)g(corresp)s(onding)390 1680 -y(uncompressed)c(\014les.)59 b(In)m(tegrit)m(y)38 b(testing)f(\()p -Fj(-t)p Fl(\))g(of)g(concatenated)i(compressed)e(\014les)f(is)390 -1784 y(also)30 b(supp)s(orted.)390 1935 y(Y)-8 b(ou)40 -b(can)g(also)f(compress)g(or)g(decompress)g(\014les)g(to)h(the)f -(standard)g(output)g(b)m(y)g(giving)390 2039 y(the)30 -b Fj(-c)g Fl(\015ag.)40 b(Multiple)28 b(\014les)h(ma)m(y)i(b)s(e)e -(compressed)h(and)f(decompressed)h(lik)m(e)f(this.)39 -b(The)390 2142 y(resulting)31 b(outputs)i(are)h(fed)f(sequen)m(tially)f -(to)i(stdout.)49 b(Compression)32 b(of)h(m)m(ultiple)e(\014les)390 -2246 y(in)24 b(this)g(manner)h(generates)h(a)g(stream)f(con)m(taining)g -(m)m(ultiple)e(compressed)i(\014le)f(represen-)390 2350 -y(tations.)58 b(Suc)m(h)36 b(a)g(stream)g(can)h(b)s(e)e(decompressed)h -(correctly)h(only)e(b)m(y)h Fj(bzip2)e Fl(v)m(ersion)390 -2454 y(0.9.0)g(or)e(later.)47 b(Earlier)30 b(v)m(ersions)i(of)g -Fj(bzip2)f Fl(will)f(stop)i(after)h(decompressing)e(the)i(\014rst)390 -2558 y(\014le)c(in)h(the)g(stream.)390 2709 y Fj(bzcat)f -Fl(\(or)i Fj(bzip2)e(-dc)p Fl(\))g(decompresses)i(all)e(sp)s(eci\014ed) -g(\014les)g(to)i(the)g(standard)e(output.)390 2860 y -Fj(bzip2)f Fl(will)g(read)i(argumen)m(ts)g(from)f(the)h(en)m(vironmen)m -(t)g(v)-5 b(ariables)28 b Fj(BZIP2)h Fl(and)g Fj(BZIP)p -Fl(,)g(in)390 2963 y(that)24 b(order,)g(and)f(will)e(pro)s(cess)i(them) -g(b)s(efore)g(an)m(y)h(argumen)m(ts)f(read)h(from)f(the)g(command)390 -3067 y(line.)39 b(This)29 b(giv)m(es)h(a)h(con)m(v)m(enien)m(t)h(w)m(a) -m(y)f(to)g(supply)d(default)i(argumen)m(ts.)390 3218 -y(Compression)h(is)h(alw)m(a)m(ys)i(p)s(erformed,)e(ev)m(en)h(if)f(the) -h(compressed)g(\014le)f(is)g(sligh)m(tly)f(larger)390 -3322 y(than)26 b(the)g(original.)38 b(Files)25 b(of)h(less)g(than)g(ab) -s(out)g(one)g(h)m(undred)e(b)m(ytes)j(tend)f(to)h(get)g(larger,)390 -3426 y(since)34 b(the)g(compression)f(mec)m(hanism)h(has)f(a)i(constan) -m(t)g(o)m(v)m(erhead)h(in)d(the)h(region)g(of)g(50)390 -3529 y(b)m(ytes.)54 b(Random)34 b(data)h(\(including)d(the)i(output)h -(of)f(most)h(\014le)f(compressors\))h(is)e(co)s(ded)390 -3633 y(at)e(ab)s(out)f(8.05)i(bits)d(p)s(er)h(b)m(yte,)h(giving)e(an)h -(expansion)g(of)g(around)g(0.5\045.)390 3784 y(As)h(a)g(self-c)m(hec)m -(k)h(for)e(y)m(our)h(protection,)g Fj(bzip2)f Fl(uses)g(32-bit)h(CR)m -(Cs)f(to)i(mak)m(e)f(sure)f(that)390 3888 y(the)45 b(decompressed)f(v)m -(ersion)g(of)g(a)h(\014le)e(is)h(iden)m(tical)f(to)i(the)g(original.)81 -b(This)43 b(guards)390 3992 y(against)i(corruption)e(of)h(the)h -(compressed)f(data,)49 b(and)44 b(against)h(undetected)g(bugs)e(in)390 -4096 y Fj(bzip2)35 b Fl(\(hop)s(efully)e(v)m(ery)k(unlik)m(ely\).)56 -b(The)36 b(c)m(hances)h(of)f(data)h(corruption)e(going)h(unde-)390 -4199 y(tected)g(is)e(microscopic,)h(ab)s(out)f(one)h(c)m(hance)g(in)f -(four)g(billion)d(for)j(eac)m(h)i(\014le)d(pro)s(cessed.)390 -4303 y(Be)38 b(a)m(w)m(are,)k(though,)d(that)f(the)g(c)m(hec)m(k)i(o)s -(ccurs)d(up)s(on)f(decompression,)j(so)f(it)f(can)h(only)390 -4407 y(tell)28 b(y)m(ou)g(that)i(something)d(is)h(wrong.)40 -b(It)28 b(can't)i(help)d(y)m(ou)i(reco)m(v)m(er)h(the)e(original)f -(uncom-)390 4511 y(pressed)h(data.)41 b(Y)-8 b(ou)30 -b(can)f(use)g Fj(bzip2recover)d Fl(to)k(try)f(to)h(reco)m(v)m(er)h -(data)f(from)e(damaged)390 4614 y(\014les.)390 4766 y(Return)22 -b(v)-5 b(alues:)37 b(0)23 b(for)g(a)g(normal)f(exit,)j(1)e(for)g(en)m -(vironmen)m(tal)f(problems)f(\(\014le)i(not)g(found,)390 -4869 y(in)m(v)-5 b(alid)30 b(\015ags,)k(I/O)f(errors,)g(&c\),)h(2)f(to) -g(indicate)f(a)h(corrupt)f(compressed)h(\014le,)f(3)i(for)e(an)390 -4973 y(in)m(ternal)d(consistency)h(error)g(\(eg,)i(bug\))e(whic)m(h)f -(caused)i Fj(bzip2)e Fl(to)i(panic.)390 5304 y Ff(OPTIONS)p -eop -%%Page: 5 6 -5 5 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 -b(to)g(use)f Fj(bzip2)2375 b Fl(5)390 299 y Fj(-c)30 -b(--stdout)870 403 y Fl(Compress)f(or)i(decompress)f(to)h(standard)e -(output.)390 557 y Fj(-d)h(--decompress)870 661 y Fl(F)-8 -b(orce)44 b(decompression.)77 b Fj(bzip2)p Fl(,)44 b -Fj(bunzip2)d Fl(and)h Fj(bzcat)f Fl(are)i(really)f(the)870 -764 y(same)27 b(program,)h(and)e(the)i(decision)d(ab)s(out)i(what)g -(actions)g(to)h(tak)m(e)g(is)e(done)870 868 y(on)k(the)h(basis)e(of)i -(whic)m(h)e(name)h(is)g(used.)40 b(This)28 b(\015ag)j(o)m(v)m(errides)f -(that)h(mec)m(h-)870 972 y(anism,)e(and)h(forces)h(bzip2)e(to)i -(decompress.)390 1126 y Fj(-z)f(--compress)870 1230 y -Fl(The)39 b(complemen)m(t)h(to)g Fj(-d)p Fl(:)59 b(forces)40 -b(compression,)h(regardless)d(of)i(the)g(in-)870 1334 -y(v)m(ok)-5 b(ation)31 b(name.)390 1488 y Fj(-t)f(--test)8 -b Fl(Chec)m(k)33 b(in)m(tegrit)m(y)j(of)f(the)g(sp)s(eci\014ed)e -(\014le\(s\),)k(but)d(don't)h(decompress)g(them.)870 -1591 y(This)40 b(really)g(p)s(erforms)g(a)i(trial)e(decompression)h -(and)g(thro)m(ws)g(a)m(w)m(a)m(y)j(the)870 1695 y(result.)390 -1849 y Fj(-f)30 b(--force)870 1953 y Fl(F)-8 b(orce)31 -b(o)m(v)m(erwrite)f(of)g(output)f(\014les.)40 b(Normally)-8 -b(,)29 b Fj(bzip2)f Fl(will)f(not)j(o)m(v)m(erwrite)870 -2057 y(existing)e(output)g(\014les.)39 b(Also)28 b(forces)h -Fj(bzip2)e Fl(to)i(break)g(hard)e(links)f(to)k(\014les,)870 -2161 y(whic)m(h)f(it)h(otherwise)g(w)m(ouldn't)f(do.)390 -2315 y Fj(-k)h(--keep)8 b Fl(Keep)24 b(\(don't)i(delete\))h(input)d -(\014les)g(during)g(compression)h(or)h(decompression.)390 -2469 y Fj(-s)k(--small)870 2573 y Fl(Reduce)23 b(memory)f(usage,)j(for) -d(compression,)h(decompression)f(and)g(testing.)870 2676 -y(Files)f(are)i(decompressed)e(and)h(tested)h(using)e(a)h(mo)s -(di\014ed)e(algorithm)h(whic)m(h)870 2780 y(only)30 b(requires)g(2.5)j -(b)m(ytes)f(p)s(er)e(blo)s(c)m(k)h(b)m(yte.)44 b(This)30 -b(means)h(an)m(y)h(\014le)e(can)i(b)s(e)870 2884 y(decompressed)d(in)f -(2300k)j(of)e(memory)-8 b(,)30 b(alb)s(eit)e(at)i(ab)s(out)f(half)g -(the)g(normal)870 2988 y(sp)s(eed.)870 3117 y(During)42 -b(compression,)k Fj(-s)d Fl(selects)h(a)g(blo)s(c)m(k)g(size)f(of)h -(200k,)k(whic)m(h)42 b(lim-)870 3220 y(its)33 b(memory)g(use)g(to)h -(around)e(the)i(same)f(\014gure,)h(at)g(the)g(exp)s(ense)f(of)g(y)m -(our)870 3324 y(compression)g(ratio.)50 b(In)33 b(short,)i(if)d(y)m -(our)i(mac)m(hine)f(is)g(lo)m(w)g(on)h(memory)f(\(8)870 -3428 y(megab)m(ytes)42 b(or)e(less\),)j(use)d(-s)g(for)g(ev)m -(erything.)71 b(See)40 b(MEMOR)-8 b(Y)41 b(MAN-)870 3532 -y(A)m(GEMENT)31 b(b)s(elo)m(w.)390 3686 y Fj(-q)f(--quiet)870 -3790 y Fl(Suppress)j(non-essen)m(tial)j(w)m(arning)e(messages.)58 -b(Messages)38 b(p)s(ertaining)33 b(to)870 3893 y(I/O)d(errors)g(and)g -(other)h(critical)e(ev)m(en)m(ts)j(will)27 b(not)k(b)s(e)f(suppressed.) -390 4047 y Fj(-v)g(--verbose)870 4151 y Fl(V)-8 b(erb)s(ose)28 -b(mo)s(de)f({)i(sho)m(w)e(the)h(compression)f(ratio)h(for)f(eac)m(h)i -(\014le)e(pro)s(cessed.)870 4255 y(F)-8 b(urther)30 b -Fj(-v)p Fl('s)g(increase)g(the)g(v)m(erb)s(osit)m(y)g(lev)m(el,)h(sp)s -(ewing)d(out)j(lots)f(of)g(infor-)870 4359 y(mation)g(whic)m(h)f(is)h -(primarily)d(of)j(in)m(terest)h(for)f(diagnostic)g(purp)s(oses.)390 -4513 y Fj(-L)g(--license)e(-V)h(--version)870 4617 y -Fl(Displa)m(y)h(the)g(soft)m(w)m(are)i(v)m(ersion,)e(license)f(terms)i -(and)e(conditions.)390 4771 y Fj(-1)h(to)g(-9)72 b Fl(Set)35 -b(the)g(blo)s(c)m(k)f(size)h(to)g(100)h(k,)g(200)g(k)f(..)53 -b(900)36 b(k)f(when)f(compressing.)53 b(Has)870 4875 -y(no)41 b(e\013ect)h(when)d(decompressing.)71 b(See)41 -b(MEMOR)-8 b(Y)41 b(MANA)m(GEMENT)870 4978 y(b)s(elo)m(w.)390 -5132 y Fj(--)324 b Fl(T)-8 b(reats)25 b(all)e(subsequen)m(t)g(argumen)m -(ts)i(as)f(\014le)g(names,)h(ev)m(en)g(if)e(they)i(start)f(with)870 -5236 y(a)32 b(dash.)43 b(This)29 b(is)h(so)i(y)m(ou)g(can)f(handle)f -(\014les)g(with)g(names)i(b)s(eginning)c(with)870 5340 -y(a)j(dash,)f(for)g(example:)40 b Fj(bzip2)29 b(--)h(-myfilename)p -Fl(.)p eop -%%Page: 6 7 -6 6 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 -b(to)g(use)f Fj(bzip2)2375 b Fl(6)390 299 y Fj(--repetitive-fast)390 -427 y(--repetitive-best)870 530 y Fl(These)34 b(\015ags)g(are)h -(redundan)m(t)e(in)g(v)m(ersions)g(0.9.5)j(and)e(ab)s(o)m(v)m(e.)53 -b(They)34 b(pro-)870 634 y(vided)h(some)i(coarse)g(con)m(trol)g(o)m(v)m -(er)g(the)g(b)s(eha)m(viour)e(of)h(the)g(sorting)g(algo-)870 -738 y(rithm)h(in)h(earlier)g(v)m(ersions,)j(whic)m(h)d(w)m(as)h -(sometimes)h(useful.)65 b(0.9.5)41 b(and)870 842 y(ab)s(o)m(v)m(e)34 -b(ha)m(v)m(e)g(an)f(impro)m(v)m(ed)g(algorithm)f(whic)m(h)f(renders)h -(these)h(\015ags)h(irrel-)870 946 y(ev)-5 b(an)m(t.)390 -1190 y Ff(MEMOR)-10 b(Y)40 b(MANA)m(GEMENT)390 1377 y -Fj(bzip2)25 b Fl(compresses)i(large)g(\014les)e(in)g(blo)s(c)m(ks.)39 -b(The)26 b(blo)s(c)m(k)h(size)f(a\013ects)i(b)s(oth)e(the)h(compres-) -390 1481 y(sion)39 b(ratio)g(ac)m(hiev)m(ed,)k(and)d(the)f(amoun)m(t)i -(of)e(memory)h(needed)f(for)h(compression)f(and)390 1585 -y(decompression.)59 b(The)36 b(\015ags)h Fj(-1)f Fl(through)h -Fj(-9)f Fl(sp)s(ecify)f(the)i(blo)s(c)m(k)g(size)f(to)i(b)s(e)e -(100,000)390 1688 y(b)m(ytes)29 b(through)e(900,000)k(b)m(ytes)d(\(the) -h(default\))e(resp)s(ectiv)m(ely)-8 b(.)40 b(A)m(t)29 -b(decompression)e(time,)390 1792 y(the)32 b(blo)s(c)m(k)g(size)g(used)g -(for)g(compression)f(is)g(read)h(from)g(the)g(header)g(of)h(the)f -(compressed)390 1896 y(\014le,)j(and)f Fj(bunzip2)e Fl(then)i(allo)s -(cates)h(itself)e(just)h(enough)g(memory)g(to)i(decompress)e(the)390 -2000 y(\014le.)39 b(Since)29 b(blo)s(c)m(k)g(sizes)g(are)h(stored)f(in) -f(compressed)h(\014les,)g(it)g(follo)m(ws)f(that)i(the)g(\015ags)g -Fj(-1)390 2103 y Fl(to)h Fj(-9)f Fl(are)h(irrelev)-5 -b(an)m(t)29 b(to)i(and)f(so)h(ignored)e(during)f(decompression.)390 -2255 y(Compression)h(and)g(decompression)h(requiremen)m(ts,)f(in)g(b)m -(ytes,)j(can)e(b)s(e)g(estimated)h(as:)869 2406 y Fj(Compression:)140 -b(400k)46 b(+)i(\()f(8)h(x)f(block)f(size)h(\))869 2613 -y(Decompression:)d(100k)i(+)i(\()f(4)h(x)f(block)f(size)h(\),)g(or)1585 -2717 y(100k)f(+)i(\()f(2.5)g(x)g(block)g(size)f(\))390 -2868 y Fl(Larger)29 b(blo)s(c)m(k)f(sizes)h(giv)m(e)g(rapidly)d -(diminishing)e(marginal)k(returns.)39 b(Most)29 b(of)g(the)g(com-)390 -2972 y(pression)d(comes)j(from)f(the)g(\014rst)g(t)m(w)m(o)h(or)f -(three)h(h)m(undred)d(k)i(of)g(blo)s(c)m(k)g(size,)g(a)h(fact)g(w)m -(orth)390 3075 y(b)s(earing)j(in)f(mind)g(when)h(using)f -Fj(bzip2)h Fl(on)g(small)g(mac)m(hines.)47 b(It)33 b(is)f(also)h(imp)s -(ortan)m(t)f(to)390 3179 y(appreciate)j(that)h(the)f(decompression)f -(memory)h(requiremen)m(t)f(is)h(set)g(at)h(compression)390 -3283 y(time)30 b(b)m(y)g(the)h(c)m(hoice)g(of)g(blo)s(c)m(k)f(size.)390 -3434 y(F)-8 b(or)45 b(\014les)f(compressed)g(with)g(the)g(default)g -(900k)i(blo)s(c)m(k)e(size,)49 b Fj(bunzip2)42 b Fl(will)g(require)390 -3538 y(ab)s(out)29 b(3700)j(kb)m(ytes)e(to)h(decompress.)40 -b(T)-8 b(o)30 b(supp)s(ort)e(decompression)h(of)h(an)m(y)g(\014le)f(on) -g(a)i(4)390 3642 y(megab)m(yte)h(mac)m(hine,)d Fj(bunzip2)f -Fl(has)i(an)g(option)f(to)i(decompress)e(using)g(appro)m(ximately)390 -3745 y(half)k(this)g(amoun)m(t)i(of)f(memory)-8 b(,)36 -b(ab)s(out)e(2300)i(kb)m(ytes.)53 b(Decompression)34 -b(sp)s(eed)g(is)f(also)390 3849 y(halv)m(ed,)i(so)f(y)m(ou)h(should)d -(use)h(this)g(option)h(only)f(where)h(necessary)-8 b(.)53 -b(The)33 b(relev)-5 b(an)m(t)35 b(\015ag)390 3953 y(is)29 -b Fj(-s)p Fl(.)390 4104 y(In)34 b(general,)i(try)f(and)f(use)g(the)h -(largest)h(blo)s(c)m(k)e(size)h(memory)f(constrain)m(ts)h(allo)m(w,)h -(since)390 4208 y(that)45 b(maximises)f(the)h(compression)f(ac)m(hiev)m -(ed.)85 b(Compression)43 b(and)h(decompression)390 4311 -y(sp)s(eed)30 b(are)g(virtually)e(una\013ected)j(b)m(y)f(blo)s(c)m(k)g -(size.)390 4463 y(Another)25 b(signi\014can)m(t)f(p)s(oin)m(t)g -(applies)f(to)j(\014les)e(whic)m(h)g(\014t)h(in)e(a)j(single)d(blo)s(c) -m(k)i({)g(that)h(means)390 4566 y(most)g(\014les)g(y)m(ou'd)g(encoun)m -(ter)h(using)d(a)j(large)f(blo)s(c)m(k)g(size.)39 b(The)25 -b(amoun)m(t)i(of)f(real)g(memory)390 4670 y(touc)m(hed)38 -b(is)e(prop)s(ortional)f(to)j(the)f(size)g(of)h(the)f(\014le,)h(since)f -(the)g(\014le)g(is)f(smaller)g(than)h(a)390 4774 y(blo)s(c)m(k.)49 -b(F)-8 b(or)35 b(example,)f(compressing)e(a)i(\014le)e(20,000)k(b)m -(ytes)e(long)f(with)f(the)i(\015ag)g Fj(-9)f Fl(will)390 -4878 y(cause)28 b(the)f(compressor)g(to)h(allo)s(cate)f(around)f(7600k) -j(of)e(memory)-8 b(,)28 b(but)f(only)f(touc)m(h)i(400k)390 -4981 y Fj(+)h Fl(20000)j(*)e(8)g(=)f(560)i(kb)m(ytes)f(of)g(it.)40 -b(Similarly)-8 b(,)26 b(the)k(decompressor)f(will)e(allo)s(cate)j -(3700k)390 5085 y(but)g(only)f(touc)m(h)i(100k)h Fj(+)e -Fl(20000)i(*)f(4)g(=)f(180)i(kb)m(ytes.)390 5236 y(Here)41 -b(is)f(a)i(table)f(whic)m(h)e(summarises)g(the)j(maxim)m(um)d(memory)i -(usage)h(for)e(di\013eren)m(t)390 5340 y(blo)s(c)m(k)25 -b(sizes.)38 b(Also)25 b(recorded)g(is)f(the)i(total)g(compressed)e -(size)h(for)g(14)h(\014les)e(of)i(the)f(Calgary)p eop -%%Page: 7 8 -7 7 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 -b(to)g(use)f Fj(bzip2)2375 b Fl(7)390 299 y(T)-8 b(ext)38 -b(Compression)d(Corpus)h(totalling)h(3,141,622)k(b)m(ytes.)61 -b(This)36 b(column)g(giv)m(es)i(some)390 403 y(feel)23 -b(for)f(ho)m(w)h(compression)f(v)-5 b(aries)23 b(with)e(blo)s(c)m(k)i -(size.)38 b(These)23 b(\014gures)f(tend)g(to)i(understate)390 -506 y(the)g(adv)-5 b(an)m(tage)26 b(of)e(larger)f(blo)s(c)m(k)h(sizes)f -(for)h(larger)f(\014les,)h(since)g(the)g(Corpus)e(is)h(dominated)390 -610 y(b)m(y)30 b(smaller)f(\014les.)1107 761 y Fj(Compress)141 -b(Decompress)g(Decompress)f(Corpus)773 865 y(Flag)238 -b(usage)285 b(usage)332 b(-s)48 b(usage)237 b(Size)821 -1073 y(-1)286 b(1200k)332 b(500k)429 b(350k)285 b(914704)821 -1176 y(-2)h(2000k)332 b(900k)429 b(600k)285 b(877703)821 -1280 y(-3)h(2800k)f(1300k)428 b(850k)285 b(860338)821 -1384 y(-4)h(3600k)f(1700k)380 b(1100k)285 b(846899)821 -1488 y(-5)h(4400k)f(2100k)380 b(1350k)285 b(845160)821 -1591 y(-6)h(5200k)f(2500k)380 b(1600k)285 b(838626)821 -1695 y(-7)h(6100k)f(2900k)380 b(1850k)285 b(834096)821 -1799 y(-8)h(6800k)f(3300k)380 b(2100k)285 b(828642)821 -1903 y(-9)h(7600k)f(3700k)380 b(2350k)285 b(828642)390 -2147 y Ff(RECO)m(VERING)37 b(D)m(A)-10 b(T)g(A)40 b(FR)m(OM)h(D)m(AMA)m -(GED)e(FILES)390 2333 y Fj(bzip2)25 b Fl(compresses)h(\014les)g(in)f -(blo)s(c)m(ks,)h(usually)e(900kb)m(ytes)29 b(long.)39 -b(Eac)m(h)27 b(blo)s(c)m(k)e(is)h(handled)390 2437 y(indep)s(enden)m -(tly)-8 b(.)47 b(If)32 b(a)i(media)e(or)h(transmission)e(error)i -(causes)h(a)f(m)m(ulti-blo)s(c)m(k)f Fj(.bz2)g Fl(\014le)390 -2541 y(to)k(b)s(ecome)h(damaged,)g(it)e(ma)m(y)i(b)s(e)e(p)s(ossible)e -(to)k(reco)m(v)m(er)g(data)f(from)g(the)f(undamaged)390 -2645 y(blo)s(c)m(ks)30 b(in)f(the)h(\014le.)390 2796 -y(The)j(compressed)h(represen)m(tation)f(of)h(eac)m(h)h(blo)s(c)m(k)e -(is)g(delimited)e(b)m(y)j(a)g(48-bit)g(pattern,)390 2900 -y(whic)m(h)27 b(mak)m(es)j(it)e(p)s(ossible)e(to)j(\014nd)e(the)i(blo)s -(c)m(k)f(b)s(oundaries)e(with)i(reasonable)g(certain)m(t)m(y)-8 -b(.)390 3003 y(Eac)m(h)34 b(blo)s(c)m(k)f(also)g(carries)g(its)g(o)m -(wn)g(32-bit)g(CR)m(C,)h(so)f(damaged)h(blo)s(c)m(ks)f(can)g(b)s(e)g -(distin-)390 3107 y(guished)c(from)h(undamaged)g(ones.)390 -3258 y Fj(bzip2recover)37 b Fl(is)j(a)h(simple)e(program)h(whose)g -(purp)s(ose)f(is)h(to)i(searc)m(h)f(for)f(blo)s(c)m(ks)g(in)390 -3362 y Fj(.bz2)34 b Fl(\014les,)i(and)f(write)f(eac)m(h)j(blo)s(c)m(k)d -(out)i(in)m(to)f(its)g(o)m(wn)g Fj(.bz2)f Fl(\014le.)55 -b(Y)-8 b(ou)36 b(can)f(then)g(use)390 3466 y Fj(bzip2)29 -b(-t)c Fl(to)i(test)f(the)g(in)m(tegrit)m(y)g(of)g(the)g(resulting)e -(\014les,)i(and)f(decompress)h(those)g(whic)m(h)390 3569 -y(are)31 b(undamaged.)390 3721 y Fj(bzip2recover)41 b -Fl(tak)m(es)46 b(a)f(single)e(argumen)m(t,)49 b(the)44 -b(name)h(of)g(the)f(damaged)h(\014le,)j(and)390 3824 -y(writes)33 b(a)i(n)m(um)m(b)s(er)d(of)j(\014les)e Fj(rec0001file.bz2)p -Fl(,)e Fj(rec0002file.bz2)p Fl(,)g(etc,)36 b(con)m(taining)390 -3928 y(the)42 b(extracted)g(blo)s(c)m(ks.)74 b(The)41 -b(output)g(\014lenames)f(are)i(designed)e(so)i(that)g(the)g(use)f(of) -390 4032 y(wildcards)30 b(in)h(subsequen)m(t)h(pro)s(cessing)f({)i(for) -g(example,)g Fj(bzip2)c(-dc)g(rec*file.bz2)e(>)390 4136 -y(recovered_data)f Fl({)31 b(lists)e(the)i(\014les)e(in)g(the)i -(correct)g(order.)390 4287 y Fj(bzip2recover)38 b Fl(should)i(b)s(e)g -(of)i(most)g(use)f(dealing)f(with)g(large)i Fj(.bz2)e -Fl(\014les,)k(as)d(these)390 4390 y(will)29 b(con)m(tain)j(man)m(y)g -(blo)s(c)m(ks.)45 b(It)32 b(is)f(clearly)g(futile)f(to)i(use)g(it)f(on) -h(damaged)g(single-blo)s(c)m(k)390 4494 y(\014les,)g(since)f(a)h -(damaged)h(blo)s(c)m(k)e(cannot)i(b)s(e)e(reco)m(v)m(ered.)47 -b(If)32 b(y)m(ou)g(wish)e(to)j(minimise)c(an)m(y)390 -4598 y(p)s(oten)m(tial)36 b(data)i(loss)e(through)g(media)h(or)f -(transmission)f(errors,)j(y)m(ou)f(migh)m(t)g(consider)390 -4702 y(compressing)29 b(with)g(a)i(smaller)e(blo)s(c)m(k)h(size.)390 -4946 y Ff(PERF)m(ORMANCE)39 b(NOTES)390 5132 y Fl(The)f(sorting)f -(phase)h(of)h(compression)e(gathers)i(together)h(similar)35 -b(strings)i(in)g(the)i(\014le.)390 5236 y(Because)54 -b(of)f(this,)58 b(\014les)52 b(con)m(taining)g(v)m(ery)h(long)g(runs)e -(of)i(rep)s(eated)g(sym)m(b)s(ols,)58 b(lik)m(e)390 5340 -y Fj(")p Fl(aabaabaabaab)e(...)p Fj(")g Fl(\(rep)s(eated)g(sev)m(eral)f -(h)m(undred)e(times\))i(ma)m(y)h(compress)f(more)p eop -%%Page: 8 9 -8 8 bop 150 -116 a Fl(Chapter)30 b(2:)41 b(Ho)m(w)31 -b(to)g(use)f Fj(bzip2)2375 b Fl(8)390 299 y(slo)m(wly)33 -b(than)g(normal.)50 b(V)-8 b(ersions)33 b(0.9.5)i(and)f(ab)s(o)m(v)m(e) -h(fare)e(m)m(uc)m(h)h(b)s(etter)g(than)f(previous)390 -403 y(v)m(ersions)i(in)g(this)f(resp)s(ect.)57 b(The)35 -b(ratio)h(b)s(et)m(w)m(een)h(w)m(orst-case)g(and)e(a)m(v)m(erage-case) -40 b(com-)390 506 y(pression)e(time)h(is)f(in)g(the)h(region)g(of)h -(10:1.)69 b(F)-8 b(or)40 b(previous)e(v)m(ersions,)j(this)d(\014gure)h -(w)m(as)390 610 y(more)f(lik)m(e)g(100:1.)66 b(Y)-8 b(ou)38 -b(can)h(use)e(the)i Fj(-vvvv)d Fl(option)i(to)h(monitor)e(progress)h -(in)f(great)390 714 y(detail,)30 b(if)f(y)m(ou)i(w)m(an)m(t.)390 -865 y(Decompression)f(sp)s(eed)g(is)f(una\013ected)i(b)m(y)f(these)h -(phenomena.)390 1016 y Fj(bzip2)i Fl(usually)g(allo)s(cates)i(sev)m -(eral)f(megab)m(ytes)j(of)d(memory)h(to)g(op)s(erate)h(in,)e(and)g -(then)390 1120 y(c)m(harges)k(all)d(o)m(v)m(er)j(it)f(in)e(a)i(fairly)e -(random)h(fashion.)59 b(This)34 b(means)j(that)g(p)s(erformance,)390 -1224 y(b)s(oth)27 b(for)h(compressing)f(and)g(decompressing,)h(is)f -(largely)g(determined)g(b)m(y)h(the)g(sp)s(eed)f(at)390 -1327 y(whic)m(h)35 b(y)m(our)h(mac)m(hine)g(can)g(service)g(cac)m(he)i -(misses.)57 b(Because)37 b(of)g(this,)f(small)f(c)m(hanges)390 -1431 y(to)f(the)f(co)s(de)h(to)f(reduce)g(the)h(miss)d(rate)j(ha)m(v)m -(e)h(b)s(een)d(observ)m(ed)h(to)h(giv)m(e)g(disprop)s(ortion-)390 -1535 y(ately)i(large)f(p)s(erformance)f(impro)m(v)m(emen)m(ts.)56 -b(I)35 b(imagine)f Fj(bzip2)g Fl(will)e(p)s(erform)i(b)s(est)h(on)390 -1639 y(mac)m(hines)30 b(with)f(v)m(ery)i(large)f(cac)m(hes.)390 -1885 y Ff(CA)-14 b(VEA)k(TS)390 2072 y Fl(I/O)38 b(error)g(messages)h -(are)f(not)h(as)f(helpful)e(as)i(they)g(could)f(b)s(e.)64 -b Fj(bzip2)37 b Fl(tries)g(hard)g(to)390 2176 y(detect)29 -b(I/O)e(errors)g(and)f(exit)i(cleanly)-8 b(,)27 b(but)g(the)h(details)e -(of)h(what)h(the)f(problem)f(is)g(some-)390 2280 y(times)k(seem)h -(rather)f(misleading.)390 2431 y(This)j(man)m(ual)g(page)i(p)s(ertains) -e(to)i(v)m(ersion)f(1.0)i(of)e Fj(bzip2)p Fl(.)51 b(Compressed)34 -b(data)h(created)390 2534 y(b)m(y)25 b(this)e(v)m(ersion)i(is)e(en)m -(tirely)h(forw)m(ards)h(and)f(bac)m(kw)m(ards)h(compatible)f(with)f -(the)i(previous)390 2638 y(public)18 b(releases,)24 b(v)m(ersions)c -(0.1pl2,)k(0.9.0)e(and)f(0.9.5,)k(but)20 b(with)g(the)h(follo)m(wing)e -(exception:)390 2742 y(0.9.0)43 b(and)e(ab)s(o)m(v)m(e)h(can)g -(correctly)f(decompress)g(m)m(ultiple)e(concatenated)k(compressed)390 -2846 y(\014les.)c(0.1pl2)30 b(cannot)g(do)f(this;)f(it)h(will)e(stop)i -(after)h(decompressing)e(just)g(the)i(\014rst)e(\014le)g(in)390 -2949 y(the)j(stream.)390 3100 y Fj(bzip2recover)20 b -Fl(uses)k(32-bit)g(in)m(tegers)f(to)i(represen)m(t)f(bit)e(p)s -(ositions)g(in)g(compressed)i(\014les,)390 3204 y(so)j(it)f(cannot)i -(handle)d(compressed)i(\014les)f(more)h(than)f(512)i(megab)m(ytes)h -(long.)39 b(This)25 b(could)390 3308 y(easily)30 b(b)s(e)f(\014xed.)390 -3555 y Ff(A)m(UTHOR)390 3741 y Fl(Julian)f(Sew)m(ard,)i -Fj(jseward@acm.org)p Fl(.)390 3892 y(The)24 b(ideas)f(em)m(b)s(o)s -(died)f(in)h Fj(bzip2)f Fl(are)j(due)e(to)i(\(at)g(least\))g(the)f -(follo)m(wing)e(p)s(eople:)37 b(Mic)m(hael)390 3996 y(Burro)m(ws)48 -b(and)g(Da)m(vid)h(Wheeler)f(\(for)h(the)g(blo)s(c)m(k)f(sorting)g -(transformation\),)53 b(Da)m(vid)390 4100 y(Wheeler)45 -b(\(again,)50 b(for)45 b(the)g(Hu\013man)g(co)s(der\),)k(P)m(eter)d(F) --8 b(en)m(wic)m(k)46 b(\(for)g(the)f(structured)390 4204 -y(co)s(ding)26 b(mo)s(del)g(in)f(the)i(original)e Fj(bzip)p -Fl(,)i(and)f(man)m(y)h(re\014nemen)m(ts\),)h(and)e(Alistair)f -(Mo\013at,)390 4307 y(Radford)34 b(Neal)h(and)f(Ian)h(Witten)g(\(for)f -(the)h(arithmetic)g(co)s(der)f(in)g(the)h(original)d -Fj(bzip)p Fl(\).)390 4411 y(I)41 b(am)g(m)m(uc)m(h)h(indebted)e(for)h -(their)f(help,)j(supp)s(ort)c(and)i(advice.)74 b(See)41 -b(the)h(man)m(ual)e(in)390 4515 y(the)28 b(source)g(distribution)23 -b(for)28 b(p)s(oin)m(ters)e(to)j(sources)e(of)h(do)s(cumen)m(tation.)40 -b(Christian)25 b(v)m(on)390 4619 y(Ro)s(ques)31 b(encouraged)h(me)g(to) -g(lo)s(ok)f(for)h(faster)g(sorting)f(algorithms,)f(so)i(as)g(to)g(sp)s -(eed)f(up)390 4723 y(compression.)47 b(Bela)34 b(Lubkin)c(encouraged)k -(me)f(to)g(impro)m(v)m(e)g(the)g(w)m(orst-case)i(compres-)390 -4826 y(sion)25 b(p)s(erformance.)38 b(Man)m(y)26 b(p)s(eople)f(sen)m(t) -h(patc)m(hes,)h(help)s(ed)d(with)g(p)s(ortabilit)m(y)f(problems,)390 -4930 y(len)m(t)30 b(mac)m(hines,)g(ga)m(v)m(e)j(advice)d(and)g(w)m(ere) -h(generally)f(helpful.)p eop -%%Page: 9 10 -9 9 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1927 b Fl(9)150 299 y Fh(3)80 b(Programming)53 -b(with)h Fg(libbzip2)150 568 y Fl(This)29 b(c)m(hapter)i(describ)s(es)d -(the)j(programming)e(in)m(terface)i(to)g Fj(libbzip2)p -Fl(.)150 725 y(F)-8 b(or)36 b(general)e(bac)m(kground)h(information,)f -(particularly)f(ab)s(out)h(memory)h(use)f(and)g(p)s(erformance)g(as-) -150 834 y(p)s(ects,)d(y)m(ou'd)f(b)s(e)g(w)m(ell)f(advised)g(to)j(read) -e(Chapter)g(2)g(as)h(w)m(ell.)150 1124 y Fk(3.1)68 b(T)-11 -b(op-lev)l(el)46 b(structure)150 1316 y Fj(libbzip2)33 -b Fl(is)i(a)h(\015exible)e(library)f(for)j(compressing)f(and)g -(decompressing)f(data)j(in)d(the)i Fj(bzip2)e Fl(data)150 -1426 y(format.)39 b(Although)24 b(pac)m(k)-5 b(aged)26 -b(as)e(a)h(single)e(en)m(tit)m(y)-8 b(,)27 b(it)d(helps)f(to)i(regard)g -(the)g(library)d(as)i(three)h(separate)150 1535 y(parts:)40 -b(the)31 b(lo)m(w)f(lev)m(el)g(in)m(terface,)h(and)f(the)h(high)e(lev)m -(el)h(in)m(terface,)h(and)f(some)h(utilit)m(y)d(functions.)150 -1692 y(The)38 b(structure)g(of)g Fj(libbzip2)p Fl('s)e(in)m(terfaces)j -(is)e(similar)f(to)j(that)g(of)g(Jean-loup)e(Gailly's)g(and)h(Mark)150 -1802 y(Adler's)29 b(excellen)m(t)i Fj(zlib)e Fl(library)-8 -b(.)150 1959 y(All)29 b(externally)g(visible)f(sym)m(b)s(ols)h(ha)m(v)m -(e)i(names)f(b)s(eginning)e Fj(BZ2_)p Fl(.)39 b(This)29 -b(is)g(new)h(in)f(v)m(ersion)h(1.0.)41 b(The)150 2068 -y(in)m(ten)m(tion)30 b(is)f(to)i(minimise)d(p)s(ollution)f(of)k(the)f -(namespaces)h(of)g(library)d(clien)m(ts.)150 2321 y Ff(3.1.1)63 -b(Lo)m(w-lev)m(el)39 b(summary)150 2514 y Fl(This)21 -b(in)m(terface)h(pro)m(vides)g(services)g(for)g(compressing)f(and)h -(decompressing)f(data)i(in)e(memory)-8 b(.)38 b(There's)150 -2623 y(no)43 b(pro)m(vision)e(for)h(dealing)g(with)f(\014les,)k -(streams)e(or)g(an)m(y)g(other)g(I/O)g(mec)m(hanisms,)i(just)e(straigh) -m(t)150 2733 y(memory-to-memory)25 b(w)m(ork.)38 b(In)23 -b(fact,)k(this)22 b(part)i(of)f(the)h(library)d(can)j(b)s(e)f(compiled) -f(without)h(inclusion)150 2843 y(of)31 b Fj(stdio.h)p -Fl(,)d(whic)m(h)h(ma)m(y)i(b)s(e)f(helpful)d(for)k(em)m(b)s(edded)e -(applications.)150 2999 y(The)h(lo)m(w-lev)m(el)g(part)g(of)h(the)f -(library)e(has)i(no)h(global)e(v)-5 b(ariables)29 b(and)h(is)g -(therefore)g(thread-safe.)150 3156 y(Six)d(routines)g(mak)m(e)j(up)d -(the)i(lo)m(w)f(lev)m(el)g(in)m(terface:)41 b Fj(BZ2_bzCompressInit)p -Fl(,)24 b Fj(BZ2_bzCompress)p Fl(,)h(and)150 3266 y Fj -(BZ2_bzCompressEnd)h Fl(for)k(compression,)f(and)h(a)h(corresp)s -(onding)d(trio)i Fj(BZ2_bzDecompressInit)p Fl(,)150 3375 -y Fj(BZ2_bzDecompress)37 b Fl(and)j Fj(BZ2_bzDecompressEnd)c -Fl(for)42 b(decompression.)72 b(The)41 b Fj(*Init)e Fl(functions)150 -3485 y(allo)s(cate)44 b(memory)g(for)f(compression/decompression)f(and) -h(do)h(other)g(initialisations,)f(whilst)f(the)150 3595 -y Fj(*End)29 b Fl(functions)g(close)i(do)m(wn)f(op)s(erations)f(and)h -(release)h(memory)-8 b(.)150 3751 y(The)36 b(real)f(w)m(ork)i(is)e -(done)h(b)m(y)g Fj(BZ2_bzCompress)c Fl(and)j Fj(BZ2_bzDecompress)p -Fl(.)54 b(These)36 b(compress)g(and)150 3861 y(decompress)30 -b(data)h(from)f(a)h(user-supplied)c(input)i(bu\013er)g(to)i(a)g -(user-supplied)c(output)j(bu\013er.)40 b(These)150 3971 -y(bu\013ers)32 b(can)i(b)s(e)e(an)m(y)i(size;)g(arbitrary)e(quan)m -(tities)h(of)g(data)h(are)g(handled)d(b)m(y)i(making)f(rep)s(eated)i -(calls)150 4080 y(to)f(these)f(functions.)44 b(This)30 -b(is)h(a)h(\015exible)e(mec)m(hanism)i(allo)m(wing)e(a)i(consumer-pull) -e(st)m(yle)i(of)g(activit)m(y)-8 b(,)150 4190 y(or)30 -b(pro)s(ducer-push,)e(or)i(a)h(mixture)e(of)i(b)s(oth.)150 -4443 y Ff(3.1.2)63 b(High-lev)m(el)41 b(summary)150 4635 -y Fl(This)d(in)m(terface)j(pro)m(vides)e(some)h(handy)f(wrapp)s(ers)f -(around)h(the)i(lo)m(w-lev)m(el)f(in)m(terface)g(to)h(facilitate)150 -4745 y(reading)26 b(and)g(writing)f Fj(bzip2)g Fl(format)i(\014les)f -(\()p Fj(.bz2)g Fl(\014les\).)38 b(The)27 b(routines)e(pro)m(vide)h(ho) -s(oks)h(to)g(facilitate)150 4854 y(reading)43 b(\014les)f(in)h(whic)m -(h)f(the)i Fj(bzip2)f Fl(data)h(stream)g(is)f(em)m(b)s(edded)f(within)g -(some)i(larger-scale)g(\014le)150 4964 y(structure,)30 -b(or)h(where)e(there)i(are)g(m)m(ultiple)d Fj(bzip2)h -Fl(data)i(streams)f(concatenated)j(end-to-end.)150 5121 -y(F)-8 b(or)31 b(reading)f(\014les,)f Fj(BZ2_bzReadOpen)p -Fl(,)e Fj(BZ2_bzRead)p Fl(,)h Fj(BZ2_bzReadClose)e Fl(and)150 -5230 y Fj(BZ2_bzReadGetUnused)19 b Fl(are)25 b(supplied.)36 -b(F)-8 b(or)25 b(writing)d(\014les,)j Fj(BZ2_bzWriteOpen)p -Fl(,)d Fj(BZ2_bzWrite)g Fl(and)150 5340 y Fj(BZ2_bzWriteFinish)k -Fl(are)k(a)m(v)-5 b(ailable.)p eop -%%Page: 10 11 -10 10 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(10)150 299 y(As)24 b(with)f(the)h(lo)m -(w-lev)m(el)h(library)-8 b(,)23 b(no)h(global)g(v)-5 -b(ariables)23 b(are)h(used)g(so)g(the)h(library)c(is)j(p)s(er)f(se)h -(thread-safe.)150 408 y(Ho)m(w)m(ev)m(er,)32 b(if)c(I/O)h(errors)g(o)s -(ccur)g(whilst)e(reading)i(or)g(writing)e(the)j(underlying)c -(compressed)j(\014les,)g(y)m(ou)150 518 y(ma)m(y)j(ha)m(v)m(e)g(to)g -(consult)e Fj(errno)g Fl(to)h(determine)g(the)g(cause)g(of)h(the)f -(error.)42 b(In)30 b(that)i(case,)h(y)m(ou'd)e(need)g(a)150 -628 y(C)f(library)e(whic)m(h)h(correctly)i(supp)s(orts)d -Fj(errno)h Fl(in)g(a)i(m)m(ultithreaded)e(en)m(vironmen)m(t.)150 -784 y(T)-8 b(o)56 b(mak)m(e)g(the)g(library)d(a)j(little)e(simpler)f -(and)i(more)h(p)s(ortable,)61 b Fj(BZ2_bzReadOpen)51 -b Fl(and)k Fj(BZ2_)150 894 y(bzWriteOpen)34 b Fl(require)j(y)m(ou)g(to) -i(pass)e(them)g(\014le)g(handles)f(\()p Fj(FILE*)p Fl(s\))g(whic)m(h)h -(ha)m(v)m(e)h(previously)e(b)s(een)150 1004 y(op)s(ened)41 -b(for)g(reading)f(or)h(writing)f(resp)s(ectiv)m(ely)-8 -b(.)73 b(That)41 b(a)m(v)m(oids)h(p)s(ortabilit)m(y)d(problems)g(asso)s -(ciated)150 1113 y(with)j(\014le)h(op)s(erations)g(and)g(\014le)g -(attributes,)j(whilst)c(not)i(b)s(eing)e(m)m(uc)m(h)h(of)h(an)g(imp)s -(osition)c(on)k(the)150 1223 y(programmer.)150 1474 y -Ff(3.1.3)63 b(Utilit)m(y)40 b(functions)h(summary)150 -1666 y Fl(F)-8 b(or)45 b(v)m(ery)g(simple)d(needs,)48 -b Fj(BZ2_bzBuffToBuffCompres)o(s)38 b Fl(and)44 b Fj -(BZ2_bzBuffToBuffDecompres)o(s)150 1776 y Fl(are)29 b(pro)m(vided.)38 -b(These)28 b(compress)g(data)h(in)e(memory)h(from)g(one)h(bu\013er)e -(to)i(another)f(bu\013er)g(in)f(a)h(single)150 1885 y(function)38 -b(call.)67 b(Y)-8 b(ou)40 b(should)d(assess)j(whether)f(these)h -(functions)d(ful\014ll)f(y)m(our)k(memory-to-memory)150 -1995 y(compression/decompression)26 b(requiremen)m(ts)h(b)s(efore)g(in) -m(v)m(esting)g(e\013ort)i(in)d(understanding)f(the)j(more)150 -2105 y(general)i(but)g(more)h(complex)f(lo)m(w-lev)m(el)g(in)m -(terface.)150 2261 y(Y)-8 b(oshiok)j(a)47 b(Tsuneo)e(\()p -Fj(QWF00133@niftyserve.or.jp)40 b Fl(/)46 b Fj -(tsuneo-y@is.aist-nara.ac.)o(jp)p Fl(\))40 b(has)150 -2371 y(con)m(tributed)f(some)h(functions)e(to)j(giv)m(e)f(b)s(etter)g -Fj(zlib)f Fl(compatibilit)m(y)-8 b(.)67 b(These)40 b(functions)e(are)i -Fj(BZ2_)150 2481 y(bzopen)p Fl(,)e Fj(BZ2_bzread)p Fl(,)f -Fj(BZ2_bzwrite)p Fl(,)g Fj(BZ2_bzflush)p Fl(,)g Fj(BZ2_bzclose)p -Fl(,)f Fj(BZ2_bzerror)f Fl(and)i Fj(BZ2_)150 2590 y(bzlibVersion)p -Fl(.)49 b(Y)-8 b(ou)35 b(ma)m(y)g(\014nd)e(these)i(functions)d(more)j -(con)m(v)m(enien)m(t)g(for)f(simple)f(\014le)g(reading)h(and)150 -2700 y(writing,)c(than)h(those)h(in)e(the)i(high-lev)m(el)e(in)m -(terface.)45 b(These)31 b(functions)f(are)i(not)g(\(y)m(et\))h -(o\016cially)d(part)150 2809 y(of)k(the)g(library)-8 -b(,)33 b(and)g(are)h(minimally)c(do)s(cumen)m(ted)k(here.)51 -b(If)33 b(they)h(break,)h(y)m(ou)f(get)h(to)g(k)m(eep)f(all)f(the)150 -2919 y(pieces.)40 b(I)31 b(hop)s(e)e(to)i(do)s(cumen)m(t)g(them)f(prop) -s(erly)e(when)h(time)i(p)s(ermits.)150 3076 y(Y)-8 b(oshiok)j(a)27 -b(also)g(con)m(tributed)f(mo)s(di\014cations)f(to)i(allo)m(w)f(the)h -(library)e(to)i(b)s(e)f(built)f(as)i(a)g(Windo)m(ws)f(DLL.)150 -3362 y Fk(3.2)68 b(Error)45 b(handling)150 3554 y Fl(The)23 -b(library)f(is)h(designed)g(to)i(reco)m(v)m(er)g(cleanly)f(in)e(all)h -(situations,)h(including)d(the)j(w)m(orst-case)i(situation)150 -3664 y(of)j(decompressing)e(random)g(data.)41 b(I'm)28 -b(not)h(100\045)g(sure)f(that)h(it)f(can)h(alw)m(a)m(ys)g(do)f(this,)g -(so)g(y)m(ou)h(migh)m(t)150 3774 y(w)m(an)m(t)i(to)g(add)e(a)i(signal)d -(handler)g(to)j(catc)m(h)h(segmen)m(tation)f(violations)e(during)f -(decompression)h(if)g(y)m(ou)150 3883 y(are)g(feeling)f(esp)s(ecially)f -(paranoid.)39 b(I)28 b(w)m(ould)g(b)s(e)g(in)m(terested)h(in)e(hearing) -h(more)h(ab)s(out)f(the)h(robustness)150 3993 y(of)i(the)f(library)e -(to)j(corrupted)f(compressed)g(data.)150 4150 y(V)-8 -b(ersion)39 b(1.0)h(is)f(m)m(uc)m(h)g(more)h(robust)e(in)g(this)g(resp) -s(ect)i(than)f(0.9.0)i(or)e(0.9.5.)70 b(In)m(v)m(estigations)39 -b(with)150 4259 y(Chec)m(k)m(er)21 b(\(a)g(to)s(ol)g(for)f(detecting)h -(problems)d(with)h(memory)h(managemen)m(t,)k(similar)18 -b(to)j(Purify\))e(indicate)150 4369 y(that,)40 b(at)e(least)f(for)g -(the)h(few)e(\014les)h(I)g(tested,)j(all)c(single-bit)f(errors)i(in)e -(the)j(decompressed)f(data)h(are)150 4478 y(caugh)m(t)c(prop)s(erly)-8 -b(,)31 b(with)g(no)i(segmen)m(tation)h(faults,)e(no)g(reads)h(of)g -(uninitialised)27 b(data)34 b(and)e(no)g(out)h(of)150 -4588 y(range)f(reads)g(or)f(writes.)44 b(So)32 b(it's)f(certainly)g(m)m -(uc)m(h)h(impro)m(v)m(ed,)g(although)f(I)g(w)m(ouldn't)g(claim)g(it)g -(to)i(b)s(e)150 4698 y(totally)d(b)s(om)m(bpro)s(of.)150 -4854 y(The)25 b(\014le)g Fj(bzlib.h)f Fl(con)m(tains)i(all)f -(de\014nitions)e(needed)i(to)i(use)e(the)h(library)-8 -b(.)37 b(In)26 b(particular,)f(y)m(ou)h(should)150 4964 -y(de\014nitely)i(not)j(include)d Fj(bzlib_private.h)p -Fl(.)150 5121 y(In)39 b Fj(bzlib.h)p Fl(,)h(the)g(v)-5 -b(arious)39 b(return)f(v)-5 b(alues)39 b(are)h(de\014ned.)68 -b(The)39 b(follo)m(wing)f(list)h(is)f(not)i(in)m(tended)f(as)150 -5230 y(an)c(exhaustiv)m(e)h(description)d(of)i(the)h(circumstances)f -(in)f(whic)m(h)g(a)i(giv)m(en)f(v)-5 b(alue)35 b(ma)m(y)h(b)s(e)e -(returned)h({)150 5340 y(those)h(descriptions)d(are)j(giv)m(en)f -(later.)56 b(Rather,)37 b(it)d(is)h(in)m(tended)f(to)i(con)m(v)m(ey)h -(the)e(rough)g(meaning)g(of)p eop -%%Page: 11 12 -11 11 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(11)150 299 y(eac)m(h)38 -b(return)d(v)-5 b(alue.)59 b(The)36 b(\014rst)g(\014v)m(e)g(actions)h -(are)g(normal)f(and)f(not)i(in)m(tended)f(to)h(denote)g(an)f(error)150 -408 y(situation.)150 592 y Fj(BZ_OK)180 b Fl(The)30 b(requested)g -(action)h(w)m(as)g(completed)f(successfully)-8 b(.)150 -756 y Fj(BZ_RUN_OK)150 866 y(BZ_FLUSH_OK)150 975 y(BZ_FINISH_OK)630 -1085 y Fl(In)24 b Fj(BZ2_bzCompress)p Fl(,)e(the)i(requested)g -(\015ush/\014nish/nothing-sp)s(ecial)c(action)k(w)m(as)h(com-)630 -1194 y(pleted)30 b(successfully)-8 b(.)150 1358 y Fj(BZ_STREAM_END)630 -1468 y Fl(Compression)38 b(of)j(data)f(w)m(as)h(completed,)h(or)f(the)f -(logical)f(stream)i(end)e(w)m(as)i(detected)630 1577 -y(during)28 b(decompression.)150 1761 y(The)i(follo)m(wing)f(return)g -(v)-5 b(alues)30 b(indicate)f(an)h(error)g(of)h(some)g(kind.)150 -1945 y Fj(BZ_CONFIG_ERROR)630 2055 y Fl(Indicates)48 -b(that)h(the)g(library)e(has)h(b)s(een)g(improp)s(erly)d(compiled)j(on) -g(y)m(our)h(platform)630 2164 y({)j(a)g(ma)5 b(jor)51 -b(con\014guration)g(error.)104 b(Sp)s(eci\014cally)-8 -b(,)55 b(it)c(means)g(that)h Fj(sizeof\(char\))p Fl(,)630 -2274 y Fj(sizeof\(short\))44 b Fl(and)i Fj(sizeof\(int\))f -Fl(are)j(not)f(1,)52 b(2)c(and)f(4)h(resp)s(ectiv)m(ely)-8 -b(,)51 b(as)d(they)630 2384 y(should)27 b(b)s(e.)40 b(Note)30 -b(that)g(the)f(library)e(should)g(still)g(w)m(ork)i(prop)s(erly)e(on)i -(64-bit)g(platforms)630 2493 y(whic)m(h)d(follo)m(w)h(the)g(LP64)h -(programming)e(mo)s(del)h({)g(that)h(is,)g(where)e Fj(sizeof\(long\))f -Fl(and)630 2603 y Fj(sizeof\(void*\))e Fl(are)k(8.)40 -b(Under)25 b(LP64,)j Fj(sizeof\(int\))c Fl(is)h(still)f(4,)k(so)f -Fj(libbzip2)p Fl(,)e(whic)m(h)630 2712 y(do)s(esn't)30 -b(use)g(the)h Fj(long)e Fl(t)m(yp)s(e,)i(is)e(OK.)150 -2876 y Fj(BZ_SEQUENCE_ERROR)630 2986 y Fl(When)43 b(using)f(the)i -(library)-8 b(,)45 b(it)e(is)f(imp)s(ortan)m(t)h(to)h(call)e(the)i -(functions)e(in)g(the)i(correct)630 3095 y(sequence)28 -b(and)f(with)f(data)j(structures)e(\(bu\013ers)f(etc\))j(in)e(the)g -(correct)i(states.)41 b Fj(libbzip2)630 3205 y Fl(c)m(hec)m(ks)26 -b(as)e(m)m(uc)m(h)h(as)f(it)g(can)g(to)h(ensure)f(this)f(is)g(happ)s -(ening,)h(and)f(returns)g Fj(BZ_SEQUENCE_)630 3314 y(ERROR)36 -b Fl(if)h(not.)62 b(Co)s(de)37 b(whic)m(h)g(complies)f(precisely)g -(with)h(the)g(function)g(seman)m(tics,)j(as)630 3424 -y(detailed)d(b)s(elo)m(w,)i(should)d(nev)m(er)i(receiv)m(e)h(this)d(v) --5 b(alue;)41 b(suc)m(h)d(an)g(ev)m(en)m(t)h(denotes)f(buggy)630 -3534 y(co)s(de)31 b(whic)m(h)e(y)m(ou)h(should)f(in)m(v)m(estigate.)150 -3697 y Fj(BZ_PARAM_ERROR)630 3807 y Fl(Returned)43 b(when)f(a)i -(parameter)g(to)h(a)f(function)e(call)h(is)f(out)i(of)g(range)g(or)g -(otherwise)630 3917 y(manifestly)34 b(incorrect.)57 b(As)36 -b(with)e Fj(BZ_SEQUENCE_ERROR)p Fl(,)f(this)i(denotes)h(a)g(bug)f(in)g -(the)630 4026 y(clien)m(t)23 b(co)s(de.)39 b(The)22 b(distinction)f(b)s -(et)m(w)m(een)j Fj(BZ_PARAM_ERROR)c Fl(and)j Fj(BZ_SEQUENCE_ERROR)630 -4136 y Fl(is)29 b(a)i(bit)f(hazy)-8 b(,)31 b(but)f(still)e(w)m(orth)i -(making.)150 4300 y Fj(BZ_MEM_ERROR)630 4409 y Fl(Returned)g(when)f(a)i -(request)f(to)i(allo)s(cate)f(memory)f(failed.)40 b(Note)31 -b(that)g(the)g(quan)m(tit)m(y)g(of)630 4519 y(memory)21 -b(needed)g(to)i(decompress)e(a)g(stream)h(cannot)g(b)s(e)f(determined)f -(un)m(til)g(the)h(stream's)630 4628 y(header)29 b(has)g(b)s(een)g -(read.)40 b(So)29 b Fj(BZ2_bzDecompress)c Fl(and)j Fj(BZ2_bzRead)f -Fl(ma)m(y)j(return)e Fj(BZ_)630 4738 y(MEM_ERROR)d Fl(ev)m(en)k(though) -e(some)h(of)g(the)g(compressed)g(data)g(has)g(b)s(een)f(read.)39 -b(The)28 b(same)630 4847 y(is)38 b(not)i(true)f(for)g(compression;)k -(once)d Fj(BZ2_bzCompressInit)34 b Fl(or)39 b Fj(BZ2_bzWriteOpen)630 -4957 y Fl(ha)m(v)m(e)32 b(successfully)c(completed,)j -Fj(BZ_MEM_ERROR)c Fl(cannot)k(o)s(ccur.)150 5121 y Fj(BZ_DATA_ERROR)630 -5230 y Fl(Returned)h(when)g(a)h(data)g(in)m(tegrit)m(y)g(error)g(is)e -(detected)k(during)30 b(decompression.)47 b(Most)630 -5340 y(imp)s(ortan)m(tly)-8 b(,)31 b(this)f(means)i(when)f(stored)g -(and)g(computed)h(CR)m(Cs)f(for)g(the)h(data)g(do)g(not)p -eop -%%Page: 12 13 -12 12 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(12)630 299 y(matc)m(h.)41 -b(This)28 b(v)-5 b(alue)29 b(is)f(also)i(returned)e(up)s(on)g -(detection)i(of)g(an)m(y)g(other)f(anomaly)h(in)e(the)630 -408 y(compressed)i(data.)150 560 y Fj(BZ_DATA_ERROR_MAGIC)630 -670 y Fl(As)k(a)g(sp)s(ecial)f(case)i(of)f Fj(BZ_DATA_ERROR)p -Fl(,)d(it)i(is)g(sometimes)h(useful)e(to)j(kno)m(w)f(when)f(the)630 -779 y(compressed)d(stream)h(do)s(es)f(not)g(start)h(with)e(the)i -(correct)h(magic)e(b)m(ytes)h(\()p Fj('B')f('Z')f('h')p -Fl(\).)150 931 y Fj(BZ_IO_ERROR)630 1040 y Fl(Returned)k(b)m(y)h -Fj(BZ2_bzRead)d Fl(and)i Fj(BZ2_bzWrite)e Fl(when)i(there)h(is)f(an)g -(error)h(reading)f(or)630 1150 y(writing)28 b(in)h(the)h(compressed)g -(\014le,)f(and)h(b)m(y)g Fj(BZ2_bzReadOpen)c Fl(and)j -Fj(BZ2_bzWriteOpen)630 1259 y Fl(for)i(attempts)i(to)f(use)f(a)h -(\014le)e(for)i(whic)m(h)e(the)h(error)g(indicator)g(\(viz,)g -Fj(ferror\(f\))p Fl(\))f(is)g(set.)630 1369 y(On)h(receipt)g(of)h -Fj(BZ_IO_ERROR)p Fl(,)e(the)h(caller)h(should)d(consult)i -Fj(errno)g Fl(and/or)g Fj(perror)f Fl(to)630 1479 y(acquire)g(op)s -(erating-system)g(sp)s(eci\014c)f(information)g(ab)s(out)h(the)h -(problem.)150 1630 y Fj(BZ_UNEXPECTED_EOF)630 1740 y -Fl(Returned)36 b(b)m(y)g Fj(BZ2_bzRead)e Fl(when)i(the)h(compressed)f -(\014le)g(\014nishes)e(b)s(efore)j(the)f(logical)630 -1849 y(end)30 b(of)g(stream)h(is)e(detected.)150 2001 -y Fj(BZ_OUTBUFF_FULL)630 2110 y Fl(Returned)g(b)m(y)i -Fj(BZ2_bzBuffToBuffCompres)o(s)24 b Fl(and)30 b Fj -(BZ2_bzBuffToBuffDecompres)o(s)630 2220 y Fl(to)h(indicate)f(that)h -(the)f(output)g(data)h(will)d(not)i(\014t)h(in)m(to)f(the)h(output)f -(bu\013er)f(pro)m(vided.)150 2492 y Fk(3.3)68 b(Lo)l(w-lev)l(el)47 -b(in)l(terface)150 2766 y Ff(3.3.1)63 b Fe(BZ2_bzCompressInit)390 -2953 y Fj(typedef)533 3057 y(struct)46 b({)676 3161 y(char)h(*next_in;) -676 3264 y(unsigned)f(int)h(avail_in;)676 3368 y(unsigned)f(int)h -(total_in_lo32;)676 3472 y(unsigned)f(int)h(total_in_hi32;)676 -3680 y(char)g(*next_out;)676 3783 y(unsigned)f(int)h(avail_out;)676 -3887 y(unsigned)f(int)h(total_out_lo32;)676 3991 y(unsigned)f(int)h -(total_out_hi32;)676 4198 y(void)g(*state;)676 4406 y(void)g -(*\(*bzalloc\)\(void)c(*,int,int\);)676 4510 y(void)k -(\(*bzfree\)\(void)d(*,void)i(*\);)676 4614 y(void)h(*opaque;)533 -4717 y(})533 4821 y(bz_stream;)390 5029 y(int)g(BZ2_bzCompressInit)c -(\()k(bz_stream)e(*strm,)1583 5132 y(int)i(blockSize100k,)1583 -5236 y(int)g(verbosity,)1583 5340 y(int)g(workFactor)e(\);)p -eop -%%Page: 13 14 -13 13 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(13)150 456 y(Prepares)32 -b(for)h(compression.)47 b(The)32 b Fj(bz_stream)e Fl(structure)j(holds) -e(all)h(data)h(p)s(ertaining)e(to)i(the)g(com-)150 565 -y(pression)i(activit)m(y)-8 b(.)62 b(A)37 b Fj(bz_stream)e -Fl(structure)h(should)f(b)s(e)i(allo)s(cated)g(and)f(initialised)e -(prior)h(to)j(the)150 675 y(call.)67 b(The)39 b(\014elds)e(of)j -Fj(bz_stream)d Fl(comprise)h(the)i(en)m(tiret)m(y)g(of)f(the)h -(user-visible)c(data.)68 b Fj(state)38 b Fl(is)h(a)150 -784 y(p)s(oin)m(ter)29 b(to)i(the)g(priv)-5 b(ate)30 -b(data)h(structures)f(required)e(for)i(compression.)150 -941 y(Custom)37 b(memory)g(allo)s(cators)g(are)h(supp)s(orted,)f(via)g -(\014elds)f Fj(bzalloc)p Fl(,)h Fj(bzfree)p Fl(,)g(and)g -Fj(opaque)p Fl(.)59 b(The)150 1051 y(v)-5 b(alue)32 b -Fj(opaque)e Fl(is)i(passed)f(to)i(as)g(the)f(\014rst)g(argumen)m(t)h -(to)g(all)e(calls)g(to)i Fj(bzalloc)d Fl(and)i Fj(bzfree)p -Fl(,)f(but)h(is)150 1160 y(otherwise)d(ignored)g(b)m(y)h(the)g(library) --8 b(.)38 b(The)29 b(call)h Fj(bzalloc)e(\()i(opaque,)e(n,)i(m)g(\))g -Fl(is)e(exp)s(ected)j(to)f(return)150 1270 y(a)g(p)s(oin)m(ter)e -Fj(p)h Fl(to)h Fj(n)g(*)g(m)f Fl(b)m(ytes)h(of)g(memory)-8 -b(,)30 b(and)e Fj(bzfree)h(\()h(opaque,)f(p)h(\))f Fl(should)e(free)i -(that)h(memory)-8 b(.)150 1427 y(If)33 b(y)m(ou)g(don't)h(w)m(an)m(t)g -(to)g(use)f(a)g(custom)h(memory)f(allo)s(cator,)h(set)g -Fj(bzalloc)p Fl(,)e Fj(bzfree)g Fl(and)h Fj(opaque)e -Fl(to)150 1537 y Fj(NULL)p Fl(,)e(and)h(the)h(library)d(will)f(then)k -(use)f(the)g(standard)g Fj(malloc)p Fl(/)p Fj(free)e -Fl(routines.)150 1693 y(Before)39 b(calling)d Fj(BZ2_bzCompressInit)p -Fl(,)f(\014elds)h Fj(bzalloc)p Fl(,)h Fj(bzfree)f Fl(and)h -Fj(opaque)f Fl(should)g(b)s(e)h(\014lled)150 1803 y(appropriately)-8 -b(,)35 b(as)h(just)f(describ)s(ed.)53 b(Up)s(on)34 b(return,)i(the)g -(in)m(ternal)e(state)i(will)d(ha)m(v)m(e)j(b)s(een)f(allo)s(cated)150 -1913 y(and)43 b(initialised,)g(and)g Fj(total_in_lo32)p -Fl(,)h Fj(total_in_hi32)p Fl(,)f Fj(total_out_lo32)d -Fl(and)j Fj(total_out_)150 2022 y(hi32)37 b Fl(will)f(ha)m(v)m(e)j(b)s -(een)f(set)h(to)g(zero.)65 b(These)38 b(four)g(\014elds)e(are)j(used)f -(b)m(y)g(the)g(library)e(to)j(inform)e(the)150 2132 y(caller)j(of)g -(the)h(total)g(amoun)m(t)g(of)g(data)g(passed)f(in)m(to)g(and)g(out)g -(of)h(the)g(library)-8 b(,)41 b(resp)s(ectiv)m(ely)-8 -b(.)70 b(Y)-8 b(ou)150 2241 y(should)34 b(not)j(try)f(to)h(c)m(hange)g -(them.)58 b(As)36 b(of)h(v)m(ersion)f(1.0,)j(64-bit)d(coun)m(ts)h(are)f -(main)m(tained,)h(ev)m(en)g(on)150 2351 y(32-bit)i(platforms,)h(using)d -(the)i Fj(_hi32)e Fl(\014elds)g(to)j(store)f(the)g(upp)s(er)d(32)k -(bits)d(of)i(the)g(coun)m(t.)66 b(So,)41 b(for)150 2460 -y(example,)30 b(the)h(total)g(amoun)m(t)g(of)f(data)h(in)f(is)f -Fj(\(total_in_hi32)d(<<)k(32\))g(+)g(total_in_lo32)p -Fl(.)150 2617 y(P)m(arameter)g Fj(blockSize100k)25 b -Fl(sp)s(eci\014es)i(the)h(blo)s(c)m(k)g(size)h(to)g(b)s(e)f(used)f(for) -h(compression.)40 b(It)28 b(should)f(b)s(e)150 2727 y(a)k(v)-5 -b(alue)30 b(b)s(et)m(w)m(een)i(1)f(and)f(9)h(inclusiv)m(e,)e(and)h(the) -h(actual)g(blo)s(c)m(k)f(size)g(used)g(is)g(100000)j(x)e(this)e -(\014gure.)42 b(9)150 2836 y(giv)m(es)31 b(the)f(b)s(est)g(compression) -g(but)f(tak)m(es)j(most)f(memory)-8 b(.)150 2993 y(P)m(arameter)29 -b Fj(verbosity)c Fl(should)h(b)s(e)h(set)i(to)f(a)h(n)m(um)m(b)s(er)d -(b)s(et)m(w)m(een)j(0)f(and)f(4)h(inclusiv)m(e.)38 b(0)28 -b(is)f(silen)m(t,)h(and)150 3103 y(greater)j(n)m(um)m(b)s(ers)c(giv)m -(e)j(increasingly)d(v)m(erb)s(ose)j(monitoring/debugging)d(output.)40 -b(If)29 b(the)g(library)e(has)150 3212 y(b)s(een)j(compiled)e(with)i -Fj(-DBZ_NO_STDIO)p Fl(,)d(no)j(suc)m(h)g(output)g(will)e(app)s(ear)h -(for)h(an)m(y)h(v)m(erb)s(osit)m(y)f(setting.)150 3369 -y(P)m(arameter)35 b Fj(workFactor)d Fl(con)m(trols)i(ho)m(w)g(the)g -(compression)f(phase)h(b)s(eha)m(v)m(es)g(when)f(presen)m(ted)h(with) -150 3479 y(w)m(orst)40 b(case,)j(highly)37 b(rep)s(etitiv)m(e,)k(input) -d(data.)68 b(If)39 b(compression)g(runs)e(in)m(to)j(di\016culties)d -(caused)i(b)m(y)150 3588 y(rep)s(etitiv)m(e)34 b(data,)j(the)e(library) -d(switc)m(hes)j(from)f(the)h(standard)f(sorting)g(algorithm)g(to)i(a)f -(fallbac)m(k)f(al-)150 3698 y(gorithm.)47 b(The)32 b(fallbac)m(k)g(is)g -(slo)m(w)m(er)g(than)h(the)f(standard)g(algorithm)g(b)m(y)g(p)s(erhaps) -f(a)i(factor)h(of)e(three,)150 3808 y(but)e(alw)m(a)m(ys)h(b)s(eha)m(v) -m(es)f(reasonably)-8 b(,)31 b(no)f(matter)h(ho)m(w)g(bad)f(the)g -(input.)150 3965 y(Lo)m(w)m(er)25 b(v)-5 b(alues)24 b(of)h -Fj(workFactor)d Fl(reduce)i(the)h(amoun)m(t)g(of)g(e\013ort)g(the)g -(standard)f(algorithm)f(will)f(exp)s(end)150 4074 y(b)s(efore)j -(resorting)h(to)g(the)g(fallbac)m(k.)39 b(Y)-8 b(ou)27 -b(should)c(set)k(this)e(parameter)h(carefully;)g(to)s(o)h(lo)m(w,)g -(and)e(man)m(y)150 4184 y(inputs)32 b(will)f(b)s(e)i(handled)f(b)m(y)i -(the)g(fallbac)m(k)g(algorithm)f(and)g(so)h(compress)g(rather)g(slo)m -(wly)-8 b(,)34 b(to)s(o)h(high,)150 4293 y(and)24 b(y)m(our)h(a)m(v)m -(erage-to-w)m(orst)30 b(case)c(compression)e(times)h(can)g(b)s(ecome)g -(v)m(ery)h(large.)39 b(The)24 b(default)g(v)-5 b(alue)150 -4403 y(of)31 b(30)g(giv)m(es)f(reasonable)h(b)s(eha)m(viour)e(o)m(v)m -(er)i(a)g(wide)e(range)i(of)f(circumstances.)150 4560 -y(Allo)m(w)m(able)h(v)-5 b(alues)31 b(range)i(from)e(0)i(to)f(250)h -(inclusiv)m(e.)44 b(0)32 b(is)f(a)h(sp)s(ecial)f(case,)i(equiv)-5 -b(alen)m(t)32 b(to)g(using)f(the)150 4669 y(default)f(v)-5 -b(alue)29 b(of)i(30.)150 4826 y(Note)38 b(that)f(the)g(compressed)f -(output)g(generated)h(is)f(the)g(same)h(regardless)f(of)h(whether)f(or) -g(not)h(the)150 4936 y(fallbac)m(k)30 b(algorithm)f(is)h(used.)150 -5093 y(Be)23 b(a)m(w)m(are)h(also)f(that)g(this)f(parameter)h(ma)m(y)g -(disapp)s(ear)e(en)m(tirely)h(in)f(future)h(v)m(ersions)g(of)h(the)g -(library)-8 b(.)36 b(In)150 5202 y(principle)20 b(it)j(should)e(b)s(e)h -(p)s(ossible)f(to)j(devise)f(a)g(go)s(o)s(d)g(w)m(a)m(y)i(to)f -(automatically)f(c)m(ho)s(ose)h(whic)m(h)e(algorithm)150 -5312 y(to)31 b(use.)41 b(Suc)m(h)29 b(a)i(mec)m(hanism)f(w)m(ould)f -(render)g(the)i(parameter)g(obsolete.)p eop -%%Page: 14 15 -14 14 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(14)150 299 y(P)m(ossible)29 -b(return)h(v)-5 b(alues:)572 450 y Fj(BZ_CONFIG_ERROR)663 -554 y Fl(if)29 b(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 -657 y Fj(BZ_PARAM_ERROR)663 761 y Fl(if)g Fj(strm)g Fl(is)h -Fj(NULL)663 865 y Fl(or)g Fj(blockSize)e(<)i Fl(1)h(or)f -Fj(blockSize)e(>)i Fl(9)663 969 y(or)g Fj(verbosity)e(<)i -Fl(0)h(or)f Fj(verbosity)e(>)i Fl(4)663 1073 y(or)g Fj(workFactor)e(<)i -Fl(0)g(or)h Fj(workFactor)c(>)j Fl(250)572 1176 y Fj(BZ_MEM_ERROR)663 -1280 y Fl(if)f(not)i(enough)f(memory)g(is)f(a)m(v)-5 -b(ailable)572 1384 y Fj(BZ_OK)663 1488 y Fl(otherwise)150 -1645 y(Allo)m(w)m(able)30 b(next)g(actions:)572 1796 -y Fj(BZ2_bzCompress)663 1899 y Fl(if)f Fj(BZ_OK)g Fl(is)g(returned)572 -2003 y(no)h(sp)s(eci\014c)f(action)i(needed)f(in)f(case)i(of)g(error) -150 2255 y Ff(3.3.2)63 b Fe(BZ2_bzCompress)533 2441 y -Fj(int)47 b(BZ2_bzCompress)d(\()j(bz_stream)f(*strm,)g(int)h(action)f -(\);)150 2598 y Fl(Pro)m(vides)28 b(more)g(input)f(and/or)h(output)g -(bu\013er)g(space)h(for)f(the)h(library)-8 b(.)38 b(The)28 -b(caller)g(main)m(tains)f(input)150 2708 y(and)j(output)g(bu\013ers,)f -(and)h(calls)g Fj(BZ2_bzCompress)c Fl(to)31 b(transfer)f(data)h(b)s(et) -m(w)m(een)g(them.)150 2865 y(Before)j(eac)m(h)g(call)e(to)i -Fj(BZ2_bzCompress)p Fl(,)c Fj(next_in)h Fl(should)g(p)s(oin)m(t)h(at)h -(the)g(data)h(to)g(b)s(e)e(compressed,)150 2974 y(and)41 -b Fj(avail_in)f Fl(should)g(indicate)h(ho)m(w)h(man)m(y)f(b)m(ytes)i -(the)f(library)d(ma)m(y)k(read.)75 b Fj(BZ2_bzCompress)150 -3084 y Fl(up)s(dates)29 b Fj(next_in)p Fl(,)g Fj(avail_in)f -Fl(and)i Fj(total_in)e Fl(to)j(re\015ect)g(the)g(n)m(um)m(b)s(er)e(of)h -(b)m(ytes)h(it)f(has)g(read.)150 3241 y(Similarly)-8 -b(,)27 b Fj(next_out)h Fl(should)g(p)s(oin)m(t)h(to)i(a)f(bu\013er)f -(in)g(whic)m(h)g(the)h(compressed)g(data)h(is)e(to)i(b)s(e)e(placed,) -150 3350 y(with)i Fj(avail_out)f Fl(indicating)h(ho)m(w)h(m)m(uc)m(h)h -(output)f(space)h(is)f(a)m(v)-5 b(ailable.)46 b Fj(BZ2_bzCompress)29 -b Fl(up)s(dates)150 3460 y Fj(next_out)p Fl(,)f Fj(avail_out)g -Fl(and)i Fj(total_out)e Fl(to)j(re\015ect)g(the)f(n)m(um)m(b)s(er)g(of) -g(b)m(ytes)h(output.)150 3617 y(Y)-8 b(ou)40 b(ma)m(y)g(pro)m(vide)e -(and)h(remo)m(v)m(e)i(as)f(little)e(or)h(as)h(m)m(uc)m(h)f(data)h(as)g -(y)m(ou)f(lik)m(e)g(on)g(eac)m(h)i(call)e(of)g Fj(BZ2_)150 -3726 y(bzCompress)p Fl(.)48 b(In)33 b(the)h(limit,)f(it)h(is)f -(acceptable)h(to)h(supply)c(and)j(remo)m(v)m(e)h(data)g(one)f(b)m(yte)g -(at)h(a)f(time,)150 3836 y(although)28 b(this)f(w)m(ould)g(b)s(e)h -(terribly)e(ine\016cien)m(t.)39 b(Y)-8 b(ou)29 b(should)e(alw)m(a)m(ys) -h(ensure)g(that)h(at)g(least)g(one)f(b)m(yte)150 3946 -y(of)j(output)f(space)g(is)g(a)m(v)-5 b(ailable)30 b(at)h(eac)m(h)g -(call.)150 4102 y(A)38 b(second)h(purp)s(ose)d(of)j Fj(BZ2_bzCompress) -34 b Fl(is)j(to)i(request)f(a)h(c)m(hange)g(of)g(mo)s(de)e(of)i(the)f -(compressed)150 4212 y(stream.)150 4369 y(Conceptually)-8 -b(,)24 b(a)g(compressed)g(stream)g(can)f(b)s(e)g(in)g(one)h(of)f(four)g -(states:)39 b(IDLE,)24 b(R)m(UNNING,)h(FLUSH-)150 4478 -y(ING)37 b(and)g(FINISHING.)g(Before)i(initialisation)33 -b(\()p Fj(BZ2_bzCompressInit)p Fl(\))g(and)j(after)i(termination)150 -4588 y(\()p Fj(BZ2_bzCompressEnd)p Fl(\),)27 b(a)j(stream)h(is)f -(regarded)g(as)g(IDLE.)150 4745 y(Up)s(on)35 b(initialisation)e(\()p -Fj(BZ2_bzCompressInit)p Fl(\),)h(the)i(stream)h(is)e(placed)h(in)e(the) -j(R)m(UNNING)g(state.)150 4854 y(Subsequen)m(t)j(calls)g(to)i -Fj(BZ2_bzCompress)37 b Fl(should)j(pass)g Fj(BZ_RUN)g -Fl(as)h(the)g(requested)h(action;)47 b(other)150 4964 -y(actions)31 b(are)f(illegal)f(and)h(will)d(result)j(in)f -Fj(BZ_SEQUENCE_ERROR)p Fl(.)150 5121 y(A)m(t)38 b(some)f(p)s(oin)m(t,)h -(the)f(calling)e(program)i(will)d(ha)m(v)m(e)k(pro)m(vided)e(all)f(the) -i(input)e(data)j(it)e(w)m(an)m(ts)i(to.)61 b(It)150 5230 -y(will)28 b(then)h(w)m(an)m(t)i(to)g(\014nish)d(up)h({)i(in)d -(e\013ect,)k(asking)e(the)g(library)e(to)j(pro)s(cess)f(an)m(y)g(data)h -(it)f(migh)m(t)g(ha)m(v)m(e)150 5340 y(bu\013ered)25 -b(in)m(ternally)-8 b(.)38 b(In)25 b(this)g(state,)k Fj(BZ2_bzCompress) -22 b Fl(will)i(no)i(longer)g(attempt)h(to)g(read)f(data)h(from)p -eop -%%Page: 15 16 -15 15 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(15)150 299 y Fj(next_in)p -Fl(,)33 b(but)g(it)h(will)d(w)m(an)m(t)k(to)g(write)e(data)h(to)h -Fj(next_out)p Fl(.)49 b(Because)36 b(the)e(output)f(bu\013er)g -(supplied)150 408 y(b)m(y)e(the)h(user)e(can)i(b)s(e)f(arbitrarily)d -(small,)j(the)g(\014nishing-up)d(op)s(eration)i(cannot)i(necessarily)e -(b)s(e)h(done)150 518 y(with)e(a)i(single)e(call)h(of)g -Fj(BZ2_bzCompress)p Fl(.)150 675 y(Instead,)47 b(the)d(calling)f -(program)g(passes)h Fj(BZ_FINISH)d Fl(as)j(an)g(action)g(to)h -Fj(BZ2_bzCompress)p Fl(.)77 b(This)150 784 y(c)m(hanges)30 -b(the)f(stream's)g(state)h(to)f(FINISHING.)g(An)m(y)g(remaining)e -(input)g(\(ie,)i Fj(next_in[0)f(..)i(avail_)150 894 y(in-1])p -Fl(\))36 b(is)f(compressed)i(and)f(transferred)g(to)h(the)g(output)g -(bu\013er.)58 b(T)-8 b(o)38 b(do)e(this,)i Fj(BZ2_bzCompress)150 -1004 y Fl(m)m(ust)h(b)s(e)f(called)g(rep)s(eatedly)h(un)m(til)e(all)h -(the)h(output)f(has)h(b)s(een)f(consumed.)66 b(A)m(t)40 -b(that)g(p)s(oin)m(t,)g Fj(BZ2_)150 1113 y(bzCompress)h -Fl(returns)h Fj(BZ_STREAM_END)p Fl(,)i(and)f(the)h(stream's)g(state)h -(is)d(set)j(bac)m(k)f(to)g(IDLE.)g Fj(BZ2_)150 1223 y(bzCompressEnd)27 -b Fl(should)h(then)i(b)s(e)g(called.)150 1380 y(Just)25 -b(to)i(mak)m(e)g(sure)e(the)i(calling)d(program)i(do)s(es)g(not)g(c)m -(heat,)i(the)f(library)c(mak)m(es)k(a)f(note)h(of)f Fj(avail_in)150 -1489 y Fl(at)g(the)g(time)f(of)g(the)g(\014rst)g(call)g(to)h -Fj(BZ2_bzCompress)21 b Fl(whic)m(h)j(has)h Fj(BZ_FINISH)e -Fl(as)i(an)h(action)f(\(ie,)i(at)f(the)150 1599 y(time)d(the)h(program) -g(has)f(announced)g(its)h(in)m(ten)m(tion)f(to)h(not)g(supply)e(an)m(y) -i(more)g(input\).)37 b(By)24 b(comparing)150 1708 y(this)k(v)-5 -b(alue)28 b(with)g(that)h(of)h Fj(avail_in)c Fl(o)m(v)m(er)k(subsequen) -m(t)f(calls)f(to)h Fj(BZ2_bzCompress)p Fl(,)d(the)j(library)e(can)150 -1818 y(detect)33 b(an)m(y)e(attempts)i(to)f(slip)d(in)h(more)h(data)h -(to)h(compress.)43 b(An)m(y)31 b(calls)g(for)g(whic)m(h)f(this)g(is)h -(detected)150 1928 y(will)j(return)h Fj(BZ_SEQUENCE_ERROR)p -Fl(.)55 b(This)34 b(indicates)i(a)h(programming)e(mistak)m(e)i(whic)m -(h)e(should)g(b)s(e)150 2037 y(corrected.)150 2194 y(Instead)i(of)g -(asking)f(to)h(\014nish,)f(the)h(calling)f(program)g(ma)m(y)h(ask)g -Fj(BZ2_bzCompress)c Fl(to)38 b(tak)m(e)g(all)e(the)150 -2304 y(remaining)j(input,)i(compress)f(it)g(and)g(terminate)h(the)g -(curren)m(t)f(\(Burro)m(ws-Wheeler\))h(compression)150 -2413 y(blo)s(c)m(k.)e(This)26 b(could)h(b)s(e)g(useful)f(for)h(error)h -(con)m(trol)g(purp)s(oses.)38 b(The)27 b(mec)m(hanism)g(is)g(analogous) -h(to)g(that)150 2523 y(for)35 b(\014nishing:)46 b(call)35 -b Fj(BZ2_bzCompress)c Fl(with)i(an)i(action)g(of)g Fj(BZ_FLUSH)p -Fl(,)g(remo)m(v)m(e)h(output)f(data,)i(and)150 2632 y(p)s(ersist)h -(with)g(the)i Fj(BZ_FLUSH)e Fl(action)i(un)m(til)e(the)i(v)-5 -b(alue)39 b Fj(BZ_RUN)f Fl(is)h(returned.)68 b(As)39 -b(with)g(\014nishing,)150 2742 y Fj(BZ2_bzCompress)23 -b Fl(detects)28 b(an)m(y)f(attempt)h(to)f(pro)m(vide)f(more)h(input)e -(data)i(once)g(the)g(\015ush)e(has)i(b)s(egun.)150 2899 -y(Once)j(the)h(\015ush)e(is)g(complete,)i(the)g(stream)f(returns)g(to)h -(the)f(normal)g(R)m(UNNING)h(state.)150 3056 y(This)f(all)h(sounds)g -(prett)m(y)h(complex,)h(but)e(isn't)g(really)-8 b(.)45 -b(Here's)33 b(a)f(table)g(whic)m(h)f(sho)m(ws)h(whic)m(h)f(actions)150 -3165 y(are)e(allo)m(w)m(able)f(in)f(eac)m(h)j(state,)g(what)f(action)g -(will)c(b)s(e)j(tak)m(en,)j(what)d(the)h(next)f(state)i(is,)e(and)g -(what)h(the)150 3275 y(non-error)h(return)f(v)-5 b(alues)29 -b(are.)41 b(Note)32 b(that)e(y)m(ou)h(can't)g(explicitly)d(ask)i(what)g -(state)i(the)e(stream)h(is)e(in,)150 3384 y(but)h(nor)g(do)g(y)m(ou)h -(need)f(to)h({)g(it)e(can)i(b)s(e)f(inferred)e(from)i(the)h(v)-5 -b(alues)29 b(returned)h(b)m(y)g Fj(BZ2_bzCompress)p Fl(.)390 -3535 y(IDLE/)p Fj(any)572 3639 y Fl(Illegal.)60 b(IDLE)30 -b(state)i(only)d(exists)h(after)h Fj(BZ2_bzCompressEnd)26 -b Fl(or)572 3743 y(b)s(efore)k Fj(BZ2_bzCompressInit)p -Fl(.)572 3847 y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_SEQUENCE_ERROR)390 -4054 y Fl(R)m(UNNING/)p Fj(BZ_RUN)572 4158 y Fl(Compress)f(from)h -Fj(next_in)f Fl(to)i Fj(next_out)d Fl(as)i(m)m(uc)m(h)h(as)f(p)s -(ossible.)572 4262 y(Next)h(state)h(=)e(R)m(UNNING)572 -4366 y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_RUN_OK)390 -4573 y Fl(R)m(UNNING/)p Fj(BZ_FLUSH)572 4677 y Fl(Remem)m(b)s(er)g -(curren)m(t)g(v)-5 b(alue)30 b(of)g Fj(next_in)p Fl(.)59 -b(Compress)30 b(from)g Fj(next_in)572 4781 y Fl(to)h -Fj(next_out)d Fl(as)j(m)m(uc)m(h)f(as)h(p)s(ossible,)d(but)i(do)g(not)g -(accept)i(an)m(y)f(more)f(input.)572 4885 y(Next)h(state)h(=)e -(FLUSHING)572 4988 y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_FLUSH_OK)390 -5196 y Fl(R)m(UNNING/)p Fj(BZ_FINISH)572 5300 y Fl(Remem)m(b)s(er)g -(curren)m(t)g(v)-5 b(alue)30 b(of)g Fj(next_in)p Fl(.)59 -b(Compress)30 b(from)g Fj(next_in)p eop -%%Page: 16 17 -16 16 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(16)572 299 y(to)31 b Fj(next_out)d -Fl(as)j(m)m(uc)m(h)f(as)h(p)s(ossible,)d(but)i(do)g(not)g(accept)i(an)m -(y)f(more)f(input.)572 403 y(Next)h(state)h(=)e(FINISHING)572 -506 y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_FINISH_OK)390 -714 y Fl(FLUSHING/)p Fj(BZ_FLUSH)572 818 y Fl(Compress)f(from)h -Fj(next_in)f Fl(to)i Fj(next_out)d Fl(as)i(m)m(uc)m(h)h(as)f(p)s -(ossible,)572 922 y(but)f(do)i(not)f(accept)i(an)m(y)f(more)f(input.) -572 1025 y(If)g(all)f(the)i(existing)e(input)f(has)i(b)s(een)g(used)g -(up)f(and)h(all)f(compressed)572 1129 y(output)h(has)g(b)s(een)g(remo)m -(v)m(ed)663 1233 y(Next)h(state)h(=)e(R)m(UNNING;)i(Return)d(v)-5 -b(alue)30 b(=)g Fj(BZ_RUN_OK)572 1337 y Fl(else)663 1440 -y(Next)h(state)h(=)e(FLUSHING;)h(Return)e(v)-5 b(alue)30 -b(=)g Fj(BZ_FLUSH_OK)390 1648 y Fl(FLUSHING/other)572 -1752 y(Illegal.)572 1856 y(Return)f(v)-5 b(alue)30 b(=)g -Fj(BZ_SEQUENCE_ERROR)390 2063 y Fl(FINISHING/)p Fj(BZ_FINISH)572 -2167 y Fl(Compress)f(from)h Fj(next_in)f Fl(to)i Fj(next_out)d -Fl(as)i(m)m(uc)m(h)h(as)f(p)s(ossible,)572 2271 y(but)f(to)j(not)e -(accept)i(an)m(y)f(more)f(input.)572 2374 y(If)g(all)f(the)i(existing)e -(input)f(has)i(b)s(een)g(used)g(up)f(and)h(all)f(compressed)572 -2478 y(output)h(has)g(b)s(een)g(remo)m(v)m(ed)663 2582 -y(Next)h(state)h(=)e(IDLE;)g(Return)g(v)-5 b(alue)30 -b(=)g Fj(BZ_STREAM_END)572 2686 y Fl(else)663 2790 y(Next)h(state)h(=)e -(FINISHING;)g(Return)g(v)-5 b(alue)30 b(=)g Fj(BZ_FINISHING)390 -2997 y Fl(FINISHING/other)572 3101 y(Illegal.)572 3205 -y(Return)f(v)-5 b(alue)30 b(=)g Fj(BZ_SEQUENCE_ERROR)150 -3361 y Fl(That)24 b(still)f(lo)s(oks)g(complicated?)39 -b(W)-8 b(ell,)25 b(fair)f(enough.)38 b(The)24 b(usual)f(sequence)i(of)f -(calls)g(for)g(compressing)150 3471 y(a)31 b(load)f(of)g(data)h(is:)225 -3628 y Fi(\017)60 b Fl(Get)31 b(started)g(with)e Fj(BZ2_bzCompressInit) -p Fl(.)225 3774 y Fi(\017)60 b Fl(Sho)m(v)m(el)38 b(data)h(in)e(and)g -(shlurp)e(out)k(its)e(compressed)h(form)g(using)e(zero)j(or)f(more)h -(calls)e(of)h Fj(BZ2_)330 3884 y(bzCompress)28 b Fl(with)h(action)h(=)g -Fj(BZ_RUN)p Fl(.)225 4030 y Fi(\017)60 b Fl(Finish)23 -b(up.)38 b(Rep)s(eatedly)25 b(call)f Fj(BZ2_bzCompress)e -Fl(with)i(action)h(=)g Fj(BZ_FINISH)p Fl(,)f(cop)m(ying)h(out)h(the)330 -4139 y(compressed)k(output,)g(un)m(til)f Fj(BZ_STREAM_END)e -Fl(is)i(returned.)225 4285 y Fi(\017)60 b Fl(Close)30 -b(up)f(and)h(go)h(home.)41 b(Call)29 b Fj(BZ2_bzCompressEnd)p -Fl(.)150 4478 y(If)23 b(the)h(data)h(y)m(ou)f(w)m(an)m(t)h(to)f -(compress)g(\014ts)f(in)m(to)h(y)m(our)g(input)e(bu\013er)h(all)f(at)j -(once,)h(y)m(ou)e(can)g(skip)f(the)h(calls)150 4588 y(of)37 -b Fj(BZ2_bzCompress)26 b(\()k(...,)f(BZ_RUN)g(\))36 b -Fl(and)g(just)g(do)h(the)g Fj(BZ2_bzCompress)26 b(\()k(...,)f -(BZ_FINISH)150 4698 y(\))h Fl(calls.)150 4854 y(All)36 -b(required)g(memory)h(is)f(allo)s(cated)i(b)m(y)f Fj -(BZ2_bzCompressInit)p Fl(.)56 b(The)37 b(compression)g(library)e(can) -150 4964 y(accept)g(an)m(y)f(data)h(at)g(all)d(\(ob)m(viously\).)51 -b(So)34 b(y)m(ou)g(shouldn't)e(get)j(an)m(y)f(error)f(return)g(v)-5 -b(alues)33 b(from)h(the)150 5074 y Fj(BZ2_bzCompress)29 -b Fl(calls.)46 b(If)32 b(y)m(ou)h(do,)g(they)g(will)d(b)s(e)i -Fj(BZ_SEQUENCE_ERROR)p Fl(,)d(and)j(indicate)f(a)i(bug)f(in)150 -5183 y(y)m(our)e(programming.)150 5340 y(T)-8 b(rivial)28 -b(other)j(p)s(ossible)d(return)h(v)-5 b(alues:)p eop -%%Page: 17 18 -17 17 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(17)572 299 y Fj(BZ_PARAM_ERROR)663 -403 y Fl(if)29 b Fj(strm)g Fl(is)h Fj(NULL)p Fl(,)f(or)i -Fj(strm->s)d Fl(is)h Fj(NULL)150 652 y Ff(3.3.3)63 b -Fe(BZ2_bzCompressEnd)390 839 y Fj(int)47 b(BZ2_bzCompressEnd)c(\()k -(bz_stream)f(*strm)g(\);)150 996 y Fl(Releases)31 b(all)e(memory)h -(asso)s(ciated)h(with)e(a)i(compression)e(stream.)150 -1153 y(P)m(ossible)g(return)h(v)-5 b(alues:)481 1304 -y Fj(BZ_PARAM_ERROR)117 b Fl(if)30 b Fj(strm)f Fl(is)g -Fj(NULL)g Fl(or)i Fj(strm->s)d Fl(is)i Fj(NULL)481 1408 -y(BZ_OK)120 b Fl(otherwise)150 1657 y Ff(3.3.4)63 b Fe -(BZ2_bzDecompressInit)390 1844 y Fj(int)47 b(BZ2_bzDecompressInit)42 -b(\()48 b(bz_stream)d(*strm,)h(int)h(verbosity,)e(int)i(small)f(\);)150 -2001 y Fl(Prepares)30 b(for)f(decompression.)40 b(As)29 -b(with)g Fj(BZ2_bzCompressInit)p Fl(,)c(a)31 b Fj(bz_stream)c -Fl(record)j(should)e(b)s(e)150 2110 y(allo)s(cated)c(and)f(initialised) -e(b)s(efore)i(the)i(call.)38 b(Fields)22 b Fj(bzalloc)p -Fl(,)i Fj(bzfree)e Fl(and)i Fj(opaque)e Fl(should)g(b)s(e)h(set)i(if) -150 2220 y(a)h(custom)f(memory)g(allo)s(cator)g(is)g(required,)f(or)h -(made)h Fj(NULL)e Fl(for)h(the)g(normal)f Fj(malloc)p -Fl(/)p Fj(free)f Fl(routines.)150 2330 y(Up)s(on)h(return,)h(the)g(in)m -(ternal)f(state)i(will)c(ha)m(v)m(e)k(b)s(een)f(initialised,)d(and)i -Fj(total_in)f Fl(and)h Fj(total_out)f Fl(will)150 2439 -y(b)s(e)30 b(zero.)150 2596 y(F)-8 b(or)31 b(the)g(meaning)e(of)i -(parameter)g Fj(verbosity)p Fl(,)d(see)j Fj(BZ2_bzCompressInit)p -Fl(.)150 2753 y(If)e Fj(small)e Fl(is)h(nonzero,)i(the)f(library)e -(will)f(use)j(an)g(alternativ)m(e)h(decompression)e(algorithm)g(whic)m -(h)f(uses)150 2862 y(less)c(memory)g(but)g(at)h(the)g(cost)h(of)e -(decompressing)g(more)g(slo)m(wly)g(\(roughly)f(sp)s(eaking,)i(half)f -(the)h(sp)s(eed,)150 2972 y(but)34 b(the)i(maxim)m(um)d(memory)i -(requiremen)m(t)g(drops)e(to)j(around)e(2300k\).)57 b(See)35 -b(Chapter)g(2)g(for)g(more)150 3082 y(information)29 -b(on)h(memory)g(managemen)m(t.)150 3238 y(Note)40 b(that)f(the)f(amoun) -m(t)h(of)g(memory)f(needed)g(to)i(decompress)e(a)h(stream)f(cannot)h(b) -s(e)f(determined)150 3348 y(un)m(til)j(the)h(stream's)h(header)f(has)g -(b)s(een)g(read,)j(so)e(ev)m(en)g(if)e Fj(BZ2_bzDecompressInit)c -Fl(succeeds,)46 b(a)150 3458 y(subsequen)m(t)30 b Fj(BZ2_bzDecompress)c -Fl(could)j(fail)g(with)g Fj(BZ_MEM_ERROR)p Fl(.)150 3614 -y(P)m(ossible)g(return)h(v)-5 b(alues:)572 3765 y Fj(BZ_CONFIG_ERROR) -663 3869 y Fl(if)29 b(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 -3973 y Fj(BZ_PARAM_ERROR)663 4077 y Fl(if)g Fj(\(small)46 -b(!=)h(0)h(&&)f(small)f(!=)h(1\))663 4181 y Fl(or)30 -b Fj(\(verbosity)45 b(<)j(0)f(||)g(verbosity)e(>)j(4\))572 -4284 y(BZ_MEM_ERROR)663 4388 y Fl(if)29 b(insu\016cien)m(t)g(memory)h -(is)f(a)m(v)-5 b(ailable)150 4545 y(Allo)m(w)m(able)30 -b(next)g(actions:)572 4696 y Fj(BZ2_bzDecompress)663 -4800 y Fl(if)f Fj(BZ_OK)g Fl(w)m(as)i(returned)572 4904 -y(no)f(sp)s(eci\014c)f(action)i(required)e(in)g(case)i(of)g(error)150 -5153 y Ff(3.3.5)63 b Fe(BZ2_bzDecompress)390 5340 y Fj(int)47 -b(BZ2_bzDecompress)c(\()48 b(bz_stream)d(*strm)h(\);)p -eop -%%Page: 18 19 -18 18 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(18)150 299 y(Pro)m(vides)24 -b(more)g(input)f(and/out)h(output)g(bu\013er)g(space)h(for)f(the)g -(library)-8 b(.)37 b(The)24 b(caller)g(main)m(tains)f(input)150 -408 y(and)30 b(output)g(bu\013ers,)f(and)h(uses)g Fj(BZ2_bzDecompress)c -Fl(to)31 b(transfer)f(data)h(b)s(et)m(w)m(een)g(them.)150 -565 y(Before)g(eac)m(h)g(call)f(to)g Fj(BZ2_bzDecompress)p -Fl(,)c Fj(next_in)i Fl(should)h(p)s(oin)m(t)g(at)h(the)h(compressed)e -(data,)j(and)150 675 y Fj(avail_in)h Fl(should)h(indicate)h(ho)m(w)h -(man)m(y)f(b)m(ytes)i(the)e(library)f(ma)m(y)i(read.)56 -b Fj(BZ2_bzDecompress)32 b Fl(up-)150 784 y(dates)f Fj(next_in)p -Fl(,)e Fj(avail_in)f Fl(and)h Fj(total_in)g Fl(to)i(re\015ect)g(the)f -(n)m(um)m(b)s(er)f(of)i(b)m(ytes)g(it)f(has)g(read.)150 -941 y(Similarly)-8 b(,)37 b Fj(next_out)f Fl(should)g(p)s(oin)m(t)i(to) -g(a)h(bu\013er)e(in)g(whic)m(h)g(the)i(uncompressed)e(output)g(is)h(to) -h(b)s(e)150 1051 y(placed,)d(with)e Fj(avail_out)f Fl(indicating)g(ho)m -(w)i(m)m(uc)m(h)g(output)g(space)h(is)e(a)m(v)-5 b(ailable.)55 -b Fj(BZ2_bzCompress)150 1160 y Fl(up)s(dates)29 b Fj(next_out)p -Fl(,)g Fj(avail_out)f Fl(and)h Fj(total_out)f Fl(to)j(re\015ect)g(the)g -(n)m(um)m(b)s(er)e(of)h(b)m(ytes)h(output.)150 1317 y(Y)-8 -b(ou)40 b(ma)m(y)g(pro)m(vide)e(and)h(remo)m(v)m(e)i(as)f(little)e(or)h -(as)h(m)m(uc)m(h)f(data)h(as)g(y)m(ou)f(lik)m(e)g(on)g(eac)m(h)i(call)e -(of)g Fj(BZ2_)150 1427 y(bzDecompress)p Fl(.)e(In)27 -b(the)i(limit,)d(it)i(is)f(acceptable)j(to)f(supply)d(and)h(remo)m(v)m -(e)j(data)f(one)f(b)m(yte)h(at)g(a)g(time,)150 1537 y(although)f(this)f -(w)m(ould)g(b)s(e)h(terribly)e(ine\016cien)m(t.)39 b(Y)-8 -b(ou)29 b(should)e(alw)m(a)m(ys)h(ensure)g(that)h(at)g(least)g(one)f(b) -m(yte)150 1646 y(of)j(output)f(space)g(is)g(a)m(v)-5 -b(ailable)30 b(at)h(eac)m(h)g(call.)150 1803 y(Use)g(of)f -Fj(BZ2_bzDecompress)c Fl(is)k(simpler)e(than)i Fj(BZ2_bzCompress)p -Fl(.)150 1960 y(Y)-8 b(ou)31 b(should)d(pro)m(vide)h(input)f(and)i -(remo)m(v)m(e)i(output)d(as)i(describ)s(ed)d(ab)s(o)m(v)m(e,)k(and)d -(rep)s(eatedly)h(call)f Fj(BZ2_)150 2069 y(bzDecompress)35 -b Fl(un)m(til)i Fj(BZ_STREAM_END)e Fl(is)j(returned.)64 -b(App)s(earance)39 b(of)g Fj(BZ_STREAM_END)c Fl(denotes)150 -2179 y(that)47 b Fj(BZ2_bzDecompress)42 b Fl(has)k(detected)h(the)f -(logical)g(end)g(of)g(the)h(compressed)e(stream.)89 b -Fj(BZ2_)150 2289 y(bzDecompress)28 b Fl(will)g(not)j(pro)s(duce)f -Fj(BZ_STREAM_END)d Fl(un)m(til)j(all)f(output)i(data)h(has)e(b)s(een)h -(placed)f(in)m(to)150 2398 y(the)36 b(output)g(bu\013er,)h(so)g(once)g -Fj(BZ_STREAM_END)32 b Fl(app)s(ears,)38 b(y)m(ou)e(are)h(guaran)m(teed) -g(to)g(ha)m(v)m(e)h(a)m(v)-5 b(ailable)150 2508 y(all)29 -b(the)i(decompressed)f(output,)g(and)g Fj(BZ2_bzDecompressEnd)25 -b Fl(can)31 b(safely)f(b)s(e)f(called.)150 2665 y(If)40 -b(case)h(of)f(an)h(error)e(return)h(v)-5 b(alue,)42 b(y)m(ou)f(should)d -(call)h Fj(BZ2_bzDecompressEnd)c Fl(to)41 b(clean)f(up)g(and)150 -2774 y(release)31 b(memory)-8 b(.)150 2931 y(P)m(ossible)29 -b(return)h(v)-5 b(alues:)572 3082 y Fj(BZ_PARAM_ERROR)663 -3186 y Fl(if)29 b Fj(strm)g Fl(is)h Fj(NULL)f Fl(or)h -Fj(strm->s)f Fl(is)g Fj(NULL)663 3290 y Fl(or)h Fj(strm->avail_out)44 -b(<)j(1)572 3393 y(BZ_DATA_ERROR)663 3497 y Fl(if)29 -b(a)i(data)g(in)m(tegrit)m(y)f(error)g(is)g(detected)h(in)e(the)i -(compressed)f(stream)572 3601 y Fj(BZ_DATA_ERROR_MAGIC)663 -3705 y Fl(if)f(the)i(compressed)f(stream)g(do)s(esn't)h(b)s(egin)e -(with)g(the)h(righ)m(t)g(magic)h(b)m(ytes)572 3808 y -Fj(BZ_MEM_ERROR)663 3912 y Fl(if)e(there)i(w)m(asn't)f(enough)h(memory) -f(a)m(v)-5 b(ailable)572 4016 y Fj(BZ_STREAM_END)663 -4120 y Fl(if)29 b(the)i(logical)e(end)h(of)h(the)f(data)h(stream)g(w)m -(as)g(detected)g(and)f(all)663 4224 y(output)g(in)f(has)h(b)s(een)g -(consumed,)f(eg)j Fj(s->avail_out)44 b(>)k(0)572 4327 -y(BZ_OK)663 4431 y Fl(otherwise)150 4588 y(Allo)m(w)m(able)30 -b(next)g(actions:)572 4739 y Fj(BZ2_bzDecompress)663 -4843 y Fl(if)f Fj(BZ_OK)g Fl(w)m(as)i(returned)572 4946 -y Fj(BZ2_bzDecompressEnd)663 5050 y Fl(otherwise)p eop -%%Page: 19 20 -19 19 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(19)150 299 y Ff(3.3.6)63 -b Fe(BZ2_bzDecompressEnd)390 486 y Fj(int)47 b(BZ2_bzDecompressEnd)42 -b(\()48 b(bz_stream)d(*strm)i(\);)150 643 y Fl(Releases)31 -b(all)e(memory)h(asso)s(ciated)h(with)e(a)i(decompression)e(stream.)150 -799 y(P)m(ossible)g(return)h(v)-5 b(alues:)572 950 y -Fj(BZ_PARAM_ERROR)663 1054 y Fl(if)29 b Fj(strm)g Fl(is)h -Fj(NULL)f Fl(or)h Fj(strm->s)f Fl(is)g Fj(NULL)572 1158 -y(BZ_OK)663 1262 y Fl(otherwise)150 1419 y(Allo)m(w)m(able)h(next)g -(actions:)572 1570 y(None.)150 1857 y Fk(3.4)68 b(High-lev)l(el)47 -b(in)l(terface)150 2050 y Fl(This)35 b(in)m(terface)j(pro)m(vides)d -(functions)h(for)g(reading)g(and)h(writing)e Fj(bzip2)g -Fl(format)i(\014les.)59 b(First,)39 b(some)150 2159 y(general)30 -b(p)s(oin)m(ts.)225 2316 y Fi(\017)60 b Fl(All)35 b(of)h(the)g -(functions)e(tak)m(e)k(an)e Fj(int*)f Fl(\014rst)g(argumen)m(t,)j -Fj(bzerror)p Fl(.)56 b(After)36 b(eac)m(h)h(call,)g Fj(bzerror)330 -2426 y Fl(should)23 b(b)s(e)i(consulted)g(\014rst)g(to)h(determine)e -(the)i(outcome)h(of)e(the)h(call.)38 b(If)25 b Fj(bzerror)f -Fl(is)g Fj(BZ_OK)p Fl(,)i(the)330 2535 y(call)35 b(completed)g -(successfully)-8 b(,)36 b(and)f(only)g(then)g(should)f(the)h(return)g -(v)-5 b(alue)35 b(of)h(the)f(function)g(\(if)330 2645 -y(an)m(y\))30 b(b)s(e)f(consulted.)39 b(If)29 b Fj(bzerror)e -Fl(is)h Fj(BZ_IO_ERROR)p Fl(,)f(there)i(w)m(as)h(an)f(error)g -(reading/writing)e(the)330 2754 y(underlying)32 b(compressed)j(\014le,) -h(and)f(y)m(ou)h(should)d(then)i(consult)g Fj(errno)p -Fl(/)p Fj(perror)e Fl(to)j(determine)330 2864 y(the)i(cause)g(of)g(the) -g(di\016cult)m(y)-8 b(.)61 b Fj(bzerror)36 b Fl(ma)m(y)i(also)g(b)s(e)f -(set)h(to)g(v)-5 b(arious)37 b(other)h(v)-5 b(alues;)41 -b(precise)330 2974 y(details)29 b(are)i(giv)m(en)g(on)f(a)h(p)s -(er-function)d(basis)h(b)s(elo)m(w.)225 3111 y Fi(\017)60 -b Fl(If)40 b Fj(bzerror)f Fl(indicates)g(an)i(error)f(\(ie,)j(an)m -(ything)d(except)h Fj(BZ_OK)f Fl(and)g Fj(BZ_STREAM_END)p -Fl(\),)g(y)m(ou)330 3220 y(should)56 b(immediately)h(call)g -Fj(BZ2_bzReadClose)e Fl(\(or)j Fj(BZ2_bzWriteClose)p -Fl(,)j(dep)s(ending)56 b(on)330 3330 y(whether)50 b(y)m(ou)g(are)h -(attempting)g(to)g(read)f(or)g(to)i(write\))d(to)j(free)e(up)f(all)h -(resources)g(asso)s(ci-)330 3439 y(ated)33 b(with)e(the)i(stream.)47 -b(Once)32 b(an)h(error)f(has)g(b)s(een)g(indicated,)f(b)s(eha)m(viour)g -(of)i(all)e(calls)h(except)330 3549 y Fj(BZ2_bzReadClose)46 -b Fl(\()p Fj(BZ2_bzWriteClose)p Fl(\))h(is)j(unde\014ned.)99 -b(The)50 b(implication)e(is)i(that)h(\(1\))330 3659 y -Fj(bzerror)44 b Fl(should)g(b)s(e)h(c)m(hec)m(k)m(ed)j(after)e(eac)m(h) -h(call,)i(and)c(\(2\))i(if)e Fj(bzerror)f Fl(indicates)g(an)i(error,) -330 3768 y Fj(BZ2_bzReadClose)26 b Fl(\()p Fj(BZ2_bzWriteClose)p -Fl(\))h(should)h(then)i(b)s(e)g(called)g(to)h(clean)f(up.)225 -3905 y Fi(\017)60 b Fl(The)33 b Fj(FILE*)f Fl(argumen)m(ts)h(passed)g -(to)h Fj(BZ2_bzReadOpen)p Fl(/)p Fj(BZ2_bzWriteOp)o(en)27 -b Fl(should)32 b(b)s(e)g(set)i(to)330 4015 y(binary)23 -b(mo)s(de.)38 b(Most)26 b(Unix)d(systems)i(will)d(do)i(this)g(b)m(y)g -(default,)i(but)e(other)g(platforms,)h(including)330 -4124 y(Windo)m(ws)20 b(and)g(Mac,)k(will)19 b(not.)38 -b(If)20 b(y)m(ou)h(omit)g(this,)h(y)m(ou)f(ma)m(y)h(encoun)m(ter)f -(problems)e(when)h(mo)m(ving)330 4234 y(co)s(de)31 b(to)g(new)f -(platforms.)225 4371 y Fi(\017)60 b Fl(Memory)23 b(allo)s(cation)f -(requests)h(are)g(handled)e(b)m(y)i Fj(malloc)p Fl(/)p -Fj(free)p Fl(.)36 b(A)m(t)23 b(presen)m(t)g(there)g(is)f(no)h(facilit)m -(y)330 4481 y(for)40 b(user-de\014ned)e(memory)i(allo)s(cators)g(in)f -(the)h(\014le)g(I/O)g(functions)e(\(could)i(easily)f(b)s(e)g(added,)330 -4590 y(though\).)150 4842 y Ff(3.4.1)63 b Fe(BZ2_bzReadOpen)533 -5029 y Fj(typedef)46 b(void)h(BZFILE;)533 5236 y(BZFILE)f -(*BZ2_bzReadOpen)e(\()j(int)g(*bzerror,)f(FILE)g(*f,)1726 -5340 y(int)h(small,)f(int)h(verbosity,)p eop -%%Page: 20 21 -20 20 bop 150 -116 a Fl(Chapter)30 b(3:)h(Programming)e(with)g -Fj(libbzip2)1891 b Fl(20)1726 299 y Fj(void)47 b(*unused,)f(int)g -(nUnused)g(\);)150 456 y Fl(Prepare)29 b(to)g(read)g(compressed)f(data) -i(from)e(\014le)g(handle)f Fj(f)p Fl(.)40 b Fj(f)29 b -Fl(should)d(refer)j(to)h(a)f(\014le)f(whic)m(h)f(has)i(b)s(een)150 -565 y(op)s(ened)h(for)h(reading,)f(and)h(for)f(whic)m(h)g(the)h(error)g -(indicator)e(\()p Fj(ferror\(f\))p Fl(\)is)f(not)k(set.)42 -b(If)31 b Fj(small)e Fl(is)h(1,)150 675 y(the)h(library)d(will)f(try)j -(to)i(decompress)e(using)f(less)g(memory)-8 b(,)31 b(at)g(the)g(exp)s -(ense)f(of)g(sp)s(eed.)150 832 y(F)-8 b(or)39 b(reasons)f(explained)f -(b)s(elo)m(w,)j Fj(BZ2_bzRead)35 b Fl(will)h(decompress)i(the)g -Fj(nUnused)e Fl(b)m(ytes)j(starting)f(at)150 941 y Fj(unused)p -Fl(,)k(b)s(efore)e(starting)h(to)g(read)g(from)f(the)h(\014le)f -Fj(f)p Fl(.)71 b(A)m(t)42 b(most)f Fj(BZ_MAX_UNUSED)c -Fl(b)m(ytes)k(ma)m(y)h(b)s(e)150 1051 y(supplied)32 b(lik)m(e)k(this.) -55 b(If)36 b(this)e(facilit)m(y)h(is)g(not)h(required,)g(y)m(ou)g -(should)e(pass)h Fj(NULL)g Fl(and)g Fj(0)g Fl(for)h Fj(unused)150 -1160 y Fl(and)30 b(n)p Fj(Unused)e Fl(resp)s(ectiv)m(ely)-8 -b(.)150 1317 y(F)g(or)31 b(the)g(meaning)e(of)i(parameters)g -Fj(small)e Fl(and)g Fj(verbosity)p Fl(,)f(see)j Fj -(BZ2_bzDecompressInit)p Fl(.)150 1474 y(The)k(amoun)m(t)g(of)g(memory)g -(needed)g(to)g(decompress)g(a)h(\014le)e(cannot)h(b)s(e)g(determined)e -(un)m(til)h(the)h(\014le's)150 1584 y(header)22 b(has)f(b)s(een)g -(read.)38 b(So)22 b(it)f(is)g(p)s(ossible)e(that)k Fj(BZ2_bzReadOpen)17 -b Fl(returns)k Fj(BZ_OK)f Fl(but)h(a)i(subsequen)m(t)150 -1693 y(call)30 b(of)g Fj(BZ2_bzRead)e Fl(will)f(return)j -Fj(BZ_MEM_ERROR)p Fl(.)150 1850 y(P)m(ossible)f(assignmen)m(ts)h(to)h -Fj(bzerror)p Fl(:)572 2001 y Fj(BZ_CONFIG_ERROR)663 2105 -y Fl(if)e(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 -2209 y Fj(BZ_PARAM_ERROR)663 2313 y Fl(if)g Fj(f)h Fl(is)g -Fj(NULL)663 2416 y Fl(or)g Fj(small)f Fl(is)g(neither)h -Fj(0)g Fl(nor)g Fj(1)663 2520 y Fl(or)g Fj(\(unused)46 -b(==)h(NULL)g(&&)g(nUnused)f(!=)h(0\))663 2624 y Fl(or)30 -b Fj(\(unused)46 b(!=)h(NULL)g(&&)g(!\(0)g(<=)g(nUnused)f(<=)h -(BZ_MAX_UNUSED\)\))572 2728 y(BZ_IO_ERROR)663 2831 y -Fl(if)29 b Fj(ferror\(f\))f Fl(is)h(nonzero)572 2935 -y Fj(BZ_MEM_ERROR)663 3039 y Fl(if)g(insu\016cien)m(t)g(memory)h(is)f -(a)m(v)-5 b(ailable)572 3143 y Fj(BZ_OK)663 3247 y Fl(otherwise.)150 -3403 y(P)m(ossible)29 b(return)h(v)-5 b(alues:)572 3554 -y(P)m(oin)m(ter)31 b(to)g(an)f(abstract)h Fj(BZFILE)663 -3658 y Fl(if)e Fj(bzerror)f Fl(is)i Fj(BZ_OK)572 3762 -y(NULL)663 3866 y Fl(otherwise)150 4023 y(Allo)m(w)m(able)g(next)g -(actions:)572 4174 y Fj(BZ2_bzRead)663 4277 y Fl(if)f -Fj(bzerror)f Fl(is)i Fj(BZ_OK)572 4381 y(BZ2_bzClose)663 -4485 y Fl(otherwise)150 4887 y Ff(3.4.2)63 b Fe(BZ2_bzRead)533 -5074 y Fj(int)47 b(BZ2_bzRead)e(\()j(int)e(*bzerror,)g(BZFILE)g(*b,)h -(void)f(*buf,)h(int)g(len)g(\);)150 5230 y Fl(Reads)35 -b(up)f(to)h Fj(len)f Fl(\(uncompressed\))h(b)m(ytes)g(from)f(the)h -(compressed)g(\014le)f Fj(b)g Fl(in)m(to)h(the)g(bu\013er)f -Fj(buf)p Fl(.)53 b(If)150 5340 y(the)30 b(read)f(w)m(as)h(successful,)f -Fj(bzerror)e Fl(is)i(set)h(to)g Fj(BZ_OK)e Fl(and)h(the)h(n)m(um)m(b)s -(er)e(of)i(b)m(ytes)g(read)f(is)g(returned.)p eop -%%Page: 21 22 -21 21 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(21)150 299 y(If)33 b(the)g(logical)g -(end-of-stream)h(w)m(as)g(detected,)i Fj(bzerror)31 b -Fl(will)g(b)s(e)h(set)i(to)g Fj(BZ_STREAM_END)p Fl(,)d(and)i(the)150 -408 y(n)m(um)m(b)s(er)c(of)i(b)m(ytes)f(read)h(is)e(returned.)40 -b(All)29 b(other)h Fj(bzerror)f Fl(v)-5 b(alues)29 b(denote)i(an)g -(error.)150 565 y Fj(BZ2_bzRead)37 b Fl(will)f(supply)h -Fj(len)i Fl(b)m(ytes,)j(unless)c(the)i(logical)f(stream)h(end)e(is)h -(detected)i(or)e(an)g(error)150 675 y(o)s(ccurs.)75 b(Because)43 -b(of)f(this,)i(it)d(is)g(p)s(ossible)e(to)k(detect)g(the)f(stream)g -(end)f(b)m(y)h(observing)f(when)g(the)150 784 y(n)m(um)m(b)s(er)29 -b(of)h(b)m(ytes)g(returned)f(is)g(less)g(than)h(the)g(n)m(um)m(b)s(er)f -(requested.)40 b(Nev)m(ertheless,)31 b(this)e(is)g(regarded)150 -894 y(as)38 b(inadvisable;)g(y)m(ou)g(should)d(instead)i(c)m(hec)m(k)i -Fj(bzerror)d Fl(after)i(ev)m(ery)g(call)e(and)h(w)m(atc)m(h)i(out)f -(for)f Fj(BZ_)150 1004 y(STREAM_END)p Fl(.)150 1160 y(In)m(ternally)-8 -b(,)47 b Fj(BZ2_bzRead)41 b Fl(copies)j(data)g(from)g(the)g(compressed) -g(\014le)f(in)f(c)m(h)m(unks)i(of)g(size)g Fj(BZ_MAX_)150 -1270 y(UNUSED)31 b Fl(b)m(ytes)i(b)s(efore)f(decompressing)f(it.)47 -b(If)32 b(the)h(\014le)e(con)m(tains)i(more)g(b)m(ytes)g(than)f -(strictly)f(needed)150 1380 y(to)48 b(reac)m(h)f(the)g(logical)f -(end-of-stream,)52 b Fj(BZ2_bzRead)44 b Fl(will)g(almost)j(certainly)f -(read)h(some)g(of)g(the)150 1489 y(trailing)c(data)j(b)s(efore)e -(signalling)f Fj(BZ_SEQUENCE_END)p Fl(.)80 b(T)-8 b(o)46 -b(collect)f(the)g(read)g(but)g(un)m(used)e(data)150 1599 -y(once)29 b Fj(BZ_SEQUENCE_END)24 b Fl(has)k(app)s(eared,)g(call)f -Fj(BZ2_bzReadGetUnused)c Fl(immediately)j(b)s(efore)i -Fj(BZ2_)150 1708 y(bzReadClose)p Fl(.)150 1865 y(P)m(ossible)h -(assignmen)m(ts)h(to)h Fj(bzerror)p Fl(:)572 2016 y Fj(BZ_PARAM_ERROR) -663 2120 y Fl(if)e Fj(b)h Fl(is)g Fj(NULL)f Fl(or)h Fj(buf)g -Fl(is)f Fj(NULL)g Fl(or)i Fj(len)46 b(<)i(0)572 2224 -y(BZ_SEQUENCE_ERROR)663 2328 y Fl(if)29 b Fj(b)h Fl(w)m(as)h(op)s(ened) -e(with)h Fj(BZ2_bzWriteOpen)572 2431 y(BZ_IO_ERROR)663 -2535 y Fl(if)f(there)i(is)e(an)h(error)g(reading)g(from)g(the)g -(compressed)g(\014le)572 2639 y Fj(BZ_UNEXPECTED_EOF)663 -2743 y Fl(if)f(the)i(compressed)f(\014le)f(ended)h(b)s(efore)g(the)g -(logical)g(end-of-stream)h(w)m(as)g(detected)572 2847 -y Fj(BZ_DATA_ERROR)663 2950 y Fl(if)e(a)i(data)g(in)m(tegrit)m(y)f -(error)g(w)m(as)h(detected)h(in)d(the)h(compressed)g(stream)572 -3054 y Fj(BZ_DATA_ERROR_MAGIC)663 3158 y Fl(if)f(the)i(stream)f(do)s -(es)g(not)h(b)s(egin)e(with)g(the)i(requisite)e(header)h(b)m(ytes)h -(\(ie,)f(is)g(not)663 3262 y(a)g Fj(bzip2)f Fl(data)i(\014le\).)61 -b(This)28 b(is)i(really)f(a)i(sp)s(ecial)e(case)i(of)g -Fj(BZ_DATA_ERROR)p Fl(.)572 3365 y Fj(BZ_MEM_ERROR)663 -3469 y Fl(if)e(insu\016cien)m(t)g(memory)h(w)m(as)h(a)m(v)-5 -b(ailable)572 3573 y Fj(BZ_STREAM_END)663 3677 y Fl(if)29 -b(the)i(logical)e(end)h(of)h(stream)f(w)m(as)h(detected.)572 -3781 y Fj(BZ_OK)663 3884 y Fl(otherwise.)150 4041 y(P)m(ossible)e -(return)h(v)-5 b(alues:)572 4192 y(n)m(um)m(b)s(er)29 -b(of)h(b)m(ytes)h(read)663 4296 y(if)e Fj(bzerror)f Fl(is)i -Fj(BZ_OK)f Fl(or)h Fj(BZ_STREAM_END)572 4400 y Fl(unde\014ned)663 -4503 y(otherwise)150 4660 y(Allo)m(w)m(able)g(next)g(actions:)572 -4811 y(collect)h(data)g(from)f Fj(buf)p Fl(,)f(then)h -Fj(BZ2_bzRead)e Fl(or)i Fj(BZ2_bzReadClose)663 4915 y -Fl(if)f Fj(bzerror)f Fl(is)i Fj(BZ_OK)572 5019 y Fl(collect)h(data)g -(from)f Fj(buf)p Fl(,)f(then)h Fj(BZ2_bzReadClose)d Fl(or)j -Fj(BZ2_bzReadGetUnused)663 5123 y Fl(if)f Fj(bzerror)f -Fl(is)i Fj(BZ_SEQUENCE_END)572 5226 y(BZ2_bzReadClose)663 -5330 y Fl(otherwise)p eop -%%Page: 22 23 -22 22 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(22)150 299 y Ff(3.4.3)63 -b Fe(BZ2_bzReadGetUnused)533 486 y Fj(void)47 b(BZ2_bzReadGetUnused)42 -b(\()48 b(int*)e(bzerror,)g(BZFILE)g(*b,)1822 589 y(void**)g(unused,)g -(int*)g(nUnused)g(\);)150 746 y Fl(Returns)36 b(data)i(whic)m(h)d(w)m -(as)j(read)f(from)f(the)h(compressed)g(\014le)f(but)g(w)m(as)h(not)h -(needed)e(to)i(get)g(to)g(the)150 856 y(logical)k(end-of-stream.)78 -b Fj(*unused)41 b Fl(is)h(set)h(to)g(the)g(address)f(of)g(the)h(data,)k -(and)42 b Fj(*nUnused)e Fl(to)k(the)150 965 y(n)m(um)m(b)s(er)29 -b(of)i(b)m(ytes.)41 b Fj(*nUnused)28 b Fl(will)g(b)s(e)h(set)i(to)g(a)g -(v)-5 b(alue)30 b(b)s(et)m(w)m(een)h Fj(0)f Fl(and)g -Fj(BZ_MAX_UNUSED)d Fl(inclusiv)m(e.)150 1122 y(This)d(function)h(ma)m -(y)h(only)g(b)s(e)f(called)g(once)i Fj(BZ2_bzRead)c Fl(has)j(signalled) -e Fj(BZ_STREAM_END)e Fl(but)j(b)s(efore)150 1232 y Fj(BZ2_bzReadClose)p -Fl(.)150 1389 y(P)m(ossible)k(assignmen)m(ts)h(to)h Fj(bzerror)p -Fl(:)572 1540 y Fj(BZ_PARAM_ERROR)663 1644 y Fl(if)e -Fj(b)h Fl(is)g Fj(NULL)663 1747 y Fl(or)g Fj(unused)f -Fl(is)g Fj(NULL)g Fl(or)i Fj(nUnused)d Fl(is)i Fj(NULL)572 -1851 y(BZ_SEQUENCE_ERROR)663 1955 y Fl(if)f Fj(BZ_STREAM_END)e -Fl(has)j(not)h(b)s(een)e(signalled)663 2059 y(or)h(if)f -Fj(b)h Fl(w)m(as)h(op)s(ened)f(with)f Fj(BZ2_bzWriteOpen)542 -2162 y(BZ_OK)663 2266 y Fl(otherwise)150 2423 y(Allo)m(w)m(able)h(next) -g(actions:)572 2574 y Fj(BZ2_bzReadClose)150 2882 y Ff(3.4.4)63 -b Fe(BZ2_bzReadClose)533 3068 y Fj(void)47 b(BZ2_bzReadClose)c(\()48 -b(int)f(*bzerror,)e(BZFILE)h(*b)h(\);)150 3225 y Fl(Releases)36 -b(all)e(memory)h(p)s(ertaining)e(to)i(the)h(compressed)f(\014le)f -Fj(b)p Fl(.)54 b Fj(BZ2_bzReadClose)31 b Fl(do)s(es)k(not)h(call)150 -3335 y Fj(fclose)c Fl(on)h(the)h(underlying)d(\014le)h(handle,)h(so)h -(y)m(ou)g(should)e(do)h(that)h(y)m(ourself)f(if)g(appropriate.)49 -b Fj(BZ2_)150 3445 y(bzReadClose)27 b Fl(should)i(b)s(e)g(called)h(to)h -(clean)f(up)g(after)h(all)e(error)h(situations.)150 3601 -y(P)m(ossible)f(assignmen)m(ts)h(to)h Fj(bzerror)p Fl(:)572 -3752 y Fj(BZ_SEQUENCE_ERROR)663 3856 y Fl(if)e Fj(b)h -Fl(w)m(as)h(op)s(ened)e(with)h Fj(BZ2_bzOpenWrite)572 -3960 y(BZ_OK)663 4064 y Fl(otherwise)150 4221 y(Allo)m(w)m(able)g(next) -g(actions:)572 4372 y(none)150 4679 y Ff(3.4.5)63 b Fe(BZ2_bzWriteOpen) -533 4866 y Fj(BZFILE)46 b(*BZ2_bzWriteOpen)e(\()j(int)g(*bzerror,)e -(FILE)i(*f,)1774 4970 y(int)g(blockSize100k,)d(int)j(verbosity,)1774 -5074 y(int)g(workFactor)e(\);)150 5230 y Fl(Prepare)33 -b(to)g(write)f(compressed)h(data)h(to)f(\014le)f(handle)g -Fj(f)p Fl(.)47 b Fj(f)33 b Fl(should)e(refer)i(to)g(a)g(\014le)f(whic)m -(h)g(has)h(b)s(een)150 5340 y(op)s(ened)d(for)g(writing,)e(and)i(for)g -(whic)m(h)f(the)i(error)f(indicator)f(\()p Fj(ferror\(f\))p -Fl(\)is)f(not)i(set.)p eop -%%Page: 23 24 -23 23 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(23)150 299 y(F)-8 b(or)31 -b(the)g(meaning)e(of)i(parameters)g Fj(blockSize100k)p -Fl(,)c Fj(verbosity)g Fl(and)j Fj(workFactor)p Fl(,)e(see)150 -408 y Fj(BZ2_bzCompressInit)p Fl(.)150 565 y(All)d(required)f(memory)i -(is)f(allo)s(cated)i(at)g(this)e(stage,)j(so)f(if)e(the)h(call)g -(completes)g(successfully)-8 b(,)26 b Fj(BZ_MEM_)150 -675 y(ERROR)j Fl(cannot)i(b)s(e)f(signalled)e(b)m(y)i(a)h(subsequen)m -(t)f(call)f(to)i Fj(BZ2_bzWrite)p Fl(.)150 832 y(P)m(ossible)e -(assignmen)m(ts)h(to)h Fj(bzerror)p Fl(:)572 983 y Fj(BZ_CONFIG_ERROR) -663 1087 y Fl(if)e(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 -1190 y Fj(BZ_PARAM_ERROR)663 1294 y Fl(if)g Fj(f)h Fl(is)g -Fj(NULL)663 1398 y Fl(or)g Fj(blockSize100k)44 b(<)k(1)30 -b Fl(or)g Fj(blockSize100k)44 b(>)k(9)572 1502 y(BZ_IO_ERROR)663 -1605 y Fl(if)29 b Fj(ferror\(f\))f Fl(is)h(nonzero)572 -1709 y Fj(BZ_MEM_ERROR)663 1813 y Fl(if)g(insu\016cien)m(t)g(memory)h -(is)f(a)m(v)-5 b(ailable)572 1917 y Fj(BZ_OK)663 2021 -y Fl(otherwise)150 2177 y(P)m(ossible)29 b(return)h(v)-5 -b(alues:)572 2328 y(P)m(oin)m(ter)31 b(to)g(an)f(abstract)h -Fj(BZFILE)663 2432 y Fl(if)e Fj(bzerror)f Fl(is)i Fj(BZ_OK)572 -2536 y(NULL)663 2640 y Fl(otherwise)150 2797 y(Allo)m(w)m(able)g(next)g -(actions:)572 2948 y Fj(BZ2_bzWrite)663 3051 y Fl(if)f -Fj(bzerror)f Fl(is)i Fj(BZ_OK)604 3155 y Fl(\(y)m(ou)25 -b(could)e(go)h(directly)f(to)h Fj(BZ2_bzWriteClose)p -Fl(,)c(but)j(this)g(w)m(ould)g(b)s(e)g(prett)m(y)h(p)s(oin)m(tless\)) -572 3259 y Fj(BZ2_bzWriteClose)663 3363 y Fl(otherwise)150 -3639 y Ff(3.4.6)63 b Fe(BZ2_bzWrite)533 3826 y Fj(void)47 -b(BZ2_bzWrite)e(\()i(int)g(*bzerror,)e(BZFILE)h(*b,)h(void)g(*buf,)f -(int)h(len)g(\);)150 3983 y Fl(Absorbs)26 b Fj(len)g -Fl(b)m(ytes)i(from)e(the)i(bu\013er)e Fj(buf)p Fl(,)h(ev)m(en)m(tually) -g(to)h(b)s(e)e(compressed)h(and)f(written)g(to)i(the)g(\014le.)150 -4140 y(P)m(ossible)h(assignmen)m(ts)h(to)h Fj(bzerror)p -Fl(:)572 4291 y Fj(BZ_PARAM_ERROR)663 4395 y Fl(if)e -Fj(b)h Fl(is)g Fj(NULL)f Fl(or)h Fj(buf)g Fl(is)f Fj(NULL)g -Fl(or)i Fj(len)46 b(<)i(0)572 4498 y(BZ_SEQUENCE_ERROR)663 -4602 y Fl(if)29 b(b)h(w)m(as)h(op)s(ened)e(with)g Fj(BZ2_bzReadOpen)572 -4706 y(BZ_IO_ERROR)663 4810 y Fl(if)g(there)i(is)e(an)h(error)g -(writing)f(the)h(compressed)g(\014le.)572 4914 y Fj(BZ_OK)663 -5017 y Fl(otherwise)150 5294 y Ff(3.4.7)63 b Fe(BZ2_bzWriteClose)p -eop -%%Page: 24 25 -24 24 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(24)533 299 y Fj(void)47 -b(BZ2_bzWriteClose)c(\()48 b(int)f(*bzerror,)e(BZFILE*)h(f,)1679 -403 y(int)h(abandon,)1679 506 y(unsigned)e(int*)i(nbytes_in,)1679 -610 y(unsigned)e(int*)i(nbytes_out)e(\);)533 818 y(void)i -(BZ2_bzWriteClose64)c(\()k(int)g(*bzerror,)e(BZFILE*)h(f,)1774 -922 y(int)h(abandon,)1774 1025 y(unsigned)f(int*)g(nbytes_in_lo32,)1774 -1129 y(unsigned)g(int*)g(nbytes_in_hi32,)1774 1233 y(unsigned)g(int*)g -(nbytes_out_lo32,)1774 1337 y(unsigned)g(int*)g(nbytes_out_hi32)e(\);) -150 1493 y Fl(Compresses)39 b(and)g(\015ushes)g(to)h(the)g(compressed)g -(\014le)f(all)f(data)j(so)f(far)g(supplied)c(b)m(y)k -Fj(BZ2_bzWrite)p Fl(.)150 1603 y(The)27 b(logical)g(end-of-stream)h -(mark)m(ers)g(are)g(also)f(written,)h(so)f(subsequen)m(t)g(calls)g(to)h -Fj(BZ2_bzWrite)d Fl(are)150 1713 y(illegal.)50 b(All)33 -b(memory)h(asso)s(ciated)g(with)f(the)i(compressed)e(\014le)h -Fj(b)f Fl(is)g(released.)52 b Fj(fflush)33 b Fl(is)g(called)g(on)150 -1822 y(the)e(compressed)f(\014le,)f(but)h(it)g(is)f(not)i -Fj(fclose)p Fl('d.)150 1979 y(If)i Fj(BZ2_bzWriteClose)c -Fl(is)k(called)f(to)j(clean)e(up)f(after)i(an)g(error,)g(the)g(only)e -(action)i(is)f(to)h(release)g(the)150 2089 y(memory)-8 -b(.)42 b(The)30 b(library)e(records)j(the)g(error)f(co)s(des)h(issued)e -(b)m(y)h(previous)f(calls,)i(so)f(this)g(situation)g(will)150 -2198 y(b)s(e)c(detected)h(automatically)-8 b(.)40 b(There)26 -b(is)g(no)g(attempt)h(to)h(complete)e(the)h(compression)f(op)s -(eration,)g(nor)150 2308 y(to)32 b Fj(fflush)d Fl(the)i(compressed)g -(\014le.)42 b(Y)-8 b(ou)32 b(can)f(force)h(this)e(b)s(eha)m(viour)g(to) -h(happ)s(en)f(ev)m(en)i(in)d(the)j(case)g(of)150 2417 -y(no)e(error,)g(b)m(y)h(passing)e(a)i(nonzero)f(v)-5 -b(alue)30 b(to)h Fj(abandon)p Fl(.)150 2574 y(If)j Fj(nbytes_in)d -Fl(is)j(non-n)m(ull,)f Fj(*nbytes_in)e Fl(will)h(b)s(e)h(set)i(to)g(b)s -(e)f(the)g(total)h(v)m(olume)f(of)g(uncompressed)150 -2684 y(data)k(handled.)60 b(Similarly)-8 b(,)35 b Fj(nbytes_out)g -Fl(will)g(b)s(e)h(set)i(to)g(the)g(total)g(v)m(olume)f(of)g(compressed) -g(data)150 2793 y(written.)h(F)-8 b(or)27 b(compatibilit)m(y)d(with)h -(older)g(v)m(ersions)h(of)g(the)g(library)-8 b(,)25 b -Fj(BZ2_bzWriteClose)d Fl(only)j(yields)150 2903 y(the)40 -b(lo)m(w)m(er)g(32)h(bits)d(of)i(these)h(coun)m(ts.)69 -b(Use)40 b Fj(BZ2_bzWriteClose64)35 b Fl(if)k(y)m(ou)h(w)m(an)m(t)h -(the)f(full)d(64)k(bit)150 3013 y(coun)m(ts.)g(These)30 -b(t)m(w)m(o)i(functions)d(are)i(otherwise)f(absolutely)f(iden)m(tical.) -150 3169 y(P)m(ossible)g(assignmen)m(ts)h(to)h Fj(bzerror)p -Fl(:)572 3320 y Fj(BZ_SEQUENCE_ERROR)663 3424 y Fl(if)e -Fj(b)h Fl(w)m(as)h(op)s(ened)e(with)h Fj(BZ2_bzReadOpen)572 -3528 y(BZ_IO_ERROR)663 3632 y Fl(if)f(there)i(is)e(an)h(error)g -(writing)f(the)h(compressed)g(\014le)572 3736 y Fj(BZ_OK)663 -3839 y Fl(otherwise)150 4161 y Ff(3.4.8)63 b(Handling)41 -b(em)m(b)s(edded)g(compressed)h(data)e(streams)150 4354 -y Fl(The)i(high-lev)m(el)g(library)f(facilitates)h(use)h(of)g -Fj(bzip2)e Fl(data)j(streams)f(whic)m(h)f(form)g(some)i(part)e(of)i(a) -150 4463 y(surrounding,)27 b(larger)j(data)h(stream.)225 -4620 y Fi(\017)60 b Fl(F)-8 b(or)22 b(writing,)f(the)g(library)e(tak)m -(es)k(an)e(op)s(en)f(\014le)g(handle,)i(writes)e(compressed)h(data)h -(to)g(it,)g Fj(fflush)p Fl(es)330 4730 y(it)34 b(but)f(do)s(es)h(not)h -Fj(fclose)d Fl(it.)52 b(The)34 b(calling)f(application)g(can)h(write)g -(its)f(o)m(wn)i(data)g(b)s(efore)f(and)330 4839 y(after)d(the)f -(compressed)h(data)g(stream,)g(using)d(that)j(same)g(\014le)f(handle.) -225 5011 y Fi(\017)60 b Fl(Reading)34 b(is)f(more)i(complex,)g(and)f -(the)h(facilities)d(are)j(not)g(as)g(general)f(as)h(they)f(could)g(b)s -(e)g(since)330 5121 y(generalit)m(y)e(is)f(hard)f(to)j(reconcile)e -(with)f(e\016ciency)-8 b(.)46 b Fj(BZ2_bzRead)29 b Fl(reads)i(from)g -(the)h(compressed)330 5230 y(\014le)39 b(in)g(blo)s(c)m(ks)g(of)h(size) -g Fj(BZ_MAX_UNUSED)c Fl(b)m(ytes,)44 b(and)39 b(in)g(doing)g(so)h -(probably)e(will)f(o)m(v)m(ersho)s(ot)330 5340 y(the)i(logical)g(end)f -(of)h(compressed)f(stream.)67 b(T)-8 b(o)40 b(reco)m(v)m(er)g(this)e -(data)i(once)f(decompression)f(has)p eop -%%Page: 25 26 -25 25 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(25)330 299 y(ended,)28 -b(call)g Fj(BZ2_bzReadGetUnused)23 b Fl(after)29 b(the)g(last)f(call)g -(of)g Fj(BZ2_bzRead)e Fl(\(the)j(one)g(returning)330 -408 y Fj(BZ_STREAM_END)p Fl(\))e(but)j(b)s(efore)g(calling)f -Fj(BZ2_bzReadClose)p Fl(.)150 596 y(This)51 b(mec)m(hanism)g(mak)m(es)j -(it)e(easy)h(to)g(decompress)f(m)m(ultiple)e Fj(bzip2)i -Fl(streams)g(placed)g(end-to-)150 706 y(end.)90 b(As)48 -b(the)f(end)f(of)i(one)f(stream,)52 b(when)46 b Fj(BZ2_bzRead)f -Fl(returns)h Fj(BZ_STREAM_END)p Fl(,)i(call)e Fj(BZ2_)150 -816 y(bzReadGetUnused)36 b Fl(to)41 b(collect)g(the)g(un)m(used)e(data) -i(\(cop)m(y)g(it)f(in)m(to)g(y)m(our)h(o)m(wn)f(bu\013er)f -(somewhere\).)150 925 y(That)25 b(data)g(forms)f(the)h(start)h(of)e -(the)h(next)g(compressed)g(stream.)39 b(T)-8 b(o)25 b(start)h -(uncompressing)c(that)k(next)150 1035 y(stream,)40 b(call)d -Fj(BZ2_bzReadOpen)d Fl(again,)40 b(feeding)d(in)g(the)h(un)m(used)e -(data)j(via)e(the)h Fj(unused)p Fl(/)p Fj(nUnused)150 -1144 y Fl(parameters.)54 b(Keep)34 b(doing)g(this)f(un)m(til)g -Fj(BZ_STREAM_END)e Fl(return)j(coincides)f(with)h(the)g(ph)m(ysical)g -(end)150 1254 y(of)d(\014le)e(\()p Fj(feof\(f\))p Fl(\).)39 -b(In)30 b(this)f(situation)h Fj(BZ2_bzReadGetUnused)25 -b Fl(will)i(of)k(course)g(return)e(no)h(data.)150 1411 -y(This)c(should)f(giv)m(e)j(some)g(feel)f(for)g(ho)m(w)h(the)g -(high-lev)m(el)e(in)m(terface)i(can)f(b)s(e)g(used.)39 -b(If)27 b(y)m(ou)h(require)e(extra)150 1520 y(\015exibilit)m(y)-8 -b(,)28 b(y)m(ou'll)i(ha)m(v)m(e)h(to)g(bite)f(the)h(bullet)d(and)i(get) -i(to)f(grips)e(with)g(the)h(lo)m(w-lev)m(el)h(in)m(terface.)150 -1779 y Ff(3.4.9)63 b(Standard)40 b(\014le-reading/writing)j(co)s(de)150 -1972 y Fl(Here's)31 b(ho)m(w)f(y)m(ou'd)h(write)e(data)j(to)f(a)f -(compressed)g(\014le:)390 2330 y Fj(FILE*)142 b(f;)390 -2434 y(BZFILE*)46 b(b;)390 2538 y(int)238 b(nBuf;)390 -2642 y(char)190 b(buf[)46 b(/*)i(whatever)d(size)i(you)g(like)f(*/)i -(];)390 2746 y(int)238 b(bzerror;)390 2849 y(int)g(nWritten;)390 -3057 y(f)47 b(=)h(fopen)e(\()i("myfile.bz2",)c("w")j(\);)390 -3161 y(if)g(\(!f\))g({)533 3264 y(/*)g(handle)f(error)h(*/)390 -3368 y(})390 3472 y(b)g(=)h(BZ2_bzWriteOpen)c(\()j(&bzerror,)e(f,)i(9)h -(\);)390 3576 y(if)f(\(bzerror)f(!=)h(BZ_OK\))f({)533 -3680 y(BZ2_bzWriteClose)e(\()j(b)g(\);)533 3783 y(/*)g(handle)f(error)h -(*/)390 3887 y(})390 4095 y(while)f(\()i(/*)f(condition)e(*/)i(\))h({) -533 4198 y(/*)f(get)g(data)g(to)g(write)f(into)h(buf,)g(and)g(set)g -(nBuf)f(appropriately)e(*/)533 4302 y(nWritten)i(=)h(BZ2_bzWrite)e(\()i -(&bzerror,)f(b,)h(buf,)f(nBuf)h(\);)533 4406 y(if)g(\(bzerror)f(==)h -(BZ_IO_ERROR\))e({)676 4510 y(BZ2_bzWriteClose)f(\()j(&bzerror,)e(b)j -(\);)676 4614 y(/*)g(handle)e(error)g(*/)533 4717 y(})390 -4821 y(})390 5029 y(BZ2_bzWriteClose)d(\()48 b(&bzerror,)d(b)j(\);)390 -5132 y(if)f(\(bzerror)f(==)h(BZ_IO_ERROR\))d({)533 5236 -y(/*)j(handle)f(error)h(*/)390 5340 y(})p eop -%%Page: 26 27 -26 26 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(26)150 299 y(And)29 b(to)j(read)e(from)g -(a)h(compressed)f(\014le:)390 450 y Fj(FILE*)142 b(f;)390 -554 y(BZFILE*)46 b(b;)390 657 y(int)238 b(nBuf;)390 761 -y(char)190 b(buf[)46 b(/*)i(whatever)d(size)i(you)g(like)f(*/)i(];)390 -865 y(int)238 b(bzerror;)390 969 y(int)g(nWritten;)390 -1176 y(f)47 b(=)h(fopen)e(\()i("myfile.bz2",)c("r")j(\);)390 -1280 y(if)g(\(!f\))g({)533 1384 y(/*)g(handle)f(error)h(*/)390 -1488 y(})390 1591 y(b)g(=)h(BZ2_bzReadOpen)c(\()j(&bzerror,)f(f,)h(0,)g -(NULL,)f(0)i(\);)390 1695 y(if)f(\(bzerror)f(!=)h(BZ_OK\))f({)533 -1799 y(BZ2_bzReadClose)e(\()j(&bzerror,)f(b)h(\);)533 -1903 y(/*)g(handle)f(error)h(*/)390 2007 y(})390 2214 -y(bzerror)f(=)h(BZ_OK;)390 2318 y(while)f(\(bzerror)g(==)h(BZ_OK)f(&&)i -(/*)f(arbitrary)e(other)h(conditions)f(*/\))i({)533 2422 -y(nBuf)g(=)g(BZ2_bzRead)e(\()j(&bzerror,)d(b,)i(buf,)g(/*)g(size)g(of)g -(buf)g(*/)g(\);)533 2525 y(if)g(\(bzerror)f(==)h(BZ_OK\))f({)676 -2629 y(/*)i(do)f(something)e(with)i(buf[0)f(..)h(nBuf-1])f(*/)533 -2733 y(})390 2837 y(})390 2941 y(if)h(\(bzerror)f(!=)h(BZ_STREAM_END\)) -d({)533 3044 y(BZ2_bzReadClose)g(\()j(&bzerror,)f(b)h(\);)533 -3148 y(/*)g(handle)f(error)h(*/)390 3252 y(})g(else)g({)533 -3356 y(BZ2_bzReadClose)d(\()j(&bzerror)f(\);)390 3459 -y(})150 3753 y Fk(3.5)68 b(Utilit)l(y)47 b(functions)150 -4045 y Ff(3.5.1)63 b Fe(BZ2_bzBuffToBuffCompress)533 -4232 y Fj(int)47 b(BZ2_bzBuffToBuffCompress\()41 b(char*)428 -b(dest,)1965 4335 y(unsigned)46 b(int*)g(destLen,)1965 -4439 y(char*)428 b(source,)1965 4543 y(unsigned)46 b(int)94 -b(sourceLen,)1965 4647 y(int)524 b(blockSize100k,)1965 -4751 y(int)g(verbosity,)1965 4854 y(int)g(workFactor)45 -b(\);)150 5011 y Fl(A)m(ttempts)33 b(to)g(compress)f(the)g(data)h(in)e -Fj(source[0)d(..)i(sourceLen-1])e Fl(in)m(to)k(the)h(destination)e -(bu\013er,)150 5121 y Fj(dest[0)e(..)g(*destLen-1])p -Fl(.)37 b(If)26 b(the)g(destination)g(bu\013er)f(is)h(big)f(enough,)j -Fj(*destLen)c Fl(is)h(set)i(to)g(the)g(size)150 5230 -y(of)i(the)f(compressed)h(data,)g(and)f Fj(BZ_OK)f Fl(is)h(returned.)39 -b(If)28 b(the)h(compressed)f(data)h(w)m(on't)g(\014t,)g -Fj(*destLen)150 5340 y Fl(is)g(unc)m(hanged,)i(and)e -Fj(BZ_OUTBUFF_FULL)e Fl(is)i(returned.)p eop -%%Page: 27 28 -27 27 bop 150 -116 a Fl(Chapter)30 b(3:)h(Programming)e(with)g -Fj(libbzip2)1891 b Fl(27)150 299 y(Compression)22 b(in)g(this)h(manner) -g(is)g(a)h(one-shot)g(ev)m(en)m(t,)j(done)c(with)g(a)h(single)e(call)h -(to)i(this)d(function.)37 b(The)150 408 y(resulting)25 -b(compressed)i(data)i(is)d(a)i(complete)f Fj(bzip2)f -Fl(format)i(data)g(stream.)40 b(There)27 b(is)f(no)i(mec)m(hanism)150 -518 y(for)23 b(making)g(additional)e(calls)i(to)h(pro)m(vide)f(extra)h -(input)e(data.)39 b(If)23 b(y)m(ou)h(w)m(an)m(t)g(that)g(kind)e(of)h -(mec)m(hanism,)150 628 y(use)30 b(the)h(lo)m(w-lev)m(el)f(in)m -(terface.)150 784 y(F)-8 b(or)31 b(the)g(meaning)e(of)i(parameters)g -Fj(blockSize100k)p Fl(,)c Fj(verbosity)g Fl(and)j Fj(workFactor)p -Fl(,)150 894 y(see)h Fj(BZ2_bzCompressInit)p Fl(.)150 -1051 y(T)-8 b(o)27 b(guaran)m(tee)h(that)e(the)h(compressed)f(data)h -(will)d(\014t)i(in)f(its)g(bu\013er,)i(allo)s(cate)f(an)g(output)g -(bu\013er)g(of)g(size)150 1160 y(1\045)31 b(larger)f(than)g(the)g -(uncompressed)f(data,)j(plus)c(six)h(h)m(undred)g(extra)i(b)m(ytes.)150 -1317 y Fj(BZ2_bzBuffToBuffDecompre)o(ss)25 b Fl(will)k(not)j(write)e -(data)j(at)f(or)f(b)s(ey)m(ond)g Fj(dest[*destLen])p -Fl(,)d(ev)m(en)k(in)150 1427 y(case)f(of)g(bu\013er)e(o)m(v)m(er\015o)m -(w.)150 1584 y(P)m(ossible)g(return)h(v)-5 b(alues:)572 -1735 y Fj(BZ_CONFIG_ERROR)663 1839 y Fl(if)29 b(the)i(library)d(has)i -(b)s(een)f(mis-compiled)572 1942 y Fj(BZ_PARAM_ERROR)663 -2046 y Fl(if)g Fj(dest)g Fl(is)h Fj(NULL)f Fl(or)h Fj(destLen)f -Fl(is)g Fj(NULL)663 2150 y Fl(or)h Fj(blockSize100k)44 -b(<)k(1)30 b Fl(or)g Fj(blockSize100k)44 b(>)k(9)663 -2254 y Fl(or)30 b Fj(verbosity)45 b(<)j(0)30 b Fl(or)g -Fj(verbosity)45 b(>)j(4)663 2357 y Fl(or)30 b Fj(workFactor)45 -b(<)j(0)30 b Fl(or)g Fj(workFactor)45 b(>)i(250)572 2461 -y(BZ_MEM_ERROR)663 2565 y Fl(if)29 b(insu\016cien)m(t)g(memory)h(is)f -(a)m(v)-5 b(ailable)572 2669 y Fj(BZ_OUTBUFF_FULL)663 -2773 y Fl(if)29 b(the)i(size)f(of)g(the)h(compressed)f(data)h(exceeds)g -Fj(*destLen)572 2876 y(BZ_OK)663 2980 y Fl(otherwise)150 -3349 y Ff(3.5.2)63 b Fe(BZ2_bzBuffToBuffDecompress)533 -3536 y Fj(int)47 b(BZ2_bzBuffToBuffDecompres)o(s)42 b(\()47 -b(char*)428 b(dest,)2108 3640 y(unsigned)46 b(int*)g(destLen,)2108 -3744 y(char*)428 b(source,)2108 3848 y(unsigned)46 b(int)94 -b(sourceLen,)2108 3951 y(int)524 b(small,)2108 4055 y(int)g(verbosity) -46 b(\);)150 4212 y Fl(A)m(ttempts)24 b(to)g(decompress)f(the)g(data)g -(in)f Fj(source[0)28 b(..)i(sourceLen-1])20 b Fl(in)m(to)j(the)g -(destination)f(bu\013er,)150 4322 y Fj(dest[0)29 b(..)g(*destLen-1])p -Fl(.)37 b(If)26 b(the)g(destination)g(bu\013er)f(is)h(big)f(enough,)j -Fj(*destLen)c Fl(is)h(set)i(to)g(the)g(size)150 4431 -y(of)21 b(the)g(uncompressed)e(data,)24 b(and)c Fj(BZ_OK)f -Fl(is)h(returned.)36 b(If)20 b(the)h(compressed)g(data)g(w)m(on't)h -(\014t,)g Fj(*destLen)150 4541 y Fl(is)29 b(unc)m(hanged,)i(and)e -Fj(BZ_OUTBUFF_FULL)e Fl(is)i(returned.)150 4698 y Fj(source)g -Fl(is)g(assumed)h(to)h(hold)e(a)i(complete)f Fj(bzip2)f -Fl(format)i(data)g(stream.)150 4807 y Fj(BZ2_bzBuffToBuffDecompre)o(ss) -22 b Fl(tries)28 b(to)i(decompress)e(the)h(en)m(tiret)m(y)g(of)g(the)f -(stream)h(in)m(to)g(the)f(out-)150 4917 y(put)i(bu\013er.)150 -5074 y(F)-8 b(or)31 b(the)g(meaning)e(of)i(parameters)g -Fj(small)e Fl(and)g Fj(verbosity)p Fl(,)f(see)j Fj -(BZ2_bzDecompressInit)p Fl(.)150 5230 y(Because)j(the)f(compression)e -(ratio)i(of)g(the)g(compressed)f(data)h(cannot)g(b)s(e)f(kno)m(wn)g(in) -g(adv)-5 b(ance,)34 b(there)150 5340 y(is)d(no)h(easy)g(w)m(a)m(y)h(to) -f(guaran)m(tee)i(that)e(the)g(output)f(bu\013er)g(will)e(b)s(e)i(big)g -(enough.)45 b(Y)-8 b(ou)32 b(ma)m(y)h(of)f(course)p eop -%%Page: 28 29 -28 28 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(28)150 299 y(mak)m(e)36 -b(arrangemen)m(ts)f(in)e(y)m(our)i(co)s(de)g(to)g(record)g(the)g(size)f -(of)h(the)g(uncompressed)f(data,)i(but)e(suc)m(h)h(a)150 -408 y(mec)m(hanism)30 b(is)f(b)s(ey)m(ond)h(the)g(scop)s(e)h(of)f(this) -g(library)-8 b(.)150 565 y Fj(BZ2_bzBuffToBuffDecompre)o(ss)25 -b Fl(will)k(not)j(write)e(data)j(at)f(or)f(b)s(ey)m(ond)g -Fj(dest[*destLen])p Fl(,)d(ev)m(en)k(in)150 675 y(case)f(of)g(bu\013er) -e(o)m(v)m(er\015o)m(w.)150 832 y(P)m(ossible)g(return)h(v)-5 -b(alues:)572 983 y Fj(BZ_CONFIG_ERROR)663 1087 y Fl(if)29 -b(the)i(library)d(has)i(b)s(een)f(mis-compiled)572 1190 -y Fj(BZ_PARAM_ERROR)663 1294 y Fl(if)g Fj(dest)g Fl(is)h -Fj(NULL)f Fl(or)h Fj(destLen)f Fl(is)g Fj(NULL)663 1398 -y Fl(or)h Fj(small)46 b(!=)i(0)f(&&)g(small)g(!=)g(1)663 -1502 y Fl(or)30 b Fj(verbosity)45 b(<)j(0)30 b Fl(or)g -Fj(verbosity)45 b(>)j(4)572 1605 y(BZ_MEM_ERROR)663 1709 -y Fl(if)29 b(insu\016cien)m(t)g(memory)h(is)f(a)m(v)-5 -b(ailable)572 1813 y Fj(BZ_OUTBUFF_FULL)663 1917 y Fl(if)29 -b(the)i(size)f(of)g(the)h(compressed)f(data)h(exceeds)g -Fj(*destLen)572 2021 y(BZ_DATA_ERROR)663 2124 y Fl(if)e(a)i(data)g(in)m -(tegrit)m(y)f(error)g(w)m(as)h(detected)h(in)d(the)h(compressed)g(data) -572 2228 y Fj(BZ_DATA_ERROR_MAGIC)663 2332 y Fl(if)f(the)i(compressed)f -(data)h(do)s(esn't)f(b)s(egin)f(with)g(the)i(righ)m(t)e(magic)i(b)m -(ytes)572 2436 y Fj(BZ_UNEXPECTED_EOF)663 2539 y Fl(if)e(the)i -(compressed)f(data)h(ends)e(unexp)s(ectedly)572 2643 -y Fj(BZ_OK)663 2747 y Fl(otherwise)150 3116 y Fk(3.6)68 -b Fd(zlib)43 b Fk(compatibilit)l(y)k(functions)150 3308 -y Fl(Y)-8 b(oshiok)j(a)33 b(Tsuneo)e(has)h(con)m(tributed)g(some)g -(functions)f(to)i(giv)m(e)g(b)s(etter)f Fj(zlib)f Fl(compatibilit)m(y) --8 b(.)45 b(These)150 3418 y(functions)36 b(are)i Fj(BZ2_bzopen)p -Fl(,)e Fj(BZ2_bzread)p Fl(,)h Fj(BZ2_bzwrite)p Fl(,)f -Fj(BZ2_bzflush)p Fl(,)h Fj(BZ2_bzclose)p Fl(,)f Fj(BZ2_)150 -3527 y(bzerror)23 b Fl(and)h Fj(BZ2_bzlibVersion)p Fl(.)34 -b(These)25 b(functions)e(are)j(not)f(\(y)m(et\))h(o\016cially)e(part)h -(of)g(the)g(library)-8 b(.)150 3637 y(If)30 b(they)g(break,)h(y)m(ou)g -(get)g(to)g(k)m(eep)g(all)f(the)g(pieces.)41 b(Nev)m(ertheless,)31 -b(I)f(think)f(they)i(w)m(ork)f(ok.)390 3788 y Fj(typedef)46 -b(void)g(BZFILE;)390 3995 y(const)g(char)h(*)g(BZ2_bzlibVersion)d(\()j -(void)g(\);)150 4152 y Fl(Returns)29 b(a)i(string)f(indicating)e(the)i -(library)e(v)m(ersion.)390 4303 y Fj(BZFILE)46 b(*)i(BZ2_bzopen)92 -b(\()48 b(const)e(char)h(*path,)f(const)g(char)h(*mode)f(\);)390 -4407 y(BZFILE)g(*)i(BZ2_bzdopen)c(\()k(int)381 b(fd,)190 -b(const)46 b(char)h(*mode)f(\);)150 4564 y Fl(Op)s(ens)19 -b(a)j Fj(.bz2)e Fl(\014le)g(for)g(reading)g(or)h(writing,)g(using)f -(either)g(its)h(name)g(or)g(a)g(pre-existing)f(\014le)g(descriptor.)150 -4674 y(Analogous)30 b(to)i Fj(fopen)c Fl(and)i Fj(fdopen)p -Fl(.)390 4825 y Fj(int)47 b(BZ2_bzread)93 b(\()47 b(BZFILE*)f(b,)h -(void*)f(buf,)h(int)g(len)g(\);)390 4928 y(int)g(BZ2_bzwrite)e(\()i -(BZFILE*)f(b,)h(void*)f(buf,)h(int)g(len)g(\);)150 5085 -y Fl(Reads/writes)30 b(data)h(from/to)g(a)g(previously)d(op)s(ened)i -Fj(BZFILE)p Fl(.)39 b(Analogous)30 b(to)h Fj(fread)e -Fl(and)h Fj(fwrite)p Fl(.)390 5236 y Fj(int)95 b(BZ2_bzflush)44 -b(\()k(BZFILE*)e(b)h(\);)390 5340 y(void)g(BZ2_bzclose)d(\()k(BZFILE*)e -(b)h(\);)p eop -%%Page: 29 30 -29 29 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(29)150 299 y(Flushes/closes)27 -b(a)h Fj(BZFILE)p Fl(.)39 b Fj(BZ2_bzflush)24 b Fl(do)s(esn't)k -(actually)f(do)h(an)m(ything.)39 b(Analogous)28 b(to)h -Fj(fflush)150 408 y Fl(and)h Fj(fclose)p Fl(.)390 559 -y Fj(const)46 b(char)h(*)g(BZ2_bzerror)e(\()j(BZFILE)e(*b,)h(int)g -(*errnum)e(\))150 716 y Fl(Returns)31 b(a)i(string)e(describing)f(the)i -(more)g(recen)m(t)h(error)f(status)h(of)f Fj(b)p Fl(,)g(and)g(also)g -(sets)h Fj(*errnum)d Fl(to)j(its)150 826 y(n)m(umerical)c(v)-5 -b(alue.)150 1242 y Fk(3.7)68 b(Using)46 b(the)f(library)g(in)g(a)g -Fd(stdio)p Fk(-free)f(en)l(vironmen)l(t)150 1615 y Ff(3.7.1)63 -b(Getting)40 b(rid)h(of)g Fe(stdio)150 1807 y Fl(In)i(a)g(deeply)g(em)m -(b)s(edded)f(application,)j(y)m(ou)f(migh)m(t)f(w)m(an)m(t)h(to)g(use)f -(just)g(the)h(memory-to-memory)150 1917 y(functions.)39 -b(Y)-8 b(ou)30 b(can)f(do)g(this)g(con)m(v)m(enien)m(tly)g(b)m(y)g -(compiling)e(the)j(library)d(with)h(prepro)s(cessor)g(sym)m(b)s(ol)150 -2026 y Fj(BZ_NO_STDIO)35 b Fl(de\014ned.)63 b(Doing)39 -b(this)e(giv)m(es)h(y)m(ou)h(a)f(library)e(con)m(taining)i(only)f(the)i -(follo)m(wing)e(eigh)m(t)150 2136 y(functions:)150 2293 -y Fj(BZ2_bzCompressInit)p Fl(,)26 b Fj(BZ2_bzCompress)p -Fl(,)g Fj(BZ2_bzCompressEnd)150 2402 y(BZ2_bzDecompressInit)p -Fl(,)f Fj(BZ2_bzDecompress)p Fl(,)h Fj(BZ2_bzDecompressEnd)150 -2512 y(BZ2_bzBuffToBuffCompress)o Fl(,)f Fj(BZ2_bzBuffToBuffDecompre)o -(ss)150 2669 y Fl(When)30 b(compiled)f(lik)m(e)h(this,)f(all)g -(functions)g(will)f(ignore)i Fj(verbosity)e Fl(settings.)150 -3006 y Ff(3.7.2)63 b(Critical)40 b(error)h(handling)150 -3199 y Fj(libbzip2)20 b Fl(con)m(tains)j(a)g(n)m(um)m(b)s(er)f(of)g(in) -m(ternal)g(assertion)g(c)m(hec)m(ks)i(whic)m(h)d(should,)i(needless)f -(to)h(sa)m(y)-8 b(,)26 b(nev)m(er)150 3308 y(b)s(e)g(activ)-5 -b(ated.)40 b(Nev)m(ertheless,)28 b(if)d(an)i(assertion)f(should)e -(fail,)i(b)s(eha)m(viour)f(dep)s(ends)f(on)j(whether)e(or)i(not)150 -3418 y(the)k(library)d(w)m(as)i(compiled)f(with)g Fj(BZ_NO_STDIO)e -Fl(set.)150 3575 y(F)-8 b(or)31 b(a)g(normal)e(compile,)h(an)g -(assertion)g(failure)f(yields)f(the)j(message)533 3726 -y Fj(bzip2/libbzip2:)44 b(internal)h(error)i(number)f(N.)533 -3829 y(This)h(is)g(a)g(bug)g(in)h(bzip2/libbzip2,)43 -b(1.0)k(of)g(21-Mar-2000.)533 3933 y(Please)f(report)g(it)i(to)f(me)g -(at:)g(jseward@acm.org.)91 b(If)47 b(this)g(happened)533 -4037 y(when)g(you)g(were)f(using)h(some)f(program)g(which)h(uses)f -(libbzip2)g(as)h(a)533 4141 y(component,)e(you)i(should)f(also)h -(report)f(this)h(bug)f(to)i(the)f(author\(s\))533 4244 -y(of)g(that)g(program.)93 b(Please)46 b(make)h(an)g(effort)f(to)h -(report)g(this)f(bug;)533 4348 y(timely)g(and)h(accurate)f(bug)h -(reports)e(eventually)g(lead)i(to)g(higher)533 4452 y(quality)f -(software.)93 b(Thanks.)h(Julian)46 b(Seward,)f(21)j(March)e(2000.)150 -4609 y Fl(where)30 b Fj(N)g Fl(is)f(some)i(error)f(co)s(de)h(n)m(um)m -(b)s(er.)39 b Fj(exit\(3\))28 b Fl(is)i(then)g(called.)150 -4766 y(F)-8 b(or)31 b(a)g Fj(stdio)p Fl(-free)e(library)-8 -b(,)29 b(assertion)h(failures)e(result)i(in)f(a)i(call)e(to)i(a)g -(function)e(declared)h(as:)533 4917 y Fj(extern)46 b(void)h -(bz_internal_error)c(\()k(int)g(errcode)f(\);)150 5074 -y Fl(The)30 b(relev)-5 b(an)m(t)31 b(co)s(de)f(is)g(passed)f(as)i(a)g -(parameter.)41 b(Y)-8 b(ou)31 b(should)d(supply)g(suc)m(h)i(a)h -(function.)150 5230 y(In)g(either)g(case,)j(once)e(an)g(assertion)g -(failure)e(has)h(o)s(ccurred,)h(an)m(y)g Fj(bz_stream)e -Fl(records)h(in)m(v)m(olv)m(ed)h(can)150 5340 y(b)s(e)e(regarded)g(as)h -(in)m(v)-5 b(alid.)38 b(Y)-8 b(ou)31 b(should)d(not)j(attempt)g(to)g -(resume)f(normal)g(op)s(eration)f(with)g(them.)p eop -%%Page: 30 31 -30 30 bop 150 -116 a Fl(Chapter)30 b(3:)41 b(Programming)29 -b(with)g Fj(libbzip2)1881 b Fl(30)150 299 y(Y)-8 b(ou)22 -b(ma)m(y)-8 b(,)25 b(of)d(course,)h(c)m(hange)g(critical)e(error)g -(handling)e(to)j(suit)f(y)m(our)g(needs.)38 b(As)21 b(I)h(said)e(ab)s -(o)m(v)m(e,)25 b(critical)150 408 y(errors)30 b(indicate)g(bugs)g(in)g -(the)h(library)d(and)i(should)f(not)i(o)s(ccur.)42 b(All)29 -b Fj(")p Fl(normal)p Fj(")h Fl(error)g(situations)g(are)150 -518 y(indicated)f(via)h(error)g(return)f(co)s(des)i(from)f(functions,)f -(and)g(can)i(b)s(e)f(reco)m(v)m(ered)i(from.)150 798 -y Fk(3.8)68 b(Making)45 b(a)g(Windo)l(ws)h(DLL)150 990 -y Fl(Ev)m(erything)30 b(related)g(to)h(Windo)m(ws)f(has)g(b)s(een)f -(con)m(tributed)h(b)m(y)g(Y)-8 b(oshiok)j(a)31 b(Tsuneo)150 -1100 y(\()p Fj(QWF00133@niftyserve.or.jp)46 b Fl(/)52 -b Fj(tsuneo-y@is.aist-nara.ac.j)o(p)p Fl(\),)g(so)h(y)m(ou)f(should)f -(send)150 1210 y(y)m(our)30 b(queries)g(to)h(him)e(\(but)h(p)s(erhaps)e -(Cc:)41 b(me,)31 b Fj(jseward@acm.org)p Fl(\).)150 1366 -y(My)43 b(v)-5 b(ague)44 b(understanding)d(of)i(what)g(to)h(do)f(is:)65 -b(using)41 b(Visual)h(C)p Fj(++)g Fl(5.0,)48 b(op)s(en)42 -b(the)h(pro)5 b(ject)44 b(\014le)150 1476 y Fj(libbz2.dsp)p -Fl(,)28 b(and)i(build.)37 b(That's)31 b(all.)150 1633 -y(If)41 b(y)m(ou)g(can't)h(op)s(en)e(the)h(pro)5 b(ject)42 -b(\014le)e(for)h(some)g(reason,)j(mak)m(e)e(a)g(new)e(one,)k(naming)c -(these)i(\014les:)150 1742 y Fj(blocksort.c)p Fl(,)28 -b Fj(bzlib.c)p Fl(,)g Fj(compress.c)p Fl(,)g Fj(crctable.c)p -Fl(,)g Fj(decompress.c)p Fl(,)f Fj(huffman.c)p Fl(,)150 -1852 y Fj(randtable.c)32 b Fl(and)j Fj(libbz2.def)p Fl(.)53 -b(Y)-8 b(ou)36 b(will)d(also)i(need)g(to)h(name)g(the)g(header)f -(\014les)f Fj(bzlib.h)g Fl(and)150 1962 y Fj(bzlib_private.h)p -Fl(.)150 2118 y(If)c(y)m(ou)h(don't)f(use)g(V)m(C)p Fj(++)p -Fl(,)g(y)m(ou)h(ma)m(y)g(need)f(to)h(de\014ne)f(the)h(propro)s(cessor)e -(sym)m(b)s(ol)g Fj(_WIN32)p Fl(.)150 2275 y(Finally)-8 -b(,)28 b Fj(dlltest.c)e Fl(is)h(a)i(sample)f(program)g(using)g(the)g -(DLL.)h(It)g(has)f(a)h(pro)5 b(ject)29 b(\014le,)g Fj(dlltest.dsp)p -Fl(.)150 2432 y(If)h(y)m(ou)h(just)e(w)m(an)m(t)j(a)e(mak)m(e\014le)h -(for)f(Visual)f(C,)h(ha)m(v)m(e)i(a)e(lo)s(ok)g(at)i -Fj(makefile.msc)p Fl(.)150 2589 y(Be)k(a)m(w)m(are)g(that)g(if)e(y)m -(ou)h(compile)f Fj(bzip2)g Fl(itself)g(on)h(Win32,)h(y)m(ou)g(m)m(ust)f -(set)g Fj(BZ_UNIX)e Fl(to)j(0)f(and)g Fj(BZ_)150 2698 -y(LCCWIN32)27 b Fl(to)j(1,)g(in)f(the)g(\014le)g Fj(bzip2.c)p -Fl(,)e(b)s(efore)i(compiling.)39 b(Otherwise)28 b(the)h(resulting)f -(binary)f(w)m(on't)150 2808 y(w)m(ork)j(correctly)-8 -b(.)150 2965 y(I)30 b(ha)m(v)m(en't)i(tried)d(an)m(y)i(of)g(this)e -(stu\013)h(m)m(yself,)g(but)g(it)f(all)h(lo)s(oks)g(plausible.)p -eop -%%Page: 31 32 -31 31 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 -b(31)150 299 y Fh(4)80 b(Miscellanea)150 583 y Fl(These)30 -b(are)h(just)f(some)g(random)g(though)m(ts)h(of)f(mine.)40 -b(Y)-8 b(our)30 b(mileage)h(ma)m(y)g(v)-5 b(ary)d(.)150 -884 y Fk(4.1)68 b(Limitations)47 b(of)e(the)g(compressed)g(\014le)h -(format)150 1077 y Fj(bzip2-1.0)p Fl(,)e Fj(0.9.5)e Fl(and)g -Fj(0.9.0)g Fl(use)h(exactly)h(the)f(same)h(\014le)e(format)i(as)f(the)h -(previous)d(v)m(ersion,)150 1186 y Fj(bzip2-0.1)p Fl(.)75 -b(This)41 b(decision)g(w)m(as)i(made)g(in)e(the)i(in)m(terests)g(of)g -(stabilit)m(y)-8 b(.)77 b(Creating)42 b(y)m(et)i(another)150 -1296 y(incompatible)21 b(compressed)i(\014le)f(format)i(w)m(ould)e -(create)i(further)e(confusion)g(and)h(disruption)d(for)j(users.)150 -1453 y(Nev)m(ertheless,)31 b(this)e(is)g(not)h(a)g(painless)e -(decision.)39 b(Dev)m(elopmen)m(t)31 b(w)m(ork)f(since)f(the)h(release) -h(of)f Fj(bzip2-)150 1562 y(0.1)19 b Fl(in)g(August)i(1997)h(has)e(sho) -m(wn)f(complexities)h(in)f(the)h(\014le)g(format)g(whic)m(h)f(slo)m(w)h -(do)m(wn)g(decompression)150 1672 y(and,)30 b(in)f(retrosp)s(ect,)i -(are)g(unnecessary)-8 b(.)40 b(These)31 b(are:)225 1829 -y Fi(\017)60 b Fl(The)20 b(run-length)g(enco)s(der,)i(whic)m(h)e(is)g -(the)h(\014rst)f(of)h(the)g(compression)f(transformations,)i(is)e(en)m -(tirely)330 1938 y(irrelev)-5 b(an)m(t.)63 b(The)38 b(original)e(purp)s -(ose)g(w)m(as)j(to)g(protect)g(the)f(sorting)g(algorithm)f(from)g(the)i -(v)m(ery)330 2048 y(w)m(orst)h(case)h(input:)58 b(a)41 -b(string)e(of)h(rep)s(eated)g(sym)m(b)s(ols.)68 b(But)40 -b(algorithm)f(steps)h(Q6a)h(and)e(Q6b)330 2157 y(in)30 -b(the)i(original)e(Burro)m(ws-Wheeler)i(tec)m(hnical)g(rep)s(ort)f -(\(SR)m(C-124\))i(sho)m(w)f(ho)m(w)g(rep)s(eats)g(can)g(b)s(e)330 -2267 y(handled)c(without)i(di\016cult)m(y)f(in)g(blo)s(c)m(k)h -(sorting.)225 2409 y Fi(\017)60 b Fl(The)30 b(randomisation)e(mec)m -(hanism)i(do)s(esn't)g(really)f(need)h(to)g(b)s(e)g(there.)41 -b(Udi)29 b(Man)m(b)s(er)h(and)f(Gene)330 2518 y(My)m(ers)j(published)c -(a)33 b(su\016x)e(arra)m(y)h(construction)f(algorithm)g(a)h(few)g(y)m -(ears)h(bac)m(k,)g(whic)m(h)d(can)j(b)s(e)330 2628 y(emplo)m(y)m(ed)27 -b(to)h(sort)g(an)m(y)f(blo)s(c)m(k,)h(no)f(matter)h(ho)m(w)f(rep)s -(etitiv)m(e,)h(in)d(O\(N)j(log)f(N\))h(time.)39 b(Subsequen)m(t)330 -2737 y(w)m(ork)25 b(b)m(y)f(Kunihik)m(o)f(Sadak)-5 b(ane)24 -b(has)h(pro)s(duced)e(a)i(deriv)-5 b(ativ)m(e)24 b(O\(N)h(\(log)g(N\))p -Fj(^)p Fl(2\))h(algorithm)d(whic)m(h)330 2847 y(usually)28 -b(outp)s(erforms)h(the)i(Man)m(b)s(er-My)m(ers)g(algorithm.)330 -2988 y(I)g(could)g(ha)m(v)m(e)i(c)m(hanged)f(to)g(Sadak)-5 -b(ane's)32 b(algorithm,)f(but)g(I)g(\014nd)f(it)h(to)h(b)s(e)f(slo)m(w) -m(er)h(than)f Fj(bzip2)p Fl('s)330 3098 y(existing)38 -b(algorithm)g(for)h(most)h(inputs,)f(and)g(the)g(randomisation)f(mec)m -(hanism)g(protects)i(ade-)330 3208 y(quately)34 b(against)f(bad)g -(cases.)52 b(I)33 b(didn't)f(think)g(it)i(w)m(as)g(a)g(go)s(o)s(d)f -(tradeo\013)i(to)f(mak)m(e.)51 b(P)m(artly)34 b(this)330 -3317 y(is)39 b(due)h(to)h(the)f(fact)h(that)g(I)f(w)m(as)g(not)h(\015o) -s(o)s(ded)e(with)g(email)g(complain)m(ts)g(ab)s(out)h -Fj(bzip2-0.1)p Fl('s)330 3427 y(p)s(erformance)30 b(on)g(rep)s(etitiv)m -(e)g(data,)h(so)g(p)s(erhaps)d(it)i(isn't)g(a)h(problem)d(for)j(real)f -(inputs.)330 3568 y(Probably)i(the)h(b)s(est)g(long-term)g(solution,)g -(and)g(the)g(one)h(I)f(ha)m(v)m(e)h(incorp)s(orated)e(in)m(to)i(0.9.5)h -(and)330 3678 y(ab)s(o)m(v)m(e,)42 b(is)c(to)h(use)f(the)h(existing)f -(sorting)g(algorithm)f(initially)-8 b(,)38 b(and)g(fall)f(bac)m(k)i(to) -h(a)f(O\(N)f(\(log)330 3787 y(N\))p Fj(^)p Fl(2\))31 -b(algorithm)f(if)f(the)i(standard)e(algorithm)h(gets)h(in)m(to)f -(di\016culties.)225 3929 y Fi(\017)60 b Fl(The)31 b(compressed)f -(\014le)g(format)i(w)m(as)f(nev)m(er)h(designed)d(to)j(b)s(e)f(handled) -e(b)m(y)i(a)g(library)-8 b(,)29 b(and)i(I)g(ha)m(v)m(e)330 -4039 y(had)d(to)i(jump)e(though)g(some)i(ho)s(ops)e(to)i(pro)s(duce)e -(an)h(e\016cien)m(t)g(implemen)m(tation)f(of)h(decompres-)330 -4148 y(sion.)38 b(It's)26 b(a)h(bit)e(hairy)-8 b(.)38 -b(T)-8 b(ry)26 b(passing)f Fj(decompress.c)d Fl(through)k(the)g(C)f -(prepro)s(cessor)g(and)h(y)m(ou'll)330 4258 y(see)32 -b(what)g(I)f(mean.)45 b(Muc)m(h)32 b(of)g(this)e(complexit)m(y)i(could) -f(ha)m(v)m(e)i(b)s(een)e(a)m(v)m(oided)h(if)e(the)i(compressed)330 -4367 y(size)e(of)h(eac)m(h)g(blo)s(c)m(k)f(of)h(data)g(w)m(as)g -(recorded)f(in)f(the)h(data)h(stream.)225 4509 y Fi(\017)60 -b Fl(An)30 b(Adler-32)g(c)m(hec)m(ksum,)i(rather)e(than)g(a)h(CR)m(C32) -g(c)m(hec)m(ksum,)g(w)m(ould)e(b)s(e)h(faster)h(to)g(compute.)150 -4698 y(It)e(w)m(ould)f(b)s(e)g(fair)g(to)h(sa)m(y)h(that)g(the)f -Fj(bzip2)e Fl(format)i(w)m(as)h(frozen)f(b)s(efore)f(I)h(prop)s(erly)d -(and)j(fully)d(under-)150 4807 y(sto)s(o)s(d)k(the)h(p)s(erformance)e -(consequences)i(of)g(doing)e(so.)150 4964 y(Impro)m(v)m(emen)m(ts)d -(whic)m(h)e(I)i(w)m(as)g(able)f(to)h(incorp)s(orate)f(in)m(to)g(0.9.0,) -k(despite)24 b(using)g(the)i(same)g(\014le)e(format,)150 -5074 y(are:)225 5230 y Fi(\017)60 b Fl(Single)30 b(arra)m(y)i(implemen) -m(tation)e(of)h(the)h(in)m(v)m(erse)f(BWT.)h(This)e(signi\014can)m(tly) -f(sp)s(eeds)i(up)f(decom-)330 5340 y(pression,)f(presumably)f(b)s -(ecause)i(it)g(reduces)g(the)h(n)m(um)m(b)s(er)e(of)i(cac)m(he)h -(misses.)p eop -%%Page: 32 33 -32 32 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 -b(32)225 299 y Fi(\017)60 b Fl(F)-8 b(aster)27 b(in)m(v)m(erse)e(MTF)h -(transform)f(for)g(large)h(MTF)f(v)-5 b(alues.)39 b(The)25 -b(new)g(implemen)m(tation)f(is)g(based)330 408 y(on)30 -b(the)h(notion)f(of)g(sliding)e(blo)s(c)m(ks)h(of)i(v)-5 -b(alues.)225 544 y Fi(\017)60 b Fj(bzip2-0.9.0)24 b Fl(no)m(w)k(reads)f -(and)f(writes)h(\014les)f(with)g Fj(fread)g Fl(and)h -Fj(fwrite)p Fl(;)f(v)m(ersion)h(0.1)i(used)d Fj(putc)330 -653 y Fl(and)k Fj(getc)p Fl(.)39 b(Duh!)h(W)-8 b(ell,)31 -b(y)m(ou)f(liv)m(e)g(and)g(learn.)150 836 y(F)-8 b(urther)30 -b(ahead,)g(it)f(w)m(ould)g(b)s(e)g(nice)h(to)g(b)s(e)g(able)f(to)i(do)e -(random)g(access)j(in)m(to)d(\014les.)40 b(This)28 b(will)f(require)150 -945 y(some)k(careful)e(design)h(of)g(compressed)g(\014le)g(formats.)150 -1227 y Fk(4.2)68 b(P)l(ortabilit)l(y)47 b(issues)150 -1419 y Fl(After)36 b(some)g(consideration,)g(I)f(ha)m(v)m(e)i(decided)d -(not)i(to)g(use)g(GNU)g Fj(autoconf)d Fl(to)j(con\014gure)g(0.9.5)h(or) -150 1529 y(1.0.)150 1686 y Fj(autoconf)p Fl(,)31 b(admirable)g(and)h(w) -m(onderful)f(though)i(it)f(is,)h(mainly)d(assists)j(with)e(p)s -(ortabilit)m(y)g(problems)150 1795 y(b)s(et)m(w)m(een)f(Unix-lik)m(e)d -(platforms.)40 b(But)29 b Fj(bzip2)f Fl(do)s(esn't)h(ha)m(v)m(e)h(m)m -(uc)m(h)f(in)f(the)h(w)m(a)m(y)h(of)g(p)s(ortabilit)m(y)d(prob-)150 -1905 y(lems)35 b(on)h(Unix;)j(most)d(of)g(the)h(di\016culties)d(app)s -(ear)h(when)g(p)s(orting)g(to)i(the)f(Mac,)j(or)d(to)h(Microsoft's)150 -2015 y(op)s(erating)26 b(systems.)40 b Fj(autoconf)25 -b Fl(do)s(esn't)h(help)g(in)f(those)j(cases,)h(and)d(brings)f(in)g(a)j -(whole)e(load)g(of)h(new)150 2124 y(complexit)m(y)-8 -b(.)150 2281 y(Most)28 b(p)s(eople)f(should)f(b)s(e)h(able)g(to)h -(compile)e(the)i(library)d(and)i(program)h(under)e(Unix)g(straigh)m(t)i -(out-of-)150 2391 y(the-b)s(o)m(x,)j(so)g(to)g(sp)s(eak,)f(esp)s -(ecially)f(if)g(y)m(ou)i(ha)m(v)m(e)g(a)g(v)m(ersion)f(of)g(GNU)h(C)f -(a)m(v)-5 b(ailable.)150 2547 y(There)32 b(are)h(a)g(couple)f(of)h -Fj(__inline__)d Fl(directiv)m(es)i(in)f(the)i(co)s(de.)48 -b(GNU)33 b(C)f(\()p Fj(gcc)p Fl(\))g(should)f(b)s(e)h(able)g(to)150 -2657 y(handle)24 b(them.)39 b(If)25 b(y)m(ou're)i(not)e(using)g(GNU)h -(C,)f(y)m(our)h(C)f(compiler)f(shouldn't)g(see)i(them)f(at)i(all.)38 -b(If)25 b(y)m(our)150 2767 y(compiler)k(do)s(es,)i(for)g(some)g -(reason,)h(see)f(them)g(and)f(do)s(esn't)h(lik)m(e)f(them,)i(just)e -Fj(#define)f(__inline__)150 2876 y Fl(to)37 b(b)s(e)f -Fj(/*)30 b(*/)p Fl(.)58 b(One)36 b(easy)h(w)m(a)m(y)g(to)h(do)e(this)f -(is)h(to)h(compile)e(with)g(the)i(\015ag)g Fj(-D__inline__=)p -Fl(,)d(whic)m(h)150 2986 y(should)28 b(b)s(e)i(understo)s(o)s(d)f(b)m -(y)h(most)h(Unix)e(compilers.)150 3143 y(If)35 b(y)m(ou)g(still)e(ha)m -(v)m(e)j(di\016culties,)e(try)h(compiling)e(with)g(the)j(macro)f -Fj(BZ_STRICT_ANSI)c Fl(de\014ned.)54 b(This)150 3252 -y(should)28 b(enable)i(y)m(ou)h(to)g(build)d(the)i(library)e(in)h(a)i -(strictly)f(ANSI)g(complian)m(t)f(en)m(vironmen)m(t.)41 -b(Building)150 3362 y(the)25 b(program)f(itself)f(lik)m(e)g(this)h(is)f -(dangerous)h(and)g(not)g(supp)s(orted,)g(since)g(y)m(ou)h(remo)m(v)m(e) -g Fj(bzip2)p Fl('s)e(c)m(hec)m(ks)150 3471 y(against)30 -b(compressing)f(directories,)g(sym)m(b)s(olic)g(links,)f(devices,)i -(and)f(other)h(not-really-a-\014le)g(en)m(tities.)150 -3581 y(This)f(could)g(cause)i(\014lesystem)f(corruption!)150 -3738 y(One)e(other)i(thing:)39 b(if)27 b(y)m(ou)j(create)g(a)f -Fj(bzip2)f Fl(binary)f(for)i(public)d(distribution,)g(please)i(try)h -(and)g(link)d(it)150 3847 y(statically)g(\()p Fj(gcc)k(-s)p -Fl(\).)39 b(This)25 b(a)m(v)m(oids)i(all)f(sorts)h(of)g(library-v)m -(ersion)d(issues)h(that)i(others)g(ma)m(y)g(encoun)m(ter)150 -3957 y(later)j(on.)150 4114 y(If)f(y)m(ou)g(build)e Fj(bzip2)h -Fl(on)h(Win32,)h(y)m(ou)f(m)m(ust)g(set)h Fj(BZ_UNIX)e -Fl(to)i(0)f(and)g Fj(BZ_LCCWIN32)d Fl(to)k(1,)g(in)e(the)i(\014le)150 -4223 y Fj(bzip2.c)p Fl(,)f(b)s(efore)h(compiling.)38 -b(Otherwise)29 b(the)i(resulting)d(binary)h(w)m(on't)i(w)m(ork)f -(correctly)-8 b(.)150 4505 y Fk(4.3)68 b(Rep)t(orting)46 -b(bugs)150 4698 y Fl(I)25 b(tried)f(prett)m(y)i(hard)e(to)i(mak)m(e)g -(sure)f Fj(bzip2)e Fl(is)i(bug)f(free,)j(b)s(oth)d(b)m(y)h(design)f -(and)h(b)m(y)g(testing.)39 b(Hop)s(efully)150 4807 y(y)m(ou'll)29 -b(nev)m(er)i(need)f(to)h(read)g(this)e(section)h(for)h(real.)150 -4964 y(Nev)m(ertheless,)36 b(if)c Fj(bzip2)h Fl(dies)g(with)f(a)i -(segmen)m(tation)h(fault,)g(a)f(bus)f(error)g(or)h(an)g(in)m(ternal)e -(assertion)150 5074 y(failure,)i(it)h(will)d(ask)j(y)m(ou)g(to)g(email) -f(me)h(a)g(bug)f(rep)s(ort.)54 b(Exp)s(erience)33 b(with)h(v)m(ersion)g -(0.1)i(sho)m(ws)e(that)150 5183 y(almost)c(all)g(these)h(problems)d -(can)j(b)s(e)f(traced)h(to)g(either)f(compiler)e(bugs)i(or)g(hardw)m -(are)g(problems.)225 5340 y Fi(\017)60 b Fl(Recompile)22 -b(the)h(program)g(with)f(no)h(optimisation,)g(and)f(see)i(if)e(it)g(w)m -(orks.)39 b(And/or)22 b(try)h(a)g(di\013eren)m(t)p eop -%%Page: 33 34 -33 33 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 -b(33)330 299 y(compiler.)77 b(I)43 b(heard)f(all)g(sorts)h(of)h -(stories)e(ab)s(out)h(v)-5 b(arious)42 b(\015a)m(v)m(ours)h(of)h(GNU)f -(C)g(\(and)g(other)330 408 y(compilers\))20 b(generating)i(bad)e(co)s -(de)i(for)f Fj(bzip2)p Fl(,)h(and)f(I'v)m(e)h(run)e(across)i(t)m(w)m(o) -g(suc)m(h)f(examples)g(m)m(yself.)330 606 y(2.7.X)35 -b(v)m(ersions)e(of)g(GNU)h(C)f(are)h(kno)m(wn)f(to)h(generate)h(bad)d -(co)s(de)i(from)f(time)g(to)h(time,)g(at)g(high)330 716 -y(optimisation)20 b(lev)m(els.)37 b(If)21 b(y)m(ou)g(get)i(problems,)e -(try)g(using)f(the)i(\015ags)f Fj(-O2)f(-fomit-frame-pointer)330 -825 y(-fno-strength-reduce)p Fl(.)35 b(Y)-8 b(ou)31 b(should)d(sp)s -(eci\014cally)h Fc(not)j Fl(use)e Fj(-funroll-loops)p -Fl(.)330 1023 y(Y)-8 b(ou)38 b(ma)m(y)g(notice)g(that)g(the)g(Mak)m -(e\014le)g(runs)e(six)g(tests)i(as)g(part)f(of)h(the)g(build)c(pro)s -(cess.)62 b(If)37 b(the)330 1132 y(program)43 b(passes)g(all)f(of)h -(these,)k(it's)c(a)h(prett)m(y)f(go)s(o)s(d)g(\(but)g(not)g(100\045\))i -(indication)c(that)j(the)330 1242 y(compiler)29 b(has)h(done)g(its)g -(job)g(correctly)-8 b(.)225 1440 y Fi(\017)60 b Fl(If)33 -b Fj(bzip2)f Fl(crashes)i(randomly)-8 b(,)33 b(and)g(the)h(crashes)g -(are)g(not)g(rep)s(eatable,)g(y)m(ou)g(ma)m(y)g(ha)m(v)m(e)h(a)f -(\015aky)330 1549 y(memory)k(subsystem.)64 b Fj(bzip2)37 -b Fl(really)g(hammers)h(y)m(our)g(memory)g(hierarc)m(h)m(y)-8 -b(,)41 b(and)d(if)f(it's)h(a)h(bit)330 1659 y(marginal,)33 -b(y)m(ou)h(ma)m(y)g(get)h(these)f(problems.)49 b(Ditto)34 -b(if)f(y)m(our)h(disk)e(or)h(I/O)h(subsystem)e(is)h(slo)m(wly)330 -1768 y(failing.)39 b(Y)-8 b(up,)30 b(this)f(really)g(do)s(es)h(happ)s -(en.)330 1966 y(T)-8 b(ry)28 b(using)f(a)i(di\013eren)m(t)f(mac)m(hine) -g(of)h(the)g(same)f(t)m(yp)s(e,)i(and)e(see)h(if)e(y)m(ou)i(can)g(rep)s -(eat)g(the)f(problem.)225 2163 y Fi(\017)60 b Fl(This)21 -b(isn't)i(really)f(a)h(bug,)i(but)d(...)39 b(If)23 b -Fj(bzip2)f Fl(tells)g(y)m(ou)h(y)m(our)h(\014le)e(is)g(corrupted)h(on)g -(decompression,)330 2273 y(and)29 b(y)m(ou)g(obtained)f(the)i(\014le)e -(via)h(FTP)-8 b(,)29 b(there)h(is)e(a)h(p)s(ossibilit)m(y)d(that)k(y)m -(ou)f(forgot)h(to)g(tell)e(FTP)h(to)330 2383 y(do)23 -b(a)g(binary)e(mo)s(de)i(transfer.)38 b(That)23 b(absolutely)f(will)e -(cause)j(the)h(\014le)e(to)h(b)s(e)g(non-decompressible.)330 -2492 y(Y)-8 b(ou'll)30 b(ha)m(v)m(e)h(to)g(transfer)f(it)g(again.)150 -2737 y(If)i(y)m(ou'v)m(e)h(incorp)s(orated)e Fj(libbzip2)f -Fl(in)m(to)i(y)m(our)g(o)m(wn)g(program)g(and)g(are)g(getting)h -(problems,)e(please,)150 2847 y(please,)d(please,)h(c)m(hec)m(k)g(that) -f(the)g(parameters)g(y)m(ou)g(are)g(passing)f(in)f(calls)h(to)h(the)g -(library)-8 b(,)26 b(are)j(correct,)150 2956 y(and)e(in)f(accordance)k -(with)c(what)i(the)g(do)s(cumen)m(tation)f(sa)m(ys)h(is)f(allo)m(w)m -(able.)39 b(I)28 b(ha)m(v)m(e)h(tried)e(to)h(mak)m(e)h(the)150 -3066 y(library)f(robust)i(against)g(suc)m(h)g(problems,)f(but)h(I'm)g -(sure)g(I)g(ha)m(v)m(en't)h(succeeded.)150 3223 y(Finally)-8 -b(,)32 b(if)g(the)h(ab)s(o)m(v)m(e)i(commen)m(ts)e(don't)g(help,)g(y)m -(ou'll)f(ha)m(v)m(e)i(to)g(send)e(me)h(a)g(bug)g(rep)s(ort.)48 -b(No)m(w,)34 b(it's)150 3332 y(just)c(amazing)g(ho)m(w)h(man)m(y)f(p)s -(eople)g(will)d(send)j(me)g(a)h(bug)f(rep)s(ort)g(sa)m(ying)g -(something)g(lik)m(e)481 3483 y(bzip2)f(crashed)h(with)f(segmen)m -(tation)j(fault)e(on)g(m)m(y)g(mac)m(hine)150 3640 y(and)h(absolutely)f -(nothing)h(else.)44 b(Needless)32 b(to)g(sa)m(y)-8 b(,)33 -b(a)f(suc)m(h)f(a)h(rep)s(ort)f(is)g Fc(totally)-8 b(,)32 -b(utterly)-8 b(,)32 b(completely)150 3750 y(and)40 b(comprehensiv)m -(ely)g(100\045)h(useless;)46 b(a)41 b(w)m(aste)g(of)g(y)m(our)g(time,)i -(m)m(y)e(time,)i(and)e(net)g(bandwidth)p Fl(.)150 3859 -y(With)31 b(no)h(details)f(at)i(all,)e(there's)h(no)g(w)m(a)m(y)h(I)f -(can)g(p)s(ossibly)d(b)s(egin)h(to)j(\014gure)e(out)i(what)e(the)i -(problem)150 3969 y(is.)150 4126 y(The)d(rules)e(of)i(the)g(game)h -(are:)41 b(facts,)32 b(facts,)f(facts.)41 b(Don't)31 -b(omit)f(them)g(b)s(ecause)g Fj(")p Fl(oh,)g(they)g(w)m(on't)h(b)s(e) -150 4235 y(relev)-5 b(an)m(t)p Fj(")p Fl(.)41 b(A)m(t)31 -b(the)g(bare)f(minim)m(um:)481 4386 y(Mac)m(hine)h(t)m(yp)s(e.)61 -b(Op)s(erating)29 b(system)h(v)m(ersion.)481 4490 y(Exact)h(v)m(ersion) -f(of)h Fj(bzip2)e Fl(\(do)h Fj(bzip2)47 b(-V)p Fl(\).)481 -4594 y(Exact)31 b(v)m(ersion)f(of)h(the)f(compiler)f(used.)481 -4698 y(Flags)i(passed)e(to)j(the)e(compiler.)150 4854 -y(Ho)m(w)m(ev)m(er,)i(the)d(most)h(imp)s(ortan)m(t)f(single)f(thing)g -(that)i(will)d(help)h(me)h(is)f(the)i(\014le)e(that)i(y)m(ou)g(w)m(ere) -g(trying)150 4964 y(to)f(compress)f(or)g(decompress)g(at)h(the)f(time)g -(the)g(problem)f(happ)s(ened.)38 b(Without)28 b(that,)h(m)m(y)g(abilit) -m(y)d(to)150 5074 y(do)k(an)m(ything)g(more)h(than)f(sp)s(eculate)g(ab) -s(out)g(the)g(cause,)i(is)d(limited.)150 5230 y(Please)34 -b(remem)m(b)s(er)f(that)h(I)f(connect)i(to)f(the)g(In)m(ternet)g(with)e -(a)i(mo)s(dem,)g(so)f(y)m(ou)h(should)e(con)m(tact)k(me)150 -5340 y(b)s(efore)30 b(mailing)e(me)j(h)m(uge)f(\014les.)p -eop -%%Page: 34 35 -34 34 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 -b(34)150 299 y Fk(4.4)68 b(Did)45 b(y)l(ou)g(get)h(the)f(righ)l(t)h -(pac)l(k)-7 b(age?)150 491 y Fj(bzip2)34 b Fl(is)h(a)h(resource)g(hog.) -56 b(It)36 b(soaks)g(up)f(large)g(amoun)m(ts)h(of)g(CPU)f(cycles)h(and) -f(memory)-8 b(.)57 b(Also,)36 b(it)150 601 y(giv)m(es)26 -b(v)m(ery)h(large)f(latencies.)39 b(In)25 b(the)h(w)m(orst)g(case,)i(y) -m(ou)f(can)f(feed)g(man)m(y)g(megab)m(ytes)h(of)f(uncompressed)150 -711 y(data)45 b(in)m(to)e(the)i(library)c(b)s(efore)j(getting)g(an)m(y) -g(compressed)g(output,)j(so)d(this)f(probably)f(rules)h(out)150 -820 y(applications)29 b(requiring)e(in)m(teractiv)m(e)32 -b(b)s(eha)m(viour.)150 977 y(These)38 b(aren't)h(faults)e(of)h(m)m(y)g -(implemen)m(tation,)h(I)f(hop)s(e,)i(but)d(more)h(an)g(in)m(trinsic)e -(prop)s(ert)m(y)h(of)i(the)150 1087 y(Burro)m(ws-Wheeler)30 -b(transform)g(\(unfortunately\).)40 b(Ma)m(yb)s(e)31 -b(this)e(isn't)h(what)g(y)m(ou)h(w)m(an)m(t.)150 1244 -y(If)h(y)m(ou)h(w)m(an)m(t)g(a)g(compressor)g(and/or)f(library)e(whic)m -(h)h(is)h(faster,)i(uses)e(less)g(memory)g(but)g(gets)h(prett)m(y)150 -1353 y(go)s(o)s(d)e(compression,)g(and)g(has)h(minimal)c(latency)-8 -b(,)33 b(consider)e(Jean-loup)f(Gailly's)g(and)h(Mark)h(Adler's)150 -1463 y(w)m(ork,)f Fj(zlib-1.1.2)c Fl(and)j Fj(gzip-1.2.4)p -Fl(.)38 b(Lo)s(ok)31 b(for)f(them)g(at)150 1620 y Fj -(http://www.cdrom.com/pub)o(/inf)o(ozip)o(/zl)o(ib)24 -b Fl(and)30 b Fj(http://www.gzip.org)25 b Fl(resp)s(ectiv)m(ely)-8 -b(.)150 1776 y(F)g(or)32 b(something)f(faster)i(and)e(ligh)m(ter)f -(still,)h(y)m(ou)g(migh)m(t)h(try)f(Markus)h(F)g(X)f(J)h(Ob)s(erh)m -(umer's)d Fj(LZO)i Fl(real-)150 1886 y(time)f -(compression/decompression)f(library)-8 b(,)28 b(at)150 -1996 y Fj(http://wildsau.idv.uni-l)o(inz.)o(ac.a)o(t/m)o(fx/l)o(zo.h)o -(tml)o Fl(.)150 2152 y(If)38 b(y)m(ou)h(w)m(an)m(t)g(to)h(use)e(the)g -Fj(bzip2)g Fl(algorithms)f(to)i(compress)f(small)g(blo)s(c)m(ks)f(of)i -(data,)j(64k)d(b)m(ytes)g(or)150 2262 y(smaller,)i(for)e(example)g(on)h -(an)f(on-the-\015y)h(disk)e(compressor,)k(y)m(ou'd)e(b)s(e)f(w)m(ell)g -(advised)f(not)i(to)g(use)150 2372 y(this)i(library)-8 -b(.)77 b(Instead,)47 b(I'v)m(e)d(made)f(a)h(sp)s(ecial)e(library)f -(tuned)h(for)h(that)h(kind)d(of)j(use.)79 b(It's)43 b(part)150 -2481 y(of)d Fj(e2compr-0.40)p Fl(,)f(an)g(on-the-\015y)h(disk)e -(compressor)h(for)h(the)f(Lin)m(ux)f Fj(ext2)h Fl(\014lesystem.)67 -b(Lo)s(ok)40 b(at)150 2591 y Fj(http://www.netspace.net.)o(au/~)o(reit) -o(er/)o(e2co)o(mpr)p Fl(.)150 2880 y Fk(4.5)68 b(T)-11 -b(esting)150 3072 y Fl(A)30 b(record)h(of)f(the)h(tests)g(I'v)m(e)g -(done.)150 3229 y(First,)f(some)h(data)g(sets:)225 3386 -y Fi(\017)60 b Fl(B:)32 b(a)f(directory)f(con)m(taining)h(6001)i -(\014les,)d(one)h(for)g(ev)m(ery)h(length)e(in)g(the)h(range)g(0)h(to)f -(6000)i(b)m(ytes.)330 3496 y(The)d(\014les)f(con)m(tain)i(random)e(lo)m -(w)m(ercase)j(letters.)41 b(18.7)32 b(megab)m(ytes.)225 -3633 y Fi(\017)60 b Fl(H:)36 b(m)m(y)f(home)h(directory)f(tree.)56 -b(Do)s(cumen)m(ts,)38 b(source)d(co)s(de,)i(mail)d(\014les,)i -(compressed)f(data.)57 b(H)330 3743 y(con)m(tains)39 -b(B,)h(and)f(also)g(a)g(directory)g(of)g(\014les)f(designed)g(as)i(b)s -(oundary)d(cases)j(for)f(the)g(sorting;)330 3853 y(mostly)30 -b(v)m(ery)h(rep)s(etitiv)m(e,)f(nast)m(y)h(\014les.)39 -b(565)32 b(megab)m(ytes.)225 3990 y Fi(\017)60 b Fl(A:)43 -b(directory)f(tree)i(holding)d(v)-5 b(arious)41 b(applications)g(built) -g(from)h(source:)66 b Fj(egcs)p Fl(,)45 b Fj(gcc-2.8.1)p -Fl(,)330 4100 y(KDE,)31 b(GTK,)f(Octa)m(v)m(e,)j(etc.)41 -b(2200)33 b(megab)m(ytes.)150 4285 y(The)i(tests)g(conducted)g(are)h -(as)f(follo)m(ws.)54 b(Eac)m(h)36 b(test)g(means)f(compressing)f(\(a)h -(cop)m(y)h(of)7 b(\))36 b(eac)m(h)g(\014le)e(in)150 4394 -y(the)d(data)g(set,)g(decompressing)e(it)h(and)g(comparing)f(it)h -(against)h(the)g(original.)150 4551 y(First,)26 b(a)g(bunc)m(h)f(of)h -(tests)h(with)d(blo)s(c)m(k)h(sizes)h(and)f(in)m(ternal)g(bu\013er)f -(sizes)i(set)g(v)m(ery)g(small,)g(to)g(detect)i(an)m(y)150 -4661 y(problems)g(with)g(the)i(blo)s(c)m(king)f(and)g(bu\013ering)e -(mec)m(hanisms.)40 b(This)28 b(required)g(mo)s(difying)f(the)j(source) -150 4770 y(co)s(de)h(so)f(as)h(to)g(try)f(to)h(break)g(it.)199 -4927 y(1.)61 b(Data)32 b(set)f(H,)g(with)e(bu\013er)g(size)h(of)h(1)g -(b)m(yte,)g(and)f(blo)s(c)m(k)g(size)g(of)g(23)i(b)m(ytes.)199 -5065 y(2.)61 b(Data)32 b(set)f(B,)g(bu\013er)e(sizes)h(1)h(b)m(yte,)g -(blo)s(c)m(k)f(size)g(1)h(b)m(yte.)199 5202 y(3.)61 b(As)30 -b(\(2\))i(but)d(small-mo)s(de)g(decompression.)199 5340 -y(4.)61 b(As)30 b(\(2\))i(with)d(blo)s(c)m(k)h(size)g(2)h(b)m(ytes.)p -eop -%%Page: 35 36 -35 35 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 -b(35)199 299 y(5.)61 b(As)30 b(\(2\))i(with)d(blo)s(c)m(k)h(size)g(3)h -(b)m(ytes.)199 431 y(6.)61 b(As)30 b(\(2\))i(with)d(blo)s(c)m(k)h(size) -g(4)h(b)m(ytes.)199 564 y(7.)61 b(As)30 b(\(2\))i(with)d(blo)s(c)m(k)h -(size)g(5)h(b)m(ytes.)199 697 y(8.)61 b(As)30 b(\(2\))i(with)d(blo)s(c) -m(k)h(size)g(6)h(b)m(ytes)g(and)e(small-mo)s(de)g(decompression.)199 -829 y(9.)61 b(H)30 b(with)g(bu\013er)f(size)h(of)h(1)f(b)m(yte,)i(but)d -(normal)h(blo)s(c)m(k)g(size)g(\(up)f(to)j(900000)h(b)m(ytes\).)150 -1009 y(Then)c(some)i(tests)g(with)e(unmo)s(di\014ed)f(source)i(co)s -(de.)199 1166 y(1.)61 b(H,)31 b(all)e(settings)h(normal.)199 -1299 y(2.)61 b(As)30 b(\(1\),)i(with)d(small-mo)s(de)g(decompress.)199 -1431 y(3.)61 b(H,)31 b(compress)f(with)f(\015ag)i Fj(-1)p -Fl(.)199 1564 y(4.)61 b(H,)31 b(compress)f(with)f(\015ag)i -Fj(-s)p Fl(,)f(decompress)g(with)f(\015ag)i Fj(-s)p Fl(.)199 -1697 y(5.)61 b(F)-8 b(orw)m(ards)33 b(compatibilit)m(y:)45 -b(H,)33 b Fj(bzip2-0.1pl2)d Fl(compressing,)j Fj(bzip2-0.9.5)d -Fl(decompressing,)330 1806 y(all)f(settings)i(normal.)199 -1939 y(6.)61 b(Bac)m(kw)m(ards)23 b(compatibilit)m(y:)35 -b(H,)23 b Fj(bzip2-0.9.5)c Fl(compressing,)k Fj(bzip2-0.1pl2)c -Fl(decompressing,)330 2048 y(all)29 b(settings)i(normal.)199 -2181 y(7.)61 b(Bigger)31 b(tests:)41 b(A,)31 b(all)e(settings)i -(normal.)199 2314 y(8.)61 b(As)30 b(\(7\),)i(using)d(the)i(fallbac)m(k) -e(\(Sadak)-5 b(ane-lik)m(e\))31 b(sorting)f(algorithm.)199 -2446 y(9.)61 b(As)30 b(\(8\),)i(compress)e(with)f(\015ag)i -Fj(-1)p Fl(,)f(decompress)g(with)f(\015ag)i Fj(-s)p Fl(.)154 -2579 y(10.)61 b(H,)31 b(using)e(the)h(fallbac)m(k)g(sorting)g -(algorithm.)154 2711 y(11.)61 b(F)-8 b(orw)m(ards)33 -b(compatibilit)m(y:)45 b(A,)33 b Fj(bzip2-0.1pl2)d Fl(compressing,)j -Fj(bzip2-0.9.5)d Fl(decompressing,)330 2821 y(all)f(settings)i(normal.) -154 2954 y(12.)61 b(Bac)m(kw)m(ards)23 b(compatibilit)m(y:)35 -b(A,)23 b Fj(bzip2-0.9.5)c Fl(compressing,)k Fj(bzip2-0.1pl2)c -Fl(decompressing,)330 3063 y(all)29 b(settings)i(normal.)154 -3196 y(13.)61 b(Misc)39 b(test:)58 b(ab)s(out)39 b(400)h(megab)m(ytes)h -(of)e Fj(.tar)f Fl(\014les)f(with)h Fj(bzip2)f Fl(compiled)h(with)f -(Chec)m(k)m(er)j(\(a)330 3305 y(memory)30 b(access)i(error)e(detector,) -i(lik)m(e)e(Purify\).)154 3438 y(14.)61 b(Misc)30 b(tests)h(to)g(mak)m -(e)h(sure)d(it)h(builds)e(and)h(runs)g(ok)i(on)f(non-Lin)m(ux/x86)g -(platforms.)150 3618 y(These)35 b(tests)h(w)m(ere)f(conducted)g(on)g(a) -h(225)g(MHz)g(IDT)f(WinChip)d(mac)m(hine,)k(running)d(Lin)m(ux)g -(2.0.36.)150 3728 y(They)d(represen)m(t)g(nearly)g(a)h(w)m(eek)g(of)f -(con)m(tin)m(uous)g(computation.)41 b(All)29 b(tests)i(completed)f -(successfully)-8 b(.)150 4003 y Fk(4.6)68 b(F)-11 b(urther)44 -b(reading)150 4196 y Fj(bzip2)28 b Fl(is)h(not)h(researc)m(h)g(w)m -(ork,)g(in)e(the)i(sense)g(that)g(it)f(do)s(esn't)g(presen)m(t)h(an)m -(y)g(new)f(ideas.)40 b(Rather,)30 b(it's)150 4306 y(an)g(engineering)f -(exercise)i(based)f(on)g(existing)g(ideas.)150 4463 y(F)-8 -b(our)31 b(do)s(cumen)m(ts)f(describ)s(e)e(essen)m(tially)i(all)f(the)i -(ideas)e(b)s(ehind)f Fj(bzip2)p Fl(:)390 4614 y Fj(Michael)46 -b(Burrows)g(and)h(D.)g(J.)g(Wheeler:)485 4717 y("A)h(block-sorting)c -(lossless)h(data)i(compression)e(algorithm")533 4821 -y(10th)i(May)g(1994.)533 4925 y(Digital)f(SRC)h(Research)e(Report)i -(124.)533 5029 y(ftp://ftp.digital.com/pub)o(/DEC)o(/SR)o(C/re)o(sear)o -(ch-)o(repo)o(rts/)o(SRC)o(-124)o(.ps.)o(gz)533 5132 -y(If)g(you)g(have)g(trouble)f(finding)g(it,)g(try)h(searching)f(at)h -(the)533 5236 y(New)g(Zealand)f(Digital)g(Library,)f -(http://www.nzdl.org.)p eop -%%Page: 36 37 -36 36 bop 150 -116 a Fl(Chapter)30 b(4:)41 b(Miscellanea)2586 -b(36)390 299 y Fj(Daniel)46 b(S.)h(Hirschberg)e(and)i(Debra)g(A.)g -(LeLewer)485 403 y("Efficient)e(Decoding)h(of)h(Prefix)f(Codes")533 -506 y(Communications)e(of)j(the)g(ACM,)g(April)f(1990,)h(Vol)f(33,)h -(Number)f(4.)533 610 y(You)h(might)f(be)i(able)e(to)h(get)g(an)h -(electronic)d(copy)h(of)h(this)676 714 y(from)g(the)g(ACM)g(Digital)f -(Library.)390 922 y(David)g(J.)i(Wheeler)533 1025 y(Program)e(bred3.c)g -(and)h(accompanying)d(document)i(bred3.ps.)533 1129 y(This)h(contains)e -(the)i(idea)g(behind)f(the)h(multi-table)e(Huffman)533 -1233 y(coding)h(scheme.)533 1337 y(ftp://ftp.cl.cam.ac.uk/us)o(ers/)o -(djw)o(3/)390 1544 y(Jon)h(L.)g(Bentley)f(and)h(Robert)f(Sedgewick)485 -1648 y("Fast)h(Algorithms)e(for)i(Sorting)f(and)g(Searching)g(Strings") -533 1752 y(Available)f(from)i(Sedgewick's)e(web)i(page,)533 -1856 y(www.cs.princeton.edu/~rs)150 2012 y Fl(The)29 -b(follo)m(wing)f(pap)s(er)g(giv)m(es)h(v)-5 b(aluable)28 -b(additional)g(insigh)m(ts)f(in)m(to)j(the)f(algorithm,)g(but)g(is)f -(not)i(imme-)150 2122 y(diately)g(the)g(basis)f(of)i(an)m(y)g(co)s(de)f -(used)g(in)f(bzip2.)390 2273 y Fj(Peter)46 b(Fenwick:)533 -2377 y(Block)h(Sorting)e(Text)i(Compression)533 2481 -y(Proceedings)e(of)i(the)g(19th)g(Australasian)d(Computer)i(Science)f -(Conference,)629 2584 y(Melbourne,)g(Australia.)92 b(Jan)47 -b(31)g(-)h(Feb)f(2,)g(1996.)533 2688 y(ftp://ftp.cs.auckland.ac.)o -(nz/p)o(ub/)o(pete)o(r-f/)o(ACS)o(C96p)o(aper)o(.ps)150 -2845 y Fl(Kunihik)m(o)28 b(Sadak)-5 b(ane's)31 b(sorting)e(algorithm,)h -(men)m(tioned)g(ab)s(o)m(v)m(e,)i(is)d(a)m(v)-5 b(ailable)30 -b(from:)390 2996 y Fj(http://naomi.is.s.u-toky)o(o.ac)o(.jp/)o(~sa)o -(da/p)o(aper)o(s/S)o(ada9)o(8b.p)o(s.g)o(z)150 3153 y -Fl(The)41 b(Man)m(b)s(er-My)m(ers)g(su\016x)g(arra)m(y)g(construction)g -(algorithm)f(is)g(describ)s(ed)f(in)h(a)i(pap)s(er)e(a)m(v)-5 -b(ailable)150 3262 y(from:)390 3413 y Fj(http://www.cs.arizona.ed)o -(u/pe)o(ople)o(/ge)o(ne/P)o(APER)o(S/s)o(uffi)o(x.ps)150 -3570 y Fl(Finally)d(,)33 b(the)h(follo)m(wing)e(pap)s(er)h(do)s(cumen)m -(ts)g(some)h(recen)m(t)h(in)m(v)m(estigations)e(I)h(made)f(in)m(to)h -(the)g(p)s(erfor-)150 3680 y(mance)d(of)f(sorting)g(algorithms:)390 -3831 y Fj(Julian)46 b(Seward:)533 3935 y(On)h(the)g(Performance)e(of)i -(BWT)g(Sorting)f(Algorithms)533 4038 y(Proceedings)f(of)i(the)g(IEEE)g -(Data)f(Compression)f(Conference)g(2000)629 4142 y(Snowbird,)g(Utah.)94 -b(28-30)46 b(March)h(2000.)p eop -%%Page: -1 38 --1 37 bop 3725 -116 a Fl(i)150 299 y Fh(T)-13 b(able)54 -b(of)g(Con)l(ten)l(ts)150 641 y Fk(1)135 b(In)l(tro)t(duction)15 -b Fb(.)20 b(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f -(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)60 -b Fk(2)150 911 y(2)135 b(Ho)l(w)45 b(to)h(use)f Fd(bzip2)31 -b Fb(.)19 b(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g -(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)78 b Fk(3)1047 -1048 y Fl(NAME)20 b Fa(.)c(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)50 b Fl(3)1047 -1157 y(SYNOPSIS)21 b Fa(.)13 b(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)h(.)f(.)g(.)50 b Fl(3)1047 1267 y(DESCRIPTION)10 -b Fa(.)j(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)40 -b Fl(3)1047 1377 y(OPTIONS)16 b Fa(.)d(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)45 b Fl(4)1047 -1486 y(MEMOR)-8 b(Y)31 b(MANA)m(GEMENT)14 b Fa(.)j(.)e(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)44 -b Fl(6)1047 1596 y(RECO)m(VERING)30 b(D)m(A)-8 b(T)g(A)32 -b(FR)m(OM)f(D)m(AMA)m(GED)i(FILES)1256 1705 y Fa(.)15 -b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)57 b Fl(7)1047 1815 y(PERF)m(ORMANCE)30 -b(NOTES)9 b Fa(.)14 b(.)h(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)38 b Fl(7)1047 1924 -y(CA)-10 b(VEA)i(TS)10 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)40 b Fl(8)1047 2034 -y(A)m(UTHOR)23 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)53 b Fl(8)150 2276 y Fk(3)135 -b(Programming)46 b(with)f Fd(libbzip2)29 b Fb(.)16 b(.)j(.)h(.)f(.)h(.) -f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)74 b Fk(9)449 -2413 y Fl(3.1)92 b(T)-8 b(op-lev)m(el)30 b(structure)24 -b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h -(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)54 b Fl(9)748 2523 y(3.1.1)93 b(Lo)m(w-lev)m(el)30 -b(summary)23 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)53 -b Fl(9)748 2633 y(3.1.2)93 b(High-lev)m(el)29 b(summary)12 -b Fa(.)i(.)h(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)41 b -Fl(9)748 2742 y(3.1.3)93 b(Utilit)m(y)29 b(functions)g(summary)12 -b Fa(.)h(.)j(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)41 b Fl(10)449 2852 y(3.2)92 b(Error)29 -b(handling)18 b Fa(.)13 b(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)48 b Fl(10)449 -2961 y(3.3)92 b(Lo)m(w-lev)m(el)31 b(in)m(terface)d Fa(.)15 -b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)57 b Fl(12)748 3071 y(3.3.1)93 b Fj(BZ2_bzCompressInit)21 -b Fa(.)9 b(.)15 b(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)50 b Fl(12)748 -3181 y(3.3.2)93 b Fj(BZ2_bzCompress)9 b Fa(.)h(.)15 b(.)g(.)g(.)g(.)g -(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)38 b Fl(14)748 3290 y(3.3.3)93 -b Fj(BZ2_bzCompressEnd)23 b Fa(.)10 b(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g -(.)52 b Fl(17)748 3400 y(3.3.4)93 b Fj(BZ2_bzDecompressInit)16 -b Fa(.)9 b(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h -(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)45 b Fl(17)748 3509 -y(3.3.5)93 b Fj(BZ2_bzDecompress)21 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)55 b Fl(17)748 3619 y(3.3.6)93 b Fj(BZ2_bzDecompressEnd)18 -b Fa(.)10 b(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)48 b Fl(19)449 -3729 y(3.4)92 b(High-lev)m(el)30 b(in)m(terface)16 b -Fa(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h -(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)45 b Fl(19)748 3838 y(3.4.1)93 b Fj(BZ2_bzReadOpen)9 -b Fa(.)h(.)15 b(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)38 -b Fl(19)748 3948 y(3.4.2)93 b Fj(BZ2_bzRead)18 b Fa(.)12 -b(.)j(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)47 -b Fl(20)748 4057 y(3.4.3)93 b Fj(BZ2_bzReadGetUnused)18 -b Fa(.)10 b(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)48 b Fl(22)748 -4167 y(3.4.4)93 b Fj(BZ2_bzReadClose)23 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)57 b Fl(22)748 4276 y(3.4.5)93 b -Fj(BZ2_bzWriteOpen)23 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)57 b Fl(22)748 4386 y(3.4.6)93 b Fj(BZ2_bzWrite)16 -b Fa(.)11 b(.)k(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -45 b Fl(23)748 4496 y(3.4.7)93 b Fj(BZ2_bzWriteClose)21 -b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)55 b Fl(23)748 -4605 y(3.4.8)93 b(Handling)28 b(em)m(b)s(edded)h(compressed)h(data)h -(streams)17 b Fa(.)f(.)f(.)g(.)46 b Fl(24)748 4715 y(3.4.9)93 -b(Standard)29 b(\014le-reading/writing)e(co)s(de)22 b -Fa(.)16 b(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)52 -b Fl(25)449 4824 y(3.5)92 b(Utilit)m(y)29 b(functions)f -Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)59 b Fl(26)748 4934 y(3.5.1)93 b -Fj(BZ2_bzBuffToBuffCompres)o(s)22 b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)57 b Fl(26)748 -5044 y(3.5.2)93 b Fj(BZ2_bzBuffToBuffDecompr)o(ess)17 -b Fa(.)e(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -53 b Fl(27)449 5153 y(3.6)92 b Fj(zlib)29 b Fl(compatibilit)m(y)g -(functions)23 b Fa(.)13 b(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h -(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)52 -b Fl(28)449 5263 y(3.7)92 b(Using)30 b(the)g(library)e(in)h(a)i -Fj(stdio)p Fl(-free)e(en)m(vironmen)m(t)23 b Fa(.)15 -b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)52 b Fl(29)p -eop -%%Page: -2 39 --2 38 bop 3699 -116 a Fl(ii)748 83 y(3.7.1)93 b(Getting)31 -b(rid)d(of)j Fj(stdio)20 b Fa(.)13 b(.)i(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)49 -b Fl(29)748 193 y(3.7.2)93 b(Critical)28 b(error)i(handling)22 -b Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)54 b Fl(29)449 302 -y(3.8)92 b(Making)30 b(a)h(Windo)m(ws)e(DLL)15 b Fa(.)h(.)f(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)44 b Fl(30)150 545 -y Fk(4)135 b(Miscellanea)11 b Fb(.)21 b(.)f(.)f(.)h(.)f(.)g(.)h(.)f(.)h -(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.)g(.)h(.)f(.)h(.)f(.)h(.)f(.) -h(.)f(.)g(.)h(.)56 b Fk(31)449 682 y Fl(4.1)92 b(Limitations)29 -b(of)h(the)h(compressed)f(\014le)f(format)9 b Fa(.)15 -b(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)38 -b Fl(31)449 791 y(4.2)92 b(P)m(ortabilit)m(y)30 b(issues)14 -b Fa(.)f(.)j(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)h(.)f(.)g(.)43 b Fl(32)449 901 y(4.3)92 b(Rep)s(orting)29 -b(bugs)f Fa(.)15 b(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)58 b Fl(32)449 1010 y(4.4)92 -b(Did)29 b(y)m(ou)i(get)h(the)e(righ)m(t)g(pac)m(k)-5 -b(age?)22 b Fa(.)17 b(.)e(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h -(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)51 -b Fl(34)449 1120 y(4.5)92 b(T)-8 b(esting)16 b Fa(.)f(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)45 b Fl(34)449 1230 y(4.6)92 -b(F)-8 b(urther)30 b(reading)22 b Fa(.)14 b(.)h(.)g(.)h(.)f(.)g(.)g(.)g -(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.) -g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)g(.)h(.)f(.)51 -b Fl(35)p eop -%%Trailer -end -userdict /end-hook known{end-hook}if -%%EOF diff -Nru bzip2-1.0.1/manual.texi bzip2-1.0.1.new/manual.texi --- bzip2-1.0.1/manual.texi Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/manual.texi Thu Jan 1 01:00:00 1970 @@ -1,2215 +0,0 @@ -\input texinfo @c -*- Texinfo -*- -@setfilename bzip2.info - -@ignore -This file documents bzip2 version 1.0, and associated library -libbzip2, written by Julian Seward (jseward@acm.org). - -Copyright (C) 1996-2000 Julian R Seward - -Permission is granted to make and distribute verbatim copies of -this manual provided the copyright notice and this permission notice -are preserved on all copies. - -Permission is granted to copy and distribute translations of this manual -into another language, under the above conditions for verbatim copies. -@end ignore - -@ifinfo -@format -START-INFO-DIR-ENTRY -* Bzip2: (bzip2). A program and library for data compression. -END-INFO-DIR-ENTRY -@end format - -@end ifinfo - -@iftex -@c @finalout -@settitle bzip2 and libbzip2 -@titlepage -@title bzip2 and libbzip2 -@subtitle a program and library for data compression -@subtitle copyright (C) 1996-2000 Julian Seward -@subtitle version 1.0 of 21 March 2000 -@author Julian Seward - -@end titlepage - -@parindent 0mm -@parskip 2mm - -@end iftex -@node Top, Overview, (dir), (dir) - -This program, @code{bzip2}, -and associated library @code{libbzip2}, are -Copyright (C) 1996-2000 Julian R Seward. All rights reserved. - -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: -@itemize @bullet -@item - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. -@item - The origin of this software must not be misrepresented; you must - not claim that you wrote the original software. If you use this - software in a product, an acknowledgment in the product - documentation would be appreciated but is not required. -@item - Altered source versions must be plainly marked as such, and must - not be misrepresented as being the original software. -@item - The name of the author may not be used to endorse or promote - products derived from this software without specific prior written - permission. -@end itemize -THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS -OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED -WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY -DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL -DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE -GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, -WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING -NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS -SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - -Julian Seward, Cambridge, UK. - -@code{jseward@@acm.org} - -@code{http://sourceware.cygnus.com/bzip2} - -@code{http://www.cacheprof.org} - -@code{http://www.muraroa.demon.co.uk} - -@code{bzip2}/@code{libbzip2} version 1.0 of 21 March 2000. - -PATENTS: To the best of my knowledge, @code{bzip2} does not use any patented -algorithms. However, I do not have the resources available to carry out -a full patent search. Therefore I cannot give any guarantee of the -above statement. - - - - - - - -@node Overview, Implementation, Top, Top -@chapter Introduction - -@code{bzip2} compresses files using the Burrows-Wheeler -block-sorting text compression algorithm, and Huffman coding. -Compression is generally considerably better than that -achieved by more conventional LZ77/LZ78-based compressors, -and approaches the performance of the PPM family of statistical compressors. - -@code{bzip2} is built on top of @code{libbzip2}, a flexible library -for handling compressed data in the @code{bzip2} format. This manual -describes both how to use the program and -how to work with the library interface. Most of the -manual is devoted to this library, not the program, -which is good news if your interest is only in the program. - -Chapter 2 describes how to use @code{bzip2}; this is the only part -you need to read if you just want to know how to operate the program. -Chapter 3 describes the programming interfaces in detail, and -Chapter 4 records some miscellaneous notes which I thought -ought to be recorded somewhere. - - -@chapter How to use @code{bzip2} - -This chapter contains a copy of the @code{bzip2} man page, -and nothing else. - -@quotation - -@unnumberedsubsubsec NAME -@itemize -@item @code{bzip2}, @code{bunzip2} -- a block-sorting file compressor, v1.0 -@item @code{bzcat} -- decompresses files to stdout -@item @code{bzip2recover} -- recovers data from damaged bzip2 files -@end itemize - -@unnumberedsubsubsec SYNOPSIS -@itemize -@item @code{bzip2} [ -cdfkqstvzVL123456789 ] [ filenames ... ] -@item @code{bunzip2} [ -fkvsVL ] [ filenames ... ] -@item @code{bzcat} [ -s ] [ filenames ... ] -@item @code{bzip2recover} filename -@end itemize - -@unnumberedsubsubsec DESCRIPTION - -@code{bzip2} compresses files using the Burrows-Wheeler block sorting -text compression algorithm, and Huffman coding. Compression is -generally considerably better than that achieved by more conventional -LZ77/LZ78-based compressors, and approaches the performance of the PPM -family of statistical compressors. - -The command-line options are deliberately very similar to those of GNU -@code{gzip}, but they are not identical. - -@code{bzip2} expects a list of file names to accompany the command-line -flags. Each file is replaced by a compressed version of itself, with -the name @code{original_name.bz2}. Each compressed file has the same -modification date, permissions, and, when possible, ownership as the -corresponding original, so that these properties can be correctly -restored at decompression time. File name handling is naive in the -sense that there is no mechanism for preserving original file names, -permissions, ownerships or dates in filesystems which lack these -concepts, or have serious file name length restrictions, such as MS-DOS. - -@code{bzip2} and @code{bunzip2} will by default not overwrite existing -files. If you want this to happen, specify the @code{-f} flag. - -If no file names are specified, @code{bzip2} compresses from standard -input to standard output. In this case, @code{bzip2} will decline to -write compressed output to a terminal, as this would be entirely -incomprehensible and therefore pointless. - -@code{bunzip2} (or @code{bzip2 -d}) decompresses all -specified files. Files which were not created by @code{bzip2} -will be detected and ignored, and a warning issued. -@code{bzip2} attempts to guess the filename for the decompressed file -from that of the compressed file as follows: -@itemize -@item @code{filename.bz2 } becomes @code{filename} -@item @code{filename.bz } becomes @code{filename} -@item @code{filename.tbz2} becomes @code{filename.tar} -@item @code{filename.tbz } becomes @code{filename.tar} -@item @code{anyothername } becomes @code{anyothername.out} -@end itemize -If the file does not end in one of the recognised endings, -@code{.bz2}, @code{.bz}, -@code{.tbz2} or @code{.tbz}, @code{bzip2} complains that it cannot -guess the name of the original file, and uses the original name -with @code{.out} appended. - -As with compression, supplying no -filenames causes decompression from standard input to standard output. - -@code{bunzip2} will correctly decompress a file which is the -concatenation of two or more compressed files. The result is the -concatenation of the corresponding uncompressed files. Integrity -testing (@code{-t}) of concatenated compressed files is also supported. - -You can also compress or decompress files to the standard output by -giving the @code{-c} flag. Multiple files may be compressed and -decompressed like this. The resulting outputs are fed sequentially to -stdout. Compression of multiple files in this manner generates a stream -containing multiple compressed file representations. Such a stream -can be decompressed correctly only by @code{bzip2} version 0.9.0 or -later. Earlier versions of @code{bzip2} will stop after decompressing -the first file in the stream. - -@code{bzcat} (or @code{bzip2 -dc}) decompresses all specified files to -the standard output. - -@code{bzip2} will read arguments from the environment variables -@code{BZIP2} and @code{BZIP}, in that order, and will process them -before any arguments read from the command line. This gives a -convenient way to supply default arguments. - -Compression is always performed, even if the compressed file is slightly -larger than the original. Files of less than about one hundred bytes -tend to get larger, since the compression mechanism has a constant -overhead in the region of 50 bytes. Random data (including the output -of most file compressors) is coded at about 8.05 bits per byte, giving -an expansion of around 0.5%. - -As a self-check for your protection, @code{bzip2} uses 32-bit CRCs to -make sure that the decompressed version of a file is identical to the -original. This guards against corruption of the compressed data, and -against undetected bugs in @code{bzip2} (hopefully very unlikely). The -chances of data corruption going undetected is microscopic, about one -chance in four billion for each file processed. Be aware, though, that -the check occurs upon decompression, so it can only tell you that -something is wrong. It can't help you recover the original uncompressed -data. You can use @code{bzip2recover} to try to recover data from -damaged files. - -Return values: 0 for a normal exit, 1 for environmental problems (file -not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt -compressed file, 3 for an internal consistency error (eg, bug) which -caused @code{bzip2} to panic. - - -@unnumberedsubsubsec OPTIONS -@table @code -@item -c --stdout -Compress or decompress to standard output. -@item -d --decompress -Force decompression. @code{bzip2}, @code{bunzip2} and @code{bzcat} are -really the same program, and the decision about what actions to take is -done on the basis of which name is used. This flag overrides that -mechanism, and forces bzip2 to decompress. -@item -z --compress -The complement to @code{-d}: forces compression, regardless of the -invokation name. -@item -t --test -Check integrity of the specified file(s), but don't decompress them. -This really performs a trial decompression and throws away the result. -@item -f --force -Force overwrite of output files. Normally, @code{bzip2} will not overwrite -existing output files. Also forces @code{bzip2} to break hard links -to files, which it otherwise wouldn't do. -@item -k --keep -Keep (don't delete) input files during compression -or decompression. -@item -s --small -Reduce memory usage, for compression, decompression and testing. Files -are decompressed and tested using a modified algorithm which only -requires 2.5 bytes per block byte. This means any file can be -decompressed in 2300k of memory, albeit at about half the normal speed. - -During compression, @code{-s} selects a block size of 200k, which limits -memory use to around the same figure, at the expense of your compression -ratio. In short, if your machine is low on memory (8 megabytes or -less), use -s for everything. See MEMORY MANAGEMENT below. -@item -q --quiet -Suppress non-essential warning messages. Messages pertaining to -I/O errors and other critical events will not be suppressed. -@item -v --verbose -Verbose mode -- show the compression ratio for each file processed. -Further @code{-v}'s increase the verbosity level, spewing out lots of -information which is primarily of interest for diagnostic purposes. -@item -L --license -V --version -Display the software version, license terms and conditions. -@item -1 to -9 -Set the block size to 100 k, 200 k .. 900 k when compressing. Has no -effect when decompressing. See MEMORY MANAGEMENT below. -@item -- -Treats all subsequent arguments as file names, even if they start -with a dash. This is so you can handle files with names beginning -with a dash, for example: @code{bzip2 -- -myfilename}. -@item --repetitive-fast -@item --repetitive-best -These flags are redundant in versions 0.9.5 and above. They provided -some coarse control over the behaviour of the sorting algorithm in -earlier versions, which was sometimes useful. 0.9.5 and above have an -improved algorithm which renders these flags irrelevant. -@end table - - -@unnumberedsubsubsec MEMORY MANAGEMENT - -@code{bzip2} compresses large files in blocks. The block size affects -both the compression ratio achieved, and the amount of memory needed for -compression and decompression. The flags @code{-1} through @code{-9} -specify the block size to be 100,000 bytes through 900,000 bytes (the -default) respectively. At decompression time, the block size used for -compression is read from the header of the compressed file, and -@code{bunzip2} then allocates itself just enough memory to decompress -the file. Since block sizes are stored in compressed files, it follows -that the flags @code{-1} to @code{-9} are irrelevant to and so ignored -during decompression. - -Compression and decompression requirements, in bytes, can be estimated -as: -@example - Compression: 400k + ( 8 x block size ) - - Decompression: 100k + ( 4 x block size ), or - 100k + ( 2.5 x block size ) -@end example -Larger block sizes give rapidly diminishing marginal returns. Most of -the compression comes from the first two or three hundred k of block -size, a fact worth bearing in mind when using @code{bzip2} on small machines. -It is also important to appreciate that the decompression memory -requirement is set at compression time by the choice of block size. - -For files compressed with the default 900k block size, @code{bunzip2} -will require about 3700 kbytes to decompress. To support decompression -of any file on a 4 megabyte machine, @code{bunzip2} has an option to -decompress using approximately half this amount of memory, about 2300 -kbytes. Decompression speed is also halved, so you should use this -option only where necessary. The relevant flag is @code{-s}. - -In general, try and use the largest block size memory constraints allow, -since that maximises the compression achieved. Compression and -decompression speed are virtually unaffected by block size. - -Another significant point applies to files which fit in a single block --- that means most files you'd encounter using a large block size. The -amount of real memory touched is proportional to the size of the file, -since the file is smaller than a block. For example, compressing a file -20,000 bytes long with the flag @code{-9} will cause the compressor to -allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 -kbytes of it. Similarly, the decompressor will allocate 3700k but only -touch 100k + 20000 * 4 = 180 kbytes. - -Here is a table which summarises the maximum memory usage for different -block sizes. Also recorded is the total compressed size for 14 files of -the Calgary Text Compression Corpus totalling 3,141,622 bytes. This -column gives some feel for how compression varies with block size. -These figures tend to understate the advantage of larger block sizes for -larger files, since the Corpus is dominated by smaller files. -@example - Compress Decompress Decompress Corpus - Flag usage usage -s usage Size - - -1 1200k 500k 350k 914704 - -2 2000k 900k 600k 877703 - -3 2800k 1300k 850k 860338 - -4 3600k 1700k 1100k 846899 - -5 4400k 2100k 1350k 845160 - -6 5200k 2500k 1600k 838626 - -7 6100k 2900k 1850k 834096 - -8 6800k 3300k 2100k 828642 - -9 7600k 3700k 2350k 828642 -@end example - -@unnumberedsubsubsec RECOVERING DATA FROM DAMAGED FILES - -@code{bzip2} compresses files in blocks, usually 900kbytes long. Each -block is handled independently. If a media or transmission error causes -a multi-block @code{.bz2} file to become damaged, it may be possible to -recover data from the undamaged blocks in the file. - -The compressed representation of each block is delimited by a 48-bit -pattern, which makes it possible to find the block boundaries with -reasonable certainty. Each block also carries its own 32-bit CRC, so -damaged blocks can be distinguished from undamaged ones. - -@code{bzip2recover} is a simple program whose purpose is to search for -blocks in @code{.bz2} files, and write each block out into its own -@code{.bz2} file. You can then use @code{bzip2 -t} to test the -integrity of the resulting files, and decompress those which are -undamaged. - -@code{bzip2recover} -takes a single argument, the name of the damaged file, -and writes a number of files @code{rec0001file.bz2}, - @code{rec0002file.bz2}, etc, containing the extracted blocks. - The output filenames are designed so that the use of - wildcards in subsequent processing -- for example, -@code{bzip2 -dc rec*file.bz2 > recovered_data} -- lists the files in - the correct order. - -@code{bzip2recover} should be of most use dealing with large @code{.bz2} - files, as these will contain many blocks. It is clearly - futile to use it on damaged single-block files, since a - damaged block cannot be recovered. If you wish to minimise -any potential data loss through media or transmission errors, -you might consider compressing with a smaller - block size. - - -@unnumberedsubsubsec PERFORMANCE NOTES - -The sorting phase of compression gathers together similar strings in the -file. Because of this, files containing very long runs of repeated -symbols, like "aabaabaabaab ..." (repeated several hundred times) may -compress more slowly than normal. Versions 0.9.5 and above fare much -better than previous versions in this respect. The ratio between -worst-case and average-case compression time is in the region of 10:1. -For previous versions, this figure was more like 100:1. You can use the -@code{-vvvv} option to monitor progress in great detail, if you want. - -Decompression speed is unaffected by these phenomena. - -@code{bzip2} usually allocates several megabytes of memory to operate -in, and then charges all over it in a fairly random fashion. This means -that performance, both for compressing and decompressing, is largely -determined by the speed at which your machine can service cache misses. -Because of this, small changes to the code to reduce the miss rate have -been observed to give disproportionately large performance improvements. -I imagine @code{bzip2} will perform best on machines with very large -caches. - - -@unnumberedsubsubsec CAVEATS - -I/O error messages are not as helpful as they could be. @code{bzip2} -tries hard to detect I/O errors and exit cleanly, but the details of -what the problem is sometimes seem rather misleading. - -This manual page pertains to version 1.0 of @code{bzip2}. Compressed -data created by this version is entirely forwards and backwards -compatible with the previous public releases, versions 0.1pl2, 0.9.0 and -0.9.5, but with the following exception: 0.9.0 and above can correctly -decompress multiple concatenated compressed files. 0.1pl2 cannot do -this; it will stop after decompressing just the first file in the -stream. - -@code{bzip2recover} uses 32-bit integers to represent bit positions in -compressed files, so it cannot handle compressed files more than 512 -megabytes long. This could easily be fixed. - - -@unnumberedsubsubsec AUTHOR -Julian Seward, @code{jseward@@acm.org}. - -The ideas embodied in @code{bzip2} are due to (at least) the following -people: Michael Burrows and David Wheeler (for the block sorting -transformation), David Wheeler (again, for the Huffman coder), Peter -Fenwick (for the structured coding model in the original @code{bzip}, -and many refinements), and Alistair Moffat, Radford Neal and Ian Witten -(for the arithmetic coder in the original @code{bzip}). I am much -indebted for their help, support and advice. See the manual in the -source distribution for pointers to sources of documentation. Christian -von Roques encouraged me to look for faster sorting algorithms, so as to -speed up compression. Bela Lubkin encouraged me to improve the -worst-case compression performance. Many people sent patches, helped -with portability problems, lent machines, gave advice and were generally -helpful. - -@end quotation - - - - -@chapter Programming with @code{libbzip2} - -This chapter describes the programming interface to @code{libbzip2}. - -For general background information, particularly about memory -use and performance aspects, you'd be well advised to read Chapter 2 -as well. - -@section Top-level structure - -@code{libbzip2} is a flexible library for compressing and decompressing -data in the @code{bzip2} data format. Although packaged as a single -entity, it helps to regard the library as three separate parts: the low -level interface, and the high level interface, and some utility -functions. - -The structure of @code{libbzip2}'s interfaces is similar to -that of Jean-loup Gailly's and Mark Adler's excellent @code{zlib} -library. - -All externally visible symbols have names beginning @code{BZ2_}. -This is new in version 1.0. The intention is to minimise pollution -of the namespaces of library clients. - -@subsection Low-level summary - -This interface provides services for compressing and decompressing -data in memory. There's no provision for dealing with files, streams -or any other I/O mechanisms, just straight memory-to-memory work. -In fact, this part of the library can be compiled without inclusion -of @code{stdio.h}, which may be helpful for embedded applications. - -The low-level part of the library has no global variables and -is therefore thread-safe. - -Six routines make up the low level interface: -@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, and @* @code{BZ2_bzCompressEnd} -for compression, -and a corresponding trio @code{BZ2_bzDecompressInit}, @* @code{BZ2_bzDecompress} -and @code{BZ2_bzDecompressEnd} for decompression. -The @code{*Init} functions allocate -memory for compression/decompression and do other -initialisations, whilst the @code{*End} functions close down operations -and release memory. - -The real work is done by @code{BZ2_bzCompress} and @code{BZ2_bzDecompress}. -These compress and decompress data from a user-supplied input buffer -to a user-supplied output buffer. These buffers can be any size; -arbitrary quantities of data are handled by making repeated calls -to these functions. This is a flexible mechanism allowing a -consumer-pull style of activity, or producer-push, or a mixture of -both. - - - -@subsection High-level summary - -This interface provides some handy wrappers around the low-level -interface to facilitate reading and writing @code{bzip2} format -files (@code{.bz2} files). The routines provide hooks to facilitate -reading files in which the @code{bzip2} data stream is embedded -within some larger-scale file structure, or where there are -multiple @code{bzip2} data streams concatenated end-to-end. - -For reading files, @code{BZ2_bzReadOpen}, @code{BZ2_bzRead}, -@code{BZ2_bzReadClose} and @* @code{BZ2_bzReadGetUnused} are supplied. For -writing files, @code{BZ2_bzWriteOpen}, @code{BZ2_bzWrite} and -@code{BZ2_bzWriteFinish} are available. - -As with the low-level library, no global variables are used -so the library is per se thread-safe. However, if I/O errors -occur whilst reading or writing the underlying compressed files, -you may have to consult @code{errno} to determine the cause of -the error. In that case, you'd need a C library which correctly -supports @code{errno} in a multithreaded environment. - -To make the library a little simpler and more portable, -@code{BZ2_bzReadOpen} and @code{BZ2_bzWriteOpen} require you to pass them file -handles (@code{FILE*}s) which have previously been opened for reading or -writing respectively. That avoids portability problems associated with -file operations and file attributes, whilst not being much of an -imposition on the programmer. - - - -@subsection Utility functions summary -For very simple needs, @code{BZ2_bzBuffToBuffCompress} and -@code{BZ2_bzBuffToBuffDecompress} are provided. These compress -data in memory from one buffer to another buffer in a single -function call. You should assess whether these functions -fulfill your memory-to-memory compression/decompression -requirements before investing effort in understanding the more -general but more complex low-level interface. - -Yoshioka Tsuneo (@code{QWF00133@@niftyserve.or.jp} / -@code{tsuneo-y@@is.aist-nara.ac.jp}) has contributed some functions to -give better @code{zlib} compatibility. These functions are -@code{BZ2_bzopen}, @code{BZ2_bzread}, @code{BZ2_bzwrite}, @code{BZ2_bzflush}, -@code{BZ2_bzclose}, -@code{BZ2_bzerror} and @code{BZ2_bzlibVersion}. You may find these functions -more convenient for simple file reading and writing, than those in the -high-level interface. These functions are not (yet) officially part of -the library, and are minimally documented here. If they break, you -get to keep all the pieces. I hope to document them properly when time -permits. - -Yoshioka also contributed modifications to allow the library to be -built as a Windows DLL. - - -@section Error handling - -The library is designed to recover cleanly in all situations, including -the worst-case situation of decompressing random data. I'm not -100% sure that it can always do this, so you might want to add -a signal handler to catch segmentation violations during decompression -if you are feeling especially paranoid. I would be interested in -hearing more about the robustness of the library to corrupted -compressed data. - -Version 1.0 is much more robust in this respect than -0.9.0 or 0.9.5. Investigations with Checker (a tool for -detecting problems with memory management, similar to Purify) -indicate that, at least for the few files I tested, all single-bit -errors in the decompressed data are caught properly, with no -segmentation faults, no reads of uninitialised data and no -out of range reads or writes. So it's certainly much improved, -although I wouldn't claim it to be totally bombproof. - -The file @code{bzlib.h} contains all definitions needed to use -the library. In particular, you should definitely not include -@code{bzlib_private.h}. - -In @code{bzlib.h}, the various return values are defined. The following -list is not intended as an exhaustive description of the circumstances -in which a given value may be returned -- those descriptions are given -later. Rather, it is intended to convey the rough meaning of each -return value. The first five actions are normal and not intended to -denote an error situation. -@table @code -@item BZ_OK -The requested action was completed successfully. -@item BZ_RUN_OK -@itemx BZ_FLUSH_OK -@itemx BZ_FINISH_OK -In @code{BZ2_bzCompress}, the requested flush/finish/nothing-special action -was completed successfully. -@item BZ_STREAM_END -Compression of data was completed, or the logical stream end was -detected during decompression. -@end table - -The following return values indicate an error of some kind. -@table @code -@item BZ_CONFIG_ERROR -Indicates that the library has been improperly compiled on your -platform -- a major configuration error. Specifically, it means -that @code{sizeof(char)}, @code{sizeof(short)} and @code{sizeof(int)} -are not 1, 2 and 4 respectively, as they should be. Note that the -library should still work properly on 64-bit platforms which follow -the LP64 programming model -- that is, where @code{sizeof(long)} -and @code{sizeof(void*)} are 8. Under LP64, @code{sizeof(int)} is -still 4, so @code{libbzip2}, which doesn't use the @code{long} type, -is OK. -@item BZ_SEQUENCE_ERROR -When using the library, it is important to call the functions in the -correct sequence and with data structures (buffers etc) in the correct -states. @code{libbzip2} checks as much as it can to ensure this is -happening, and returns @code{BZ_SEQUENCE_ERROR} if not. Code which -complies precisely with the function semantics, as detailed below, -should never receive this value; such an event denotes buggy code -which you should investigate. -@item BZ_PARAM_ERROR -Returned when a parameter to a function call is out of range -or otherwise manifestly incorrect. As with @code{BZ_SEQUENCE_ERROR}, -this denotes a bug in the client code. The distinction between -@code{BZ_PARAM_ERROR} and @code{BZ_SEQUENCE_ERROR} is a bit hazy, but still worth -making. -@item BZ_MEM_ERROR -Returned when a request to allocate memory failed. Note that the -quantity of memory needed to decompress a stream cannot be determined -until the stream's header has been read. So @code{BZ2_bzDecompress} and -@code{BZ2_bzRead} may return @code{BZ_MEM_ERROR} even though some of -the compressed data has been read. The same is not true for -compression; once @code{BZ2_bzCompressInit} or @code{BZ2_bzWriteOpen} have -successfully completed, @code{BZ_MEM_ERROR} cannot occur. -@item BZ_DATA_ERROR -Returned when a data integrity error is detected during decompression. -Most importantly, this means when stored and computed CRCs for the -data do not match. This value is also returned upon detection of any -other anomaly in the compressed data. -@item BZ_DATA_ERROR_MAGIC -As a special case of @code{BZ_DATA_ERROR}, it is sometimes useful to -know when the compressed stream does not start with the correct -magic bytes (@code{'B' 'Z' 'h'}). -@item BZ_IO_ERROR -Returned by @code{BZ2_bzRead} and @code{BZ2_bzWrite} when there is an error -reading or writing in the compressed file, and by @code{BZ2_bzReadOpen} -and @code{BZ2_bzWriteOpen} for attempts to use a file for which the -error indicator (viz, @code{ferror(f)}) is set. -On receipt of @code{BZ_IO_ERROR}, the caller should consult -@code{errno} and/or @code{perror} to acquire operating-system -specific information about the problem. -@item BZ_UNEXPECTED_EOF -Returned by @code{BZ2_bzRead} when the compressed file finishes -before the logical end of stream is detected. -@item BZ_OUTBUFF_FULL -Returned by @code{BZ2_bzBuffToBuffCompress} and -@code{BZ2_bzBuffToBuffDecompress} to indicate that the output data -will not fit into the output buffer provided. -@end table - - - -@section Low-level interface - -@subsection @code{BZ2_bzCompressInit} -@example -typedef - struct @{ - char *next_in; - unsigned int avail_in; - unsigned int total_in_lo32; - unsigned int total_in_hi32; - - char *next_out; - unsigned int avail_out; - unsigned int total_out_lo32; - unsigned int total_out_hi32; - - void *state; - - void *(*bzalloc)(void *,int,int); - void (*bzfree)(void *,void *); - void *opaque; - @} - bz_stream; - -int BZ2_bzCompressInit ( bz_stream *strm, - int blockSize100k, - int verbosity, - int workFactor ); - -@end example - -Prepares for compression. The @code{bz_stream} structure -holds all data pertaining to the compression activity. -A @code{bz_stream} structure should be allocated and initialised -prior to the call. -The fields of @code{bz_stream} -comprise the entirety of the user-visible data. @code{state} -is a pointer to the private data structures required for compression. - -Custom memory allocators are supported, via fields @code{bzalloc}, -@code{bzfree}, -and @code{opaque}. The value -@code{opaque} is passed to as the first argument to -all calls to @code{bzalloc} and @code{bzfree}, but is -otherwise ignored by the library. -The call @code{bzalloc ( opaque, n, m )} is expected to return a -pointer @code{p} to -@code{n * m} bytes of memory, and @code{bzfree ( opaque, p )} -should free -that memory. - -If you don't want to use a custom memory allocator, set @code{bzalloc}, -@code{bzfree} and -@code{opaque} to @code{NULL}, -and the library will then use the standard @code{malloc}/@code{free} -routines. - -Before calling @code{BZ2_bzCompressInit}, fields @code{bzalloc}, -@code{bzfree} and @code{opaque} should -be filled appropriately, as just described. Upon return, the internal -state will have been allocated and initialised, and @code{total_in_lo32}, -@code{total_in_hi32}, @code{total_out_lo32} and -@code{total_out_hi32} will have been set to zero. -These four fields are used by the library -to inform the caller of the total amount of data passed into and out of -the library, respectively. You should not try to change them. -As of version 1.0, 64-bit counts are maintained, even on 32-bit -platforms, using the @code{_hi32} fields to store the upper 32 bits -of the count. So, for example, the total amount of data in -is @code{(total_in_hi32 << 32) + total_in_lo32}. - -Parameter @code{blockSize100k} specifies the block size to be used for -compression. It should be a value between 1 and 9 inclusive, and the -actual block size used is 100000 x this figure. 9 gives the best -compression but takes most memory. - -Parameter @code{verbosity} should be set to a number between 0 and 4 -inclusive. 0 is silent, and greater numbers give increasingly verbose -monitoring/debugging output. If the library has been compiled with -@code{-DBZ_NO_STDIO}, no such output will appear for any verbosity -setting. - -Parameter @code{workFactor} controls how the compression phase behaves -when presented with worst case, highly repetitive, input data. If -compression runs into difficulties caused by repetitive data, the -library switches from the standard sorting algorithm to a fallback -algorithm. The fallback is slower than the standard algorithm by -perhaps a factor of three, but always behaves reasonably, no matter how -bad the input. - -Lower values of @code{workFactor} reduce the amount of effort the -standard algorithm will expend before resorting to the fallback. You -should set this parameter carefully; too low, and many inputs will be -handled by the fallback algorithm and so compress rather slowly, too -high, and your average-to-worst case compression times can become very -large. The default value of 30 gives reasonable behaviour over a wide -range of circumstances. - -Allowable values range from 0 to 250 inclusive. 0 is a special case, -equivalent to using the default value of 30. - -Note that the compressed output generated is the same regardless of -whether or not the fallback algorithm is used. - -Be aware also that this parameter may disappear entirely in future -versions of the library. In principle it should be possible to devise a -good way to automatically choose which algorithm to use. Such a -mechanism would render the parameter obsolete. - -Possible return values: -@display - @code{BZ_CONFIG_ERROR} - if the library has been mis-compiled - @code{BZ_PARAM_ERROR} - if @code{strm} is @code{NULL} - or @code{blockSize} < 1 or @code{blockSize} > 9 - or @code{verbosity} < 0 or @code{verbosity} > 4 - or @code{workFactor} < 0 or @code{workFactor} > 250 - @code{BZ_MEM_ERROR} - if not enough memory is available - @code{BZ_OK} - otherwise -@end display -Allowable next actions: -@display - @code{BZ2_bzCompress} - if @code{BZ_OK} is returned - no specific action needed in case of error -@end display - -@subsection @code{BZ2_bzCompress} -@example - int BZ2_bzCompress ( bz_stream *strm, int action ); -@end example -Provides more input and/or output buffer space for the library. The -caller maintains input and output buffers, and calls @code{BZ2_bzCompress} to -transfer data between them. - -Before each call to @code{BZ2_bzCompress}, @code{next_in} should point at -the data to be compressed, and @code{avail_in} should indicate how many -bytes the library may read. @code{BZ2_bzCompress} updates @code{next_in}, -@code{avail_in} and @code{total_in} to reflect the number of bytes it -has read. - -Similarly, @code{next_out} should point to a buffer in which the -compressed data is to be placed, with @code{avail_out} indicating how -much output space is available. @code{BZ2_bzCompress} updates -@code{next_out}, @code{avail_out} and @code{total_out} to reflect the -number of bytes output. - -You may provide and remove as little or as much data as you like on each -call of @code{BZ2_bzCompress}. In the limit, it is acceptable to supply and -remove data one byte at a time, although this would be terribly -inefficient. You should always ensure that at least one byte of output -space is available at each call. - -A second purpose of @code{BZ2_bzCompress} is to request a change of mode of the -compressed stream. - -Conceptually, a compressed stream can be in one of four states: IDLE, -RUNNING, FLUSHING and FINISHING. Before initialisation -(@code{BZ2_bzCompressInit}) and after termination (@code{BZ2_bzCompressEnd}), a -stream is regarded as IDLE. - -Upon initialisation (@code{BZ2_bzCompressInit}), the stream is placed in the -RUNNING state. Subsequent calls to @code{BZ2_bzCompress} should pass -@code{BZ_RUN} as the requested action; other actions are illegal and -will result in @code{BZ_SEQUENCE_ERROR}. - -At some point, the calling program will have provided all the input data -it wants to. It will then want to finish up -- in effect, asking the -library to process any data it might have buffered internally. In this -state, @code{BZ2_bzCompress} will no longer attempt to read data from -@code{next_in}, but it will want to write data to @code{next_out}. -Because the output buffer supplied by the user can be arbitrarily small, -the finishing-up operation cannot necessarily be done with a single call -of @code{BZ2_bzCompress}. - -Instead, the calling program passes @code{BZ_FINISH} as an action to -@code{BZ2_bzCompress}. This changes the stream's state to FINISHING. Any -remaining input (ie, @code{next_in[0 .. avail_in-1]}) is compressed and -transferred to the output buffer. To do this, @code{BZ2_bzCompress} must be -called repeatedly until all the output has been consumed. At that -point, @code{BZ2_bzCompress} returns @code{BZ_STREAM_END}, and the stream's -state is set back to IDLE. @code{BZ2_bzCompressEnd} should then be -called. - -Just to make sure the calling program does not cheat, the library makes -a note of @code{avail_in} at the time of the first call to -@code{BZ2_bzCompress} which has @code{BZ_FINISH} as an action (ie, at the -time the program has announced its intention to not supply any more -input). By comparing this value with that of @code{avail_in} over -subsequent calls to @code{BZ2_bzCompress}, the library can detect any -attempts to slip in more data to compress. Any calls for which this is -detected will return @code{BZ_SEQUENCE_ERROR}. This indicates a -programming mistake which should be corrected. - -Instead of asking to finish, the calling program may ask -@code{BZ2_bzCompress} to take all the remaining input, compress it and -terminate the current (Burrows-Wheeler) compression block. This could -be useful for error control purposes. The mechanism is analogous to -that for finishing: call @code{BZ2_bzCompress} with an action of -@code{BZ_FLUSH}, remove output data, and persist with the -@code{BZ_FLUSH} action until the value @code{BZ_RUN} is returned. As -with finishing, @code{BZ2_bzCompress} detects any attempt to provide more -input data once the flush has begun. - -Once the flush is complete, the stream returns to the normal RUNNING -state. - -This all sounds pretty complex, but isn't really. Here's a table -which shows which actions are allowable in each state, what action -will be taken, what the next state is, and what the non-error return -values are. Note that you can't explicitly ask what state the -stream is in, but nor do you need to -- it can be inferred from the -values returned by @code{BZ2_bzCompress}. -@display -IDLE/@code{any} - Illegal. IDLE state only exists after @code{BZ2_bzCompressEnd} or - before @code{BZ2_bzCompressInit}. - Return value = @code{BZ_SEQUENCE_ERROR} - -RUNNING/@code{BZ_RUN} - Compress from @code{next_in} to @code{next_out} as much as possible. - Next state = RUNNING - Return value = @code{BZ_RUN_OK} - -RUNNING/@code{BZ_FLUSH} - Remember current value of @code{next_in}. Compress from @code{next_in} - to @code{next_out} as much as possible, but do not accept any more input. - Next state = FLUSHING - Return value = @code{BZ_FLUSH_OK} - -RUNNING/@code{BZ_FINISH} - Remember current value of @code{next_in}. Compress from @code{next_in} - to @code{next_out} as much as possible, but do not accept any more input. - Next state = FINISHING - Return value = @code{BZ_FINISH_OK} - -FLUSHING/@code{BZ_FLUSH} - Compress from @code{next_in} to @code{next_out} as much as possible, - but do not accept any more input. - If all the existing input has been used up and all compressed - output has been removed - Next state = RUNNING; Return value = @code{BZ_RUN_OK} - else - Next state = FLUSHING; Return value = @code{BZ_FLUSH_OK} - -FLUSHING/other - Illegal. - Return value = @code{BZ_SEQUENCE_ERROR} - -FINISHING/@code{BZ_FINISH} - Compress from @code{next_in} to @code{next_out} as much as possible, - but to not accept any more input. - If all the existing input has been used up and all compressed - output has been removed - Next state = IDLE; Return value = @code{BZ_STREAM_END} - else - Next state = FINISHING; Return value = @code{BZ_FINISHING} - -FINISHING/other - Illegal. - Return value = @code{BZ_SEQUENCE_ERROR} -@end display - -That still looks complicated? Well, fair enough. The usual sequence -of calls for compressing a load of data is: -@itemize @bullet -@item Get started with @code{BZ2_bzCompressInit}. -@item Shovel data in and shlurp out its compressed form using zero or more -calls of @code{BZ2_bzCompress} with action = @code{BZ_RUN}. -@item Finish up. -Repeatedly call @code{BZ2_bzCompress} with action = @code{BZ_FINISH}, -copying out the compressed output, until @code{BZ_STREAM_END} is returned. -@item Close up and go home. Call @code{BZ2_bzCompressEnd}. -@end itemize -If the data you want to compress fits into your input buffer all -at once, you can skip the calls of @code{BZ2_bzCompress ( ..., BZ_RUN )} and -just do the @code{BZ2_bzCompress ( ..., BZ_FINISH )} calls. - -All required memory is allocated by @code{BZ2_bzCompressInit}. The -compression library can accept any data at all (obviously). So you -shouldn't get any error return values from the @code{BZ2_bzCompress} calls. -If you do, they will be @code{BZ_SEQUENCE_ERROR}, and indicate a bug in -your programming. - -Trivial other possible return values: -@display - @code{BZ_PARAM_ERROR} - if @code{strm} is @code{NULL}, or @code{strm->s} is @code{NULL} -@end display - -@subsection @code{BZ2_bzCompressEnd} -@example -int BZ2_bzCompressEnd ( bz_stream *strm ); -@end example -Releases all memory associated with a compression stream. - -Possible return values: -@display - @code{BZ_PARAM_ERROR} if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} - @code{BZ_OK} otherwise -@end display - - -@subsection @code{BZ2_bzDecompressInit} -@example -int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small ); -@end example -Prepares for decompression. As with @code{BZ2_bzCompressInit}, a -@code{bz_stream} record should be allocated and initialised before the -call. Fields @code{bzalloc}, @code{bzfree} and @code{opaque} should be -set if a custom memory allocator is required, or made @code{NULL} for -the normal @code{malloc}/@code{free} routines. Upon return, the internal -state will have been initialised, and @code{total_in} and -@code{total_out} will be zero. - -For the meaning of parameter @code{verbosity}, see @code{BZ2_bzCompressInit}. - -If @code{small} is nonzero, the library will use an alternative -decompression algorithm which uses less memory but at the cost of -decompressing more slowly (roughly speaking, half the speed, but the -maximum memory requirement drops to around 2300k). See Chapter 2 for -more information on memory management. - -Note that the amount of memory needed to decompress -a stream cannot be determined until the stream's header has been read, -so even if @code{BZ2_bzDecompressInit} succeeds, a subsequent -@code{BZ2_bzDecompress} could fail with @code{BZ_MEM_ERROR}. - -Possible return values: -@display - @code{BZ_CONFIG_ERROR} - if the library has been mis-compiled - @code{BZ_PARAM_ERROR} - if @code{(small != 0 && small != 1)} - or @code{(verbosity < 0 || verbosity > 4)} - @code{BZ_MEM_ERROR} - if insufficient memory is available -@end display - -Allowable next actions: -@display - @code{BZ2_bzDecompress} - if @code{BZ_OK} was returned - no specific action required in case of error -@end display - - - -@subsection @code{BZ2_bzDecompress} -@example -int BZ2_bzDecompress ( bz_stream *strm ); -@end example -Provides more input and/out output buffer space for the library. The -caller maintains input and output buffers, and uses @code{BZ2_bzDecompress} -to transfer data between them. - -Before each call to @code{BZ2_bzDecompress}, @code{next_in} -should point at the compressed data, -and @code{avail_in} should indicate how many bytes the library -may read. @code{BZ2_bzDecompress} updates @code{next_in}, @code{avail_in} -and @code{total_in} -to reflect the number of bytes it has read. - -Similarly, @code{next_out} should point to a buffer in which the uncompressed -output is to be placed, with @code{avail_out} indicating how much output space -is available. @code{BZ2_bzCompress} updates @code{next_out}, -@code{avail_out} and @code{total_out} to reflect -the number of bytes output. - -You may provide and remove as little or as much data as you like on -each call of @code{BZ2_bzDecompress}. -In the limit, it is acceptable to -supply and remove data one byte at a time, although this would be -terribly inefficient. You should always ensure that at least one -byte of output space is available at each call. - -Use of @code{BZ2_bzDecompress} is simpler than @code{BZ2_bzCompress}. - -You should provide input and remove output as described above, and -repeatedly call @code{BZ2_bzDecompress} until @code{BZ_STREAM_END} is -returned. Appearance of @code{BZ_STREAM_END} denotes that -@code{BZ2_bzDecompress} has detected the logical end of the compressed -stream. @code{BZ2_bzDecompress} will not produce @code{BZ_STREAM_END} until -all output data has been placed into the output buffer, so once -@code{BZ_STREAM_END} appears, you are guaranteed to have available all -the decompressed output, and @code{BZ2_bzDecompressEnd} can safely be -called. - -If case of an error return value, you should call @code{BZ2_bzDecompressEnd} -to clean up and release memory. - -Possible return values: -@display - @code{BZ_PARAM_ERROR} - if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} - or @code{strm->avail_out < 1} - @code{BZ_DATA_ERROR} - if a data integrity error is detected in the compressed stream - @code{BZ_DATA_ERROR_MAGIC} - if the compressed stream doesn't begin with the right magic bytes - @code{BZ_MEM_ERROR} - if there wasn't enough memory available - @code{BZ_STREAM_END} - if the logical end of the data stream was detected and all - output in has been consumed, eg @code{s->avail_out > 0} - @code{BZ_OK} - otherwise -@end display -Allowable next actions: -@display - @code{BZ2_bzDecompress} - if @code{BZ_OK} was returned - @code{BZ2_bzDecompressEnd} - otherwise -@end display - - -@subsection @code{BZ2_bzDecompressEnd} -@example -int BZ2_bzDecompressEnd ( bz_stream *strm ); -@end example -Releases all memory associated with a decompression stream. - -Possible return values: -@display - @code{BZ_PARAM_ERROR} - if @code{strm} is @code{NULL} or @code{strm->s} is @code{NULL} - @code{BZ_OK} - otherwise -@end display - -Allowable next actions: -@display - None. -@end display - - -@section High-level interface - -This interface provides functions for reading and writing -@code{bzip2} format files. First, some general points. - -@itemize @bullet -@item All of the functions take an @code{int*} first argument, - @code{bzerror}. - After each call, @code{bzerror} should be consulted first to determine - the outcome of the call. If @code{bzerror} is @code{BZ_OK}, - the call completed - successfully, and only then should the return value of the function - (if any) be consulted. If @code{bzerror} is @code{BZ_IO_ERROR}, - there was an error - reading/writing the underlying compressed file, and you should - then consult @code{errno}/@code{perror} to determine the - cause of the difficulty. - @code{bzerror} may also be set to various other values; precise details are - given on a per-function basis below. -@item If @code{bzerror} indicates an error - (ie, anything except @code{BZ_OK} and @code{BZ_STREAM_END}), - you should immediately call @code{BZ2_bzReadClose} (or @code{BZ2_bzWriteClose}, - depending on whether you are attempting to read or to write) - to free up all resources associated - with the stream. Once an error has been indicated, behaviour of all calls - except @code{BZ2_bzReadClose} (@code{BZ2_bzWriteClose}) is undefined. - The implication is that (1) @code{bzerror} should - be checked after each call, and (2) if @code{bzerror} indicates an error, - @code{BZ2_bzReadClose} (@code{BZ2_bzWriteClose}) should then be called to clean up. -@item The @code{FILE*} arguments passed to - @code{BZ2_bzReadOpen}/@code{BZ2_bzWriteOpen} - should be set to binary mode. - Most Unix systems will do this by default, but other platforms, - including Windows and Mac, will not. If you omit this, you may - encounter problems when moving code to new platforms. -@item Memory allocation requests are handled by - @code{malloc}/@code{free}. - At present - there is no facility for user-defined memory allocators in the file I/O - functions (could easily be added, though). -@end itemize - - - -@subsection @code{BZ2_bzReadOpen} -@example - typedef void BZFILE; - - BZFILE *BZ2_bzReadOpen ( int *bzerror, FILE *f, - int small, int verbosity, - void *unused, int nUnused ); -@end example -Prepare to read compressed data from file handle @code{f}. @code{f} -should refer to a file which has been opened for reading, and for which -the error indicator (@code{ferror(f)})is not set. If @code{small} is 1, -the library will try to decompress using less memory, at the expense of -speed. - -For reasons explained below, @code{BZ2_bzRead} will decompress the -@code{nUnused} bytes starting at @code{unused}, before starting to read -from the file @code{f}. At most @code{BZ_MAX_UNUSED} bytes may be -supplied like this. If this facility is not required, you should pass -@code{NULL} and @code{0} for @code{unused} and n@code{Unused} -respectively. - -For the meaning of parameters @code{small} and @code{verbosity}, -see @code{BZ2_bzDecompressInit}. - -The amount of memory needed to decompress a file cannot be determined -until the file's header has been read. So it is possible that -@code{BZ2_bzReadOpen} returns @code{BZ_OK} but a subsequent call of -@code{BZ2_bzRead} will return @code{BZ_MEM_ERROR}. - -Possible assignments to @code{bzerror}: -@display - @code{BZ_CONFIG_ERROR} - if the library has been mis-compiled - @code{BZ_PARAM_ERROR} - if @code{f} is @code{NULL} - or @code{small} is neither @code{0} nor @code{1} - or @code{(unused == NULL && nUnused != 0)} - or @code{(unused != NULL && !(0 <= nUnused <= BZ_MAX_UNUSED))} - @code{BZ_IO_ERROR} - if @code{ferror(f)} is nonzero - @code{BZ_MEM_ERROR} - if insufficient memory is available - @code{BZ_OK} - otherwise. -@end display - -Possible return values: -@display - Pointer to an abstract @code{BZFILE} - if @code{bzerror} is @code{BZ_OK} - @code{NULL} - otherwise -@end display - -Allowable next actions: -@display - @code{BZ2_bzRead} - if @code{bzerror} is @code{BZ_OK} - @code{BZ2_bzClose} - otherwise -@end display - - -@subsection @code{BZ2_bzRead} -@example - int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len ); -@end example -Reads up to @code{len} (uncompressed) bytes from the compressed file -@code{b} into -the buffer @code{buf}. If the read was successful, -@code{bzerror} is set to @code{BZ_OK} -and the number of bytes read is returned. If the logical end-of-stream -was detected, @code{bzerror} will be set to @code{BZ_STREAM_END}, -and the number -of bytes read is returned. All other @code{bzerror} values denote an error. - -@code{BZ2_bzRead} will supply @code{len} bytes, -unless the logical stream end is detected -or an error occurs. Because of this, it is possible to detect the -stream end by observing when the number of bytes returned is -less than the number -requested. Nevertheless, this is regarded as inadvisable; you should -instead check @code{bzerror} after every call and watch out for -@code{BZ_STREAM_END}. - -Internally, @code{BZ2_bzRead} copies data from the compressed file in chunks -of size @code{BZ_MAX_UNUSED} bytes -before decompressing it. If the file contains more bytes than strictly -needed to reach the logical end-of-stream, @code{BZ2_bzRead} will almost certainly -read some of the trailing data before signalling @code{BZ_SEQUENCE_END}. -To collect the read but unused data once @code{BZ_SEQUENCE_END} has -appeared, call @code{BZ2_bzReadGetUnused} immediately before @code{BZ2_bzReadClose}. - -Possible assignments to @code{bzerror}: -@display - @code{BZ_PARAM_ERROR} - if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} - @code{BZ_SEQUENCE_ERROR} - if @code{b} was opened with @code{BZ2_bzWriteOpen} - @code{BZ_IO_ERROR} - if there is an error reading from the compressed file - @code{BZ_UNEXPECTED_EOF} - if the compressed file ended before the logical end-of-stream was detected - @code{BZ_DATA_ERROR} - if a data integrity error was detected in the compressed stream - @code{BZ_DATA_ERROR_MAGIC} - if the stream does not begin with the requisite header bytes (ie, is not - a @code{bzip2} data file). This is really a special case of @code{BZ_DATA_ERROR}. - @code{BZ_MEM_ERROR} - if insufficient memory was available - @code{BZ_STREAM_END} - if the logical end of stream was detected. - @code{BZ_OK} - otherwise. -@end display - -Possible return values: -@display - number of bytes read - if @code{bzerror} is @code{BZ_OK} or @code{BZ_STREAM_END} - undefined - otherwise -@end display - -Allowable next actions: -@display - collect data from @code{buf}, then @code{BZ2_bzRead} or @code{BZ2_bzReadClose} - if @code{bzerror} is @code{BZ_OK} - collect data from @code{buf}, then @code{BZ2_bzReadClose} or @code{BZ2_bzReadGetUnused} - if @code{bzerror} is @code{BZ_SEQUENCE_END} - @code{BZ2_bzReadClose} - otherwise -@end display - - - -@subsection @code{BZ2_bzReadGetUnused} -@example - void BZ2_bzReadGetUnused ( int* bzerror, BZFILE *b, - void** unused, int* nUnused ); -@end example -Returns data which was read from the compressed file but was not needed -to get to the logical end-of-stream. @code{*unused} is set to the address -of the data, and @code{*nUnused} to the number of bytes. @code{*nUnused} will -be set to a value between @code{0} and @code{BZ_MAX_UNUSED} inclusive. - -This function may only be called once @code{BZ2_bzRead} has signalled -@code{BZ_STREAM_END} but before @code{BZ2_bzReadClose}. - -Possible assignments to @code{bzerror}: -@display - @code{BZ_PARAM_ERROR} - if @code{b} is @code{NULL} - or @code{unused} is @code{NULL} or @code{nUnused} is @code{NULL} - @code{BZ_SEQUENCE_ERROR} - if @code{BZ_STREAM_END} has not been signalled - or if @code{b} was opened with @code{BZ2_bzWriteOpen} - @code{BZ_OK} - otherwise -@end display - -Allowable next actions: -@display - @code{BZ2_bzReadClose} -@end display - - -@subsection @code{BZ2_bzReadClose} -@example - void BZ2_bzReadClose ( int *bzerror, BZFILE *b ); -@end example -Releases all memory pertaining to the compressed file @code{b}. -@code{BZ2_bzReadClose} does not call @code{fclose} on the underlying file -handle, so you should do that yourself if appropriate. -@code{BZ2_bzReadClose} should be called to clean up after all error -situations. - -Possible assignments to @code{bzerror}: -@display - @code{BZ_SEQUENCE_ERROR} - if @code{b} was opened with @code{BZ2_bzOpenWrite} - @code{BZ_OK} - otherwise -@end display - -Allowable next actions: -@display - none -@end display - - - -@subsection @code{BZ2_bzWriteOpen} -@example - BZFILE *BZ2_bzWriteOpen ( int *bzerror, FILE *f, - int blockSize100k, int verbosity, - int workFactor ); -@end example -Prepare to write compressed data to file handle @code{f}. -@code{f} should refer to -a file which has been opened for writing, and for which the error -indicator (@code{ferror(f)})is not set. - -For the meaning of parameters @code{blockSize100k}, -@code{verbosity} and @code{workFactor}, see -@* @code{BZ2_bzCompressInit}. - -All required memory is allocated at this stage, so if the call -completes successfully, @code{BZ_MEM_ERROR} cannot be signalled by a -subsequent call to @code{BZ2_bzWrite}. - -Possible assignments to @code{bzerror}: -@display - @code{BZ_CONFIG_ERROR} - if the library has been mis-compiled - @code{BZ_PARAM_ERROR} - if @code{f} is @code{NULL} - or @code{blockSize100k < 1} or @code{blockSize100k > 9} - @code{BZ_IO_ERROR} - if @code{ferror(f)} is nonzero - @code{BZ_MEM_ERROR} - if insufficient memory is available - @code{BZ_OK} - otherwise -@end display - -Possible return values: -@display - Pointer to an abstract @code{BZFILE} - if @code{bzerror} is @code{BZ_OK} - @code{NULL} - otherwise -@end display - -Allowable next actions: -@display - @code{BZ2_bzWrite} - if @code{bzerror} is @code{BZ_OK} - (you could go directly to @code{BZ2_bzWriteClose}, but this would be pretty pointless) - @code{BZ2_bzWriteClose} - otherwise -@end display - - - -@subsection @code{BZ2_bzWrite} -@example - void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len ); -@end example -Absorbs @code{len} bytes from the buffer @code{buf}, eventually to be -compressed and written to the file. - -Possible assignments to @code{bzerror}: -@display - @code{BZ_PARAM_ERROR} - if @code{b} is @code{NULL} or @code{buf} is @code{NULL} or @code{len < 0} - @code{BZ_SEQUENCE_ERROR} - if b was opened with @code{BZ2_bzReadOpen} - @code{BZ_IO_ERROR} - if there is an error writing the compressed file. - @code{BZ_OK} - otherwise -@end display - - - - -@subsection @code{BZ2_bzWriteClose} -@example - void BZ2_bzWriteClose ( int *bzerror, BZFILE* f, - int abandon, - unsigned int* nbytes_in, - unsigned int* nbytes_out ); - - void BZ2_bzWriteClose64 ( int *bzerror, BZFILE* f, - int abandon, - unsigned int* nbytes_in_lo32, - unsigned int* nbytes_in_hi32, - unsigned int* nbytes_out_lo32, - unsigned int* nbytes_out_hi32 ); -@end example - -Compresses and flushes to the compressed file all data so far supplied -by @code{BZ2_bzWrite}. The logical end-of-stream markers are also written, so -subsequent calls to @code{BZ2_bzWrite} are illegal. All memory associated -with the compressed file @code{b} is released. -@code{fflush} is called on the -compressed file, but it is not @code{fclose}'d. - -If @code{BZ2_bzWriteClose} is called to clean up after an error, the only -action is to release the memory. The library records the error codes -issued by previous calls, so this situation will be detected -automatically. There is no attempt to complete the compression -operation, nor to @code{fflush} the compressed file. You can force this -behaviour to happen even in the case of no error, by passing a nonzero -value to @code{abandon}. - -If @code{nbytes_in} is non-null, @code{*nbytes_in} will be set to be the -total volume of uncompressed data handled. Similarly, @code{nbytes_out} -will be set to the total volume of compressed data written. For -compatibility with older versions of the library, @code{BZ2_bzWriteClose} -only yields the lower 32 bits of these counts. Use -@code{BZ2_bzWriteClose64} if you want the full 64 bit counts. These -two functions are otherwise absolutely identical. - - -Possible assignments to @code{bzerror}: -@display - @code{BZ_SEQUENCE_ERROR} - if @code{b} was opened with @code{BZ2_bzReadOpen} - @code{BZ_IO_ERROR} - if there is an error writing the compressed file - @code{BZ_OK} - otherwise -@end display - -@subsection Handling embedded compressed data streams - -The high-level library facilitates use of -@code{bzip2} data streams which form some part of a surrounding, larger -data stream. -@itemize @bullet -@item For writing, the library takes an open file handle, writes -compressed data to it, @code{fflush}es it but does not @code{fclose} it. -The calling application can write its own data before and after the -compressed data stream, using that same file handle. -@item Reading is more complex, and the facilities are not as general -as they could be since generality is hard to reconcile with efficiency. -@code{BZ2_bzRead} reads from the compressed file in blocks of size -@code{BZ_MAX_UNUSED} bytes, and in doing so probably will overshoot -the logical end of compressed stream. -To recover this data once decompression has -ended, call @code{BZ2_bzReadGetUnused} after the last call of @code{BZ2_bzRead} -(the one returning @code{BZ_STREAM_END}) but before calling -@code{BZ2_bzReadClose}. -@end itemize - -This mechanism makes it easy to decompress multiple @code{bzip2} -streams placed end-to-end. As the end of one stream, when @code{BZ2_bzRead} -returns @code{BZ_STREAM_END}, call @code{BZ2_bzReadGetUnused} to collect the -unused data (copy it into your own buffer somewhere). -That data forms the start of the next compressed stream. -To start uncompressing that next stream, call @code{BZ2_bzReadOpen} again, -feeding in the unused data via the @code{unused}/@code{nUnused} -parameters. -Keep doing this until @code{BZ_STREAM_END} return coincides with the -physical end of file (@code{feof(f)}). In this situation -@code{BZ2_bzReadGetUnused} -will of course return no data. - -This should give some feel for how the high-level interface can be used. -If you require extra flexibility, you'll have to bite the bullet and get -to grips with the low-level interface. - -@subsection Standard file-reading/writing code -Here's how you'd write data to a compressed file: -@example @code -FILE* f; -BZFILE* b; -int nBuf; -char buf[ /* whatever size you like */ ]; -int bzerror; -int nWritten; - -f = fopen ( "myfile.bz2", "w" ); -if (!f) @{ - /* handle error */ -@} -b = BZ2_bzWriteOpen ( &bzerror, f, 9 ); -if (bzerror != BZ_OK) @{ - BZ2_bzWriteClose ( b ); - /* handle error */ -@} - -while ( /* condition */ ) @{ - /* get data to write into buf, and set nBuf appropriately */ - nWritten = BZ2_bzWrite ( &bzerror, b, buf, nBuf ); - if (bzerror == BZ_IO_ERROR) @{ - BZ2_bzWriteClose ( &bzerror, b ); - /* handle error */ - @} -@} - -BZ2_bzWriteClose ( &bzerror, b ); -if (bzerror == BZ_IO_ERROR) @{ - /* handle error */ -@} -@end example -And to read from a compressed file: -@example -FILE* f; -BZFILE* b; -int nBuf; -char buf[ /* whatever size you like */ ]; -int bzerror; -int nWritten; - -f = fopen ( "myfile.bz2", "r" ); -if (!f) @{ - /* handle error */ -@} -b = BZ2_bzReadOpen ( &bzerror, f, 0, NULL, 0 ); -if (bzerror != BZ_OK) @{ - BZ2_bzReadClose ( &bzerror, b ); - /* handle error */ -@} - -bzerror = BZ_OK; -while (bzerror == BZ_OK && /* arbitrary other conditions */) @{ - nBuf = BZ2_bzRead ( &bzerror, b, buf, /* size of buf */ ); - if (bzerror == BZ_OK) @{ - /* do something with buf[0 .. nBuf-1] */ - @} -@} -if (bzerror != BZ_STREAM_END) @{ - BZ2_bzReadClose ( &bzerror, b ); - /* handle error */ -@} else @{ - BZ2_bzReadClose ( &bzerror ); -@} -@end example - - - -@section Utility functions -@subsection @code{BZ2_bzBuffToBuffCompress} -@example - int BZ2_bzBuffToBuffCompress( char* dest, - unsigned int* destLen, - char* source, - unsigned int sourceLen, - int blockSize100k, - int verbosity, - int workFactor ); -@end example -Attempts to compress the data in @code{source[0 .. sourceLen-1]} -into the destination buffer, @code{dest[0 .. *destLen-1]}. -If the destination buffer is big enough, @code{*destLen} is -set to the size of the compressed data, and @code{BZ_OK} is -returned. If the compressed data won't fit, @code{*destLen} -is unchanged, and @code{BZ_OUTBUFF_FULL} is returned. - -Compression in this manner is a one-shot event, done with a single call -to this function. The resulting compressed data is a complete -@code{bzip2} format data stream. There is no mechanism for making -additional calls to provide extra input data. If you want that kind of -mechanism, use the low-level interface. - -For the meaning of parameters @code{blockSize100k}, @code{verbosity} -and @code{workFactor}, @* see @code{BZ2_bzCompressInit}. - -To guarantee that the compressed data will fit in its buffer, allocate -an output buffer of size 1% larger than the uncompressed data, plus -six hundred extra bytes. - -@code{BZ2_bzBuffToBuffDecompress} will not write data at or -beyond @code{dest[*destLen]}, even in case of buffer overflow. - -Possible return values: -@display - @code{BZ_CONFIG_ERROR} - if the library has been mis-compiled - @code{BZ_PARAM_ERROR} - if @code{dest} is @code{NULL} or @code{destLen} is @code{NULL} - or @code{blockSize100k < 1} or @code{blockSize100k > 9} - or @code{verbosity < 0} or @code{verbosity > 4} - or @code{workFactor < 0} or @code{workFactor > 250} - @code{BZ_MEM_ERROR} - if insufficient memory is available - @code{BZ_OUTBUFF_FULL} - if the size of the compressed data exceeds @code{*destLen} - @code{BZ_OK} - otherwise -@end display - - - -@subsection @code{BZ2_bzBuffToBuffDecompress} -@example - int BZ2_bzBuffToBuffDecompress ( char* dest, - unsigned int* destLen, - char* source, - unsigned int sourceLen, - int small, - int verbosity ); -@end example -Attempts to decompress the data in @code{source[0 .. sourceLen-1]} -into the destination buffer, @code{dest[0 .. *destLen-1]}. -If the destination buffer is big enough, @code{*destLen} is -set to the size of the uncompressed data, and @code{BZ_OK} is -returned. If the compressed data won't fit, @code{*destLen} -is unchanged, and @code{BZ_OUTBUFF_FULL} is returned. - -@code{source} is assumed to hold a complete @code{bzip2} format -data stream. @* @code{BZ2_bzBuffToBuffDecompress} tries to decompress -the entirety of the stream into the output buffer. - -For the meaning of parameters @code{small} and @code{verbosity}, -see @code{BZ2_bzDecompressInit}. - -Because the compression ratio of the compressed data cannot be known in -advance, there is no easy way to guarantee that the output buffer will -be big enough. You may of course make arrangements in your code to -record the size of the uncompressed data, but such a mechanism is beyond -the scope of this library. - -@code{BZ2_bzBuffToBuffDecompress} will not write data at or -beyond @code{dest[*destLen]}, even in case of buffer overflow. - -Possible return values: -@display - @code{BZ_CONFIG_ERROR} - if the library has been mis-compiled - @code{BZ_PARAM_ERROR} - if @code{dest} is @code{NULL} or @code{destLen} is @code{NULL} - or @code{small != 0 && small != 1} - or @code{verbosity < 0} or @code{verbosity > 4} - @code{BZ_MEM_ERROR} - if insufficient memory is available - @code{BZ_OUTBUFF_FULL} - if the size of the compressed data exceeds @code{*destLen} - @code{BZ_DATA_ERROR} - if a data integrity error was detected in the compressed data - @code{BZ_DATA_ERROR_MAGIC} - if the compressed data doesn't begin with the right magic bytes - @code{BZ_UNEXPECTED_EOF} - if the compressed data ends unexpectedly - @code{BZ_OK} - otherwise -@end display - - - -@section @code{zlib} compatibility functions -Yoshioka Tsuneo has contributed some functions to -give better @code{zlib} compatibility. These functions are -@code{BZ2_bzopen}, @code{BZ2_bzread}, @code{BZ2_bzwrite}, @code{BZ2_bzflush}, -@code{BZ2_bzclose}, -@code{BZ2_bzerror} and @code{BZ2_bzlibVersion}. -These functions are not (yet) officially part of -the library. If they break, you get to keep all the pieces. -Nevertheless, I think they work ok. -@example -typedef void BZFILE; - -const char * BZ2_bzlibVersion ( void ); -@end example -Returns a string indicating the library version. -@example -BZFILE * BZ2_bzopen ( const char *path, const char *mode ); -BZFILE * BZ2_bzdopen ( int fd, const char *mode ); -@end example -Opens a @code{.bz2} file for reading or writing, using either its name -or a pre-existing file descriptor. -Analogous to @code{fopen} and @code{fdopen}. -@example -int BZ2_bzread ( BZFILE* b, void* buf, int len ); -int BZ2_bzwrite ( BZFILE* b, void* buf, int len ); -@end example -Reads/writes data from/to a previously opened @code{BZFILE}. -Analogous to @code{fread} and @code{fwrite}. -@example -int BZ2_bzflush ( BZFILE* b ); -void BZ2_bzclose ( BZFILE* b ); -@end example -Flushes/closes a @code{BZFILE}. @code{BZ2_bzflush} doesn't actually do -anything. Analogous to @code{fflush} and @code{fclose}. - -@example -const char * BZ2_bzerror ( BZFILE *b, int *errnum ) -@end example -Returns a string describing the more recent error status of -@code{b}, and also sets @code{*errnum} to its numerical value. - - -@section Using the library in a @code{stdio}-free environment - -@subsection Getting rid of @code{stdio} - -In a deeply embedded application, you might want to use just -the memory-to-memory functions. You can do this conveniently -by compiling the library with preprocessor symbol @code{BZ_NO_STDIO} -defined. Doing this gives you a library containing only the following -eight functions: - -@code{BZ2_bzCompressInit}, @code{BZ2_bzCompress}, @code{BZ2_bzCompressEnd} @* -@code{BZ2_bzDecompressInit}, @code{BZ2_bzDecompress}, @code{BZ2_bzDecompressEnd} @* -@code{BZ2_bzBuffToBuffCompress}, @code{BZ2_bzBuffToBuffDecompress} - -When compiled like this, all functions will ignore @code{verbosity} -settings. - -@subsection Critical error handling -@code{libbzip2} contains a number of internal assertion checks which -should, needless to say, never be activated. Nevertheless, if an -assertion should fail, behaviour depends on whether or not the library -was compiled with @code{BZ_NO_STDIO} set. - -For a normal compile, an assertion failure yields the message -@example - bzip2/libbzip2: internal error number N. - This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000. - Please report it to me at: jseward@@acm.org. If this happened - when you were using some program which uses libbzip2 as a - component, you should also report this bug to the author(s) - of that program. Please make an effort to report this bug; - timely and accurate bug reports eventually lead to higher - quality software. Thanks. Julian Seward, 21 March 2000. -@end example -where @code{N} is some error code number. @code{exit(3)} -is then called. - -For a @code{stdio}-free library, assertion failures result -in a call to a function declared as: -@example - extern void bz_internal_error ( int errcode ); -@end example -The relevant code is passed as a parameter. You should supply -such a function. - -In either case, once an assertion failure has occurred, any -@code{bz_stream} records involved can be regarded as invalid. -You should not attempt to resume normal operation with them. - -You may, of course, change critical error handling to suit -your needs. As I said above, critical errors indicate bugs -in the library and should not occur. All "normal" error -situations are indicated via error return codes from functions, -and can be recovered from. - - -@section Making a Windows DLL -Everything related to Windows has been contributed by Yoshioka Tsuneo -@* (@code{QWF00133@@niftyserve.or.jp} / -@code{tsuneo-y@@is.aist-nara.ac.jp}), so you should send your queries to -him (but perhaps Cc: me, @code{jseward@@acm.org}). - -My vague understanding of what to do is: using Visual C++ 5.0, -open the project file @code{libbz2.dsp}, and build. That's all. - -If you can't -open the project file for some reason, make a new one, naming these files: -@code{blocksort.c}, @code{bzlib.c}, @code{compress.c}, -@code{crctable.c}, @code{decompress.c}, @code{huffman.c}, @* -@code{randtable.c} and @code{libbz2.def}. You will also need -to name the header files @code{bzlib.h} and @code{bzlib_private.h}. - -If you don't use VC++, you may need to define the proprocessor symbol -@code{_WIN32}. - -Finally, @code{dlltest.c} is a sample program using the DLL. It has a -project file, @code{dlltest.dsp}. - -If you just want a makefile for Visual C, have a look at -@code{makefile.msc}. - -Be aware that if you compile @code{bzip2} itself on Win32, you must set -@code{BZ_UNIX} to 0 and @code{BZ_LCCWIN32} to 1, in the file -@code{bzip2.c}, before compiling. Otherwise the resulting binary won't -work correctly. - -I haven't tried any of this stuff myself, but it all looks plausible. - - - -@chapter Miscellanea - -These are just some random thoughts of mine. Your mileage may -vary. - -@section Limitations of the compressed file format -@code{bzip2-1.0}, @code{0.9.5} and @code{0.9.0} -use exactly the same file format as the previous -version, @code{bzip2-0.1}. This decision was made in the interests of -stability. Creating yet another incompatible compressed file format -would create further confusion and disruption for users. - -Nevertheless, this is not a painless decision. Development -work since the release of @code{bzip2-0.1} in August 1997 -has shown complexities in the file format which slow down -decompression and, in retrospect, are unnecessary. These are: -@itemize @bullet -@item The run-length encoder, which is the first of the - compression transformations, is entirely irrelevant. - The original purpose was to protect the sorting algorithm - from the very worst case input: a string of repeated - symbols. But algorithm steps Q6a and Q6b in the original - Burrows-Wheeler technical report (SRC-124) show how - repeats can be handled without difficulty in block - sorting. -@item The randomisation mechanism doesn't really need to be - there. Udi Manber and Gene Myers published a suffix - array construction algorithm a few years back, which - can be employed to sort any block, no matter how - repetitive, in O(N log N) time. Subsequent work by - Kunihiko Sadakane has produced a derivative O(N (log N)^2) - algorithm which usually outperforms the Manber-Myers - algorithm. - - I could have changed to Sadakane's algorithm, but I find - it to be slower than @code{bzip2}'s existing algorithm for - most inputs, and the randomisation mechanism protects - adequately against bad cases. I didn't think it was - a good tradeoff to make. Partly this is due to the fact - that I was not flooded with email complaints about - @code{bzip2-0.1}'s performance on repetitive data, so - perhaps it isn't a problem for real inputs. - - Probably the best long-term solution, - and the one I have incorporated into 0.9.5 and above, - is to use the existing sorting - algorithm initially, and fall back to a O(N (log N)^2) - algorithm if the standard algorithm gets into difficulties. -@item The compressed file format was never designed to be - handled by a library, and I have had to jump though - some hoops to produce an efficient implementation of - decompression. It's a bit hairy. Try passing - @code{decompress.c} through the C preprocessor - and you'll see what I mean. Much of this complexity - could have been avoided if the compressed size of - each block of data was recorded in the data stream. -@item An Adler-32 checksum, rather than a CRC32 checksum, - would be faster to compute. -@end itemize -It would be fair to say that the @code{bzip2} format was frozen -before I properly and fully understood the performance -consequences of doing so. - -Improvements which I was able to incorporate into -0.9.0, despite using the same file format, are: -@itemize @bullet -@item Single array implementation of the inverse BWT. This - significantly speeds up decompression, presumably - because it reduces the number of cache misses. -@item Faster inverse MTF transform for large MTF values. The - new implementation is based on the notion of sliding blocks - of values. -@item @code{bzip2-0.9.0} now reads and writes files with @code{fread} - and @code{fwrite}; version 0.1 used @code{putc} and @code{getc}. - Duh! Well, you live and learn. - -@end itemize -Further ahead, it would be nice -to be able to do random access into files. This will -require some careful design of compressed file formats. - - - -@section Portability issues -After some consideration, I have decided not to use -GNU @code{autoconf} to configure 0.9.5 or 1.0. - -@code{autoconf}, admirable and wonderful though it is, -mainly assists with portability problems between Unix-like -platforms. But @code{bzip2} doesn't have much in the way -of portability problems on Unix; most of the difficulties appear -when porting to the Mac, or to Microsoft's operating systems. -@code{autoconf} doesn't help in those cases, and brings in a -whole load of new complexity. - -Most people should be able to compile the library and program -under Unix straight out-of-the-box, so to speak, especially -if you have a version of GNU C available. - -There are a couple of @code{__inline__} directives in the code. GNU C -(@code{gcc}) should be able to handle them. If you're not using -GNU C, your C compiler shouldn't see them at all. -If your compiler does, for some reason, see them and doesn't -like them, just @code{#define} @code{__inline__} to be @code{/* */}. One -easy way to do this is to compile with the flag @code{-D__inline__=}, -which should be understood by most Unix compilers. - -If you still have difficulties, try compiling with the macro -@code{BZ_STRICT_ANSI} defined. This should enable you to build the -library in a strictly ANSI compliant environment. Building the program -itself like this is dangerous and not supported, since you remove -@code{bzip2}'s checks against compressing directories, symbolic links, -devices, and other not-really-a-file entities. This could cause -filesystem corruption! - -One other thing: if you create a @code{bzip2} binary for public -distribution, please try and link it statically (@code{gcc -s}). This -avoids all sorts of library-version issues that others may encounter -later on. - -If you build @code{bzip2} on Win32, you must set @code{BZ_UNIX} to 0 and -@code{BZ_LCCWIN32} to 1, in the file @code{bzip2.c}, before compiling. -Otherwise the resulting binary won't work correctly. - - - -@section Reporting bugs -I tried pretty hard to make sure @code{bzip2} is -bug free, both by design and by testing. Hopefully -you'll never need to read this section for real. - -Nevertheless, if @code{bzip2} dies with a segmentation -fault, a bus error or an internal assertion failure, it -will ask you to email me a bug report. Experience with -version 0.1 shows that almost all these problems can -be traced to either compiler bugs or hardware problems. -@itemize @bullet -@item -Recompile the program with no optimisation, and see if it -works. And/or try a different compiler. -I heard all sorts of stories about various flavours -of GNU C (and other compilers) generating bad code for -@code{bzip2}, and I've run across two such examples myself. - -2.7.X versions of GNU C are known to generate bad code from -time to time, at high optimisation levels. -If you get problems, try using the flags -@code{-O2} @code{-fomit-frame-pointer} @code{-fno-strength-reduce}. -You should specifically @emph{not} use @code{-funroll-loops}. - -You may notice that the Makefile runs six tests as part of -the build process. If the program passes all of these, it's -a pretty good (but not 100%) indication that the compiler has -done its job correctly. -@item -If @code{bzip2} crashes randomly, and the crashes are not -repeatable, you may have a flaky memory subsystem. @code{bzip2} -really hammers your memory hierarchy, and if it's a bit marginal, -you may get these problems. Ditto if your disk or I/O subsystem -is slowly failing. Yup, this really does happen. - -Try using a different machine of the same type, and see if -you can repeat the problem. -@item This isn't really a bug, but ... If @code{bzip2} tells -you your file is corrupted on decompression, and you -obtained the file via FTP, there is a possibility that you -forgot to tell FTP to do a binary mode transfer. That absolutely -will cause the file to be non-decompressible. You'll have to transfer -it again. -@end itemize - -If you've incorporated @code{libbzip2} into your own program -and are getting problems, please, please, please, check that the -parameters you are passing in calls to the library, are -correct, and in accordance with what the documentation says -is allowable. I have tried to make the library robust against -such problems, but I'm sure I haven't succeeded. - -Finally, if the above comments don't help, you'll have to send -me a bug report. Now, it's just amazing how many people will -send me a bug report saying something like -@display - bzip2 crashed with segmentation fault on my machine -@end display -and absolutely nothing else. Needless to say, a such a report -is @emph{totally, utterly, completely and comprehensively 100% useless; -a waste of your time, my time, and net bandwidth}. -With no details at all, there's no way I can possibly begin -to figure out what the problem is. - -The rules of the game are: facts, facts, facts. Don't omit -them because "oh, they won't be relevant". At the bare -minimum: -@display - Machine type. Operating system version. - Exact version of @code{bzip2} (do @code{bzip2 -V}). - Exact version of the compiler used. - Flags passed to the compiler. -@end display -However, the most important single thing that will help me is -the file that you were trying to compress or decompress at the -time the problem happened. Without that, my ability to do anything -more than speculate about the cause, is limited. - -Please remember that I connect to the Internet with a modem, so -you should contact me before mailing me huge files. - - -@section Did you get the right package? - -@code{bzip2} is a resource hog. It soaks up large amounts of CPU cycles -and memory. Also, it gives very large latencies. In the worst case, you -can feed many megabytes of uncompressed data into the library before -getting any compressed output, so this probably rules out applications -requiring interactive behaviour. - -These aren't faults of my implementation, I hope, but more -an intrinsic property of the Burrows-Wheeler transform (unfortunately). -Maybe this isn't what you want. - -If you want a compressor and/or library which is faster, uses less -memory but gets pretty good compression, and has minimal latency, -consider Jean-loup -Gailly's and Mark Adler's work, @code{zlib-1.1.2} and -@code{gzip-1.2.4}. Look for them at - -@code{http://www.cdrom.com/pub/infozip/zlib} and -@code{http://www.gzip.org} respectively. - -For something faster and lighter still, you might try Markus F X J -Oberhumer's @code{LZO} real-time compression/decompression library, at -@* @code{http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html}. - -If you want to use the @code{bzip2} algorithms to compress small blocks -of data, 64k bytes or smaller, for example on an on-the-fly disk -compressor, you'd be well advised not to use this library. Instead, -I've made a special library tuned for that kind of use. It's part of -@code{e2compr-0.40}, an on-the-fly disk compressor for the Linux -@code{ext2} filesystem. Look at -@code{http://www.netspace.net.au/~reiter/e2compr}. - - - -@section Testing - -A record of the tests I've done. - -First, some data sets: -@itemize @bullet -@item B: a directory containing 6001 files, one for every length in the - range 0 to 6000 bytes. The files contain random lowercase - letters. 18.7 megabytes. -@item H: my home directory tree. Documents, source code, mail files, - compressed data. H contains B, and also a directory of - files designed as boundary cases for the sorting; mostly very - repetitive, nasty files. 565 megabytes. -@item A: directory tree holding various applications built from source: - @code{egcs}, @code{gcc-2.8.1}, KDE, GTK, Octave, etc. - 2200 megabytes. -@end itemize -The tests conducted are as follows. Each test means compressing -(a copy of) each file in the data set, decompressing it and -comparing it against the original. - -First, a bunch of tests with block sizes and internal buffer -sizes set very small, -to detect any problems with the -blocking and buffering mechanisms. -This required modifying the source code so as to try to -break it. -@enumerate -@item Data set H, with - buffer size of 1 byte, and block size of 23 bytes. -@item Data set B, buffer sizes 1 byte, block size 1 byte. -@item As (2) but small-mode decompression. -@item As (2) with block size 2 bytes. -@item As (2) with block size 3 bytes. -@item As (2) with block size 4 bytes. -@item As (2) with block size 5 bytes. -@item As (2) with block size 6 bytes and small-mode decompression. -@item H with buffer size of 1 byte, but normal block - size (up to 900000 bytes). -@end enumerate -Then some tests with unmodified source code. -@enumerate -@item H, all settings normal. -@item As (1), with small-mode decompress. -@item H, compress with flag @code{-1}. -@item H, compress with flag @code{-s}, decompress with flag @code{-s}. -@item Forwards compatibility: H, @code{bzip2-0.1pl2} compressing, - @code{bzip2-0.9.5} decompressing, all settings normal. -@item Backwards compatibility: H, @code{bzip2-0.9.5} compressing, - @code{bzip2-0.1pl2} decompressing, all settings normal. -@item Bigger tests: A, all settings normal. -@item As (7), using the fallback (Sadakane-like) sorting algorithm. -@item As (8), compress with flag @code{-1}, decompress with flag - @code{-s}. -@item H, using the fallback sorting algorithm. -@item Forwards compatibility: A, @code{bzip2-0.1pl2} compressing, - @code{bzip2-0.9.5} decompressing, all settings normal. -@item Backwards compatibility: A, @code{bzip2-0.9.5} compressing, - @code{bzip2-0.1pl2} decompressing, all settings normal. -@item Misc test: about 400 megabytes of @code{.tar} files with - @code{bzip2} compiled with Checker (a memory access error - detector, like Purify). -@item Misc tests to make sure it builds and runs ok on non-Linux/x86 - platforms. -@end enumerate -These tests were conducted on a 225 MHz IDT WinChip machine, running -Linux 2.0.36. They represent nearly a week of continuous computation. -All tests completed successfully. - - -@section Further reading -@code{bzip2} is not research work, in the sense that it doesn't present -any new ideas. Rather, it's an engineering exercise based on existing -ideas. - -Four documents describe essentially all the ideas behind @code{bzip2}: -@example -Michael Burrows and D. J. Wheeler: - "A block-sorting lossless data compression algorithm" - 10th May 1994. - Digital SRC Research Report 124. - ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz - If you have trouble finding it, try searching at the - New Zealand Digital Library, http://www.nzdl.org. - -Daniel S. Hirschberg and Debra A. LeLewer - "Efficient Decoding of Prefix Codes" - Communications of the ACM, April 1990, Vol 33, Number 4. - You might be able to get an electronic copy of this - from the ACM Digital Library. - -David J. Wheeler - Program bred3.c and accompanying document bred3.ps. - This contains the idea behind the multi-table Huffman - coding scheme. - ftp://ftp.cl.cam.ac.uk/users/djw3/ - -Jon L. Bentley and Robert Sedgewick - "Fast Algorithms for Sorting and Searching Strings" - Available from Sedgewick's web page, - www.cs.princeton.edu/~rs -@end example -The following paper gives valuable additional insights into the -algorithm, but is not immediately the basis of any code -used in bzip2. -@example -Peter Fenwick: - Block Sorting Text Compression - Proceedings of the 19th Australasian Computer Science Conference, - Melbourne, Australia. Jan 31 - Feb 2, 1996. - ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps -@end example -Kunihiko Sadakane's sorting algorithm, mentioned above, -is available from: -@example -http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz -@end example -The Manber-Myers suffix array construction -algorithm is described in a paper -available from: -@example -http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps -@end example -Finally, the following paper documents some recent investigations -I made into the performance of sorting algorithms: -@example -Julian Seward: - On the Performance of BWT Sorting Algorithms - Proceedings of the IEEE Data Compression Conference 2000 - Snowbird, Utah. 28-30 March 2000. -@end example - - -@contents - -@bye - diff -Nru bzip2-1.0.1/manual_1.html bzip2-1.0.1.new/manual_1.html --- bzip2-1.0.1/manual_1.html Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/manual_1.html Thu Jan 1 01:00:00 1970 @@ -1,47 +0,0 @@ - - - - -bzip2 and libbzip2 - Introduction - - - - - -

Go to the first, previous, next, last section, table of contents. -


- - -

Introduction

- -

-bzip2 compresses files using the Burrows-Wheeler -block-sorting text compression algorithm, and Huffman coding. -Compression is generally considerably better than that -achieved by more conventional LZ77/LZ78-based compressors, -and approaches the performance of the PPM family of statistical compressors. - -

-

-bzip2 is built on top of libbzip2, a flexible library -for handling compressed data in the bzip2 format. This manual -describes both how to use the program and -how to work with the library interface. Most of the -manual is devoted to this library, not the program, -which is good news if your interest is only in the program. - -

-

-Chapter 2 describes how to use bzip2; this is the only part -you need to read if you just want to know how to operate the program. -Chapter 3 describes the programming interfaces in detail, and -Chapter 4 records some miscellaneous notes which I thought -ought to be recorded somewhere. - -

- -


-

Go to the first, previous, next, last section, table of contents. - - diff -Nru bzip2-1.0.1/manual_2.html bzip2-1.0.1.new/manual_2.html --- bzip2-1.0.1/manual_2.html Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/manual_2.html Thu Jan 1 01:00:00 1970 @@ -1,484 +0,0 @@ - - - - -bzip2 and libbzip2 - How to use bzip2 - - - - - - -

Go to the first, previous, next, last section, table of contents. -


- - -

How to use bzip2

- -

-This chapter contains a copy of the bzip2 man page, -and nothing else. - -

- -
- - - -

NAME

- -
    -
  • bzip2, bunzip2 - -- a block-sorting file compressor, v1.0 -
  • bzcat - -- decompresses files to stdout -
  • bzip2recover - -- recovers data from damaged bzip2 files -
- - - -

SYNOPSIS

- -
    -
  • bzip2 [ -cdfkqstvzVL123456789 ] [ filenames ... ] - -
  • bunzip2 [ -fkvsVL ] [ filenames ... ] - -
  • bzcat [ -s ] [ filenames ... ] - -
  • bzip2recover filename - -
- - - -

DESCRIPTION

- -

-bzip2 compresses files using the Burrows-Wheeler block sorting -text compression algorithm, and Huffman coding. Compression is -generally considerably better than that achieved by more conventional -LZ77/LZ78-based compressors, and approaches the performance of the PPM -family of statistical compressors. - -

-

-The command-line options are deliberately very similar to those of GNU -gzip, but they are not identical. - -

-

-bzip2 expects a list of file names to accompany the command-line -flags. Each file is replaced by a compressed version of itself, with -the name original_name.bz2. Each compressed file has the same -modification date, permissions, and, when possible, ownership as the -corresponding original, so that these properties can be correctly -restored at decompression time. File name handling is naive in the -sense that there is no mechanism for preserving original file names, -permissions, ownerships or dates in filesystems which lack these -concepts, or have serious file name length restrictions, such as MS-DOS. - -

-

-bzip2 and bunzip2 will by default not overwrite existing -files. If you want this to happen, specify the -f flag. - -

-

-If no file names are specified, bzip2 compresses from standard -input to standard output. In this case, bzip2 will decline to -write compressed output to a terminal, as this would be entirely -incomprehensible and therefore pointless. - -

-

-bunzip2 (or bzip2 -d) decompresses all -specified files. Files which were not created by bzip2 -will be detected and ignored, and a warning issued. -bzip2 attempts to guess the filename for the decompressed file -from that of the compressed file as follows: - -

    -
  • filename.bz2 becomes filename - -
  • filename.bz becomes filename - -
  • filename.tbz2 becomes filename.tar - -
  • filename.tbz becomes filename.tar - -
  • anyothername becomes anyothername.out - -
- -

-If the file does not end in one of the recognised endings, -.bz2, .bz, -.tbz2 or .tbz, bzip2 complains that it cannot -guess the name of the original file, and uses the original name -with .out appended. - -

-

-As with compression, supplying no -filenames causes decompression from standard input to standard output. - -

-

-bunzip2 will correctly decompress a file which is the -concatenation of two or more compressed files. The result is the -concatenation of the corresponding uncompressed files. Integrity -testing (-t) of concatenated compressed files is also supported. - -

-

-You can also compress or decompress files to the standard output by -giving the -c flag. Multiple files may be compressed and -decompressed like this. The resulting outputs are fed sequentially to -stdout. Compression of multiple files in this manner generates a stream -containing multiple compressed file representations. Such a stream -can be decompressed correctly only by bzip2 version 0.9.0 or -later. Earlier versions of bzip2 will stop after decompressing -the first file in the stream. - -

-

-bzcat (or bzip2 -dc) decompresses all specified files to -the standard output. - -

-

-bzip2 will read arguments from the environment variables -BZIP2 and BZIP, in that order, and will process them -before any arguments read from the command line. This gives a -convenient way to supply default arguments. - -

-

-Compression is always performed, even if the compressed file is slightly -larger than the original. Files of less than about one hundred bytes -tend to get larger, since the compression mechanism has a constant -overhead in the region of 50 bytes. Random data (including the output -of most file compressors) is coded at about 8.05 bits per byte, giving -an expansion of around 0.5%. - -

-

-As a self-check for your protection, bzip2 uses 32-bit CRCs to -make sure that the decompressed version of a file is identical to the -original. This guards against corruption of the compressed data, and -against undetected bugs in bzip2 (hopefully very unlikely). The -chances of data corruption going undetected is microscopic, about one -chance in four billion for each file processed. Be aware, though, that -the check occurs upon decompression, so it can only tell you that -something is wrong. It can't help you recover the original uncompressed -data. You can use bzip2recover to try to recover data from -damaged files. - -

-

-Return values: 0 for a normal exit, 1 for environmental problems (file -not found, invalid flags, I/O errors, &c), 2 to indicate a corrupt -compressed file, 3 for an internal consistency error (eg, bug) which -caused bzip2 to panic. - -

- - - -

OPTIONS

-
- -
-c --stdout -
-Compress or decompress to standard output. -
-d --decompress -
-Force decompression. bzip2, bunzip2 and bzcat are -really the same program, and the decision about what actions to take is -done on the basis of which name is used. This flag overrides that -mechanism, and forces bzip2 to decompress. -
-z --compress -
-The complement to -d: forces compression, regardless of the -invokation name. -
-t --test -
-Check integrity of the specified file(s), but don't decompress them. -This really performs a trial decompression and throws away the result. -
-f --force -
-Force overwrite of output files. Normally, bzip2 will not overwrite -existing output files. Also forces bzip2 to break hard links -to files, which it otherwise wouldn't do. -
-k --keep -
-Keep (don't delete) input files during compression -or decompression. -
-s --small -
-Reduce memory usage, for compression, decompression and testing. Files -are decompressed and tested using a modified algorithm which only -requires 2.5 bytes per block byte. This means any file can be -decompressed in 2300k of memory, albeit at about half the normal speed. - -During compression, -s selects a block size of 200k, which limits -memory use to around the same figure, at the expense of your compression -ratio. In short, if your machine is low on memory (8 megabytes or -less), use -s for everything. See MEMORY MANAGEMENT below. -
-q --quiet -
-Suppress non-essential warning messages. Messages pertaining to -I/O errors and other critical events will not be suppressed. -
-v --verbose -
-Verbose mode -- show the compression ratio for each file processed. -Further -v's increase the verbosity level, spewing out lots of -information which is primarily of interest for diagnostic purposes. -
-L --license -V --version -
-Display the software version, license terms and conditions. -
-1 to -9 -
-Set the block size to 100 k, 200 k .. 900 k when compressing. Has no -effect when decompressing. See MEMORY MANAGEMENT below. -
-- -
-Treats all subsequent arguments as file names, even if they start -with a dash. This is so you can handle files with names beginning -with a dash, for example: bzip2 -- -myfilename. -
--repetitive-fast -
-
--repetitive-best -
-These flags are redundant in versions 0.9.5 and above. They provided -some coarse control over the behaviour of the sorting algorithm in -earlier versions, which was sometimes useful. 0.9.5 and above have an -improved algorithm which renders these flags irrelevant. -
- - - -

MEMORY MANAGEMENT

- -

-bzip2 compresses large files in blocks. The block size affects -both the compression ratio achieved, and the amount of memory needed for -compression and decompression. The flags -1 through -9 -specify the block size to be 100,000 bytes through 900,000 bytes (the -default) respectively. At decompression time, the block size used for -compression is read from the header of the compressed file, and -bunzip2 then allocates itself just enough memory to decompress -the file. Since block sizes are stored in compressed files, it follows -that the flags -1 to -9 are irrelevant to and so ignored -during decompression. - -

-

-Compression and decompression requirements, in bytes, can be estimated -as: - -

-     Compression:   400k + ( 8 x block size )
-
-     Decompression: 100k + ( 4 x block size ), or
-                    100k + ( 2.5 x block size )
-
- -

-Larger block sizes give rapidly diminishing marginal returns. Most of -the compression comes from the first two or three hundred k of block -size, a fact worth bearing in mind when using bzip2 on small machines. -It is also important to appreciate that the decompression memory -requirement is set at compression time by the choice of block size. - -

-

-For files compressed with the default 900k block size, bunzip2 -will require about 3700 kbytes to decompress. To support decompression -of any file on a 4 megabyte machine, bunzip2 has an option to -decompress using approximately half this amount of memory, about 2300 -kbytes. Decompression speed is also halved, so you should use this -option only where necessary. The relevant flag is -s. - -

-

-In general, try and use the largest block size memory constraints allow, -since that maximises the compression achieved. Compression and -decompression speed are virtually unaffected by block size. - -

-

-Another significant point applies to files which fit in a single block --- that means most files you'd encounter using a large block size. The -amount of real memory touched is proportional to the size of the file, -since the file is smaller than a block. For example, compressing a file -20,000 bytes long with the flag -9 will cause the compressor to -allocate around 7600k of memory, but only touch 400k + 20000 * 8 = 560 -kbytes of it. Similarly, the decompressor will allocate 3700k but only -touch 100k + 20000 * 4 = 180 kbytes. - -

-

-Here is a table which summarises the maximum memory usage for different -block sizes. Also recorded is the total compressed size for 14 files of -the Calgary Text Compression Corpus totalling 3,141,622 bytes. This -column gives some feel for how compression varies with block size. -These figures tend to understate the advantage of larger block sizes for -larger files, since the Corpus is dominated by smaller files. - -

-          Compress   Decompress   Decompress   Corpus
-   Flag     usage      usage       -s usage     Size
-
-    -1      1200k       500k         350k      914704
-    -2      2000k       900k         600k      877703
-    -3      2800k      1300k         850k      860338
-    -4      3600k      1700k        1100k      846899
-    -5      4400k      2100k        1350k      845160
-    -6      5200k      2500k        1600k      838626
-    -7      6100k      2900k        1850k      834096
-    -8      6800k      3300k        2100k      828642
-    -9      7600k      3700k        2350k      828642
-
- - - -

RECOVERING DATA FROM DAMAGED FILES

- -

-bzip2 compresses files in blocks, usually 900kbytes long. Each -block is handled independently. If a media or transmission error causes -a multi-block .bz2 file to become damaged, it may be possible to -recover data from the undamaged blocks in the file. - -

-

-The compressed representation of each block is delimited by a 48-bit -pattern, which makes it possible to find the block boundaries with -reasonable certainty. Each block also carries its own 32-bit CRC, so -damaged blocks can be distinguished from undamaged ones. - -

-

-bzip2recover is a simple program whose purpose is to search for -blocks in .bz2 files, and write each block out into its own -.bz2 file. You can then use bzip2 -t to test the -integrity of the resulting files, and decompress those which are -undamaged. - -

-

-bzip2recover -takes a single argument, the name of the damaged file, -and writes a number of files rec0001file.bz2, - rec0002file.bz2, etc, containing the extracted blocks. - The output filenames are designed so that the use of - wildcards in subsequent processing -- for example, -bzip2 -dc rec*file.bz2 > recovered_data -- lists the files in - the correct order. - -

-

-bzip2recover should be of most use dealing with large .bz2 - files, as these will contain many blocks. It is clearly - futile to use it on damaged single-block files, since a - damaged block cannot be recovered. If you wish to minimise -any potential data loss through media or transmission errors, -you might consider compressing with a smaller - block size. - -

- - - -

PERFORMANCE NOTES

- -

-The sorting phase of compression gathers together similar strings in the -file. Because of this, files containing very long runs of repeated -symbols, like "aabaabaabaab ..." (repeated several hundred times) may -compress more slowly than normal. Versions 0.9.5 and above fare much -better than previous versions in this respect. The ratio between -worst-case and average-case compression time is in the region of 10:1. -For previous versions, this figure was more like 100:1. You can use the --vvvv option to monitor progress in great detail, if you want. - -

-

-Decompression speed is unaffected by these phenomena. - -

-

-bzip2 usually allocates several megabytes of memory to operate -in, and then charges all over it in a fairly random fashion. This means -that performance, both for compressing and decompressing, is largely -determined by the speed at which your machine can service cache misses. -Because of this, small changes to the code to reduce the miss rate have -been observed to give disproportionately large performance improvements. -I imagine bzip2 will perform best on machines with very large -caches. - -

- - - -

CAVEATS

- -

-I/O error messages are not as helpful as they could be. bzip2 -tries hard to detect I/O errors and exit cleanly, but the details of -what the problem is sometimes seem rather misleading. - -

-

-This manual page pertains to version 1.0 of bzip2. Compressed -data created by this version is entirely forwards and backwards -compatible with the previous public releases, versions 0.1pl2, 0.9.0 and -0.9.5, but with the following exception: 0.9.0 and above can correctly -decompress multiple concatenated compressed files. 0.1pl2 cannot do -this; it will stop after decompressing just the first file in the -stream. - -

-

-bzip2recover uses 32-bit integers to represent bit positions in -compressed files, so it cannot handle compressed files more than 512 -megabytes long. This could easily be fixed. - -

- - - -

AUTHOR

-

-Julian Seward, jseward@acm.org. - -

-

-The ideas embodied in bzip2 are due to (at least) the following -people: Michael Burrows and David Wheeler (for the block sorting -transformation), David Wheeler (again, for the Huffman coder), Peter -Fenwick (for the structured coding model in the original bzip, -and many refinements), and Alistair Moffat, Radford Neal and Ian Witten -(for the arithmetic coder in the original bzip). I am much -indebted for their help, support and advice. See the manual in the -source distribution for pointers to sources of documentation. Christian -von Roques encouraged me to look for faster sorting algorithms, so as to -speed up compression. Bela Lubkin encouraged me to improve the -worst-case compression performance. Many people sent patches, helped -with portability problems, lent machines, gave advice and were generally -helpful. - -

-
- -


-

Go to the first, previous, next, last section, table of contents. - - diff -Nru bzip2-1.0.1/manual_3.html bzip2-1.0.1.new/manual_3.html --- bzip2-1.0.1/manual_3.html Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/manual_3.html Thu Jan 1 01:00:00 1970 @@ -1,1773 +0,0 @@ - - - - -bzip2 and libbzip2 - Programming with libbzip2 - - - - - - -

Go to the first, previous, next, last section, table of contents. -


- - -

Programming with libbzip2

- -

-This chapter describes the programming interface to libbzip2. - -

-

-For general background information, particularly about memory -use and performance aspects, you'd be well advised to read Chapter 2 -as well. - -

- - -

Top-level structure

- -

-libbzip2 is a flexible library for compressing and decompressing -data in the bzip2 data format. Although packaged as a single -entity, it helps to regard the library as three separate parts: the low -level interface, and the high level interface, and some utility -functions. - -

-

-The structure of libbzip2's interfaces is similar to -that of Jean-loup Gailly's and Mark Adler's excellent zlib -library. - -

-

-All externally visible symbols have names beginning BZ2_. -This is new in version 1.0. The intention is to minimise pollution -of the namespaces of library clients. - -

- - -

Low-level summary

- -

-This interface provides services for compressing and decompressing -data in memory. There's no provision for dealing with files, streams -or any other I/O mechanisms, just straight memory-to-memory work. -In fact, this part of the library can be compiled without inclusion -of stdio.h, which may be helpful for embedded applications. - -

-

-The low-level part of the library has no global variables and -is therefore thread-safe. - -

-

-Six routines make up the low level interface: -BZ2_bzCompressInit, BZ2_bzCompress, and
BZ2_bzCompressEnd -for compression, -and a corresponding trio BZ2_bzDecompressInit,
BZ2_bzDecompress -and BZ2_bzDecompressEnd for decompression. -The *Init functions allocate -memory for compression/decompression and do other -initialisations, whilst the *End functions close down operations -and release memory. - -

-

-The real work is done by BZ2_bzCompress and BZ2_bzDecompress. -These compress and decompress data from a user-supplied input buffer -to a user-supplied output buffer. These buffers can be any size; -arbitrary quantities of data are handled by making repeated calls -to these functions. This is a flexible mechanism allowing a -consumer-pull style of activity, or producer-push, or a mixture of -both. - -

- - - -

High-level summary

- -

-This interface provides some handy wrappers around the low-level -interface to facilitate reading and writing bzip2 format -files (.bz2 files). The routines provide hooks to facilitate -reading files in which the bzip2 data stream is embedded -within some larger-scale file structure, or where there are -multiple bzip2 data streams concatenated end-to-end. - -

-

-For reading files, BZ2_bzReadOpen, BZ2_bzRead, -BZ2_bzReadClose and
BZ2_bzReadGetUnused are supplied. For -writing files, BZ2_bzWriteOpen, BZ2_bzWrite and -BZ2_bzWriteFinish are available. - -

-

-As with the low-level library, no global variables are used -so the library is per se thread-safe. However, if I/O errors -occur whilst reading or writing the underlying compressed files, -you may have to consult errno to determine the cause of -the error. In that case, you'd need a C library which correctly -supports errno in a multithreaded environment. - -

-

-To make the library a little simpler and more portable, -BZ2_bzReadOpen and BZ2_bzWriteOpen require you to pass them file -handles (FILE*s) which have previously been opened for reading or -writing respectively. That avoids portability problems associated with -file operations and file attributes, whilst not being much of an -imposition on the programmer. - -

- - - -

Utility functions summary

-

-For very simple needs, BZ2_bzBuffToBuffCompress and -BZ2_bzBuffToBuffDecompress are provided. These compress -data in memory from one buffer to another buffer in a single -function call. You should assess whether these functions -fulfill your memory-to-memory compression/decompression -requirements before investing effort in understanding the more -general but more complex low-level interface. - -

-

-Yoshioka Tsuneo (QWF00133@niftyserve.or.jp / -tsuneo-y@is.aist-nara.ac.jp) has contributed some functions to -give better zlib compatibility. These functions are -BZ2_bzopen, BZ2_bzread, BZ2_bzwrite, BZ2_bzflush, -BZ2_bzclose, -BZ2_bzerror and BZ2_bzlibVersion. You may find these functions -more convenient for simple file reading and writing, than those in the -high-level interface. These functions are not (yet) officially part of -the library, and are minimally documented here. If they break, you -get to keep all the pieces. I hope to document them properly when time -permits. - -

-

-Yoshioka also contributed modifications to allow the library to be -built as a Windows DLL. - -

- - - -

Error handling

- -

-The library is designed to recover cleanly in all situations, including -the worst-case situation of decompressing random data. I'm not -100% sure that it can always do this, so you might want to add -a signal handler to catch segmentation violations during decompression -if you are feeling especially paranoid. I would be interested in -hearing more about the robustness of the library to corrupted -compressed data. - -

-

-Version 1.0 is much more robust in this respect than -0.9.0 or 0.9.5. Investigations with Checker (a tool for -detecting problems with memory management, similar to Purify) -indicate that, at least for the few files I tested, all single-bit -errors in the decompressed data are caught properly, with no -segmentation faults, no reads of uninitialised data and no -out of range reads or writes. So it's certainly much improved, -although I wouldn't claim it to be totally bombproof. - -

-

-The file bzlib.h contains all definitions needed to use -the library. In particular, you should definitely not include -bzlib_private.h. - -

-

-In bzlib.h, the various return values are defined. The following -list is not intended as an exhaustive description of the circumstances -in which a given value may be returned -- those descriptions are given -later. Rather, it is intended to convey the rough meaning of each -return value. The first five actions are normal and not intended to -denote an error situation. -

- -
BZ_OK -
-The requested action was completed successfully. -
BZ_RUN_OK -
-
BZ_FLUSH_OK -
-
BZ_FINISH_OK -
-In BZ2_bzCompress, the requested flush/finish/nothing-special action -was completed successfully. -
BZ_STREAM_END -
-Compression of data was completed, or the logical stream end was -detected during decompression. -
- -

-The following return values indicate an error of some kind. -

- -
BZ_CONFIG_ERROR -
-Indicates that the library has been improperly compiled on your -platform -- a major configuration error. Specifically, it means -that sizeof(char), sizeof(short) and sizeof(int) -are not 1, 2 and 4 respectively, as they should be. Note that the -library should still work properly on 64-bit platforms which follow -the LP64 programming model -- that is, where sizeof(long) -and sizeof(void*) are 8. Under LP64, sizeof(int) is -still 4, so libbzip2, which doesn't use the long type, -is OK. -
BZ_SEQUENCE_ERROR -
-When using the library, it is important to call the functions in the -correct sequence and with data structures (buffers etc) in the correct -states. libbzip2 checks as much as it can to ensure this is -happening, and returns BZ_SEQUENCE_ERROR if not. Code which -complies precisely with the function semantics, as detailed below, -should never receive this value; such an event denotes buggy code -which you should investigate. -
BZ_PARAM_ERROR -
-Returned when a parameter to a function call is out of range -or otherwise manifestly incorrect. As with BZ_SEQUENCE_ERROR, -this denotes a bug in the client code. The distinction between -BZ_PARAM_ERROR and BZ_SEQUENCE_ERROR is a bit hazy, but still worth -making. -
BZ_MEM_ERROR -
-Returned when a request to allocate memory failed. Note that the -quantity of memory needed to decompress a stream cannot be determined -until the stream's header has been read. So BZ2_bzDecompress and -BZ2_bzRead may return BZ_MEM_ERROR even though some of -the compressed data has been read. The same is not true for -compression; once BZ2_bzCompressInit or BZ2_bzWriteOpen have -successfully completed, BZ_MEM_ERROR cannot occur. -
BZ_DATA_ERROR -
-Returned when a data integrity error is detected during decompression. -Most importantly, this means when stored and computed CRCs for the -data do not match. This value is also returned upon detection of any -other anomaly in the compressed data. -
BZ_DATA_ERROR_MAGIC -
-As a special case of BZ_DATA_ERROR, it is sometimes useful to -know when the compressed stream does not start with the correct -magic bytes ('B' 'Z' 'h'). -
BZ_IO_ERROR -
-Returned by BZ2_bzRead and BZ2_bzWrite when there is an error -reading or writing in the compressed file, and by BZ2_bzReadOpen -and BZ2_bzWriteOpen for attempts to use a file for which the -error indicator (viz, ferror(f)) is set. -On receipt of BZ_IO_ERROR, the caller should consult -errno and/or perror to acquire operating-system -specific information about the problem. -
BZ_UNEXPECTED_EOF -
-Returned by BZ2_bzRead when the compressed file finishes -before the logical end of stream is detected. -
BZ_OUTBUFF_FULL -
-Returned by BZ2_bzBuffToBuffCompress and -BZ2_bzBuffToBuffDecompress to indicate that the output data -will not fit into the output buffer provided. -
- - - -

Low-level interface

- - - -

BZ2_bzCompressInit

- -
-typedef 
-   struct {
-      char *next_in;
-      unsigned int avail_in;
-      unsigned int total_in_lo32;
-      unsigned int total_in_hi32;
-
-      char *next_out;
-      unsigned int avail_out;
-      unsigned int total_out_lo32;
-      unsigned int total_out_hi32;
-
-      void *state;
-
-      void *(*bzalloc)(void *,int,int);
-      void (*bzfree)(void *,void *);
-      void *opaque;
-   } 
-   bz_stream;
-
-int BZ2_bzCompressInit ( bz_stream *strm, 
-                         int blockSize100k, 
-                         int verbosity,
-                         int workFactor );
-
-
- -

-Prepares for compression. The bz_stream structure -holds all data pertaining to the compression activity. -A bz_stream structure should be allocated and initialised -prior to the call. -The fields of bz_stream -comprise the entirety of the user-visible data. state -is a pointer to the private data structures required for compression. - -

-

-Custom memory allocators are supported, via fields bzalloc, -bzfree, -and opaque. The value -opaque is passed to as the first argument to -all calls to bzalloc and bzfree, but is -otherwise ignored by the library. -The call bzalloc ( opaque, n, m ) is expected to return a -pointer p to -n * m bytes of memory, and bzfree ( opaque, p ) -should free -that memory. - -

-

-If you don't want to use a custom memory allocator, set bzalloc, -bzfree and -opaque to NULL, -and the library will then use the standard malloc/free -routines. - -

-

-Before calling BZ2_bzCompressInit, fields bzalloc, -bzfree and opaque should -be filled appropriately, as just described. Upon return, the internal -state will have been allocated and initialised, and total_in_lo32, -total_in_hi32, total_out_lo32 and -total_out_hi32 will have been set to zero. -These four fields are used by the library -to inform the caller of the total amount of data passed into and out of -the library, respectively. You should not try to change them. -As of version 1.0, 64-bit counts are maintained, even on 32-bit -platforms, using the _hi32 fields to store the upper 32 bits -of the count. So, for example, the total amount of data in -is (total_in_hi32 << 32) + total_in_lo32. - -

-

-Parameter blockSize100k specifies the block size to be used for -compression. It should be a value between 1 and 9 inclusive, and the -actual block size used is 100000 x this figure. 9 gives the best -compression but takes most memory. - -

-

-Parameter verbosity should be set to a number between 0 and 4 -inclusive. 0 is silent, and greater numbers give increasingly verbose -monitoring/debugging output. If the library has been compiled with --DBZ_NO_STDIO, no such output will appear for any verbosity -setting. - -

-

-Parameter workFactor controls how the compression phase behaves -when presented with worst case, highly repetitive, input data. If -compression runs into difficulties caused by repetitive data, the -library switches from the standard sorting algorithm to a fallback -algorithm. The fallback is slower than the standard algorithm by -perhaps a factor of three, but always behaves reasonably, no matter how -bad the input. - -

-

-Lower values of workFactor reduce the amount of effort the -standard algorithm will expend before resorting to the fallback. You -should set this parameter carefully; too low, and many inputs will be -handled by the fallback algorithm and so compress rather slowly, too -high, and your average-to-worst case compression times can become very -large. The default value of 30 gives reasonable behaviour over a wide -range of circumstances. - -

-

-Allowable values range from 0 to 250 inclusive. 0 is a special case, -equivalent to using the default value of 30. - -

-

-Note that the compressed output generated is the same regardless of -whether or not the fallback algorithm is used. - -

-

-Be aware also that this parameter may disappear entirely in future -versions of the library. In principle it should be possible to devise a -good way to automatically choose which algorithm to use. Such a -mechanism would render the parameter obsolete. - -

-

-Possible return values: - -

-      BZ_CONFIG_ERROR
-         if the library has been mis-compiled
-      BZ_PARAM_ERROR 
-         if strm is NULL 
-         or blockSize < 1 or blockSize > 9
-         or verbosity < 0 or verbosity > 4
-         or workFactor < 0 or workFactor > 250
-      BZ_MEM_ERROR 
-         if not enough memory is available
-      BZ_OK 
-         otherwise
-
- -

-Allowable next actions: - -

-      BZ2_bzCompress 
-         if BZ_OK is returned
-      no specific action needed in case of error
-
- - - -

BZ2_bzCompress

- -
-   int BZ2_bzCompress ( bz_stream *strm, int action );
-
- -

-Provides more input and/or output buffer space for the library. The -caller maintains input and output buffers, and calls BZ2_bzCompress to -transfer data between them. - -

-

-Before each call to BZ2_bzCompress, next_in should point at -the data to be compressed, and avail_in should indicate how many -bytes the library may read. BZ2_bzCompress updates next_in, -avail_in and total_in to reflect the number of bytes it -has read. - -

-

-Similarly, next_out should point to a buffer in which the -compressed data is to be placed, with avail_out indicating how -much output space is available. BZ2_bzCompress updates -next_out, avail_out and total_out to reflect the -number of bytes output. - -

-

-You may provide and remove as little or as much data as you like on each -call of BZ2_bzCompress. In the limit, it is acceptable to supply and -remove data one byte at a time, although this would be terribly -inefficient. You should always ensure that at least one byte of output -space is available at each call. - -

-

-A second purpose of BZ2_bzCompress is to request a change of mode of the -compressed stream. - -

-

-Conceptually, a compressed stream can be in one of four states: IDLE, -RUNNING, FLUSHING and FINISHING. Before initialisation -(BZ2_bzCompressInit) and after termination (BZ2_bzCompressEnd), a -stream is regarded as IDLE. - -

-

-Upon initialisation (BZ2_bzCompressInit), the stream is placed in the -RUNNING state. Subsequent calls to BZ2_bzCompress should pass -BZ_RUN as the requested action; other actions are illegal and -will result in BZ_SEQUENCE_ERROR. - -

-

-At some point, the calling program will have provided all the input data -it wants to. It will then want to finish up -- in effect, asking the -library to process any data it might have buffered internally. In this -state, BZ2_bzCompress will no longer attempt to read data from -next_in, but it will want to write data to next_out. -Because the output buffer supplied by the user can be arbitrarily small, -the finishing-up operation cannot necessarily be done with a single call -of BZ2_bzCompress. - -

-

-Instead, the calling program passes BZ_FINISH as an action to -BZ2_bzCompress. This changes the stream's state to FINISHING. Any -remaining input (ie, next_in[0 .. avail_in-1]) is compressed and -transferred to the output buffer. To do this, BZ2_bzCompress must be -called repeatedly until all the output has been consumed. At that -point, BZ2_bzCompress returns BZ_STREAM_END, and the stream's -state is set back to IDLE. BZ2_bzCompressEnd should then be -called. - -

-

-Just to make sure the calling program does not cheat, the library makes -a note of avail_in at the time of the first call to -BZ2_bzCompress which has BZ_FINISH as an action (ie, at the -time the program has announced its intention to not supply any more -input). By comparing this value with that of avail_in over -subsequent calls to BZ2_bzCompress, the library can detect any -attempts to slip in more data to compress. Any calls for which this is -detected will return BZ_SEQUENCE_ERROR. This indicates a -programming mistake which should be corrected. - -

-

-Instead of asking to finish, the calling program may ask -BZ2_bzCompress to take all the remaining input, compress it and -terminate the current (Burrows-Wheeler) compression block. This could -be useful for error control purposes. The mechanism is analogous to -that for finishing: call BZ2_bzCompress with an action of -BZ_FLUSH, remove output data, and persist with the -BZ_FLUSH action until the value BZ_RUN is returned. As -with finishing, BZ2_bzCompress detects any attempt to provide more -input data once the flush has begun. - -

-

-Once the flush is complete, the stream returns to the normal RUNNING -state. - -

-

-This all sounds pretty complex, but isn't really. Here's a table -which shows which actions are allowable in each state, what action -will be taken, what the next state is, and what the non-error return -values are. Note that you can't explicitly ask what state the -stream is in, but nor do you need to -- it can be inferred from the -values returned by BZ2_bzCompress. - -

-IDLE/any           
-      Illegal.  IDLE state only exists after BZ2_bzCompressEnd or
-      before BZ2_bzCompressInit.
-      Return value = BZ_SEQUENCE_ERROR
-
-RUNNING/BZ_RUN     
-      Compress from next_in to next_out as much as possible.
-      Next state = RUNNING
-      Return value = BZ_RUN_OK
-
-RUNNING/BZ_FLUSH   
-      Remember current value of next_in.  Compress from next_in
-      to next_out as much as possible, but do not accept any more input.  
-      Next state = FLUSHING
-      Return value = BZ_FLUSH_OK
-
-RUNNING/BZ_FINISH  
-      Remember current value of next_in.  Compress from next_in
-      to next_out as much as possible, but do not accept any more input.
-      Next state = FINISHING
-      Return value = BZ_FINISH_OK
-
-FLUSHING/BZ_FLUSH  
-      Compress from next_in to next_out as much as possible, 
-      but do not accept any more input.  
-      If all the existing input has been used up and all compressed
-      output has been removed
-         Next state = RUNNING; Return value = BZ_RUN_OK
-      else
-         Next state = FLUSHING; Return value = BZ_FLUSH_OK
-
-FLUSHING/other     
-      Illegal.
-      Return value = BZ_SEQUENCE_ERROR
-
-FINISHING/BZ_FINISH  
-      Compress from next_in to next_out as much as possible,
-      but to not accept any more input.  
-      If all the existing input has been used up and all compressed
-      output has been removed
-         Next state = IDLE; Return value = BZ_STREAM_END
-      else
-         Next state = FINISHING; Return value = BZ_FINISHING
-
-FINISHING/other
-      Illegal.
-      Return value = BZ_SEQUENCE_ERROR
-
- -

-That still looks complicated? Well, fair enough. The usual sequence -of calls for compressing a load of data is: - -

    -
  • Get started with BZ2_bzCompressInit. - -
  • Shovel data in and shlurp out its compressed form using zero or more - -calls of BZ2_bzCompress with action = BZ_RUN. -
  • Finish up. - -Repeatedly call BZ2_bzCompress with action = BZ_FINISH, -copying out the compressed output, until BZ_STREAM_END is returned. -
  • Close up and go home. Call BZ2_bzCompressEnd. - -
- -

-If the data you want to compress fits into your input buffer all -at once, you can skip the calls of BZ2_bzCompress ( ..., BZ_RUN ) and -just do the BZ2_bzCompress ( ..., BZ_FINISH ) calls. - -

-

-All required memory is allocated by BZ2_bzCompressInit. The -compression library can accept any data at all (obviously). So you -shouldn't get any error return values from the BZ2_bzCompress calls. -If you do, they will be BZ_SEQUENCE_ERROR, and indicate a bug in -your programming. - -

-

-Trivial other possible return values: - -

-      BZ_PARAM_ERROR   
-         if strm is NULL, or strm->s is NULL
-
- - - -

BZ2_bzCompressEnd

- -
-int BZ2_bzCompressEnd ( bz_stream *strm );
-
- -

-Releases all memory associated with a compression stream. - -

-

-Possible return values: - -

-   BZ_PARAM_ERROR    if strm is NULL or strm->s is NULL
-   BZ_OK    otherwise
-
- - - -

BZ2_bzDecompressInit

- -
-int BZ2_bzDecompressInit ( bz_stream *strm, int verbosity, int small );
-
- -

-Prepares for decompression. As with BZ2_bzCompressInit, a -bz_stream record should be allocated and initialised before the -call. Fields bzalloc, bzfree and opaque should be -set if a custom memory allocator is required, or made NULL for -the normal malloc/free routines. Upon return, the internal -state will have been initialised, and total_in and -total_out will be zero. - -

-

-For the meaning of parameter verbosity, see BZ2_bzCompressInit. - -

-

-If small is nonzero, the library will use an alternative -decompression algorithm which uses less memory but at the cost of -decompressing more slowly (roughly speaking, half the speed, but the -maximum memory requirement drops to around 2300k). See Chapter 2 for -more information on memory management. - -

-

-Note that the amount of memory needed to decompress -a stream cannot be determined until the stream's header has been read, -so even if BZ2_bzDecompressInit succeeds, a subsequent -BZ2_bzDecompress could fail with BZ_MEM_ERROR. - -

-

-Possible return values: - -

-      BZ_CONFIG_ERROR
-         if the library has been mis-compiled
-      BZ_PARAM_ERROR
-         if (small != 0 && small != 1)
-         or (verbosity < 0 || verbosity > 4)
-      BZ_MEM_ERROR
-         if insufficient memory is available
-
- -

-Allowable next actions: - -

-      BZ2_bzDecompress
-         if BZ_OK was returned
-      no specific action required in case of error
-
- -

- - -

- - -

BZ2_bzDecompress

- -
-int BZ2_bzDecompress ( bz_stream *strm );
-
- -

-Provides more input and/out output buffer space for the library. The -caller maintains input and output buffers, and uses BZ2_bzDecompress -to transfer data between them. - -

-

-Before each call to BZ2_bzDecompress, next_in -should point at the compressed data, -and avail_in should indicate how many bytes the library -may read. BZ2_bzDecompress updates next_in, avail_in -and total_in -to reflect the number of bytes it has read. - -

-

-Similarly, next_out should point to a buffer in which the uncompressed -output is to be placed, with avail_out indicating how much output space -is available. BZ2_bzCompress updates next_out, -avail_out and total_out to reflect -the number of bytes output. - -

-

-You may provide and remove as little or as much data as you like on -each call of BZ2_bzDecompress. -In the limit, it is acceptable to -supply and remove data one byte at a time, although this would be -terribly inefficient. You should always ensure that at least one -byte of output space is available at each call. - -

-

-Use of BZ2_bzDecompress is simpler than BZ2_bzCompress. - -

-

-You should provide input and remove output as described above, and -repeatedly call BZ2_bzDecompress until BZ_STREAM_END is -returned. Appearance of BZ_STREAM_END denotes that -BZ2_bzDecompress has detected the logical end of the compressed -stream. BZ2_bzDecompress will not produce BZ_STREAM_END until -all output data has been placed into the output buffer, so once -BZ_STREAM_END appears, you are guaranteed to have available all -the decompressed output, and BZ2_bzDecompressEnd can safely be -called. - -

-

-If case of an error return value, you should call BZ2_bzDecompressEnd -to clean up and release memory. - -

-

-Possible return values: - -

-      BZ_PARAM_ERROR
-         if strm is NULL or strm->s is NULL
-         or strm->avail_out < 1
-      BZ_DATA_ERROR
-         if a data integrity error is detected in the compressed stream
-      BZ_DATA_ERROR_MAGIC
-         if the compressed stream doesn't begin with the right magic bytes
-      BZ_MEM_ERROR
-         if there wasn't enough memory available
-      BZ_STREAM_END
-         if the logical end of the data stream was detected and all
-         output in has been consumed, eg s->avail_out > 0
-      BZ_OK
-         otherwise
-
- -

-Allowable next actions: - -

-      BZ2_bzDecompress
-         if BZ_OK was returned
-      BZ2_bzDecompressEnd
-         otherwise
-
- - - -

BZ2_bzDecompressEnd

- -
-int BZ2_bzDecompressEnd ( bz_stream *strm );
-
- -

-Releases all memory associated with a decompression stream. - -

-

-Possible return values: - -

-      BZ_PARAM_ERROR
-         if strm is NULL or strm->s is NULL
-      BZ_OK
-         otherwise
-
- -

-Allowable next actions: - -

-      None.
-
- - - -

High-level interface

- -

-This interface provides functions for reading and writing -bzip2 format files. First, some general points. - -

- -
    -
  • All of the functions take an int* first argument, - - bzerror. - After each call, bzerror should be consulted first to determine - the outcome of the call. If bzerror is BZ_OK, - the call completed - successfully, and only then should the return value of the function - (if any) be consulted. If bzerror is BZ_IO_ERROR, - there was an error - reading/writing the underlying compressed file, and you should - then consult errno/perror to determine the - cause of the difficulty. - bzerror may also be set to various other values; precise details are - given on a per-function basis below. -
  • If bzerror indicates an error - - (ie, anything except BZ_OK and BZ_STREAM_END), - you should immediately call BZ2_bzReadClose (or BZ2_bzWriteClose, - depending on whether you are attempting to read or to write) - to free up all resources associated - with the stream. Once an error has been indicated, behaviour of all calls - except BZ2_bzReadClose (BZ2_bzWriteClose) is undefined. - The implication is that (1) bzerror should - be checked after each call, and (2) if bzerror indicates an error, - BZ2_bzReadClose (BZ2_bzWriteClose) should then be called to clean up. -
  • The FILE* arguments passed to - - BZ2_bzReadOpen/BZ2_bzWriteOpen - should be set to binary mode. - Most Unix systems will do this by default, but other platforms, - including Windows and Mac, will not. If you omit this, you may - encounter problems when moving code to new platforms. -
  • Memory allocation requests are handled by - - malloc/free. - At present - there is no facility for user-defined memory allocators in the file I/O - functions (could easily be added, though). -
- - - -

BZ2_bzReadOpen

- -
-   typedef void BZFILE;
-
-   BZFILE *BZ2_bzReadOpen ( int *bzerror, FILE *f, 
-                            int small, int verbosity,
-                            void *unused, int nUnused );
-
- -

-Prepare to read compressed data from file handle f. f -should refer to a file which has been opened for reading, and for which -the error indicator (ferror(f))is not set. If small is 1, -the library will try to decompress using less memory, at the expense of -speed. - -

-

-For reasons explained below, BZ2_bzRead will decompress the -nUnused bytes starting at unused, before starting to read -from the file f. At most BZ_MAX_UNUSED bytes may be -supplied like this. If this facility is not required, you should pass -NULL and 0 for unused and nUnused -respectively. - -

-

-For the meaning of parameters small and verbosity, -see BZ2_bzDecompressInit. - -

-

-The amount of memory needed to decompress a file cannot be determined -until the file's header has been read. So it is possible that -BZ2_bzReadOpen returns BZ_OK but a subsequent call of -BZ2_bzRead will return BZ_MEM_ERROR. - -

-

-Possible assignments to bzerror: - -

-      BZ_CONFIG_ERROR
-         if the library has been mis-compiled
-      BZ_PARAM_ERROR
-         if f is NULL 
-         or small is neither 0 nor 1                 
-         or (unused == NULL && nUnused != 0)
-         or (unused != NULL && !(0 <= nUnused <= BZ_MAX_UNUSED))
-      BZ_IO_ERROR    
-         if ferror(f) is nonzero
-      BZ_MEM_ERROR   
-         if insufficient memory is available
-      BZ_OK
-         otherwise.
-
- -

-Possible return values: - -

-      Pointer to an abstract BZFILE        
-         if bzerror is BZ_OK   
-      NULL
-         otherwise
-
- -

-Allowable next actions: - -

-      BZ2_bzRead
-         if bzerror is BZ_OK   
-      BZ2_bzClose 
-         otherwise
-
- - - -

BZ2_bzRead

- -
-   int BZ2_bzRead ( int *bzerror, BZFILE *b, void *buf, int len );
-
- -

-Reads up to len (uncompressed) bytes from the compressed file -b into -the buffer buf. If the read was successful, -bzerror is set to BZ_OK -and the number of bytes read is returned. If the logical end-of-stream -was detected, bzerror will be set to BZ_STREAM_END, -and the number -of bytes read is returned. All other bzerror values denote an error. - -

-

-BZ2_bzRead will supply len bytes, -unless the logical stream end is detected -or an error occurs. Because of this, it is possible to detect the -stream end by observing when the number of bytes returned is -less than the number -requested. Nevertheless, this is regarded as inadvisable; you should -instead check bzerror after every call and watch out for -BZ_STREAM_END. - -

-

-Internally, BZ2_bzRead copies data from the compressed file in chunks -of size BZ_MAX_UNUSED bytes -before decompressing it. If the file contains more bytes than strictly -needed to reach the logical end-of-stream, BZ2_bzRead will almost certainly -read some of the trailing data before signalling BZ_SEQUENCE_END. -To collect the read but unused data once BZ_SEQUENCE_END has -appeared, call BZ2_bzReadGetUnused immediately before BZ2_bzReadClose. - -

-

-Possible assignments to bzerror: - -

-      BZ_PARAM_ERROR
-         if b is NULL or buf is NULL or len < 0
-      BZ_SEQUENCE_ERROR 
-         if b was opened with BZ2_bzWriteOpen
-      BZ_IO_ERROR 
-         if there is an error reading from the compressed file
-      BZ_UNEXPECTED_EOF 
-         if the compressed file ended before the logical end-of-stream was detected
-      BZ_DATA_ERROR 
-         if a data integrity error was detected in the compressed stream
-      BZ_DATA_ERROR_MAGIC
-         if the stream does not begin with the requisite header bytes (ie, is not 
-         a bzip2 data file).  This is really a special case of BZ_DATA_ERROR.
-      BZ_MEM_ERROR 
-         if insufficient memory was available
-      BZ_STREAM_END 
-         if the logical end of stream was detected.
-      BZ_OK
-         otherwise.
-
- -

-Possible return values: - -

-      number of bytes read
-         if bzerror is BZ_OK or BZ_STREAM_END
-      undefined
-         otherwise
-
- -

-Allowable next actions: - -

-      collect data from buf, then BZ2_bzRead or BZ2_bzReadClose
-         if bzerror is BZ_OK 
-      collect data from buf, then BZ2_bzReadClose or BZ2_bzReadGetUnused 
-         if bzerror is BZ_SEQUENCE_END   
-      BZ2_bzReadClose 
-         otherwise
-
- - - -

BZ2_bzReadGetUnused

- -
-   void BZ2_bzReadGetUnused ( int* bzerror, BZFILE *b, 
-                              void** unused, int* nUnused );
-
- -

-Returns data which was read from the compressed file but was not needed -to get to the logical end-of-stream. *unused is set to the address -of the data, and *nUnused to the number of bytes. *nUnused will -be set to a value between 0 and BZ_MAX_UNUSED inclusive. - -

-

-This function may only be called once BZ2_bzRead has signalled -BZ_STREAM_END but before BZ2_bzReadClose. - -

-

-Possible assignments to bzerror: - -

-      BZ_PARAM_ERROR 
-         if b is NULL 
-         or unused is NULL or nUnused is NULL
-      BZ_SEQUENCE_ERROR 
-         if BZ_STREAM_END has not been signalled
-         or if b was opened with BZ2_bzWriteOpen
-     BZ_OK
-         otherwise
-
- -

-Allowable next actions: - -

-      BZ2_bzReadClose
-
- - - -

BZ2_bzReadClose

- -
-   void BZ2_bzReadClose ( int *bzerror, BZFILE *b );
-
- -

-Releases all memory pertaining to the compressed file b. -BZ2_bzReadClose does not call fclose on the underlying file -handle, so you should do that yourself if appropriate. -BZ2_bzReadClose should be called to clean up after all error -situations. - -

-

-Possible assignments to bzerror: - -

-      BZ_SEQUENCE_ERROR 
-         if b was opened with BZ2_bzOpenWrite 
-      BZ_OK 
-         otherwise
-
- -

-Allowable next actions: - -

-      none
-
- - - -

BZ2_bzWriteOpen

- -
-   BZFILE *BZ2_bzWriteOpen ( int *bzerror, FILE *f, 
-                             int blockSize100k, int verbosity,
-                             int workFactor );
-
- -

-Prepare to write compressed data to file handle f. -f should refer to -a file which has been opened for writing, and for which the error -indicator (ferror(f))is not set. - -

-

-For the meaning of parameters blockSize100k, -verbosity and workFactor, see -
BZ2_bzCompressInit. - -

-

-All required memory is allocated at this stage, so if the call -completes successfully, BZ_MEM_ERROR cannot be signalled by a -subsequent call to BZ2_bzWrite. - -

-

-Possible assignments to bzerror: - -

-      BZ_CONFIG_ERROR
-         if the library has been mis-compiled
-      BZ_PARAM_ERROR 
-         if f is NULL 
-         or blockSize100k < 1 or blockSize100k > 9
-      BZ_IO_ERROR 
-         if ferror(f) is nonzero
-      BZ_MEM_ERROR 
-         if insufficient memory is available
-      BZ_OK 
-         otherwise
-
- -

-Possible return values: - -

-      Pointer to an abstract BZFILE  
-         if bzerror is BZ_OK   
-      NULL 
-         otherwise
-
- -

-Allowable next actions: - -

-      BZ2_bzWrite 
-         if bzerror is BZ_OK 
-         (you could go directly to BZ2_bzWriteClose, but this would be pretty pointless)
-      BZ2_bzWriteClose 
-         otherwise
-
- - - -

BZ2_bzWrite

- -
-   void BZ2_bzWrite ( int *bzerror, BZFILE *b, void *buf, int len );
-
- -

-Absorbs len bytes from the buffer buf, eventually to be -compressed and written to the file. - -

-

-Possible assignments to bzerror: - -

-      BZ_PARAM_ERROR 
-         if b is NULL or buf is NULL or len < 0
-      BZ_SEQUENCE_ERROR 
-         if b was opened with BZ2_bzReadOpen
-      BZ_IO_ERROR 
-         if there is an error writing the compressed file.
-      BZ_OK 
-         otherwise
-
- - - -

BZ2_bzWriteClose

- -
-   void BZ2_bzWriteClose ( int *bzerror, BZFILE* f,
-                           int abandon,
-                           unsigned int* nbytes_in,
-                           unsigned int* nbytes_out );
-
-   void BZ2_bzWriteClose64 ( int *bzerror, BZFILE* f,
-                             int abandon,
-                             unsigned int* nbytes_in_lo32,
-                             unsigned int* nbytes_in_hi32,
-                             unsigned int* nbytes_out_lo32,
-                             unsigned int* nbytes_out_hi32 );
-
- -

-Compresses and flushes to the compressed file all data so far supplied -by BZ2_bzWrite. The logical end-of-stream markers are also written, so -subsequent calls to BZ2_bzWrite are illegal. All memory associated -with the compressed file b is released. -fflush is called on the -compressed file, but it is not fclose'd. - -

-

-If BZ2_bzWriteClose is called to clean up after an error, the only -action is to release the memory. The library records the error codes -issued by previous calls, so this situation will be detected -automatically. There is no attempt to complete the compression -operation, nor to fflush the compressed file. You can force this -behaviour to happen even in the case of no error, by passing a nonzero -value to abandon. - -

-

-If nbytes_in is non-null, *nbytes_in will be set to be the -total volume of uncompressed data handled. Similarly, nbytes_out -will be set to the total volume of compressed data written. For -compatibility with older versions of the library, BZ2_bzWriteClose -only yields the lower 32 bits of these counts. Use -BZ2_bzWriteClose64 if you want the full 64 bit counts. These -two functions are otherwise absolutely identical. - -

- -

-Possible assignments to bzerror: - -

-      BZ_SEQUENCE_ERROR 
-         if b was opened with BZ2_bzReadOpen
-      BZ_IO_ERROR 
-         if there is an error writing the compressed file
-      BZ_OK 
-         otherwise
-
- - - -

Handling embedded compressed data streams

- -

-The high-level library facilitates use of -bzip2 data streams which form some part of a surrounding, larger -data stream. - -

    -
  • For writing, the library takes an open file handle, writes - -compressed data to it, fflushes it but does not fclose it. -The calling application can write its own data before and after the -compressed data stream, using that same file handle. -
  • Reading is more complex, and the facilities are not as general - -as they could be since generality is hard to reconcile with efficiency. -BZ2_bzRead reads from the compressed file in blocks of size -BZ_MAX_UNUSED bytes, and in doing so probably will overshoot -the logical end of compressed stream. -To recover this data once decompression has -ended, call BZ2_bzReadGetUnused after the last call of BZ2_bzRead -(the one returning BZ_STREAM_END) but before calling -BZ2_bzReadClose. -
- -

-This mechanism makes it easy to decompress multiple bzip2 -streams placed end-to-end. As the end of one stream, when BZ2_bzRead -returns BZ_STREAM_END, call BZ2_bzReadGetUnused to collect the -unused data (copy it into your own buffer somewhere). -That data forms the start of the next compressed stream. -To start uncompressing that next stream, call BZ2_bzReadOpen again, -feeding in the unused data via the unused/nUnused -parameters. -Keep doing this until BZ_STREAM_END return coincides with the -physical end of file (feof(f)). In this situation -BZ2_bzReadGetUnused -will of course return no data. - -

-

-This should give some feel for how the high-level interface can be used. -If you require extra flexibility, you'll have to bite the bullet and get -to grips with the low-level interface. - -

- - -

Standard file-reading/writing code

-

-Here's how you'd write data to a compressed file: - -

-FILE*   f;
-BZFILE* b;
-int     nBuf;
-char    buf[ /* whatever size you like */ ];
-int     bzerror;
-int     nWritten;
-
-f = fopen ( "myfile.bz2", "w" );
-if (!f) {
-   /* handle error */
-}
-b = BZ2_bzWriteOpen ( &bzerror, f, 9 );
-if (bzerror != BZ_OK) {
-   BZ2_bzWriteClose ( b );
-   /* handle error */
-}
-
-while ( /* condition */ ) {
-   /* get data to write into buf, and set nBuf appropriately */
-   nWritten = BZ2_bzWrite ( &bzerror, b, buf, nBuf );
-   if (bzerror == BZ_IO_ERROR) { 
-      BZ2_bzWriteClose ( &bzerror, b );
-      /* handle error */
-   }
-}
-
-BZ2_bzWriteClose ( &bzerror, b );
-if (bzerror == BZ_IO_ERROR) {
-   /* handle error */
-}
-
- -

-And to read from a compressed file: - -

-FILE*   f;
-BZFILE* b;
-int     nBuf;
-char    buf[ /* whatever size you like */ ];
-int     bzerror;
-int     nWritten;
-
-f = fopen ( "myfile.bz2", "r" );
-if (!f) {
-   /* handle error */
-}
-b = BZ2_bzReadOpen ( &bzerror, f, 0, NULL, 0 );
-if (bzerror != BZ_OK) {
-   BZ2_bzReadClose ( &bzerror, b );
-   /* handle error */
-}
-
-bzerror = BZ_OK;
-while (bzerror == BZ_OK && /* arbitrary other conditions */) {
-   nBuf = BZ2_bzRead ( &bzerror, b, buf, /* size of buf */ );
-   if (bzerror == BZ_OK) {
-      /* do something with buf[0 .. nBuf-1] */
-   }
-}
-if (bzerror != BZ_STREAM_END) {
-   BZ2_bzReadClose ( &bzerror, b );
-   /* handle error */
-} else {
-   BZ2_bzReadClose ( &bzerror );
-}
-
- - - -

Utility functions

- - -

BZ2_bzBuffToBuffCompress

- -
-   int BZ2_bzBuffToBuffCompress( char*         dest,
-                                 unsigned int* destLen,
-                                 char*         source,
-                                 unsigned int  sourceLen,
-                                 int           blockSize100k,
-                                 int           verbosity,
-                                 int           workFactor );
-
- -

-Attempts to compress the data in source[0 .. sourceLen-1] -into the destination buffer, dest[0 .. *destLen-1]. -If the destination buffer is big enough, *destLen is -set to the size of the compressed data, and BZ_OK is -returned. If the compressed data won't fit, *destLen -is unchanged, and BZ_OUTBUFF_FULL is returned. - -

-

-Compression in this manner is a one-shot event, done with a single call -to this function. The resulting compressed data is a complete -bzip2 format data stream. There is no mechanism for making -additional calls to provide extra input data. If you want that kind of -mechanism, use the low-level interface. - -

-

-For the meaning of parameters blockSize100k, verbosity -and workFactor,
see BZ2_bzCompressInit. - -

-

-To guarantee that the compressed data will fit in its buffer, allocate -an output buffer of size 1% larger than the uncompressed data, plus -six hundred extra bytes. - -

-

-BZ2_bzBuffToBuffDecompress will not write data at or -beyond dest[*destLen], even in case of buffer overflow. - -

-

-Possible return values: - -

-      BZ_CONFIG_ERROR
-         if the library has been mis-compiled
-      BZ_PARAM_ERROR 
-         if dest is NULL or destLen is NULL
-         or blockSize100k < 1 or blockSize100k > 9
-         or verbosity < 0 or verbosity > 4 
-         or workFactor < 0 or workFactor > 250
-      BZ_MEM_ERROR
-         if insufficient memory is available 
-      BZ_OUTBUFF_FULL
-         if the size of the compressed data exceeds *destLen
-      BZ_OK 
-         otherwise
-
- - - -

BZ2_bzBuffToBuffDecompress

- -
-   int BZ2_bzBuffToBuffDecompress ( char*         dest,
-                                    unsigned int* destLen,
-                                    char*         source,
-                                    unsigned int  sourceLen,
-                                    int           small,
-                                    int           verbosity );
-
- -

-Attempts to decompress the data in source[0 .. sourceLen-1] -into the destination buffer, dest[0 .. *destLen-1]. -If the destination buffer is big enough, *destLen is -set to the size of the uncompressed data, and BZ_OK is -returned. If the compressed data won't fit, *destLen -is unchanged, and BZ_OUTBUFF_FULL is returned. - -

-

-source is assumed to hold a complete bzip2 format -data stream.
BZ2_bzBuffToBuffDecompress tries to decompress -the entirety of the stream into the output buffer. - -

-

-For the meaning of parameters small and verbosity, -see BZ2_bzDecompressInit. - -

-

-Because the compression ratio of the compressed data cannot be known in -advance, there is no easy way to guarantee that the output buffer will -be big enough. You may of course make arrangements in your code to -record the size of the uncompressed data, but such a mechanism is beyond -the scope of this library. - -

-

-BZ2_bzBuffToBuffDecompress will not write data at or -beyond dest[*destLen], even in case of buffer overflow. - -

-

-Possible return values: - -

-      BZ_CONFIG_ERROR
-         if the library has been mis-compiled
-      BZ_PARAM_ERROR 
-         if dest is NULL or destLen is NULL
-         or small != 0 && small != 1
-         or verbosity < 0 or verbosity > 4 
-      BZ_MEM_ERROR
-         if insufficient memory is available 
-      BZ_OUTBUFF_FULL
-         if the size of the compressed data exceeds *destLen
-      BZ_DATA_ERROR
-         if a data integrity error was detected in the compressed data
-      BZ_DATA_ERROR_MAGIC
-         if the compressed data doesn't begin with the right magic bytes
-      BZ_UNEXPECTED_EOF
-         if the compressed data ends unexpectedly
-      BZ_OK 
-         otherwise
-
- - - -

zlib compatibility functions

-

-Yoshioka Tsuneo has contributed some functions to -give better zlib compatibility. These functions are -BZ2_bzopen, BZ2_bzread, BZ2_bzwrite, BZ2_bzflush, -BZ2_bzclose, -BZ2_bzerror and BZ2_bzlibVersion. -These functions are not (yet) officially part of -the library. If they break, you get to keep all the pieces. -Nevertheless, I think they work ok. - -

-typedef void BZFILE;
-
-const char * BZ2_bzlibVersion ( void );
-
- -

-Returns a string indicating the library version. - -

-BZFILE * BZ2_bzopen  ( const char *path, const char *mode );
-BZFILE * BZ2_bzdopen ( int        fd,    const char *mode );
-
- -

-Opens a .bz2 file for reading or writing, using either its name -or a pre-existing file descriptor. -Analogous to fopen and fdopen. - -

-int BZ2_bzread  ( BZFILE* b, void* buf, int len );
-int BZ2_bzwrite ( BZFILE* b, void* buf, int len );
-
- -

-Reads/writes data from/to a previously opened BZFILE. -Analogous to fread and fwrite. - -

-int  BZ2_bzflush ( BZFILE* b );
-void BZ2_bzclose ( BZFILE* b );
-
- -

-Flushes/closes a BZFILE. BZ2_bzflush doesn't actually do -anything. Analogous to fflush and fclose. - -

- -
-const char * BZ2_bzerror ( BZFILE *b, int *errnum )
-
- -

-Returns a string describing the more recent error status of -b, and also sets *errnum to its numerical value. - -

- - - -

Using the library in a stdio-free environment

- - - -

Getting rid of stdio

- -

-In a deeply embedded application, you might want to use just -the memory-to-memory functions. You can do this conveniently -by compiling the library with preprocessor symbol BZ_NO_STDIO -defined. Doing this gives you a library containing only the following -eight functions: - -

-

-BZ2_bzCompressInit, BZ2_bzCompress, BZ2_bzCompressEnd
-BZ2_bzDecompressInit, BZ2_bzDecompress, BZ2_bzDecompressEnd
-BZ2_bzBuffToBuffCompress, BZ2_bzBuffToBuffDecompress - -

-

-When compiled like this, all functions will ignore verbosity -settings. - -

- - -

Critical error handling

-

-libbzip2 contains a number of internal assertion checks which -should, needless to say, never be activated. Nevertheless, if an -assertion should fail, behaviour depends on whether or not the library -was compiled with BZ_NO_STDIO set. - -

-

-For a normal compile, an assertion failure yields the message - -

-   bzip2/libbzip2: internal error number N.
-   This is a bug in bzip2/libbzip2, 1.0 of 21-Mar-2000.
-   Please report it to me at: jseward@acm.org.  If this happened
-   when you were using some program which uses libbzip2 as a
-   component, you should also report this bug to the author(s)
-   of that program.  Please make an effort to report this bug;
-   timely and accurate bug reports eventually lead to higher
-   quality software.  Thanks.  Julian Seward, 21 March 2000.
-
- -

-where N is some error code number. exit(3) -is then called. - -

-

-For a stdio-free library, assertion failures result -in a call to a function declared as: - -

-   extern void bz_internal_error ( int errcode );
-
- -

-The relevant code is passed as a parameter. You should supply -such a function. - -

-

-In either case, once an assertion failure has occurred, any -bz_stream records involved can be regarded as invalid. -You should not attempt to resume normal operation with them. - -

-

-You may, of course, change critical error handling to suit -your needs. As I said above, critical errors indicate bugs -in the library and should not occur. All "normal" error -situations are indicated via error return codes from functions, -and can be recovered from. - -

- - - -

Making a Windows DLL

-

-Everything related to Windows has been contributed by Yoshioka Tsuneo -
(QWF00133@niftyserve.or.jp / -tsuneo-y@is.aist-nara.ac.jp), so you should send your queries to -him (but perhaps Cc: me, jseward@acm.org). - -

-

-My vague understanding of what to do is: using Visual C++ 5.0, -open the project file libbz2.dsp, and build. That's all. - -

-

-If you can't -open the project file for some reason, make a new one, naming these files: -blocksort.c, bzlib.c, compress.c, -crctable.c, decompress.c, huffman.c,
-randtable.c and libbz2.def. You will also need -to name the header files bzlib.h and bzlib_private.h. - -

-

-If you don't use VC++, you may need to define the proprocessor symbol -_WIN32. - -

-

-Finally, dlltest.c is a sample program using the DLL. It has a -project file, dlltest.dsp. - -

-

-If you just want a makefile for Visual C, have a look at -makefile.msc. - -

-

-Be aware that if you compile bzip2 itself on Win32, you must set -BZ_UNIX to 0 and BZ_LCCWIN32 to 1, in the file -bzip2.c, before compiling. Otherwise the resulting binary won't -work correctly. - -

-

-I haven't tried any of this stuff myself, but it all looks plausible. - -

- -


-

Go to the first, previous, next, last section, table of contents. - - diff -Nru bzip2-1.0.1/manual_4.html bzip2-1.0.1.new/manual_4.html --- bzip2-1.0.1/manual_4.html Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/manual_4.html Thu Jan 1 01:00:00 1970 @@ -1,528 +0,0 @@ - - - - -bzip2 and libbzip2 - Miscellanea - - - - - -

Go to the first, previous, next, last section, table of contents. -


- - -

Miscellanea

- -

-These are just some random thoughts of mine. Your mileage may -vary. - -

- - -

Limitations of the compressed file format

-

-bzip2-1.0, 0.9.5 and 0.9.0 -use exactly the same file format as the previous -version, bzip2-0.1. This decision was made in the interests of -stability. Creating yet another incompatible compressed file format -would create further confusion and disruption for users. - -

-

-Nevertheless, this is not a painless decision. Development -work since the release of bzip2-0.1 in August 1997 -has shown complexities in the file format which slow down -decompression and, in retrospect, are unnecessary. These are: - -

    -
  • The run-length encoder, which is the first of the - - compression transformations, is entirely irrelevant. - The original purpose was to protect the sorting algorithm - from the very worst case input: a string of repeated - symbols. But algorithm steps Q6a and Q6b in the original - Burrows-Wheeler technical report (SRC-124) show how - repeats can be handled without difficulty in block - sorting. -
  • The randomisation mechanism doesn't really need to be - - there. Udi Manber and Gene Myers published a suffix - array construction algorithm a few years back, which - can be employed to sort any block, no matter how - repetitive, in O(N log N) time. Subsequent work by - Kunihiko Sadakane has produced a derivative O(N (log N)^2) - algorithm which usually outperforms the Manber-Myers - algorithm. - - I could have changed to Sadakane's algorithm, but I find - it to be slower than bzip2's existing algorithm for - most inputs, and the randomisation mechanism protects - adequately against bad cases. I didn't think it was - a good tradeoff to make. Partly this is due to the fact - that I was not flooded with email complaints about - bzip2-0.1's performance on repetitive data, so - perhaps it isn't a problem for real inputs. - - Probably the best long-term solution, - and the one I have incorporated into 0.9.5 and above, - is to use the existing sorting - algorithm initially, and fall back to a O(N (log N)^2) - algorithm if the standard algorithm gets into difficulties. -
  • The compressed file format was never designed to be - - handled by a library, and I have had to jump though - some hoops to produce an efficient implementation of - decompression. It's a bit hairy. Try passing - decompress.c through the C preprocessor - and you'll see what I mean. Much of this complexity - could have been avoided if the compressed size of - each block of data was recorded in the data stream. -
  • An Adler-32 checksum, rather than a CRC32 checksum, - - would be faster to compute. -
- -

-It would be fair to say that the bzip2 format was frozen -before I properly and fully understood the performance -consequences of doing so. - -

-

-Improvements which I was able to incorporate into -0.9.0, despite using the same file format, are: - -

    -
  • Single array implementation of the inverse BWT. This - - significantly speeds up decompression, presumably - because it reduces the number of cache misses. -
  • Faster inverse MTF transform for large MTF values. The - - new implementation is based on the notion of sliding blocks - of values. -
  • bzip2-0.9.0 now reads and writes files with fread - - and fwrite; version 0.1 used putc and getc. - Duh! Well, you live and learn. - -
- -

-Further ahead, it would be nice -to be able to do random access into files. This will -require some careful design of compressed file formats. - -

- - - -

Portability issues

-

-After some consideration, I have decided not to use -GNU autoconf to configure 0.9.5 or 1.0. - -

-

-autoconf, admirable and wonderful though it is, -mainly assists with portability problems between Unix-like -platforms. But bzip2 doesn't have much in the way -of portability problems on Unix; most of the difficulties appear -when porting to the Mac, or to Microsoft's operating systems. -autoconf doesn't help in those cases, and brings in a -whole load of new complexity. - -

-

-Most people should be able to compile the library and program -under Unix straight out-of-the-box, so to speak, especially -if you have a version of GNU C available. - -

-

-There are a couple of __inline__ directives in the code. GNU C -(gcc) should be able to handle them. If you're not using -GNU C, your C compiler shouldn't see them at all. -If your compiler does, for some reason, see them and doesn't -like them, just #define __inline__ to be /* */. One -easy way to do this is to compile with the flag -D__inline__=, -which should be understood by most Unix compilers. - -

-

-If you still have difficulties, try compiling with the macro -BZ_STRICT_ANSI defined. This should enable you to build the -library in a strictly ANSI compliant environment. Building the program -itself like this is dangerous and not supported, since you remove -bzip2's checks against compressing directories, symbolic links, -devices, and other not-really-a-file entities. This could cause -filesystem corruption! - -

-

-One other thing: if you create a bzip2 binary for public -distribution, please try and link it statically (gcc -s). This -avoids all sorts of library-version issues that others may encounter -later on. - -

-

-If you build bzip2 on Win32, you must set BZ_UNIX to 0 and -BZ_LCCWIN32 to 1, in the file bzip2.c, before compiling. -Otherwise the resulting binary won't work correctly. - -

- - - -

Reporting bugs

-

-I tried pretty hard to make sure bzip2 is -bug free, both by design and by testing. Hopefully -you'll never need to read this section for real. - -

-

-Nevertheless, if bzip2 dies with a segmentation -fault, a bus error or an internal assertion failure, it -will ask you to email me a bug report. Experience with -version 0.1 shows that almost all these problems can -be traced to either compiler bugs or hardware problems. - -

    -
  • - -Recompile the program with no optimisation, and see if it -works. And/or try a different compiler. -I heard all sorts of stories about various flavours -of GNU C (and other compilers) generating bad code for -bzip2, and I've run across two such examples myself. - -2.7.X versions of GNU C are known to generate bad code from -time to time, at high optimisation levels. -If you get problems, try using the flags --O2 -fomit-frame-pointer -fno-strength-reduce. -You should specifically not use -funroll-loops. - -You may notice that the Makefile runs six tests as part of -the build process. If the program passes all of these, it's -a pretty good (but not 100%) indication that the compiler has -done its job correctly. -
  • - -If bzip2 crashes randomly, and the crashes are not -repeatable, you may have a flaky memory subsystem. bzip2 -really hammers your memory hierarchy, and if it's a bit marginal, -you may get these problems. Ditto if your disk or I/O subsystem -is slowly failing. Yup, this really does happen. - -Try using a different machine of the same type, and see if -you can repeat the problem. -
  • This isn't really a bug, but ... If bzip2 tells - -you your file is corrupted on decompression, and you -obtained the file via FTP, there is a possibility that you -forgot to tell FTP to do a binary mode transfer. That absolutely -will cause the file to be non-decompressible. You'll have to transfer -it again. -
- -

-If you've incorporated libbzip2 into your own program -and are getting problems, please, please, please, check that the -parameters you are passing in calls to the library, are -correct, and in accordance with what the documentation says -is allowable. I have tried to make the library robust against -such problems, but I'm sure I haven't succeeded. - -

-

-Finally, if the above comments don't help, you'll have to send -me a bug report. Now, it's just amazing how many people will -send me a bug report saying something like - -

-   bzip2 crashed with segmentation fault on my machine
-
- -

-and absolutely nothing else. Needless to say, a such a report -is totally, utterly, completely and comprehensively 100% useless; -a waste of your time, my time, and net bandwidth. -With no details at all, there's no way I can possibly begin -to figure out what the problem is. - -

-

-The rules of the game are: facts, facts, facts. Don't omit -them because "oh, they won't be relevant". At the bare -minimum: - -

-   Machine type.  Operating system version.  
-   Exact version of bzip2 (do bzip2 -V).  
-   Exact version of the compiler used.  
-   Flags passed to the compiler.
-
- -

-However, the most important single thing that will help me is -the file that you were trying to compress or decompress at the -time the problem happened. Without that, my ability to do anything -more than speculate about the cause, is limited. - -

-

-Please remember that I connect to the Internet with a modem, so -you should contact me before mailing me huge files. - -

- - - -

Did you get the right package?

- -

-bzip2 is a resource hog. It soaks up large amounts of CPU cycles -and memory. Also, it gives very large latencies. In the worst case, you -can feed many megabytes of uncompressed data into the library before -getting any compressed output, so this probably rules out applications -requiring interactive behaviour. - -

-

-These aren't faults of my implementation, I hope, but more -an intrinsic property of the Burrows-Wheeler transform (unfortunately). -Maybe this isn't what you want. - -

-

-If you want a compressor and/or library which is faster, uses less -memory but gets pretty good compression, and has minimal latency, -consider Jean-loup -Gailly's and Mark Adler's work, zlib-1.1.2 and -gzip-1.2.4. Look for them at - -

-

-http://www.cdrom.com/pub/infozip/zlib and -http://www.gzip.org respectively. - -

-

-For something faster and lighter still, you might try Markus F X J -Oberhumer's LZO real-time compression/decompression library, at -
http://wildsau.idv.uni-linz.ac.at/mfx/lzo.html. - -

-

-If you want to use the bzip2 algorithms to compress small blocks -of data, 64k bytes or smaller, for example on an on-the-fly disk -compressor, you'd be well advised not to use this library. Instead, -I've made a special library tuned for that kind of use. It's part of -e2compr-0.40, an on-the-fly disk compressor for the Linux -ext2 filesystem. Look at -http://www.netspace.net.au/~reiter/e2compr. - -

- - - -

Testing

- -

-A record of the tests I've done. - -

-

-First, some data sets: - -

    -
  • B: a directory containing 6001 files, one for every length in the - - range 0 to 6000 bytes. The files contain random lowercase - letters. 18.7 megabytes. -
  • H: my home directory tree. Documents, source code, mail files, - - compressed data. H contains B, and also a directory of - files designed as boundary cases for the sorting; mostly very - repetitive, nasty files. 565 megabytes. -
  • A: directory tree holding various applications built from source: - - egcs, gcc-2.8.1, KDE, GTK, Octave, etc. - 2200 megabytes. -
- -

-The tests conducted are as follows. Each test means compressing -(a copy of) each file in the data set, decompressing it and -comparing it against the original. - -

-

-First, a bunch of tests with block sizes and internal buffer -sizes set very small, -to detect any problems with the -blocking and buffering mechanisms. -This required modifying the source code so as to try to -break it. - -

    -
  1. Data set H, with - - buffer size of 1 byte, and block size of 23 bytes. -
  2. Data set B, buffer sizes 1 byte, block size 1 byte. - -
  3. As (2) but small-mode decompression. - -
  4. As (2) with block size 2 bytes. - -
  5. As (2) with block size 3 bytes. - -
  6. As (2) with block size 4 bytes. - -
  7. As (2) with block size 5 bytes. - -
  8. As (2) with block size 6 bytes and small-mode decompression. - -
  9. H with buffer size of 1 byte, but normal block - - size (up to 900000 bytes). -
- -

-Then some tests with unmodified source code. - -

    -
  1. H, all settings normal. - -
  2. As (1), with small-mode decompress. - -
  3. H, compress with flag -1. - -
  4. H, compress with flag -s, decompress with flag -s. - -
  5. Forwards compatibility: H, bzip2-0.1pl2 compressing, - - bzip2-0.9.5 decompressing, all settings normal. -
  6. Backwards compatibility: H, bzip2-0.9.5 compressing, - - bzip2-0.1pl2 decompressing, all settings normal. -
  7. Bigger tests: A, all settings normal. - -
  8. As (7), using the fallback (Sadakane-like) sorting algorithm. - -
  9. As (8), compress with flag -1, decompress with flag - - -s. -
  10. H, using the fallback sorting algorithm. - -
  11. Forwards compatibility: A, bzip2-0.1pl2 compressing, - - bzip2-0.9.5 decompressing, all settings normal. -
  12. Backwards compatibility: A, bzip2-0.9.5 compressing, - - bzip2-0.1pl2 decompressing, all settings normal. -
  13. Misc test: about 400 megabytes of .tar files with - - bzip2 compiled with Checker (a memory access error - detector, like Purify). -
  14. Misc tests to make sure it builds and runs ok on non-Linux/x86 - - platforms. -
- -

-These tests were conducted on a 225 MHz IDT WinChip machine, running -Linux 2.0.36. They represent nearly a week of continuous computation. -All tests completed successfully. - -

- - - -

Further reading

-

-bzip2 is not research work, in the sense that it doesn't present -any new ideas. Rather, it's an engineering exercise based on existing -ideas. - -

-

-Four documents describe essentially all the ideas behind bzip2: - -

-Michael Burrows and D. J. Wheeler:
-  "A block-sorting lossless data compression algorithm"
-   10th May 1994. 
-   Digital SRC Research Report 124.
-   ftp://ftp.digital.com/pub/DEC/SRC/research-reports/SRC-124.ps.gz
-   If you have trouble finding it, try searching at the
-   New Zealand Digital Library, http://www.nzdl.org.
-
-Daniel S. Hirschberg and Debra A. LeLewer
-  "Efficient Decoding of Prefix Codes"
-   Communications of the ACM, April 1990, Vol 33, Number 4.
-   You might be able to get an electronic copy of this
-      from the ACM Digital Library.
-
-David J. Wheeler
-   Program bred3.c and accompanying document bred3.ps.
-   This contains the idea behind the multi-table Huffman
-   coding scheme.
-   ftp://ftp.cl.cam.ac.uk/users/djw3/
-
-Jon L. Bentley and Robert Sedgewick
-  "Fast Algorithms for Sorting and Searching Strings"
-   Available from Sedgewick's web page,
-   www.cs.princeton.edu/~rs
-
- -

-The following paper gives valuable additional insights into the -algorithm, but is not immediately the basis of any code -used in bzip2. - -

-Peter Fenwick:
-   Block Sorting Text Compression
-   Proceedings of the 19th Australasian Computer Science Conference,
-     Melbourne, Australia.  Jan 31 - Feb 2, 1996.
-   ftp://ftp.cs.auckland.ac.nz/pub/peter-f/ACSC96paper.ps
-
- -

-Kunihiko Sadakane's sorting algorithm, mentioned above, -is available from: - -

-http://naomi.is.s.u-tokyo.ac.jp/~sada/papers/Sada98b.ps.gz
-
- -

-The Manber-Myers suffix array construction -algorithm is described in a paper -available from: - -

-http://www.cs.arizona.edu/people/gene/PAPERS/suffix.ps
-
- -

-Finally, the following paper documents some recent investigations -I made into the performance of sorting algorithms: - -

-Julian Seward:
-   On the Performance of BWT Sorting Algorithms
-   Proceedings of the IEEE Data Compression Conference 2000
-     Snowbird, Utah.  28-30 March 2000.
-
- -


-

Go to the first, previous, next, last section, table of contents. - - diff -Nru bzip2-1.0.1/manual_toc.html bzip2-1.0.1.new/manual_toc.html --- bzip2-1.0.1/manual_toc.html Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/manual_toc.html Thu Jan 1 01:00:00 1970 @@ -1,173 +0,0 @@ - - - - -bzip2 and libbzip2 - Table of Contents - - - -

bzip2 and libbzip2

-

a program and library for data compression

-

copyright (C) 1996-2000 Julian Seward

-

version 1.0 of 21 March 2000

-
Julian Seward
-

-


- -

-This program, bzip2, -and associated library libbzip2, are -Copyright (C) 1996-2000 Julian R Seward. All rights reserved. - -

-

-Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: - -

    -
  • - - Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. -
  • - - The origin of this software must not be misrepresented; you must - not claim that you wrote the original software. If you use this - software in a product, an acknowledgment in the product - documentation would be appreciated but is not required. -
  • - - Altered source versions must be plainly marked as such, and must - not be misrepresented as being the original software. -
  • - - The name of the author may not be used to endorse or promote - products derived from this software without specific prior written - permission. -
- -

-THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS -OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED -WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE -ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY -DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL -DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE -GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS -INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, -WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING -NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS -SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - -

-

-Julian Seward, Cambridge, UK. - -

-

-jseward@acm.org - -

-

-http://sourceware.cygnus.com/bzip2 - -

-

-http://www.cacheprof.org - -

-

-http://www.muraroa.demon.co.uk - -

-

-bzip2/libbzip2 version 1.0 of 21 March 2000. - -

-

-PATENTS: To the best of my knowledge, bzip2 does not use any patented -algorithms. However, I do not have the resources available to carry out -a full patent search. Therefore I cannot give any guarantee of the -above statement. - -

- - -


-This document was generated on 23 March 2000 using the -texi2html -translator version 1.51a.

- - diff -Nru bzip2-1.0.1/randtable.c bzip2-1.0.1.new/randtable.c --- bzip2-1.0.1/randtable.c Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/randtable.c Sat Jun 24 20:13:06 2000 @@ -58,6 +58,10 @@ For more information on these sources, see the manual. --*/ +#ifdef HAVE_CONFIG_H +#include +#endif + #include "bzlib_private.h" diff -Nru bzip2-1.0.1/spewG.c bzip2-1.0.1.new/spewG.c --- bzip2-1.0.1/spewG.c Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/spewG.c Sat Jun 24 20:13:06 2000 @@ -9,7 +9,10 @@ (but is otherwise harmless). */ -#define _FILE_OFFSET_BITS 64 +#ifdef HAVE_CONFIG_H +#include +#endif + #include #include diff -Nru bzip2-1.0.1/stamp-h.in bzip2-1.0.1.new/stamp-h.in --- bzip2-1.0.1/stamp-h.in Thu Jan 1 01:00:00 1970 +++ bzip2-1.0.1.new/stamp-h.in Sat Jun 24 20:13:06 2000 @@ -0,0 +1 @@ +timestamp diff -Nru bzip2-1.0.1/unzcrash.c bzip2-1.0.1.new/unzcrash.c --- bzip2-1.0.1/unzcrash.c Sat Jun 24 20:13:27 2000 +++ bzip2-1.0.1.new/unzcrash.c Sat Jun 24 20:13:06 2000 @@ -13,6 +13,12 @@ many hours. */ +#ifdef HAVE_CONFIG_H +#include +#endif + + + #include #include #include "bzlib.h"