A Space & astronomy forum. SpaceBanter.com

Go Back   Home » SpaceBanter.com forum » Astronomy and Astrophysics » FITS
Site Map Home Authors List Search Today's Posts Mark Forums Read Web Partners

[fitsbits] Rice compression from the command line



 
 
Thread Tools Display Modes
  #1  
Old July 12th 06, 08:23 PM posted to sci.astro.fits
Rob Seaman
external usenet poster
 
Posts: 49
Default [fitsbits] Rice compression from the command line

Work on the next release of the NOAO Science Archive has caused me to
revisit an earlier selection of gzip (which itself was the result of
an exercise in "satisficing" the choice of compression). For all the
obvious reasons (improved read/write speed, higher compression
factors, transparent access) we're taking another look at FITS Rice
compression. Not much seems to have changed over the past five years
- except that it seems like the example imcopy program in the cfitsio
distribution is actually being used in production environments. This
program has several functional shortcomings, in addition to all the
obvious logistical features that are missing in comparison to the
unix gzip command, for instance.

I've appended a quickly modified prototype that addresses some of
those issues. (Compile and link as with imcopy.c.) If there are
alternative FITS Rice compression tools already available, I would be
delighted to hear about them. In the mean time, let me describe some
of the issues I see with Rice compression, whether at the level of
the FITS Convention, CFITSIO or imcopy:

http://heasarc.gsfc.nasa.gov/docs/so...mpression.html
http://heasarc.gsfc.nasa.gov/docs/so...o/compression/
compress_image.html
http://heasarc.gsfc.nasa.gov/docs/so...io/fitsio.html
http://heasarc.gsfc.nasa.gov/docs/so...mples/imcopy.c

Starting with the imcopy application first, there are as I say many
missing feature. The two most obvious such are the ability to
compress "in-place" and to process a list of files. One of the
primary use cases for compression is as a magic wand to wave over a
file or a directory to shrink the disk usage. Such a compression
utility that instead creates a second file misses the point that many
users will be aiming for.

The next two issues appear to me to reflect limitations in the
conceptual design of the CFITSIO interface. 1) a copy operation is
not idempotent. Since the interface is semantically aware of the
meaning, as well as the contents of headers, a new copy may differ in
various ways from the original. This is a problem for a compression
application that wants to be able to restore a byte-by-byte copy of
the original. 2) updating an HDU does not necessarily update the
checksums. Failing this, the checksum convention mandates that the
CHECKSUM and DATASUM keywords be deleted, but instead CFITSIO leaves
stale keywords (which remain stale even after restoring the
uncompressed HDU, see #1).

(Tests indicate that the output file resulting from compressing and
then uncompressing whatever input file, may itself be idempotent. I
don't know if this will hold up for all cases or for FITS interfaces
other than CFITSIO. Such an action is something like the FITS
equivalent of canonicalizing XML.)

Finally the FITS compression convention is incomplete. It doesn't
actually express a coherent strategy for compressing and/or
uncompressing general FITS objects, but is limited to per-HDU
issues. For example, if an "SIF" file (that is, not an "MEF") is
compressed, an MEF is generated to contain the resulting binary
table. No information is retained to describe the original file
structure, so uncompressing this file later generates an ambiguity
about whether the original was indeed an SIF or rather was an
uncompressed MEF with a single IMAGE extension. A complementary
issue arises with MEF input, if the primary HDU is not dataless.
Does the "extra" extension resulting from compression become the
first output extension or the last? How many extensions does such a
restored file have? N or N+1?

Philosophically FITS compression is not like gzip or other "opaque"
compression. The output is itself a legal FITS object and interfaces
like CFITSIO or tools like imcopy can invisibly regard a compressed
image array as equivalent to an uncompressed array. This is a great
strength, but it doesn't remove the utility of other compression use
cases. For instance, I would be grateful if somebody could tell me
how to infer the compression status of an HDU using CFITSIO.
Invisibility is nice, but Claude Rains tells us its limits. (Which
are that the prototype doesn't currently uncompress, simply because
it can't a priori decide if the input is compressed to begin with.
Obvious workaround is to have separate "grice" and "gunrice"
commands. This might be desirable in any case for reasons I won't
belabor here.)

Some questions to mull over:

1) Does a better alternative to the CFITSIO imcopy already exist?
(Options don't have to be limited to ANSI C.) How best might we
encourage a wide adoption of a single standard across the
astronomical community? Gzip is ubiquitous, but so is FITS.

2) What features should a general purpose command line FITS
compression tool have? (For instance, should the checksums from the
original file be cached for later comparison to restored HDUs -
whether on disk or in memory?)

3) Should idempotency and correct checksum handling be the
responsibility of CFITSIO, or rather of the application?

4) What logistical procedures and semantic structures need to be
added to the FITS compression convention to support real-world usage?

5) Note that I have not talked about compression algorithms at all.
Has any progress been made on these issues in the last few years that
FITS could benefit from? The compression convention is intended to
support multiple algorithms, of course.

Please take a look at the attached code. Please don't just take it
and use it under battlefield conditions - this appears to be what
happened with the original imcopy program :-) I've traded some email
with Bill Pence about this issue, but would be delighted to hear
additional feedback. If it turns out that further work is warranted
on this prototype, I'll gladly donate the results to be incorporated
into CFITSIO as Bill may deem appropriate. Folks interested in
collaborating are always welcome.

Rob Seaman
NOAO

--------


 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
FITS Image compression, is it portable? Carsten A. Arnholm Amateur Astronomy 4 May 31st 06 09:59 PM
Dr. Rice Kicked Your Fat Derrieres, DEMO-SLUTS! Elwood Blues Astronomy Misc 1 April 10th 04 11:17 AM
Information blackout (mars rover) jimmydevice Policy 24 February 4th 04 06:38 PM
Oval Office. Bush and C.Rise. ValeryD Amateur Astronomy 0 January 9th 04 08:18 PM


All times are GMT +1. The time now is 08:45 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 SpaceBanter.com.
The comments are property of their posters.