A Space & astronomy forum. SpaceBanter.com

Go Back   Home » SpaceBanter.com forum » Astronomy and Astrophysics » FITS
Site Map Home Authors List Search Today's Posts Mark Forums Read Web Partners

[fitsbits] Potential new compression method for FITS tables



 
 
Thread Tools Display Modes
  #1  
Old December 22nd 10, 11:29 AM posted to sci.astro.fits
Mark Taylor
external usenet poster
 
Posts: 2
Default [fitsbits] Potential new compression method for FITS tables

Bill and others,

On Tue, 21 Dec 2010, William Pence wrote:

Thank you for carefully reading our document (describing a potential new
compression method for FITS binary tables). Here are a few more comments, in
addition to the previous ones from Rob Seaman:


Thank you for considering my comments. I have a couple of follow ups:

I agree with Mark's observation that this compressed table format is not very
convenient for applications that need random access to the rows and columns of
data. This is no different, however, from the case where the entire FITS file
is compressed with gzip. In both cases, it is usually necessary to
uncompress the table before the application reads or writes data in the table.


Quite true. There is a significant difference in convenience/usability
however, in that everybody understands what a .fits.gz file is and how
to uncompress it, whereas it will be much less obvious to people what
a tile-compressed table is, and how to make sense of it. If the format
becomes widely used this issue will be ameliorated, but that would probably
take quite some time.

This can be done either by explicitly creating an uncompressed copy of the
FITS file (e.g., by using our fpack/funpack FITS file compression utility
programs) which is then processed by the application program, or by having the
FITS reader create an uncompressed virtual FITS file in memory, which is then
accessed by the application program on the fly. I'm planning to implement
this latter approach in the CFITSIO library, similar to what has already been
done to support the tiled-image compression format. Application programs that
use CFITSIO to access these compressed tables will be able to do so in exactly
the same way as for normal uncompressed tables; CFITSIO will transparently
uncompress the table when necessary, and if the application modifies the
table, then CFITSIO will automatically recompress it when the application is
finished.


If I have time, and if this format looks like becoming widely used, I'd do
something similar in STIL/TOPCAT. But in the case of large tables,
it would still equate to a significantly longer processing time
than being able to do direct random access on an existing disk file.

My feeling is that, disk space being cheap, for most *user* contexts
the compression levels achievable with tile-compressed FITS will not
represent a good trade-off against the additional inconvenience of
using them. I am happy to admit however that for archives the reverse
may well be true.

Mark also expressed concerns about possible confusion between the compressed
and uncompressed versions of the same table, by humans or by software that is
unaware of this compression convention. It is true that the headers of the
uncompressed and uncompressed tables look quite similar, because only the
NAXIS2, PCOUNT, and TFORMn keyword value must necessarily differ. All the
other keywords can remain unchanged. I think this is largely a positive,
because readers of the compressed table header (whether human or software) can
quite easily understand the contents of the compressed table. I don't think
there is any danger than unsuspecting software could mistakenly process the
compressed table and produce misleading scientific results, if for no other
reason than because the compressed table will only contain a single row of
data in most cases. Mark suggested inventing a new extension type (instead of
BINTABLE) for these compressed tables, but I don't think we want to encourage
a proliferation of new extension types simply because the contents of the
table are slightly different. In any case, section 3.4.2 of the FITS standard
says that only one extension format shall be approved for each type of data
organization.


I do agree that this is not likely to lead to subtly inaccurate
scientific results. I still think user confusion is quite likely,
but admit that this is a less serious issue.

One possible improvement we could make is to add a few COMMENT keywords to the
header of the compressed table to tell readers that table columns have been
compressed, and include a link to further information about how to interpret
the contents.


I think recommending this kind of additional annotation, along with
some discussion in the document of the pros and cons of using this
format in various contexts, would be an appropriate way to address
my concerns.

Best festive wishes,

Mark

--
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
+44-117-928-8776 http://www.star.bris.ac.uk/~mbt/

 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[fitsbits] Potential new compression method for FITS tables Preben Grosbol FITS 0 December 17th 10 01:40 PM
[fitsbits] Potential new compression method for FITS tables Rob Seaman FITS 0 October 30th 10 12:46 AM


All times are GMT +1. The time now is 12:55 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 SpaceBanter.com.
The comments are property of their posters.