|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
[fitsbits] FITS Bintable proposals
The following comments about the FITS binary table proposals from Arnold
Rots and Preben Grosbol may be of general interest, so I'm reposting this here to the wider FITSBITS audience. I have updated the draft proposals, available from http://fits.gsfc.nasa.gov/bintable_proposals.html with the changes that are discussed here. -Bill Pence --------------------------------------------------------------- Arnold Rots wrote: OK, I read the binary table proposals and have some comments/questions. The last sentence of the changes brought about by Proposal 1: "The meaning of these bytes is defined in section 8.3.5." The standard currently says "...One proposed application is described in Appendix B.1." Does this change mean that we will now restrict PCOUNT to use by Variable Length Arrays only? That's OK, but probably should be said explicitly. This was an oversight on my part; I had not intended to restrict other possible uses of PCOUNT and the heap storage area. In principle there is no reason 2 or more conventions could not share the heap space (just as the memory space managed by 'malloc' can be used for multiple purposes). Note also that currently it is legal to have a binary table with a non-zero heap (PCOUNT 0) but without any 'P' variable length columns in the table; in that case the meaning of the heap data is currently undefined. The proposal draft has been modified so that section 8.3.3.2, in its entirety, will read: "8.3.3.2 Bytes Following Main Table The main data table shall be followed by an additional data area containing zero or more bytes, as specified by the value of the PCOUNT keyword. One use for this data area is described in section 8.3.5. This does not preclude other uses for these bytes." In the text of 8.3.5: Second par.: it says tables can be read by programs not understanding VLAs (:-). That's correct, but it may be good to point out that they'd better know about applying PCOUNT. The value of PCOUNT should not affect how programs read the main data table, and it is only necessary to apply PCOUNT when calculating the total number of bits in the binary table extension and hence the starting location of the next HDU. This is already stated in section 5.4.1.2 on Conforming Extensions; given the legalistic style of the FITS Standard (it is not a verbose User's Guide) is is probably better to leave the current wording as is and not duplicate the same requirement in more than one section. I would prefer to refer to data consistently as plural: P 1, par 3, l 3: "...data are not stored..." P 2, par 2, l 1-2: "...a table are not stored...records; they are stored..." There may be more... These and a few other cases have been corrected, so now "data" is always used as plural. P 2, par 5: Twice it says "NAXIS x NAXIS2"; this should be "NAXIS1 x NAXIS2" This strange 'cut and paste' typo has been corrected. One thing is not clear about PCOUNT: is it (gap + heap) or (gap + heap + padding)? The latter seems implied by the example in the next par. PCOUNT does not include any padding bytes needed to make the length of the data unit a multiple of 2880 bytes. I've changed the example to use a heap size of 3000 (instead of 2880) to make this clear. P 2, par 6, l 1: "...5 rows of each 168..." ^^ This has been reworded. I would also like to see a better definition of PCOUNT in the example. This has been reworded and hopefully is now clearer. Re-reading all of this made me realize that, in retrospect, I am uncomfortable with the 32 bit signed restriction. Here we start worrying about 64 bit integers but we restrict the size of the heap to 2 GB through the second half of the P fields. But I should not reopen the debate :-) Some reasons for restricting the array length and offset to signed integers a - there is no precedent in FITS for using unsigned 32-bit integers - use of unsigned integers is problematic in some languages like Fortran - as far as I'm aware, the current software implementations of the heap (e.g. CFITSIO) interpret these values as signed integers I share your discomfort about this, however, and think that perhaps in the future we could reverse this decision and redefine these to be unsigned integers. (This is possible because doing so would not invalidate any existing FITS files and thus would not violate the "once FITS, always FITS" rule). We need more time to evaluate all the implications before making this decision, so for now I think it is best to restrict these fields to be signed integers, but leave the door open for a change in the future. (see also Preben's related comment, below) As to the TDIMn replacement for 8.3.2, I cringe a bit at the thought of the opportunities for abuse of combining a TDIM with a P TFORM. Would it be helpful (again, without reopening the discussion) to give an indication of what usage is envisioned here? I'm not sure what else needs to be added, since this is the FITS standard and not a user's guide. There are perhaps 2 main uses of TDIMn with VLAs: Case 1. Every row of the VLA column contains the same size/shape image, except for some rows where the array has zero length (is not present). This may happen if an image is optional, and not necessarily applicable to every row of the table In this case the TDIMn keyword is used normally to give the dimensions of the image, when it is present. The TDIMn keyword is ignored for those rows with no image. 2. If the VLA column contains a different size/shape image in each row, then then the TDIMn keyword cannot be used, and instead the TDIMn value for each row would be given in another column of the table (whose name is 'TDIMn' where n is the number of the VLA column). This follows the 'Greenbank Convention' for collapsing a constant table column into a keyword, or expanding a keyword whose value varies from row to row into a column. ============================================ In a separate message on 14-Dec, Preben Grosbol wrote: I found only one other issue namely the first modification of VLA which explicit include P column in the types effected by TNULLn. My two comments a 1) Since specifying a zero array length has the same effect, TNULL values in P columns are not needed. That will work only if the entire array is null; we still need to use TNULLn to define a null value if only some elements of the variable length array are null. 2) We have signed integer so in principle we should explicitly define that VLA elements with either length or offset being negative are regarded as undefined values. The following sentence has been added to the draft proposal: "The meaning of a negative value for either of these integers is undefined by this standard." This leaves an opening for future experimentation with using unsigned integers (large unsigned integers are equivalent to negative signed integers) in these fields. -- __________________________________________________ __________________ Dr. William Pence NASA/GSFC Code 662 HEASARC +1-301-286-4599 (voice) Greenbelt MD 20771 +1-301-286-1684 (fax) |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
FITS long integer support (was [fitsbits] ADASS FITS BoFon Sunday) | William Pence | FITS | 6 | October 22nd 04 08:23 PM |
[fitsbits] Comment Period on 2 FITS Proposals | William Pence | FITS | 0 | October 21st 04 09:56 PM |
[fitsbits] FITS long integer support | Steve Allen | FITS | 0 | October 21st 04 06:22 PM |
[fitsbits] Start of the FITS MIME type Public Comment Period | William Pence | FITS | 8 | June 17th 04 06:08 AM |
[fitsbits] Happy Birthday, FITS! | Don Wells | FITS | 0 | March 28th 04 01:58 PM |