View Single Post
  #1  
Old April 1st 08, 04:39 PM posted to sci.astro.fits
William Pence[_2_]
external usenet poster
 
Posts: 44
Default [fitsbits] Interpretation of repeat count in binary tables

Paddy Leahy wrote:
On Mon, 31 Mar 2008, William Pence wrote:
In the example you cite below, the binary table contains 12288 rows of data,
and each row contains 3 vectors, containing 1024 single precision floating
point numbers each. There is no explicit relationship between the vectors
in different rows of the table.


"No explicit relationship" sounds worrying. Does this mean that different
FITS readers can legitimately interpret the same table in different ways?
Or are there some keywords that clarify the relationship between rows?


The mandatory keywords in this case simply define a FITS table that
consists of a sequence of 1024 element vectors. There is no explicitly
defined relationship between the vectors on different rows. Implicitly,
however, the fact that all these vectors have been grouped together into
a single table suggests that they are probably related in some way.
There could be other mission-specific keywords in the header that
explains the relationship.

snip
The TDIMn keyword, if present in this case, would specify the dimensionality
of each individual 1024 element vector in that column. For example, TDIM1 =
'(16,64)', would mean that each vector in column 1 should be interpreted as a
16 x 64 2D array. If there is no TDIMn keyword, then the vector would be
interpreted as a 1024 element 1-D array.


So, let me be clear: a fits reader is asked to return the data from column
2 of a table which has N_row rows and a repeat count of R for column 2.
This should definitely be returned as a 2-D array of (R, N_row) elements,
even if TDIM2 is not present? (Or maybe, given your "no explicit
relationship" comment, as something looser, like a linked list of arrays
of length R?).


It mainly depends on what the application program itself wants to read
from the table. Some applications might read the vectors sequentially
one row at at time in a loop; other applications might ask for all the
vectors at once, as a big array of length = (column vector count) *
(NAXIS2 rows). The application could optionally interpret this as a 2D
array if it desires, but the FITS keywords do not mandate this.

The point being that, I guess because some fits readers are highly
inefficient at reading tables with short rows, it has become common
practice at least in the CMB field to use a repeat count just as a way of
packing data in a way that can be read efficiently. The writer of my
example intended each column to be read as 1 long vector of R*N_row
elements. Is that definitely contrary to the standard?


No it is not contrary to the standard. Projects are free to invent their
own local conventions for storing the data in whatever structure is most
convenient for them. If, however, the intention is to pack 3 1024 x
12288 element 2-D arrays into a binary table, then a more explicit way
to do this would be as follows:

NAXIS1 = 150994944 / width of table (1024*12288*3*4)
NAXIS2 = 1 / table has only 1 row
....
TFIELDS = 3
TFORM1 = '12582912E'
TDIM1 = '(1024,12288)'
TFORM2 = '12582912E'
TDIM2 = '(1024,12288)'
TFORM3 = '12582912E'
TDIM3 = '(1024,12288)'

In this structure, all the elements for each array are contiguous in the
table, which would make it more efficient when reading the whole
2-D array at once.

Bill Pence
--
__________________________________________________ __________________
Dr. William Pence
NASA/GSFC Code 662 HEASARC +1-301-286-4599 (voice)
Greenbelt MD 20771 +1-301-286-1684 (fax)