View Single Post
  #2  
Old August 21st 07, 01:05 PM posted to sci.astro.fits
LC's NoSpam Newsreading account[_2_]
external usenet poster
 
Posts: 23
Default [fitsbits] Proposed Changes to the FITS Standard

On Sun, 19 Aug 2007, Rob Seaman wrote:

[...]
change. Duplicate keywords, on the other hand, are a frequent
occurrence (thus the interest in eliminating them :-)

[...]
It seems to me that in this imperfect world it would be better if the
major FITS software packages adopted a coherent behavior on
encountering duplicate keywords. A header with duplicate FITS
keywords is not a bug. Currently, it is perfectly legal FITS, if
questionable practice.


Maybe the point is that the nature (usage ?) of most keyword (type)s is
indeterminate (or unpredictable by whoever wrote the file ?) or at least
oscillates between those two extreme cases :

- a keyword is intended as a named resource to be mainly read by
software, maybe into a variable, and then be acted upon (all the
mandatory and WCS keywords, those defined by specific conventions,
etc.)

- a keyword just records some information associated to a file, which
is intended to be read by a human, but it is hardly relevant to any
software (essentially "commentary" keywords).

If all commentary information would be written into commentary or
"value-less" keywords (4.4.2.4, 4.1.2.2, 4.1.2.3) a generic reader will
have no problems.

Talking about readers we can think of essentially two types of readers:

- specific readers, which read only the keywords they know of
beforehand. They read them by name. They know beforehand they should
correspond to a variable of a given type (integer, real, string...).
They most likely search for a keyword of a given name (and probably
stop at the first occurrence).

But if they know of or expect a duplicated keyword may knowingly
act in some predefined way (does anybody know such a beast ?)

- all-purpose readers. I can imagine things like reading the entire
header into memory, or generating some data structure scanning the
entire header. I have for instance an IDL procedure which reads a
file into a structure with elements a.kwd1, a.kwd2 ... a.kwdn and
a.data (the data array). Actually my procedure does not read FITS
file, but a format of my own (which can however be generated also
from FITS) ... and its relies on the (sound) idea that keywords have
unique names, because structure element names are built on the fly
from kwd names (so a.bitpix, a.naxis, a.bunit ...).

In such a procedure duplicate kwds are a nuisance and trigger an
error. In fact since my procedure skips COMMENTs (do not enters them
in the structure at all), so it treats two particular keywords
(HISTORY and another non-FITS one) as repeatable (in which cases
it generates structure elements a.h0001, a.h0002 etc.) and fails
in error in all other cases.

All these seemed to me reasonable sound practice, and this inspired the
idea to forbid duplication of (named, non-commentary, non-valueless)
keyword in FITS 3.0.

Given now that it seems there are more live FITS files which by purpose
or accident (not error) contain duplicated keywords, we could probably
demote the change from forbidding to strongly recommending against.

But it is hard to define a preferred way to deal with duplicated kwds.
Unless we register alternate conventions which explicitly specify what
the reader should do about duplicated keywords. E.g.

DUPKWDS = 'none' assures that the FITS file was written without
any duplicated keywords

DUPKWDS = 'ignore' (or 'comments') declares that duplicated
keywords are of commentary nature, so they can
be ignored by s/w or dealt with as HISTORY or
COMMENTS

DUPKWDS = 'take_first' declare that only the first or last value
DUPKWDS = 'take_last' shall be considered

DUPKWDS = 'concatenate' declare (string) values wanting to be
concatenated (also numeric arrays ??)

Any other cases possible ?

But even with such conventions, we are still left with the problem of
what a generic reader should do with (older or not) files not following
any convention.

Lucio Chiappetti

--
----------------------------------------------------------------------
is a newsreading account used by more persons to
avoid unwanted spam. Any mail returning to this address will be rejected.
Users can disclose their e-mail address in the article if they wish so.