SpaceBanter.com - View Single Post - [fitsbits] Proposed Changes to the FITS Standard

**Rob Seaman** · #4 August 21st 07, 06:43 PM posted to sci.astro.fits

Lucio Chiappetti wrote:

- a keyword is intended as a named resource to be mainly read by
software, maybe into a variable, and then be acted upon (all the
mandatory and WCS keywords, those defined by specific conventions,
etc.)

- a keyword just records some information associated to a file, which
is intended to be read by a human, but it is hardly relevant to any
software (essentially "commentary" keywords).

I'd suggest FITS keywords fall into three categories:

1) FITS metadata, that is "data about FITS data" - examples start
with the mandatory keywords, SIMPLE, XTENSION, BITPIX, NAXISn,
PCOUNT, GCOUNT, but also CHECKSUM and DATASUM, etc.

2) Science metadata, that is "data about the data represented within
the HDU or file" - examples are DATE-OBS, EXPTIME, the slew of WCS
keywords, etc.

3) Provenance - this may be purely commentary including COMMENTs and
HISTORY, but may also be contained in keywords with values, but the
point is that it doesn't describe the file as it is, but rather, how
it came to be. The most obvious here is DATE.

One can make these distinctions finer grained - for instance INHERIT
is meta-science-metadata - but it isn't clear how useful that is
likely to be.

DUPKWDS = 'none' assures that the FITS file was written
without
any duplicated keywords

DUPKWDS = 'ignore' (or 'comments') declares that duplicated
keywords are of commentary nature, so they
can
be ignored by s/w or dealt with as HISTORY or
COMMENTS

DUPKWDS = 'take_first' declare that only the first or last value
DUPKWDS = 'take_last' shall be considered

DUPKWDS = 'concatenate' declare (string) values wanting to be
concatenated (also numeric arrays ??)

Any other cases possible ?

I suspect most will think we're reaching diminishing returns. If we
can't reach consensus on whether the first or last instance should
take precedence then "indeterminate" it will have to be. I'm still
interested to hear of cases where the duplicates are intentional.
Perhaps these would be addressed better through some other mechanism
than duplication?

But even with such conventions, we are still left with the problem of
what a generic reader should do with (older or not) files not
following
any convention.

What is this generic reader people keep talking about? Data is only
ever read for some purpose. If the purpose is to display the header
to a human, then display both copies of duplicate keywords. If the
purpose is to semantically capture the value of such a keyword, INDEF
seems appropriate (and we would do our users a favor to clarify the
standard to say so). If the purpose is to copy the input to the
output, copy it verbatim. If the purpose is to validate the data
structures, throw a warning if you want on detecting a duplicate
keyword - just don't throw an error. But if it is one of the key
structural keywords, there is no need to clarify the standard to know
to throw a big, fat, juicy error, e.g., duplicating BITPIX calls the
parsing of the file into question. Beneath every standard lies a
bedrock of logic.

A nod is as good as a wink to a blind horse.

Rob