View Single Post
  #17  
Old August 17th 07, 08:55 PM posted to sci.astro.fits
William Pence
external usenet poster
 
Posts: 66
Default [fitsbits] Proposed Changes to the FITS Standard

Rob Seaman wrote:
Bill said:

The "once FITS always FITS" philosophy captures the spirit of FITS,
but
in practice each new version of the FITS Standard has imposed new
requirements that in principle could invalidate existing FITS files.
For example, version 2.0 of the FITS Standard introduced a new
requirement that the value and comment fields in a keyword MUST be
separated by a slash character.


It would be interesting to review past such instances. I don't
personally recall changes of this mandatory nature. The example
regarding comments is pretty tame since any reasonable implementation
would already be ignoring the comments. Do you have another example
to quote?


Some other new requirements we

- keyword values are restricted to be a single value, not an array
- logical keyword values must consist of a single T or F followed
only by a space or a slash character
- integer and float keyword values must not contain embedded spaces
- complex keyword values must be enclosed in parentheses
- no other keywords may intervene between the mandatory keywords in
the primary array or extension
- the TFORM keyword values must be upper case (e.g., F5.2, not f5.2)

There are only 3 proposed new absolute requirements in this list:

1. Keywords that have a value shall not be repeated in a header.


I have many examples (hundreds of thousands?) of files in which
keywords are repeated. Rather than the wording in the current
proposal, I would replace the attempt at a requirement with a strong
recommendation and a clarification that the final copy of any such
repeated keyword should take precedence.


Imposing a new requirement on software systems to read the last instance
of the keyword would likely have a lot of negative repercussions.
Current software systems produce different results when reading a FITS
file with duplicate keywords. CFITSIO cyclically scans the header for
the next occurrence of the keyword following the last keyword that was
read or written, so the same application may read a different value
depending on exactly what processing was done before hand. I'm sure
other commonly used software systems will always return the first
instance of the keyword, while other systems will always return the last
instance. Requiring all software systems to follow the same behavior is
not practical, so the only sure way to prevent users from getting an
incorrect result when analyzing the file is to eliminate duplicate
keywords in the first place. There is less harm if the duplicated
keywords all have the same value, so maybe the wording of this
requirement should be modified to take this into account.

2. PCOUNT and GCOUNT must immediately follow the last NAXISn
keyword in all conforming extensions (as is already required
in IMAGE, TABLE, and BINTABLE extensions).


I guess I'd like to know if there are any such extensions.


There a at least some of your FOREIGN extensions have the order of
these 2 keywords reversed.


3. Embedded space characters are now forbidden within numeric
values in an ASCII Table (e.g. "1 23 4.5" is no longer
allowed to represent the decimal value 1234.5)


Again - are there any examples of such usage in the field?


No, as far as we know. If there are any, then it is very likely that
most current software systems do not support embedded spaces in the
value and will silently read an incorrect value, or will exit with an
error. Thus, it seems better to me to outlaw this usage rather than
just not recommend it or deprecate it.

(...)


And should new dragons appear that the community deems must be slain,
it does indeed appear to this observer that an explicit version
keyword (whether a comment or not) should be simultaneously required
to trigger new conformance restrictions.


I don't really see any practical benefit to having a version keyword.
Either the software will support a new requirement, or it won't; the
presence of a version (or DATE) keyword isn't really helpful, except
maybe to a human reading the header.

The loose wording about pre-
existing data is unenforceable since there is no requirement (whether
or not there ought to be) for a DATE keyword to separate old from
new. Perhaps the new version tag could itself supply a date - in
that case, I'd recommend that any revisions of the standard should
contain explicit references to the date(s) that apply for different
feature(s).


The proposed new statement ("Existing FITS files that conformed to the
latest version of the standard at the time the files were created are
expressly exempt from any new requirements imposed by subsequent
versions of the standard.") is, I think, mainly intended as a political
statement to reassure institutions that the FITS committees are not
imposing new unfunded mandates that require modifications to existing
FITS archives. I don't see this statement as having much relevance to
the way software is implemented.

Bill Pence
--
__________________________________________________ __________________
Dr. William Pence
NASA/GSFC Code 662 HEASARC +1-301-286-4599 (voice)
Greenbelt MD 20771 +1-301-286-1684 (fax)