A Space & astronomy forum. SpaceBanter.com

Go Back   Home » SpaceBanter.com forum » Astronomy and Astrophysics » FITS
Site Map Home Authors List Search Today's Posts Mark Forums Read Web Partners

[fitsbits] Proposed Changes to the FITS Standard



 
 
Thread Tools Display Modes
  #11  
Old August 17th 07, 06:53 PM posted to sci.astro.fits
Steve Allen
external usenet poster
 
Posts: 37
Default [fitsbits] Proposed Changes to the FITS Standard

On Fri 2007-08-17T16:38:27 +0200, LC's NoSpam Newsreading account hath writ:
However it is always possible to add a COMMENT which claims conformance
with the latest (e.g. 3.0) standard.


For the sake of the problem that triggered us to overcome the starting
friction and actually write the FITS MIME document I would like to
go much farther than a comment.

I would like to see the IAUFWG establish a registry (in the sense of
the IANA) wherein all the documented FITS conventions have unique
names, and I would like to see a series of keywords which can be
placed in the PHDU to assert that "this FITS file employs these named
conventions".

When we get that far we may have solved Bill Joye's problem with
the ds9 viewer, which is to answer the questions:
I've just been given a FITS file.
What might I do with it?
How might I best present its content to the user?

FITS MIME could only go as far as representing to the internet
community that we have a file format and a robust process for
taking care of it. The rest of this work is only beginning.

--
Steve Allen WGS-84 (GPS)
UCO/Lick Observatory Natural Sciences II, Room 165 Lat +36.99855
University of California Voice: +1 831 459 3046 Lng -122.06015
Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m
  #12  
Old August 17th 07, 07:43 PM posted to sci.astro.fits
Rob Seaman
external usenet poster
 
Posts: 49
Default [fitsbits] Proposed Changes to the FITS Standard

Bill said:

The "once FITS always FITS" philosophy captures the spirit of FITS,
but
in practice each new version of the FITS Standard has imposed new
requirements that in principle could invalidate existing FITS files.
For example, version 2.0 of the FITS Standard introduced a new
requirement that the value and comment fields in a keyword MUST be
separated by a slash character.


It would be interesting to review past such instances. I don't
personally recall changes of this mandatory nature. The example
regarding comments is pretty tame since any reasonable implementation
would already be ignoring the comments. Do you have another example
to quote?

Of course, if the FITS community thinks a new requirement would cause
too much dislocation to existing data or software, then an alternative
would be to just "strongly recommend" instead of "require" the new
feature.


Indeed. I think that would be best in all three instances quoted below.

It's also possible to specify that a new requirement will come
into effect at some point in the future to allow time for software
systems to adapt, as was done with the Y2000 change to the
DATE keyword format.


Obviously the scheduling for the Y2K changes was forced, but I agree
with Steve that any such new requirements should be announced well in
advance of taking effect.

I think, however, that there is a misapprehension about the DATE/DATE-
OBS changes. The new ISO date format was very carefully designed to
only be required for post-Y2K data (there was also some overlap
period as I recall). The old format remained - and remains - valid
to describe 20th century data. In fact, the old dd/mm/yy format was
clarified to explicitly denote such dates. No after-the-fact
requirements were leveraged onto archival data. This is very
different than attempting to place new absolute requirements that
would invalidate old data sets. I say "attempting to place", because
there is no mechanism for enforcement.

There are only 3 proposed new absolute requirements in this list:

1. Keywords that have a value shall not be repeated in a header.


I have many examples (hundreds of thousands?) of files in which
keywords are repeated. Rather than the wording in the current
proposal, I would replace the attempt at a requirement with a strong
recommendation and a clarification that the final copy of any such
repeated keyword should take precedence.

2. PCOUNT and GCOUNT must immediately follow the last NAXISn
keyword in all conforming extensions (as is already required
in IMAGE, TABLE, and BINTABLE extensions).


I guess I'd like to know if there are any such extensions. If not,
this is relatively safe. If so, make it a strong recommendation for
an explicit list of grandfathered extension types and an absolute
requirement for any newly defined extensions.

3. Embedded space characters are now forbidden within numeric
values in an ASCII Table (e.g. "1 23 4.5" is no longer
allowed to represent the decimal value 1234.5)


Again - are there any examples of such usage in the field?

I think the general principle, however, should reflect the "letter of
the law", not "spirit of the law".

I should end here by repeating my earlier appreciation of the
excellent effort that has gone into the revision. If this careful
revision has not uncovered any other critical new requirements that
must be applied ex post facto, one can opine that there are no
lurking dragons that need to be fought. That being the case, it
seems to me that the responsibility lies rather to preserve the great
investment in archival data products rather than to attempt to
legislate these new requirements on the back of our current holdings
and current software investment.

And should new dragons appear that the community deems must be slain,
it does indeed appear to this observer that an explicit version
keyword (whether a comment or not) should be simultaneously required
to trigger new conformance restrictions. The loose wording about pre-
existing data is unenforceable since there is no requirement (whether
or not there ought to be) for a DATE keyword to separate old from
new. Perhaps the new version tag could itself supply a date - in
that case, I'd recommend that any revisions of the standard should
contain explicit references to the date(s) that apply for different
feature(s).

Rob

  #13  
Old August 17th 07, 07:47 PM posted to sci.astro.fits
William Pence
external usenet poster
 
Posts: 66
Default [fitsbits] Proposed Changes to the FITS Standard

Steve Allen wrote:
On Fri 2007-08-17T13:18:40 -0400, William Pence hath writ:
1. Keywords that have a value shall not be repeated in a header.


If this is to be implemented with exactly that wording then
on behalf of UCO/Lick/Keck I have to ask for a very clear
answer to this question:

Starting when?

We can do it, but in order to move the organization to get there we're
going to need a little warning beforehand.


Whether or not this exact wording is approved, you should probably
consider yourself warned that it might happen. :-)

The earliest that the regional FITS committees and the IAUFWG could
approve a new version of the FITS Standard would be early 2008. If
there were major disagreements, I would guess that it could take up to
an additional year to resolve the issues.

Bill Pence
--
__________________________________________________ __________________
Dr. William Pence
NASA/GSFC Code 662 HEASARC +1-301-286-4599 (voice)
Greenbelt MD 20771 +1-301-286-1684 (fax)


  #14  
Old August 17th 07, 08:13 PM posted to sci.astro.fits
Tim Pearson
external usenet poster
 
Posts: 3
Default [fitsbits] Proposed Changes to the FITS Standard

On Aug 17, 2007, at 11:43 AM, Rob Seaman wrote:

There are only 3 proposed new absolute requirements in this list:

1. Keywords that have a value shall not be repeated in a header.


I have many examples (hundreds of thousands?) of files in which
keywords are repeated. Rather than the wording in the current
proposal, I would replace the attempt at a requirement with a strong
recommendation and a clarification that the final copy of any such
repeated keyword should take precedence.


I strongly support this. The proposed text makes such files invalid
FITS, retrospectively. Taking the last instance of a keyword is a
much more reasonable interpretation. But I note that a program that
blindly drops all but the last instance may lose the information
conveyed by: the number of instances, the list of values and their
order, and any associated comments.

Are there existing applications where a keyword can occur more than
once with different values, in which more than just the last
occurrence are intended to carry significant information?

[I understand that technically COMMENT and HISTORY keywords do not
"have a value".]

- Tim Pearson
  #15  
Old August 17th 07, 08:32 PM posted to sci.astro.fits
Rob Seaman
external usenet poster
 
Posts: 49
Default [fitsbits] Proposed Changes to the FITS Standard

1. Keywords that have a value shall not be repeated in a header.

I have many examples (hundreds of thousands?) of files in which
keywords are repeated. Rather than the wording in the current
proposal, I would replace the attempt at a requirement with a strong
recommendation and a clarification that the final copy of any such
repeated keyword should take precedence.

2. PCOUNT and GCOUNT must immediately follow the last NAXISn
keyword in all conforming extensions (as is already required
in IMAGE, TABLE, and BINTABLE extensions).


I guess I'd like to know if there are any such extensions. If not,
this is relatively safe. If so, make it a strong recommendation for
an explicit list of grandfathered extension types and an absolute
requirement for any newly defined extensions.


It got me thinking, so I looked at the FITS parser in iSTB (the
current version of save-the-bits deployed on three mountaintops and
handling several terabytes of raw data annually). And no, I don't
currently require PCOUNT and GCOUNT to immediately follow NAXISn. I
do, however, throw an error if these particular keywords are
duplicated :-)

Speaking of which, it is the duplicate keyword requirement that seems
most onerous. To implement this efficiently for all keywords, one
would have to build a hash table or some such for each header. Then
one is left with the question of what to do upon detecting a
duplicate. The sense of a requirement is to simply throw an error
and exit. How helpful is that? STB will toss a FITS file if any of
the structural keywords (NAXISn, BITPIX, PCOUNT, GCOUNT, etc.) are
questionable - precisely because this calls into question the
possibility of handling the data appropriately. The daemon needs to
know the size of the file because it is reading it on the standard
input, perhaps concatenated with other files. The size of each
extension must be known to find subsequent extensions. Etc.

But am I to discard brand new data simply because some camera
temperature keyword appears twice? I spend a lot of time every week
trying to convince a dozen different instrument teams to provide the
archive with a reliable DATE-OBS, EXPTIME, FILTER, OBSTYPE, etcetera
and so forth. They'll rebel if I start tossing their data due to
foibles with minor engineering keywords.

I really think enforcing #1 will prove impossible in practice. I'm
not going to build a hash table to search for duplicates for every
keyword just so I can throw an error that will anger my stakeholders
over trivial details. And on the other hand, for pipeline reduced
science data sets, no requirement is needed since there already is
sufficient impetus for data providers to carefully tailor their data
products, eliminating duplicate keywords as a matter of course.

Making it a strong recommendation is my own strong recommendation.

Rob

  #16  
Old August 17th 07, 08:38 PM posted to sci.astro.fits
Steve Allen
external usenet poster
 
Posts: 37
Default [fitsbits] Proposed Changes to the FITS Standard

On Fri 2007-08-17T12:13:57 -0700, Tim Pearson hath writ:
Are there existing applications where a keyword can occur more than
once with different values, in which more than just the last
occurrence are intended to carry significant information?


In the instruments delivered to Keck by UCO/Lick and Caltech there
are repeated occurrences of keywords which not only have values,
but for which the data type of the value is different in the
different occurrences.

The information is significant, but only in the sense that it is a
dump of information which might be relevant to someone who is
debugging engineering aspects of the image data acquisition system
(and that's pretty much the explanation for why this atrocity
of poor form in database normalization exists in the first place).

To the extent that we share a common code base I have already
implemented the changes necessary to avoid such FITS files, but it
will have to be retrofitted and tested on each affected instrument.

--
Steve Allen WGS-84 (GPS)
UCO/Lick Observatory Natural Sciences II, Room 165 Lat +36.99855
University of California Voice: +1 831 459 3046 Lng -122.06015
Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m
  #17  
Old August 17th 07, 08:55 PM posted to sci.astro.fits
William Pence
external usenet poster
 
Posts: 66
Default [fitsbits] Proposed Changes to the FITS Standard

Rob Seaman wrote:
Bill said:

The "once FITS always FITS" philosophy captures the spirit of FITS,
but
in practice each new version of the FITS Standard has imposed new
requirements that in principle could invalidate existing FITS files.
For example, version 2.0 of the FITS Standard introduced a new
requirement that the value and comment fields in a keyword MUST be
separated by a slash character.


It would be interesting to review past such instances. I don't
personally recall changes of this mandatory nature. The example
regarding comments is pretty tame since any reasonable implementation
would already be ignoring the comments. Do you have another example
to quote?


Some other new requirements we

- keyword values are restricted to be a single value, not an array
- logical keyword values must consist of a single T or F followed
only by a space or a slash character
- integer and float keyword values must not contain embedded spaces
- complex keyword values must be enclosed in parentheses
- no other keywords may intervene between the mandatory keywords in
the primary array or extension
- the TFORM keyword values must be upper case (e.g., F5.2, not f5.2)

There are only 3 proposed new absolute requirements in this list:

1. Keywords that have a value shall not be repeated in a header.


I have many examples (hundreds of thousands?) of files in which
keywords are repeated. Rather than the wording in the current
proposal, I would replace the attempt at a requirement with a strong
recommendation and a clarification that the final copy of any such
repeated keyword should take precedence.


Imposing a new requirement on software systems to read the last instance
of the keyword would likely have a lot of negative repercussions.
Current software systems produce different results when reading a FITS
file with duplicate keywords. CFITSIO cyclically scans the header for
the next occurrence of the keyword following the last keyword that was
read or written, so the same application may read a different value
depending on exactly what processing was done before hand. I'm sure
other commonly used software systems will always return the first
instance of the keyword, while other systems will always return the last
instance. Requiring all software systems to follow the same behavior is
not practical, so the only sure way to prevent users from getting an
incorrect result when analyzing the file is to eliminate duplicate
keywords in the first place. There is less harm if the duplicated
keywords all have the same value, so maybe the wording of this
requirement should be modified to take this into account.

2. PCOUNT and GCOUNT must immediately follow the last NAXISn
keyword in all conforming extensions (as is already required
in IMAGE, TABLE, and BINTABLE extensions).


I guess I'd like to know if there are any such extensions.


There a at least some of your FOREIGN extensions have the order of
these 2 keywords reversed.


3. Embedded space characters are now forbidden within numeric
values in an ASCII Table (e.g. "1 23 4.5" is no longer
allowed to represent the decimal value 1234.5)


Again - are there any examples of such usage in the field?


No, as far as we know. If there are any, then it is very likely that
most current software systems do not support embedded spaces in the
value and will silently read an incorrect value, or will exit with an
error. Thus, it seems better to me to outlaw this usage rather than
just not recommend it or deprecate it.

(...)


And should new dragons appear that the community deems must be slain,
it does indeed appear to this observer that an explicit version
keyword (whether a comment or not) should be simultaneously required
to trigger new conformance restrictions.


I don't really see any practical benefit to having a version keyword.
Either the software will support a new requirement, or it won't; the
presence of a version (or DATE) keyword isn't really helpful, except
maybe to a human reading the header.

The loose wording about pre-
existing data is unenforceable since there is no requirement (whether
or not there ought to be) for a DATE keyword to separate old from
new. Perhaps the new version tag could itself supply a date - in
that case, I'd recommend that any revisions of the standard should
contain explicit references to the date(s) that apply for different
feature(s).


The proposed new statement ("Existing FITS files that conformed to the
latest version of the standard at the time the files were created are
expressly exempt from any new requirements imposed by subsequent
versions of the standard.") is, I think, mainly intended as a political
statement to reassure institutions that the FITS committees are not
imposing new unfunded mandates that require modifications to existing
FITS archives. I don't see this statement as having much relevance to
the way software is implemented.

Bill Pence
--
__________________________________________________ __________________
Dr. William Pence
NASA/GSFC Code 662 HEASARC +1-301-286-4599 (voice)
Greenbelt MD 20771 +1-301-286-1684 (fax)


  #18  
Old August 17th 07, 09:21 PM posted to sci.astro.fits
William Pence
external usenet poster
 
Posts: 66
Default [fitsbits] Proposed Changes to the FITS Standard

Rob Seaman wrote:
Speaking of which, it is the duplicate keyword requirement that seems
most onerous. To implement this efficiently for all keywords, one
would have to build a hash table or some such for each header. Then
one is left with the question of what to do upon detecting a
duplicate. The sense of a requirement is to simply throw an error
and exit. How helpful is that?


This is missing the main point of this new requirement. No current
software system that I am aware of (except for the FITS verifier code)
checks for duplicated keywords, so users have no idea which of the
duplicated keywords is being used by a particular program. The software
might be using the first, the 'next', or the last instance of the
keyword. This could easily cause the user to derive incorrect
scientific results. What is the best way to prevent this from
happening? Seems to me we should focus on the root of the problem and
(formally at least) disallow duplicated keywords in a conforming FITS
file. This doesn't mean software should automatically throw out a file
that inadvertently has a duplicated keyword.

Stepping back a little, I think the seriousness of this problem depends
on what keyword is duplicated. If it is just some observatory-specific
keyword that does not directly affect the scientific results, then it
does not matter very much, and data providers need not worry about it.
But if a critical WCS keyword, or exposure time keyword is duplicated in
the file with different values, then surely the data providers need to
take responsibility and fix the problem.

Bill Pence
--
__________________________________________________ __________________
Dr. William Pence
NASA/GSFC Code 662 HEASARC +1-301-286-4599 (voice)
Greenbelt MD 20771 +1-301-286-1684 (fax)


  #19  
Old August 17th 07, 09:24 PM posted to sci.astro.fits
Thierry Forveille
external usenet poster
 
Posts: 14
Default [fitsbits] Proposed Changes to the FITS Standard

I think, however, that there is a misapprehension about the DATE/DATE-
OBS changes. The new ISO date format was very carefully designed to
only be required for post-Y2K data (there was also some overlap
period as I recall). The old format remained - and remains - valid
to describe 20th century data. In fact, the old dd/mm/yy format was
clarified to explicitly denote such dates. No after-the-fact
requirements were leveraged onto archival data.

Plus a that time some change HAD to be made, since the old format
was going to wrap around, while here we have a choice.

1. Keywords that have a value shall not be repeated in a header.


I have many examples (hundreds of thousands?) of files in which
keywords are repeated. Rather than the wording in the current
proposal, I would replace the attempt at a requirement with a strong
recommendation and a clarification that the final copy of any such
repeated keyword should take precedence.

I similarly cannot see the value of this particular proposed
change: FITS readers will need to support repeated keywords forever,
given the very large numbers of existing files with them, so it's
not even as if this would simplify reading FITS. I am also
very much in favour of instead simply clarifying that the last
occurence has precedence.

The other changes look more like matching the letter of the law
with its spirit, so are perfectly fine with me.

  #20  
Old August 18th 07, 09:30 PM posted to sci.astro.fits
Craig Markwardt
external usenet poster
 
Posts: 232
Default [fitsbits] Proposed Changes to the FITS Standard


William Pence writes:
... There are only 3
proposed new absolute requirements in this list:

1. Keywords that have a value shall not be repeated in a header.

2. PCOUNT and GCOUNT must immediately follow the last NAXISn
keyword in all conforming extensions (as is already required
in IMAGE, TABLE, and BINTABLE extensions).

3. Embedded space characters are now forbidden within numeric
values in an ASCII Table (e.g. "1 23 4.5" is no longer
allowed to represent the decimal value 1234.5)

The public comment period on these, as well as all the other
recommended changes, remains open here on this email list/newsgroup
until at least the end of September...


Another proposed change, the case of the EXTEND keyword being made
optional, will also impose a software-change burden. Software which
previously relied on that keyword will now be required to check for
the presence of extensions in a different way.

Craig

 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[fitsbits] Proposed Changes to the FITS Standard Mark Calabretta FITS 0 August 2nd 07 09:39 AM
[fitsbits] Proposed Changes to the FITS Standard Steve Allen FITS 0 August 1st 07 06:08 PM
[fitsbits] Proposed Changes to the FITS Standard Thierry Forveille FITS 0 August 1st 07 04:51 PM
[fitsbits] Proposed Changes to the FITS Standard William Pence FITS 0 July 27th 07 07:38 PM
[fitsbits] Proposed Changes to the FITS Standard Rob Seaman FITS 0 July 24th 07 07:21 PM


All times are GMT +1. The time now is 01:31 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 SpaceBanter.com.
The comments are property of their posters.