|
|
Thread Tools | Display Modes |
#51
|
|||
|
|||
[fitsbits] Abuse of EXTEND keyword
I don't think you can have it both ways.
If currently, you can have extensions *without* having EXTEND=T in the initial header, then by the "once FITS, always FITS" rule, EXTEND=T has no meaning. To me, it originally had value because I didn't have to figure out where the next header was (I just returned an error if that was attempted without EXTEND=T), but if you've allowed other uses, you have to live with it and just create more headache for the software developers. Deprecate it. I agree with Arnold and others. Arne |
#52
|
|||
|
|||
[fitsbits] EXTEND/NEXTEND reality check
On Fri 2007-08-24T14:39:58 -0400, William Pence hath writ:
if software needs to know if there are extensions in the file then it can just do an inventory of the file in real time. This is not a time consuming operation since it does not require reading the whole file; the software can directly jump to the header of each subsequent extension, at least on random access storage devices. Let's do one reality check here. I agree with the above, but if you meet me on the street I am probably carrying 50 GB of file storage which can be addressed in arbitrary sequence (and I don't even know where I could find a tape drive to read the boxes of 9-track tapes in my office). If there is anyone on whose equipment the process described by Bill Pence is burdensome, speak now, otherwise don't reply. -- Steve Allen WGS-84 (GPS) UCO/Lick Observatory Natural Sciences II, Room 165 Lat +36.99855 University of California Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m |
#53
|
|||
|
|||
[fitsbits] EXTEND/NEXTEND reality check
Hi All -
There is nothing in the FITS standard which prevents a primary header from containing information describing the remainder of the file, and in fact this is often done with multi-extension files (the INHERIT convention for example). Likewise there is nothing in FITS which prevents one from adding a keyword such as NEXTEND. It can even be useful to do these things. Of course such usage does require that a FITS writer be prepared to modify the primary header, but that is not such a big deal (perhaps easier than hopping along sequentially through 50 extensions), and existing software already does this kind of thing. Since it is legal FITS, already current practice, and possibly even useful, there is no reason to deprecate such usage. MEF files are an important part of FITS, but unfortunately this issue of describing the overall MEF as a structured object has never been adequately addressed. While it may be possible for a MEF to be a random collection of FITS extensions that are concatenated together, this is not the only model nor necessarily even the best one. In many applications one would like to consider the MEF to be a structured object, and be able to add high level information to describe the object, indendent from what is stored in the individual extensions. The most logical place to put this information in FITS currently is in the primary header. One could easily argue that we need *more* keywords in the primary header to properly describe a MEF, not less as seems to be the trend of this discussion. - Doug |
#54
|
|||
|
|||
[fitsbits] EXTEND/NEXTEND reality check
Steve Allen inquired:
If there is anyone on whose equipment the process described by Bill Pence is burdensome, speak now, otherwise don't reply. I'm not sure we should be discussing burdensome demands placed on anybody's equipment, and you guys can do whatever you want with [N] EXTEND, but it behooves us to consider that even in a world of limitless resources some requirements pertain to serialization, not random access. Not all FITS files are realized on hard media, random or otherwise - in particular, any number of applications must parse FITS received via the standard input or similar byte stream transport. Certainly NOAO's Save-the-bits does. The question isn't only whether you need to seek willy-nilly through a file, the question is whether it is efficient to seek in whatever manner for whatever application. STB doesn't care since my mandate is to - well - save all the intervening bits. And supporting some modest complexity MEF file on disk likely won't be a challenge since you only have to parse and skip a few dozen times. On the other hand, perhaps there are applications relying on data files with hundreds or thousands of extensions that require idiosyncratic routing? In that case, even on random access media [N] EXTEND likely doesn't do *enough* and one might be tempted to maintain a table of XTENSION keyword byte offsets in the primary (or some other) HDU. The tape-based STB relied on constructing a bi-directional linked list of FITS header objects since a single large(ish) tape file might contain hundreds of small IR images packed as IMAGE extensions. Even so, the problem was made much simpler by requiring an intermediate random access copy of the new MEF before writing it to tape. This was deemed much more robust in an uncertain mountain environment than copying the serial input direct through to the serial output. A similar logic applied to the CD-based STB which (of course) used the normal two step process of 1) creating ISO image on scratch disk, and 2) mastering CD. Rob |
#55
|
|||
|
|||
[fitsbits] Abuse of EXTEND keyword
Keep in mind that if a keyword is officially "deprecated", then we are
telling software developers that they really should change their software that creates FITS files so that it no longer writes the keyword (unless one claims the loophole that applications that already use a deprecated structure can continue to use it). Simply making the keyword optional would have less impact on developers and users. For example, if I were mandated to change CFITSIO to no longer write the EXTEND keyword (it currently automatically writes EXTEND = T in every new FITS file), then after projects upgrade to the new version, their FITS files will suddenly be different than before. The change should be harmless, but still this may cause their regression testing to fail. arne wrote: I don't think you can have it both ways. If currently, you can have extensions *without* having EXTEND=T in the initial header, then by the "once FITS, always FITS" rule, EXTEND=T has no meaning. To me, it originally had value because I didn't have to figure out where the next header was (I just returned an error if that was attempted without EXTEND=T), but if you've allowed other uses, you have to live with it and just create more headache for the software developers. Deprecate it. I agree with Arnold and others. Arne -- __________________________________________________ __________________ Dr. William Pence NASA/GSFC Code 662 HEASARC +1-301-286-4599 (voice) Greenbelt MD 20771 +1-301-286-1684 (fax) |
#56
|
|||
|
|||
[fitsbits] Abuse of EXTEND keyword
William Pence wrote:
This mention of the NEXTEND keyword caused me to rethink what is really meant by EXTEND = F: EXTEND = T is defined to mean that the FITS file *is permitted* to have extensions following the primary array (the actually wording in the Standard is "may contain extensions" ); it does not mean that the file actually has any extensions. If EXTEND = F, then this logically means the opposite, i.e., that the FITS file is *not permitted* to have any extensions. But this is not the meaning that the STEREO mission folks really intended by setting EXTEND = F since they surely don't care if users were to add extensions to the files at a later date. We don't expect users to add extensions on their own. Personally, I have no objection to deprecating the EXTEND keyword. The current definition isn't particularly meaningful. Bill Thompson -- William Thompson NASA Goddard Space Flight Center Code 671 Greenbelt, MD 20771 USA 301-286-2040 |
#57
|
|||
|
|||
[fitsbits] Abuse of EXTEND keyword
William Thompson wrote:
William Pence wrote: This mention of the NEXTEND keyword caused me to rethink what is really meant by EXTEND = F: EXTEND = T is defined to mean that the FITS file *is permitted* to have extensions following the primary array (the actually wording in the Standard is "may contain extensions" ); it does not mean that the file actually has any extensions. If EXTEND = F, then this logically means the opposite, i.e., that the FITS file is *not permitted* to have any extensions. But this is not the meaning that the STEREO mission folks really intended by setting EXTEND = F since they surely don't care if users were to add extensions to the files at a later date. We don't expect users to add extensions on their own. this is going on the order of "I hate to say this, but": I've come to expect that users don't really care what we expect. Aloha, Maren |
#58
|
|||
|
|||
[fitsbits] EXTEND/NEXTEND reality check
Doug Tody a écrit :
MEF files are an important part of FITS, but unfortunately this issue of describing the overall MEF as a structured object has never been adequately addressed. While it may be possible for a MEF to be a random collection of FITS extensions that are concatenated together, this is not the only model nor necessarily even the best one. In many applications one would like to consider the MEF to be a structured object, and be able to add high level information to describe the object, indendent from what is stored in the individual extensions. The most logical place to put this information in FITS currently is in the primary header. One could easily argue that we need *more* keywords in the primary header to properly describe a MEF, not less as seems to be the trend of this discussion. Or one could take the (in my view more logical) view that a MEF file is the wrong choice when strong structural information needs to be included :-) |
#59
|
|||
|
|||
[fitsbits] EXTEND/NEXTEND reality check
Well, we are talking about FITS here though. In any case, one can
make a strong case, for data beyond a certain level of complexity, for composing complex datasets by aggregating primary data objects (image, table, etc. - standard components) in a general container, and describing the structure logically via a combination of relationships and higher level metadata. MEF works reasonably well for this. The more "obvious" approach of explicit structure (XML etc.) makes the structure more directly visible to a human, but is less flexible and extensible. Basically this is the old relational vs hierarchical structure debate. There is much more that could be said about this; suffice it to say that it is by no means obvious that the explicitly structured alternatives are a better choice, although both approaches have their place. Getting back to the FITS issues, the main point is that we already routinely produce complex datasets, usually containing raw or calibrated instrumental data, composed as a FITS MEF, often with some higher level metadata in the primary header (or sometimes in a special extension). This is quite an important and powerful way to use FITS and should not be deprecated. One of the biggest problems we have currently is that there is no way in standard FITS to warn a generic client application that it cannot safely modify the file (e.g., edit individual extensions) without the possibility of breaking things. One sees this with something as simple as NEXTEND. Basically there are two types of MEF, one where the extensions are completely independent, and one where they are related in some fashion. One could easily argue that, rather than do away with things like NEXTEND, we could use a few more extension-related keywords in the primary header to say something about how extensions are used in this particular MEF. - Doug On Sat, 25 Aug 2007, Thierry Forveille wrote: Doug Tody a écrit : MEF files are an important part of FITS, but unfortunately this issue of describing the overall MEF as a structured object has never been adequately addressed. While it may be possible for a MEF to be a random collection of FITS extensions that are concatenated together, this is not the only model nor necessarily even the best one. In many applications one would like to consider the MEF to be a structured object, and be able to add high level information to describe the object, indendent from what is stored in the individual extensions. The most logical place to put this information in FITS currently is in the primary header. One could easily argue that we need *more* keywords in the primary header to properly describe a MEF, not less as seems to be the trend of this discussion. Or one could take the (in my view more logical) view that a MEF file is the wrong choice when strong structural information needs to be included :-) |
#60
|
|||
|
|||
[fitsbits] EXTENDing FITS
(I see Doug has also replied, arguing a middle path.)
Thierry Forveille wrote: Or one could take the (in my view more logical) view that a MEF file is the wrong choice when strong structural information needs to be included :-) Presumably we would then not want to absolutely deprecate special records :-) Astronomical data is both large and often composed of large chunks. A modest sized single data object of almost any nature can be squeezed into a bintable with a purpose tailored schema. One such example would be a tile-compressed image. A large but fine-grained data object, e.g., a catalog, also wants something equivalent to a bintable. It is the chunkiness (of FITS and of the underlying data) that demands an MEF-style mapping. Whatever the nature of each chunk - array or table or something else - there are usage scenarios requiring simple access to at both the level of the individual chunks as well as the entire file. MEF is one solution that provides this. A tightly coupled global tabular structure would be significantly more unwieldy, e.g., Unix-style software tools would no longer work, etc. If we don't think MEFs are sufficient to the task - or that they could be modified to serve the purpose - but we do want to retain FITS as our foundation - then something other than stuffing everything into a single monolithic bintable will be needed. Special records would be the only place to go. Rather, it seems to me that both FITS and MEF are worthy - if idiosyncratic - platforms for the next round of evolution of astronomical data. Our formats need to scale to data past the 2 GB divide, e.g., LSST's 3.2 gigapixel camera - yet astronomers will want to use familiar image processing tools (IRAF, IDL, etc.) on portions and the whole of the focal plane. It is immaterial if these looming data cows choose some alternate internal format, since unless FITS is going to be retired community-wide, it will have to retain significant utility with new data products anyway. (Or perhaps IVOA has a master plan of sinking FITS beneath the waves?) Rather, I think something along the lines of "Encapsulated FITS" would combine both a semantically mature skeleton (fine grained per- HDU data model with versioning/name-spacing) with coarse grained structuring - perhaps a new INDEX extension type to be appended as needed to complex MEF files. Per-HDU semantics plus per-file indexing. There are options such as embedding XML structures to make everybody happy (or maybe just as well - equally unhappy :-) Meta-discussions as with EXTEND or duplicate keywords or keyword continuation reflect a need for coherently extending our vision of FITS. The status quo likely won't scale to projects currently under development. The alternatives therefore are to use something other than FITS, or to extend FITS. Whichever it is, we should get started soonish. Rob |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[fitsbits] Proposed Changes to the FITS Standard | Mark Calabretta | FITS | 0 | August 2nd 07 09:39 AM |
[fitsbits] Proposed Changes to the FITS Standard | Steve Allen | FITS | 0 | August 1st 07 06:08 PM |
[fitsbits] Proposed Changes to the FITS Standard | Thierry Forveille | FITS | 0 | August 1st 07 04:51 PM |
[fitsbits] Proposed Changes to the FITS Standard | William Pence | FITS | 0 | July 27th 07 07:38 PM |
[fitsbits] Proposed Changes to the FITS Standard | Rob Seaman | FITS | 0 | July 24th 07 07:21 PM |