|
|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
[fitsbits] Dataset identifications.
Thomas McGlynn writes:
One that comes to mind is whether use of this keyword should be recommended only for the primary header of a FITS file. If not then a file may not be associated with a unique dataset id. Is that a problem, or actually a desirable characteristic? I could imagine a single FITS file containing data from several origins in different extensions (e.g. M31 images from IRAS, ROSAT, and the CfA CO(1-0) survey) where different DS_IDENT would be apropriate. Or should such a merged dataset be given a new DS_IDENT and forget about the original ones? At first sight at least, retaining the original pedigree sounds like a good thing. |
#2
|
|||
|
|||
[fitsbits] Dataset identifications.
Thierry Forveille wrote:
Thomas McGlynn writes: One that comes to mind is whether use of this keyword should be recommended only for the primary header of a FITS file. If not then a file may not be associated with a unique dataset id. Is that a problem, or actually a desirable characteristic? I could imagine a single FITS file containing data from several origins in different extensions (e.g. M31 images from IRAS, ROSAT, and the CfA CO(1-0) survey) where different DS_IDENT would be apropriate. Or should such a merged dataset be given a new DS_IDENT and forget about the original ones? At first sight at least, retaining the original pedigree sounds like a good thing. Keeping the pedigree is a good idea - that would surely be one use of the IDs. However, I'm concerned that if a single FITS file is not atomic with respect to the dataset identifier, then it would generally be difficult for software to use the ID to tie things together. E.g., one would need to scan the entire FITS file to find the included IDs. If one has a derived data product that was the concatenation of several datasets, then I'd expect there to be some link in the header to the original datasets, but that this derived data product would indeed have its own dataset id (assuming it had any). My feeling is that if the two elements of data are so tightly coupled that they would normally be put in the same FITS file, then one would consider them to be the same dataset. At the HEASARC datasets generally comprise more than one FITS file. I can't think off the top of my head of an example where a given FITS file would naturally split up into multiple IDs. Tom |
#3
|
|||
|
|||
[fitsbits] Dataset identifications.
Thierry Forveille wrote:
Thomas McGlynn writes: One that comes to mind is whether use of this keyword should be recommended only for the primary header of a FITS file. If not then a file may not be associated with a unique dataset id. Is that a problem, or actually a desirable characteristic? I could imagine a single FITS file containing data from several origins in different extensions (e.g. M31 images from IRAS, ROSAT, and the CfA CO(1-0) survey) where different DS_IDENT would be apropriate. Or should such a merged dataset be given a new DS_IDENT and forget about the original ones? At first sight at least, retaining the original pedigree sounds like a good thing. Keeping the pedigree is a good idea - that would surely be one use of the IDs. However, I'm concerned that if a single FITS file is not atomic with respect to the dataset identifier, then it would generally be difficult for software to use the ID to tie things together. E.g., one would need to scan the entire FITS file to find the included IDs. If one has a derived data product that was the concatenation of several datasets, then I'd expect there to be some link in the header to the original datasets, but that this derived data product would indeed have its own dataset id (assuming it had any). My feeling is that if the two elements of data are so tightly coupled that they would normally be put in the same FITS file, then one would consider them to be the same dataset. At the HEASARC datasets generally comprise more than one FITS file. I can't think off the top of my head of an example where a given FITS file would naturally split up into multiple IDs. Tom |
#4
|
|||
|
|||
[fitsbits] Dataset identifications.
On Thu 2004-03-11T12:20:12 -0500, Thomas McGlynn hath writ:
My feeling is that if the two elements of data are so tightly coupled that they would normally be put in the same FITS file, then one would consider them to be the same dataset. At the HEASARC datasets generally comprise more than one FITS file. I can't think off the top of my head of an example where a given FITS file would naturally split up into multiple IDs. We are currently experimenting with a new observing mode for a mosaic of CCDs on an echelle spectrograph. The notion is that if the red end of the spectrum has a very different flux than the blue end, it might be good to run multiple exposures of one CCD while running one exposure of another CCD. For example, if the blue flux is low enough to be limited by readout noise then that CCD wants one long exposure. But if the red flux is notably higher then it might be nice to run two or three sub-exposures there so as to be able to remove cosmic rays. Although our ideal plan would be to write all of the readouts into separate IMAGE extensions of one FITS file, engineering constraints may cause us initially to write the different sub-exposures into separate FITS files which we would want to recombine later. So I am not convinced that the rules about splitting files and combining HDUs are simple enough not to require the IDENT in each of the HDUs. -- Steve Allen UCO/Lick Observatory Santa Cruz, CA 95064 Voice: +1 831 459 3046 http://www.ucolick.org/~sla PGP: 1024/E46978C5 F6 78 D1 10 62 94 8F 2E 49 89 0E FE 26 B4 14 93 |
#5
|
|||
|
|||
[fitsbits] Dataset identifications.
On Thu 2004-03-11T12:20:12 -0500, Thomas McGlynn hath writ:
My feeling is that if the two elements of data are so tightly coupled that they would normally be put in the same FITS file, then one would consider them to be the same dataset. At the HEASARC datasets generally comprise more than one FITS file. I can't think off the top of my head of an example where a given FITS file would naturally split up into multiple IDs. We are currently experimenting with a new observing mode for a mosaic of CCDs on an echelle spectrograph. The notion is that if the red end of the spectrum has a very different flux than the blue end, it might be good to run multiple exposures of one CCD while running one exposure of another CCD. For example, if the blue flux is low enough to be limited by readout noise then that CCD wants one long exposure. But if the red flux is notably higher then it might be nice to run two or three sub-exposures there so as to be able to remove cosmic rays. Although our ideal plan would be to write all of the readouts into separate IMAGE extensions of one FITS file, engineering constraints may cause us initially to write the different sub-exposures into separate FITS files which we would want to recombine later. So I am not convinced that the rules about splitting files and combining HDUs are simple enough not to require the IDENT in each of the HDUs. -- Steve Allen UCO/Lick Observatory Santa Cruz, CA 95064 Voice: +1 831 459 3046 http://www.ucolick.org/~sla PGP: 1024/E46978C5 F6 78 D1 10 62 94 8F 2E 49 89 0E FE 26 B4 14 93 |
#6
|
|||
|
|||
[fitsbits] Dataset identifications.
Tom McGlynn writes:
My feeling is that if the two elements of data are so tightly coupled that they would normally be put in the same FITS file, then one would consider them to be the same dataset. At the HEASARC datasets generally comprise more than one FITS file. I can't think off the top of my head of an example where a given FITS file would naturally split up into multiple IDs. NOAO (through "Save the bits") has three or four million discrete FITS images packaged up into MEF files for purposes of efficient and easy handling. On the other hand, HEASARC's usage supplies an example involving one dataset that contains several files. Surely there are (or will be) other examples than NOAO's of one FITS file containing several datasets. And if we don't believe this to be a useful characteristic of FITS, we should be seeking to outlaw FITS as a mechanism for storing "unrelated" data under the umbrella of a single file. (Good luck defining "unrelated".) Failing that, any dataset ID convention (or any FITS dataset grouping tools in general) will have to deal with both possibilities. Personally, I think before we reserve "DS_IDENT" or any other keyword for the purpose of identifying datasets, we should define the concept of a "dataset". Do other communities beyond astronomy share this notion? Does astronomy itself share a single clear vision of what constitutes a dataset? In short, where does the dataset object reside in the cosmic class diagram in the sky? I suspect I'm also not alone in wondering why our grand discussions of discovering and developing an overall astronomical ontology and semantic web and all those other VO "vision" things continually breaks down into ad hoc, off the cuff, suggestions of random notions of the "right" way to do this thing or that thing with no connection to the whole. Either our headers are real world, all too human, faulty examples of an underlying Platonic ideal - and we should try to characterize that ideal - or we should stop pretending that our individual datasets share any common attributes at all. Rob Seaman NOAO Science Data Systems |
#7
|
|||
|
|||
[fitsbits] Dataset identifications.
Tom McGlynn writes:
My feeling is that if the two elements of data are so tightly coupled that they would normally be put in the same FITS file, then one would consider them to be the same dataset. At the HEASARC datasets generally comprise more than one FITS file. I can't think off the top of my head of an example where a given FITS file would naturally split up into multiple IDs. NOAO (through "Save the bits") has three or four million discrete FITS images packaged up into MEF files for purposes of efficient and easy handling. On the other hand, HEASARC's usage supplies an example involving one dataset that contains several files. Surely there are (or will be) other examples than NOAO's of one FITS file containing several datasets. And if we don't believe this to be a useful characteristic of FITS, we should be seeking to outlaw FITS as a mechanism for storing "unrelated" data under the umbrella of a single file. (Good luck defining "unrelated".) Failing that, any dataset ID convention (or any FITS dataset grouping tools in general) will have to deal with both possibilities. Personally, I think before we reserve "DS_IDENT" or any other keyword for the purpose of identifying datasets, we should define the concept of a "dataset". Do other communities beyond astronomy share this notion? Does astronomy itself share a single clear vision of what constitutes a dataset? In short, where does the dataset object reside in the cosmic class diagram in the sky? I suspect I'm also not alone in wondering why our grand discussions of discovering and developing an overall astronomical ontology and semantic web and all those other VO "vision" things continually breaks down into ad hoc, off the cuff, suggestions of random notions of the "right" way to do this thing or that thing with no connection to the whole. Either our headers are real world, all too human, faulty examples of an underlying Platonic ideal - and we should try to characterize that ideal - or we should stop pretending that our individual datasets share any common attributes at all. Rob Seaman NOAO Science Data Systems |
#8
|
|||
|
|||
[fitsbits] Dataset identifications.
On Thu 2004/03/11 12:20:12 CDT, Thomas McGlynn wrote in a message to: Thierry Forveille and copied to: , one use of the IDs. However, I'm concerned that if a single FITS file is not atomic with respect to the dataset identifier, then it would generally be difficult for software to use the ID to tie things together. E.g., one would need to scan the entire FITS file to find the included IDs. Putting an index of IDs in the primary header, i.e. just a list of the IDs that come later, would preclude the need to scan past the first header when looking for a particular ID. Mark Calabretta ATNF |
#9
|
|||
|
|||
[fitsbits] Dataset identifications.
On Thu 2004/03/11 12:20:12 CDT, Thomas McGlynn wrote in a message to: Thierry Forveille and copied to: , one use of the IDs. However, I'm concerned that if a single FITS file is not atomic with respect to the dataset identifier, then it would generally be difficult for software to use the ID to tie things together. E.g., one would need to scan the entire FITS file to find the included IDs. Putting an index of IDs in the primary header, i.e. just a list of the IDs that come later, would preclude the need to scan past the first header when looking for a particular ID. Mark Calabretta ATNF |
#10
|
|||
|
|||
[fitsbits] Dataset identifications.
Thomas McGlynn writes:
My feeling is that if the two elements of data are so tightly coupled that they would normally be put in the same FITS file, then one would consider them to be the same dataset. At the HEASARC datasets generally comprise more than one FITS file. I can't think off the top of my head of an example where a given FITS file would naturally split up into multiple IDs. The need I had in mind is further down the line than either telescopes or places like HEASARC: an astronomer builds a single FITS file with all the images+photon-lists+tables+... that he uses in a given A&A paper, gives them a new DS_IDENT of course (probably derived from the bib code of the paper) in the primary header, and keeps the original DS_IDENT for each extension. Or actually, that merging might end up being done by the journals. It's a slight extension of what you proposed and it might perhaps never actually develop into reality, but I see no major reason to outright forbid that use. |
|
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[fitsbits] Dataset identifications. | Thomas McGlynn | FITS | 0 | March 10th 04 08:20 PM |
[fitsbits] [fitswcs] WCSLIB 3.3 | Mark Calabretta | FITS | 8 | October 29th 03 07:08 PM |
[fitsbits] [fitswcs] WCSLIB 3.3 | Stephen Walton | FITS | 0 | October 22nd 03 07:38 PM |
[fitsbits] WCSLIB 3.3 | Mark Calabretta | FITS | 0 | October 22nd 03 10:24 AM |
[fitsbits] proposed FITS MIME types Internet Draft | Steve Allen | FITS | 0 | October 1st 03 05:49 AM |