A Space & astronomy forum. SpaceBanter.com

Go Back   Home » SpaceBanter.com forum » Astronomy and Astrophysics » FITS
Site Map Home Authors List Search Today's Posts Mark Forums Read Web Partners

[fitsbits] Dataset identifications.



 
 
Thread Tools Display Modes
  #1  
Old March 11th 04, 01:14 AM
Thierry Forveille
external usenet poster
 
Posts: n/a
Default [fitsbits] Dataset identifications.

Thomas McGlynn writes:
One that comes to mind is whether use of this
keyword should be recommended only for the primary header of a FITS
file. If not then a file may not be associated with a unique dataset
id.

Is that a problem, or actually a desirable characteristic? I could imagine
a single FITS file containing data from several origins in different
extensions (e.g. M31 images from IRAS, ROSAT, and the CfA CO(1-0) survey)
where different DS_IDENT would be apropriate. Or should such a merged
dataset be given a new DS_IDENT and forget about the original ones? At
first sight at least, retaining the original pedigree sounds like
a good thing.

  #2  
Old March 11th 04, 06:20 PM
Thomas McGlynn
external usenet poster
 
Posts: n/a
Default [fitsbits] Dataset identifications.

Thierry Forveille wrote:

Thomas McGlynn writes:
One that comes to mind is whether use of this
keyword should be recommended only for the primary header of a FITS
file. If not then a file may not be associated with a unique dataset
id.

Is that a problem, or actually a desirable characteristic? I could imagine
a single FITS file containing data from several origins in different
extensions (e.g. M31 images from IRAS, ROSAT, and the CfA CO(1-0) survey)
where different DS_IDENT would be apropriate. Or should such a merged
dataset be given a new DS_IDENT and forget about the original ones? At
first sight at least, retaining the original pedigree sounds like
a good thing.


Keeping the pedigree is a good idea - that would surely be
one use of the IDs. However, I'm
concerned that if a single FITS file is not atomic with respect
to the dataset identifier, then it would generally be difficult
for software to use the ID to tie things together. E.g., one
would need to scan the entire FITS file to find the included IDs.
If one has a derived data product that was the concatenation of
several datasets, then I'd expect there to be some link in the header
to the original datasets, but that this derived data product would
indeed have its own dataset id (assuming it had any).

My feeling is that if the two elements of data are so tightly
coupled that they would normally be put in the same FITS file,
then one would consider them to be the same dataset. At the HEASARC
datasets generally comprise more than one FITS file. I can't think
off the top of my head of an example where a given FITS file would
naturally split up into multiple IDs.

Tom

  #3  
Old March 11th 04, 06:20 PM
Thomas McGlynn
external usenet poster
 
Posts: n/a
Default [fitsbits] Dataset identifications.

Thierry Forveille wrote:

Thomas McGlynn writes:
One that comes to mind is whether use of this
keyword should be recommended only for the primary header of a FITS
file. If not then a file may not be associated with a unique dataset
id.

Is that a problem, or actually a desirable characteristic? I could imagine
a single FITS file containing data from several origins in different
extensions (e.g. M31 images from IRAS, ROSAT, and the CfA CO(1-0) survey)
where different DS_IDENT would be apropriate. Or should such a merged
dataset be given a new DS_IDENT and forget about the original ones? At
first sight at least, retaining the original pedigree sounds like
a good thing.


Keeping the pedigree is a good idea - that would surely be
one use of the IDs. However, I'm
concerned that if a single FITS file is not atomic with respect
to the dataset identifier, then it would generally be difficult
for software to use the ID to tie things together. E.g., one
would need to scan the entire FITS file to find the included IDs.
If one has a derived data product that was the concatenation of
several datasets, then I'd expect there to be some link in the header
to the original datasets, but that this derived data product would
indeed have its own dataset id (assuming it had any).

My feeling is that if the two elements of data are so tightly
coupled that they would normally be put in the same FITS file,
then one would consider them to be the same dataset. At the HEASARC
datasets generally comprise more than one FITS file. I can't think
off the top of my head of an example where a given FITS file would
naturally split up into multiple IDs.

Tom

  #4  
Old March 11th 04, 07:49 PM
Steve Allen
external usenet poster
 
Posts: n/a
Default [fitsbits] Dataset identifications.

On Thu 2004-03-11T12:20:12 -0500, Thomas McGlynn hath writ:
My feeling is that if the two elements of data are so tightly
coupled that they would normally be put in the same FITS file,
then one would consider them to be the same dataset. At the HEASARC
datasets generally comprise more than one FITS file. I can't think
off the top of my head of an example where a given FITS file would
naturally split up into multiple IDs.


We are currently experimenting with a new observing mode for a
mosaic of CCDs on an echelle spectrograph. The notion is that
if the red end of the spectrum has a very different flux than the
blue end, it might be good to run multiple exposures of one CCD
while running one exposure of another CCD.

For example, if the blue flux is low enough to be limited by readout
noise then that CCD wants one long exposure. But if the red flux is
notably higher then it might be nice to run two or three sub-exposures
there so as to be able to remove cosmic rays.

Although our ideal plan would be to write all of the readouts into
separate IMAGE extensions of one FITS file, engineering constraints
may cause us initially to write the different sub-exposures into
separate FITS files which we would want to recombine later.

So I am not convinced that the rules about splitting files and
combining HDUs are simple enough not to require the IDENT in
each of the HDUs.

--
Steve Allen UCO/Lick Observatory Santa Cruz, CA 95064
Voice: +1 831 459 3046 http://www.ucolick.org/~sla
PGP: 1024/E46978C5 F6 78 D1 10 62 94 8F 2E 49 89 0E FE 26 B4 14 93

  #5  
Old March 11th 04, 07:49 PM
Steve Allen
external usenet poster
 
Posts: n/a
Default [fitsbits] Dataset identifications.

On Thu 2004-03-11T12:20:12 -0500, Thomas McGlynn hath writ:
My feeling is that if the two elements of data are so tightly
coupled that they would normally be put in the same FITS file,
then one would consider them to be the same dataset. At the HEASARC
datasets generally comprise more than one FITS file. I can't think
off the top of my head of an example where a given FITS file would
naturally split up into multiple IDs.


We are currently experimenting with a new observing mode for a
mosaic of CCDs on an echelle spectrograph. The notion is that
if the red end of the spectrum has a very different flux than the
blue end, it might be good to run multiple exposures of one CCD
while running one exposure of another CCD.

For example, if the blue flux is low enough to be limited by readout
noise then that CCD wants one long exposure. But if the red flux is
notably higher then it might be nice to run two or three sub-exposures
there so as to be able to remove cosmic rays.

Although our ideal plan would be to write all of the readouts into
separate IMAGE extensions of one FITS file, engineering constraints
may cause us initially to write the different sub-exposures into
separate FITS files which we would want to recombine later.

So I am not convinced that the rules about splitting files and
combining HDUs are simple enough not to require the IDENT in
each of the HDUs.

--
Steve Allen UCO/Lick Observatory Santa Cruz, CA 95064
Voice: +1 831 459 3046 http://www.ucolick.org/~sla
PGP: 1024/E46978C5 F6 78 D1 10 62 94 8F 2E 49 89 0E FE 26 B4 14 93

  #6  
Old March 11th 04, 11:18 PM
Rob Seaman
external usenet poster
 
Posts: n/a
Default [fitsbits] Dataset identifications.

Tom McGlynn writes:

My feeling is that if the two elements of data are so tightly
coupled that they would normally be put in the same FITS file,
then one would consider them to be the same dataset. At the HEASARC
datasets generally comprise more than one FITS file. I can't think
off the top of my head of an example where a given FITS file would
naturally split up into multiple IDs.


NOAO (through "Save the bits") has three or four million discrete FITS
images packaged up into MEF files for purposes of efficient and easy
handling. On the other hand, HEASARC's usage supplies an example
involving one dataset that contains several files. Surely there are
(or will be) other examples than NOAO's of one FITS file containing
several datasets. And if we don't believe this to be a useful
characteristic of FITS, we should be seeking to outlaw FITS as a
mechanism for storing "unrelated" data under the umbrella of a single
file. (Good luck defining "unrelated".) Failing that, any dataset
ID convention (or any FITS dataset grouping tools in general) will have
to deal with both possibilities.

Personally, I think before we reserve "DS_IDENT" or any other keyword
for the purpose of identifying datasets, we should define the concept
of a "dataset". Do other communities beyond astronomy share this
notion? Does astronomy itself share a single clear vision of what
constitutes a dataset? In short, where does the dataset object reside
in the cosmic class diagram in the sky?

I suspect I'm also not alone in wondering why our grand discussions of
discovering and developing an overall astronomical ontology and semantic
web and all those other VO "vision" things continually breaks down into
ad hoc, off the cuff, suggestions of random notions of the "right" way
to do this thing or that thing with no connection to the whole.

Either our headers are real world, all too human, faulty examples of an
underlying Platonic ideal - and we should try to characterize that ideal -
or we should stop pretending that our individual datasets share any
common attributes at all.

Rob Seaman
NOAO Science Data Systems
  #7  
Old March 11th 04, 11:18 PM
Rob Seaman
external usenet poster
 
Posts: n/a
Default [fitsbits] Dataset identifications.

Tom McGlynn writes:

My feeling is that if the two elements of data are so tightly
coupled that they would normally be put in the same FITS file,
then one would consider them to be the same dataset. At the HEASARC
datasets generally comprise more than one FITS file. I can't think
off the top of my head of an example where a given FITS file would
naturally split up into multiple IDs.


NOAO (through "Save the bits") has three or four million discrete FITS
images packaged up into MEF files for purposes of efficient and easy
handling. On the other hand, HEASARC's usage supplies an example
involving one dataset that contains several files. Surely there are
(or will be) other examples than NOAO's of one FITS file containing
several datasets. And if we don't believe this to be a useful
characteristic of FITS, we should be seeking to outlaw FITS as a
mechanism for storing "unrelated" data under the umbrella of a single
file. (Good luck defining "unrelated".) Failing that, any dataset
ID convention (or any FITS dataset grouping tools in general) will have
to deal with both possibilities.

Personally, I think before we reserve "DS_IDENT" or any other keyword
for the purpose of identifying datasets, we should define the concept
of a "dataset". Do other communities beyond astronomy share this
notion? Does astronomy itself share a single clear vision of what
constitutes a dataset? In short, where does the dataset object reside
in the cosmic class diagram in the sky?

I suspect I'm also not alone in wondering why our grand discussions of
discovering and developing an overall astronomical ontology and semantic
web and all those other VO "vision" things continually breaks down into
ad hoc, off the cuff, suggestions of random notions of the "right" way
to do this thing or that thing with no connection to the whole.

Either our headers are real world, all too human, faulty examples of an
underlying Platonic ideal - and we should try to characterize that ideal -
or we should stop pretending that our individual datasets share any
common attributes at all.

Rob Seaman
NOAO Science Data Systems
  #10  
Old March 12th 04, 01:14 AM
Thierry Forveille
external usenet poster
 
Posts: n/a
Default [fitsbits] Dataset identifications.

Thomas McGlynn writes:
My feeling is that if the two elements of data are so tightly
coupled that they would normally be put in the same FITS file,
then one would consider them to be the same dataset. At the HEASARC
datasets generally comprise more than one FITS file. I can't think
off the top of my head of an example where a given FITS file would
naturally split up into multiple IDs.

The need I had in mind is further down the line than either telescopes or
places like HEASARC: an astronomer builds a single FITS file with all the
images+photon-lists+tables+... that he uses in a given A&A paper, gives
them a new DS_IDENT of course (probably derived from the bib code of the
paper) in the primary header, and keeps the original DS_IDENT for each
extension. Or actually, that merging might end up being done by the
journals.

It's a slight extension of what you proposed and it might perhaps never
actually develop into reality, but I see no major reason to outright
forbid that use.

 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[fitsbits] Dataset identifications. Thomas McGlynn FITS 0 March 10th 04 08:20 PM
[fitsbits] [fitswcs] WCSLIB 3.3 Mark Calabretta FITS 8 October 29th 03 07:08 PM
[fitsbits] [fitswcs] WCSLIB 3.3 Stephen Walton FITS 0 October 22nd 03 07:38 PM
[fitsbits] WCSLIB 3.3 Mark Calabretta FITS 0 October 22nd 03 10:24 AM
[fitsbits] proposed FITS MIME types Internet Draft Steve Allen FITS 0 October 1st 03 05:49 AM


All times are GMT +1. The time now is 02:54 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 SpaceBanter.com.
The comments are property of their posters.