View Single Post
  #27  
Old March 25th 04, 03:45 PM
Arnold Rots
external usenet poster
 
Posts: n/a
Default [fitsbits] 'Dataset Identifications' postings (digest)

Lucio Chiappetti wrote:
Let me answer to a bunch of messages in one go.

...

From: Arnold Rots
Date: Wed, 24 Mar 2004 15:47:57 -0500 (EST)


The scope of Tom's proposal is really quite limited:

He is announcing the establishment of a convention that employs
a keyword (DS_IDENT) or set of keywords (DS_IDiii).
The intent is that the value of that keyword contains a label or key
that will allow users to obtain a pointer to a particular volume in
astronomical data space. No less, but also no more.


just a little bit more


OK, just the following sentence.


Within the space of data identifier strings only the subspace of
strings starting with "ADS/" (case-insensitive!) is reserved.


I believe you should reserve also the fact that the first part of the id
is the namespace, and delegate all the rest to the namespace authority.

May be one should also add another kwd (DSAUTHOR) which points to an URL
of the namespace authority.

Or are we imagining something like the DNS with a set of "root
nameservers" ?


Nothing is implied or recommended by this proposal.
We took great pains to ensure that the ADS/ identifiers be conforming
with the standard being developed by the IVOA, but that is not part of
this proposal. Others may want to suggest further conventions tieing
the two together in the future, but this is not the time to do that -
for one thing, the IVOA standard has not yet been completed.


and purposes. For the Chandra Data Archive what you will get in
response to the key is a URL that will allow you to request a download
of data products associated with a particular observation - or maybe a
set of observations. If you try again next month, the files may be
different: we may have reprocessed or decided to add some products to
the package.


Hmmm ... I'm a bit worried by the fact that the dataset may change. Maybe
that's why it is not yet so clear to me what usage an user will do of the
dataset identifier. Let's make some examples.


a) I read a paper, which tells me "the data used here belong to dataset
xyz". I want to repeat the analysis of the SAME data myself, so I
use the id to retrieve the data. Obviously here I want to get the
SAME data, not a further and better version (do I ?).

No FITS file involved here though on the user end.


Do you just want to repeat the analysis or do you want to do a better
job? We would give you the current (best) set of data products based
on the same raw observational data, so you can do your (better) job.
If you can't reconcile results, you can ask us for the version that
was (most likely) used for the paper and we'll be happy to give it to
you, provided it was a "good" version.


b) I retrieve the files, and I want to check they really belong to the
correct dataset.

c) I have got somehow some files, and I want to know to what observation
do they refer, or to retrieve more files of the same dataset, or to
find what papers have been published using them.


This all goes by OBSID, not DS_IDENT, at least for us, although we
could make it work through the idenitifier as well.


d) I do my analysis and produce some more files. These are private, but
I may want to document that the starting point of the analysis was
the given dataset. But DS-IDENT is not the right way, my data DO NOT
belong to the dataset, I need a separate history kwd ...


Agreed.


... if I'd ever distribute the data (I suppose I also have to quote
the DS-IDENT in any paper I will write, for the ADS to use it)


That's the idea - or multiple identifiers.


Again, think of the dataset identifier as a key that allows the user
to obtain a pointer to the dataset. There is no need to encode any
information in it - nor is that prohibited


Agreed

The list of informational metadata that Rob provided looks to me more
like metadata that ought to reside in a database.


(or in other keywords in the same file if desired)

--
----------------------------------------------------------------------
is a newsreading account used by more persons to
avoid unwanted spam. Any mail returning to this address will be rejected.
Users can disclose their e-mail address in the article if they wish so.

_______________________________________________
fitsbits mailing list

http://listmgr.cv.nrao.edu/mailman/listinfo/fitsbits

--------------------------------------------------------------------------
Arnold H. Rots Chandra X-ray Science Center
Smithsonian Astrophysical Observatory tel: +1 617 496 7701
60 Garden Street, MS 67 fax: +1 617 495 7356
Cambridge, MA 02138
USA
http://hea-www.harvard.edu/~arots/
--------------------------------------------------------------------------