View Single Post
  #3  
Old December 4th 03, 11:50 AM
Mike Dworetsky
external usenet poster
 
Posts: n/a
Default Question about classification

"Alex T." wrote in message
...
I read a couple of papers about stellar classification in
which classification results were reported not in terms of
error rate, as it is usually done in classification, but
instead in something like : classification accuracy is 1.6
subclasses.

I don't think I fully understand this system.

Does classification accuracy of 2 subclasses means that all
objects were classified either correctly or with neighbouring
class ? Not a single object was classified with a different,
completely randomn class ? I find it hardly reasonable.

And what about accuracy of 1.6 subclasses ? Does that mean that
40% percent of the objects were classified correctly and 60%
as their neighbours ? Once again, not a single object was
classified absolutely randomly ?

Thanks a lot.


Classification errors are not completely random, so the distribution of
errors would not necessarily be a normal distribution, if only because the
"measurements" are necessarily quantized into boxes before the analysis of
statistics is performed.

The error quoted would mean, for example, that a star with a given
classification of G4 has a 0.67 probability of being given a different class
of somewhere between G2.5 and G5.5 if reobserved and reclassified in the
same experiment. There are small probabilities of the error being
"three-sigma" or +/- 4.8 subclasses, so that there is about a 1% chance that
a repeat observation and classification would result in F9/G0 or G8/9.
However, the errors are probably not distributed normally, so the chance of
statistical extreme outliers is much reduced: another way to say this is to
say it is awfully hard to mistake a G star for an A star, if the spectra are
as good as the claimed error of 1.6 subclasses would suggest. Incidentally,
I know of occasional big mistakes in the old HD catalogue when the spectra
were so poorly exposed that the classifiers were practically guessing, but
that was poor data, not inexpert classifying.

--
Mike Dworetsky

(Remove "pants" spamblock to send e-mail)