View Single Post
  #4  
Old September 2nd 16, 01:55 AM posted to sci.astro.research
Phillip Helbig (undress to reply)[_2_]
external usenet poster
 
Posts: 273
Default Statistics Problem/Question

In article ,
"Robert L. Oldershaw" writes:

Thanks for the analysis. I tried two samples of 30 stars each and found
a deviation of 0.001 solar mass for the first sample, which had error
bars on the masses that were 0.004 to 0.005 solar mass on average.


The error on the total mass will be larger than on the individual
masses, so you can't know the total mass that well. If the deviation is
really 0.001, that is interesting, but since the error on the total mass
is much larger, probably not meaningful.

The second sample had mass uncertainties that were =/ than 0.01 solar
mass but the average was about closer to 0.01 than 0.005. The deviation
for that sample was 0.008 solar mass.


Still, the deviation is less than the error on the total mass.

Personally, I find this mildly encouraging, but I acknowledge that it
will take more accurate stellar masses to fully test my hypothesis.


Much more accurate.

There must be a better way to search for hidden patterns in noisy data,
but this is obviously not my strong suit.


Yes. There is a huge literature on this. (H-I intensity mapping, for
example, deals with a signal-to-noise ratio of much less than 1.)

Before going further in this direction, though, why 30 stars? Why not
25 or 300? Why two samples of 30 each rather than one of 60 each?

If one accepts that something interesting is going on if the probability
of getting the observed results if the null hypothesis is true is less
than 1% or whatever, keep in mind that if you do 100 tests, you should
EXPECT to get a false positive once. So unless it is abundantly clear
that there are objective reasons for 30 stars and so on, it will be hard
to demonstrate that you didn't try various sample sizes and so on until
something looked significant while not talking about the majority of
cases where nothing looked significant. Even if you didn't do this, the
reader has to be able to see that objectively what you are doing is
straightforward. A-posteriori statistics is shaky ground.

The 0.145 has come up here many times. My impression was that even you
were once convinced that there was no signal in the stuff you presented.
So, even if there is some signal now, the question arises in how many
places you looked until you found something. Again, even if this is not
what happened, the reader has to be convinced that the signal is really
unexpected and not just the expected fluke which one will always see if
one looks long enough.