|
|
Thread Tools | Display Modes |
#1
|
|||
|
|||
Statistics Problem/Question
I am not looking for an exact answer for the following question, but
rather for an approximate answer. Say you had 30 low-mass red dwarfs (unbiased sample) whose masses are all measured to +/- 0.01 solar mass. Say you added all 30 masses and divided by 0.145 solar mass. What would the approximate (or exact if you prefer) probability be for find ing an exact multiple? What would be the expected probability for a deviation of +/- 0.001 solar mass, and finally what would be the expected probability for a deviation of +/- 0.01 solar mass? What I am looking for is a better feeling for whether it makes sense to do such a test and whether a result that comes from such a test might be meaningful or misleading.=20 First, second, third, fourth, ... opinions are most welcome. |
#2
|
|||
|
|||
Statistics Problem/Question
In article ,
"Robert L. Oldershaw" writes: Say you had 30 low-mass red dwarfs (unbiased sample) whose masses are all measured to +/- 0.01 solar mass. Say you added all 30 masses and divided by 0.145 solar mass. What would the approximate (or exact if you prefer) probability be for finding an exact multiple? The probability for finding an EXACT multiple is zero. What would be the expected probability for a deviation of +/- 0.001 solar mass, This doesn't make any sense if, as stated above, the "masses are all measured to +/- 0.01 solar mass". and finally what would be the expected probability for a deviation of +/- 0.01 solar mass? Presumably, you want to test whether the null hypothesis that the masses are "random" can be ruled out, and whether there is some evidence for 0.145 being some sort of preferred value. The number 30 doesn't matter. The sum of any number of "random" numbers is itself "random". The probability that the sum deviates from an integer multiple of 0.145 by less than 0.0725 is 100%, and the probability that it deviates by more is 0. This is because 0.0725=0.145/2 is the maximum possible deviation. If the masses are measured to 0.01 solar masses, then you can forget about detecting any smaller deviations. It doesn't make sense to talk about "a deviation of +/- 0.01 solar mass", since this probability is 0. What you might mean is the probability that the deviation is less than 0.01. That would be roughly 14% (0.01/0.0725). Suppose you know the masses to 0.001 instead of 0.01. Then the probability of the deviation being less than 0.001 is about 1.4%. If you actually found this, it would be considered marginal evidence in favour of ruling out the null hypothesis. A more detailed analysis would also put an error bar on the sum (add the individual errors quadratically). |
#3
|
|||
|
|||
Statistics Problem/Question
On Wednesday, August 31, 2016 at 4:52:39 PM UTC-4, Phillip Helbig (undress to reply) wrote:
It doesn't make sense to talk about "a deviation of +/- 0.01 solar mass", since this probability is 0. What you might mean is the probability that the deviation is less than 0.01. That would be roughly 14% (0.01/0.0725). Suppose you know the masses to 0.001 instead of 0.01. Then the probability of the deviation being less than 0.001 is about 1.4%. If you actually found this, it would be considered marginal evidence in favour of ruling out the null hypothesis. A more detailed analysis would also put an error bar on the sum (add the individual errors quadratically). Thanks for the analysis. I tried two samples of 30 stars each and found a deviation of 0.001 solar mass for the first sample, which had error bars on the masses that were 0.004 to 0.005 solar mass on average. The second sample had mass uncertainties that were =/ than 0.01 solar mass but the average was about closer to 0.01 than 0.005. The deviation for that sample was 0.008 solar mass. Personally, I find this mildly encouraging, but I acknowledge that it will take more accurate stellar masses to fully test my hypothesis. There must be a better way to search for hidden patterns in noisy data, but this is obviously not my strong suit. |
#4
|
|||
|
|||
Statistics Problem/Question
In article ,
"Robert L. Oldershaw" writes: Thanks for the analysis. I tried two samples of 30 stars each and found a deviation of 0.001 solar mass for the first sample, which had error bars on the masses that were 0.004 to 0.005 solar mass on average. The error on the total mass will be larger than on the individual masses, so you can't know the total mass that well. If the deviation is really 0.001, that is interesting, but since the error on the total mass is much larger, probably not meaningful. The second sample had mass uncertainties that were =/ than 0.01 solar mass but the average was about closer to 0.01 than 0.005. The deviation for that sample was 0.008 solar mass. Still, the deviation is less than the error on the total mass. Personally, I find this mildly encouraging, but I acknowledge that it will take more accurate stellar masses to fully test my hypothesis. Much more accurate. There must be a better way to search for hidden patterns in noisy data, but this is obviously not my strong suit. Yes. There is a huge literature on this. (H-I intensity mapping, for example, deals with a signal-to-noise ratio of much less than 1.) Before going further in this direction, though, why 30 stars? Why not 25 or 300? Why two samples of 30 each rather than one of 60 each? If one accepts that something interesting is going on if the probability of getting the observed results if the null hypothesis is true is less than 1% or whatever, keep in mind that if you do 100 tests, you should EXPECT to get a false positive once. So unless it is abundantly clear that there are objective reasons for 30 stars and so on, it will be hard to demonstrate that you didn't try various sample sizes and so on until something looked significant while not talking about the majority of cases where nothing looked significant. Even if you didn't do this, the reader has to be able to see that objectively what you are doing is straightforward. A-posteriori statistics is shaky ground. The 0.145 has come up here many times. My impression was that even you were once convinced that there was no signal in the stuff you presented. So, even if there is some signal now, the question arises in how many places you looked until you found something. Again, even if this is not what happened, the reader has to be convinced that the signal is really unexpected and not just the expected fluke which one will always see if one looks long enough. |
#5
|
|||
|
|||
Statistics Problem/Question
On Thursday, September 1, 2016 at 8:55:44 PM UTC-4, Phillip Helbig (undress=
to reply) wrote: Before going further in this direction, though, why 30 stars? Why not 25 or 300? Why two samples of 30 each rather than one of 60 each? If one accepts that something interesting is going on if the probability of getting the observed results if the null hypothesis is true is less than 1% or whatever, keep in mind that if you do 100 tests, you should EXPECT to get a false positive once. So unless it is abundantly clear that there are objective reasons for 30 stars and so on, it will be hard to demonstrate that you didn't try various sample sizes and so on until something looked significant while not talking about the majority of cases where nothing looked significant. Even if you didn't do this, the reader has to be able to see that objectively what you are doing is straightforward. A-posteriori statistics is shaky ground. The 0.145 has come up here many times. M The second sample had mass uncertainties that were =/ than 0.01 solar mass but the average was about closer to 0.01 than 0.005. The deviation for that sample was 0.008 solar mass. Still, the deviation is less than the error on the total mass. Personaly impression was that even you were once convinced that there was no signal in the stuff you presented. So, even if there is some signal now, the question arises in how many places you looked until you found something. Again, even if this is not what happened, the reader has to be convinced that the signal is really unexpected and not just the expected fluke which one will always see if one looks long enough. ------------------- Well, I suppose an answer is warranted. Firstly, you are badly mistaken if you intuit that I have given up on the hypothesis. I am more aware of how hard it will be to convince anyone of a positive result since it conflicts so radically with current wisdom. I think it will take the proverbial "extraordinary evidence" that cannot be hand-waved away or ignored. I readily admit that the hypothesis is extraordinarily hard to test at present, given the available data. However, the mass estimates will continue to get better with time. The first sample came from this recently posted paper: http://arxiv.org/abs/1608.04775 . I did not go looking for some sample that fit my hypothesis, and I find the implication that I would do so more than a bit unfair. The second sample was just the bottom 30 stars in the Southworth's detached binary stars sample that I have discussed before. I chose the bottom 30 so as to emphasize low-mass stars, and I restricted the sample to stars with masses estimated to +/- 0.01 solar mass. It never hurts to keep an open mind on open questions. Robert L. Oldershaw http://www3.amherst.edu/~rloldershaw |
#6
|
|||
|
|||
Statistics Problem/Question
In article ,
"Robert L. Oldershaw" writes: Say you had 30 low-mass red dwarfs (unbiased sample) whose masses are all measured to +/- 0.01 solar mass. Say you added all 30 masses The uncertainty in the sum would be 0.01 * sqrt(30). I don't understand the rest of the question, but... and divided by 0.145 solar mass. If you are testing some hypothesis that the masses are quantized, I don't see why you would add the masses. You'd want to find out whether the individual masses predominantly fall close to the value you hypothesize or not. The sensitivity of the test depends on, among other things, how precisely you state your hypothesis. It's a lot easier to test "within 0.00001 of 0.14500" than "near some fixed but unknown value between 0.10 and 0.20". (Uncertainty in the hypothesis is a separate issue from uncertainty in the observations.) As a quick estimate, if your hypothesis also has an uncertainty of 0.01 Msun, I think you'd need about 100 stars to make a first test. (That only applies of you don't "tune" your hypothesis to the actual data. If you do, you'd need several times more stars.) You don't need fancy statistics; just plot up the mass differences from your hypothesized value and see whether they fall near zero more often than a random distribution. But a I say, I don't understand the question, so maybe you are after something entirely different. -- Help keep our newsgroup healthy; please don't feed the trolls. Steve Willner Phone 617-495-7123 Cambridge, MA 02138 USA |
Thread Tools | |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
statistics oh gawd | Brian Gaff | Space Shuttle | 0 | November 23rd 09 08:18 PM |
Question the 3 body problem | [email protected] | Astronomy Misc | 14 | November 27th 06 10:01 AM |
Shuttle Statistics...? | Eric | Space Shuttle | 12 | January 5th 06 07:01 PM |
Registax Question / Problem | [email protected] | Amateur Astronomy | 1 | October 30th 05 12:15 PM |
Image Statistics? - Help! | Chris Taylor | UK Astronomy | 6 | December 10th 04 05:30 PM |