supernova data

**Oh No** · #1 September 1st 08, 10:18 AM posted to sci.astro.research

There has been a new, and substantially larger compilation including new
near SN (which should much improve the dataset) , while I wasn't looking

http://arxiv.org/abs/0804.4142

They have a website where they intend to keep regular updates, and from
which the data can be downloaded.

http://supernova.lbl.gov/Union/

Regards

--
Charles Francis
moderator sci.physics.foundations.
charles (dot) e (dot) h (dot) francis (at) googlemail.com (remove spaces and
braces)

http://www.teleconnection.info/rqg/MainIndex

**Bob Day** · September 10th 08, 02:42 PM posted to sci.astro.research

"Oh No" wrote in message
...
There has been a new, and substantially larger compilation including new
near SN (which should much improve the dataset) , while I wasn't looking

http://arxiv.org/abs/0804.4142

They have a website where they intend to keep regular updates, and from
which the data can be downloaded.

http://supernova.lbl.gov/Union/

I have a couple of concerns about this data:

First, on page 13 of the paper on the SCP Union by Kowalski et al,
http://arxiv.org/PS_cache/arxiv/pdf/...804.4142v1.pdf ,
about 2/3 down in the text of the left-hand column, the authors indicate
that they adjusted sigma-sys to force the value of chi-square to be
1 per degree of freedom. Other research teams, calling it sigma-int,
have done this also. For example, A. Conley et al,
http://arxiv.org/PS_cache/astro-ph/p.../0602411v2.pdf , page 13 last
paragraph, and Ashtier et al,
http://arxiv.org/PS_cache/astro-ph/p.../0510447v1.pdf , page 10
right column, about 1/4 down.

But isn't that fudging the data? -- using the model used to fit the data
to optimize the fit? I can see that might be OK if you are sure your
model is correct and are just trying to optimize the parameters, but it
seems to me that it makes the data suspect if you are trying to fit
other models.

Second, the density of low redshift supernovae is much greater than
that of high density supernovae. There are more than four times the
number of supernovae on the range 0 - 0.8 than there are in the
range 0.8 - 1.6. As a consequence, models that fit low redshift data
will be disproportionately favored. For my own purposes, I have
weeded through the SCP Union data to create a "thinned" dataset
of 50 points consisting of a maximum of the seven lowest uncertainty
data points in each of the ranges 0.0 z = 0.2, 0.2 z = 0.4, etc.
I'm seeing significant differences in the values I'm getting for fit
parameters.

-- Bob Day

**Oh No** · September 10th 08, 03:29 PM posted to sci.astro.research

Thus spake Bob Day
"Oh No" wrote in message
...
There has been a new, and substantially larger compilation including new
near SN (which should much improve the dataset) , while I wasn't looking

http://arxiv.org/abs/0804.4142

They have a website where they intend to keep regular updates, and from
which the data can be downloaded.

http://supernova.lbl.gov/Union/

I have a couple of concerns about this data:

First, on page 13 of the paper on the SCP Union by Kowalski et al,
http://arxiv.org/PS_cache/arxiv/pdf/...804.4142v1.pdf ,
about 2/3 down in the text of the left-hand column, the authors indicate
that they adjusted sigma-sys to force the value of chi-square to be
1 per degree of freedom. Other research teams, calling it sigma-int,
have done this also. For example, A. Conley et al,
http://arxiv.org/PS_cache/astro-ph/p.../0602411v2.pdf , page 13 last
paragraph, and Ashtier et al,
http://arxiv.org/PS_cache/astro-ph/p.../0510447v1.pdf , page 10
right column, about 1/4 down.

But isn't that fudging the data? -- using the model used to fit the data
to optimize the fit? I can see that might be OK if you are sure your
model is correct and are just trying to optimize the parameters, but it
seems to me that it makes the data suspect if you are trying to fit
other models.

I don't think this is intrinsically a bad thing. Rather, I would say it
is irrelevant. One should realise that the "absolute" value of chi^2 is
floating. If one wants to compare models, then one has to normalise
chi^2 to one model (the one with the lower chi^2 value) in order to see
whether the value for the other would cause one to reject the fit.

Second, the density of low redshift supernovae is much greater than
that of high density supernovae. There are more than four times the
number of supernovae on the range 0 - 0.8 than there are in the
range 0.8 - 1.6.

This has concerned me too.

As a consequence, models that fit low redshift data
will be disproportionately favored.

I am less sure of that. The nature of the curve is that high redshift
datapoints have more "leverage". If we had more points high up, there
might be more of a scatter away from a curve which has been
overinfluenced by an individual SN.

For my own purposes, I have
weeded through the SCP Union data to create a "thinned" dataset
of 50 points consisting of a maximum of the seven lowest uncertainty
data points in each of the ranges 0.0 z = 0.2, 0.2 z = 0.4, etc.
I'm seeing significant differences in the values I'm getting for fit
parameters.

This strikes me as a good test, since the data points with lowest
uncertainty would not be expected to contain a bias. If the differences
are significant in the strict statistical sense, then I think it goes to
confirm what I found in previous analyses, that at the current time we
really do not have adequate supernova data to say anything very much
about cosmological parameters.

Regards

--
Charles Francis
moderator sci.physics.foundations.
charles (dot) e (dot) h (dot) francis (at) googlemail.com (remove spaces and
braces)

http://www.teleconnection.info/rqg/MainIndex

**Bob Day** · September 10th 08, 05:43 PM posted to sci.astro.research

"Oh No" wrote in message
...
Thus spake Bob Day
"Oh No" wrote in message
...
There has been a new, and substantially larger compilation including new
near SN (which should much improve the dataset) , while I wasn't looking

http://arxiv.org/abs/0804.4142

They have a website where they intend to keep regular updates, and from
which the data can be downloaded.

http://supernova.lbl.gov/Union/

I have a couple of concerns about this data:

First, on page 13 of the paper on the SCP Union by Kowalski et al,
http://arxiv.org/PS_cache/arxiv/pdf/...804.4142v1.pdf ,
about 2/3 down in the text of the left-hand column, the authors indicate
that they adjusted sigma-sys to force the value of chi-square to be
1 per degree of freedom. Other research teams, calling it sigma-int,
have done this also. For example, A. Conley et al,
http://arxiv.org/PS_cache/astro-ph/p.../0602411v2.pdf , page 13 last
paragraph, and Ashtier et al,
http://arxiv.org/PS_cache/astro-ph/p.../0510447v1.pdf , page 10
right column, about 1/4 down.

But isn't that fudging the data? -- using the model used to fit the data
to optimize the fit? I can see that might be OK if you are sure your
model is correct and are just trying to optimize the parameters, but it
seems to me that it makes the data suspect if you are trying to fit
other models.

I don't think this is intrinsically a bad thing. Rather, I would say it
is irrelevant. One should realise that the "absolute" value of chi^2 is
floating. If one wants to compare models, then one has to normalise
chi^2 to one model (the one with the lower chi^2 value) in order to see
whether the value for the other would cause one to reject the fit.

It seems to me that if I fit a straight line to some data that is
somewhat scattered about a sharply curved parabola and
normalized the chi-square of the straight line fit to 1 by adding
a constant to the denominator of the chi-square terms (as the
authors of some of the supernova cosmology studies appear
to do), then I could use the fit of the straight line to reject the
fit of the parabola. No?

[Mod. note: no, because if the reduced chi^2 of a model that was
originally a poor fit is now = 1, then the reduced chi^2 of the model
that was originally a good fit is now 1. Having said that, in my
view, doing *any* sort of goodness of fit analysis with data whose
chi^2 you have renormalized in this way is invalid, but I haven't
looked at the papers to see what the authors are actually doing.
-- mjh]

-- Bob Day

September 10th 08, 07:04 PM posted to sci.astro.research

In article , "Bob Day"
writes:

that they adjusted sigma-sys to force the value of chi-square to be
1 per degree of freedom.

But isn't that fudging the data? -- using the model used to fit the data
to optimize the fit? I can see that might be OK if you are sure your
model is correct and are just trying to optimize the parameters, but it
seems to me that it makes the data suspect if you are trying to fit
other models.

This procedure is actually quite common. See the last part of
Sect. 15.1 of the second edition of NUMERICAL RECIPES IN FORTRAN, for
example, for a description.

If the errors were correctly estimated, then the chi-square should be 1
per DOF, as you state, for the correct model. But suppose they are
vastly different---then they cannot have been correctly estimated
(assuming that the best-fit model is in fact the correct model). Then
it makes sense to re-calibrate the errors based on the assumption that
the chi-square should be 1 per DOF.

Of course, if one does this, then one cannot estimate the goodness of
fit!

There would be a problem only if, after re-estimating the errors, one
does a new parameter fit and gets a significantly different result.

**Bob Day** · September 11th 08, 02:52 PM posted to sci.astro.research

"Bob Day" wrote in message
...
[Mod. note: no, because if the reduced chi^2 of a model that was
originally a poor fit is now = 1, then the reduced chi^2 of the model
that was originally a good fit is now 1. Having said that, in my
view, doing *any* sort of goodness of fit analysis with data whose
chi^2 you have renormalized in this way is invalid, but I haven't
looked at the papers to see what the authors are actually doing.
-- mjh]

For curve fitting, the ideal value of chi-square is 1 per degree of
freedom. Either too low or too high indicates that the model is
incorrect. For example, a polynomial of a high enough order can
be made to fit just about any data perfectly, yielding a chi-square
value of zero But that doesn't mean it's the correct curve!

[Mod. note: no, a reduced chi^2 1 doesn't mean the model is
incorrect. It means, most likely, that the errors have been
overestimated. One never rejects a *model* because the reduced chi^2
is too low, but one does question one's error estimates.

A polynomial of high enough order to give a perfect fit will have at
as many free parameters as there are data points, the number of
degrees of freedom will be zero, and the *reduced* chi^2 will be
undefined, so your example sheds no light on the question. -- mjh]

Now suppose we fudge the data by adding a constant to the
denominator of the chi-square terms so that the chi-square value
of an imperfectly fitting, incorrect model, polynomial becomes
the ideal value of 1 per degree of freedom. Doing that could
very well drive the value of a correct model lower, causing it to
be rejected in favor of the polynomial.

-- Bob Day

**Bob Day** · September 12th 08, 02:58 PM posted to sci.astro.research

"Bob Day" wrote in message
...
"Bob Day" wrote in message
...
[Mod. note: no, because if the reduced chi^2 of a model that was
originally a poor fit is now = 1, then the reduced chi^2 of the model
that was originally a good fit is now 1. Having said that, in my
view, doing *any* sort of goodness of fit analysis with data whose
chi^2 you have renormalized in this way is invalid, but I haven't
looked at the papers to see what the authors are actually doing.
-- mjh]

For curve fitting, the ideal value of chi-square is 1 per degree of
freedom. Either too low or too high indicates that the model is
incorrect. For example, a polynomial of a high enough order can
be made to fit just about any data perfectly, yielding a chi-square
value of zero But that doesn't mean it's the correct curve!

[Mod. note: no, a reduced chi^2 1 doesn't mean the model is
incorrect. It means, most likely, that the errors have been
overestimated. One never rejects a *model* because the reduced chi^2
is too low, but one does question one's error estimates.

A polynomial of high enough order to give a perfect fit will have at
as many free parameters as there are data points, the number of
degrees of freedom will be zero, and the *reduced* chi^2 will be
undefined, so your example sheds no light on the question. -- mjh]

Oops. You're right. Bad example. Thanks for pointing that out!

-- Bob Day

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Puckett Observatory Supernova Search Discovers Its 100th Supernova	[email protected]	Misc	0	July 18th 05 04:56 AM
Puckett Observatory Supernova Search Discovers Its 100th Supernova	[email protected]	Astronomy Misc	0	July 18th 05 04:56 AM
Puckett Observatory Supernova Search Discovers Its 100th Supernova	[email protected]	News	0	July 18th 05 04:55 AM
Gravitic bipolarity: fact or farce? (Was "1a Supernova data")	Bill Sheppard	Misc	38	July 29th 03 04:16 PM