Cosmology Raw Data: Access Unanalyzed Experiments

In summary: Bear in mind that not all of the data that is made public due to this agreement is housed in NASA servers, as they also fund lots of other observational teams, who are also then bound by the same... agreement?Yes, that's right. There is a lot of data out there that's not in the NASA database. It's up to the individual researcher to determine whether they want to access that data or not. But it's generally available to anyone who wants to look for it.Yes, that's right. There is a lot of data out there that's not in the NASA database. It's up to the individual researcher to determine whether they want to access that data or not.
  • #1
barnflakes
156
4
I was reading Roger Penrose's book and he mentions that there are huge amounts of raw data from experiments that haven't yet been fully analysed. Is there a way I can get my hands on this raw data, for cosmology experiments or any other experiments with large amounts of raw data? I know that very new data isn't available because researchers on that particular experiment are allowed "first run" at the data, but old data is fine by me.
 
Space news on Phys.org
  • #2
There is a ton of raw data available. There's stuff from SDSS and other surveys, there's stuff that's been compiled (not necessarily analyzed) on Nasa's Extragalactic Database, Hyper LEDA and on and on. If you have a yen for statistical analysis of huge blocks of data, you can knock yourself out. This can be brain-numbing work - the kind of stuff that you'd like to enslave some grad students to do, but there are lots of observations that are available to you. Warning: you may spend a good deal of time trying to put measurements in compatible formats. Redshifts can be expressed in lots of reference frames, luminosities can be expressed in different bands, etc. If you're interested in looking for small systemic effects, you'll have to get the measurements expressed in compatible terms in order to detect them.
 
  • #3
Thanks turbo, I'm not yet at the stage where I'd know what the hell to do with such data but I aim to start "playing around" with it next summer when I have time to learn statistics/cosmology in more depth. I think it'd be pretty fun though, not mind numbing, surely once you've written the code you just let it run and see what it finds?
 
  • #4
The primary problem with dealing with raw data is systematic errors. This means that to actually make good use of raw data, you have to really understand not only the physics and the statistical analysis techniques used, but also the hardware used to collect the data.

Usually, of course, the observational teams will do as much as they can do to ensure that the systematic effects are taken care of. But it's generally a good idea to read their papers in detail to really get an idea of what's going on.

If you want to get your elbows into this, I'd suggest starting with SDSS: http://www.sdss.org/
 
  • #5
turbo-1 said:
There is a ton of raw data available. There's stuff from SDSS and other surveys, there's stuff that's been compiled (not necessarily analyzed) on Nasa's Extragalactic Database, Hyper LEDA and on and on. If you have a yen for statistical analysis of huge blocks of data, you can knock yourself out. This can be brain-numbing work - the kind of stuff that you'd like to enslave some grad students to do, but there are lots of observations that are available to you. Warning: you may spend a good deal of time trying to put measurements in compatible formats. Redshifts can be expressed in lots of reference frames, luminosities can be expressed in different bands, etc. If you're interested in looking for small systemic effects, you'll have to get the measurements expressed in compatible terms in order to detect them.

Perhaps much, if not just about all, raw data of the type that you want, is considered proprietary, and NOT available for private use. There may well be serious computer security concerns about its possible transfer also. LOL
 
  • #6
justwondering said:
Perhaps much, if not just about all, raw data of the type that you want, is considered proprietary, and NOT available for private use. There may well be serious computer security concerns about its possible transfer also. LOL
This is the case in some circles. Any data that is collected as part of a NASA mission must be released to the public, however. And many other astronomy groups are moving in the same direction.
 
  • #7
Chalnoth said:
This is the case in some circles. Any data that is collected as part of a NASA mission must be released to the public, however. And many other astronomy groups are moving in the same direction.

So then NASA, that must have received many thousands of terrabytes of raw data in total, is obligated to 'hand it over' to me, and any others in the USA?
 
  • #8
justwondering said:
So then NASA, that must have received many thousands of terrabytes of raw data in total, is obligated to 'hand it over' to me, and any others in the USA?
It's in the public domain, and usually available online. See here for one area of NASA research:
http://lambda.gsfc.nasa.gov/

Bear in mind that not all of the data that is made public due to this agreement is housed in NASA servers, as they also fund lots of other observational teams, who are also then bound by the same agreement. And yes, there is a heck of a lot of data available.
 
  • #9
justwondering said:
So then NASA, that must have received many thousands of terrabytes of raw data in total, is obligated to 'hand it over' to me, and any others in the USA?

Yes.

You can start at http://lambda.gsfc.nasa.gov/

Data that NASA maintains is covered under the Freedom of Information Act. Also OMB Circular A-110 Subpart A(d)(1)(2) requires that non-profit organizations receiving federal grant money make their research data available under the FOIA, although the can charge for the costs of copying data.

http://www.whitehouse.gov/omb/rewrite/circulars/a110/a110.html

What typically happens under with NASA space missions is that the research group that was the main group involved with the mission gets the privilege of publishing the first paper using the data, but once that paper is published the data is made available to everyone else.
 
Last edited by a moderator:
  • #10
barnflakes said:
Thanks turbo, I'm not yet at the stage where I'd know what the hell to do with such data but I aim to start "playing around" with it next summer when I have time to learn statistics/cosmology in more depth. I think it'd be pretty fun though, not mind numbing, surely once you've written the code you just let it run and see what it finds?

As others have indicated, the process of comparing different data sets in a comparable way is extremely non trivial. Every survey has different selection effects, error budgets etc etc. That's before you get across the issue that for cosmology to say anything meaningful about physics, you need to have a robust prediction to compare to (i.e. given a physics, what would we see?) and for almost all usefull observations, this prediction itself is a not entirely solved problem (though for some data sets such as the CMB we are pretty much there for all but the most esoteric models). For instance we know very well what the LCDM model predicts, for a wide range of parameter values, the number and nature of dark matter structures to expect. We know this from detailed simulations. However, translating that into a genuinely comparable prediction to real observations, given the unknown details of galaxy formation and evolution, is far from solved. We've come a long way but we aren't there yet.

I don't want to dissuade you, cosmology is very interesting at any level of investigation, but don't take the quote "there are huge amounts of raw data from experiments that haven't yet been fully analysed" to mean "there is heaps of low hanging fruit out there just waiting to be picked off". All the cosmological data that has come in has been analysed already in all kinds of ways, even if there are even more things you could think of doing that haven't been done yet. Yes there could be (and probably are) surprises lurking data already taken, but if they still remain hidden it's not for lack of trying.
 

FAQ: Cosmology Raw Data: Access Unanalyzed Experiments

What is cosmology raw data?

Cosmology raw data refers to the unprocessed and unanalyzed data collected from experiments and observations related to the study of the origin, evolution, and structure of the universe.

Why is access to unanalyzed experiments important?

Access to unanalyzed experiments allows scientists to study and analyze the data themselves, which can lead to new discoveries and insights about the universe. It also allows for independent verification of results and promotes transparency in scientific research.

How is cosmology raw data collected?

Cosmology raw data is collected through various methods, including telescopes, satellites, and other instruments that capture signals from distant objects in the universe. The data is then stored in databases and made available for analysis.

What are the challenges of working with cosmology raw data?

Working with cosmology raw data can be challenging due to its large volume and complexity. The data may also contain noise or errors that need to be accounted for during analysis. Additionally, the interpretation of raw data requires advanced statistical and computational techniques.

What are the potential benefits of studying cosmology raw data?

Studying cosmology raw data can lead to new discoveries and insights about the universe, such as the understanding of dark matter and dark energy. It can also help to test and refine existing theories and models of the universe, and guide future research in the field of cosmology.

Similar threads

Back
Top