What's the Difference Between Data Cooking and Data Rigging?

  • Thread starter Gruxg
  • Start date
  • Tags
    Data
In summary: Actually, what I want to do is to call attention to what I consider an incorrect method often used by many people, without adressing to a concrete person. I don't want to accuse all this people of being deliberately liying, but I think they are in some way liying themselves. Thanks for your reply, Dave
  • #1
Gruxg
41
2
I'm not sure where I should post this question, I will try here.

It's a linguistic doubt. I am not a native English speaker and I don't know the difference between 'data cooking' and 'data rigging'. Which one sounds more offensive or can be considered a more serious fault for a scientist?

How would you call in English some method of data processing and analysis not totally objective and influenced by the result we would like to obtain?. I think the author don't want to lie but is using an incorrect trick to get misleading good results. I'd like to be clear but not very rude.

Thanks!
 
Last edited:
Physics news on Phys.org
  • #2
Gruxg said:
I'm not sure where I should post this question, I will try here.

It's a linguistic doubt. I am not a native English speaker and I don't know the difference between 'data cooking' and 'data rigging'. Which one sounds more offensive or can be considered a more serious fault for a scientist?

How would you call in English some method of data processing and analysis not totally objective and influenced by the result we would like to obtain?. I think the author don't want to lie but is using an incorrect trick to get misleading good results. I'd like to be clear but not very rude.

Thanks!

Those two terms might as well be synonymous. They're both euphamisms so the specific intended meaning is ambiguous. The conclusion in both cases is simply that the data has been deliberately manipulated and cannot be trusted. Exactly what euphamism is used doesn't really seem to matter.

Re-reading your post, I am under the impression that you plan to write to the author and ... well ... accuse him of manipulating the data.

Any label you use will lead to an interpretation of rudeness -especially if it contains the accusation that the data was deliberately manipulated. If you want to be not rude then avoid using labels at all; just tell him that you think his data analysis is flawed. This leaves him an "out", in that you are not directly accusing him of deliberately cooking his data.
 
  • #3
Thanks for your reply, Dave

Actually, what I want to do is to call attention to what I consider an incorrect method often used by many people, without adressing to a concrete person. I don't want to accuse all this people of being deliberately liying, but I think they are in some way liying although not deliberately.
 
Last edited:
  • #4
Gruxg said:
Thanks for your reply, Dave

Actually, what I want to do is to call attention to what I consider an incorrect method often used by many people, without adressing to a concrete person. I don't want to accuse all this people of being deliberately liying, but I think they are in some way liying themselves.
There is also the term "cherry picking" which means the data itself might be correct, but the picking of certain data points and twisting that data to make it look like it means something different in order to support your hypothesis is unethical.
 
Last edited:
  • #5
Another appropriate term is data mining but cherry picking as suggested above is more clear. I don't believe the terms you suggested in your original post communicate what you are trying to say.
 
  • #6
Evo said:
There is also the term "cherry picking"
Ooh. Good one.

John Creighto said:
Another appropriate term is data mining...
This is not my understanding of data mining.

I thought data mining simply meant deep, number-crunchy processing of data in search of patterns.

As an example, one might look at data much closer than was originally intended. In company of 10,000 people one might find some very interesting emergent data that was not apparent from the individual data points - say, a disproportionate number of employees at a military technology vendor are correlated with long distance overseas calls with hostile countries.

Nothing wrong with the data or the methods it is subjected to. i.e. in my understanding, data mining is not the term that the OP is looking for.
 
  • #7
DaveC426913 said:
This is not my understanding of data mining.

I thought data mining simply meant deep, number-crunchy processing of data in search of patterns.

As an example, one might look at data much closer than was originally intended. In company of 10,000 people one might find some very interesting emergent data that was not apparent from the individual data points - say, a disproportionate number of employees at a military technology vendor are correlated with long distance overseas calls with hostile countries.

Nothing wrong with the data or the methods it is subjected to. i.e. in my understanding, data mining is not the term that the OP is looking for.

This is true, however if you've ever followed the climate audit blog, Steve McIntyre uses the term to describe the use of principle component analysis to to extract a hockey stick shape from the data used in mbh98. The point being that the technique amplifies low power noise and therefore by selecting a region with an apartment upward trend the application of the technique gives the desired result. The point being the word mining is used because we are digging though the data to get the desired result rather then trying to find a non biased vantage point.

Even more so the proxies selected by the technique were highly correlated with CO2 and thus established the desired correlated between CO2 and temperature. I do not know if this use of the word is limited to McIntyre's blog or has a wider usage but the term cherry picking is certainly widely used.
 
  • #8
Thanks a lot for the comments.

I knew the term "data mining" with the meaning explained by Dave:
http://en.wikipedia.org/wiki/Data_mining

I didn't know the term "cherry picking", but after searching a bit on the web, I think it refers to using only the data that support an hypothesis and disregard others, while in my case the problem is the analysis rather than the selection of the data.

Maybe I should take Dave's advice and avoid any label.
 
  • #9
Gruxg said:
Thanks a lot for the comments.

I knew the term "data mining" with the meaning explained by Dave:
http://en.wikipedia.org/wiki/Data_mining

I didn't know the term "cherry picking", but after searching a bit on the web, I think it refers to using only the data that support an hypothesis and disregard others, while in my case the problem is the analysis rather than the selection of the data.

Maybe I should take Dave's advice and avoid any label.

We'll both the data and the method of analysis are items which can be cherry picked.
 
  • #10
John Creighto said:
The point being the word mining is used because we are digging though the data to get the desired result rather then trying to find a non biased vantage point.
Analagously, I could go to the local library to dig through the data there to get my desired result. But that does not make "going to the library" a term with negative or dishonest connotations.

Whereas cooking data, rigging data and cherry-picking are all distinctly negative and dishonest.
 
  • #11
Cherry mining.
 
  • #12
lisab said:
Cherry mining.

:smile:

I see your problem. Your tree is upside down.

:biggrin:
 
  • #13
I have to observe that I understand a subtle diffrence between 'cooking' and 'rigging' something.

To me cooking implies falsifying or hiding data after the event as in an accountant 'cooking the books' to present a false financial picture.

On the other hand rigging implies prearranging something so the outcome will be skewed in some desired fashion as in 'loading the dice'. I don't think anyone would describe this as cooking the dice, but may use I have heard the term rigging the dice.

The OP may also be also interested in the following distinction.

Tax evasion is a crime.

Tax avoidance is common sense
 

FAQ: What's the Difference Between Data Cooking and Data Rigging?

What is data cooking/data rigging?

Data cooking, also known as data rigging, is the process of manipulating or altering data in order to achieve a desired outcome. This can involve intentionally changing or omitting data points, or selectively choosing which data to analyze in order to support a particular conclusion.

Why is data cooking/data rigging unethical?

Data cooking/data rigging is considered unethical because it involves intentionally misrepresenting data in order to support a desired outcome or agenda. This can lead to biased or misleading conclusions, which can have serious consequences in fields such as scientific research or business decision-making.

What are some examples of data cooking/data rigging?

Examples of data cooking/data rigging include selectively choosing which data to include in a study, altering data to fit a predetermined narrative, or cherry-picking data to support a specific argument. It can also involve manipulating data through statistical techniques or altering data collection methods in order to achieve a desired outcome.

How can data cooking/data rigging be detected?

Data cooking/data rigging can be detected through careful examination of the data and analysis methods used. This can involve checking for unusual patterns or discrepancies in the data, reviewing the methods used to collect and analyze the data, and comparing the results to other studies or data sets. Collaboration and peer review can also help to identify potential instances of data cooking/data rigging.

How can we prevent data cooking/data rigging?

Preventing data cooking/data rigging requires a combination of ethical standards and careful data management. Scientists and researchers should adhere to ethical guidelines and practices, such as transparent data collection and reporting methods, and avoiding conflicts of interest. Additionally, maintaining a culture of collaboration and peer review can help to identify and prevent instances of data cooking/data rigging.

Back
Top