Transparency of Methods and Analysis
5.3. Transparency of Methods and Analysis#
Learning Outcome
Students will be able to identify clarity in methods of analysis of data and demonstrate how conclusions can be misleading.
Sample Tasks
Differentiate between settings where an algorithm will lead to biased results and to unbiased results.
Identify the traits that demonstrate that the analysis, findings, and conclusions made from data are reliable and reproducible.
Explain the conclusions of the analysis of data in terms that are understandable to an appropriate audience.
Identify misinterpretations in conclusions of data analysis.
Our first reading, from Modern Data Science with R [BKH21], gives some examples where “true” data is presented in such a way as to convey false meaning. Similar issues were discussed in Section 2.6.
Reading Question
Is global temperature increasing?
Our second reading, also from Modern Data Science with R [BKH21], discusses the importance of making your analysis reproducible, so others could check it.
Reading Question
If your analysis is correct, why do others need to be able to reproduce it?
Our third reading, a 2021 blog post, discusses the dangers poor data ethics pose to research.
Reading Question
If your data contradicts your hypothesis, should you still (try to) publlish it?
Our fourth set of readings are Wikipedia’s entries on
Data dredging (also known as data snooping or p-hacking)
Reading Questions
Your friend finds a hedge fund that has outperformed the market for 10 years in a row and suggests that you invest in it. Did your friend engage in data dredging?
You notice that when you drop heavy things on your foot, your foot hurts. Did you just engage in HARKing?
Further Resources
Federal Data Strategy Data Ethics Framework
5 Principles of Data Ethics for Business, a 2021 blog post.
What Is Data Ethics?, a 2021 blog post.
When It Comes to Data Collection, Transparency and Ethical Standards Are Not a Matter of Choice, a 2021 blog post.
Why Data Transparency Matters, a post.