![]() ![]() There's certainly evidence that p-hacking is "out there", e.g. Someone reporting p-values from a stepwise regression (because they find stepwise procedures "produce good models", but aren't aware the purported p-values are invalidated) is in the latter camp, but the effect is still p-hacking under the last of my bullet points above. Although some dubious motivations and (particularly in the competition for academic publication) counterproductive incentives are obvious, I suspect it's hard to figure out quite why it's done, whether deliberate malpractice or simple ignorance. It is often listed as one of the "dangers of the p-value" and was mentioned in the ASA report on statistical significance, discussed here on Cross Validated, so we also know it's a Bad Thing. experimentation during model-fitting, particularly covariates to include, but also regarding data transformations/functional form.the previous example is related to optional stopping, i.e., analyzing a dataset and deciding on whether to collect more data or not depending on the data collected so far ("this is almost significant, let's measure three more students!") without accounting for this in the analysis.in a meta-analysis, it may be a finely balanced argument whether a particular study's methodology is sufficient robust to include) in an econometric study of "developed countries", different definitions yield different sets of countries), or qualitative inclusion criteria (e.g. One opportunity comes when "data-cleaning outliers", but also when applying an ambiguous definition (e.g. experimenting with inclusion/exclusion of data points, until the desired result is obtained.both a parametric and a non-parametric test ( there's some discussion of that in this thread), but only reporting the most significant trying different tests of the same hypothesis, e.g.failing to adjust properly for multiple testing, particularly post-hoc testing and failing to report tests carried out that were not significant. ![]() ![]() only analysing an "interesting" subset of the data, in which a pattern was found. ![]() There are many ways to procure a "more significant" result, including but by no means limited to: The phrase p-hacking (also: "data dredging", "snooping" or "fishing") refers to various kinds of statistical malpractice in which results become artificially statistically significant. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |