skip to Main Content

Editorial: Data mining to uncover patterns in data

Data mining to uncover patterns in data

P-hacking, data dredging, data fishing. These terms imply use of data mining to uncover patterns in data that can be presented as statistically significant, without first devising a specific hypothesis as to the underlying causality.1 An example: the number of people who drowned by falling into a pool correlates with the number movies Nicolas Cage appeared in from 1999 to 2009 (r=0.67).2 This seems bizarre but these kind of practices, in a far more subtle way of course, occur in research groups. Data files are thoroughly analyzed in the hope of finding significant results. P-hacking is often followed by HARKing (hypothesizing after the results are known), which defies the founding principles of empiricism.3 The hypothesis and method get altered after the results are analyzed instead of the other way around. The incentive for this behavior is driven by many factors; for instance by the amount of pressure to publicize and journals mostly publishing positive results. These malpractices have already been described in statistical literature in the previous century, but the awareness slowly seems to increase in medical sciences. Different solutions to tackle these problems are named; using ‘clinical relevance’ instead of ‘significant’ results, preregistering research plans before collecting data and all kinds of statistical solutions.4 I would like to encourage students to study these ideas and to be critical about their own actions and the actions of fellow researchers.

To set an example, the Amsterdam Medical Student journal (AMSj) tries to be critical on their authors and will always value a solid methodology over positive results. Once again, we are proud to present you our next edition. In this edition S.L. Verhaart reviews the safety and efficacy of CTLA-4 checkpoint inhibitors in the treatment of cancer. Besides, we would like to welcome back J.J.K. van Diemen. She presents a case report about the enduring relationship of clinical presentation and troponin-T. Our international colleagues from New Zealand contributed with a special ‘Crossing Borders: From Aotearoa to Amsterdam’ item, in which insight is given about a medical system 18.000 kilometres away. Nearing the end of my period as Student Editor-in-Chief (AMC), we are excited to welcome Rens Kempeneers as my successor. Combining his clinical rotations with an PhD program in Surgery gives him the complete package to become a great Student Editor-in-Chief. I would like thank all the authors who contributed the past one and a half year to this amazing initiative and wish you all a bright future.

M.T.U. Schuijt


  1. Manufò MR, Nosek BA, Bishop DVM, et al. A manifesto for reproducible science. Nature Human Behaviour 2017;1(1),0021
  2. Spurious-correlations [Internet]. 2018. Available here.
  3. Kerr NL. HARKing: hypothesizing after the results are known. Personality and Social Psychology Review. 1998;2:196-217.
  4. Head ML, Holman L, Lanfear R, et al. The Extent and Consequences of P-Hacking in Science. PLoS Biol. 2015;13(3): e1002106-1-e1002106-15

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top