Using Statistics Ethically to Combat “A Scientific Credibility Crisis”

February 10, 2017 - A survey of more than 1,500 investigators, published in a 2016 issue of Nature, showed that more than 70 percent of researchers have tried and failed to reproduce other scientists’ experiments, and more than half have failed to reproduce their own experiments.

Additional studies have come to similar conclusions, says Rochelle Tractenberg, PhD, associate professor of neurology at Georgetown University Medical Center with secondary appointments in biostatistics, bioinformatics and biomathematics, and rehabilitation medicine. “Irreproducible results do harm that can be difficult to discover and even more difficult to undo,” she said.

A consulting statistician and practicing scientist for the past 20 years, Tractenberg started pursuing her interest in promoting ethical research skills in 2009 after being invited to join a GUMC task force to explore these challenges. Tractenberg has worked with colleagues at GUMC and other institutions around the country to promote this brand of responsible research, and she'll have a larger stage later this month.

Tractenberg will discuss responsible research, and its relevance for all statisticians, data analysts and data scientists, in a symposium she organized for the upcoming annual meeting of the American Association for the Advancement of Science (AAAS) in Boston on February 19.

Credibility crises

When most investigators have taken just a single course in statistics, and are therefore laymen when it comes to statistics - for typical experiments or if they wish to participate in big data analyses - it is perhaps not surprising that so many studies cannot be replicated, nor results reproduced, Tractenberg says.

“My focus on promoting ethical statistical practice arose because a scientific credibility crisis is emerging due partly to scientists who do not conduct - or insist upon - appropriate statistical analysis or interpretation, or both,” she says. “If ethical statistical practice becomes the norm across statistics and data science, it may then be taken up into other domains where data analysis makes important contributions.”

Several elements of a study can lead to irreproducible results, including incorrect analysis, improper interpretation of data, cherry picking results, or failing to transparently report the number of analyses that were done, Tractenberg says. Avoiding these are principles of ethical statistical practice as well as responsible conduct in research.

“Although it can often seem that data analysis is secondary to the ‘main’ science or study purpose, the analytic method and its interpretation are essential attributes of both rigor and reproducibility, and this is true for their own work and for their peer review of others' work,” says Tractenberg.

A large number of these irreproducible studies may have never been published if peer reviewers that were unable to evaluate the statistics “just told the editor they don’t feel qualified to evaluate the study’s statistical argument, and that a formal statistical review is needed,” she says. Having a formal statistical review does not guarantee reproducibility or rigor, but not having or insisting on one virtually guarantees the continuation of the reproducibility crisis.

Data “Must Be Treated Ethically”

A faculty member at Georgetown since 2002, Tractenberg was appointed to the national Committee on Professional Ethics of the American Statistical Association (ASA) in 2013, a committee that she now chairs. In her 90-minute panel at the AAAS meeting, “How Ethical Science Supports Ethical Policy: Disciplinary Perspectives,” she will discuss the ASA Ethical Guidelines for Statistical Practice, which all those who analyze data can utilize, whether dealing with “small” or “big” data. She says that ethical statistical practice - by every data analyst - is integral to maintaining the value of science in society.

Tractenberg's panel will also bring together specialists in engineering and economics to describe their efforts to establish and promote ethical practices and policies within their disciplines. These three perspectives will then be discussed with respect to their potential to influence and support ethical policy and decision making.

“All scientific fields have different relationships to data and how the data should be interpreted,” Tractenberg says. “But the core of all in this work is the data and its analysis, and I firmly believe these must be dealt with ethically. Otherwise, decisions that are based on these results may be incorrect or indefensible, or both.”

“The data analyst, whether a professional statistician or just the group member who is most skilled with the analysis software, has an obligation to treat and interpret the data ethically,” Tractenberg says. “In a post-truth world, this may be the best way to promote scientific integrity. ”

Renee Twombly
GUMC Communications