An international group of scientists is demanding scientific journals demand more transparency from researchers in computer-related areas when accepting their reports for publication.
They also want computational researchers to include information about their code, models, and computational environments in published reports.
Their call, published in Nature Magazine in October, was in response to the results of research conducted by Google Health that was published in Nature last January.
The research claimed an artificial intelligence system was faster and more accurate at screening for breast cancer than human radiologists.
Google funded the study, which was led by Google Scholar Scott McKinney and other Google employees.
Criticisms of the Google Study
“In their study, McKinney et al. showed the high potential of artificial intelligence for breast cancer screening,” the international group of scientists, led by Benjamin Haibe-Kains, of the University of Toronto, stated.
“However, the lack of detailed methods and computer code undermines its scientific value. This shortcoming limits the evidence required for others to prospectively validate and clinically implement such technologies.”
Scientific progress depends on the ability of independent researchers to scrutinize the results of a research study, reproduce its main results using its materials, and build upon them in future studies, the scientists said, citing Nature Magazine’s policies.
McKinney and his co-authors stated that it was not feasible to release the code used for training the models because it has a large number of dependencies on internal tooling, infrastructure, and hardware, Haibe-Kains’ group noted.
However, many frameworks and platforms are available to make AI research more transparent and reproducible, the group said. These include Bitbucket and Github; package managers, including Conda; and container and virtualization systems such as Code Ocean and Gigantum.
Al shows great promise for use in the field of medicine, but “Unfortunately, the biomedical literature is littered with studies that have failed the test of reproducibility, and many of these can be tied to methodologies and experimental practices that could not be investigated due to failure to fully disclose software and data,” Haibe-Kains’ group said.
Google did not respond to our request to provide comment for this story.
Patents Pending?
There might be good business reasons for companies not to disclose full details about their AI research studies.
“This research is also considered confidential in the development of technology,” Jim McGregor, a principal analyst at Tirias Research, told TechNewsWorld. “Should technology companies be forced to give away technology they’ve spend billions of dollars in developing?”
What researchers are doing with AI “is phenomenal and is leading to technological breakthroughs, some of which are going to be covered by patent protection,” McGregor said. “So not all of the information is going to be available for testing, but just because you can’t test it doesn’t mean it isn’t correct or true.”
Haibe-Kains’ group recommended that if data cannot be shared with the entire scientific community because of licensing or other insurmountable issues, “at a minimum, a mechanism should be set so that some highly-trained, independent investigators can access the data and verify the analyses.”
Driven by Hype
Verifiability and reproducibility plague AI research study results on the whole. Only 15 percent of AI research papers publish their code, according to the State of AI Report 2020, produced by AI investors Nathan Benaich and Ian Hogarth.
They particularly single out Google’s AI subsidiary and laboratory DeepMind and AI research and development company OpenAI as culprits.
“Many of the problems in scientific research are driven by the rising hype about it, [which] is needed to generate funding,” Dr. Jeffrey Funk, a technology economics and business consultant based in Singapore, told TechNewsWorld.
“This hype, and its exaggerated claims, fuel a need for results that match those claims, and thus a tolerance for research that is not reproducible.”
Scientists and funding agencies will have to “dial back on the hype” to achieve more reproducibility, Funk observed. However, that “may reduce the amount of funding for AI and other technologies, funding that has exploded because lawmakers have been convinced that AI will generate $15 trillion in economic gains by 2030.”