This study investigates the citations of reproducible vs. not reproducible papers and is based on 328 published models, classified by Tiwari et al. based on their reproducibility are analyzed in this study. Hypothese testing is performed using a flexible Bayesian approach for a complete assessment of posteriors. The approach handels outliers via a non-central t distribution. Results show that reproducible papers are significantly more citet between 2013 and 2020, i.e. 10 years after the introduction of SBML. This trend persists also for later periods with more than 95% credibility. In conclusion, this statistical analysis demonstrates long-term benefits of reproducible modeling for the individual researcher and the scientific community.
DOI: 10.15490/fairdomhub.1.study.1103.2
Zenodo URL: None
Created at: 11th Jan 2023 at 16:22
Statistical analysis and BEST method of Kruschke for python applied on citation data in Systems Biology
The statistical analysis was performed in a jupyter notebook.
This notebook contains the commands for all performed analyses (Statistical_analysis_of_FAIR_citations.ipynb)
The Bayesian Estimation Superseeds the t Test (BEST) method of Kruschke 2013 was used for the Bayesian significance testing.
The method was implemented in a python class together with visualization and distributional analysis methods (BEST_method_python_Kruschke2012.py).
Also the bayesian multiple comparison analysis can be
...
Curated citation data
The classification in reproducible and not reproducible models was made by Tiwari et al.
Citations were looked up in Scopus, Web of Science and Google Scholar.
The following journals had to be excluded, as Journal Impact Factors (JIF) were missing or papers were discontinued:
* Experientia was closed 1996 and continued as Cellular and Molecular Life Sciences 1997
* The American journal of physiology – split into fields 1977, further splits in 1980 and 1989
* IFAC Proceedings Volumes – last issue
...
- Citation data.zip
Posterior traces and visualizations
The Results of the analysis are structured in three parts:
1. The results of the main analysis
2. The results with a broader prior (Sensitivity analysis)
3. The Results of the multiple period comparison
For each part, full posterior traces for all analysis and visualizations of the paper are avalable.
Furthermore the diagnostics and traces were added for the different analysis.
The trace for the mulitple comparison was to large to upload it and is available on request.
- Results.zip
BEST method and executable notebook
The folder contains the jupyter notebook for the execution of all analyses of the study.
The BEST method is used in the notebook and is added in a separate python skript.
There is a class for the BEST method according to Kruschke and a class für the BEST multiple comparison.
A conda environment file with all libraries that are necessary to perform the analysis, including the package version was created.
It can be easily installed via
conda env create -f pymc_env.yml
- Statistical analysis and python BEST method.zip
These checksums allow you to check a Snapshot you have downloaded hasn't been modified. For details on how to use these please visit this guide
MD5: 420604823ed43ccff2ecbd7d6b890cba
SHA1: ae6b4ac3ade99ebbe71d16d5719380792f742c33
Views: 447 Downloads: 39
Created: 11th Jan 2023 at 16:22
Last updated: 11th Jan 2023 at 16:31