-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
IRSI: Preliminary information about the project
- added extended information about the project, spectrum used, and preliminary results from the predictive experiment - added the feature importance, and y_true, and y_pred plots within the README - added gitignore Signed-off-by: akhilpandey95 <[email protected]>
- Loading branch information
1 parent
c3e7466
commit 51fea6b
Showing
5 changed files
with
189 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,156 @@ | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
*.py,cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
cover/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
db.sqlite3 | ||
db.sqlite3-journal | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
.pybuilder/ | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# IPython | ||
profile_default/ | ||
ipython_config.py | ||
|
||
# pyenv | ||
# For a library or package, you might want to ignore these files since the code is | ||
# intended to run in multiple environments; otherwise, check them in: | ||
# .python-version | ||
|
||
# pipenv | ||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. | ||
# However, in case of collaboration, if having platform-specific dependencies or dependencies | ||
# having no cross-platform support, pipenv may install dependencies that don't work, or not | ||
# install all needed dependencies. | ||
#Pipfile.lock | ||
|
||
# poetry | ||
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. | ||
# This is especially recommended for binary packages to ensure reproducibility, and is more | ||
# commonly ignored for libraries. | ||
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control | ||
#poetry.lock | ||
|
||
# pdm | ||
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. | ||
#pdm.lock | ||
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it | ||
# in version control. | ||
# https://pdm.fming.dev/#use-with-ide | ||
.pdm.toml | ||
|
||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm | ||
__pypackages__/ | ||
|
||
# Celery stuff | ||
celerybeat-schedule | ||
celerybeat.pid | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
|
||
# Pyre type checker | ||
.pyre/ | ||
|
||
# pytype static type analyzer | ||
.pytype/ | ||
|
||
# Cython debug symbols | ||
cython_debug/ | ||
|
||
# add the specific text dataset to ignore | ||
data/acmpapers_fulltext_labels_04_24.csv |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,35 @@ | ||
# IRSI | ||
|
||
> **I**nfluence of **R**eproducibility on **S**cientific **I**mpact | ||
Reproducibility is an important feature of science; experiments are retested, and analyses are repeated.we examine a myriad of features in scholarly articles published in computer science conferences and journals and model how they influence scientific impact. | ||
|
||
### Reproducibility Spectrum | ||
|
||
The author-centric framework focuses on acknowledging availability, accessibility, and quality of the artifact available within scientific document to signal satisfying prerequisites to reproduce a paper. The Author-Centric framework within the spectrum includes, $A_i = A_{PWA}$ (Papers without artifacts), $A_{PUNX}$ (Papers with artifacts that aren't permanantly archived), and $A_{PAX}$ (Papers with artifacts that are permanantly archived). | ||
|
||
The external-agent framework that presents the reproducibility evaluation status of a paper. This includes $E_i = E_{NR}$ (Paper that cannot be reproduced), $E_{AR}$ (Paper Awaiting-Reproducibility), $E_{R^{Re}}$ (Reproduced paper), and $E_{R}$ (Reproducible paper). | ||
|
||
<img src="media/AC_EA_DARK.png"/> | ||
|
||
### Influence on Scientific Impact | ||
|
||
The concept of reproducibility echoes diverse sentiments, and from a taxonomy standpoint, the formal definition of reproducibility has evolved as a term and a concept. We align with the National Academy of Sciences in defining \textit{reproducibility} as the process of obtaining consistent computational results using the same input data, computational steps, methods, code, and conditions of analysis. This definition provides an ideal generalizable standard applicable to large sections of scientific research within the sub-domains of computer science. Consensus on this definition can make it easier to recognize procedures and protocols for validating and verifying scientific claims. The relevance and importance of reproducibility are heightened more than ever, given the current outgrowth of Artificial Intelligence (AI) models into diverse public domains. The modus operandi of scientific workflows in AI has shifted from offering posterior fixes to building a-priori reproducible AI pipelines. Regardless of the complexity, we can observe the push for making models, datasets, and algorithms available in containers, open-source repositories, and webpages. The significance of reproducibility is multifaceted. First, it upholds a standard for sustaining | ||
quality in the results and analysis of scholarly works, ensuring that scientific findings are robust, reliable, and unbiased. Second, it enables researchers to innovate and expand on proven findings quickly. Third, in the context of AI, reproducibility addresses essential safety and trust considerations by ensuring accountability within the systems implementing algorithms that make decisions affecting human lives. | ||
|
||
<img src="media/IRSSI_test_val_mdi_feat_imp.png"/> | ||
|
||
> Fig.1.a Important features observed while predicting scholarly impact of reproducible papers in computer science | ||
Preliminary evidence from utilizing diverse feature groups central to | ||
computational sciences in predicting scholarly impact highlight the importance of transparency, reproducibility, clear communication, and practical contributions in enhancing the scholarly impact of academic papers. They reflect broader trends in the scientific community towards open science and reproducibility, which are key to our interest in addressing the **Reproducibility crisis in AI**. | ||
|
||
<img src="media/IRSSI_test_val_ytrue_ypred.png"/> | ||
|
||
> Fig.1.b Plotting the true citation counts against the predicted values | ||
### Authors | ||
[Akhil Pandey](https://github.com/akhilpandey95), [Hamed Alhoori](https://github.com/alhoori) | ||
|
||
### Acknowledgement | ||
This work is supported in part by NSF Grant No. [2022443](https://www.nsf.gov/awardsearch/showAward?AWD_ID=2022443&HistoricalAwards=false). |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.