Skip to content

Commit

Permalink
Merge pull request #3 from drcassar/sciglass
Browse files Browse the repository at this point in the history
Sciglass
  • Loading branch information
drcassar authored Jul 4, 2020
2 parents 3aca830 + 6103f6d commit 2333066
Show file tree
Hide file tree
Showing 15 changed files with 599 additions and 29 deletions.
3 changes: 2 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
include *.txt
recursive-include doc *.txt
recursive-include doc *
include glasspy/data/datafiles/*
23 changes: 18 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ GlassPy is a Python module for scientists working with glass materials.
The aim is to provide classes and functions written in Python for materials scientists working with glass and non-crystalline materials. The hope is that with an open and collaborative project, we can build a reliable toolset to support faster and reproducible research on this topic.

## How to install
The source code is hosted on GitHub at: https://github.com/drcassar/glasspy.
The source code is hosted on GitHub at https://github.com/drcassar/glasspy.

Binary installers for the latest released version are available at the [Python Package Index](https://pypi.org/project/glasspy/). To install GlassPy using pip run

Expand All @@ -20,11 +20,10 @@ To install the latest development version of GlassPy run
```sh
pip install --upgrade git+git://github.com/drcassar/glasspy
```

## Development
GlassPy was born as a personal tool back in 2013 when I started coding with Python. It is based on a colection of MATLAB code that I wrote for the Glass State graduate course of 2010 and for the numerical analysis during my PhD.
GlassPy was born as a personal tool back in 2013 when I started coding with Python. It is based on a collection of MATLAB code that I wrote for the Glass State graduate course of 2010 and the numerical analysis during my PhD.

Right now, I'm sorting all my code and adequately documenting it to build this Python module. My personal objective is to increase the reproducibility of my research and hopefully be useful for researchers working in the field of glass science.
Right now, I'm sorting all my code and adequately documenting it to build this Python module. My personal objective is to increase my research's reproducibility and hopefully be useful for researchers working in the field of glass science.

## Documentation
There is no documentation right now, but all the functions have detailed docstring.
Expand All @@ -42,12 +41,26 @@ Some examples are provided as notebooks in Google Colab (they run in the cloud,
- [SciPy](https://www.scipy.org/)
- [Pandas](https://pandas.pydata.org/)
- [lmfit](https://lmfit.github.io/lmfit-py/)
- [chemparse](https://pypi.org/project/chemparse/)

## Other python repositories for glass science
- [RelaxPy](https://github.com/Mauro-Glass-Group/RelaxPy) - Module to compute glass relaxation kinetics.
- [PyGlass](https://github.com/jrafolsr/PyGlass) - Module to simulate the specific heat signature of glasses with a specified thermal treatment following the Tool-Narayanaswamy-Moynihan model.

## License
## SciGlass database licence
[ODbL](https://github.com/drcassar/glasspy/blob/master/glasspy/data/datafiles/LICENCE_sciglass)

ODC Open Database License (ODbL)

Copyright (c) 2019 EPAM Systems

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

## GlassPy license
[GPL](https://github.com/drcassar/glasspy/blob/master/LICENSE)

GlassPy, Python module for scientists working with glass materials. Copyright (C) 2019-2020 Daniel Roberto Cassar
Expand Down
23 changes: 18 additions & 5 deletions README_PyPI.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ GlassPy is a Python module for scientists working with glass materials.
The aim is to provide classes and functions written in Python for materials scientists working with glass and non-crystalline materials. The hope is that with an open and collaborative project, we can build a reliable toolset to support faster and reproducible research on this topic.

## How to install
The source code is hosted on GitHub at: https://github.com/drcassar/glasspy.
The source code is hosted on GitHub at https://github.com/drcassar/glasspy.

Binary installers for the latest released version are available at the [Python Package Index](https://pypi.org/project/glasspy/). To install GlassPy using pip run

Expand All @@ -18,11 +18,10 @@ To install the latest development version of GlassPy run
```sh
pip install --upgrade git+git://github.com/drcassar/glasspy
```

## Development
GlassPy was born as a personal tool back in 2013 when I started coding with Python. It is based on a colection of MATLAB code that I wrote for the Glass State graduate course of 2010 and for the numerical analysis during my PhD.
GlassPy was born as a personal tool back in 2013 when I started coding with Python. It is based on a collection of MATLAB code that I wrote for the Glass State graduate course of 2010 and the numerical analysis during my PhD.

Right now, I'm sorting all my code and adequately documenting it to build this Python module. My personal objective is to increase the reproducibility of my research and hopefully be useful for researchers working in the field of glass science.
Right now, I'm sorting all my code and adequately documenting it to build this Python module. My personal objective is to increase my research's reproducibility and hopefully be useful for researchers working in the field of glass science.

## Documentation
There is no documentation right now, but all the functions have detailed docstring.
Expand All @@ -40,12 +39,26 @@ Some examples are provided as notebooks in Google Colab (they run in the cloud,
- [SciPy](https://www.scipy.org/)
- [Pandas](https://pandas.pydata.org/)
- [lmfit](https://lmfit.github.io/lmfit-py/)
- [chemparse](https://pypi.org/project/chemparse/)

## Other python repositories for glass science
- [RelaxPy](https://github.com/Mauro-Glass-Group/RelaxPy) - Module to compute glass relaxation kinetics.
- [PyGlass](https://github.com/jrafolsr/PyGlass) - Module to simulate the specific heat signature of glasses with a specified thermal treatment following the Tool-Narayanaswamy-Moynihan model.

## License
## SciGlass database licence
[ODbL](https://github.com/drcassar/glasspy/blob/master/glasspy/data/datafiles/LICENCE_sciglass)

ODC Open Database License (ODbL)

Copyright (c) 2019 EPAM Systems

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

## GlassPy license
[GPL](https://github.com/drcassar/glasspy/blob/master/LICENSE)

GlassPy, Python module for scientists working with glass materials. Copyright (C) 2019-2020 Daniel Roberto Cassar
Expand Down
18 changes: 10 additions & 8 deletions glasspy/data/analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,22 @@
from scipy.spatial.distance import cdist


def relativeNeighborhoodDeviation(X,
Y,
distance_threshold,
metric='euclidean'):
def relativeNeighborhoodDeviation(
X,
Y,
distance_threshold,
metric='euclidean',
):
'''Computes the Relative Neighbourhood Deviation (RND).
RND is used to check the intrinsic deviation in the data. An example of it
being used can be seen in Figure 3 from Ref. [1].
RND is used to check the intrinsic deviation in the data. See Ref. [1] for
an example.
Parameters
----------
X : n-d array
Values of the features (or independent variable). This function uses a
lot of RAM depending on the size of X.
Values of the features (or independent variable). Beware! This function
uses a lot of RAM depending on the size of X.
Y : 1-d array
Values of the target (or dependent variable).
Expand Down
21 changes: 21 additions & 0 deletions glasspy/data/datafiles/LICENSE_sciglass
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
ODC Open Database License (ODbL)

Copyright (c) 2019 EPAM Systems

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Binary file added glasspy/data/datafiles/sciglass.csv.xz
Binary file not shown.
Binary file added glasspy/data/datafiles/sciglass_atfrac.csv.xz
Binary file not shown.
Binary file added glasspy/data/datafiles/sciglass_comp.csv.xz
Binary file not shown.
Binary file added glasspy/data/datafiles/viscosity_at_frac.csv.xz
Binary file not shown.
Binary file added glasspy/data/datafiles/viscosity_compounds.csv.xz
Binary file not shown.
200 changes: 200 additions & 0 deletions glasspy/data/load.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
#!/usr/bin/env python3

import pandas as pd
import numpy as np
import os

from .manipulate import removeColumnsWithOnlyZerosMultiIndex

__cur_path = os.path.dirname(__file__)
SCIGLASS_DATABASE_PATH = os.path.join(__cur_path, 'datafiles/sciglass.csv.xz')
SCIGLASS_COMP_DATABASE_PATH = os.path.join(__cur_path, 'datafiles/sciglass_comp.csv.xz')
SCIGLASS_ATFRAC_DATABASE_PATH = os.path.join(__cur_path, 'datafiles/sciglass_atfrac.csv.xz')


def sciglass(load_compounds=False, load_atomic_fraction=True):
"""Load SciGlass data into a pandas DataFrame
SciGlass is a database of glass properties Copyright (c) 2019 EPAM Systems
and licensed under ODC Open Database License (ODbL). The database is hosted
on GitHub [1]. A portion of the SciGlass database is shipped with GlassPy,
so no additional downloads are necessary.
This function returns a MultiIndex pandas DataFrame. The first-level
indexes are:
at_frac : relative to the atomic fraction of the chemical elements that
make the glass. Only available if "load_atomic_fraction" is True.
comp : relative to the chemical compounds that make the glass. Only
available if "load_compounds" is True.
meta : metadata.
prop : properties.
The property column names are:
RefractiveIndex : refractive index measured at wavelenght of 589.3 nm.
Dimensionless.
AbbeNumber : Abbe number. Dimensionless.
CTE : linear coefficient of thermal expansion below the glass transition
temperature. Unit: K^{-1}.
ElasticModulus : Elastic of Young's Modulus. Unit: GPa.
Tg : glass transition temperature. Unit: K.
Tliquidus: liquidus temperature. Unit: K.
T0 to T12 : "Tn" is the temperature where the base-10 logarithm of
viscosity (in Pa.s) is "n". Example: T4 is the temperature where
log10(viscosity) = 4. Unit: K.
ViscosityAt773K to ViscosityAt2473K : value of base-10 logarithm of
viscosity (in Pa.s) at a certain temperature. Example:
ViscosityAt1073K is the log10(viscosity) at 1073 Kelvin.
Dimensionless.
Parameters
----------
load_compounds : bool, False
If True then chemical compounds are loaded and added to the DataFrame
load_atomic_fraction : bool, True
If True then the atomic fractions are loaded and added to the DataFrame
Returns
-------
data : pandas DataFrame
MultiIndex DataFrame containing a portion of the SciGlass database.
References
----------
[1] Epam/SciGlass. 2019. EPAM Systems, 2019.
https://github.com/epam/SciGlass.
"""
data = pd.read_csv(SCIGLASS_DATABASE_PATH, index_col=0)
metadata_index = ['ChemicalAnalysis', 'Author', 'Year']
property_index = np.array(sorted(set(data.columns) - set(metadata_index)))
d = {}

if load_atomic_fraction:
data_af = pd.read_csv(SCIGLASS_ATFRAC_DATABASE_PATH, index_col=0)
data['NumberChemicalElements'] = data_af.astype('bool').sum(axis=1)
metadata_index.append('NumberChemicalElements')
d['at_frac'] = data_af

if load_compounds:
data_c = pd.read_csv(SCIGLASS_COMP_DATABASE_PATH, index_col=0)
data['NumberCompounds'] = data_c.astype('bool').sum(axis=1)
metadata_index.append('NumberCompounds')
d['comp'] = data_c

d['meta'] = data[metadata_index]
d['prop'] = data[property_index]
data = pd.concat(d, axis=1)

return data


def sciglassOxides(
minimum_fraction_oxygen=0.3,
elements_to_remove=['S', 'H', 'C', 'Pt', 'Au', 'F', 'Cl', 'N', 'Br', 'I'],
load_compounds=False,
):
'''Load only the oxides from SciGlass database into a pandas DataFrame
The default settings of this function follow the definion of an oxide glass
used in [1]. These can be changed with the parameters of the function.
SciGlass is a database of glass properties Copyright (c) 2019 EPAM Systems
and licensed under ODC Open Database License (ODbL). The database is hosted
on GitHub [1]. A portion of the SciGlass database is shipped with GlassPy,
so no additional downloads are necessary.
This function returns a MultiIndex pandas DataFrame. The first-level
indexes are:
at_frac : relative to the atomic fraction of the chemical elements that
make the glass. Only available if "load_atomic_fraction" is True.
comp : relative to the chemical compounds that make the glass. Only
available if "load_compounds" is True.
meta : metadata.
prop : properties.
The property column names are:
RefractiveIndex : refractive index measured at wavelenght of 589.3 nm.
Dimensionless.
AbbeNumber : Abbe number. Dimensionless.
CTE : linear coefficient of thermal expansion below the glass transition
temperature. Unit: K^{-1}.
ElasticModulus : Elastic of Young's Modulus. Unit: GPa.
Tg : glass transition temperature. Unit: K.
Tliquidus: liquidus temperature. Unit: K.
T0 to T12 : "Tn" is the temperature where the base-10 logarithm of
viscosity (in Pa.s) is "n". Example: T4 is the temperature where
log10(viscosity) = 4. Unit: K.
ViscosityAt773K to ViscosityAt2473K : value of base-10 logarithm of
viscosity (in Pa.s) at a certain temperature. Example:
ViscosityAt1073K is the log10(viscosity) at 1073 Kelvin.
Dimensionless.
Parameters
----------
minimum_fraction_oxygen : float
Minimum atomic fraction of oxygen for the glass to be considered an
oxide. A value between 0 and 1 is expected.
elements_to_remove : list or 1-d array or False
Iterable with the chemical elements (strings) that must not be present
in the glass in. If None then no chemical element is removed. Default
value is ['S', 'H', 'C', 'Pt', 'Au', 'F', 'Cl', 'N', 'Br', 'I'].
load_compounds : bool
If True then chemical compounds are loaded and added to the DataFrame.
Default value is False
Returns
-------
data : pandas DataFrame
MultiIndex DataFrame containing a portion of the SciGlass database.
References
----------
[1] Alcobaça, E., Mastelini, S.M., Botari, T., Pimentel, B.A., Cassar, D.R.,
de Carvalho, A.C.P. de L.F., and Zanotto, E.D. (2020). Explainable
Machine Learning Algorithms For Predicting Glass Transition
Temperatures. Acta Materialia 188, 92–100.
[2] Epam/SciGlass. 2019. EPAM Systems, 2019.
https://github.com/epam/SciGlass.
'''
data = sciglass(load_compounds, load_atomic_fraction=True)
logic = data['at_frac']['O'] >= minimum_fraction_oxygen
data = data.loc[data[logic].index]

if elements_to_remove:
for el in elements_to_remove:
logic = data['at_frac'][el] == 0
data = data.loc[data[logic].index]

data = removeColumnsWithOnlyZerosMultiIndex(data, 'at_frac')

if load_compounds:
data = removeColumnsWithOnlyZerosMultiIndex(data, 'comp')

return data
25 changes: 25 additions & 0 deletions glasspy/data/manipulate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/usr/bin/env python3

def removeColumnsWithOnlyZerosMultiIndex(data, first_index):
'''Remove columns with only zeros in MultiIndex pandas DataFrames
Parameters
----------
data : DataFrame
MultiIndex dataframe
first_index : string
Name of the first level index to search for columns with only zeroes.
Returns
-------
data_clean : DataFrame
DataFrame with columns with only zeroes removed.
'''
nonzero_cols_bool = data[first_index].sum(axis=0).astype(bool)
zero_cols = data[first_index].columns.values[~nonzero_cols_bool]
drop_cols = [(first_index, col) for col in zero_cols]
data_clean = data.drop(drop_cols, axis=1)

return data_clean
Loading

0 comments on commit 2333066

Please sign in to comment.