Skip to content

Commit

Permalink
Short expansion of conditional mutual information (#387)
Browse files Browse the repository at this point in the history
* wip...

* Reorganize into separate file for the SECMI test

* tests, docs

* changelog + version

* Slightly reorganize tests

* fix tests

* docs

* no need to store shuffles

* examples

* fix deprecated syntax

* reproducible tests

* Add cross-references

* documentation example for SECMI

* Actually show docstrings

* docstring

* docstring

* add relevant imports

* reproducible tests

* better description

* Fix implementation for mu < 0

* CI badge for main branch only

* Update changelog and version

* Increase sample size to have enough points to get consistent results

* better tests for secmi

* a comment explaining the marginal selection

* Add min/max variables for SECMI

* Use SECMITest in oce tests
  • Loading branch information
kahaaga authored Nov 20, 2024
1 parent 1054b8f commit 7bfb8b9
Show file tree
Hide file tree
Showing 22 changed files with 479 additions and 97 deletions.
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name = "Associations"
uuid = "614afb3a-e278-4863-8805-9959372b9ec2"
authors = ["Kristian Agasøster Haaga <[email protected]>", "Tor Einar Møller <[email protected]>", "George Datseris <[email protected]>"]
repo = "https://github.com/kahaaga/Associations.jl.git"
version = "4.3.0"
version = "4.4.0"

[deps]
Accessors = "7d9f7c33-5ae7-4f3b-8dc6-eff91059b697"
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Associations

[![CI](https://github.com/juliadynamics/Associations.jl/workflows/CI/badge.svg)](https://github.com/JuliaDynamics/Associations.jl/actions)
[![CI (main)](https://github.com/juliadynamics/Associations.jl/workflows/CI/badge.svg?branch=main)](https://github.com/JuliaDynamics/Associations.jl/actions)
[![](https://img.shields.io/badge/docs-latest_tagged-blue.svg)](https://juliadynamics.github.io/Associations.jl/stable/)
[![](https://img.shields.io/badge/docs-dev_(main)-blue.svg)](https://juliadynamics.github.io/Associations.jl/dev/)
[![codecov](https://codecov.io/gh/JuliaDynamics/Associations.jl/branch/main/graph/badge.svg?token=0b71n6x6AP)](https://codecov.io/gh/JuliaDynamics/Associations.jl)
Expand Down
5 changes: 5 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

From version v4.0 onwards, this package has been renamed to to Associations.jl.

# 4.4

- New association measure: `SECMI` (`ShortExpansionConditionalMutualInformation`)
- New independence test: `SECMITest`, which is based on `SECMI`.

# 4.3

- Compatiblity with StateSpaceSets.jl v2.X
Expand Down
10 changes: 10 additions & 0 deletions docs/refs.bib
Original file line number Diff line number Diff line change
Expand Up @@ -1333,4 +1333,14 @@ @article{Azadkia2021
pages={3070--3102},
year={2021},
publisher={Institute of Mathematical Statistics}
}

@article{Kubkowski2021,
title={How to gain on power: novel conditional independence tests based on short expansion of conditional mutual information},
author={Kubkowski, Mariusz and Mielniczuk, Jan and Teisseyre, Pawe{\l}},
journal={Journal of Machine Learning Research},
volume={22},
number={62},
pages={1--57},
year={2021}
}
6 changes: 6 additions & 0 deletions docs/src/associations.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,12 @@ EmbeddingTE
PartialMutualInformation
```

### Short expansion of conditional mutual information

```@docs
ShortExpansionConditionalMutualInformation
```

## [Correlation measures](@id correlation_api)

```@docs
Expand Down
18 changes: 18 additions & 0 deletions docs/src/examples/examples_associations.md
Original file line number Diff line number Diff line change
Expand Up @@ -1093,6 +1093,24 @@ est = MIDecomposition(CMIShannon(base = 2), KSG1(k = 10))
association(est, x, z, y)
```

## [`ShortExpansionConditionalMutualInformation`](@ref)

### [[`JointProbabilities`](@ref) with [`CodifyVariables`](@ref) and [`ValueBinning`](@ref)](@id example_ShortExpansionConditionalMutualInformation_JointProbabilities_CodifyVariables_ValueBinning)

```@example
using Associations
using Test
using Random; rng = Xoshiro(1234)
n = 20
x = rand(rng, n)
y = randn(rng, n) .+ x .^ 2
z = randn(rng, n) .* y
# An estimator for estimating the SECMI measure
est = JointProbabilities(SECMI(base = 2), CodifyVariables(ValueBinning(3)))
association(est, x, z, y)
```

### [[`EntropyDecomposition`](@ref) + [`Kraskov`](@ref)](@id example_CMIShannon_EntropyDecomposition_Kraskov)

Any [`DifferentialInfoEstimator`](@ref) can also be used to compute conditional
Expand Down
52 changes: 51 additions & 1 deletion docs/src/examples/examples_independence.md
Original file line number Diff line number Diff line change
Expand Up @@ -469,4 +469,54 @@ connecting `x` and `z`.)
independence(test, x, z, y)
```

The test verifies our expectation.
The test verifies our expectation.
## [[`SECMITest`](@ref)](@id example_SECMITEST)

## [[`JointProbabilities`](@ref) estimation on numeric data](@id example_SECMITEST_JointProbabilities_CodifyVariables_ValueBinning)

```@example example_SECMITEst
using Associations
using Test
using Random; rng = Xoshiro(1234)
n = 25
x = rand(rng, n)
y = randn(rng, n) .+ x .^ 2
z = randn(rng, n) .* y
# An estimator for estimating the SECMI measure
est = JointProbabilities(SECMI(base = 2), CodifyVariables(ValueBinning(3)))
test = SECMITest(est; nshuffles = 19)
```

When analyzing ``SECMI(x, y | z)``, the expectation is to reject the null hypothesis (independence), since `x` and `y` are connected, regardless of the effect of `z`.

```@example example_SECMITEst
independence(test, x, y, z)
```

We can detect this association, even for `n = 25`! When analyzing ``SECMI(x, z | y)``, we
expect that we can't reject the null (indepdendence), precisely since `x` and `z` are *not*
connected when "conditioning away" `y`.

```@example example_SECMITEst
independence(test, x, z, y)
```

## [[`JointProbabilities`](@ref) estimation on categorical data](@id example_SECMITEST_JointProbabilities_CodifyVariables_UniqueElements)

Note that this also works for categorical variables. Just use [`UniqueElements`](@ref) to
discretize!

```@example example_SECMITest_categorical
using Associations
using Test
using Random; rng = Xoshiro(1234)
n = 24
x = rand(rng, ["vegetables", "candy"], n)
y = [xᵢ == "candy" && rand(rng) > 0.3 ? "yummy" : "yuck" for xᵢ in x]
z = [yᵢ == "yummy" && rand(rng) > 0.6 ? "grown-up" : "child" for yᵢ in y]
d = CodifyVariables(UniqueElements())
est = JointProbabilities(SECMI(base = 2), d)
independence(SECMITest(est; nshuffles = 19), x, z, y)
```
7 changes: 7 additions & 0 deletions docs/src/independence.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,3 +50,10 @@ JDDTestResult
CorrTest
CorrTestResult
```

## [`SECMITest`](@ref)

```@docs
SECMITest
SECMITestResult
```
Loading

2 comments on commit 7bfb8b9

@kahaaga
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/119881

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v4.4.0 -m "<description of version>" 7bfb8b927d41cecb4fe467837308cf64d54b3174
git push origin v4.4.0

Please sign in to comment.