Skip to content

Commit

Permalink
Fix bug in bivariate MI/TE estimation and min stats
Browse files Browse the repository at this point in the history
Fix bug in bivariate MI/TE estimation. Conditioning was not performed
correctly. Find non-uniform embedding separately for each link. This
requires conditioning on the target's past for TE and conditioning on
all variables already selected for this link.

Fix bug in minimum statistics: conditional for surrogate creation should
not contain the minimum candidate. Otherwise the minimum candidate's CMI
is calculated with a conditional that is smaller by one dimension
compared to the conditional used for surrogate creation.

Add unit tests and change documentation.
  • Loading branch information
pwollstadt committed Aug 19, 2018
1 parent 3596453 commit 634a769
Show file tree
Hide file tree
Showing 10 changed files with 735 additions and 238 deletions.
23 changes: 16 additions & 7 deletions idtxl/bivariate_mi.py
Original file line number Diff line number Diff line change
Expand Up @@ -170,15 +170,22 @@ def analyse_single_target(self, settings, data, target, sources='all'):
processes and the target process. Uses bivariate, non-uniform embedding
found through information maximisation
MI is calculated in two steps:
MI is calculated in three steps:
(1) find all relevant samples in a single source processes' past, by
iteratively adding candidate samples that have significant
(1) find all relevant variables in a single source processes' past, by
iteratively adding candidate variables that have significant
conditional mutual information (CMI) with the current value
(conditional on all samples that were added previously)
(2) statistics on the final set of sources (test for over-all transfer
(conditional on all variables that were added previously)
(2) prune the final conditional set for each link (i.e., each
process-target pairing): test the CMI between each variable in
the final set and the current value, conditional on all other
variables in the final set of the current link; treat each
potential source process separately, i.e., the CMI is calculated
with respect to already selected variables the current processes'
past only
(3) statistics on the final set of sources (test for over-all transfer
between the final conditional set and the current value, and for
significant transfer of all individual samples in the set)
significant transfer of all individual variables in the set)
Note:
For a further description of the algorithm see references in the
Expand Down Expand Up @@ -263,7 +270,9 @@ class docstring.
# Main algorithm.
print('\n---------------------------- (1) include source candidates')
self._include_source_candidates(data)
print('\n---------------------------- (2) omnibus test')
print('\n---------------------------- (2) prune cadidates')
self._prune_candidates(data)
print('\n---------------------------- (3) final statistics')
self._test_final_conditional(data)

# Clean up and return results.
Expand Down
25 changes: 17 additions & 8 deletions idtxl/bivariate_te.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,15 +176,22 @@ def analyse_single_target(self, settings, data, target, sources='all'):
Bivariate TE is calculated in four steps:
(1) find all relevant samples in the target processes' own past, by
iteratively adding candidate samples that have significant
(1) find all relevant variables in the target processes' own past, by
iteratively adding candidate variables that have significant
conditional mutual information (CMI) with the current value
(conditional on all samples that were added previously)
(2) find all relevant samples in the single source processes' pasts
(again by finding all candidates with significant CMI)
(3) statistics on the final set of sources (test for over-all transfer
(conditional on all variables that were added previously)
(2) find all relevant variables in the single source processes' pasts
(again by finding all candidates with significant CMI); treat each
potential source process separately, i.e., the CMI is calculated
with respect to already selected variables from the target's past
and from the current processes' past only
(3) prune the final conditional set for each link (i.e., each
process-target pairing): test the CMI between each variable in
the final set and the current value, conditional on all other
variables in the final set of the current link
(4) statistics on the final set of sources (test for over-all transfer
between the final conditional set and the current value, and for
significant transfer of all individual samples in the set)
significant transfer of all individual variables in the set)
Note:
For a further description of the algorithm see references in the
Expand Down Expand Up @@ -268,7 +275,9 @@ class docstring.
self._include_target_candidates(data)
print('\n---------------------------- (2) include source candidates')
self._include_source_candidates(data)
print('\n---------------------------- (3) omnibus test')
print('\n---------------------------- (3) prune cadidates')
self._prune_candidates(data)
print('\n---------------------------- (4) final statistics')
self._test_final_conditional(data)

# Clean up and return results.
Expand Down
14 changes: 7 additions & 7 deletions idtxl/multivariate_te.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,18 +179,18 @@ def analyse_single_target(self, settings, data, target, sources='all'):
through information maximisation. Multivariate TE is calculated in four
steps:
(1) find all relevant samples in the target processes' own past, by
iteratively adding candidate samples that have significant
(1) find all relevant variables in the target processes' own past, by
iteratively adding candidate variables that have significant
conditional mutual information (CMI) with the current value
(conditional on all samples that were added previously)
(2) find all relevant samples in the source processes' pasts (again
(conditional on all variables that were added previously)
(2) find all relevant variables in the source processes' pasts (again
by finding all candidates with significant CMI)
(3) prune the final conditional set by testing the CMI between each
sample in the final set and the current value, conditional on all
other samples in the final set
variable in the final set and the current value, conditional on all
other variables in the final set
(4) statistics on the final set of sources (test for over-all transfer
between the final conditional set and the current value, and for
significant transfer of all individual samples in the set)
significant transfer of all individual variables in the set)
Note:
For a further description of the algorithm see references in the
Expand Down
Loading

0 comments on commit 634a769

Please sign in to comment.