Skip to content

Commit

Permalink
added bootstrapping tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
jakobrunge committed Aug 3, 2023
1 parent 09e5397 commit 494b8bd
Show file tree
Hide file tree
Showing 32 changed files with 1,512 additions and 1,359 deletions.
Binary file modified docs/_build/.doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/_build/.doctrees/index.doctree
Binary file not shown.
1 change: 1 addition & 0 deletions docs/_build/_modules/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ <h1>All modules for which code is available</h1>
<li><a href="tigramite/models.html">tigramite.models</a></li>
<li><a href="tigramite/pcmci.html">tigramite.pcmci</a></li>
<li><a href="tigramite/plotting.html">tigramite.plotting</a></li>
<li><a href="tigramite/rpcmci.html">tigramite.rpcmci</a></li>
<li><a href="tigramite/toymodels/structural_causal_processes.html">tigramite.toymodels.structural_causal_processes</a></li>
</ul>

Expand Down
117 changes: 59 additions & 58 deletions docs/_build/_modules/tigramite/data_processing.html
Original file line number Diff line number Diff line change
Expand Up @@ -58,18 +58,18 @@ <h1>Source code for tigramite.data_processing</h1><div class="highlight"><pre>
<span class="sd"> ----------</span>
<span class="sd"> data : array-like</span>
<span class="sd"> if analysis_mode == &#39;single&#39;:</span>
<span class="sd"> 1) Numpy array of shape (observations T, variables N)</span>
<span class="sd"> OR</span>
<span class="sd"> 2) Dictionary with a single entry whose value is a numpy array of</span>
<span class="sd"> shape (observations T, variables N)</span>
<span class="sd"> Numpy array of shape (observations T, variables N)</span>
<span class="sd"> OR</span>
<span class="sd"> Dictionary with a single entry whose value is a numpy array of</span>
<span class="sd"> shape (observations T, variables N)</span>
<span class="sd"> if analysis_mode == &#39;multiple&#39;:</span>
<span class="sd"> 1) Numpy array of shape (multiple datasets M, observations T,</span>
<span class="sd"> variables N)</span>
<span class="sd"> OR</span>
<span class="sd"> 2) Dictionary whose values are numpy arrays of shape</span>
<span class="sd"> (observations T_i, variables N), where the number of observations</span>
<span class="sd"> T_i may vary across the multiple datasets but the number of variables</span>
<span class="sd"> N is fixed. </span>
<span class="sd"> Numpy array of shape (multiple datasets M, observations T,</span>
<span class="sd"> variables N)</span>
<span class="sd"> OR</span>
<span class="sd"> Dictionary whose values are numpy arrays of shape</span>
<span class="sd"> (observations T_i, variables N), where the number of observations</span>
<span class="sd"> T_i may vary across the multiple datasets but the number of variables</span>
<span class="sd"> N is fixed. </span>
<span class="sd"> mask : array-like, optional (default: None)</span>
<span class="sd"> Optional mask array, must be of same format and shape as data.</span>
<span class="sd"> data_type : array-like</span>
Expand Down Expand Up @@ -114,18 +114,18 @@ <h1>Source code for tigramite.data_processing</h1><div class="highlight"><pre>
<span class="sd"> At least one value must be in [0, 1, ..., T_max-1].</span>
<span class="sd"> time_offsets : None or dict, optional (default: None)</span>
<span class="sd"> if analysis_mode == &#39;single&#39;:</span>
<span class="sd"> Must be None.</span>
<span class="sd"> Shared time axis defined by the time indices of the single time series</span>
<span class="sd"> Must be None.</span>
<span class="sd"> Shared time axis defined by the time indices of the single time series</span>
<span class="sd"> if analysis_mode == &#39;multiple&#39; and data is numpy array:</span>
<span class="sd"> Must be None.</span>
<span class="sd"> All datasets are assumed to be already aligned in time with</span>
<span class="sd"> respect to a shared time axis, which is the time axis of data</span>
<span class="sd"> Must be None.</span>
<span class="sd"> All datasets are assumed to be already aligned in time with</span>
<span class="sd"> respect to a shared time axis, which is the time axis of data</span>
<span class="sd"> if analysis_mode == &#39;multiple&#39; and data is dictionary:</span>
<span class="sd"> Must be dictionary of the form {key(m): time_offset(m), ...} whose</span>
<span class="sd"> set of keys agrees with the set of keys of data and whose values are</span>
<span class="sd"> non-negative integers, at least one of which is 0. The value</span>
<span class="sd"> time_offset(m) defines the time offset of dataset m with</span>
<span class="sd"> respect to a shared time axis.</span>
<span class="sd"> Must be dictionary of the form {key(m): time_offset(m), ...} whose</span>
<span class="sd"> set of keys agrees with the set of keys of data and whose values are</span>
<span class="sd"> non-negative integers, at least one of which is 0. The value</span>
<span class="sd"> time_offset(m) defines the time offset of dataset m with</span>
<span class="sd"> respect to a shared time axis.</span>

<span class="sd"> Attributes</span>
<span class="sd"> ----------</span>
Expand All @@ -135,16 +135,11 @@ <h1>Source code for tigramite.data_processing</h1><div class="highlight"><pre>
<span class="sd"> self.values : dictionary</span>
<span class="sd"> Dictionary holding the observations given by data internally mapped to a</span>
<span class="sd"> dictionary representation as follows:</span>
<span class="sd"> If analysis_mode == &#39;single&#39;:</span>
<span class="sd"> If self._initialized_from == &#39;2d numpy array&#39;:</span>
<span class="sd"> Is {0: data}</span>
<span class="sd"> If self._initialized_from == &#39;dict&#39;:</span>
<span class="sd"> Is data</span>
<span class="sd"> If analysis_mode == &#39;multiple&#39;:</span>
<span class="sd"> If self._initialized_from == &#39;3d numpy array&#39;:</span>
<span class="sd"> Is {m: data[m, :, :] for m in range(data.shape[0])}</span>
<span class="sd"> If self._initialized_from == &#39;dict&#39;:</span>
<span class="sd"> Is data</span>
<span class="sd"> If analysis_mode == &#39;single&#39;: for self._initialized_from == &#39;2d numpy array&#39; this</span>
<span class="sd"> is {0: data} and for self._initialized_from == &#39;dict&#39; this is data.</span>
<span class="sd"> If analysis_mode == &#39;multiple&#39;: If self._initialized_from == &#39;3d numpy array&#39;, this is</span>
<span class="sd"> {m: data[m, :, :] for m in range(data.shape[0])} and for self._initialized_from == &#39;dict&#39; this</span>
<span class="sd"> is data.</span>
<span class="sd"> self.datasets: list</span>
<span class="sd"> List of the keys identifiying the multiple datasets, i.e.,</span>
<span class="sd"> list(self.values.keys())</span>
Expand Down Expand Up @@ -628,44 +623,50 @@ <h1>Source code for tigramite.data_processing</h1><div class="highlight"><pre>
<span class="sd"> Whether to perform sanity checks on input X,Y,Z</span>
<span class="sd"> remove_overlaps : bool, optional (default: True)</span>
<span class="sd"> Whether to remove variables from Z/extraZ if they overlap with X or Y.</span>
<span class="sd"> cut_off : {&#39;2xtau_max&#39;, &#39;tau_max&#39;, &#39;max_lag&#39;, &#39;max_lag_or_tau_max&#39;,</span>
<span class="sd"> 2xtau_max_future}</span>
<span class="sd"> cut_off : {&#39;2xtau_max&#39;, &#39;tau_max&#39;, &#39;max_lag&#39;, &#39;max_lag_or_tau_max&#39;, 2xtau_max_future}</span>
<span class="sd"> If cut_off == &#39;2xtau_max&#39;:</span>
<span class="sd"> - 2*tau_max samples are cut off at the beginning of the time</span>
<span class="sd"> series (&#39;beginning&#39; here refers to the temporally first time</span>
<span class="sd"> steps). This guarantees that (as long as no mask is used) all</span>
<span class="sd"> MCI tests are conducted on the same samples, independent of X,</span>
<span class="sd"> Y, and Z.</span>
<span class="sd"> series (&#39;beginning&#39; here refers to the temporally first</span>
<span class="sd"> time steps). This guarantees that (as long as no mask is</span>
<span class="sd"> used) all MCI tests are conducted on the same samples,</span>
<span class="sd"> independent of X, Y, and Z.</span>

<span class="sd"> - If at time step t_missing a data value is missing, then the</span>
<span class="sd"> time steps t_missing, ..., t_missing + 2*tau_max are cut out.</span>
<span class="sd"> The latter part only holds if remove_missing_upto_maxlag=True.</span>
<span class="sd"> time steps t_missing, ..., t_missing + 2*tau_max are cut</span>
<span class="sd"> out. The latter part only holds if</span>
<span class="sd"> remove_missing_upto_maxlag=True.</span>

<span class="sd"> If cut_off == &#39;max_lag&#39;:</span>
<span class="sd"> - max_lag(X, Y, Z) samples are cut off at the beginning of the</span>
<span class="sd"> time series, where max_lag(X, Y, Z) is the maximum lag of all</span>
<span class="sd"> nodes in X, Y, and Z. These are all samples that can in</span>
<span class="sd"> principle be used.</span>
<span class="sd"> time series, where max_lag(X, Y, Z) is the maximum lag of</span>
<span class="sd"> all nodes in X, Y, and Z. These are all samples that can in</span>
<span class="sd"> principle be used.</span>

<span class="sd"> - If at time step t_missing a data value is missing, then the</span>
<span class="sd"> time steps t_missing, ..., t_missing + max_lag(X, Y, Z) are cut</span>
<span class="sd"> out.</span>
<span class="sd"> The latter part only holds if remove_missing_upto_maxlag=True.</span>
<span class="sd"> time steps t_missing, ..., t_missing + max_lag(X, Y, Z) are</span>
<span class="sd"> cut out. The latter part only holds if</span>
<span class="sd"> remove_missing_upto_maxlag=True.</span>

<span class="sd"> If cut_off == &#39;max_lag_or_tau_max&#39;:</span>
<span class="sd"> - max(max_lag(X, Y, Z), tau_max) are cut off at the beginning.</span>
<span class="sd"> This may be useful for modeling by comparing multiple models on</span>
<span class="sd"> the same samples. </span>
<span class="sd"> This may be useful for modeling by comparing multiple</span>
<span class="sd"> models on the same samples. </span>

<span class="sd"> - If at time step t_missing a data value is missing, then the</span>
<span class="sd"> time steps</span>
<span class="sd"> t_missing, ..., t_missing + max(max_lag(X, Y, Z), tau_max)</span>
<span class="sd"> are cut out.</span>
<span class="sd"> The latter part only holds if remove_missing_upto_maxlag=True.</span>
<span class="sd"> time steps t_missing, ..., t_missing + max(max_lag(X, Y,</span>
<span class="sd"> Z), tau_max) are cut out. The latter part only holds if</span>
<span class="sd"> remove_missing_upto_maxlag=True.</span>

<span class="sd"> If cut_off == &#39;tau_max&#39;:</span>
<span class="sd"> - tau_max samples are cut off at the beginning.</span>
<span class="sd"> This may be useful for modeling by comparing multiple models on</span>
<span class="sd"> the same samples. </span>
<span class="sd"> - tau_max samples are cut off at the beginning. This may be</span>
<span class="sd"> useful for modeling by comparing multiple models on the</span>
<span class="sd"> same samples. </span>

<span class="sd"> - If at time step t_missing a data value is missing, then the</span>
<span class="sd"> time steps</span>
<span class="sd"> t_missing, ..., t_missing + max(max_lag(X, Y, Z), tau_max)</span>
<span class="sd"> are cut out.</span>
<span class="sd"> The latter part only holds if remove_missing_upto_maxlag=True.</span>
<span class="sd"> time steps t_missing, ..., t_missing + max(max_lag(X, Y,</span>
<span class="sd"> Z), tau_max) are cut out. The latter part only holds if</span>
<span class="sd"> remove_missing_upto_maxlag=True.</span>
<span class="sd"> </span>
<span class="sd"> If cut_off == &#39;2xtau_max_future&#39;:</span>
<span class="sd"> First, the relevant time steps are determined as for cut_off ==</span>
<span class="sd"> &#39;max_lag&#39;. Then, the temporally latest time steps are removed</span>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,56 +49,56 @@ <h1>Source code for tigramite.independence_tests.robust_parcorr</h1><div class="
<div class="viewcode-block" id="RobustParCorr"><a class="viewcode-back" href="../../../index.html#tigramite.independence_tests.robust_parcorr.RobustParCorr">[docs]</a><span class="k">class</span> <span class="nc">RobustParCorr</span><span class="p">(</span><span class="n">CondIndTest</span><span class="p">):</span>
<span class="w"> </span><span class="sa">r</span><span class="sd">&quot;&quot;&quot;Robust partial correlation test based on non-paranormal models.</span>

<span class="sd"> Partial correlation is estimated through transformation to standard</span>
<span class="sd"> normal marginals, ordinary least squares (OLS) regression, and a test for</span>
<span class="sd"> non-zero linear Pearson correlation on the residuals.</span>
<span class="sd"> Partial correlation is estimated through transformation to standard</span>
<span class="sd"> normal marginals, ordinary least squares (OLS) regression, and a test for</span>
<span class="sd"> non-zero linear Pearson correlation on the residuals.</span>

<span class="sd"> Notes</span>
<span class="sd"> -----</span>
<span class="sd"> To test :math:`X \perp Y | Z`, firstly, each marginal is transformed to be</span>
<span class="sd"> standard normally distributed. For that, the transform</span>
<span class="sd"> :math:`\Phi^{-1}\circ\hat{F}` is used. Here, :math:`\Phi^{-1}` is the</span>
<span class="sd"> quantile function of a standard normal distribution and </span>
<span class="sd"> :math:`\hat{F}` is the empirical distribution function for the respective</span>
<span class="sd"> marginal.</span>
<span class="sd"> Notes</span>
<span class="sd"> -----</span>
<span class="sd"> To test :math:`X \perp Y | Z`, firstly, each marginal is transformed to be</span>
<span class="sd"> standard normally distributed. For that, the transform</span>
<span class="sd"> :math:`\Phi^{-1}\circ\hat{F}` is used. Here, :math:`\Phi^{-1}` is the</span>
<span class="sd"> quantile function of a standard normal distribution and </span>
<span class="sd"> :math:`\hat{F}` is the empirical distribution function for the respective</span>
<span class="sd"> marginal.</span>


<span class="sd"> This idea stems from the literature on nonparanormal models, see:</span>
<span class="sd"> This idea stems from the literature on nonparanormal models, see:</span>

<span class="sd"> - Han Liu, John Lafferty, and Larry Wasserman. The nonparanormal:</span>
<span class="sd"> semiparametric estimation of high dimensional undirected graphs. J.</span>
<span class="sd"> Mach. Learn. Res., 10:2295–2328, 2009.</span>
<span class="sd"> - Han Liu, John Lafferty, and Larry Wasserman. The nonparanormal:</span>
<span class="sd"> semiparametric estimation of high dimensional undirected graphs. J.</span>
<span class="sd"> Mach. Learn. Res., 10:2295–2328, 2009.</span>

<span class="sd"> - Han Liu, Fang Han, Ming Yuan, John Lafferty, and Larry Wasserman.</span>
<span class="sd"> High-dimensional semiparametric Gaussian copula graphical models. Ann.</span>
<span class="sd"> Statist., 40(4):2293–2326, 2012a.</span>
<span class="sd"> - Han Liu, Fang Han, Ming Yuan, John Lafferty, and Larry Wasserman.</span>
<span class="sd"> High-dimensional semiparametric Gaussian copula graphical models. Ann.</span>
<span class="sd"> Statist., 40(4):2293–2326, 2012a.</span>

<span class="sd"> - Naftali Harris, Mathias Drton. PC Algorithm for Nonparanormal Graphical</span>
<span class="sd"> Models. Journal of Machine Learning Research, 14: 3365-3383, 2013.</span>
<span class="sd"> - Naftali Harris, Mathias Drton. PC Algorithm for Nonparanormal Graphical</span>
<span class="sd"> Models. Journal of Machine Learning Research, 14: 3365-3383, 2013.</span>

<span class="sd"> Afterwards (where Z, X, and Y are now assumed to be transformed to the</span>
<span class="sd"> standard normal scale):</span>
<span class="sd"> Afterwards (where Z, X, and Y are now assumed to be transformed to the</span>
<span class="sd"> standard normal scale):</span>

<span class="sd"> :math:`Z` is regressed out from</span>
<span class="sd"> :math:`X` and :math:`Y` assuming the model</span>
<span class="sd"> :math:`Z` is regressed out from</span>
<span class="sd"> :math:`X` and :math:`Y` assuming the model</span>

<span class="sd"> .. math:: X &amp; = Z \beta_X + \epsilon_{X} \\</span>
<span class="sd"> Y &amp; = Z \beta_Y + \epsilon_{Y}</span>
<span class="sd"> .. math:: X &amp; = Z \beta_X + \epsilon_{X} \\</span>
<span class="sd"> Y &amp; = Z \beta_Y + \epsilon_{Y}</span>

<span class="sd"> using OLS regression. Then the dependency of the residuals is tested with</span>
<span class="sd"> the Pearson correlation test.</span>
<span class="sd"> using OLS regression. Then the dependency of the residuals is tested with</span>
<span class="sd"> the Pearson correlation test.</span>

<span class="sd"> .. math:: \rho\left(r_X, r_Y\right)</span>
<span class="sd"> .. math:: \rho\left(r_X, r_Y\right)</span>

<span class="sd"> For the ``significance=&#39;analytic&#39;`` Student&#39;s-*t* distribution with</span>
<span class="sd"> :math:`T-D_Z-2` degrees of freedom is implemented.</span>
<span class="sd"> For the ``significance=&#39;analytic&#39;`` Student&#39;s-*t* distribution with</span>
<span class="sd"> :math:`T-D_Z-2` degrees of freedom is implemented.</span>

<span class="sd"> Parameters</span>
<span class="sd"> ----------</span>
<span class="sd"> **kwargs :</span>
<span class="sd"> Arguments passed on to Parent class CondIndTest.</span>
<span class="sd"> Parameters</span>
<span class="sd"> ----------</span>
<span class="sd"> **kwargs :</span>
<span class="sd"> Arguments passed on to Parent class CondIndTest.</span>
<span class="sd"> &quot;&quot;&quot;</span>
<span class="c1"># documentation</span>

<span class="nd">@property</span>
<span class="k">def</span> <span class="nf">measure</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="w"> </span><span class="sd">&quot;&quot;&quot;</span>
Expand Down
Loading

0 comments on commit 494b8bd

Please sign in to comment.