-
Notifications
You must be signed in to change notification settings - Fork 114
Forthcoming fixes
Joseph Ramsey edited this page Oct 5, 2025
·
420 revisions
Changes for forthcoming version 7.6.9
New features:
- Added new options for cyclic SEM initialization (Fixed Radius and Product Capped) with parameters exposed in the interface and config, ensuring stable simulation of feedback loops.
- Improved Fisher Z independence test by adding Ledoit–Wolf shrinkage and a dedicated ridge option, producing more reliable results in cyclic settings.
- Enhanced CCD and related algorithms: stable SEM initialization plus shrinkage-based Fisher Z now yield correct behavior on the canonical 4-node cyclic model.
- Updated configuration defaults and interface controls to include cyclic parameters (CYCLIC_RADIUS, CYCLIC_MAX_PROD, CYCLIC_COEF_STYLE) with conservative values pre-set.
- Revised Fisher Z regularization handling by separating ridge and pseudoinverse modes for greater clarity and control.
- Expanded unit tests to cover expected vs. spurious independencies in cyclic SEMs, improving robustness of regression checks.
- Added full implementation of the RCIT (Randomized Conditional Independence Test) and RCoT (unconditional variant) with support for multiple approximation methods (Gamma, Chi-square, HBE, LPB4, Permutation).
- Exposed new RCIT parameters in the interface and config (RCIT_LAMBDA, RCIT_MODE, RCIT_PERMUTATIONS, RCIT_APPROX, RCIT_CENTER_FEATURES, RCIT_NUM_FEATURES, RCIT_NUM_FEATURES_XY, RCIT_NUM_FEATURES_Z) for fine-grained control, fully documented in the manual.
- Added robustness improvements to covariance handling in kernel-based tests, ensuring compatibility with EJML 0.44.0 and improved numerical stability.
- Added a translation of RCIT (Randomized Conditional Independence Test, Strobl) from Python (which was translated from the original R).
- Added a new implementation of KCI (Kernel Conditional Independence, Zhang), motivated by the implementation of the same in Python.
- Added an implementation of the GIN (Generalized Independent Noise) algorithm.
- Added the GIN (Residual Independence) independence test as well.
- Added a new latent variable clustering search (TSC, Trek Separation Clusters) and new implementations of BPC (Build Pure Clusters), FOFC (Find One Factor Clusters), FTFC (Find Two Factor Clusters), and a generalization of GFFC (Generalized Find Factor Clusters) of FOFC and FTFC, providing multiple strategies for discovering latent clusterings from measurement data.
- Added new buttons and boxes to the Tetrad interface for searching for latent clusters and latent structure.
- Integrated new Mimbuild-Bollen and Mimbuild-PCA implementations, offering alternative approaches for latent measurement model construction with corresponding interface parameters.
- Revised the random Multiple Indicators Model (MIM) graph generator to allow for more flexible generation of random MIMs with arbitrary ranks.
- Extended latent clustering search to support rank-based tests, clique detection, and improved handling of patchy or overlapping latent structures.
- Allowed knowledge to be defined over latent cluster variables.
- Expanded simulation tools for latent variable models, including flexible latent group specifications and random MIM generators.
- Added an effective sample size (ESS) parameter to all tests and scores that appear in the interface.
- Implemented a general False Discovery Rate (FDR) independence test wrapper that can be used anywhere a test is being used, and added this option to all constraint-based algorithms in the interface.
- Added new implementations of the PC-style algorithms by incorporating sepset, conservative, and max-P options for PC and FCI. Made the max-P option the default based on performance.
- Implemented a new graph type for time lag models that repeats edges across time lags automatically and used this to produce updated implementations of SVAR-FCI and SVAR-BOSS.
- Implemented the causal unmixing algorithm of Zhang and Gymour that decomposes datasets into mixture components with unique causal structures and added to the interface as a new Data Manipulation item.
- Added the ability of the Search box to do searches for multiple datasets in the Data box and present their results in a tabbed pane that the user can navigate, inside the Search box. Useful if the user generates multiple runs in the simulation box or performs a causal unmixing of a dataset into multiple datasets, for example.
- Added recursive blocking algorithm for adjustment set discovery, along with soundness and completeness verification hooks.
- Added a translation of Huang et al.'s CD-NOD algorithm from Python (https://github.com/py-why/causal-learn/blob/main/causallearn/search/ConstraintBased/CDNOD.py). This requires either a time series or time lag data with a context variable (e.g., time index) as the last variable in the data.
- Added PCMCI algorithm for causal discovery in time series, extending PC with lagged conditional independence tests.
- Added an implementation of the Causal Additive Model (CAM) algorithm of Peters et al, which assumes a nonlinear additive function model.
- Added an implementation of a Nonlinear Additive Model (ANM) simulator, for use, among other things, for testing the CAM algorithm.
- Added an implementation of the Instance-specific GFCI and FGES algorithms described in Jabbari's Ph.D. dissertation.
- Added a new API for doing modeling of hybrid continuous/discrete data using an augmented conditional Gaussian method. This includes a new Hybrid CG parametric model, a new Hybrid CG instantiated model, simulation facilities from the instantiated model, and a new Hybrid CG Estimator, which takes the PM and IM as input. Search over mixed data could already be done using the Conditional Gaussian and Degenerate Gaussian scores and tests.
- SVAR-style graph replication parameters were added to all algorithms that can use them, including the originally modified SVAR algorithms, which have been removed because of this generalization. The idea behind this change is to allow edges to be replicated across time lags when analyzing time lag data, where the variables have canonical names, e.g., X for the current time lag, X:1 for one time lag back, X:2 for two time lags back, and so on.
- Added a new template for latent structure search.
Bug fixes / Technical improvements:
- Fixed eigenvalue/eigenvector handling in kernel-based tests by migrating to SimpleEVD in EJML 0.44.0, eliminating deprecated API usage.
- Improved numerical stability of test statistics by enforcing minimum ridge values (lambda ≥ 1e-12).
- Standardized z-scoring (ddof=1) across all kernel-based tests for consistent variance estimation.
- Minor interface fixes to prevent empty/null dataset names from being dropped in result graph labels.
- Fixed bug preventing some latent structure graphical models from being edited in the interface.
- Fixed a bug in the interface preventing time lag model specifications from playing nicely across session boxes.
- Patched ICA-LingD to stop throwing exceptions and stabilized its behavior; we also noted ICA-based methods are now usable, albeit demoted in importance compared to DirectLiNGAM.
- Fixed a bug in the RICF likelihood calculation that resulted from using the covariance rather then the precision of the undirected component. In testing, the likelihood scores for simple models are as expected.
- Fixed a bug in knowledge for time lag models, where if the user specifies their own knowledge in a knowledge box and gives it to an algorithm, this knowledge was ignored in favor of the knowledge from the Knowledge box. Changed the preference order.