Skip to content

Conversation

@dlaehnemann
Copy link
Member

@dlaehnemann dlaehnemann commented Oct 31, 2025

the fgsea script throws:
"Error: C stack usage 7974932 is too close to the limit"

Summary by CodeRabbit

  • Bug Fixes

    • Removed debug output from workflow script for cleaner execution.
  • Chores

    • Enhanced error handling in analysis workflow.

the fgsea script throws:
"Error: C stack usage  7974932 is too close to the limit"
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 31, 2025

Walkthrough

The FGSEA R script was updated to add error handling infrastructure via rlang::global_entrace() and to remove two debug print statements for cleaner output.

Changes

Cohort / File(s) Summary
Error handling & debugging cleanup
workflow/scripts/fgsea.R
Added rlang::global_entrace() call for enhanced error handling; removed interim debug prints (print(gene_sets) and print(ranked_genes))

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

  • Single-file change with minimal modifications
  • Addition of error handling utility is a standard practice
  • Removal of debug statements is straightforward

Poem

🐰 Errors now caught with entrances grand,
Debug prints banished from the land,
The FGSEA script runs clean and bright,
With better tracing through the night! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "fix: debug fgsea error" is directly related to the changeset's main objective. The PR addresses a C stack usage error in the fgsea script, and the changes (adding rlang::global_entrace() and removing debug print statements) are targeted at resolving this issue. The title uses the "fix:" prefix to indicate a bug fix and specifically references "fgsea error," making it clear and specific rather than vague or generic. While the word "debug" could potentially be interpreted in multiple ways, the title is sufficiently clear to summarize the primary change from a teammate's perspective reviewing the git history.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/debug-fgsea-stack-error

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 37df153 and d24ffb3.

📒 Files selected for processing (1)
  • workflow/scripts/fgsea.R (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: dlaehnemann
Repo: snakemake-workflows/rna-seq-kallisto-sleuth PR: 0
File: :0-0
Timestamp: 2025-10-10T07:43:04.543Z
Learning: In this Snakemake workflow project, tidyverse package functions (such as `read_tsv()` from `readr`) provide good error handling and clear error messages. Manual error handling with `tryCatch()` for common cases is not necessary when using tidyverse functions.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: run-three-prime-rna-workflow
  • GitHub Check: run-rna-workflow
🔇 Additional comments (1)
workflow/scripts/fgsea.R (1)

17-39: Good: Debug print statements removed.

The removal of interim debug prints (print(gene_sets) and print(ranked_genes)) improves code cleanliness for production use. The data loading and preprocessing logic looks sound.

sink(log)
sink(log, type="message")

rlang::global_entrace()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify that rlang::global_entrace() actually resolves the C stack error.

The PR description indicates a "C stack usage is too close to the limit" error. However, rlang::global_entrace() is primarily an error tracing utility that captures better backtraces—it doesn't directly address C stack limit issues, which typically arise from deep recursion or large nested data structures.

Could you confirm that adding this line actually resolves the C stack error? If the issue persists, consider these alternatives:

  • Adjust the nproc parameter in the fgsea() call (line 52) to control parallelization
  • Increase expression depth limit with options(expressions = 50000) before the fgsea() call
  • Check if input data size can be reduced or chunked

Run the script with the changes to confirm the C stack error is resolved:


🏁 Script executed:

#!/bin/bash
# Description: Check if rlang is listed as a dependency
# This helps verify if rlang::global_entrace() will work as expected

# Search for rlang in package dependencies (e.g., DESCRIPTION, environment.yaml, requirements files)
fd -t f -e yaml -e yml -e txt --exec rg -l "rlang" {}
rg -n "library.*rlang|require.*rlang" --type r

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Find project structure and R package metadata files
echo "=== Project metadata files ==="
fd -t f \( -name "DESCRIPTION" -o -name "renv.lock" -o -name ".Rprofile" -o -name "environment.yml" \)

echo -e "\n=== Listing workflow/scripts directory ==="
fd -t f --base-directory workflow/scripts

echo -e "\n=== Contents of fgsea.R (first 60 lines) ==="
head -60 workflow/scripts/fgsea.R

echo -e "\n=== Search for package dependencies on rlang ==="
rg -n "rlang|fgsea" --type r -C 2 workflow/scripts/fgsea.R | head -80

Length of output: 6701


🏁 Script executed:

#!/bin/bash
# Verify if fgsea depends on rlang or tidyverse
echo "=== Search for package dependency information ==="
# Look for any R package metadata that might list dependencies
fd -t f \( -name "DESCRIPTION" -o -name "*.lock" \) 

echo -e "\n=== Check common.R to understand what's loaded ==="
head -30 workflow/scripts/common.R

echo -e "\n=== Search for any version control or environment files ==="
fd -t f \( -name "*.lock" -o -name "*requirements*" -o -name "*.txt" \) --max-depth 2 2>/dev/null | head -20

Length of output: 1263


🌐 Web query:

rlang global_entrace C stack error fix

💡 Result:

The C-stack errors are caused by stack overflows (deep/infinitely recursive code) while rlang’s global entrace/error handlers try to capture a backtrace. Fixes:

  • Stop the recursion (best fix).
  • Temporarily disable rlang’s global entracing when running code that may overflow:
    • In your session: rlang::global_entrace(FALSE).
    • Or remove/avoid calling rlang::global_entrace() from your Rprofile. [1][2]
  • Upgrade rlang and R (rlang’s handlers use R >= 4.0 globalCallingHandlers; newer rlang releases include behavior and docs about stack-overflow handling). If you rely on handlers for diagnostics, keep rlang up to date. [1][3]
  • If you must catch stack overflows, use tryCatch/try_fetch around the risky call (note: handlers behave differently for stack overflows and R >= 4.2). See rlang docs on try_fetch and stack overflows. [4]

References:
[1] rlang::global_entrace documentation. [2] rlang::entrace docs (how entrace/handlers work). [3] rlang global_handle / handlers docs. [4] rlang docs on try_fetch and stack overflows.


rlang::global_entrace() likely worsens rather than fixes C stack overflow errors.

Based on rlang documentation, global_entrace() captures detailed error backtraces and adds error handling overhead. For C stack overflow errors (like "C stack usage 7974932 is too close to the limit"), this approach is counterproductive because it increases stack consumption during error processing.

The recommended fixes for C stack overflows are:

  • Disable global_entrace() rather than enable it
  • Reduce nproc parameter in the fgsea() call (line 48) to decrease parallelization overhead
  • Adjust eps parameter to affect convergence behavior
  • Reduce input data size if possible

Please verify that adding this line actually resolves the reported C stack error. If the error persists, consider removing global_entrace() and implementing one of the alternatives above.

🤖 Prompt for AI Agents
In workflow/scripts/fgsea.R around line 5, the call to rlang::global_entrace()
is likely increasing stack usage and can worsen C stack overflow errors; remove
or comment out this line so backtrace capture is disabled, then in the fgsea()
invocation around line 48 reduce the nproc value (e.g., lower or set to 1) to
limit parallel threads, consider increasing the eps parameter to relax
convergence or otherwise tune it, and if feasible reduce the input data size;
after these changes run the failing analysis to confirm the C stack error is
resolved and only re-enable advanced error tracing if the overflow no longer
occurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants