Take charge of parsing and evaluating inputs #960

lionel- · 2025-11-06T13:54:52Z

Progress towards posit-dev/positron#1766
Addresses posit-dev/positron#6553
Closes #598
Closes #722
Closes #840

This PR does two things:

It takes charge of Reading/Parsing and Evaluation of R code. (First commit)
Refactors, simplifies, and fixes a bunch of issues. (All other commits)

Doing parsing and evaluation in Ark is a significant departure of how read_console(), the heart of the kernel, is currently structured. This handler is hooked to the R frontend method ReadConsole, which takes user inputs as strings. The R REPL gets these strings, parses them, evaluates them, prints the result, then loops back. With this PR, we take control of both parsing and evaluation.

There is no functional changes here except for a few bugfixes. In the future though, this will allow us to:

Parse all inputs sent from editors with source references and injected breakpoints.
When debugging, evaluate code in the frame selected in the IDE.

Fixes

Minor: We no longer have any size limitation on lines of inputs. They were previously limited to 4096 bytes, the maximum size of the ReadConsole input buffer.
Restart during debug sessions now works as expected: Restart the R kernel during debugging session leads to bad state positron#6553
We now interrupt ongoing evaluations before restarting. This shouldn't have any impact on Positron, which sends an interrupt before restarting, but should make restarts behave better with other frontends. This is also a nice defensive measure even with Positron.
We now flush autoprint output when an error is thrown by a print method. I was quite confused by this bug because it also swallowed debugging output!
The set of syntax errors that throw an R error instead of returning an error code no longer cause a leaked backtrace in the console (Sending a pair of empty backticks to the kernel results in an internal error showing up in the console #598, Weird error with "\s" #722)

These fixes (and more) are covered by integration tests.

Approach

We parse inputs into a vector of complete expressions. These expressions are stored in the read-console state in a new PendingInputs struct that implements a stack interface, with a pop method. This stack of pending inputs replaces the stack pending lines we implemented in Split inputs by newlines #536.
When read-console is called by R for a new input, it is evaluated and the result is stored in base::.ark_last_value. We store it in the base environment for robust access from (almost) any evaluation environment (global on top level but the debugger can evaluate in any environment). We only require the presence of :: so that we can reach into base.
Read-console returns to R with the string input "base::.ark_last_value" if evaluation returned visibly, or "base::invisible(base::.ark_last_value)" if it returned invisibly. The R REPL then parses this expression and evaluates it.
Because we are still hooked into the R REPL, it has a chance to do top-level stuff like printing warnings, running task callbacks, updating .Last.value, etc.

While this approach works, I hit two main issues that we need to workaround:

When an error happens, the R_EvalDepth variable is not reset before the next evaluation (it's normally reset here: https://github.com/r-devel/r-svn/blob/811080fb/src/main/main.c#L260). This could be bad especially if that error is an exceeded evaluation depth error and we're looping back to evaluate a new expression without having reset it.
When a nested debug REPL returns to a parent REPL, the R_ConsoleIob parse buffer is normally reset after returning from eval() (here: https://github.com/r-devel/r-svn/blob/811080fb/src/main/main.c#L279). With this approach, it doesn't have a chance to reset before entering read-console again.

To solve the first issue, we evaluate a dummy input ("base::.Last.value" to keep the last value stable, this is tested) after an error occurs. This gives R a chance to fully reset state. To solve the second issue, we send a dummy parse causing a PARSE_NULL everytime we detect a nested read-console has returned. This causes R to reset the parse buffer here: https://github.com/r-devel/r-svn/blob/811080fb/src/main/main.c#L238.

These workarounds are unfortunate but have been mostly contained in the r_read_console() method, which keeps track of all these details and other state.

Reorganisation

Read-console is both an event loop (when we're waiting for input from the frontend) and a state machine (to manage execution of inputs). The event loop part is now extracted in a new method run_event_loop(). We no longer wait for both input replies (when we're asking the frontend to reply to a readline() or menu() question) and execution requests (when we're waiting for the frontend to send us the next bit of code to execute) at the same time, it's either one or the other, which simplifies the code. The handler for input requests now invokes the event loop itself, instead of returning to the top level of read-console to fall through the event loop.
The state machine part of read-console has been simplified in other ways. We now inspect the states from the top level and pass it down through arguments, which makes it easier to reason about. See in particular the new take_exception() and take_result() methods called from top-level and passed down to the handler for replying to an active request when there are no inputs left to evaluate. This allowed simplifying things quite a bit.
We now keep track of the number of nested consoles (due to active debug sessions) on the stack. Not used for anything but it seems good information to keep track of. To support this (and other state bookkeeping), we use a new wrapper around R_ExecWithCleanup().
The handling of debugger commands has been consolidated into a single location.
A benefit of parsing expressions ourselves is that it is no longer possible for R to send us an incomplete prompt. This has been completely removed from the possible states read-console can be in, which is another nice simplification.
The DummyFrontend for integration tests has now a few more methods like execute_request() that capture common patterns of messaging. This removes quite a lot of boilerplate.

I couldn't produce any issues with `R_EvalDepth` but it seems more sound to let R reset state before evaluation

Would be hard to get right in the case of nested browser sessions

We've never supported it, so better be explicit about it. This allows us to simplify some special-casing for that question. Also set `--no-restore-data` by default for consistency.

DavisVaughan

Looks good! I've been running this for a few days and have not had any major issues.

In addition to my comments below, two other things

I still think Option would be useful around srcrefs to show to the reader that they are optional. I confused myself about them again today. It doesn't seem too hard to do this #961

There is one regression I noted with debugonce() when inside a package that you have done load_all() with:

devtools::load_all()
debugonce(vec_count)
vec_count(1:5)

If you take a step and then run 1:2 in the Console, then with this PR you get jumped to a fallback view of vec_count, which is pretty jarring and isn't right compared with the current behavior of staying in the deparsed srcref file we created for vctrs.

This seems worth fixing before merging.

This PR:

Screen.Recording.2025-11-11.at.9.15.22.AM.mov

Current main:

Screen.Recording.2025-11-11.at.9.16.28.AM.mov

crates/libr/src/r.rs

DavisVaughan · 2025-11-11T13:05:35Z

crates/ark/src/sys/unix/interface.rs

+        if stdext::IS_TESTING {
+            use libr::ptr_R_CleanUp;
+
+            use crate::interface::r_cleanup_for_tests;
+
+            libr::set(ptr_R_CleanUp, Some(r_cleanup_for_tests));
+        }


Suggested change

if stdext::IS_TESTING {

use libr::ptr_R_CleanUp;

use crate::interface::r_cleanup_for_tests;

libr::set(ptr_R_CleanUp, Some(r_cleanup_for_tests));

}

// TODO!: Comment about why we need this

if stdext::IS_TESTING {

libr::set(

libr::ptr_R_CleanUp,

Some(crate::interface::r_cleanup_for_tests),

);

}

I think this needs a big comment like the one below it about why we need this for tests.

I also very much dislike this local use style. I've never seen other Rust programmers use it, and it doesn't feel very ergonomic to me.

IIUC, it is something like

// Install a CleanUp hook for integration tests that test the shutdown process. // We confirm that shutdown occurs by waiting in the test until `CLEANUP_SIGNAL`'s // condition variable sends a notification, which occurs in this cleanup method // that is called during R's shutdown process.

I like that local uses make it very clear what the scope of the usage is, especially in these special paths.

DavisVaughan · 2025-11-11T13:09:17Z

crates/harp/src/parser/srcref.rs

+    RObject::new(result)
+}
+
+pub fn get_block_srcrefs(call: libr::SEXP) -> RObject {


The name block confuses me. Is this just for calls? Can it be get_call_srcrefs()?

Is this really different from your pub fn srcrefs(&self) helper above? I think we should avoid duplication of work if possible for hygeine reasons.

DavisVaughan · 2025-11-11T13:12:18Z

crates/amalthea/tests/client.rs

+    // Send a shutdown request with restart = false
+    frontend.send_shutdown_request(false);
+
+    // Shutdown requests generate busy/idle status messages on IOPub
+    frontend.recv_iopub_busy();
+
+    // Receive the shutdown reply
+    let reply = frontend.recv_control_shutdown_reply();
+    assert_eq!(reply.status, Status::Ok);
+    assert_eq!(reply.restart, false);
+
+    frontend.recv_iopub_idle();
+
+    // Test with restart = true


I don't think these comments add much

DavisVaughan · 2025-11-11T13:12:52Z

crates/ark/tests/kernel.rs


+// Avoids our global calling handler from rlangifying errors.
+// This causes some test instability across configs.


I don't think I understand this comment

CI was failing because the error included rlang's ! prefix. I don't understand why this doesn't happen locally.

DavisVaughan · 2025-11-11T15:28:03Z

crates/ark/src/interface.rs

+
+    // SAFETY: Call this from a POD frame. Inputs must be protected.
+    unsafe fn eval(expr: libr::SEXP, srcref: libr::SEXP, buf: *mut c_uchar, buflen: c_int) {
+        let frame = harp::r_current_frame();


Hmmmmmmmmmmm, big code smell about this being an RObject in a POD frame! Destructor probably won't run if this jumps?

Oh hmm, you've moved ownership into a SEXP and destroyed the RObject in the next line.

That really confused me!

DavisVaughan · 2025-11-11T15:34:34Z

crates/ark/src/interface.rs

                }
            },
-            Err(err) => ConsoleResult::Error(err),
+            Err(err) => ConsoleResult::Error(format!("{err:?}")),


Why remove the amalthea::Error?

It felt like an odd choice since we're in Ark not Amalthea

DavisVaughan · 2025-11-11T15:41:12Z

crates/ark/src/interface.rs

+    // and are now ready to evaluate the actual input
+    if let Some(next_input) = main.read_console_next_input.take() {
+        RMain::on_console_input(buf, buflen, next_input).unwrap();
+        return 1;


Suggested change

// and are now ready to evaluate the actual input

if let Some(next_input) = main.read_console_next_input.take() {

RMain::on_console_input(buf, buflen, next_input).unwrap();

return 1;

// and are now ready to evaluate the actual input, which is typically

// just `.ark_last_value`.

if let Some(next_input) = main.read_console_next_input.take() {

RMain::on_console_input(buf, buflen, next_input).unwrap();

return 1;

Right? At first I was concerned that we could write something here that would overflow the buffer.

DavisVaughan · 2025-11-11T15:42:36Z

crates/ark/src/interface.rs

+    // Technically this also resets time limits (see `base::setTimeLimit()`) but
+    // these aren't supported in Ark because they cause errors when we poll R
+    // events.
+    if main.last_error.is_some() && main.read_console_threw_error.get() {


Suggested change

if main.last_error.is_some() && main.read_console_threw_error.get() {

if main.read_console_threw_error.get() {

Is one of these enough?

If not, it's mildly confusing that you reset read_console_threw_error but not last_error in the if block

DavisVaughan · 2025-11-11T15:52:13Z

crates/ark/src/interface.rs

+use std::sync::Condvar;
+pub static CLEANUP_SIGNAL: (Mutex<bool>, Condvar) = (Mutex::new(false), Condvar::new());
+
+#[no_mangle]
+pub extern "C-unwind" fn r_cleanup_for_tests(_save_act: i32, _status: i32, _run_last: i32) {


I think we are likely to get confused about the OS level support for this right now.

Can we move all of this into a unix/ folder's file?

It is only supported by unix/interface.rs, so it is misleading to have CLEANUP_SIGNAL here.

It becomes very confusing when you look at DummyArkFrontend::wait_for_cleanup(), which looks like it can be used on any OS because CLEANUP_SIGNAL is available on any OS. But that will never get a notification on Windows!

So it would be nice if wait_for_cleanup() would be marked with cfg(unix), which requires making everything here cfg(unix) as well.

lionel- force-pushed the feature/read-srcref branch 2 times, most recently from 3b0cb2f to d7dbc33 Compare November 6, 2025 15:11

lionel- added 28 commits November 7, 2025 08:21

Take charge of parsing and evaluation

c93bf69

More caller tracking

5df36e9

Consolidate debugger states

7dd0f4e

Extract handle_input_request()

2f07846

Extract handle_pending_input()

0650040

Rename finalize_call_text() to handle_read_console()

3367930

Make read() a constructor method on PendingInputs

dccf144

Refactor console error and result handling

fc697b7

Cancel pending inputs when we get in the debugger

2040e08

Add test for invalid syntax

4bae770

Tweak documentation

019cba0

Remove into_protected() method

7c3c168

Make reply_execute_request() a free function

69ca0dc

Create Jupyter exception in the global condition handler

cad9a3f

Fully remove incomplete prompts heuristic since they are now impossible

e884469

Extract ReadConsole event loop into method

e6ec6e4

Return to base REPL in case of error

acf96e4

I couldn't produce any issues with `R_EvalDepth` but it seems more sound to let R reset state before evaluation

Rename eval_pending() to eval()

30d244b

Don't clear pending expressions in browser sessions

fde428e

Would be hard to get right in the case of nested browser sessions

Tweak docs

13f1fd0

Add failing test

666a727

RMain::eval() doesn't need self ref

fb69d6b

Add harp::exec_with_cleanup()

02a8aa4

Keep track of nested REPLs and clean up R's state when needed

acf7965

Flush autoprint buffer in case of error

22e21bb

Use a PromptKind enum to discriminate prompts

6bb6f3b

Evaluate in current environment

94229c6

Keep track of console frame via r_read_console()

8810197

lionel- added 13 commits November 7, 2025 08:21

Clearer documentation and variable name

83ba148

Add shutdown handling to dummy frontend

5780f6d

Add integration tests for shutdown

d257e05

Prevent R from asking about saving workspace

3ec41cb

We've never supported it, so better be explicit about it. This allows us to simplify some special-casing for that question. Also set `--no-restore-data` by default for consistency.

Shutdown all nested consoles in case of Shutdown request

5e219c2

Send interrupt before shutting down

6216051

Move signal declaration inside of Unix context

ab9a936

Make the error message test less brittle

cf63dbc

Disable brittle tests

42f8775

Opt out of Shutdown tests on Windows

6d2f1e8

Improve naming a bit

492d33c

Don't include backtrace in syntax errors

337efde

Fix backtraces in special syntax errors

b4839c2

lionel- force-pushed the feature/read-srcref branch from ae27d23 to b4839c2 Compare November 7, 2025 08:39

Disable error entracing in sensitive tests

a4272d1

lionel- requested a review from DavisVaughan November 7, 2025 09:05

Extract FrontendDummy::execute_request() and variants

0b9052c

DavisVaughan mentioned this pull request Nov 11, 2025

Make it clear that srcrefs is not always present #961

Open

DavisVaughan approved these changes Nov 11, 2025

View reviewed changes


		// Avoids our global calling handler from rlangifying errors.
		// This causes some test instability across configs.

	if main.last_error.is_some() && main.read_console_threw_error.get() {
	if main.read_console_threw_error.get() {

Take charge of parsing and evaluating inputs #960

Are you sure you want to change the base?

Take charge of parsing and evaluating inputs #960

Uh oh!

Conversation

lionel- commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixes

Approach

Reorganisation

Uh oh!

DavisVaughan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lionel- commented Nov 6, 2025 •

edited

Loading