Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserialize function doesn't have uri parameter #125757

Closed
alyssajotice opened this issue Jun 7, 2021 · 16 comments
Closed

Deserialize function doesn't have uri parameter #125757

alyssajotice opened this issue Jun 7, 2021 · 16 comments
Assignees
Milestone

Comments

@alyssajotice
Copy link

I am working on getting Live Share notebooks functional again after recent API changes. We want Live Share notebooks to be running on stable APIs, which means using serializers rather than content providers.

However, serializers do not work for the guest scenario. Previously, the guest's content provider would have an openNotebook function with a uri parameter for the document being opened. We proxied that request to the host, where we could use vscode.notebook.openNotebookDocument with the uri. Now, deserialize only includes the document's content. I can add the serializer's view type to proxied request, but the only information we can give the host is content and viewtype. We need it to return NotebookData, but we cannot identify which notebook document to deserialize.

Could vscode.commands.executeCommand('_resolveNotebookContentProvider'); return serializers that have callable deserialize functions?

cc @jrieken @rebornix @Davsterl

@gregvanl
Copy link

gregvanl commented Jun 8, 2021

Looks like a product question so moving to the vscode product repo.

@gregvanl gregvanl transferred this issue from microsoft/vscode-docs Jun 8, 2021
@alyssajotice
Copy link
Author

Thanks @gregvanl, it was in the wrong repo.

@Downchuck

This comment has been minimized.

@jrieken jrieken modified the milestones: June 2021, July 2021 Jun 24, 2021
@rebornix rebornix removed their assignment Jul 1, 2021
@jrieken jrieken removed this from the July 2021 milestone Jul 12, 2021
@rebornix rebornix removed the notebook label Oct 21, 2021
@alyssajotice
Copy link
Author

Hey @jrieken, can we reopen the conversation about migrating Live Share to serializers? We have a new suggestion for what we need to move away from content providers. Can we get an API to get a reference to a serializer and call deserializeNotebook on it? Then, we could keep a 1:1 mapping of serializers from the host to the guest. We can include the notebook type in our RPC request and then from the notebook type, get the reference to the serializer and call it.

Would this work?

@jrieken
Copy link
Member

jrieken commented Mar 31, 2022

Things needed to onboard LS onto serializers

  • make sure vscode.resolveNotebookContentProviders includes both providers/sources
  • add command to "execute" notebook serializer, e.g vscode.executeNotebookSerializer(notebookType, data)

Rough sketch of the flow

  1. LS gets notebook types (file pattern, name, type) from host
  2. for guest LS adds serializer per type (serializer knows its type)
  3. serializer forward their type and given data to the host where it runs the executeNotebookSerializer command, host returns data from command to guest

jrieken added a commit that referenced this issue Apr 1, 2022
…Notebook` to execute notebook serializers by type, #125757
@jrieken jrieken added this to the April 2022 milestone Apr 1, 2022
@jrieken
Copy link
Member

jrieken commented Apr 1, 2022

I have pushed changes to make sure the vscode.resolveNotebookContentProviders works (it was broken since 579e0a3), I ensures that it does include all kinds of notebook data providers.

I have also pushed to new commands that allow to "execute" a notebook serializer: vscode.executeDataToNotebook and vscode.executeNotebookToData. Both need to be called with a notebook type and the respective data. This is a sample:

test('command: vscode.executeDataToNotebook', async function () {
const value = 'dataToNotebook';
const data = await vscode.commands.executeCommand<vscode.NotebookData>('vscode.executeDataToNotebook', notebookType, new TextEncoder().encode(value));
assert.ok(data instanceof vscode.NotebookData);
assert.strictEqual(data.cells.length, 1);
assert.strictEqual(data.cells[0].value, value);
});
test('command: vscode.executeNotebookToData', async function () {
const value = 'notebookToData';
const notebook = new vscode.NotebookData([new vscode.NotebookCellData(vscode.NotebookCellKind.Code, value, 'fooLang')]);
const data = await vscode.commands.executeCommand<Uint8Array>('vscode.executeNotebookToData', notebookType, notebook);
assert.ok(data instanceof Uint8Array);
assert.deepStrictEqual(new TextDecoder().decode(data), value);
});

@jrieken jrieken closed this as completed Apr 1, 2022
@jrieken
Copy link
Member

jrieken commented Apr 1, 2022

These changes should allow LiveShare to migrate onto the notebook serializer API. Please let me know how that goes

@alyssajotice
Copy link
Author

@jrieken things are working with the serializers for most files. I have one test file that this command does not work with and I get the following error:
image
Other .ipynb files work as expected, and other notebook types work as expected.

To repro, I set up a serializer that deserializes the content of the file with this command using the jupyter serializer.

extension.ts
import { TextEncoder } from 'util';
import * as vscode from 'vscode';

export async function activate(context: vscode.ExtensionContext) {
  context.subscriptions.push(vscode.workspace.registerNotebookSerializer('vsls-testing', new class implements vscode.NotebookSerializer {
  	deserializeNotebook(content: Uint8Array, _token: vscode.CancellationToken): vscode.NotebookData | Thenable<vscode.NotebookData> {
  		return vscode.commands.executeCommand<vscode.NotebookData>('vscode.executeDataToNotebook', 'jupyter-notebook', content); 
  	}
  	serializeNotebook(data: vscode.NotebookData, _token: vscode.CancellationToken): Uint8Array | Thenable<Uint8Array> {
  		return new TextEncoder().encode(data.cells[0].value);
  	}
  }, {}, {
  	displayName: 'LS',
  	filenamePattern: ['*'],
  	exclusive: true
  }));
}

With this extension activated, open this file:

nb.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "dotnet_interactive": {
     "language": "html"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Hello world\n"
     ]
    }
   ],
   "source": [
    "print(\"Hello world\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    }
   },
   "source": [
    "## Hello W  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    }
   },
   "outputs": [],
   "source": [
    "Console.WriteLine(\"hello\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    }
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    }
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    }
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "dotnet_interactive": {
     "language": "csharp"
    }
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "70adf7d9d67fce19c0d10a40853929ab78ed9880e9ec63b083cd87fcc27327e6"
  },
  "kernelspec": {
   "display_name": ".NET (C#)",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.9"
  },
  "metadata": {
   "interpreter": {
    "hash": "0a6456904a4e417f6aca58f6dd6b4feac7b82dab23c8ff870c0b8bcfafad0149"
   }
  },
  "orig_nbformat": 3
 },
 "nbformat": 4,
 "nbformat_minor": 2
}


Two other issues I'm working on to remove content providers:

  1. We previously used the openNotebook (the function the guest calls with the notebook data to be deserialized) call on the host to block excluded files from being opened. We need the filename to do this.
  2. We create our cell managers in openNotebook on the host. These managers coordinate coediting and cell operations between the guest and host. These edits and operations are tracked via filename.

cc @daytonellwanger for visibility on these two issues.

@jrieken
Copy link
Member

jrieken commented Apr 19, 2022

call on the host to block excluded files from being opened.

Not sure what excluded means in this in this context?

We create our cell managers in openNotebook on the host. These managers coordinate coediting and cell operations between the guest and host. These edits and operations are tracked via filename.

Would the onDidOpenNotebook and onDidCloseNotebook events be an alternative for that?

@alyssajotice
Copy link
Author

For the excluded scenario, the host can have a list of files that they do not want to show to guests. When the guest requests an open, we check the list of excluded files and do not return the notebook data if the requested notebook is marked as excluded.

For the cell managers, I think we can use the onDidOpenNotebook call on the guest, send an RPC notification to the host, and create the cell managers there.

Do you have any updates about the cause of the error I mentioned?

@jrieken
Copy link
Member

jrieken commented Apr 19, 2022

Do you have any updates about the cause of the error I mentioned?

Please file a new, dedicated issue for that.

When the guest requests an open, we check the list of excluded files and do not return the notebook data if the requested notebook is marked as excluded.

Wouldn't it make more sense to do the check in the layer that implements the file system? E.g I assume the guest has a virtual file system which is implemented by the host. To me that looks like a better place for such checks, otherwise the guest can always workaround those protections, e.g have an extension snippet doing the equivalent of vscode.workspace.fs.readFile(vscode.Uri.parse('vsls:///some/excluded/file.nb'))

@alyssajotice
Copy link
Author

New issue here: #147711 (comment).

@alyssajotice
Copy link
Author

I'm not sure if here is the right place to continue this conversation (perhaps #147248 would be better?).

You're correct about the exclusion issue. We can block excluded files from being read (and then handed to the serializer) in our file system.

However, the cell ids remain an issue, and I'm not sure that we can switch to serializers with the current APIs. We depend on the NotebookDocument returned from the host to have associated cell ids that we use for all coediting functionality. Unfortunately with serializers, the host is returning the notebook contents that come from on disk, which does not include unsaved changes (like our cell ids). We have a few ideas to work around this, but they are a complicated and not ideal.

I can schedule this work for after Build, but it will take some time to figure out the right way to rework this. In the meantime to move us away from content providers, could we add an option to vscode.proposed.notebookLiveShare.d.ts, where the deserialize call has an overload with the uri of the file with the contents that need to be deserialized?

I can explain the cell id situation in greater detail if that would be helpful.

@jrieken
Copy link
Member

jrieken commented Apr 21, 2022

I can explain the cell id situation in greater detail if that would be helpful.

Yeah, that would be great because I believe I am missing the bigger picture.

Adding the uri-parameter to these calls might be possible but there is no guarantee that we always have this information. Internally this is just a context-free transform operation from bytes to structured data (similar to how text encoding/decoding works) and that can occur without a file/uri at hands. I also want to make sure that we are on the same page wrt lifecycle: Serializing and deserializing a notebook can occur multiple times "per notebook", e.g we deserialize for the initial open but it can happen again without the notebook having been closed/opened. In short: taking the deserialize call as signal to install something is likely wrong and error prone. The open/close events should be used instead. I also want to learn about the cell ids and how it drives coediting, I thought LS build upon the fact that every notebook cell is also a text document.

@jrieken jrieken removed the feature-request Request for new features or functionality label Apr 25, 2022
@github-actions github-actions bot locked and limited conversation to collaborators May 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants
@Downchuck @rebornix @jrieken @tanhakabir @gregvanl @alyssajotice and others