Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce timeout for keeping connection contexts alive #13082

Merged
merged 7 commits into from
Dec 18, 2023

Conversation

tsmaeder
Copy link
Contributor

What it does

Fixes #12823
The basic idea of this PR is that when a the websocket connecting the front end to back-end services disconnects, we don't throw the connection context on the back end away right away, but we keep it around in case the front end reconnects. Only after a configurable timeout do we close the connection context.

  • refactor front end to allow for multiple reconnections
  • remove IWebsocket abstractions
  • separate front end connections from service channel management
  • introduce mechanism to reconnect front end to existing connection context based on timeouts

Contributed on behalf of STMicroelectronics

How to test

I've added the following code into sample-menu-contribution.ts:

const reconnectCommand: Command = {
    id: 'test.reconnect',
    label: 'Test frontend reconnect'
};

@injectable()
export class SampleCommandContribution implements CommandContribution {

    @inject(QuickInputService)
    protected readonly quickInputService: QuickInputService;

    @inject(MessageService)
    protected readonly messageService: MessageService;

    // Test command contribution

    @inject(WebsocketConnectionSource)
    protected connectionProvider: WebsocketConnectionSource;

    registerCommands(commands: CommandRegistry): void {
        commands.registerCommand(reconnectCommand, {
            execute: async () => {
                const timeoutString = await this.quickInputService.input({
                    value: '500',
                    validateInput: input => Promise.resolve(Number.parseInt(input) === undefined ? 'not an integer' : undefined)
                });
                if (timeoutString) {
                    // eslint-disable-next-line @typescript-eslint/no-explicit-any
                    (this.connectionProvider as any).socket.disconnect();
                    setTimeout(() => {
                        // eslint-disable-next-line @typescript-eslint/no-explicit-any
                        (this.connectionProvider as any).socket.connect();
                    }, Number.parseInt(timeoutString));
                }
            }
        });

The command lets you disconnect the front end websocket for a configurable amount of milliseconds. This allows to simulate the disconnection for different amounts of time.
An application package can configure the timeout behaviour in it's package.json. here's the electron example, for example:

theia": {
    "target": "electron",
    "frontend": {
      "config": {
        "applicationName": "Theia Electron Example",
        "reloadOnReconnect": true
      }
    },
    "backend": {
      "config": {
        "frontendConnectionTimeout": -1
      }
    }
  },

any negative value for frontendConnectionTimeout will be interpreted as "keep forever". Any positive value or zero will be milliseconds to wait before discarding the connection context after the websocket closes.

The testing procedure is to run through different disconnect/reconnect scenarios:

  1. Close windows
  2. Refresh Windows
  3. SSH remote connection connect/disconnect
  4. Timeout behaviour with/without reloadOnReconnect (you'll have to rebuild the application to play around with the settings).
  5. Upon reconnecting, the IDE should work just as before, even when messages have been generated during the disconnection
  6. Makes sure all services (including plugins) work in all scenarios

Follow-ups

There are some deprecations in the code that we should clean up eventually.

Review checklist

Reminder for reviewers

@tsmaeder
Copy link
Contributor Author

@thegecko I've moved some stuff from "common" to "browser". Maybe have a look?

@thegecko
Copy link
Member

@thegecko I've moved some stuff from "common" to "browser". Maybe have a look?

It looks like this is just electron-connection-handler.ts moving out of common which is fine as long as this isn't used in the browser.

@tsmaeder tsmaeder requested a review from msujew December 1, 2023 13:28
Copy link
Member

@msujew msujew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality wise, this is looking quite good. Will perform a final testing review once the code comments are resolved.

Fixes eclipse-theia#12823

- refactor front end to allow for multiple reconnections
- remove IWebsockt abstractions
- separate front end connections from service channel management
- introduce mechanism to reconnect front end to existing connection
  context based on timeouts

Contributed on behalf of STMicroelectronics

Signed-off-by: Thomas Mäder <t.s.maeder@gmail.com>
Signed-off-by: Thomas Mäder <t.s.maeder@gmail.com>
Signed-off-by: Thomas Mäder <t.s.maeder@gmail.com>
Signed-off-by: Thomas Mäder <t.s.maeder@gmail.com>
Signed-off-by: Thomas Mäder <t.s.maeder@gmail.com>
Signed-off-by: Thomas Mäder <t.s.maeder@gmail.com>
Copy link
Member

@msujew msujew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works well when using Theia the "normal" way.

However, I've noticed during my final testing round that this breaks the Remote SSH feature. Unfortunately I don't have time this week to see why it breaks. I can take a look at it next week, assuming you haven't found the issue yourself in the meantime.

@tsmaeder
Copy link
Contributor Author

I did test the remote ssh feature. Can you share your scenario, please?

@msujew
Copy link
Member

msujew commented Dec 14, 2023

I can consistently reproduce it by trying to remote onto a Gitpod workspace.

@msujew
Copy link
Member

msujew commented Dec 14, 2023

@tsmaeder Ok, so it doesn't break it completely:

image

image

The application seems to be in a broken state: It is indeed connected to the backend of the remote machine (as can be seen in the Linux terminal). But the frontend still thinks it is connected to the local backend:

  • Opening native file browser
  • The status bar doesn't display the remote
  • Trying to open a new remote fails

I assume the dual websocket thingy that I've implemented to target both the local and the remote backend broke somehow. It's now all targeting the remote backend as far as I can tell.

Signed-off-by: Thomas Mäder <t.s.maeder@gmail.com>
@tsmaeder
Copy link
Contributor Author

@msujew I think I have fixed the remote problem (1-liner). if you could poke this once again, please?

Copy link
Member

@msujew msujew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems like it was a simple fix. Looks good, and can confirm that everything is working as expected now 👍

@tsmaeder tsmaeder merged commit d2bc79b into eclipse-theia:master Dec 18, 2023
14 checks passed
@tsmaeder
Copy link
Contributor Author

Merged with great satisfaction ;-) Thx for the review, @msujew

@vince-fugnitto vince-fugnitto added this to the 1.45.0 milestone Dec 21, 2023
@safisa
Copy link
Contributor

safisa commented Dec 27, 2023

Hi @tsmaeder,

Many thanks for this great improvement. Theia now works much more smoothly on disconnections, no plugin reloads, and of course, the git issue is resolved, and the main issue for us was the custom editors (or any webview) that were reloaded now are not and keep their state.

I will open later some issues (or questions) that I am facing now which I think are related to this change.

Again many thanks for this great work.

@zhuanshenlikai
Copy link

First of all, thank you very much for this contribution, it helped me solve a lot of problems.I have a question,What are the negative impacts of setting the parameter frontendConnectionTimeout to -1?

@tsmaeder
Copy link
Contributor Author

The worst thing that may happen is that you leak connection contexts in the back end. If the front end never reconnects to the back end, the back end contexts remain forever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Git plugin issue after network disconnection
6 participants