Add steps to destroy documents that are ineligible for receiving message when posting message through broadcast channel #8972

lozy219 · 2023-03-02T07:26:20Z

As discussed in #7253, we should clearly spec that when posting message through broadcast channel, if the destination is not eligible for messaging, the associated document should be made unsalvageable. This PR add some more steps to perform this in the 9.5 Broadcasting to other browsing contexts section.

At least two implementers are interested (and none opposed):
- Gecko: How should BroadcastChannel and BFCache interact? #7253 (comment)
- Chromium: How should BroadcastChannel and BFCache interact? #7253 (comment)
Tests are written and can be reviewed and commented upon at:
- To be done with https://crbug.com/1420862
Implementation bugs are filed:
- Chromium: https://crbug.com/1420862

cc @fergald @rakina @domenic @annevk @smaug---- @asutherland please help to take a look 🙇

/browsing-the-web.html ( diff )
/web-messaging.html ( diff )

fergald · 2023-03-03T04:58:26Z

@domenic could you take a look at this please and advise on how to land it?

source

rakina · 2023-03-03T05:26:00Z

source

+    <ul>
+     <li><p>If <var>destination</var>'s <span>relevant global object</span> <var>o</var> is a
+     <code>Window</code>, <span data-x="destroy a document">destroy</span> <var>o</var>'s
+     <span data-x="concept-document-window">associated <code>Document</code></span>.</p></li>


I think it might be worth adding a note clarifying that it is OK to trigger "destroy a document" here because the non-fully active document is either detached or is a BFCached document, which shouldn't receive BroadcastChannel messages and can't be restored after missing this message (maybe also link to https://w3ctag.github.io/bfcache-guide/#gate-fully-active)

domenic · 2023-03-03T06:16:56Z

@domenic could you take a look at this please and advise on how to land it?

Will do, but it will likely take a few days as my review backlog is quite large at the moment.

annevk

I discussed this with some colleagues and we're not really okay with destroying the cache for this case. ~~We currently deliver the messages, but we are open to not delivering them, if that would be better.~~ I suspect that really depends on how BroadcastChannel is used so it's probably not possible to pick something that works well everywhere. Though perhaps we could pick a default and let websites configure it in the constructor.

Edit: as per discussion below WebKit does not deliver the messages.

fergald · 2023-03-07T04:54:16Z

@annevk From my testing (see below) Mozilla and WebKit (desktop) both drop these if the receiver is in BFCache. Do you have a demontration page where they are queued and delivered?

BroadcastChannel
Chrome destroys (actually will not allow the page to enter BFCache but we want to relax this)
FF/WebKit drops

My reading of the spec right now is that these should be dropped because they are only delivered to active documents. A big concern is that BroadcastChannel is used to log out all tabs when one tab logs out. Dropping these will result in logged-in state being restored from BFCache, which is a serious privacy problem

In general, an API that is reliable until BFCacheing occurs will lead to hard to reproduce bugs.

We're fine with queuing but it would be hard to spec limits on queue size. Leaving that implementation dependent is an option. @asutherland how do you feel about that?

If we queue them, I think the timing is a bit ambiguous. Should they run as soon as the document is active again, i.e. before visibilitychange and pageshow events? That's my fairly naive reading of the spec.

cdumez · 2023-03-07T17:05:16Z

WebKit doesn't drop b/f cache entries based on BroadcastChannel (or any other Web APIs for that matter). We also do not prevent entering the b/f cache based on BroadcastChannel.

B/f cache entries are very valuable and I personally do not think it is a good idea to discard its entries based on something like BroadcastChannel messages being broadcasted.

asutherland · 2023-03-07T18:32:11Z

Re: @fergald

We're fine with queuing but it would be #7253 (comment) limits on queue size. Leaving that implementation dependent is an option. @asutherland how do you feel about that?

I think we're eventually going to need to spec the size of structured serialization and storage actions, so I don't think we should view that as insurmountable (but I'm also unable to do the spec work at this time, so I won't presume we have that for this discussion).

A compromise if we are going to queue/buffer things is to set a minimum number of messages that will be buffered, like 3. @cdumez would webkit be interested in supporting buffering on a per-BroadcastChannel basis? (That is, each BroadcastChannel would have its own buffer as opposed to having a shared buffer across all BroadcastChannels associated with a global.) Buffering per-channel allows a page to partition the messages they really want reliable-ish transport for to happen on a rarely used channel, like for "log out" versus "spammy heartbeats".

The upper bound could be implementation dependent, which also makes sense because browsers are going to have implementation-dependent heuristics and underlying implementation costs that determine when a bfached thing is no longer worth its cost.

If we queue them, I think the timing is a bit ambiguous. Should they run as soon as the document is active again, i.e. before visibilitychange and pageshow events? That's my fairly naive reading of the spec.

It seems desirable that a page can unambiguously know if it's seeing buffered messages from BroadcastChannel (or other APIs; ex: some discussion re sessionStorage noting that storage APIs have it easier since they have a canonical backing store). It seems like there are some options for this:

We add a flag to Events or their bufferable sub-classes that explicitly indicates if they're coming from buffer playback or not. It might be useful to note if the buffer overflowed or not, as one could imagine very complex pages could have a fast path and a slow path that varies on whether there was an overflow.
We book-end the buffered events so that we schedule the "pageshow" event task, then all of the buffered tasks, then an "all-done-with-buffered-tasks" event task or a promise that we added to "pageshow" that resolves once all the tasks have been dispatched. This has the advantage of letting page logic defer actions until it knows it is caught up to realtime which seems important. I expect in the absence of this, pages may just use setTimeout(0) which probably would work but limits implementation flexibility in the future. (Although I'm not sure of a situation where it would be desirable to let a freshly created timer interleave with the buffered event playback).

petervanderbeken · 2023-03-07T19:57:51Z

* FF/WebKit drops

Firefox doesn't drop messages, it removes the page from the BFCache if a message is sent to the open BroadcastChannel while in the BFCache.

From what I can tell by testing, WebKit keeps the page in the BFCache but drops the messages sent to the open BroadcastChannel while in the BFCache. @annevk, could you confirm that that's actually WebKit's behaviour? That seems quite weird as a behaviour to me. I think it makes more sense to either remove the page from the BFCache, or to queue the messages. We chose the first one because queueing is definitely more complicated (need to find a sensible limit, …).

fergald · 2023-03-07T23:54:42Z

Clarifications. When I said "drops" I meant drops the message. So the current state is

Chrome: no bfcache with BC
Mozilla: evicts if message is received while in BFCache
WebKit: drops message

I thought I saw Mozilla behaving the same as WebKit but I cannot reproduce it.

To make life a little easier, I have created a WPT that sends a message to a page in BFCache. It is NOTRUN on Chrome and Mozilla (as expected). I don't have a way to test it on WebKit but I expect it will FAIL.

fergald · 2023-03-08T00:15:55Z

Can we agree on something before discussing details?

If we restore a page from BFCache it must either

receive all BC messages or
have opted-in to dropping messages and can find out when that happened

@cdumez @annevk That is not WebKit's current behaviour, so I would particularly like your response on this.

@petervanderbeken I think you already agree but an explicit response would be helpful.

@asutherland this may be slightly different to your position because it requires an explicit opt-in. This doesn't prevent queuing by default however it means that if the queue overflows, we evict, unless the page explicitly told us that it's OK to drop.

cdumez · 2023-03-08T00:20:39Z

Can we agree on something before discussing details?

If we restore a page from BFCache it must either

receive all BC messages or

have opted-in to dropping messages and can find out when that happened

@cdumez @annevk That is not WebKit's current behaviour, so I would particularly like your response on this.

I am not opposed to: receive all BC messages when restoring a page from the b/f cache (probably up to a certain limit)
I thought this was what we implemented but it wouldn't surprise me if we were dropping messages instead.

Either way, we haven't noticed yet any breakage from our current behavior.

petervanderbeken · 2023-03-08T09:40:24Z

@petervanderbeken I think you already agree but an explicit response would be helpful.

I discussed it a bit with @smaug----, we still prefer no queueing. Queueing is complicated to spec and implement, ideally you'd then want to specify the limits (taking into account message sizes too). We already have compat issues around limits of just the length of the session history itself for example.

So our position would be: no queueing, evict on message with an opt-in for dropping messages instead.

fergald · 2023-03-08T23:42:04Z

@petervanderbeken Do we need to spec the details of queueing?

I think you and I agree there should be only 2 outcomes:

document is restored from BFCache without missing messages
document is reloaded

To achieve that, we can just say that the messages are queued and delivered if the document becomes active again. If a browser wants to evict with no queueing they are in spec because the document never becomes active again.

fergald · 2023-04-18T01:01:37Z

Pinging this. Does anyone object to speccing the following?

documents with an open BroadcastChannel can go into BFCace
documents reactivated from BFCache should receive all the messages that would have arrived
(doesn't really need to be said but for clarity here) browsers are free to destroy the document and evict the BFCache entry if they want to e.g. if the queue of messages gets too big

petervanderbeken · 2023-04-18T13:57:48Z

We're ok with that proposal. We will probably keep our current behaviour at first, which is essentially a message queue length of 0, but the third bullet point allows for that.

I think the main question is if WebKit is interested in changing their behaviour of dropping messages.

annevk · 2023-05-08T15:44:38Z

That depends on how often it would mean a cache eviction where currently there is none. We don't really want to regress on that as we haven't seen issues with our current behavior.

fergald · 2023-06-01T07:47:35Z

@annevk

That depends on how often it would mean a cache eviction where currently there is none. We don't really want to regress on that as we haven't seen issues with our current behavior.

Less than 0.4% of history navigations in Chrome have a BC open. I do not know what fraction of that would receive messages.

I can't find any discussion or documentation of BC being unreliable beyond this thread. Before #7296, the spec seemed to say that messages should always be delivered. I don't think devs are aware that it is not reliable, especially as it is reliable on Firefox and Chrome. Pages using BC as a heartbeat will be fine but sites using BC to send incremental updates are definitely broken by this probably in ways that are very hard to trace back to BC+BFCache.

I also don't know what guidance we could give to sites that want a reliable BC. Rebuilding it from other APIs seems hard. If you don't care about reliability but do care about BFCache, closing your BCs in pagehide is easy.

domenic · 2023-06-01T08:38:39Z

We discussed this in the triage call (#9308) today. Summary:

WebKit would be open to a queue with an implementation-defined limit while in bfcache, with implementation-defined behavior when the limit is reached. Others on the call were not excited about the implementation-defined post-limit behavior.
To reach our standard of "2+ implementers support, no strong implementer objections" on requiring eviction when the limit is reached, @annevk would need to check with the rest of the WebKit team as to whether they would strongly object to evicting on exceeding a limit.
- What would help here is more-specific numbers, which @fergald could help collect, e.g.: what percent of back navs have a BC channel open and receive 1+ messages, 2+ messages, N+ messages, etc. (Or maybe in KiB...)
- I pointed out that if WebKit sets a limit on the same order of magnitude as "a bfcached web page", it's very unlikely that limit will ever be reached.
There was some discussion of how this is similar to, or not similar to, storage events. It seems like previously the decision was to not fire any storage events for bfcached pages: Session storage and changing browsing contexts storage#119 (comment) although this was never reflected into the spec. @annevk would be comfortable with dropping all BroadcastChannel messages for bfcached pages in this same manner, but others on the call were not excited about making BroadcastChannel a lossy communications channel.
In general @annevk was skeptical that sites using BroadcastChannel would correctly respond to queued-and-replayed messages after being taken out of bfcache. Examples of such sites, which would break their functionality with dropped messages but work correctly with replayed messages (especially 2+ replayed messages) would be helpful.

fergald · 2023-06-15T02:09:56Z

I don't have data on the fraction of restores that would be impacted by a queue size of N. Gathering that would require implementing this change. I also can't test that nothing fails with queueing because that requires implementing queueing, I can only look at the code.

In general in the examples below, I cannot see a reason why anything would end up in an incorrect state if messages were delivered after restore. I can see how some cases may result in a lot of work being performed if multiple events are delivered.

There is no timing guarantee on BC messages or guarantees of ordering with respect to other message channels. So I would say that if an app cannot correctly receive 2+ messages on restore, then it was already buggy without BFCache. Although for some apps, the day-to-day likelihood of hitting such a bug could be zero.

If we believe that 2+ messages is commonly problematic, that seems like an argument for a max queue size of 1.

Examples

TL;DR

it's easy to find things that will break with dropped messages, including some widely used code (Firebase)
there are some examples that looked like they would be OK with dropped messages
I couldn't find any that would obviously break with a queue
replaying a large queue on restore will be a performance problem for some pages

Sources

I got most of these from https://snyk.io/advisor/npm-package/broadcast-channel/example. This is basically a polyfill for BroadcastChannel, defaulting to native BroadcastChannel. In the first 5 examples, 4 look like a problem and 1 looked like it was using BroadcastChannel to broadcast within the same document, so would not be a problem. I don't know how popular these apps/packages are.

The Firebase example comes from searching httparchive, it is very widely used. I found it very hard to uncover other understandable usages of BC via httparchive, so I stopped.

I also searched Google's internal source but it's hard for me to know what's launched/public and everything gets minified and finding the public version of the JS is a chore, so I'm not sharing any examples from there but there were a couple that looked problematic.

Telegram

TDWeb is telegram.org's library for building web clients. BroadcastChannel is used in the closeOtherClients method. It appears that this is called whenever a client is constructed, I suppose it ensures that you only have 1 live client at a time. This would fail if there was a client in BFCache and the message was dropped. It seems like it would be fine to deliver 2+ of these.

Firebase

https://cs.opensource.google/firebase-sdk/firebase-js-sdk/+/master:packages/installations/src/helpers/fid-changed.ts;l=97?q=%22new%20BroadcastChannel%22
broadcasts changes to the FID which I believe is the user-id. A tab that misses this broadcast will remain on the old user ID. This sounds bad if all tabs are supposed to be on the same FID.
Again, I don't see why delivering 2+ of these would break anything but if dealing with them is costly, it could result in jank.

RXDB

RXDB is a JS database that uses BroadcastChannel to make sure all tabs using the DB are in sync. That's going to fail if messages are dropped.
I don't see any reason for it to fail if messages are delivered after BFCache restore.

Lektor

Lektor is some kind of CMS. It sends a signal to reload all open tabs when changes are made. This signal comes to a worker which then sends it via BroadcastChannel. A tab coming back from BFCache would stay out of sync.
Delivering 2+ of these seems like it would not break, just reload excessively.

source

rakina · 2023-09-21T03:46:01Z

source

+   browsing context</span>'s <code>WindowProxy</code>'s <span data-x="concept-windowproxy-window">
+   [[Window]]</span>.</p></li>
+
+   <li><p>Let <var>channels</var> be a list of <code>BroadcastChannel</code> objects whose


In step 9 of postMessage, it looks like the channels are sorted by creation order. Should we add that step here as well?

I think when broadcasting a message to multiple channel, we have to sort them in order to reserve the order, but here it's flushing all the messages received on different channels on page reactivation, it would be a bit subtle to define what's the correct order of the channel given the messages could be received from different time.

Also, the sorting behavior seems not compulsory with this sentence:

This does not define a complete ordering. Within this constraint, user agents may sort the list in any implementation-defined manner.

I will leave this open first, and change the content if necessary.

The messages from each channel's queue might be from different times, yes. But for each message, the ordering of the channel that got them should be the same right? (e.g. message X is received and there are channels A & B, and X should be received by the older channel first, regardless if X is sent immediately or later on after being queued). I think it would be good to preserve that property in this case.

Thanks. I have added a similar sentence here. Since all the channels here should have the same "relevant agents" so maybe we just need to mention the order. Is that good enough?

source

lozy219 · 2023-09-22T07:00:54Z

Hi @annevk , from #8972 (comment) , it looks like we have reached an agreement that mentioning the queueing behavior with a implementation defined capacity is acceptable. We have made some updates ub this pull request, could you help to take a look and confirm if there is any objection from WebKit side?

Also does @fergald 's examples in #8972 (comment) resolved your concern about if sites are able to handle dropped messages? Do you need us to collect any more data to support the proposal?

Thanks.

annevk · 2023-11-16T10:46:24Z

I feel like there's still no significant data either way that WebKit's existing behavior is problematic. And while we're open to adjusting it in terms of attempting to replay messages, we'd probably still err on the side of wanting to revive a document over replaying its messages. When forced to make a choice between those that is.

fergald · 2024-06-14T06:42:13Z

Getting back to this as we have someone working on it again.

I feel like there's still no significant data either way that WebKit's existing behavior is problematic. And while we're open to adjusting it in terms of attempting to replay messages, we'd probably still err on the side of wanting to revive a document over replaying its messages. When forced to make a choice between those that is.

I've linked to code that definitely breaks, the firebase code will leave a page in the logged-in state when it should be logged out. Nobody has ever debugged this hard-to-repro problem to the point of filing a bug against Safari but the problem is still there.

If we adopt this behaviour then we need to document that BroadcastChannel is not a reliable communication channel.

I am curious what else Safari does. Does it drop or deliver storage events that occur while in BFCache?

lozy219 changed the title ~~Add steps to destroy the ineligible document when posting message through broadcast channel~~ Add steps to destroy documents that are ineligible for receiving message when posting message through broadcast channel Mar 2, 2023

rakina reviewed Mar 3, 2023

View reviewed changes

annevk reviewed Mar 7, 2023

View reviewed changes

annevk added topic: history topic: serialize and transfer labels Mar 7, 2023

This was referenced Mar 28, 2023

Interactions with the BFCache whatwg/fs#17

Open

Back-forward cache meta-bug #5880

Open

fergald mentioned this pull request Apr 18, 2023

Determine interaction with back forward cache w3c/webtransport#326

Closed

domenic added the agenda+ To be discussed at a triage meeting label Apr 20, 2023

past mentioned this pull request Apr 20, 2023

Upcoming HTML standard issue triage meeting on 4/20/2023 #9132

Closed

past mentioned this pull request May 4, 2023

Upcoming HTML standard issue triage meeting on 5/4/2023 #9193

Closed

rakina mentioned this pull request May 25, 2023

Make document unsalvageable rubberyuzu/html#2

Merged

past removed the agenda+ To be discussed at a triage meeting label Jun 2, 2023

past mentioned this pull request Jun 2, 2023

Upcoming HTML standard issue triage meeting on 6/1/2023 #9308

Closed

past mentioned this pull request Jun 15, 2023

Upcoming HTML standard issue triage meeting on 6/15/2023 #9375

Closed

Add queueing spec for broadcast channel

7a3925f

lozy219 force-pushed the destroy-ineligible-document-from-broadcast-channel branch from 202f9ae to 7a3925f Compare September 20, 2023 06:44

lozy219 added 5 commits September 20, 2023 17:39

Queue the serialized messages instead of the tasks to post the message

36a2d79

Fix wrong span attribute

ccca53a

Update the document reactivation section

9a9e3c9

Fix typo

461ec90

Fix wrong position of notes

f61e8bb

rakina reviewed Sep 21, 2023

View reviewed changes

lozy219 added 2 commits September 21, 2023 17:04

Update according to review

377e439

Sort the channels for the restored document's window

6208595

fergald mentioned this pull request Oct 31, 2023

Upcoming HTML standard issue triage meeting on 11/2/2023 #9874

Closed

past added agenda+ To be discussed at a triage meeting and removed agenda+ To be discussed at a triage meeting labels Oct 31, 2023

past mentioned this pull request Nov 16, 2023

Upcoming HTML standard issue triage meeting on 11/16/2023 #9909

Closed

past mentioned this pull request Dec 1, 2023

Upcoming WHATNOT meeting on 11/30/2023 #9937

Closed

nathanmemmott mentioned this pull request Jan 3, 2024

Define file locking in inactive pages whatwg/fs#154

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add steps to destroy documents that are ineligible for receiving message when posting message through broadcast channel #8972

Add steps to destroy documents that are ineligible for receiving message when posting message through broadcast channel #8972

lozy219 commented Mar 2, 2023 •

edited by pr-preview bot

Loading

fergald commented Mar 3, 2023

rakina Mar 3, 2023

domenic commented Mar 3, 2023

annevk left a comment •

edited

Loading

fergald commented Mar 7, 2023

cdumez commented Mar 7, 2023

asutherland commented Mar 7, 2023

petervanderbeken commented Mar 7, 2023

fergald commented Mar 7, 2023

fergald commented Mar 8, 2023

cdumez commented Mar 8, 2023

petervanderbeken commented Mar 8, 2023

fergald commented Mar 8, 2023

fergald commented Apr 18, 2023

petervanderbeken commented Apr 18, 2023

annevk commented May 8, 2023

fergald commented Jun 1, 2023

domenic commented Jun 1, 2023 •

edited

Loading

fergald commented Jun 15, 2023

rakina Sep 21, 2023

lozy219 Sep 21, 2023

rakina Sep 21, 2023

lozy219 Sep 22, 2023

lozy219 commented Sep 22, 2023

annevk commented Nov 16, 2023

fergald commented Jun 14, 2024

Add steps to destroy documents that are ineligible for receiving message when posting message through broadcast channel #8972

Are you sure you want to change the base?

Add steps to destroy documents that are ineligible for receiving message when posting message through broadcast channel #8972

Conversation

lozy219 commented Mar 2, 2023 • edited by pr-preview bot Loading

fergald commented Mar 3, 2023

rakina Mar 3, 2023

Choose a reason for hiding this comment

domenic commented Mar 3, 2023

annevk left a comment • edited Loading

Choose a reason for hiding this comment

fergald commented Mar 7, 2023

cdumez commented Mar 7, 2023

asutherland commented Mar 7, 2023

petervanderbeken commented Mar 7, 2023

fergald commented Mar 7, 2023

fergald commented Mar 8, 2023

cdumez commented Mar 8, 2023

petervanderbeken commented Mar 8, 2023

fergald commented Mar 8, 2023

fergald commented Apr 18, 2023

petervanderbeken commented Apr 18, 2023

annevk commented May 8, 2023

fergald commented Jun 1, 2023

domenic commented Jun 1, 2023 • edited Loading

fergald commented Jun 15, 2023

Examples

Sources

Telegram

Firebase

RXDB

Lektor

rakina Sep 21, 2023

Choose a reason for hiding this comment

lozy219 Sep 21, 2023

Choose a reason for hiding this comment

rakina Sep 21, 2023

Choose a reason for hiding this comment

lozy219 Sep 22, 2023

Choose a reason for hiding this comment

lozy219 commented Sep 22, 2023

annevk commented Nov 16, 2023

fergald commented Jun 14, 2024

lozy219 commented Mar 2, 2023 •

edited by pr-preview bot

Loading

annevk left a comment •

edited

Loading

domenic commented Jun 1, 2023 •

edited

Loading