Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support OnDemandUpdate with multiple components at once. #23169

Merged
merged 3 commits into from
May 3, 2024
Merged

Conversation

goodov
Copy link
Member

@goodov goodov commented Apr 19, 2024

Changes in this PR:

  1. The CrxInstaller now includes the overridable IsBraveComponent() function. This allows us to distinguish Brave/non-Brave components and let the updater to batch multiple update requests into a single request to go-updater. This is possible because Brave components are always managed by the go-updater and are never redirected to an upstream updater.
  2. The brave_component_updater::BraveOnDemandUpdater is now the sole entry point for all Brave-specific "OnDemand" component updates.
  3. The initialization sequence of BraveOnDemandUpdater has been altered to directly use OnDemandUpdater interface.

This PR will be followed with a few more to:

  1. Cleanup component_updater::BraveOnDemandUpdate.
  2. Replace OnDemandUpdate with conditional Install calls when a component is not installed. Currently, we perform unnecessary on demand updates at each startup. The ComponentUpdateService should manage most updates (and batch them) using the standard schedule.

Resolves brave/brave-browser#37370

Submitter Checklist:

  • I confirm that no security/privacy review is needed and no other type of reviews are needed, or that I have requested them
  • There is a ticket for my issue
  • Used Github auto-closing keywords in the PR description above
  • Wrote a good PR/commit description
  • Squashed any review feedback or "fixup" commits before merge, so that history is a record of what happened in the repo, not your PR
  • Added appropriate labels (QA/Yes or QA/No; release-notes/include or release-notes/exclude; OS/...) to the associated issue
  • Checked the PR locally:
    • npm run test -- brave_browser_tests, npm run test -- brave_unit_tests wiki
    • npm run presubmit wiki, npm run gn_check, npm run tslint
  • Ran git rebase master (if needed)

Reviewer Checklist:

  • A security review is not needed, or a link to one is included in the PR description
  • New files have MPL-2.0 license header
  • Adequate test coverage exists to prevent regressions
  • Major classes, functions and non-trivial code blocks are well-commented
  • Changes in component dependencies are properly reflected in gn
  • Code follows the style guide
  • Test plan is specified in PR before merging

After-merge Checklist:

Test Plan:

@goodov goodov force-pushed the issues/37370 branch 5 times, most recently from 21023c6 to d2bffa3 Compare April 22, 2024 08:59
@goodov goodov added the CI/run-upstream-tests Run upstream unit and browser tests on Linux and Windows (otherwise only on Linux) label Apr 22, 2024
@goodov goodov force-pushed the issues/37370 branch 5 times, most recently from cc517d5 to f2b9124 Compare April 23, 2024 04:28
#include "base/functional/callback.h"
#include "chrome/browser/browser_process.h"
#include "components/component_updater/component_updater_service.h"
#include "brave/components/brave_component_updater/browser/brave_on_demand_updater.h"

namespace component_updater {

void BraveOnDemandUpdate(const std::string& component_id) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function will be removed in a follow up PR. It's left for now to not overcomplicate the PR.

@goodov goodov marked this pull request as ready for review April 23, 2024 10:40
@goodov goodov requested review from a team as code owners April 23, 2024 10:40
Copy link
Collaborator

@mherrmann mherrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does upstream have the capability added by this PR already? If yes, is our reason for needing the extra code the special redirect logic in go-update? If the answer to both of these questions is yes, then maybe we could make changes in go-update and remove our customisations in brave-core.

Specifically: I know that go-update for example redirects update requests for Widevine to Google's server. This is due to a licensing restriction, which prevents us from serving Widevine. But what if go-update instead proxied the update request? So go-update would forward the request to Google's server, and then serve that as its response (for that one component). The response would still contain Google's binary download URL, so we would not be hosting Widevine. We would only be proxying the update request, much like a normal proxy. Perhaps that could avoid all the extra complexity?

@goodov
Copy link
Member Author

goodov commented Apr 24, 2024

Does upstream have the capability added by this PR already?

Not exactly. Upstream OnDemandUpdater does not expose the function to run on demand check for multiple components. But in non-on-demand scenarios Chromium implementation uses batch-checks by default. We're splitting those checks into one-by-one because of our/external components separation that require a redirect.

If yes, is our reason for needing the extra code the special redirect logic in go-update? If the answer to both of these questions is yes, then maybe we could make changes in go-update and remove our customisations in brave-core.

Specifically: I know that go-update for example redirects update requests for Widevine to Google's server. This is due to a licensing restriction, which prevents us from serving Widevine. But what if go-update instead proxied the update request? So go-update would forward the request to Google's server, and then serve that as its response (for that one component). The response would still contain Google's binary download URL, so we would not be hosting Widevine. We would only be proxying the update request, much like a normal proxy. Perhaps that could avoid all the extra complexity?

It seems like there's now a fingerprinting concern to not batch-check external components/extensions brave/brave-browser#37370 (comment)

Not sure if that was the original reason to do the update checks one-by-one.

cc @diracdeltas to maybe shed some light here, i.e. what can we do and what we'd like to avoid. For ex., can we proxy a batch-update-check request to Google or not?

const std::vector<std::string>& ids,
component_updater::OnDemandUpdater::Priority priority,
component_updater::Callback callback) {
CHECK(on_demand_updater_);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add verification that all ids are Brave components?
For chromium components better to call OnDemandUpdate() with a single id.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add verification that all ids are Brave components? For chromium components better to call OnDemandUpdate() with a single id.

Don't think we need this.

component_updater::OnDemandUpdater::Priority priority,
component_updater::Callback callback) {
CHECK(on_demand_updater_);
on_demand_updater_->OnDemandUpdate(id, priority, std::move(callback));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe on_demand_updater_->OnDemandUpdate({id}, priority, std::move(callback)); to use the one entry point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't do that. The single-id method has a hidden property of running Install if priority == FOREGROUND.

for (auto id_it = remaining_ids_.begin(); id_it != remaining_ids_.end();) {
const auto& component = update_context_->components[*id_it];
if (!ids.empty() && !IsBraveComponent(component.get())) {
break;
Copy link
Collaborator

@atuchin-m atuchin-m Apr 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work well if remaining_ids_ contains only one Chromium component?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work well if remaining_ids_ contains only one Chromium component?

it should. This break triggers only if !ids.empty(), so at least one item must be added no matter what.

@goodov
Copy link
Member Author

goodov commented Apr 27, 2024

Wouldn't it be a lot easier to just reduce the normal background update check interval, and to revert our sequential update request logic in favor of an implementation in go-update that makes it proxy some bits of the update request to Google? Then we would only have one update request instead of the many we have now, and on a faster schedule.

It would, but there are some questions that need to be answered first:

  1. Can we actually proxy the batch-update check to Google? Expose APIs for updating a list of CRX components brave-browser#37370 (comment)
  2. Are we okay with increasing the check frequency for all components? This will also include installed extensions.
  3. Is AdBlock team okay with this approach? If we're switching AdBlock components to default, but more frequent update schedule, it might still not be enough and, from what I heard: we want to have AdBlock components updated at the very start.

@mherrmann
Copy link
Collaborator

Can we actually proxy the batch-update check to Google?

The linked comment in this line talks about fingerprinting. I see that concern. Though I wonder how much better 20+ requests within one minute and I bet in 99.99999% coming from the same IP really are.

we want to have AdBlock components updated at the very start.

Maybe I'm missing something, but I don't see how my proposal would change this. At the moment, Brave performs an update check for all components and extensions at browser start. I believe this comes from logic from upstream. But even if it didn't, we could kick off a background update check at start. I'd be surprised if that increased the latency until updated components are available by more than a few seconds - especially with one batch request instead of many single ones.

Are we okay with increasing the check frequency for all components?

Besides the fingerprinting issue, I don't see what would be the harm.

This will also include installed extensions.

That's true. (I tested to confirm.)

If we're switching AdBlock components to default, but more frequent update schedule, it might still not be enough

In which situation can it not be enough? We can set the background update interval as low as we want. The only scenario I can think of where background update checks are not enough is when components should be updated infrequently, but then when they are updated they need to be updated asap. I don't suspect that's the case.

@goodov
Copy link
Member Author

goodov commented Apr 29, 2024

Maybe I'm missing something, but I don't see how my proposal would change this. At the moment, Brave performs an update check for all components and extensions at browser start. I believe this comes from logic from upstream. But even if it didn't, we could kick off a background update check at start. I'd be surprised if that increased the latency until updated components are available by more than a few seconds - especially with one batch request instead of many single ones.
...
In which situation can it not be enough? We can set the background update interval as low as we want. The only scenario I can think of where background update checks are not enough is when components should be updated infrequently, but then when they are updated they need to be updated asap. I don't suspect that's the case.

Your proposal is fine as long as all those questions are answered and we're cool with that, but AdBlock-specifics require OnDemandUpdate at the very start of the browser to update all its components as early as possible. Few important things to note:

  1. There's a difference between updating all components/extensions and updating only AdBlock components at the start. The update request is considered "finished" when ALL components are updated.
  2. AdBlock components update on the start must be in a single batch to properly handle the re-building of the AdBlock engine only when ALL AdBlock-related components are downloaded. There's currently an issue where we re-build the engine on each OnComponentReady, which is very bad. This PR should allow the AdBlock engine re-build to be implemented in a perf-friendly way.

The proposal to allow batch updates for all components is reasonable, but it's a task that aims to optimize the general component updater flow, not the task that solves the AdBlock components update problem. Please create a separate issue to work on this, I would be happy if we remove SequentialUpdateChecker entirely.

@mherrmann
Copy link
Collaborator

but AdBlock-specifics require OnDemandUpdate at the very start of the browser to update all its components as early as possible

@antonok-edm wrote the following in Slack about the motivation for updating ad block components early:

I think that's a good idea too, but we have to be very careful about initial startup because if we delay too long users might see ads when they start the browser up

To me, this sounds like the motivation for updating ad components early is to make sure users don't see unwanted ads. But I don't think this requires OnDemandUpdate. On my machine, Chrome does a background update check for components within 60s of browser start. This means that I have the latest components after 61s. The 60s can be configured via update_client::Configurator::InitialDelay(). Instead of all the complexity we seem to be having around early on-demand update checks and batching, why don't we just set this to 0 or some other low value?

AdBlock components update on the start must be in a single batch to properly handle the re-building of the AdBlock engine only when ALL AdBlock-related components are downloaded.

With InitialDelay set to a low value, the components would naturally be updated in a single batch on browser start. Just like this PR, such an implementation could also serve as a foundation for re-building the ad block engine only once.

As a related note, some of the ad block performance problems came from code running on the UI thread. Maybe upstream has or will have more performance-friendly logic for background update checks vs. on-demand update checks that we could benefit from. Using an on-demand update check when a background update check would probably be enough is not a good idea in general in my opinion.

but it's a task that aims to optimize the general component updater flow, not the task that solves the AdBlock components update problem. Please create a separate issue to work on this, I would be happy if we remove SequentialUpdateChecker entirely.

The task is to update ad block components in a single batch. There is an easy, built-in mechanism for doing so. I feel we should use it. I have no doubt that the quality of this PR is excellent. But it seems to me that this is leading us down a wrong path. If we go ahead, future development work will build on the APIs introduced here. But I have the impression that we are working against upstream's design here, and I believe our lives will be easier if we instead work with it.

@goodov
Copy link
Member Author

goodov commented Apr 29, 2024

Instead of all the complexity we seem to be having around early on-demand update checks and batching, why don't we just set this to 0 or some other low value?
<...>
Using an on-demand update check when a background update check would probably be enough is not a good idea in general in my opinion.

  1. This will trigger an update of all components, including extensions, which will not finish until all components are updated.
  2. Each component update will trigger OnComponentReady in an undefined order and time.

But I have the impression that we are working against upstream's design here

Usptream does not have the scenario of interlinked components.

The task is to update ad block components in a single batch.

... and to get the completion status of the update in a single callback.

There is an easy, built-in mechanism for doing so.

There is no public API in updater interfaces to request the update of specific components and get a single callback when all requested components are updated.

@mherrmann
Copy link
Collaborator

Alright, I could make further points in response to your comments, but it seems your mind is made up and the discussion will not go anywhere. Feel free to merge without my approval. 👍

@goodov
Copy link
Member Author

goodov commented Apr 29, 2024

Alright, I could make further points in response to your comments, but it seems your mind is made up and the discussion will not go anywhere. Feel free to merge without my approval. 👍

The decision is strong here: some part of this PR most likely will stay regardless if SequentialUpdateChecker exists or not, or if update timings are modified. We may remove this if AdBlock will move from the ComponentUpdater infrastructure or decide to pack all filters in a single component.

If you'd like to remove SequentialUpdateChecker and make the overall component update process more optimal, feel free to work on this: brave/brave-browser#37933

Copy link
Collaborator

@antonok-edm antonok-edm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @goodov! Your assessment of the situation looks accurate to me - there's some chance we can re-evaluate the approach after a generalized solution to #22188 is available for all the adblock-related components, but it's not realistic in the short term.

@@ -155,6 +156,10 @@ AdBlockComponentInstallerPolicy::GetInstallerAttributes() const {
return update_client::InstallerAttributes();
}

bool AdBlockComponentInstallerPolicy::IsBraveComponent() const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to me the BraveComponent and BraveComponentInstallerPolicy look like a legacy thing that should be cleaned up.

Yep, this code is from before my time at Brave and I have no attachment to it. But agreed that it doesn't need to be handled here.

@goodov
Copy link
Member Author

goodov commented May 2, 2024

@brave/sec-team do we need a security review here? There was a fingerprinting-related question that seems to be answered brave/brave-browser#37370 (comment)

If we do need a sec review here, I will create it. Otherwise, please review this PR and remove needs-security-review label.

@fmarier
Copy link
Member

fmarier commented May 2, 2024

The needs-sec-review label was added because of the two uses of base::Unretained which you have already reviewed and resolved. No need to file a full sec review.

Copy link
Contributor

github-actions bot commented May 3, 2024

[puLL-Merge] - brave/brave-core@23169

Description

This PR modifies the component updater system in Brave to allow Brave-specific components to be updated in batches rather than individually. It introduces a BraveOnDemandUpdater singleton that acts as an intermediary between Brave code and the Chromium OnDemandUpdater. Component IDs are partitioned so that Brave components can be updated together in a single request.

Changes

Changes

  • brave/browser/brave_browser_main_parts.cc and brave/browser/brave_browser_main_parts.h:

    • Register the BraveOnDemandUpdater with the Chromium OnDemandUpdater in PreMainMessageLoopRun().
  • brave/browser/brave_browser_process_impl.cc:

    • Remove registration of BraveOnDemandUpdate callback, as this is no longer needed.
  • brave/chromium_src/chrome/browser/component_updater/component_updater_utils.cc:

    • Modify BraveOnDemandUpdate to use BraveOnDemandUpdater instead of directly accessing the ComponentUpdateService.
  • brave/chromium_src/components/component_updater/component_installer.cc and component_installer.h:

    • Add IsBraveComponent() method to ComponentInstallerPolicy to identify Brave components.
  • brave/chromium_src/components/component_updater/component_updater_service.cc and component_updater_service.h:

    • Modify OnDemandUpdate to support updating multiple components in a single request.
  • brave/chromium_src/components/update_client/update_checker.cc:

    • Partition component IDs so that Brave components are grouped together for batch updates.
  • brave/components/brave_component_updater/browser/:

    • Implement BraveOnDemandUpdater to handle on-demand updates for Brave components.
  • Various Brave component installers:

    • Implement the IsBraveComponent() method to identify them as Brave components.
  • brave/components/brave_shields/browser/ad_block_component_service_manager.cc:

    • Update filter lists by calling BraveOnDemandUpdater with all filter list component IDs.

Security Hotspots

  1. Medium risk: The BraveOnDemandUpdater is a singleton that is accessed throughout the code. Make sure it is thread-safe and handles updates from multiple call sites properly.

  2. Low risk: Ensure that all Brave components correctly implement IsBraveComponent() so they are properly grouped for updates.

  3. Low risk: The callback passed to BraveOnDemandUpdater::OnDemandUpdate should handle errors properly and not assume updates always succeed.

Overall, the changes look well-structured and safe. The main point to verify is that the BraveOnDemandUpdater is robust to being used as a singleton. With proper synchronization and error handling there shouldn't be significant security risks from this change. Testing various update scenarios, especially with multiple components, would help validate its correctness.

@goodov goodov enabled auto-merge (squash) May 3, 2024 14:04
@goodov goodov merged commit 24303f6 into master May 3, 2024
19 checks passed
@goodov goodov deleted the issues/37370 branch May 3, 2024 19:53
@github-actions github-actions bot added this to the 1.67.x - Nightly milestone May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI/run-upstream-tests Run upstream unit and browser tests on Linux and Windows (otherwise only on Linux) feature/web3/wallet/core feature/web3/wallet puLL-Merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose APIs for updating a list of CRX components
10 participants