Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monitoring: apm-server only ships a subset of monitoring metrics #13475

Closed
Tracked by #13604
carsonip opened this issue Jun 24, 2024 · 3 comments
Closed
Tracked by #13604

monitoring: apm-server only ships a subset of monitoring metrics #13475

carsonip opened this issue Jun 24, 2024 · 3 comments

Comments

@carsonip
Copy link
Member

carsonip commented Jun 24, 2024

Discovered by #13244

EA managed apm-server is only shipping a subset of apm-server monitoring metrics. This limits observability of EA managed apm-server. The problem is spread over the following use cases:

  • APM server self-monitoring
    • metricsets state, stats
    • indices (not data stream) .monitoring-beats-7-*
    • does NOT use TSDS, but fields are unmapped due to dynamic: false
    • code in ES repo monitoring-beats.json
  • Stack monitoring: metricbeat beat-xpack module uses index template embedded in ES
    • metricsets state, stats
    • data stream .monitoring-beats-8-mb
    • use case: Elastic Cloud self-monitoring, metricbeat standalone beat-xpack
    • does NOT use TSDS, but fields are unmapped due to dynamic: false
    • code in ES repo monitoring-beats-mb.json
  • Metricbeat without xpack: metricbeat beat module
    • metricsets state, stats
    • data stream metricbeat-*
    • use case: metricbeat standalone
    • does NOT use TSDS, fields are mapped dynamically
    • code in beats repo
  • EA integration
    • metricsets stats only
    • data stream metrics-elastic_agent.apm_server.* (see code)
    • use case: elastic-agent standalone with monitoring
    • to see EA integration package content: POST kbn:/api/fleet/epm/packages/elastic_agent
    • uses TSDS, see [Elastic Agent] Enable TSDB by default integrations#7214 . The issue with using TSDS is that unmapped fields are dropped, as it uses synthetic source.
    • code in integrations repo
    • it doesn't contain any apm-server fields as of today

Unknowns:

  • As the outdated mappings above contain fields that are no longer used, mapping generated from apm-server:5066/stats will not be pure addition. Not sure if dropping a field mapping will be problematic.
    • Should it be a superset of all historically available monitoring fields? That doesn't sound realistic either
@carsonip carsonip added the bug label Jun 24, 2024
@carsonip carsonip changed the title EA managed apm-server is only shipping a subset of metrics EA managed apm-server is only shipping a subset of apm-server monitoring metrics Jun 25, 2024
@carsonip carsonip changed the title EA managed apm-server is only shipping a subset of apm-server monitoring metrics monitoring: EA managed apm-server only ships a subset of apm-server monitoring metrics Jul 8, 2024
@carsonip
Copy link
Member Author

carsonip commented Jul 9, 2024

Testing

Completed local testing on all elastic/beats#40127, elastic/elasticsearch#110568, and elastic/integrations#10414. Note that both elastic/elasticsearch#110568 and elastic/integrations#10414 require elastic/beats#40127 to have output.elasticsearch.* metrics parsed.

apm-server self monitoring (indices .monitoring-beats-7-*)

before

image

after

image

metricbeat beat-xpack (DS .monitoring-beats-8-mb)

before

image

after

image

metricbeat beat standalone (DS metricbeat-*)

before

image

after

image

EA agent monitoring (DS metrics-elastic_agent-apm_server-*)

before

image

after

image

Long term solution

While the Python script in #13638 generates the correct mapping for all the above use cases, this approach is not very maintainable. Ideally, having dynamic: true like metricbeat standalone enables apm-server monitoring fields to be mapped dynamically and will require minimal maintenance. However, since EA agent mapping already has TSDB enabled, disabling TSDB now and enabling dynamic mapping sounds like a step back.

@carsonip carsonip changed the title monitoring: EA managed apm-server only ships a subset of apm-server monitoring metrics monitoring: apm-server only ships a subset of monitoring metrics Jul 18, 2024
@lahsivjar
Copy link
Contributor

lahsivjar commented Jul 18, 2024

Tested on 8.15 BC2, (UPDATE) followup testing with 8.15 BC3:

  • APM server self-monitoring: tested with ESS elasticsearch, ❌ the backport PR was not merged. It is merged now, and will require testing again in BC3. Tested with updated BC and is working as expected.

    Snapshots from testing Screenshot 2024-07-23 at 11 10 27 AM Screenshot 2024-07-23 at 11 10 42 AM
  • APM server stack monitoring: tested on ESS, ❌ the backport PR was not merged. It is merged now, and will require testing again in BC3. Tested with updated BC and is working as expected.

    Snapshots from testing Screenshot 2024-07-23 at 11 04 32 AM Screenshot 2024-07-23 at 11 04 46 AM
  • Metricbeat without x-pack, tested locally with apm-server 008b12a7e9929ce00e01040e67e83cf19c0b7de7 and beats 417ac43a6193bdf26da490d41532b2f2d5bc3f70

    Snapshots from testing Screenshot 2024-07-18 at 3 54 14 PM Screenshot 2024-07-18 at 3 54 32 PM
  • EA integration tested with locally running EA using elastic-agent-package 475d247d16a94af14b75d991a8da3d63d68e8042. ❌ While the test succeeded, I had to manually upgrade EA integration as the version bundled in Kibana is still older (v1.20.0) whereas the latest version is v2.0.1 - since this is not related to the current issue, marking this point as done. Confirmed that with BC3 the latest version v2.0.1 of elastic-agent integration is bundled.

    Snapshots from testing Screenshot 2024-07-18 at 4 56 56 PM Screenshot 2024-07-18 at 4 57 08 PM Screenshot 2024-07-23 at 11 07 42 AM

@carsonip
Copy link
Member Author

Moved all remaining tasks to #13731 . Closing this issue as the bug of missing metrics is now fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants