Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw collector not working #218

Open
stef97 opened this issue Aug 15, 2018 · 9 comments
Open

rgw collector not working #218

stef97 opened this issue Aug 15, 2018 · 9 comments
Assignees
Labels

Comments

@stef97
Copy link

stef97 commented Aug 15, 2018

Hi,

RGW socket is not discovered

I have changed rgw_sockets to this but still no luck

rgw_sockets = glob.glob('/var/run/ceph/ceph-client.rgw.*.asok')

I am using Mimic and the socket looks like this
/var/run/ceph/ceph-client.rgw.osd01.1519.94623440453632.asok

Any help will be truly appreciated as this is an AWESOME piece of software
Steven

@pcuzner
Copy link
Contributor

pcuzner commented Aug 15, 2018

For mimic onwards we've switched from collectd to the embedded prometheus exporter in ceph-mgr. All perf stats are sent to the mgr anyway, so switching to this approach eliminates a lot of 'moving parts'.

However, if you're happy with collectd we should be able to help too. Starting from the top, the collector probes the host to look for the radosgw binary, and if found enables stats gathering. So in your collectd log on your rgw host are you seeing "Roles detected .... rgw: True"?

By default the collector should be running in debug mode, which will give us some more info in /var/log/collectd-cephmetrics.log - could you upload that to pastebin and drop the link in here?

Another common problem with access is selinux - is it possible that selinux is blocking?

@pcuzner pcuzner self-assigned this Aug 15, 2018
@stef97
Copy link
Author

stef97 commented Aug 16, 2018 via email

@stef97
Copy link
Author

stef97 commented Aug 21, 2018

any chance you can look into this ?
role discovery is not accurate on any of the ceph servers ( all are showing "true" for all roles except iSCSI)

Thanks
Steven

@pcuzner
Copy link
Contributor

pcuzner commented Aug 21, 2018

Role detection is done through the presence of the ceph binaries for a mon/osd/rgw...so if you deploy all of these components it will look as though all those roles are active.

Is selinux active/blocking?

@stef97
Copy link
Author

stef97 commented Aug 21, 2018 via email

@stef97
Copy link
Author

stef97 commented Aug 21, 2018 via email

@pcuzner
Copy link
Contributor

pcuzner commented Aug 24, 2018

If selinux is not blocking then we need to check the output we're expecting. Can you run the following;

1.check hostname resolves as expected (from the dir where the collectors are installed)

from collectors.common import get_hostname
print(get_hostname())

if that's ok, move on to 2

  1. check content of perfdump from radosgw socket

from ceph_daemon import admin_socket
from collectors.common import get_hostname
import json
import glob

h = get_hostname()
prin("host is {}".format(h))
sockets = glob.glob('/var/run/ceph/ceph-client.rgw.{}.*asok'.format(h))

print("{} Socket files found: {}".format(len(sockets), ','.join(sockets)))

raw = admin_socket(sockets[0], ['perf','dump'], format='json')
resp = json.loads(raw)
print(s.get(h))

should show something like
{u'qlen': 0, u'get': 492577570, u'failed_req': 0, u'keystone_token_cache_miss': 0, u'req': 990144796, u'put_b': 258254121730048, u'keystone_token_cache_hit': 0, u'qactive': 0, u'cache_miss': 3095675, u'put_initial_lat': {u'sum': 33889048.13001993, u'avgtime': 0.137597955, u'avgcount': 246290346}, u'put': 246290358, u'cache_hit': 995756713, u'get_initial_lat': {u'sum': 1447620.759260316, u'avgtime': 0.005877737, u'avgcount': 246288767}, u'get_b': 258252243857025}

@stef97
Copy link
Author

stef97 commented Aug 24, 2018 via email

@stef97
Copy link
Author

stef97 commented Aug 27, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants