Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get User Timeout on Tenants with many users. #207

Open
IronBernd opened this issue Jul 5, 2023 · 8 comments
Open

Get User Timeout on Tenants with many users. #207

IronBernd opened this issue Jul 5, 2023 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@IronBernd
Copy link

Describe the bug

There is an issue with the get_user call.
This issue occurs only if there are many users.

To Reproduce
Steps to reproduce the behavior:
user = zia.users.get_user(user_id='82553113')
print(user)

The API is using zia.users.get_list, as there are more then 50.000 users there will be too many requests.
The Solution may by:

My solution will be:
elif email:

  •        user = (record for record in self.list_users(search=email) if record.email == email)
    
  •        user = (record for record in self.list_users(name=email) if record.email == email)
           return next(user, None)
    

Expected behavior

Get the requested user.

Kind regards
Bernd Wollny

@IronBernd IronBernd added the bug Something isn't working label Jul 5, 2023
@IronBernd
Copy link
Author

The diff was not displayed correctly.
I am not very common in Python, so i didn't know if there is something that will break my change.

@mitchos
Copy link
Owner

mitchos commented Jul 6, 2023

Hi @IronBernd you make a fair point that searching for a user via email takes some time. I would say this isn't the fault of pyZscaler though, let me explain my reasoning:

  • Zscaler doesn't offer the ability to natively return a user via email, only the userId parameter.
  • The search via email functionality was added as part of a previous feature request.
  • The only way to retrieve a user via email is to pull the entire user list, store it in memory and then search on the email key, stopping when we find a match
  • What you've highlighted is that I've got a redundant parameter in zia.users.list_users(search=email), which is the search parameter. This doesn't actually do anything since it's not valid, while some other API calls in the User Management suite do support a search parameter (e.g. departments), the users endpoint doesn't.

So what we have here is that both the current implementation and your suggested approach don't really do anything additional; we're still pulling the entire list and using a generator expression to iterate through and stop when we find a match, returning that result.

The bottleneck here is the amount of time it takes for ZIA to return the list of users via API. The only way to fix this would be to log a request with Zscaler to either:

  • Find a way to reduce the amount of time server-side that it takes to return all users
  • Implement a search function to allow whole or partial matches on any field
  • Implement a parameter that allows using email directly to return a match instead of just userId

I'm going to leave this issue open as the only 'fix' here is to clean up the current implementation to move it from:

        elif email:
            user = (record for record in self.list_users(search=email) if record.email == email)
            return next(user, None)

to

        elif email:
            user = (record for record in self.list_users() if record.email == email)
            return next(user, None)

Let me know what you think? I don't think there's anything we can do here.

@IronBernd
Copy link
Author

Hello @mitchos,

Thank you for the excellent response.

The Zscaler API is providing the following search criteria for users:

  • ID (different request syntax)
  • Name
  • Group (should return all users in a group)
  • Department (should return all users of this department)

Two example user:
'name': 'a01006640', 'email': 'a01006640@schaeffler.com' .....
'name': 'FlipChart-Test-LGN-1', 'email': 'a1035058@schaeffler.com'

The interesting thing is:
if I Call:
self.list_users(name='a01006640@schaeffler.com')
self.list_users(name='a1035058@schaeffler.com')

The API will find the user.
It seems that he is making a search over the whole User record.

In the Zscaler Documentation, you will find this:
"The name search parameter performs a partial match."

Maybe this will help.

Kind regards,
Bernd

@martinkiska
Copy link

Recently i have tshooted, that my app segments were getting no resource found from API server. After opening TAC case they fixed it globally.. But that tshooting took me to interesting workaround - by default pagesize parametr equals to 20. (during that time api call took around 16 seconds with 200 app segments). When i added kwarg pagesize = 499 (that is max according to documentation, but it accepts bigger vaules) it took only 3 second.

In zpa parameter is "pagesize", for zia it should be entered as page_size, and it should be automatically translated to "pageSize" key.

You may try this workaround...

@mitchos
Copy link
Owner

mitchos commented Jul 10, 2023

@martinkiska that's a great workaround and I didn't even think about the pagesize.

@IronBernd would you mind comparing the time for a zia.users.list_users() with default parameters vs zia.users.list_users(page_size=1000). I imagine it should be around 50x faster if my maths hold up 😄

What I might do is change the API call to max out the page size to get some more efficiencies for the larger userDBs out there if you find that fixes the timeout issue.

e.g.

 elif email:
            user = (record for record in self.list_users(page_size=1000) if record.email == email)
            return next(user, None)

@IronBernd
Copy link
Author

Hello,
only to see what's the problem is.

List of Timing:
zia.users.list_users() : TIMOUT ( To max equests)
[429: GET] https://zsapi.zscaler.net/api/v1/users?page=220 body=b'{"message":"Rate Limit (400/HOUR) exceeded","Retry-After":"2449 seconds"}'

users.list_users(page_size=499, max_pages=500):
time python3 ./zscaler2.py >/dev/null
real 9m25.436s
user 0m40.823s
sys 1m28.778s

zia.users.list_users(page_size=1000, max_pages=500)
time python3 ./zscaler2.py >/dev/null

real 6m24.488s
user 0m39.239s
sys 1m46.453s

And this is not an issue with the internet line.

@mitchos
Copy link
Owner

mitchos commented Jul 10, 2023

Oh wow that's a lot of time blocked waiting for I/O.

The only way we could get around this is to write an async method for any API calls that might have a huge number of entries returned... except we have a rate limit of 1 per second which means we can't use an exponential function to increase the amount of threads. We'd have to drip-feed our requests at a rate of 1 per second. Our efficiency over single-threading is capped by the amount of time it takes for an API call to return (I'm getting around 11-13 seconds consistently).

So to get 50k users as quick as possible using async, I'm calculating the below:

  • 1 request per-second @ 1000 users per request
  • 50 seconds for 50,000 users
  • plus an additional 11-13 seconds for the last call to return
  • totaling approx. 60-70seconds

That's still a performance increase of 10x if you're comparing against the last test with a page size of 1000 @IronBernd.

I don't think waiting 5min+ to search for a user by email across 50k users is a very good result. Having said that, I don't think it's the responsibility of pyZscaler to 'fix' this for Zscaler, but we did add a capability that wasn't there natively so it's a two-way street here.

I'm happy to investigate this option if we think this might be of value?

@martinkiska
Copy link

btw, just tried to increase the page_size parameter to 20k, as there is no limitation mentioned on the ZIA API reference guide and it worked. 10k users in 16 seconds, that is much better ;). Without this customization, it took around 2min 50sec. @IronBernd you may try to increase it even to 50k so it will be downloaded in one batch. Then it should be only about TCP delays (for me delay between syn and syn+ack is 100ms, which is quite a lot [zsapi.zscaler.net]).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants