-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Big SSL/TLS connect performance regression in AmiSSL 5.x #67
Comments
Could you rerun the test with 5.3? Particularly for RAND_seed(). It is just that I did mess with everything that uses timer.device in 5.4, so maybe I broke something. I'll try to find it later, but loading of certificates is slower - there is an issue open about it at OpenSSL somewhere (it is obvious to see when opening the CA certs page in IBrowse). |
AmiSSL 5.3:
There appear to be no discernable difference. Regarding certificates, at which point are those loaded? In this AmiSSLTest I am not doing any explicit certificate validation as you can see by the function calls. I am wondering because if you compare the times between runs of AmiSSLTest in the testscript, you see that for 4.12 the difference between the first and second run is ~2s and with 5.4 and 5.3 it is ~3s. So wondering if just that added delay might have anything to do with the certificates. The big offender for the first run is OpenAmiSSL() and it accounts for most of the 1s difference in the first run case: 4.12 OpenAmiSSL() for first and second run after cold:
5.4 OpenAmiSSL() for first and second run after cold:
|
Each time you run amissltest, it gets its own AmiSSLBase (and own completely different OpenSSL instance), so the difference between first and second run can only really be the time it is taking for the OS to physically load amissl_v*.library from disk (the library is about 50% bigger in v5, IIRC, so that makes those times about right) and initialise the library. I think certificates are only loaded as and when required, during the connection phase - that's partly why in AmiSSL we have a separate file for each certificate in AmiSSL:Certs, rather than a single .crt bundle file. The filenames are the hashes, so when OpenSSL needs to load a certificate, it calculates the hash, then knows which file it needs to look for. The certs in AmiSSL:Certs are actually duplicated - one copy with MD5 hashed names (to keep AmiSSL v1/v2/v3 happy) and one copy for SHA hashed names (as used by AmiSSL v4+). The issue I remembered about certificates is at openssl/openssl#16871 and I think it may have been mentioned before then too. Thanks for the 5.3 test. At least I didn't break the random stuff in 5.4, but perhaps in 5.1! My main concern is with RAND_seed() at the moment, as nothing should have really changed there, at least in the Amiga specific code. I thought this would be the easiest to track down, but trying to figure out exactly what code is actually called by RAND_seed() is proving more difficult than expected. |
AmiSSL 5.1:
5.1 is actually slower than 5.4 with SSL_connect being the biggest offender. Did not find 5.0 amongst the releases, but if you provide it, I can test it. |
There was no public 5.1 version - 5.0 was only used during development and testing. I still can't figure out what the difference is with RAND_seed() in OpenSSL 3.0. The Amiga-specific code (now in https://github.com/jens-maus/amissl/blob/0467dc10ef88e20245b979581f50ea8238cf91d6/openssl/providers/implementations/rands/seeding/rand_amiga.c) appears to still be ok, so I'm guessing OpenSSL 3.0 must be doing something different compared to 1.1.1. If you call RAND_seed() twice in your test program, does the second call still take as long as the first? And have you tried the "OpenSSL rand" command? |
The the second and the following calls are fast:
The size does not seem to matter, second and following time is still fast with a small size:
However, it does not make any change to the SSL_connect() time. I have not tried the "OpenSSL rand" command, but here it is:
Looks somewhat similar to the downloading times that got me started with this - initial time is a bit below doubled in 5.4 compared to 4.12, but it looks like the actual speed of random data generation is roughly the same: Attaching updated test program - with source, Makefile and everything this time too, only needs a stock vbcc with m68k-amigaos target installed, wget, lha and make to build: |
Ok, so it must be some autoinit of something. Technically, you don't need to call |
As I understood the docs, you have not needed to explicitly call |
Btw, there is no reduction in total time by not calling
|
Yes, as expected the inits just happen at different times. Somehow, I will eventually figure out what the extra initialisation is when calling |
After adding a bunch of debug output to see when the one-time initialisations are happening, I found that after your |
Unfortunately, it is not so good news with I'm curious if changing your
The only thing left to look at is |
Unfortunately, there is no change with the minimalistic openssl.cnf:
When doing several connects,
|
Actually, I noticed that with the minimal openssl.cnf, Total time is unchanged, which is what I was looking at. For the record, the multiple-connect test was using the minimal openssl.cnf also. |
If I re-run the multiple-connect tests with the original openssl.cnf,
|
You could try maybe deactivating the legacy provider in openssl.cnf, as I think maybe your test program will only need the default provider, although it still might not make any difference. The only good news is at least others are noticing and reporting specific performance issues (https://github.com/openssl/openssl/issues?q=is%3Aissue+is%3Aopen+performance) and it seems some are already fixed, but will maybe not reach us until OpenSSL 3.1. |
Noticed that you made some rand seed changes in the 5.5 release notes, so made a test to see if it made any difference:
RAND_seed() is measurably faster now, but SSL_CTX_new() is equally as much slower for some unknown reason so the total time is more or less identical. |
Yes, most likely some of the indirect initialisation simply moved. The seed code used to call The seed code yielded mostly non-random code here as it measured the time difference between Let's hope OpenSSL 3.1 is ready soon, as it seems that does already fix some speed issues. |
@patrikaxelsson I would be interested in your test results using the following: Even on my 68060, there was a measurable performance increase, albeit small, using your test program compared with 5.7. There are two reasons. Firstly, OpenSSL 3.1 contains some performance improvements. These alone tested faster on my 68060. I then also added atomic assembly operations, which replaces Exec semaphore protection used in previous versions - this also increased performance a little. I'm hoping this is almost ready to go. I'm just letting the IBrowse beta testers try it first, as there is a chance I could have broken the multithreading stuff. I also just need to work through all the OpenSSL 3.1 changes to make sure I didn't miss anything. |
I am apparently lagging behind on the benchmarking, so will have to do 5.7 first :)
Then 5.8-rc1:
Nice - big improvement in SSL_connect, and measurable improvements in RAND_seed and SSL_CTX_new. |
Also, lets compare the above 5.8-rc1 result against a run of 4.12 to see current status of the regression:
When comparing to the difference seen from 4.12->5.3 - RAND_seed, SSL_connect and SSL_CTX_new are still the biggest offenders, but all of them have improved and SSL_connect has improved the most, followed by RAND_seed and last SSL_CTX_new. |
Thanks for that - glad to see the improvements have had a positive impact on performance. Fortunately, others often report performance regressions to the OpenSSL team and when they eventually get fixed, it will of course benefit slower systems the most. It is a bit hard to trace exactly what the performance improvements were in OpenSSL 3.1, although I did note that one of them is only a partial fix - presumably, more speed to come from that later. I will hopefully release AmiSSL 5.8 later this coming week - so far, the semaphore->atomic switch appears to be working correctly. |
Figured I should do a comparison of 5.8-rc1 -> 5.12 to get a current measurement:
A slight bit faster at connecting again and again compared to 5.8-rc1, but pretty much the same. |
A general performance tip when on a 68030 system: don't use mmu.library unless you use some software that needs it, or you use it to avoid issues with the combination of the 68030 and some expansion boards like the C= bridge boards. Above test without mmu.library:
A ~11% performance increase in repeated connection time (~9.4s -> ~8.5s) simply by not loading mmu.library. For reference - all the above test since the start of this thread has been with mmu.library enabled, so will continue to do so to be consistent. |
Thanks for the update - next big test will be OpenSSL 3.2, which is now in beta testing, so I expect not too far from release. |
When adding AmiSSL 5.x compatibility to the aget download tool a while back, I noticed a big increase in download time when https was used.
Doing some comparisons with aget, alternating between AmiSSL 4.12 and 5.x, downloading both small and large files revealed that there is no significant difference in the actual download speed when it has started reading the data, but the test of small files indicates that the time needed to get to the state where payload data can be read has about doubled in AmiSSL 5.x compared to 4.12.
As this does not show what part/function of AmiSSL has changed, I wrote a tool to time every function needed to establish and shut down the SSL/TSL connection:
AmiSSLTest.zip
This test includes connecting to a server, so choosing different servers will affect the total time somewhat, but using the same server and comparing the results will give a more than consistent enough result to show the difference when the AmiSSL version is the only variable. Especially given how big the performance regression is.
Will be using the below test script to 1. show that apart from the first run where AmiSSL is loaded and initialized the first time, the total runtime of AmiSSLTest is consistent and 2. let the final run show the individual time spent for each AmiSSL function so the differenting ones can be spotted:
All tests are done on a low-to-medium performance Amiga to make the differences more pronounced, unless otherwise noted. The machine in question is an A3000 030@25MHz with OS3.2.1. The AmiSSL installs are clean installs with each versions, so for example if if
version amisslmaster.library
says 5.4, it is 5.4 only.So, lets begin with AmiSSL 4.12:
And continue with AmiSSL 5.4 to represent 5.x:
Or the results sorted by the largest 5.4 time - 4.12 time to get a better overview which function contributes most to the overall increase in time:
This shows that the biggest offenders in 5.4 are RAND_seed(), SSL_connect() and SSL_CTX_new(). Maybe the underlying issue is connected?
The text was updated successfully, but these errors were encountered: