Scale Testing: Now with profiling!

John Arbash Meinel john at arbash-meinel.com
Thu Oct 31 07:11:30 UTC 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

So I managed to instrument a jujud with both CPU and Mem profiling
dumps. I then brought up 1000 units and did some poking around.

The results were actually pretty enlightening.


1) Guess what the #1 CPU time was. I know I was surprised:
Total: 25469 samples
   14380  56.5%  56.5%    14404  56.6% crypto/sha512.block
    1261   5.0%  61.4%     1261   5.0% crypto/hmac.(*hmac).tmpPad
    1219   4.8%  66.2%    15737  61.8% crypto/sha512.(*digest).Write
    1208   4.7%  70.9%     9548  37.5% crypto/sha512.(*digest).Sum
     439   1.7%  72.7%    19046  74.8%
launchpad.net/juju-core/thirdparty/pbkdf2

So we spend most of our CPU resources on sha512 work. With 72% of that
being caused by pbkdf2. Essentially the fact that on every Login
attempt we have to call PasswordHash which requires many iterations of
sha512. (Essentially we can DOS the juju agent by just asking for lots
of Login requests.)

I know that pbkdf2 is great for user passwords, because it makes it
very hard to do brute force search. However, we aren't getting user
input for Agent passwords. We are giving them nice long random
strings. I don't know if we can change it at this point, but that does
become the bottleneck when dealing with lots of agents. It will slow
down with time, but any time you restart/upgrade/etc you spend
*minutes* just verifying another 1000 PasswordHashes.

2) And what about memory consumption:
(pprof) top5
Total: 94.1 MB
    31.5  33.5%  33.5%     31.5  33.5% newdefer
    28.1  29.8%  63.3%     28.1  29.8% cnew
     5.0   5.3%  68.6%      5.0   5.3% runtime.malg
     4.5   4.8%  73.4%      4.5   4.8% crypto/rc4.NewCipher
     2.0   2.1%  75.6%      2.0   2.1% compress/flate.NewReader

The best I could dig up for that is:
https://codereview.appspot.com/10784043/
and
https://groups.google.com/forum/#!topic/golang-nuts/I2c7wO0SR3I

The former is that "defer()" functions seem to be trapping their
references and preventing garbage collection. Or to quote the patch:

  It should help with a somewhat common idiom:
  x, cleanup := foo()
  defer cleanup()
  Currently anything referenced by cleanup is not collected with high
  probability.

This has landed in go trunk, but I don't think it is in 1.1.1.

for 'cnew' a strong possibility is the second link. Which mentions:
> Ah, that makes sense.
> 
> For what it's worth, I think I found the issue.
> 
> If you look at http.Transport:
> http://golang.org/src/pkg/net/http/transport.go?m=text
> 
> getIdleConnCh() is called for every round trip (called by getConn,
> which is called by RoundTrip), regardless of weather keepAlives are
> on. In my case, with an app that's connecting to hundreds of
> thousands of different hosts, that idleConnCh map just keeps
> filling up and filling up forever.
> 
> 
> To verify I'm going to build my own, very simple and specialized
> RoundTripper, with pretty much no state stored between requests,
> and see if that takes care of the issue.

I don't know if that is what is actually going on.

I was a bit surprised that all of our TLS connections from the agents
back to the API server are actually using rc4, and that this is
triggering about 5% of our total memory consumption.



I'm still looking into the PasswordHash stuff, because the CPU profile
is hinting that it is being called by SetPassword (4000 times), but
that doesn't make sense, since all of those agents are running externally.

At the least, we might have a memory leak in some of our defer()
calls. Though Dave Cheney approved the patch that fixed it, so he
might know what version of golang it landed in.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (Cygwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlJyAqIACgkQJdeBCYSNAAP9gwCffPN1QtoXkRM+oXLUOnHrAv0Y
MMwAnim8+peLrmKKKkaFhtNqLRDy6uwD
=f9Jx
-----END PGP SIGNATURE-----



More information about the Juju-dev mailing list