Notes from Scale testing
John Arbash Meinel
john at arbash-meinel.com
Wed Oct 30 14:57:39 UTC 2013
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
...
>
>> - From what I can tell, all units take out a watch on their
>> service so that they can monitor its Life and CharmURL. However,
>> adding a unit to a service triggers a change on that service,
>> even though Life and CharmURL haven't changed. If we split out
>> Watching the units-on-a-service from the lifetime and URL of a
>> service, we could avoid the thundering N^2 herd problem while
>> starting up a bunch of units. Though UpgradeCharm is still going
>> to thundering herd.
>
> Where is N^2 coming from?
If you add N units one-by-one each new add triggers all existing units
to wake up and ask for the Life and CharmURL of the service again. So
first unit asks, 2nd unit asks and causes the first unit to ask again.
3rd unit asks and causes the first 2 to ask again. Nth unit asks and
causes N-1 units to ask. Thus N adds = N*(N) requests for CharmURL.
In theory it is gated at a 5-sec delay between add unit and triggering
requests. In practice it took 3+s for add-unit to complete, so it was
pretty much 1-add => 1-trigger.
The log I have for bringing up 1000 nodes has the result of the
CharmURL 2,183,716 times. There are other triggers for this, but
1000*1000 = 1M.
Put another way, in a file with 16M lines, 2.1M lines are the Request
of CharmURL (another 2.1M the response), and 2.5M lines are the
Request for Life and another 2.5M response lines.
So 9.2M lines of it is just busy work caused by adding units to a
service causing the units of that service to ask if the Life or
CharmURL of that service has changed.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (Cygwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iEYEARECAAYFAlJxHmMACgkQJdeBCYSNAAPVUgCffRWSn9ERhU8KjS8tFfNGXO/l
OmMAnjHN90MWBlAIfL4J+Uprvpdc/QQC
=wtoN
-----END PGP SIGNATURE-----
More information about the Juju-dev
mailing list