PingPeriod
John Arbash Meinel
john at arbash-meinel.com
Wed Jul 3 06:33:47 UTC 2013
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
wrt this change: https://codereview.appspot.com/10620043/patch/11001/8003
I came to this late, but I wanted to mention that I think this will
break the logic in status as to what machines are alive and what
machines are not.
Specifically, if you look at state/presence/presence.go the
Watcher.sync() loop looks for beings that have been active in either
the current slot, or the previous slot.
The slot period is defined as 30s. So agents must produce a ping
within that time, or we will perceive them as missing/dead/etc.
Consider the case where we just rolled into a new period, so the
previous period started 31s before.
That means that on average ~half of the pingers would be seen as
inactive. (Not active in current, didn't ping in last).
I would probably rather see PingPeriod == slot period. So every 30s.
On average, every pinger hits every slot, and occasionally we will
skip a slot, or hit a slot twice.
Alternatively we can back off the presence code, and have it check
more slots.
Also, in the current system, it doesn't actually hammer the servers
because it has:
if lastSlot == slot {
return nil
}
So it only ever pings 1 time per slot anyway.
I realize in the API case, we are likely (certainly in my design) to
compute the slot in the API server, so it would add some load to the
API server to see and then discard those 5s pings.
Thoughts?
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (Cygwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iEYEARECAAYFAlHTxcsACgkQJdeBCYSNAAOQ5ACgxo4REtXRGvdgo9qjVYhcz39B
BA4An3aXV0/otzEgAYBH6Mx1jVMehsNp
=XfyE
-----END PGP SIGNATURE-----
More information about the Juju-dev
mailing list