<div dir="ltr">I'm running a test now to see if I can set up HA manually.<div>One very surprising thing is that I ran "juju bootstrap" from a Trusty machine, and it gave me a Precise bootstrap node. I thought we were trying to default to the latest LTS when possible. Did some behavior change there? (I'm wondering if somewhere we changed from a single hardcoded value to taking a list of all possible LTS targets, and that ended up with us picking Precise first.)</div>
<div><br></div><div>I did manage to reproduce the bug, on machine-1 I see an endless series of</div><div><div>2014-06-14 04:45:27 INFO juju.mongo open.go:90 dialled mongo successfully<br></div></div><div><br></div><div>Like, at 5 minutes in I have >1000 lines of "I successfully connected", and *no* failure messages indicating why we are trying again.</div>
<div><br></div><div>I'll post more of my findings to the bug.</div><div><br></div><div>As a Juju process level thing, when people are changing things around HA, are you actually running up a live system and seeing it work before you submit your changes to Trunk?</div>
<div><br></div><div>John</div><div>=:-></div><div> <br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Jun 13, 2014 at 10:42 PM, Curtis Hovey-Canonical <span dir="ltr"><<a href="mailto:curtis@canonical.com" target="_blank">curtis@canonical.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">CI is regularly failing because HA and upgrade tests timeout. They do<br>
not complete. I have extended timeouts from 5 minutes to 15, but the<br>
tests still fail. I appended -devel to some tests to remove there<br>
vote. I think that was a mistake...the problem is juju, not the cloud.<br>
I reported 2 bugs about upgrade-juju and HA<br>
<br>
HA performance degradation<br>
<a href="https://bugs.launchpad.net/juju-core/+bug/1329544" target="_blank">https://bugs.launchpad.net/juju-core/+bug/1329544</a><br>
<br>
major performance degradation upgrading juju<br>
<a href="https://bugs.launchpad.net/juju-core/+bug/1329899" target="_blank">https://bugs.launchpad.net/juju-core/+bug/1329899</a><br>
<br>
I suspect there is a root cause for booth bugs. We saw performance<br>
deteriorate becuase mongodb is doing more work. Maybe HA and upgrades<br>
need to tak 30 minutes, or an hours because mongo cannot do what we<br>
once required to happen in 5 minutes.<br>
<br>
On Fri, Jun 13, 2014 at 7:12 AM, CI & CD Jenkins<br>
<<a href="mailto:aaron.bentley%2Bcjqa@canonical.com">aaron.bentley+cjqa@canonical.com</a>> wrote:<br>
> Build: #1479 Revision: gitbranch:master:<a href="http://github.com/juju/juju" target="_blank">github.com/juju/juju</a> ead2e2d6 Version: 1.19.4<br>
><br>
> Failed tests<br>
> functional-ha-recovery build #357 <a href="http://juju-ci.vapour.ws:8080/job/functional-ha-recovery/357/console" target="_blank">http://juju-ci.vapour.ws:8080/job/functional-ha-recovery/357/console</a><br>
> hp-upgrade-precise-amd64 build #1324 <a href="http://juju-ci.vapour.ws:8080/job/hp-upgrade-precise-amd64/1324/console" target="_blank">http://juju-ci.vapour.ws:8080/job/hp-upgrade-precise-amd64/1324/console</a><br>
<span class="HOEnZb"><font color="#888888"><br>
<br>
--<br>
Curtis Hovey<br>
Canonical Cloud Development and Operations<br>
<a href="http://launchpad.net/~sinzui" target="_blank">http://launchpad.net/~sinzui</a><br>
<br>
--<br>
Juju-dev mailing list<br>
<a href="mailto:Juju-dev@lists.ubuntu.com">Juju-dev@lists.ubuntu.com</a><br>
Modify settings or unsubscribe at: <a href="https://lists.ubuntu.com/mailman/listinfo/juju-dev" target="_blank">https://lists.ubuntu.com/mailman/listinfo/juju-dev</a><br>
</font></span></blockquote></div><br></div>