Testing Leader Election reconfiguration
Tom Barber
tom at analytical-labs.com
Tue Mar 15 20:37:17 UTC 2016
Okay, 4m30s is probably short, thanks for that Tim, I'll crank it up and
see what happens.
Tom
--------------
Director Meteorite.bi - Saiku Analytics Founder
Tel: +44(0)5603641316
(Thanks to the Saiku community we reached our Kickstart
<http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
goal, but you can always help by sponsoring the project
<http://www.meteorite.bi/products/saiku/sponsorship>)
On 15 March 2016 at 17:02, Tim Van Steenburgh <
tim.van.steenburgh at canonical.com> wrote:
>
>
> On Tue, Mar 15, 2016 at 12:30 PM, Tom Barber <tom at analytical-labs.com>
> wrote:
>
>> Hi Tim,
>>
>> Why would I need to increase the timeout when the status says all the
>> unit are operational?
>>
>
> The default wait time is 300s, with an "idle threshold" of 30s. Which
> means, it waits for everything to be idle for 30s before returning from the
> wait. This means that with the default timeout, if the env doesn't settle
> within 4m30s, it'll time out. This may not be what's happening in your
> case, but it's worth trying a longer timeout value to make sure.
>
>
>> The status dump came out of bundletester which said that it failed on the
>> first wait(), I assume the status dump arrived at the same time?
>> Bugs are allowed, the test was hacked up from a previous one, it doesn't
>> do anything yet, I'm trying to make sure the logic works first.
>>
>> Tom
>>
>> --------------
>>
>> Director Meteorite.bi - Saiku Analytics Founder
>> Tel: +44(0)5603641316
>>
>> (Thanks to the Saiku community we reached our Kickstart
>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
>> goal, but you can always help by sponsoring the project
>> <http://www.meteorite.bi/products/saiku/sponsorship>)
>>
>> On 15 March 2016 at 16:27, Tim Van Steenburgh <
>> tim.van.steenburgh at canonical.com> wrote:
>>
>>> Hey Tom,
>>>
>>> 1. You can increase the wait time until it doesn't time out:
>>> self.d.sentry.wait(timeout=1200)
>>> 2. At what point in this sequence of commands was the status dump
>>> captured?
>>> 3. There is a bug here. You take a reference to the pdi/0 info dict on
>>> line 1. It's the same object you use to get message2 and message3 later.
>>> So, you'll get the same message that you got on line 1. You need `message3
>>> = self.d.sentry['pdi'][0].info['workload-status'].get('message')`
>>> instead.
>>>
>>> Hope this helps.
>>>
>>> On Tue, Mar 15, 2016 at 11:41 AM, Tom Barber <tom at analytical-labs.com>
>>> wrote:
>>>
>>>> Okay back here again, so my nice leader election function looks like:
>>>>
>>>> def test_leader_election_failover(self):
>>>> unit = self.d.sentry['pdi'][0].info
>>>> message = unit['workload-status'].get('message')
>>>> ip = message.split(':', 1)[-1]
>>>> self.d.add_unit('pdi', 2)
>>>> self.d.sentry.wait()
>>>> message2 = unit['workload-status'].get('message')
>>>> ip2 = message2.split(':', 1)[-1]
>>>> self.assertEqual(ip, ip2)
>>>> self.d.remove_unit('pdi/0')
>>>> self.d.sentry.wait()
>>>> message3 = unit['workload-status'].get('message')
>>>> ip3 = message3.split(':', 1)[-1]
>>>>
>>>> self.assertNotEqual(ip3, ip2)
>>>>
>>>> I know there's no logic in there, but I need to make sure the stuff
>>>> actually functions.
>>>>
>>>> So Tim says wait() should work, but when I tested this last night,
>>>>
>>>> I get a timeout error o the wait right after add_unit.
>>>>
>>>> https://gist.github.com/buggtb/c271dd79d782af57dea6
>>>>
>>>> Yet in the status dump you can see all 3 units sat there seemingly
>>>> happy.
>>>>
>>>> Tom
>>>>
>>>> --------------
>>>>
>>>> Director Meteorite.bi - Saiku Analytics Founder
>>>> Tel: +44(0)5603641316
>>>>
>>>> (Thanks to the Saiku community we reached our Kickstart
>>>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
>>>> goal, but you can always help by sponsoring the project
>>>> <http://www.meteorite.bi/products/saiku/sponsorship>)
>>>>
>>>> On 9 March 2016 at 18:31, Tom Barber <tom at analytical-labs.com> wrote:
>>>>
>>>>> Oh really?
>>>>>
>>>>> /me stokes his invisible beard.
>>>>>
>>>>>
>>>>> Okay I'll go back and try again.
>>>>>
>>>>> Tom
>>>>>
>>>>> --------------
>>>>>
>>>>> Director Meteorite.bi - Saiku Analytics Founder
>>>>> Tel: +44(0)5603641316
>>>>>
>>>>> (Thanks to the Saiku community we reached our Kickstart
>>>>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
>>>>> goal, but you can always help by sponsoring the project
>>>>> <http://www.meteorite.bi/products/saiku/sponsorship>)
>>>>>
>>>>> On 9 March 2016 at 16:56, Tim Van Steenburgh <
>>>>> tim.van.steenburgh at canonical.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 9, 2016 at 6:31 AM, Tom Barber <tom at analytical-labs.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks Stuart.
>>>>>>>
>>>>>>> I do put a note in my charm message indicating the leader IP address
>>>>>>> so that users know which to connect to.
>>>>>>>
>>>>>>> So with juju wait, would I destroy a unit then execute juju wait? At
>>>>>>> which point it will hang until the leader election stuff is over and all
>>>>>>> becomes stable again?
>>>>>>>
>>>>>>>
>>>>>> Since you're already using amulet, there's no need to use the
>>>>>> juju-wait plugin
>>>>>> since d.sentry.wait() does the same thing. So yes, you would do
>>>>>> d.remove_unit(...)
>>>>>> and then call d.sentry.wait().
>>>>>>
>>>>>>
>>>>>>> Also, will this work if I push it upstream to the charmers and the
>>>>>>> automated tests up there?
>>>>>>>
>>>>>>>
>>>>>> Yes.
>>>>>>
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Tom
>>>>>>>
>>>>>>> --------------
>>>>>>>
>>>>>>> Director Meteorite.bi - Saiku Analytics Founder
>>>>>>> Tel: +44(0)5603641316
>>>>>>>
>>>>>>> (Thanks to the Saiku community we reached our Kickstart
>>>>>>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
>>>>>>> goal, but you can always help by sponsoring the project
>>>>>>> <http://www.meteorite.bi/products/saiku/sponsorship>)
>>>>>>>
>>>>>>> On 9 March 2016 at 11:00, Stuart Bishop <stuart.bishop at canonical.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> On 9 March 2016 at 20:31, Tom Barber <tom at analytical-labs.com>
>>>>>>>> wrote:
>>>>>>>> > Morning all
>>>>>>>> >
>>>>>>>> > I'm trying to test for charm reconfiguration if the leader goes
>>>>>>>> AWOL.
>>>>>>>>
>>>>>>>> I put the role of the unit in its workload status, so it is easy for
>>>>>>>> operators to see which unit is master. And this also makes it easy
>>>>>>>> for
>>>>>>>> tests to tell.
>>>>>>>>
>>>>>>>>
>>>>>>>> > Adam suggested that I watch the status waiting for the next
>>>>>>>> leader election
>>>>>>>> > hook the wait on that and then check my service configs.
>>>>>>>>
>>>>>>>> You are best of waiting for all the hooks to complete and a steady
>>>>>>>> state, not just leader elected (since things will still be in flux
>>>>>>>> when that hook fires, such as the leader-settings-changed hooks it
>>>>>>>> will probably trigger and the relation changes those hooks will
>>>>>>>> likely
>>>>>>>> trigger). Use the juju-wait plugin, and maybe add support to
>>>>>>>> https://bugs.launchpad.net/juju-core/+bug/1488777 to get this into
>>>>>>>> core.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Stuart Bishop <stuart.bishop at canonical.com>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Juju mailing list
>>>>>>> Juju at lists.ubuntu.com
>>>>>>> Modify settings or unsubscribe at:
>>>>>>> https://lists.ubuntu.com/mailman/listinfo/juju
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju/attachments/20160315/f467cc2f/attachment.html>
More information about the Juju
mailing list