Mongo experts - help need please

Ian Booth ian.booth at canonical.com
Thu Jul 24 07:03:11 UTC 2014


So, we have some intermittently failing tests in state/watcher.
The tests fail when the machine on which they are run is heavily loaded.

The watcher infrastructure essentially polls every 5 seconds the txns.log
collection to see what new transactions have been written and turns any new
transactions into events to send out to listeners.

Like this:

iter := w.log.Find(nil).Batch(10).Sort("-$natural").Iter()
var entry bson.D
for iter.Next(&entry) {
   <do stuff>
}

where w.log is a Mongo collection db.C("txns.log")

To stop the mongo session going stale, and losing its connection resulting in
i/o timeout errors (eg bug 1307434), the session can be copied before each use:

session := w.log.Database.Session.Copy()
defer session.Close()
log := w.log.With(session)
iter := log.Find(nil).Batch(10).Sort("-$natural").Iter()
var entry bson.D
for iter.Next(&entry) {
   <do stuff>
}

However, doing a session.Copy() each time the transaction log collection is
queried (every 5 seconds) causes a number of test failures when the host machine
is heavily loaded. Either extra events are received or events are missed.

Can anyone explain what's going on here? Why does copying a session affect
what's read from the txn log collection?





More information about the Juju-dev mailing list