Pruning the txns collection

John Meinel john at arbash-meinel.com
Thu May 14 04:45:16 UTC 2015


So one small wrench that we will want to investigate.

While we were trying to cleanup the database, we found a lot of documents
with transaction ids in txn-queue that references txn documents that were
already in APPLIED (6) state.
Looking at the TXN code, the act of applying the changes to the document
also does a $pullAll of the transactions it is applying. And only after the
documents have been updated does it update the txn document to APPLIED.

Now this db did get into a state where it was unhappy, it is possible that
all of these only happened after it ran out of disk space, or some other
unhappy behavior. We do have some backups where the data is broken, we
could try to dig into that, though it will be hard to know causality (was
it broken which caused undefined behavior, or are we missing some bit of
behavior that caused it to become broken.)

John
=:->

On Thu, May 14, 2015 at 5:08 AM, Menno Smits <menno.smits at canonical.com>
wrote:

>
> On 14 May 2015 at 06:41, Gustavo Niemeyer <gustavo.niemeyer at canonical.com>
> wrote:
>
>>
>> You are right that it's not that simple, but it's not that complex either
>> once you understand the background.
>>
>> Transactions are applied by the txn package by tagging each one of the
>> documents that will participate in the transaction with the transaction id
>> they are participating in. When mgo goes to apply a transaction in that
>> same document, it will tag the document with the new transaction id, and
>> then evaluate all the transactions it is part of. If you drop one of the
>> transactions that a document claims to be participating in, then the txn
>> package will rightfully complain since it cannot tell the state of a
>> transaction that explicitly asked to be considered for the given document.
>>
>> That means the solution is to make sure removed transactions are 1) in a
>> final state; and 2) not being referenced by any tagged documents.
>>
>
> Thanks. This explanation clarifies things a lot.
>
>
>>
>> The txn package itself collects garbage from old transactions as new
>> transactions are applied, but it doesn't guarantee that right after a
>> transaction reaches a final state it will be collected. This can lead to
>> pretty old transactions being referenced, if these documents are never
>> touched again.
>>
>
> I was confused by this part when I read it because I don't see anywhere in
> the mgo/txn code where cleanup of the txn collection already occurs. To
> summarise our later IRC conversation for anyone who might be interested:
> mgo/txn doesn't currently prune the txns collection, but it *does* prune
> references to applied transactions from the txn-queue fields on documents.
>
>
>>
>> So, you have two choices to collect these old documents:
>>
>> 1. Clean up the transaction references from all documents
>>
>> or
>>
>> 2. Just make sure the transaction being removed is not referenced anywhere
>>
>> I would personally go for 2, as it is a read-only operation everywhere
>> but in the transactions collection itself, to drop the transaction document.
>>
>
> I agree that #2 is preferable and I have a fairly straightforward strategy
> in mind to make this happen. I'll work on that today.
>
>
> Note that the same rules here apply to the stash collection as well.
>>
>
> Noted. I know how this hangs together from my work with PurgeMissing.
>
> Thanks,
> Menno
>
>
> --
> Juju-dev mailing list
> Juju-dev at lists.ubuntu.com
> Modify settings or unsubscribe at:
> https://lists.ubuntu.com/mailman/listinfo/juju-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/juju-dev/attachments/20150514/8bc19a3a/attachment-0001.html>


More information about the Juju-dev mailing list