RFC: mongo "_id" fields in the multi-environment juju server world
roger peppe
roger.peppe at canonical.com
Mon Jul 7 17:03:07 UTC 2014
On 7 July 2014 16:59, Gustavo Niemeyer <gustavo at niemeyer.net> wrote:
> On Mon, Jul 7, 2014 at 12:26 PM, roger peppe <roger.peppe at canonical.com> wrote:
>> On 7 July 2014 14:27, Gustavo Niemeyer <gustavo at niemeyer.net> wrote:
>>> On Mon, Jul 7, 2014 at 10:09 AM, roger peppe <roger.peppe at canonical.com> wrote:
>>>> I had assumed that because every client needs to see every transaction
>>>> there would likely be no benefit to sharding the log, although
>>>> technically you could shard on transaction id. I'd be
>>>
>>> Clients don't need to see every transaction. Only those that affect
>>> the documents they are acting on.
>>
>> Is it actually possible to shard the transaction log based on the documents
>> the transactions are acting on?
>
> That's unrelated to what you said above, or to my response.
>
> Either way, we can shard transaction documents, and we can add a shard
> key to them if necessary.
The latter might turn out to be quite awkward, though there's
probably a nice solution I don't see.
Suppose we've got three environments, A, B and C.
We have transactions that span {A, B}, {B, C} and {C, A}.
How can we choose a consistent shard key for all those
transactions?
>>>> Thanks for pointing this out. If we manage to hugely scale juju using mongodb
>>>> I will be very happy. I still think we should do some measurements to
>>>> convince us that we actually have some hope of doing so though.
>>>> My own measurements left me less than convinced of the
>>>> possibility, although it's been a while since I did them.
>>>
>>> When you measured a sharded setup, what was the outcome?
>>
>> I simply measured operation rate (of some actual juju operations)
>> on a non-sharded setup.
>
> Okay, so the measurements that left you unconvinced that sharding
> might help to scale up were not using sharding.
If we struggle to meet the requirements for a single environment,
we're unlikely to meet them when we're running several environments
per shard, which is surely necessary if we're to scale up.
>> I saw around 60 operations per second.
>> It may well have been that I was testing an inefficient setup, or
>> that my mongo settings were inadequate.
>
> I cannot really comment on that. What I can say is:
>
> 1. The txn package can run transactions on the order of a few hundred
> per second on my measurements on MongoDB 2.2
>
> 2. Sharding allows sending load to independent replica sets
>
> 3. MongoDB performance is improving release over release, and there's
> more coming (http://goo.gl/qPE9LB)
>
> 4. Nothing will work without effort.
I hope it can work for us.
I really do.
I just worry that without actually doing some measurement in advance,
we may spend a lot of time working on this stuff and find that it was all for
nought because we're fundamentally bottlenecked somewhere
we didn't anticipate.
More information about the Juju-dev
mailing list