Why is spark connected to zookeeper in the "Hadoop Spark" bundle?

Tue Jan 3 14:16:54 UTC 2017

And a third question: is there a repo for this charm outside of bigtop? I'd
like to put this info in the readme but don't really want to go through the
bigtop process for doing this.

2017-01-03 15:15 GMT+01:00 Merlijn Sebrechts <merlijn.sebrechts at gmail.com>:

> Hi Konstantinos
>
> Thanks for the explanation. Another question. The icon of the "hadoop
> plugin" doesn't include the bigtop tag. Doesn't the plugin supply
> bigtop-specific libraries?
>
> I'm not sure if you can really change the execution modes at job deploy.
> When you submit applications to Spark in YARN mode, you can choose from two
> "deploy modes": `client` or `cluster`[1]. This is different from the
> execution modes (YARN, Standalone, Mesos). In `client` mode on YARN, Spark
> jobs still run on YARN, but the Spark driver runs on the Spark node. This
> means that the Spark node needs to stay alive and connected during the
> duration of the job. In `cluster` mode on yarn, the Spark driver itself
> also runs on Yarn, so when the Spark node goes down, the job will keep on
> running.
>
> From what I gather, `cluster` mode is for when you have long-running
> applications. `client` mode is for interactive querying and development of
> spark applications.
>
> [1]: http://spark.apache.org/docs/latest/running-on-yarn.html
>
>
> 2017-01-03 14:44 GMT+01:00 Konstantinos Tsakalozos <
> kos.tsakalozos at canonical.com>:
>
>> Hi Merlijn,
>>
>> You are right Zookeeper is used only for HA in the standalone mode of
>> Spark. In the initial setup of the hadoop-spark bundle Zookeeper is not
>> used. However, we decided to have Zookeeper there to accommodate the case
>> where you start with hadoop-spark bundle and you switch to the
>> standalone execution mode. If I remember correctly you can also override
>> the execution mode during job submission.
>>
>> Thanks,
>> Konstantinos
>>
>> On Tue, Jan 3, 2017 at 3:25 PM, Merlijn Sebrechts <
>> merlijn.sebrechts at gmail.com> wrote:
>>
>>> Hi all
>>>
>>>
>>> According to the bigtop spark charm[1] zookeeper is used for standalone
>>> HA mode. If this is the case, then why does the hadoop-spark[2] bundle
>>> include a relationship between Spark and Zookeeper
>>> when spark_execution_mode is"yarn-client"?
>>>
>>> [1] https://jujucharms.com/spark/xenial
>>> [2] https://jujucharms.com/hadoop-spark/
>>>
>>>
>>>
>>> Kind regards
>>> Merlijn
>>>
>>> --
>>> Bigdata mailing list
>>> Bigdata at lists.ubuntu.com
>>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailm
>>> an/listinfo/bigdata
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/bigdata/attachments/20170103/01cb8e8e/attachment-0001.html>