Why is spark connected to zookeeper in the "Hadoop Spark" bundle?

Merlijn Sebrechts merlijn.sebrechts at gmail.com
Tue Jan 3 14:53:31 UTC 2017


Thanks! The plugin icon is indeed correct, my brain must be playing tricks
on me.. :)

2017-01-03 15:39 GMT+01:00 Konstantinos Tsakalozos <
kos.tsakalozos at canonical.com>:

> Hi Merlijn,
>
> Are we talking about this hadoop-plugin https://jujucharms.com/hadoop-
> plugin/xenial/6 ? I see the bigtop tag there. Yes, the apache bigtop
> project is where the charms are. However, if you do not want to go through
> bigtop you can submit a PR against our fork of apache-bigtop here:
> https://github.com/juju-solutions/bigtop and we will try to push any
> contributions you have as soon as possible. The bigtop charms can be found
> here: https://github.com/juju-solutions/bigtop/tree/master/
> bigtop-packages/src/charm
>
> Regarding spark modes, during spark-submit if you chose "--deploy-mode"
> cluster and "--master" the list of spark masters (something like
> spark://<master1>:7077) you are effectively in the standalone mode. But as
> I see here: https://github.com/juju-solutions/bigtop/blob/master/
> bigtop-packages/src/charm/spark/layer-spark/lib/charms/
> layer/bigtop_spark.py#L304 we only start the master and slaves on
> standalone execution mode. So bypassing the execution mode is not safe.
>
> Thanks,
> Konstantinos
>
> On Tue, Jan 3, 2017 at 4:16 PM, Merlijn Sebrechts <
> merlijn.sebrechts at gmail.com> wrote:
>
>> And a third question: is there a repo for this charm outside of bigtop?
>> I'd like to put this info in the readme but don't really want to go through
>> the bigtop process for doing this.
>>
>> 2017-01-03 15:15 GMT+01:00 Merlijn Sebrechts <merlijn.sebrechts at gmail.com
>> >:
>>
>>> Hi Konstantinos
>>>
>>> Thanks for the explanation. Another question. The icon of the "hadoop
>>> plugin" doesn't include the bigtop tag. Doesn't the plugin supply
>>> bigtop-specific libraries?
>>>
>>> I'm not sure if you can really change the execution modes at job deploy.
>>> When you submit applications to Spark in YARN mode, you can choose from two
>>> "deploy modes": `client` or `cluster`[1]. This is different from the
>>> execution modes (YARN, Standalone, Mesos). In `client` mode on YARN, Spark
>>> jobs still run on YARN, but the Spark driver runs on the Spark node. This
>>> means that the Spark node needs to stay alive and connected during the
>>> duration of the job. In `cluster` mode on yarn, the Spark driver itself
>>> also runs on Yarn, so when the Spark node goes down, the job will keep on
>>> running.
>>>
>>> From what I gather, `cluster` mode is for when you have long-running
>>> applications. `client` mode is for interactive querying and development of
>>> spark applications.
>>>
>>> [1]: http://spark.apache.org/docs/latest/running-on-yarn.html
>>>
>>>
>>> 2017-01-03 14:44 GMT+01:00 Konstantinos Tsakalozos <
>>> kos.tsakalozos at canonical.com>:
>>>
>>>> Hi Merlijn,
>>>>
>>>> You are right Zookeeper is used only for HA in the standalone mode of
>>>> Spark. In the initial setup of the hadoop-spark bundle Zookeeper is
>>>> not used. However, we decided to have Zookeeper there to accommodate the
>>>> case where you start with hadoop-spark bundle and you switch to the
>>>> standalone execution mode. If I remember correctly you can also override
>>>> the execution mode during job submission.
>>>>
>>>> Thanks,
>>>> Konstantinos
>>>>
>>>> On Tue, Jan 3, 2017 at 3:25 PM, Merlijn Sebrechts <
>>>> merlijn.sebrechts at gmail.com> wrote:
>>>>
>>>>> Hi all
>>>>>
>>>>>
>>>>> According to the bigtop spark charm[1] zookeeper is used for
>>>>> standalone HA mode. If this is the case, then why does the hadoop-spark[2]
>>>>> bundle include a relationship between Spark and Zookeeper
>>>>> when spark_execution_mode is"yarn-client"?
>>>>>
>>>>> [1] https://jujucharms.com/spark/xenial
>>>>> [2] https://jujucharms.com/hadoop-spark/
>>>>>
>>>>>
>>>>>
>>>>> Kind regards
>>>>> Merlijn
>>>>>
>>>>> --
>>>>> Bigdata mailing list
>>>>> Bigdata at lists.ubuntu.com
>>>>> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailm
>>>>> an/listinfo/bigdata
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/bigdata/attachments/20170103/5b051e3d/attachment.html>


More information about the Bigdata mailing list