[ec2] [ubuntu-cloud] RFC: server-lucid-ec2-config: user-data configuration file

Eric Hammond esh at ubuntu.com
Mon Dec 28 06:45:52 GMT 2009


Mathias Gug wrote:
> It seems that running apt-get update is an action that needs to be done on
> *every* first boot.

The problem is that a user might want apt-get update to be run after
they have set things up, not before.  And worse is when an apt-get
update is being run while the user is trying to run their own.

My other concern is the dependency on the Canonical Ubuntu archives
hosted on EC2.  In their current architecture I do not wish to use them
or have an outage slow down my instance startup.

> Thus it seems like a good candidate for defaulting to true, with an
> option to not be done.

There's no way to specify that option to not do the update from a
user-data script if the user-data script is run after the update is done.

> Upstart dependencies could be used to make sure that the user-data script runs
> only when the apt-get update job has finished.

That's a partial solution.  It does not remove the dependency on the
Canonical apt mirrors.

> I'm not sure I fully understand what you mean here. If existing apt mirrors are
> removed from the sources.list (in the case of a customized instance), apt-get
> update will not fetched them.

It's a matter of timing.  It sounds like this order is being proposed:

1. ec2init scripts install Canonical mirrors on EC2.
2. ec2init scripts run apt-get update
3. user-data script is run
4. user-data script installs other apt mirrors
5. user-data script runs apt-get update

This doesn't work for me as I'd like to avoid having (2) automatically
run before the user-data gets a chance to set things up.

> Could you elaborate on the use case you refer to here?

I've attached a sample user-data script.  It sets up the RightScale
Ubuntu mirrors on EC2 which have historically been more reliable than
the Canonical apt mirrors.

It also uses a particular point in time snapshot of the Ubuntu mirrors
(2009/12/01 in this example) which is a feature not currently offered by
the Canonical mirrors but which helps with running only tested software
in production EC2 environments.

The script then updates and upgrades to that point in time and installs
some sample software.

> To me it looks like we would just delay the instance boot. Considering that the
> default mirrors used by the base AMIs are located in the relevant EC2 zone, I'm
> not sure how long we would delay the boot process.

In the default case when everything is working well, I agree that the
delay is minimal.  The longest part of the delay tends to be the
security updates which come from outside of EC2.

However, in the failure modes I have seen of the apt archives on EC2,
the startup delay can be significant while waiting for network timeouts.

Proposal:

  ec2init automatically runs apt-get update on first boot, UNLESS:

  1. a user-data script is provided by the user (starting with #!), OR
  2. the advanced user-data configuration format is provided by the user
     AND that configuration specifies that apt-get update should not be
     run.

This provides backwards compatibility, flexibility, and control for
advanced users, whether they are using user-data scripts or the new
config format, and still gives normal users the apt-get update that they
might not have realized was needed.

The only use case this proposal does not support is backwards
compatibility with users who are configuring their instances with ssh
immediately after the instance comes up and who don't want apt-get
update to automatically run.  The workaround for this (likely rare) edge
case would be to pass in a stub user-data script which does nothing.

--
Eric Hammond
-------------- next part --------------
A non-text attachment was scrubbed...
Name: user-data.sh
Type: application/x-sh
Size: 1182 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/ubuntu-ec2/attachments/20091227/a6a1899f/attachment.sh 


More information about the Ubuntu-ec2 mailing list