[Bug 954197] Re: base system installation is not robust against transient network failures
Colin Watson
cjwatson at canonical.com
Tue Mar 13 16:49:50 UTC 2012
This is a complex problem that has been known in the Debian installer
since at least 2004. I'm going to try to break it down here in the hope
of making some progress on it.
1. Download error handling in debootstrap is arranged wrongly
In particular, it doesn't deal correctly with corrupted files, and
will tend to muddle on until something fails as a consequence of the
corruption. In some cases it's possible for debootstrap to complete
successfully despite a corrupted download! There's a patch in
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=618920 that improves
things, although I've been working on a better version of it.
2. No retry option
As Joey notes in http://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=283600, there's only a fairly limited
communication channel between debootstrap (which is a separate tool
invoked by the installer to do the hard work) and the parts of the
installer that can actually interact with the user. This means that
it's hard to set up a "retry" option that just retries a single
download, because debootstrap doesn't wait for user interaction on
errors and it would be a substantial amount of work to rearrange it to
do so.
What we might be able to do is as follows: if debootstrap fails at the
retrieval stage before it actually starts unpacking anything, then we
could offer an option that simply tries the whole thing again, keeping
the previous contents of /target (so that would also preserve anything
you'd wgetted by hand, but it would also try to redownload any other
missing files for itself). This is a little less neat, but would do the
job. In fact, if we borrowed some ideas from net-retriever, we could
even let you choose a different mirror.
** Bug watch added: Debian Bug tracker #618920
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=618920
** Bug watch added: Debian Bug tracker #283600
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=283600
** Also affects: debootstrap (Debian) via
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=618920
Importance: Unknown
Status: Unknown
** Also affects: base-installer (Debian) via
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=283600
Importance: Unknown
Status: Unknown
** Changed in: base-installer (Ubuntu)
Status: New => Triaged
** Changed in: base-installer (Ubuntu)
Importance: Undecided => Medium
** Changed in: debootstrap (Ubuntu)
Status: New => Triaged
** Changed in: debootstrap (Ubuntu)
Importance: Undecided => Medium
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to base-installer in Ubuntu.
https://bugs.launchpad.net/bugs/954197
Title:
base system installation is not robust against transient network
failures
Status in “base-installer” package in Ubuntu:
Triaged
Status in “debootstrap” package in Ubuntu:
Triaged
Status in “base-installer” package in Debian:
Unknown
Status in “debootstrap” package in Debian:
Unknown
Bug description:
[This bug was originally reported by Gary Potwin in
https://answers.launchpad.net/ubuntu/+source/ubiquity/+question/158404]
I have a Supermicro P6SBA motherboard with a 700 MHz Pentium III, 512
M of ram, 20G and 120G hard drives, and a DSL Internet connection.
This system has been running Windows 98 for years, and I wanted to try
Ubuntu 10.04.
Using the network kernel and initrd, the systems boots OK and
downloads the rest of the installer OK from the default mirror
(us.archive.ubuntu.com) except for one small problem (see below).
Everything seems OK until I try to load the base system.
After downloading files for about 3.5 minutes (often when it is trying
to get the file libklibc), I see the network activity stop, and soon
after I get an error message stating that it has failed to load that
file (all others up to that point were OK).
After about 2 more minutes, during which time one or more additional
files fail to load, the network activity goes back to normal, and all
the remaining files for the base system download OK.
Due to the failed files, I get the error message that the base system
has failed to install.
I did successfully download the failed files using wget into
/target/var/cache/apt/archives while the automatic download was still
in progress using a console, but after the above failure.
The system still thinks that the files were not successfully
downloaded, and I don't know how to tell the system that they are
there and OK. I have tried using many different mirrors at different
times of the day, and both http and ftp, all fail as above.
Using a similar technique, I was able to successfully load Debian
5.08, so I think the hardware is OK, but I would really like to try
the Ubuntu.
To try to rule out any problem with the DSL, I downloaded a very large
file that took 10 minutes of continuous running under Windows.
Then I went back to the 10.04 install and did the same thing using
wget at a console, just after partitioning the hard drive, and just
before starting the base install, and it worked fine. While loading
the installer, one file did fail to load, but you were given the
opportunity to retry, which took care of the problem.
I wish the base install allowed retries instead of just "go back" and
"continue", which don't seem to make any additional attempt to retry.
Any help would be appreciated.
Gary
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/base-installer/+bug/954197/+subscriptions
More information about the foundations-bugs
mailing list