InvalidEntryName: Invalid entry name: te\*st

Tue Jan 2 23:30:05 GMT 2007

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lars Wirzenius wrote:
> Mostly just for kicks (not because I expect to use the result) I thought
> I'd see if I can import the entire Debian source archive, unpacked, into
> a single bzr branch. It's almost 60 gigabytes of source code, in about 4
> million files. After over 24 hours of real time and about 15 hours of
> CPU time, it crashed with the stack trace below. I'm sending it to the
> developers in the hope it is useful in finding and fixing a bug.

This isn't strictly a bug, it probably shouldn't be giving a traceback,
but rather a more helpful error. Basically, we disallow '\' as a
character in filenames. We've talked about allowing it on platforms that
can handle it, and at this point it might actually work if we just
removed the check, but at this point '\' is an illegal character in
files versioned by bzr. It isn't the only one (we only support Unicode
names, so there are a few byte sequences that could probably be
considered illegal).

> 
> I used a checkout of the bzr.dev branch, latest revision is 2215,
> according to bzr log. I hope you'll excuse me if I don't try to update
> and re-run the test before reporting. I'm willing to re-run it with a
> newer version if it helps find the problem.
> 
> Incidentally, the test was run on a Linux (Debian, not entirely
> up-to-date sid) machine with 1 GB of physical memory, and five gigabytes
> of swap space. During the test, I increased swap space from one to five
> gigabytes, since space was running out. The bzr RSS size was hovering at
> around 800 megabytes for at least 12 hours, the VIRT slowly grew to two
> gigabytes. There was no space for disk caching, so I assume the test
> would run significantly faster if bzr used less memory. I'm not faulting
> bzr for using so much memory, though, this is pretty extreme a case.
> 

I'm not sure why we are using that much memory. I know we have a policy
of being able to hold at least 3 copies of a file's text in memory (1
original, 1 current, and ~1 to record diff or merges). But I don't think
we have a specific design that says we need to hold the contents of all
files in memory at the same time. I can see the inventory getting large,
but I don't think it should get large enough to cause the problems you
were seeing.

I'm curious what the specific memory consumption was from.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFmur9JdeBCYSNAAMRApduAKC9OGTAsgx36BENIQVbBpdN0FwJRQCdFGf4
2MjweZlmaCUOqopesoWHCVY=
=xaAJ
-----END PGP SIGNATURE-----