<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=UTF-8" http-equiv="Content-Type"> </head> <body bgcolor="#ffffcc" text="#000000"> Robert Collins wrote: <blockquote cite="mid:1239838846.2892.614.camel@lifeless-64" type="cite"> <pre wrap="">On Thu, 2009-04-16 at 00:40 +1000, Ian Clatworthy wrote: I do like the docs. However I have some concerns too. </pre> <blockquote type="cite"> <pre wrap="">1. checkouts should be lightweight by default (ala svn) </pre> </blockquote> <pre wrap=""> There are serious performance implications in this. For starters, right now this scales with tree size * latency. E.g. its about as slow as its possible to be. Is it something we could aim for? Yes, but then we'll have to download data over the network for every diff operation. lightweight checkouts are really only suitable for high bandwidth low latency environments. Fixing them to have a local cache is ~= to using a stacked bound branch with sufficient fixes to allow disconnected use. I don't think we have the manpower to achieve those fixes in the sort of time frame we'd like to have a 2.0 happening in. </pre> </blockquote> Nevertheless, it makes "quick fixes to code that lives elsewhere" possible. We want a checkout time that is ~= "download the release tarball". I agree that network-diff is toxic, but am determined that we get the quick-checkout piece right, and for 2.0. <blockquote cite="mid:1239838846.2892.614.camel@lifeless-64" type="cite"> <blockquote type="cite"> <pre wrap="">2. a branch could reuse the repository of a dependent branch without needing to explicitly create a shared repo in a parent directory. </pre> </blockquote> <pre wrap=""> I think this would make it harder for people to reason about 'where is my data' - which is important when deleting branches. If users get it wrong they will delete data they care about. Its not /so/ different from a lightweight checkout, so perhaps the change isn't that large. However, I think the root cause of the issue is that we have an expensive resource (the repository) and we want to make more reuse of it, but we don't by default make it explicit - so people need to learn about it through trial and error, and our toolchain isn't as polished for working with one repo, one work area, and multiple branches as it should be. Misquoting Martin from a couple of days ago, we can either tackle the root cause, or we can tackle specific symptoms. Tackling the root cause requires effort and though but should lead to a better result - consider the brisbane-core project, which tackled repository performance issues, and has a really good result - just tackling serialisation of inventories, or data compression alone wouldn't have gotten the same result [and there are benchmarks from the past showing that :)]. It took us focusing down and fixing the actual root cause (comparing trees is O(tree) not O(delta)) to get a good solution. I think we have to fix the root cause here - our [well intentioned] hiding of repositories by default. As an expensive resource, repositories are a performance optimisation. It is fine (and correct in my opinion) to keep them as just a performance optimisation, but we should make them available for reuse by default. Now, as to how we do that, we can look at the suggestion in this story of having some sort of pointer to the repository in arbitrary branches [but this invalidates some assumptions repositories hold dear, so we would not be able to do garbage collection at all anymore]. We already violate this for stacked-upon repositories - they cannot be garbage collected, *but* we also only stack on branches which means that its fairly hard to get references to garbage-collectable-revisions. We can have a list of branches in the repository - we currently have that by 'ls -lR' under the repository - its the 'branches are contained' rule. This is important because it means a repository can reason about what data is transient and removable, and what is used. Jelmer's colocated branches plugin is a way of extending this without adding directories where a user has their source code. There are other ways to do this - a list of named heads, or references, can also manage this, and would be more efficient at a number of operations (list-branches would be _much_ faster). And we can stop creating repository+branch at the same place, which makes the repository immediately reusable with all of our normal tools. In short I think its important we solve the 'reuse the repository' use case, but we should aim to do so by reducing the permutations available, not increasing them. So some design is needed, including evaluation current proposals on the table, and things like the repository design Aaron favoured back when we first added shared repositories; then we pick one, and run with it. </pre> </blockquote> All sounds fine, and compatible with my suggestions. As you wish on the internals, but please don't let the UI design drift in the process. Mark </body> </html>