Packaging large Java software stacks ?

Thierry Carrez thierry.carrez at ubuntu.com
Tue Jan 27 09:17:37 GMT 2009


Emmet Hikory wrote:
> Thierry Carrez wrote:
>> (1) Need precisely-versioned artifacts
>> Those stacks need, as build dependencies and as runtime dependencies,
>> very precise versions of JARs. They won't build or run with a different
>> one. Using a more recent one might break functionality in a creative
>> way. A maven-based build will sometimes require 6 different versions of
>> the same JAR. In our packages, we usually offer only one version,
>> in corner cases one minor version for each major version. In some cases
>> the Java software will run/build with ours, in most cases it won't.
> 
>     In the case of libraries in other languages (most commonly C), we
> regularly port applications to work with the preferred version of the
> libraries we ship.  When preparing new versions of C libraries, the API
> and ABI are checked, with the binary package name changed where they
> differ, to better ensure compatibility.  Perhaps we could use the Java
> Introspection methods to generate some API report, and version Java
> libraries based on changing APIs?  

Yes, that would be an option. I know Sun wants to go in a more
stable-API-oriented direction in the future, having acknowledged that
the current JAR version numbers don't really mean anything useful.

> In the case where an incompatibility
> is not an API change, what sort of differences are encountered?  Might
> these be considered bugs?  Is there a case where two applications depend
> on the same API, but would break if used with different versions
> providing that API?

When some project requires a precise version of a JAR, you don't really
know if that's :
- because it relies on a specific API
- because there is a bugfix they rely on
- because newer features introduce a regression
- because Maven uses =JAR_VERSION types of dependencies by default

There might exist some Java library projects with strict versioning
rules, like x.y.z where x.y defines API and .z defines bugfixes inside a
stable API. But the general case is that they are allowed to (and will)
break API and behavior for every version (no matter how minor), since
(1) Maven helps developers in depending on a specific JAR version and
(2) runtime JARs are shipped within the binary release tarball.

>> A solution to workaround both problems would be to avoid targeting our
>> built-from-source repositories for such Java software and pack them with
>> their binary dependencies (or as binary directly)...
> 
>     This has historically been acceptable for some packages in
> multiverse, but without using the built-from-source method, it is hard
> for us to comply with licenses that insist that source code be provided
> on request as we cannot know that the source provided generates the
> binary provided.  This is further complicated in terms of defect
> management: without being able to ourselves collect a source tree we
> know to be capable of generating a given binary object, it becomes very
> difficult for us to address any user-reported problems.  This is
> compounded if there are embedded libraries, as a patch to fix a bug in a
> library may need to be repeated for several packages embedding that
> library.  Even if such patches are generated, but the use of our build
> tools to generate the package has not been tested, we cannot be
> confident of our ability to regenerate a suitable package to distribute
> to users.

There is the middle alternative, where we would still build from source
everything we *distribute*. There are a lot more Java build dependencies
than runtime dependencies: most of the Java build dependencies are just
used to check external method signatures during the bytecode
compilation, much like an API description. So we could still produce the
needed runtime JARs from their source by shipping those JAR build
dependencies as part of the source package. That way we solve the source
code providing hole, and we can have a clean defect management system,
for a much smaller packaging cost.

Taking numbers from a project I'm working on :
- binary distribution: 1 package
- build from source all runtime dependencies: 15 packages
- build from source all runtime/build deps: ~200 packages (just counting
one level deep. Deeper analysis under way, conservative estimates are
~500 packages needed)

So you can see that we are definitely still in the "doable" area with
the middle solution. If someone knew a way of turning off external
method checking during bytecode compilation, or a way of providing a
text-based API description to compile against... that would avoid the
"binary" thing altogether.

-- 
Thierry Carrez
Ubuntu server team



More information about the ubuntu-devel mailing list