Re: XZ Migration discussion

From: H. Peter Anvin
Date: Thu Feb 11 2010 - 14:57:46 EST


On 02/11/2010 10:36 AM, J.H. wrote:
>
> Option 1)
>
> Leave gz as the master, and migrate bz2 to xz. This will happen in
> stages obviously. with bz2 ultimately being phased out.
>
> Migration option 1)
>
> All new content would be provided in .bz2 and .xz with
> an ultimate date set that the .bz2 files would stop
> being generated with new content. This would leave all
> existing content alone and it would not be a migration
> of the current .bz2 files to xz
>
> Migration option 2)
>
> At some point there would be a mass conversion of all
> existing content to include .bz2 and .xz. These would
> be run in parallel for a time period until it was
> determined that .bz2 was no longer needed and it would
> be removed from the servers leaving .gz and .xz
> Option 2)
>
> Convert the master data from gz to bz2 and use xz as the new file
> format. This has the downside of causing more tool churn as it means
> the kernel developers will have to eventually convert from gz to bz2,
> which means for a time there will be nag e-mails if you upload gz
> instead of bz2 and such. It would also mean that we (kernel.org) would
> need to be able to support .gz and .bz2 as master data for a time.
>
> Migration options are identical to Option 1 more or less, with either
> just new content getting converted, or all content getting converted.
>
> ========================================================================
>
> I'm personally leaning towards option 1, though personally don't really
> have a preference on the migration options, as both obviously offer
> different advantages, and again this e-mail is more to spur on the
> discussion and come to some general consensus across all of the groups
> concerned before moving forward with a more specific plan.
>
> So I'm inviting discussion, questions and comments on this so we know
> which way to ultimately go.

My personal recommendation would be for Option 1, Migration option 2.

I think the idea of having two "best" file formats (Migration Option 1)
in use indefinitely is rather pointless.

As for the motivations for Option 1:

a) Currently, .gz contents is original, whereas .bz2 contents is
automatically generated. Flushing the .gz files would mean flushing
original content.

b) .gz is one of the most widely supported compression formats ever
created, plus it is very fast, especially on the decompression side.
.xz is reasonably fast but very memory intensive; .bz2 is moderately
fast and still fairly memory intensive (but not as much as .xz). In
other words, .gz provides an option for small, slow or old systems, or
systems running inferior operating systems. .xz provides the best
compression. .bz2 is in between, but it doesn't serve either purpose as
well as the other two.

Realistically speaking, kernel.org itself will carry all three formats
for an extended transition time (at least a year); we will probably
discontinue the victim format only when we start running shy on disk
space. However, we obviously don't want to push this burden onto all
the mirrors. Therefore, I would really appreciate feedback from mirror
admins as to how you would prefer to see the transition -- either
transition -- happen.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/