hadoop-xz

XZ (LZMA/LZMA2) Codec for Apache Hadoop

Лицензия	Лицензия Apache License, Version 2.0
Группа	Группа io.sensesecure
Идентификатор	Идентификатор hadoop-xz
Последняя версия	Последняя версия 1.4
Дата	Дата 12 апр. 2015 г.
Тип	Тип jar
Описание	Описание hadoop-xz XZ (LZMA/LZMA2) Codec for Apache Hadoop
Ссылка на сайт	Ссылка на сайт https://github.com/yongtang/hadoop-xz
Система контроля версий	Система контроля версий https://github.com/yongtang/hadoop-xz

Скачать hadoop-xz

Имя Файла	Размер
hadoop-xz-1.4.pom
hadoop-xz-1.4.jar	6 KB
hadoop-xz-1.4-sources.jar	3 KB
hadoop-xz-1.4-javadoc.jar	42 KB
Обзор

Как подключить последнюю версию

Apache Maven

<!-- https://jarcasting.com/artifacts/io.sensesecure/hadoop-xz/ -->
<dependency>
    <groupId>io.sensesecure</groupId>
    <artifactId>hadoop-xz</artifactId>
    <version>1.4</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/io.sensesecure/hadoop-xz/
implementation 'io.sensesecure:hadoop-xz:1.4'

Gradle Kotlin

// https://jarcasting.com/artifacts/io.sensesecure/hadoop-xz/
implementation ("io.sensesecure:hadoop-xz:1.4")

Apache Buildr

'io.sensesecure:hadoop-xz:jar:1.4'

Apache Ivy

<dependency org="io.sensesecure" name="hadoop-xz" rev="1.4">
  <artifact name="hadoop-xz" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='io.sensesecure', module='hadoop-xz', version='1.4')
)

Scala SBT

libraryDependencies += "io.sensesecure" % "hadoop-xz" % "1.4"

Leiningen

[io.sensesecure/hadoop-xz "1.4"]

Зависимости

compile (2)

Идентификатор библиотеки	Тип	Версия
org.apache.hadoop : hadoop-common	jar	2.6.0
org.tukaani : xz	jar	1.5

test (1)

Идентификатор библиотеки	Тип	Версия
junit : junit	jar	4.11

Модули Проекта

Данный проект не имеет модулей.

Hadoop-XZ

XZ (LZMA/LZMA2) Codec for Apache Hadoop

Hadoop-XZ is a project to add the XZ compression codec in Hadoop. XZ is a lossless data compression file format that incorporates the LZMA/LZMA2 compression algorithms. XZ offers excellent compression ratio (LZMA/LZMA2) at the expense of longer compression time compared with other compression codecs such as gzip, lzo, or bzip2. The decompression time of XZ is much more comparable with other compression codecs. In fact, XZ have a much better decompression time than bzip2. It is an ideal compression format when longer compression time is tolerable. The data can be divided into independently compressed blocks with the index of the blocks contained in the XZ file, which makes XZ a native splittable file format.

This library is built on top of the XZ Java library provided by http://tukaani.org (XZ Utils). It supports the SplittableCompressionCodec interface so the individual XZ files could be processed with distributed tasks. Keep in mind that XZ program tends to choose larger block size if no block size is specified (--block-size=size). That often results in a single block within a huge compressed file. This will not help distributed tasks. It is always advised that an appropriate block size is specified when compression is performed.

Installation

Add the hadoop-xz POM to a project with

<dependency>
  <groupId>io.sensesecure</groupId>
  <artifactId>hadoop-xz</artifactId>
  <version>1.4</version>
</dependency>

Or add project's SBT with

libraryDependencies += "io.sensesecure" % "hadoop-xz" % "1.4"

Usage

It is fairly simple to use XZ codec in Hadoop related programs. For example, the following is an Apache Spark example of line count for an XZ compressed text file:

val sparkConf = new SparkConf().setAppName("Simple Application")
val sparkContext = new SparkContext(sparkConf)
val configuration = new Configuration()
configuration.set("io.compression.codecs","io.sensesecure.hadoop.xz.XZCodec")
val rdd = sparkContext.newAPIHadoopFile("sample.text.xz",
            classOf[TextInputFormat], classOf[LongWritable], classOf[Text],
            configuration)

println(rdd.count())

Contact

If you have trouble with the library or have questions, check out the GitHub repository at http://github.com/yongtang/hadoop-xz .

Версии библиотеки

Версия
1.4 12 апр. 2015 г.
1.3 11 апр. 2015 г.
1.2 10 апр. 2015 г.
1.1 3 апр. 2015 г.
1.0 30 мар. 2015 г.
0.9 28 мар. 2015 г.

hadoop-xz

Лицензия

Группа

Идентификатор

Последняя версия

Дата

Тип

Описание

Ссылка на сайт

Система контроля версий

Скачать hadoop-xz

Как подключить последнюю версию

Зависимости

compile (2)

test (1)

Модули Проекта

Hadoop-XZ

XZ (LZMA/LZMA2) Codec for Apache Hadoop

Installation

Usage

Contact

Версии библиотеки