com.groupon.dse:baryon

A library for building Spark streaming applications that consume data from Kafka.

Лицензия	Лицензия The BSD 3-Clause License
Группа	Группа com.groupon.dse
Идентификатор	Идентификатор baryon
Последняя версия	Последняя версия 1.0
Дата	Дата 5 июл. 2016 г.
Тип	Тип jar
Описание	Описание com.groupon.dse:baryon A library for building Spark streaming applications that consume data from Kafka.
Ссылка на сайт	Ссылка на сайт https://github.com/groupon/baryon
Система контроля версий	Система контроля версий https://github.com/groupon/baryon

Скачать baryon

Имя Файла	Размер
baryon-1.0.pom
baryon-1.0.jar	248 KB
baryon-1.0-tests.jar	417 KB
baryon-1.0-test-sources.jar	55 KB
baryon-1.0-sources.jar	77 KB
baryon-1.0-javadoc.jar	610 KB
Обзор

Как подключить последнюю версию

Apache Maven

<!-- https://jarcasting.com/artifacts/com.groupon.dse/baryon/ -->
<dependency>
    <groupId>com.groupon.dse</groupId>
    <artifactId>baryon</artifactId>
    <version>1.0</version>
</dependency>

Gradle Groovy

// https://jarcasting.com/artifacts/com.groupon.dse/baryon/
implementation 'com.groupon.dse:baryon:1.0'

Gradle Kotlin

// https://jarcasting.com/artifacts/com.groupon.dse/baryon/
implementation ("com.groupon.dse:baryon:1.0")

Apache Buildr

'com.groupon.dse:baryon:jar:1.0'

Apache Ivy

<dependency org="com.groupon.dse" name="baryon" rev="1.0">
  <artifact name="baryon" type="jar" />
</dependency>

Groovy Grape

@Grapes(
@Grab(group='com.groupon.dse', module='baryon', version='1.0')
)

Scala SBT

libraryDependencies += "com.groupon.dse" % "baryon" % "1.0"

Leiningen

[com.groupon.dse/baryon "1.0"]

Зависимости

compile (14)

Идентификатор библиотеки	Тип	Версия
org.apache.kafka : kafka_2.10	jar	0.8.1.1
com.101tec : zkclient	jar	0.7
com.groupon.dse : spark-metrics	jar	1.0
org.apache.zookeeper : zookeeper	jar	3.4.6
org.json4s : json4s-core_2.10	jar	3.2.10
org.json4s : json4s-jackson_2.10	jar	3.2.10
org.scala-lang : scala-library	jar	2.10.4
org.slf4j : slf4j-api	jar	1.7.10
com.typesafe.play : play-ws_2.10	jar	2.3.10
com.typesafe.play : play-json_2.10	jar	2.3.10
com.fasterxml.jackson.core : jackson-databind	jar	2.4.4
com.fasterxml.jackson.module : jackson-module-scala_2.10	jar	2.4.4
com.fasterxml.jackson.core : jackson-core	jar	2.4.4
com.ning : async-http-client	jar	1.9.21

provided (4)

Идентификатор библиотеки	Тип	Версия
org.apache.spark : spark-core_2.10	jar	1.5.2
org.apache.spark : spark-streaming_2.10	jar	1.5.2
log4j : log4j	jar	1.2.17
org.apache.hadoop : hadoop-common	jar	2.2.0

test (2)

Идентификатор библиотеки	Тип	Версия
org.mockito : mockito-all	jar	1.10.8
org.scalatest : scalatest_2.10	jar	2.2.4

Модули Проекта

Данный проект не имеет модулей.

Baryon

Baryon is a library for building Spark streaming applications that consume data from Kafka.

Baryon abstracts away all the bookkeeping involved in reliably connecting to a Kafka cluster and fetching data from it, so that users only need to focus on the logic to process this data.

For a detailed guide on getting started with Baryon, take a look at the wiki.

Why Baryon?

Spark itself also has libraries for interacting with Kafka, as documented in its Kafka integration guide. These libraries are well-developed, but there are certain limitations there that Baryon intends to address:

Code-independent checkpointing

Baryon's Kafka state management system allows Kafka consumption state to be stored across multiple runs of an application, even when there are code changes. Spark's checkpointing system does not support maintaining state across changes in code, so users of Spark's Kafka libraries must implement the offset management logic themselves.
Improved error handling

Baryon handles errors related to Kafka much more thoroughly than Spark's Kafka libraries, so users don't need to worry about handling Kafka problems in their code.

In addition to the above, there are a handful of additional features unique to Baryon:

Multiple consumption modes

Baryon has two modes of consumption, the blocking mode and the non-blocking mode, which can be changed without any code changes. The blocking mode more or less corresponds to the consumption behavior of the "direct" approach, while the non-blocking mode has consumption behavior similar to the receiver-based approach.
Dynamically configured topics

Baryon supports changes to the set of Kafka topics that are consumed while the application is running. Alongside this, configurations can be set at a per-topic level, which makes it easier to build a single application to process multiple, heterogeneous data streams.
Aggregated metrics

Baryon uses the spark-metrics library to collect and aggregate useful metrics across the driver and executors. These include metrics like offset lag, throughput, error rates, as well as augmented versions of existing metrics that Spark provides. The metrics here are integrated with Spark's metrics system, so they are compatible with the reporting system that comes with Spark.

Quick Start

Add Baryon as a dependency:

<dependency>
    <groupId>com.groupon.dse</groupId>
    <artifactId>baryon</artifactId>
    <version>1.0</version>
</dependency>

If you want to add custom metrics that are integrated with Spark, use the spark-metrics that Baryon also uses:

<dependency>
    <groupId>com.groupon.dse</groupId>
    <artifactId>spark-metrics</artifactId>
    <version>1.0</version>
</dependency>

Take a look at the examples to see how to write the driver and a ReceiverPlugin.

Groupon

Версии библиотеки

Версия
1.0 5 июл. 2016 г.

com.groupon.dse:baryon

Лицензия

Группа

Идентификатор

Последняя версия

Дата

Тип

Описание

Ссылка на сайт

Система контроля версий