com.cldellow:manu-format

Utilities to manage timeseries data.

Лицензия

Лицензия

Категории

Категории

ORM Данные
Группа

Группа

com.cldellow
Идентификатор

Идентификатор

manu-format
Последняя версия

Последняя версия

0.2.2
Дата

Дата

Тип

Тип

jar
Описание

Описание

com.cldellow:manu-format
Utilities to manage timeseries data.
Ссылка на сайт

Ссылка на сайт

https://github.com/cldellow/manu

Скачать manu-format

Как подключить последнюю версию

<!-- https://jarcasting.com/artifacts/com.cldellow/manu-format/ -->
<dependency>
    <groupId>com.cldellow</groupId>
    <artifactId>manu-format</artifactId>
    <version>0.2.2</version>
</dependency>
// https://jarcasting.com/artifacts/com.cldellow/manu-format/
implementation 'com.cldellow:manu-format:0.2.2'
// https://jarcasting.com/artifacts/com.cldellow/manu-format/
implementation ("com.cldellow:manu-format:0.2.2")
'com.cldellow:manu-format:jar:0.2.2'
<dependency org="com.cldellow" name="manu-format" rev="0.2.2">
  <artifact name="manu-format" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.cldellow', module='manu-format', version='0.2.2')
)
libraryDependencies += "com.cldellow" % "manu-format" % "0.2.2"
[com.cldellow/manu-format "0.2.2"]

Зависимости

compile (4)

Идентификатор библиотеки Тип Версия
com.cldellow : manu-common jar 0.2.2
joda-time : joda-time jar 2.9.9
org.xerial : sqlite-jdbc jar 3.21.0.1
me.lemire.integercompression : JavaFastPFOR jar 0.1.11

test (4)

Идентификатор библиотеки Тип Версия
junit : junit jar 4.12
com.pholser : junit-quickcheck-core jar 0.7
com.pholser : junit-quickcheck-generators jar 0.7
org.hamcrest : hamcrest-library jar 1.3

Модули Проекта

Данный проект не имеет модулей.

Manu: "Mostly archived, not updated"

Build Status codecov Maven Central

A time series storage format for integers and floats, using efficient delta encodings from FastPFOR.

Examples: pageviews by article in Wikipedia, stock open/close/high/low prices, weather temperatures.

Components

  • manu-format, a library for maintaining the data on disk
  • manu-cli, a command-line tool for ingesting data into the format
  • manu-serve, a web server to expose the data over REST

Design criteria

Priorities

  • Cheap
    • I'm doing this to drive a hobby project; my dream would be to host a variety of datasets for $10/month.
    • A Fermi estimate suggests Wikipedia pageviews has 100B datapoints over the last 10 years. This implies that storage costs will dominate.
  • Doesn’t need to be always-on
    • This sort of follows from cheap -- the ability to load subsets of data, or to run on spot instances will be a useful tool to cut costs.

Non-priorities

  • Concurrent / fast writes
    • These can happen offline.
  • Fast reads
    • The pareto principle will likely apply to queries - 1% of keys will get 99% of reads. We can use Varnish or similar to cache at the application level.

Assumptions

  • Dense datasets
    • Keys: if we see a key once, we expect to see it again.
    • Values: if key X has a datapoint at T1, we expect most other keys will as well.
  • Correlated values
    • Value for key X at T1 is likely related to value at T2.
  • Some datasets can be lossy
    • Wikipedia pageviews, e.g., are likely insensitive to precision so long as the trend is generally correct.

Obligatory

Manu

Credit: Our Greatest Asset, Saturday Morning Breakfast Cereal

Версии библиотеки

Версия
0.2.2
0.2.1
0.2.0
0.1.0