com.github.monnetproject.bliss.sparsemath

com.github.monnetproject.bliss.sparsemath from the Monnet Project's bliss project.

Лицензия

Лицензия

Категории

Категории

Сеть
Группа

Группа

com.github.monnetproject
Идентификатор

Идентификатор

bliss.sparsemath
Последняя версия

Последняя версия

1.18.4
Дата

Дата

Тип

Тип

jar
Описание

Описание

com.github.monnetproject.bliss.sparsemath
com.github.monnetproject.bliss.sparsemath from the Monnet Project's bliss project.
Ссылка на сайт

Ссылка на сайт

https://github.com/monnetproject/bliss
Система контроля версий

Система контроля версий

http://github.com/monnetproject/bliss/tree/master

Скачать bliss.sparsemath

Как подключить последнюю версию

<!-- https://jarcasting.com/artifacts/com.github.monnetproject/bliss.sparsemath/ -->
<dependency>
    <groupId>com.github.monnetproject</groupId>
    <artifactId>bliss.sparsemath</artifactId>
    <version>1.18.4</version>
</dependency>
// https://jarcasting.com/artifacts/com.github.monnetproject/bliss.sparsemath/
implementation 'com.github.monnetproject:bliss.sparsemath:1.18.4'
// https://jarcasting.com/artifacts/com.github.monnetproject/bliss.sparsemath/
implementation ("com.github.monnetproject:bliss.sparsemath:1.18.4")
'com.github.monnetproject:bliss.sparsemath:jar:1.18.4'
<dependency org="com.github.monnetproject" name="bliss.sparsemath" rev="1.18.4">
  <artifact name="bliss.sparsemath" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.github.monnetproject', module='bliss.sparsemath', version='1.18.4')
)
libraryDependencies += "com.github.monnetproject" % "bliss.sparsemath" % "1.18.4"
[com.github.monnetproject/bliss.sparsemath "1.18.4"]

Зависимости

compile (2)

Идентификатор библиотеки Тип Версия
junit : junit jar 4.10
it.unimi.dsi : fastutil jar 6.4.4

test (3)

Идентификатор библиотеки Тип Версия
org.apache.commons : commons-math jar 2.2
org.apache.commons : commons-compress jar 1.4.1
colt : colt jar 1.2.0

Модули Проекта

Данный проект не имеет модулей.

Bilingual Similarity Suite (BLISS)

This package provides a set of tools for working with topic modelling and in particular in the cross-lingual case, and for application to machine translation. The following algorithms are implemented

  • Latent Dirichlet Allocation
  • Cross-Lingual Explicit Semantic Analysis

And the following are planned

  • Kernel Explicit Semantic Analysis
  • Latent Semantic Analysis
  • Coupled Probabilistic Latent Semantic Analysis

Building

Translation Topics uses Maven to build, and can be simply installed with the following command

mvn install

Building a corpus

To build a corpus for this there are existing scripts that download the data from Wikipedia. These can be run with (for English to German)

./build-wikipedia-article.sh en de

Mate-finding trials

Mate-finding trials can be run with the following command, from the experiments sub-folder:

mvn exec:java -Dexec.mainClass=eu.monnetproject.bliss.experiments.MateFindingTrial 
       -Dexec.args="trainFile metricFactory W testFile"

Where W is the number of distinct tokens in the corpus and metricFactory is:

  • eu.monnetproject.bliss.clesa.CLESA: For CL-ESA
  • (More to come)

Language model adaptation

Language models can be trained with the following command (from the betalm folder)

mvn exec:java -Dexec.mainClass="betalm.compile" -Dexec.args="corpus.gz N wordMap W lmFile"

Where N is the order of the n-gram model and W the number of distinct tokens. To adapt to a specific document provide in addition to -Dexec.args the following flags

    -Dexec.args="-b METHOD -f file[.gz] ..." 

Where METHOD is one of

  • COS_SIM
  • NORMAL_COS_SIM
  • KLD
  • JACCARD
  • DICE
  • ROGERS_TANIMOTO
  • DF_JACCARD
  • DF_DICE
  • WxWCLESA
com.github.monnetproject

Monnet Project

Версии библиотеки

Версия
1.18.4