GloVe Word Embedding and Document Embedding in Java

Java implementation of GloVe word embedding and document embedding

Лицензия

Лицензия

MIT
Категории

Категории

Java Языки программирования
Группа

Группа

com.github.chen0040
Идентификатор

Идентификатор

java-text-embedding
Последняя версия

Последняя версия

1.0.1
Дата

Дата

Тип

Тип

jar
Описание

Описание

GloVe Word Embedding and Document Embedding in Java
Java implementation of GloVe word embedding and document embedding
Ссылка на сайт

Ссылка на сайт

https://github.com/chen0040/java-text-embedding
Система контроля версий

Система контроля версий

https://github.com/chen0040/java-text-embedding

Скачать java-text-embedding

Как подключить последнюю версию

<!-- https://jarcasting.com/artifacts/com.github.chen0040/java-text-embedding/ -->
<dependency>
    <groupId>com.github.chen0040</groupId>
    <artifactId>java-text-embedding</artifactId>
    <version>1.0.1</version>
</dependency>
// https://jarcasting.com/artifacts/com.github.chen0040/java-text-embedding/
implementation 'com.github.chen0040:java-text-embedding:1.0.1'
// https://jarcasting.com/artifacts/com.github.chen0040/java-text-embedding/
implementation ("com.github.chen0040:java-text-embedding:1.0.1")
'com.github.chen0040:java-text-embedding:jar:1.0.1'
<dependency org="com.github.chen0040" name="java-text-embedding" rev="1.0.1">
  <artifact name="java-text-embedding" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.github.chen0040', module='java-text-embedding', version='1.0.1')
)
libraryDependencies += "com.github.chen0040" % "java-text-embedding" % "1.0.1"
[com.github.chen0040/java-text-embedding "1.0.1"]

Зависимости

compile (6)

Идентификатор библиотеки Тип Версия
com.google.guava : guava jar 20.0
com.alibaba : fastjson jar 1.2.33
org.slf4j : slf4j-api jar 1.7.20
org.slf4j : slf4j-simple jar 1.7.20
org.apache.httpcomponents : httpclient jar 4.5.2
net.lingala.zip4j : zip4j jar 1.3.2

provided (1)

Идентификатор библиотеки Тип Версия
org.projectlombok : lombok jar 1.16.10

Модули Проекта

Данный проект не имеет модулей.

java-word-embedding

Word embedding in Java

The current project provides GloVe word embedding that developer can directly use within their project.

Install

Add the following dependency to your POM file:

<dependency>
  <groupId>com.github.chen0040</groupId>
  <artifactId>java-text-embedding</artifactId>
  <version>1.0.1</version>
</dependency>

Usage

The sample codes below shows how to use GloVeModel to create GloVe word embedding of different dimensions (e.g., 50, 100, 200, 300)

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.github.chen0040.embeddings.GloVeModel;

public class GloVeModelDemo {

    private static final Logger logger = LoggerFactory.getLogger(GloVeModelDemo.class);

    public static void main(String[] args) {
        GloVeModel model = new GloVeModel();
        model.load100();

        logger.info("word2em size: {}", model.size());
        logger.info("word2em dimension for individual word: {}", model.getWordVecDimension());

        logger.info("father: {}", model.encodeWord("father"));
        logger.info("mother: {}", model.encodeWord("mother"));
        logger.info("man: {}", model.encodeWord("man"));
        logger.info("woman: {}", model.encodeWord("woman"));
        logger.info("boy: {}", model.encodeWord("boy"));
        logger.info("girl: {}", model.encodeWord("girl"));
        
        logger.info("distance between boy and girl: {}", model.distance("boy", "girl"));


        String doc = "The Zen of Python. Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules.";

        logger.info("doc: {}", model.encodeDocument(doc));


    }
}

Версии библиотеки

Версия
1.0.1