General Sequential Pattern Mining

Java implementation of GSP for frequent pattern mining

Лицензия

Лицензия

MIT
Категории

Категории

Java Языки программирования
Группа

Группа

com.github.chen0040
Идентификатор

Идентификатор

java-sequential-pattern-mining
Последняя версия

Последняя версия

1.0.1
Дата

Дата

Тип

Тип

jar
Описание

Описание

General Sequential Pattern Mining
Java implementation of GSP for frequent pattern mining
Ссылка на сайт

Ссылка на сайт

https://github.com/chen0040/java-sequential-pattern-mining
Система контроля версий

Система контроля версий

https://github.com/chen0040/java-sequential-pattern-mining

Скачать java-sequential-pattern-mining

Как подключить последнюю версию

<!-- https://jarcasting.com/artifacts/com.github.chen0040/java-sequential-pattern-mining/ -->
<dependency>
    <groupId>com.github.chen0040</groupId>
    <artifactId>java-sequential-pattern-mining</artifactId>
    <version>1.0.1</version>
</dependency>
// https://jarcasting.com/artifacts/com.github.chen0040/java-sequential-pattern-mining/
implementation 'com.github.chen0040:java-sequential-pattern-mining:1.0.1'
// https://jarcasting.com/artifacts/com.github.chen0040/java-sequential-pattern-mining/
implementation ("com.github.chen0040:java-sequential-pattern-mining:1.0.1")
'com.github.chen0040:java-sequential-pattern-mining:jar:1.0.1'
<dependency org="com.github.chen0040" name="java-sequential-pattern-mining" rev="1.0.1">
  <artifact name="java-sequential-pattern-mining" type="jar" />
</dependency>
@Grapes(
@Grab(group='com.github.chen0040', module='java-sequential-pattern-mining', version='1.0.1')
)
libraryDependencies += "com.github.chen0040" % "java-sequential-pattern-mining" % "1.0.1"
[com.github.chen0040/java-sequential-pattern-mining "1.0.1"]

Зависимости

compile (3)

Идентификатор библиотеки Тип Версия
org.slf4j : slf4j-api jar 1.7.20
org.slf4j : slf4j-log4j12 jar 1.7.20
org.apache.commons : commons-math3 jar 3.2

provided (1)

Идентификатор библиотеки Тип Версия
org.projectlombok : lombok jar 1.16.6

test (10)

Идентификатор библиотеки Тип Версия
org.testng : testng jar 6.9.10
org.hamcrest : hamcrest-core jar 1.3
org.hamcrest : hamcrest-library jar 1.3
org.assertj : assertj-core jar 3.5.2
org.powermock : powermock-core jar 1.6.5
org.powermock : powermock-api-mockito jar 1.6.5
org.powermock : powermock-module-junit4 jar 1.6.5
org.powermock : powermock-module-testng jar 1.6.5
org.mockito : mockito-core jar 2.0.2-beta
org.mockito : mockito-all jar 2.0.2-beta

Модули Проекта

Данный проект не имеет модулей.

java-sequential-pattern-mining

Package provides java implementation of sequential pattern mining algorithm GSP

Build Status Coverage Status

Overview of GSP

The implementation of the algorithm is based on Srikant & Agrawal, 1996

The algorithm makes multiple passes over the data. The first pass determines the support of each item, that is, the number of data-sequences that include the item. At the end of the first pas, the algorithm knows which items are frequent, that is, have minimum support. Each such item yields a 1-element frequent sequence consisting of that item.

Each subsequent pass starts with a seed set: the frequent sequences found in the previous pass. The seed set is used to generate new potentially frequent sequences, called candidate sequences. Each candidate sequence has one more item than a seed sequence; so all the candidate sequences in a pass will have the same number of items. The support for these candidate sequences is found during the pass over the data. At the end of the pass, the algorithm determines which of the candidate sequences are actually frequent. These frequent candidates become the seed for the next pass.

Install

Add the following dependency to your POM file:

<dependency>
  <groupId>com.github.chen0040</groupId>
  <artifactId>java-sequential-pattern-mining</artifactId>
  <version>1.0.1</version>
</dependency>

Usage

The sample code belows illustrates how to use the GSP to find the frequent sequential pattern in a simple sequence database.

List<Sequence> database = new ArrayList<>();

// Below is 4 sequences of transactions stored in the database 
/*
S1 	(1), (1 2 3), (1 3), (4), (3 6)
S2 	(1 4), (3), (2 3), (1 5)
S3 	(5 6), (1 2), (4 6), (3), (2)
S4 	(5), (7), (1 6), (3), (2), (3)
*/

database.add(Sequence.make("1", "1,2,3", "1,3", "4", "3,6"));
database.add(Sequence.make("1,4", "3", "2,3", "1,5"));
database.add(Sequence.make("5,6", "1,2", "4,6", "3", "2"));
database.add(Sequence.make("5", "7", "1,6", "3", "2", "3"));

GSP method = new GSP();
method.setMinSupportLevel(2);
List<String> uniqueItems = new MetaData(database).getUniqueItems();
Sequences result = method.minePatterns(database, uniqueItems, -1);

result.getSequences().stream().forEach(sequence -> {
 System.out.println("sequence: " + sequence);
});

Версии библиотеки

Версия
1.0.1