Последняя версия

SCIE PDF Text Extractor 2.0.1

This is an optimized version of Apache PDFBox. It allows to extract the rough structure of a document (pages, blocks of text and paragraphs as well as formatting information) and was made with the intent to optimize text extraction results for scientific papers. The output can easily be transformed to plaintext (toString) or to an XML format (toXML).

Лицензия

Лицензия

Категории

Категории

PDF Данные
Группа

Группа

de.cit-ec.scie
Идентификатор

Идентификатор

pdf-extractor
Версия

Версия

2.0.1
Тип

Тип

jar
Описание

Описание

SCIE PDF Text Extractor
This is an optimized version of Apache PDFBox. It allows to extract the rough structure of a document (pages, blocks of text and paragraphs as well as formatting information) and was made with the intent to optimize text extraction results for scientific papers. The output can easily be transformed to plaintext (toString) or to an XML format (toXML).
Ссылка на сайт

Ссылка на сайт

http://openresearch.cit-ec.de/projects/scie/
Система контроля версий

Система контроля версий

https://opensource.cit-ec.de/projects/scie/repository/revisions/master/show/modules/pdf-extractor

Скачать pdf-extractor 2.0.1


<!-- https://jarcasting.com/artifacts/de.cit-ec.scie/pdf-extractor/ -->
<dependency>
    <groupId>de.cit-ec.scie</groupId>
    <artifactId>pdf-extractor</artifactId>
    <version>2.0.1</version>
</dependency>
// https://jarcasting.com/artifacts/de.cit-ec.scie/pdf-extractor/
implementation 'de.cit-ec.scie:pdf-extractor:2.0.1'
// https://jarcasting.com/artifacts/de.cit-ec.scie/pdf-extractor/
implementation ("de.cit-ec.scie:pdf-extractor:2.0.1")
'de.cit-ec.scie:pdf-extractor:jar:2.0.1'
<dependency org="de.cit-ec.scie" name="pdf-extractor" rev="2.0.1">
  <artifact name="pdf-extractor" type="jar" />
</dependency>
@Grapes(
@Grab(group='de.cit-ec.scie', module='pdf-extractor', version='2.0.1')
)
libraryDependencies += "de.cit-ec.scie" % "pdf-extractor" % "2.0.1"
[de.cit-ec.scie/pdf-extractor "2.0.1"]

Зависимости

compile (1)

Идентификатор библиотеки Тип Версия
org.apache.pdfbox : pdfbox jar 1.8.2

test (1)

Идентификатор библиотеки Тип Версия
junit : junit jar 4.11

Модули Проекта

Данный проект не имеет модулей.