jtesseract

jtesseract provides Java wrappers (JNA and BridJ) for Tesseract OCR

Лицензия

Лицензия

Группа

Группа

de.vorb
Идентификатор

Идентификатор

jtesseract
Последняя версия

Последняя версия

0.0.4
Дата

Дата

Тип

Тип

jar
Описание

Описание

jtesseract
jtesseract provides Java wrappers (JNA and BridJ) for Tesseract OCR
Ссылка на сайт

Ссылка на сайт

https://github.com/pvorb/jtesseract
Система контроля версий

Система контроля версий

https://github.com/pvorb/jtesseract

Скачать jtesseract

Как подключить последнюю версию

<!-- https://jarcasting.com/artifacts/de.vorb/jtesseract/ -->
<dependency>
    <groupId>de.vorb</groupId>
    <artifactId>jtesseract</artifactId>
    <version>0.0.4</version>
</dependency>
// https://jarcasting.com/artifacts/de.vorb/jtesseract/
implementation 'de.vorb:jtesseract:0.0.4'
// https://jarcasting.com/artifacts/de.vorb/jtesseract/
implementation ("de.vorb:jtesseract:0.0.4")
'de.vorb:jtesseract:jar:0.0.4'
<dependency org="de.vorb" name="jtesseract" rev="0.0.4">
  <artifact name="jtesseract" type="jar" />
</dependency>
@Grapes(
@Grab(group='de.vorb', module='jtesseract', version='0.0.4')
)
libraryDependencies += "de.vorb" % "jtesseract" % "0.0.4"
[de.vorb/jtesseract "0.0.4"]

Зависимости

compile (2)

Идентификатор библиотеки Тип Версия
com.nativelibs4java : bridj jar 0.6.2
de.vorb : jleptonica jar 0.0.2

Модули Проекта

Данный проект не имеет модулей.

BridJ bindings for Tesseract 3.03

This library is no longer maintained! Consider using javacpp-presets/tesseract instead.

A Java library that can be used to access Tesseract's C API from Java through BridJ. Therefore, this library provides interfaces that cover all of Tesseract's C API. BridJ classes were generated automatically by using JNAerator on Tesseract's capi.h.

This library does only cover the C API. If you are looking for a more convenient way to use Tesseract from Java, have a look at Tess4J.

Get

Information on how to get binary releases, documentation and source distributions of this software can be found on search.maven.org.

Usage example

The following lines show how to use this library to recognize text in a PNG image file.

package de.vorb.tesseract.example;

import java.awt.image.BufferedImage;
import java.awt.image.DataBuffer;
import java.awt.image.DataBufferByte;
import java.io.File;
import java.io.IOException;
import java.nio.ByteBuffer;

import javax.imageio.ImageIO;

import org.bridj.BridJ;
import org.bridj.Pointer;

import de.vorb.tesseract.bridj.Tesseract;
import de.vorb.tesseract.bridj.Tesseract.TessBaseAPI;
import de.vorb.tesseract.bridj.OCREngineMode;
import de.vorb.tesseract.bridj.PageSegMode;

public class BridJExample {
  public static void main(String[] args) throws IOException {
    // provide the native library file
    BridJ.setNativeLibraryFile("tesseract", new File("libtesseract303.dll"));

    long start = System.currentTimeMillis();
    // create a reference to an execution handle
    final Pointer<TessBaseAPI> handle = Tesseract.TessBaseAPICreate();

    // init Tesseract with data path, language and OCR engine mode
    Tesseract.TessBaseAPIInit2(handle,
        Pointer.pointerToCString("E:\\Masterarbeit\\Ressourcen\\tessdata"),
        Pointer.pointerToCString("deu-frak"), OCREngineMode.DEFAULT);

    // set page segmentation mode
    Tesseract.TessBaseAPISetPageSegMode(handle, PageSegMode.AUTO);

    // read the image into memory
    final BufferedImage inputImage = ImageIO.read(new File("input.png"));

    // get the image data
    final DataBuffer imageBuffer = inputImage.getRaster().getDataBuffer();
    final byte[] imageData = ((DataBufferByte) imageBuffer).getData();

    // image properties
    final int width = inputImage.getWidth(), height = inputImage.getHeight();
    final int bitsPerPixel = inputImage.getColorModel().getPixelSize();
    final int bytesPerPixel = bitsPerPixel / 8;
    final int bytesPerLine = (int) Math.ceil(width * bitsPerPixel / 8.0);

    // set the image
    Tesseract.TessBaseAPISetImage(handle,
        Pointer.pointerToBytes(ByteBuffer.wrap(imageData)), width, height,
        bytesPerPixel, bytesPerLine);

    // get the text result
    final String txt = Tesseract.TessBaseAPIGetUTF8Text(handle).getCString();

    // print the result
    System.out.println(txt);

    // calculate the time
    System.out.println("time: " + (System.currentTimeMillis() - start) + "ms");

    // delete handle
    Tesseract.TessBaseAPIDelete(handle);
  }
}

Other example files can be found at src/test/java/de/vorb/tesseract/example.

Версии библиотеки

Версия
0.0.4
0.0.3
0.0.2
0.0.1