html-encoding-sniffer

WebJar for html-encoding-sniffer

Лицензия

Лицензия

MIT
Группа

Группа

org.webjars.npm
Идентификатор

Идентификатор

html-encoding-sniffer
Последняя версия

Последняя версия

2.0.1
Дата

Дата

Тип

Тип

jar
Описание

Описание

html-encoding-sniffer
WebJar for html-encoding-sniffer
Ссылка на сайт

Ссылка на сайт

https://www.webjars.org
Система контроля версий

Система контроля версий

https://github.com/jsdom/html-encoding-sniffer

Скачать html-encoding-sniffer

Как подключить последнюю версию

<!-- https://jarcasting.com/artifacts/org.webjars.npm/html-encoding-sniffer/ -->
<dependency>
    <groupId>org.webjars.npm</groupId>
    <artifactId>html-encoding-sniffer</artifactId>
    <version>2.0.1</version>
</dependency>
// https://jarcasting.com/artifacts/org.webjars.npm/html-encoding-sniffer/
implementation 'org.webjars.npm:html-encoding-sniffer:2.0.1'
// https://jarcasting.com/artifacts/org.webjars.npm/html-encoding-sniffer/
implementation ("org.webjars.npm:html-encoding-sniffer:2.0.1")
'org.webjars.npm:html-encoding-sniffer:jar:2.0.1'
<dependency org="org.webjars.npm" name="html-encoding-sniffer" rev="2.0.1">
  <artifact name="html-encoding-sniffer" type="jar" />
</dependency>
@Grapes(
@Grab(group='org.webjars.npm', module='html-encoding-sniffer', version='2.0.1')
)
libraryDependencies += "org.webjars.npm" % "html-encoding-sniffer" % "2.0.1"
[org.webjars.npm/html-encoding-sniffer "2.0.1"]

Зависимости

compile (1)

Идентификатор библиотеки Тип Версия
org.webjars.npm : whatwg-encoding jar [1.0.5,2)

Модули Проекта

Данный проект не имеет модулей.

Determine the Encoding of a HTML Byte Stream

This package implements the HTML Standard's encoding sniffing algorithm in all its glory. The most interesting part of this is how it pre-scans the first 1024 bytes in order to search for certain <meta charset>-related patterns.

const htmlEncodingSniffer = require("html-encoding-sniffer");
const fs = require("fs");

const htmlBuffer = fs.readFileSync("./html-page.html");
const sniffedEncoding = htmlEncodingSniffer(htmlBuffer);

The returned value will be a canonical encoding name (not a label). You might then combine this with the whatwg-encoding package to decode the result:

const whatwgEncoding = require("whatwg-encoding");
const htmlString = whatwgEncoding.decode(htmlBuffer, sniffedEncoding);

Options

You can pass two potential options to htmlEncodingSniffer:

const sniffedEncoding = htmlEncodingSniffer(htmlBuffer, {
  transportLayerEncodingLabel,
  defaultEncoding
});

These represent two possible inputs into the encoding sniffing algorithm:

  • transportLayerEncodingLabel is an encoding label that is obtained from the "transport layer" (probably a HTTP Content-Type header), which overrides everything but a BOM.
  • defaultEncoding is the ultimate fallback encoding used if no valid encoding is supplied by the transport layer, and no encoding is sniffed from the bytes. It defaults to "windows-1252", as recommended by the algorithm's table of suggested defaults for "All other locales" (including the en locale).

Credits

This package was originally based on the excellent work of @nicolashenry, in jsdom. It has since been pulled out into this separate package.

org.webjars.npm

Версии библиотеки

Версия
2.0.1
1.0.2
1.0.1