Skip to content

Tesseract.js (v5.1.1) Fails on Angled Text Recognition #963

@rajkumardongre

Description

@rajkumardongre

I am using Tesseract.js version 5.1.1 to build a simple OCR solution. While it works well on images with horizontally or vertically aligned text (top to bottom), I encounter issues when the text is at an angle or written bottom to up.

Observations:

  1. Text orientation issues:

    • The OCR works fine with horizontal and top-to-bottom vertical text.
    • It fails or performs inconsistently with text written at certain angles, especially:
      • Angled text: Between 0° and 90° clockwise directon.
      • Bottom-to-up vertical text:Text written from bottom to top, i.e., between 90° and 270° clockwise, is not detected.
  2. Settings applied:

    • I am setting rotateAuto: true, which works in some cases but fails to detect text properly at certain angles.
    • The orientation detection setting is enabled but does not seem to improve recognition accuracy at problematic angles.
const worker = await createWorker(storedLangCodeList, 1);
await worker.setParameters({
    tessedit_pageseg_mode: PSM.AUTO_OSD,
});
const ret = await worker.recognize(imgURL, {rotateAuto: true});
console.log(ret.data.text)

Expected Behavior:

OCR should be able to correctly detect and process text, regardless of its orientation or the angle at which it appears in the image.

Actual Behavior:

Text at an angle or written from bottom to top is either not recognized or inaccurately detected.

Steps to Reproduce:

  1. Use Tesseract.js v5.1.1 with an image containing angled or bottom-to-top vertical text.
  2. Set rotateAuto: true and enable orientation detection.
  3. Attempt OCR on the image and observe inconsistent results.

Additional Information:

  • Browser: Chrome
  • Tesseract.js Version: 5.1.1
  • Issue occurs on both local development and production environments.

Images for Reference:

images of the text where recognition fails
image
image
image
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions