-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
Description
I am using Tesseract.js version 5.1.1 to build a simple OCR solution. While it works well on images with horizontally or vertically aligned text (top to bottom), I encounter issues when the text is at an angle or written bottom to up.
Observations:
-
Text orientation issues:
- The OCR works fine with horizontal and top-to-bottom vertical text.
- It fails or performs inconsistently with text written at certain angles, especially:
- Angled text: Between 0° and 90° clockwise directon.
- Bottom-to-up vertical text:Text written from bottom to top, i.e., between 90° and 270° clockwise, is not detected.
-
Settings applied:
- I am setting
rotateAuto: true, which works in some cases but fails to detect text properly at certain angles. - The orientation detection setting is enabled but does not seem to improve recognition accuracy at problematic angles.
- I am setting
const worker = await createWorker(storedLangCodeList, 1);
await worker.setParameters({
tessedit_pageseg_mode: PSM.AUTO_OSD,
});
const ret = await worker.recognize(imgURL, {rotateAuto: true});
console.log(ret.data.text)
Expected Behavior:
OCR should be able to correctly detect and process text, regardless of its orientation or the angle at which it appears in the image.
Actual Behavior:
Text at an angle or written from bottom to top is either not recognized or inaccurately detected.
Steps to Reproduce:
- Use Tesseract.js v5.1.1 with an image containing angled or bottom-to-top vertical text.
- Set
rotateAuto: trueand enable orientation detection. - Attempt OCR on the image and observe inconsistent results.
Additional Information:
- Browser: Chrome
- Tesseract.js Version: 5.1.1
- Issue occurs on both local development and production environments.
Images for Reference:
apexkid
Metadata
Metadata
Assignees
Labels
No labels



