Terrible OCR results with Channel 5 (UK) #384

Open
opened 2026-01-29 16:42:32 +00:00 by claunia · 0 comments
Owner

Originally created by @cfsmp3 on GitHub (Feb 13, 2018).

(current master, pre 0.87)

This file (but well, all of channel 5)
https://drive.google.com/open?id=1Etq-pv5G3jGqVhhRl7cNrfuw4gaKkLoV

Produces terrible results in the OCR, even though the bitmaps seem normal. What's going on?

833
00:59:26,021 --> 00:59:29,340
(CriminaLs den't.just. ’cemmit
I one type of offence. I

834
00:59:29,341 --> 00:59:31,700
Just C05 they SitQLe same petrQL
- that day‘ dQeSn't mean -

835
00:59:31,701 --> 00:59:34,740
m
pietes Of pLLa nt,
Originally created by @cfsmp3 on GitHub (Feb 13, 2018). (current master, pre 0.87) This file (but well, all of channel 5) https://drive.google.com/open?id=1Etq-pv5G3jGqVhhRl7cNrfuw4gaKkLoV Produces terrible results in the OCR, even though the bitmaps seem normal. What's going on? ``` 833 00:59:26,021 --> 00:59:29,340 (CriminaLs den't.just. ’cemmit I one type of offence. I 834 00:59:29,341 --> 00:59:31,700 Just C05 they SitQLe same petrQL - that day‘ dQeSn't mean - 835 00:59:31,701 --> 00:59:34,740 m pietes Of pLLa nt, ```
claunia added the difficulty: hardOCRGSoC-related labels 2026-01-29 16:42:32 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#384