Version 0.85 became much slower #246

Closed
opened 2026-01-29 16:38:51 +00:00 by claunia · 8 comments
Owner

Originally created by @maxkoryukov on GitHub (Jan 15, 2017).

As a result of latest few commits the current version (in master branch) works much slower ( x10 ):

...
Min PTS:         19:04:12:762
Max PTS:         19:14:14:004
Length:          00:10:01:242
Done, processing time = 227 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

this is a result of handling BBC_NEWS_2016-06-21.ts from sample-videos. Previously it took about 0:57 (about one minute) for this file.

PS: I have disabled my sbs feature for this speed test.

Originally created by @maxkoryukov on GitHub (Jan 15, 2017). As a result of latest few commits the current version (in master branch) works much slower ( x10 ): ``` ... Min PTS: 19:04:12:762 Max PTS: 19:14:14:004 Length: 00:10:01:242 Done, processing time = 227 seconds Issues? Open a ticket here https://github.com/CCExtractor/ccextractor/issues ``` this is a result of handling `BBC_NEWS_2016-06-21.ts` from sample-videos. Previously it took about 0:57 (about one minute) for this file. PS: I have disabled my **sbs** feature for this speed test.
Author
Owner

@maxkoryukov commented on GitHub (Jan 15, 2017):

I see, that CCE output now contains <font> tags for most subtitles. This is the only one difference I see, being a consumer of CCE))))

@maxkoryukov commented on GitHub (Jan 15, 2017): I see, that CCE output now contains `<font>` tags for most subtitles. This is the only one difference I see, being a consumer of CCE))))
Author
Owner

@cfsmp3 commented on GitHub (Jan 16, 2017):

What happens with -nodvbcolor ?

On Sun, Jan 15, 2017 at 3:40 PM, Maksim Koryukov notifications@github.com
wrote:

I see, that CCE output now contains tags for most subtitles. This
is the only one difference I see, being a consumer of CCE))))


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/650#issuecomment-272747420,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFrJ2aqZu0zNjK_xAPysecb3ZE8vB1l6ks5rSq5ZgaJpZM4Lj_rH
.

@cfsmp3 commented on GitHub (Jan 16, 2017): What happens with -nodvbcolor ? On Sun, Jan 15, 2017 at 3:40 PM, Maksim Koryukov <notifications@github.com> wrote: > I see, that CCE output now contains <font> tags for most subtitles. This > is the only one difference I see, being a consumer of CCE)))) > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/650#issuecomment-272747420>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AFrJ2aqZu0zNjK_xAPysecb3ZE8vB1l6ks5rSq5ZgaJpZM4Lj_rH> > . >
Author
Owner

@maxkoryukov commented on GitHub (Jan 16, 2017):

@cfsmp3 , -nodvbcolor works as expected (I think so), ccextractor with this option gives pure subs, without <font> (I use this for SBS // sentence-split feature), but even with -nodvbcolor ccextrator 0.8.5 works at the same speed - 10 minutes for 10 minute movie

I have tested only on my laptop, I didn't run/build ccextractor on other hosts.

@maxkoryukov commented on GitHub (Jan 16, 2017): @cfsmp3 , `-nodvbcolor` works as expected (I think so), ccextractor with this option gives pure subs, without `<font>` (I use this for SBS // sentence-split feature), **but** even with `-nodvbcolor` ccextrator 0.8.5 works at the same speed - 10 minutes for 10 minute movie I have tested only on my laptop, I didn't run/build ccextractor on other hosts.
Author
Owner

@cfsmp3 commented on GitHub (Jan 16, 2017):

dvdcolor being the default is correct, but of course if sbs doesn't support
colors you can just turn colors off automatically with -sbs.
About speed, is this a windows or linux build?
In windows we now have tesseract 4 which seems a lot slower than 3...

On Sun, Jan 15, 2017 at 5:12 PM, Maksim Koryukov notifications@github.com
wrote:

@cfsmp3 https://github.com/cfsmp3 , -nodvbcolor works as expected (I
think so), ccextractor with this option gives pure subs, without
(I use this for SBS // sentence-split feature), but even with
-nodvbcolor ccextrator 0.8.5 works at the same speed - 10 minutes for 10
minute movie

I have tested only on my laptop, I didn't run/build ccextractor on other
hosts.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/650#issuecomment-272753162,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFrJ2RLSwzHR1Gt5ZnL8DRte6h6i4Lsxks5rSsPggaJpZM4Lj_rH
.

@cfsmp3 commented on GitHub (Jan 16, 2017): dvdcolor being the default is correct, but of course if sbs doesn't support colors you can just turn colors off automatically with -sbs. About speed, is this a windows or linux build? In windows we now have tesseract 4 which seems a lot slower than 3... On Sun, Jan 15, 2017 at 5:12 PM, Maksim Koryukov <notifications@github.com> wrote: > @cfsmp3 <https://github.com/cfsmp3> , -nodvbcolor works as expected (I > think so), ccextractor with this option gives pure subs, without <font> > (I use this for SBS // sentence-split feature), *but* even with > -nodvbcolor ccextrator 0.8.5 works at the same speed - 10 minutes for 10 > minute movie > > I have tested only on my laptop, I didn't run/build ccextractor on other > hosts. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/650#issuecomment-272753162>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AFrJ2RLSwzHR1Gt5ZnL8DRte6h6i4Lsxks5rSsPggaJpZM4Lj_rH> > . >
Author
Owner

@maxkoryukov commented on GitHub (Jan 16, 2017):

I am on linux.
I've downloaded the latest commits from master branch, and build it with /linux/build script.

@maxkoryukov commented on GitHub (Jan 16, 2017): I am on linux. I've downloaded the latest commits from master branch, and build it with `/linux/build` script.
Author
Owner

@cfsmp3 commented on GitHub (Jan 17, 2017):

Seems like it's tesseract. @izaron is in charge of that - for now we'll
revert back to t3 as default.

On Mon, Jan 16, 2017 at 10:23 AM, Maksim Koryukov notifications@github.com
wrote:

I am on linux.
I've downloaded the latest commits from master branch, and build it with
/linux/build script.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/650#issuecomment-272932333,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFrJ2aE2_bgAYgs3AqbBrLVKH0Pa49qwks5rS7W9gaJpZM4Lj_rH
.

@cfsmp3 commented on GitHub (Jan 17, 2017): Seems like it's tesseract. @izaron is in charge of that - for now we'll revert back to t3 as default. On Mon, Jan 16, 2017 at 10:23 AM, Maksim Koryukov <notifications@github.com> wrote: > I am on linux. > I've downloaded the latest commits from master branch, and build it with > /linux/build script. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/650#issuecomment-272932333>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AFrJ2aE2_bgAYgs3AqbBrLVKH0Pa49qwks5rS7W9gaJpZM4Lj_rH> > . >
Author
Owner

@Izaron commented on GitHub (Jan 18, 2017):

Yes I saw that Tess 4.0 is slower because of new algorithm, and I'll solve it (change accuracy settings/downgrade to old .lib files and waiting for stable 4.0) But it's for Windows. @maxkoryukov can you check your tesseract version? As far I know, current tesseract-ocr-dev from Debian/Ubuntu have version 3.0.4 or 3.0.5. Or you installed library from sources?

@Izaron commented on GitHub (Jan 18, 2017): Yes I saw that Tess 4.0 is slower because of new algorithm, and I'll solve it (change accuracy settings/downgrade to old .lib files and waiting for stable 4.0) But it's for Windows. @maxkoryukov can you check your tesseract version? As far I know, current `tesseract-ocr-dev` from Debian/Ubuntu have version `3.0.4` or `3.0.5`. Or you installed library from sources?
Author
Owner

@maxkoryukov commented on GitHub (Jan 18, 2017):

@Izaron

$ tesseract -v
tesseract 3.03
 leptonica-1.70
  libgif 4.1.6(?) : libjpeg 8d : libpng 1.2.50 : libtiff 4.0.3 : zlib 1.2.8 : webp 0.4.0

As I remember, installed with apt-get, not from sources.

@maxkoryukov commented on GitHub (Jan 18, 2017): @Izaron ```bash $ tesseract -v tesseract 3.03 leptonica-1.70 libgif 4.1.6(?) : libjpeg 8d : libpng 1.2.50 : libtiff 4.0.3 : zlib 1.2.8 : webp 0.4.0 ``` As I remember, installed with `apt-get`, not from sources.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#246