Extraction from bin file does not honor -unixts and -UCLA #259

Closed
opened 2026-01-29 16:39:14 +00:00 by claunia · 11 comments
Owner

Originally created by @Liontooth on GitHub (Jan 25, 2017).

Create a bin file from a DVB transport stream:

ccextractor -ts -pn $PN -out=bin -o $FIL.bin $DIR/$FIL.$EXT

Extracting the text from this bin file:

ccextractor -in=bin -pn 53007 -tpage 891 -datets -ttxt -UCLA -noru -utf8 -parsepat -parsepmt -unixts 1485198721 -o 2017-01-23_1912_FR_TV5_Géopolitis.ccx.out 2017-01-23_1912_FR_TV5_Géopolitis.bin

results in wrong timestamps, a messed up third field, and an extra |:

19700101000109.360|19700101000112.520|CC?||Bonjour, bienvenue dans cette edition de Geopolitis.

while extraction from the transport stream produces the correct output:

20170123191310.360|20170123191313.520|891|Bonjour, bienvenue dans cette edition de Geopolitis.

Let me know if you need samples; this likely holds for any file.

Originally created by @Liontooth on GitHub (Jan 25, 2017). Create a bin file from a DVB transport stream: ` ccextractor -ts -pn $PN -out=bin -o $FIL.bin $DIR/$FIL.$EXT` Extracting the text from this bin file: ` ccextractor -in=bin -pn 53007 -tpage 891 -datets -ttxt -UCLA -noru -utf8 -parsepat -parsepmt -unixts 1485198721 -o 2017-01-23_1912_FR_TV5_Géopolitis.ccx.out 2017-01-23_1912_FR_TV5_Géopolitis.bin` results in wrong timestamps, a messed up third field, and an extra |: `19700101000109.360|19700101000112.520|CC?||Bonjour, bienvenue dans cette edition de Geopolitis.` while extraction from the transport stream produces the correct output: `20170123191310.360|20170123191313.520|891|Bonjour, bienvenue dans cette edition de Geopolitis.` Let me know if you need samples; this likely holds for any file.
Author
Owner

@cfsmp3 commented on GitHub (Jan 25, 2017):

Should be easy to fix.
GSoC qualification: Solving this issue gives 2 points.

@cfsmp3 commented on GitHub (Jan 25, 2017): Should be easy to fix. GSoC qualification: Solving this issue gives 2 points.
Author
Owner

@barun511 commented on GitHub (Jan 28, 2017):

Could I have samples please? I'll give this a shot.

@barun511 commented on GitHub (Jan 28, 2017): Could I have samples please? I'll give this a shot.
Author
Owner

@cfsmp3 commented on GitHub (Jan 28, 2017):

You can probably use any of the teletext ones from here:

http://ccextractor.org/doku.php?id=public:general:tvsamples

On Fri, Jan 27, 2017 at 6:55 PM, Barun Parruck notifications@github.com
wrote:

Could I have samples please? I'll give this a shot.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/667#issuecomment-275822533,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFrJ2Sh87tabtkfXcYBDD-0X91mBN_enks5rWq42gaJpZM4LtJvs
.

@cfsmp3 commented on GitHub (Jan 28, 2017): You can probably use any of the teletext ones from here: http://ccextractor.org/doku.php?id=public:general:tvsamples On Fri, Jan 27, 2017 at 6:55 PM, Barun Parruck <notifications@github.com> wrote: > Could I have samples please? I'll give this a shot. > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/667#issuecomment-275822533>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AFrJ2Sh87tabtkfXcYBDD-0X91mBN_enks5rWq42gaJpZM4LtJvs> > . >
Author
Owner

@barun511 commented on GitHub (Jan 30, 2017):

I can't seem to reproduce this. Is there a particular sample that you noticed this on?

@barun511 commented on GitHub (Jan 30, 2017): I can't seem to reproduce this. Is there a particular sample that you noticed this on?
Author
Owner
@Liontooth commented on GitHub (Jan 30, 2017): http://vrnewsscape.ucla.edu/dropbox/2017-01-23_1912_FR_TV5_G%c3%a9opolitis.bin
Author
Owner

@cfsmp3 commented on GitHub (Jan 31, 2017):

Confirmed. I'll let GSoC applicants give it a go though since it's not too hard.

@cfsmp3 commented on GitHub (Jan 31, 2017): Confirmed. I'll let GSoC applicants give it a go though since it's not too hard.
Author
Owner

@saurabhshri commented on GitHub (Feb 21, 2017):

Also,in this case (teletext) when extracting from bin it says No captions were found in input. and yield return code 10 even when they are extracted properly.

@saurabhshri commented on GitHub (Feb 21, 2017): Also,in this case (teletext) when extracting from bin it says `No captions were found in input.` and yield return code 10 even when they are extracted properly.
Author
Owner

@cfsmp3 commented on GitHub (Feb 22, 2017):

Please send fix for that :-)

On Tue, Feb 21, 2017 at 10:36 AM, Saurabh Shrivastava <
notifications@github.com> wrote:

Also,in this case (teletext) when extracting from bin it says No captions
were found in input. and yield return code 10 even when they are
extracted properly.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/667#issuecomment-281435381,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFrJ2egfJd4Ptt7fLB30GUAoGF_ELDr_ks5rey6mgaJpZM4LtJvs
.

@cfsmp3 commented on GitHub (Feb 22, 2017): Please send fix for that :-) On Tue, Feb 21, 2017 at 10:36 AM, Saurabh Shrivastava < notifications@github.com> wrote: > Also,in this case (teletext) when extracting from bin it says No captions > were found in input. and yield return code 10 even when they are > extracted properly. > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/667#issuecomment-281435381>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AFrJ2egfJd4Ptt7fLB30GUAoGF_ELDr_ks5rey6mgaJpZM4LtJvs> > . >
Author
Owner

@alexandrumc commented on GitHub (Feb 23, 2017):

@Liontooth, can you post here the DVB transport stream?

@alexandrumc commented on GitHub (Feb 23, 2017): @Liontooth, can you post here the DVB transport stream?
Author
Owner

@saurabhshri commented on GitHub (Feb 26, 2017):

@cfsmp3 @Liontooth While fixing, I am facing timing issues - I mean this :

From TS :

20170123191246.080|20170123191249.060|801|<font color="#00ffff">Disappearing? Are you sure, Mofy</font>
20170123191249.160|20170123191253.020|801|I've just seen it! I couldn't believe my eyes.
20170123191253.120|20170123191255.260|801|Mogu, your bag!

From .bin

20170123191245.980|20170123191248.960|801|<font color="#00ffff">Disappearing? Are you sure, Mofy?</font>
20170123191249.060|20170123191252.920|801|I've just seen it! I couldn't believe my eyes.
20170123191253.020|20170123191255.160|801|Mogu, your bag!

But then I found out that while using .bin few lines are missing too (See https://github.com/CCExtractor/ccextractor/issues/699 ).

Since timings are correct when extracted without -unixts, it must be something wrong at my part. I am trying to fix it. :)

@saurabhshri commented on GitHub (Feb 26, 2017): @cfsmp3 @Liontooth While fixing, I am facing timing issues - I mean this : From TS : ``` 20170123191246.080|20170123191249.060|801|<font color="#00ffff">Disappearing? Are you sure, Mofy</font> 20170123191249.160|20170123191253.020|801|I've just seen it! I couldn't believe my eyes. 20170123191253.120|20170123191255.260|801|Mogu, your bag! ``` From .bin ``` 20170123191245.980|20170123191248.960|801|<font color="#00ffff">Disappearing? Are you sure, Mofy?</font> 20170123191249.060|20170123191252.920|801|I've just seen it! I couldn't believe my eyes. 20170123191253.020|20170123191255.160|801|Mogu, your bag! ``` But then I found out that while using .bin few lines are missing too (See https://github.com/CCExtractor/ccextractor/issues/699 ). Since timings are correct when extracted without `-unixts`, it must be something wrong at my part. I am trying to fix it. :)
Author
Owner

@saurabhshri commented on GitHub (Feb 26, 2017):

I was unnecessarily calculating deltas and all which had mistake somewhere. The solution was staring right in the face :P Timing is correct now (in the PR #700 ).

@saurabhshri commented on GitHub (Feb 26, 2017): I was unnecessarily calculating deltas and all which had mistake somewhere. The solution was staring right in the face :P Timing is correct now (in the PR #700 ).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#259