Apparently nasty corruption in some files #63

Closed
opened 2026-01-29 16:34:15 +00:00 by claunia · 1 comment
Owner

Originally created by @cfsmp3 on GitHub (Jun 29, 2015).

Originally assigned to: @canihavesomecoffee on GitHub.

A hard one. The file

/repository/UCLACorruption/2015-06-25_1800_US_KNBC_Access_Hollywood_Live.mpg

Produces some garbage under some conditions.

Report says that it happens with these parameters:

" 463 F=2015-06-25_1800_US_KNBC_Access_Hollywood_Live
464 cx=ccextractor-0.78-alpha1
466 $cx -ts -autoprogram -UCLA -12 -noru -out=ttxt -utf8 -unixts 0 -o $F.test2 $F.mpg

Note that the problem goes away if you use -1 instead of -12. It's looking for the second channel that triggers the junk inclusion, so it's finding this somehere."

To make it worse, it doesn't happen (to me) on Windows.

Output looks like this:
19700101000012.412|19700101000014.180|CC1|RU2|>>> AH, TODAY ON "ACCESS
19700101000014.247|19700101000015.215|CC1|RU2|HOLLYWOOD LIVE," I AM SO
19700101000015.281|19700101000015.482|CC1|RU2|EXCITED.
1061371091103224844.814|1978833380815014111.993|CC-1956779514|???|�^�^��^֠^@^@xm^N^E�^�^��^֠^@^@?1 ^A^@^@^@^@?1 ^A^@^@^@^@?¢üí
1061371091103224844.814|1978833380815014111.993|CC-1956779514|???|^K�^�/F?�^�?EI??D^B�^�
^Y�^�^�ú2]ëäSs?^C?~^Q?�^�^�0

For more details, contact David at UCLA.

Originally created by @cfsmp3 on GitHub (Jun 29, 2015). Originally assigned to: @canihavesomecoffee on GitHub. A hard one. The file /repository/UCLACorruption/2015-06-25_1800_US_KNBC_Access_Hollywood_Live.mpg Produces some garbage under some conditions. Report says that it happens with these parameters: " 463 F=2015-06-25_1800_US_KNBC_Access_Hollywood_Live 464 cx=ccextractor-0.78-alpha1 466 $cx -ts -autoprogram -UCLA -12 -noru -out=ttxt -utf8 -unixts 0 -o $F.test2 $F.mpg Note that the problem goes away if you use -1 instead of -12. It's looking for the second channel that triggers the junk inclusion, so it's finding this somehere." To make it worse, it doesn't happen (to me) on Windows. Output looks like this: 19700101000012.412|19700101000014.180|CC1|RU2|>>> AH, TODAY ON "ACCESS 19700101000014.247|19700101000015.215|CC1|RU2|HOLLYWOOD LIVE," I AM SO 19700101000015.281|19700101000015.482|CC1|RU2|EXCITED. 1061371091103224844.814|1978833380815014111.993|CC-1956779514|???|�^�^��^֠^@^@xm^N^E�^�^��^֠^@^@?1 ^A^@^@^@^@?1 ^A^@^@^@^@?¢üí 1061371091103224844.814|1978833380815014111.993|CC-1956779514|???|^K�^�/F?�^�?EI??D^B�^� ^Y�^�^�ú2]ëäSs?^C?~^Q?�^�^�0 For more details, contact David at UCLA.
Author
Owner

@cfsmp3 commented on GitHub (Aug 8, 2016):

Can't reproduce with current git, so assumed fixed.

@cfsmp3 commented on GitHub (Aug 8, 2016): Can't reproduce with current git, so assumed fixed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#63