Ffmpeg Cea 708

  1. Ffmpeg Cea-708
  2. Ffmpeg Cea 708 Vs
  3. Ffmpeg Cea 708 Din
Ffmpeg Cea 708

#726closedenhancement (fixed)

Reported by:Owned by:
Priority: wish Component: undetermined
Version: git-master Keywords:
Cc: Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: no

Igor Brezac ypass.net writes: Is there a way to remove embedded Closed Caption (cea-708) from an mp4 using ffmpeg? Codec copy does not work, ffmpeg copies the user data to. Hard coding is not the solution for me. When I said closed captions I meant captions that are according to the EIA 608/708 format. The extracting subtitles method works for the dvd player but not for the decoder we are using. I want to convert a subtitle file to closed captions which will be muxed inside the video stream. Hope that clears it up.


Attachments (2)

AALE0021_first2_5MB.mxf​ (2.4 MB) - added by dericed9 years ago.
first 2.5 MB of an MXF file with SMPTE 436M caption track
_0002-G17010201.Stream​ (2.2 MB) - added by dericed9 years ago.
extract of smpte 436M vbi caption track extracted from an XDCam disc's MXF with mxfsplit

Change History (19)

Changed 9 years ago by dericed

  • AttachmentAALE0021_first2_5MB.mxf​ added

comment:1follow-up:↓ 14 Changed 9 years ago by cehoyos

comment:3 Changed 9 years ago by reimar

Changed 9 years ago by dericed

  • Attachment_0002-G17010201.Stream​ added

comment:5 Changed 9 years ago by dericed

comment:6 Changed 8 years ago by cehoyos

  • Keywordscc added
  • Reproduced by developer set
  • Status changed from new to open

comment:7 Changed 8 years ago by cehoyos

  • Keywordsmxf added

comment:9 Changed 7 years ago by helmboy

comment:10follow-up:↓ 12 Changed 6 years ago by funman

comment:12 in reply to: ↑ 10 Changed 5 years ago by da8eat

comment:13 Changed 3 years ago by yojimbo69

Last edited 3 years ago by yojimbo69 (previous) (diff)

comment:14 in reply to: ↑ 1 Changed 3 years ago by yojimbo69

comment:16 Changed 2 years ago by dericed

comment:17 Changed 2 years ago by richardpl

  • Resolution set to fixed
  • Status changed from open to closed
EIA 608 closed caption data on an NTSC analog television signal

EIA-608, also known as 'line 21 captions' and 'CEA-608',[1] was once the standard for closed captioning for NTSC TV broadcasts in the United States, Canada and Mexico. It also specifies an 'Extended Data Service', which is a means for including a VCR control service with an electronic program guide for NTSC transmissions that operates on the even line 21 field, similar to the TeleText based VPS that operates on line 16 which is used in PAL countries.

It was developed by the Electronic Industries Alliance and required by law to be implemented in most television receivers made in the United States.

EIA-608 captions are transmitted on either the odd or even fields of line 21 with an odd parity bit in the non-visible active video data area in NTSC broadcasts, and are also sometimes present in the picture user data in ATSCtransmissions. It uses a fixed bandwidth of 480 bit/s per line 21 field for a maximum of 32 characters per line per caption (maximum four captions) for a 30 frame broadcast.[2] The odd field captions relate to the primary audio track and the even field captions related to the SAP or secondary audio track which is generally a second language translation of the primary audio, such as a French or Spanish translation of an English-speaking TV show.

Raw EIA-608 caption byte pairs are becoming less prevalent as digital television replaces analog. ATSC broadcasts instead use the EIA-708 caption protocol to encapsulate both the EIA-608 caption pairs as well as add a native EIA-708 stream. EIA-608 has had revisions with the addition of extended character sets to fully support the representation of the Spanish, French, German languages, and cross section of other Western European languages. EIA-608 was also extended to support two byte characters for the Korean and Japanese markets. The full version of EIA-708 has support for more character sets and better caption positioning options; however, because of existing EIA-608 hardware and revisions to the format, there has been little or no real world use of the format besides simple 608 to 708 inline conversions.


EIA-608 defines four channels of caption information, so that a program could, for example, have captions in four different languages. There are two channels, called 1 and 2 by the standard, in each of the two fields of a frame. However, the channels are often presented to users numbered simply as CC1-2 for the odd field and CC3-4 for the even field. However due to bandwidth limitations on either field, CC1 and CC3 are the only ones used, meaning that there has been little use for the second channel. Early Spanish SAP captioned broadcasts first used the second channel CC2 because the original caption decoders only read the first odd field, but later switched to using CC3 for bandwidth reasons. Due to the same bandwidth reasons XDS was never used by Spanish-speaking stations.

Within each channel, there are two streams of information which might be considered sub-channels: one carries 'captions' and the other 'text.' The latter is not in common use due to the lack of hardware support and bandwidth available. Text is signaled by the use of text commands and can be used for a formatted URL string with a 16-bit checksum that designates a web site that the captions relate to or a local station communication channel.

This layering is based on the OSI Protocol Reference Model:

CC LayersOSI LayersDVB/MXF LayersComments
ApplicationInterpretationIssuing commands and appending text to rows
PresentationCodingBreaking up individual commands and characters
SessionChannelChannel Byte Stream
--SelectionCC channel assembly from CC byte pairs
InjectionTransportSynchronizationCC byte pairs extracted/synchronized with/from video frames
Networkunuseddirectly connected link
Linkvideo frames or VBI data split from link format


Physicallink format demodulated/retrieved from transmission/source

DVD GOP User Data Insertion[edit]

Ffmpeg Cea-708

The user data structure that follows a H.262 GOP header is as follows (the same would apply after an ISO/IEC 14496-2 GOP header):

32 bitsuser_data_start_codepatterned bslbf0x000001B2
16 bitsuser_identifierASCII bslbfCC
8 bitsuser_data_type_codeuimsbf1
8 bitscaption_block_sizeinverted uimsbf0xf8
1 bitcaption_odd_field_firstflag1
1 bitcaption_filleralignment0
5 bitscaption_block_countuimsbf15
1 bitcaption_extra_field_addedflag0
X*24 bitscaption_blockbinaryfree form

bslbf: bit string, left bit first ; uimsbf: unsigned integer, most significant bit first

Caption blocks are inserted after the sequence and GOP headers, so each block is for one second of video which would end up being one or two long lines or three to four short lines of text. Also that means if the caption_block_count is greater than 30 then the block contains both interleaved caption fields and one could devise the framing rate from the caption_block_count. However since the data is grouped together the framing rate will almost always be 30/1.001, unlike the ATSC method that inserts one byte pair for each field after the picture header making framing rates of 24/1.001 possible for HD content. Since when a decoder does a 3:2 pull-down for NTSC output the captions will remain in sync.

DVD Caption Block
7 bitscaption_filleralignment0x7f
1 bitcaption_odd_fielduimsbf1 or 0
8 bitscaption_first_byteodd parity uimsbf0x80
8 bitscaption_second_byteodd parity uimsbf0x80

DVB Transport Insertion[edit]

The packet-ed structure that is inserted before the H.222 video packet is as follows for a frame of associated video:

32 bitsprivate_stream_1_start_codepatterned bslbf0x000001BD
16 bitsPES_packet_lengthuimsbf176
2 bitsPES_versionuimsbf2
1 bitPES_priorityflag0
2 bitsPES_scrambling_controluimsbf0
1 bitdata_alignment_indicatorflag1
2 bitscopyright
2 bitsPTS_DTS_flaguimsbf2
6 bitsvarious_PES_flagsuimsbf0
8 bitsPES_header_data_lengthuimsbf36
40 bitsPTSuimsbfvaries
248 bitsstuffing_bytesuimsbf255
8 bitsdata_identifieruimsbf153
8 bitsdata_unit_iduimsbf197
8 bitsdata_unit_lengthuimsbf3
2 bitsreserved_future_useuimsbf3
1 bitfield_parity (CC1/2)flag0
5 bitsline_offsetuimsbf21
16 bitsclosed_captioning_data_blockuimsbf608 caption
8 bitsdata_unit_iduimsbf197
8 bitsdata_unit_lengthuimsbf3
2 bitsreserved_future_useuimsbf3
1 bitfield_parity (CC3/4/XDS)flag1
5 bitsline_offsetuimsbf21
16 bitsclosed_captioning_data_blockuimsbf608 caption
8 bitsdata_unit_iduimsbf255
8 bitsdata_unit_lengthuimsbf124
124*8 bitsstuffing_bytesuimsbf255

bslbf: bit string, left bit first ; uimsbf: unsigned integer, most significant bit first

Ffmpeg cea 708 code

This structure was designed for any digital VBI data and was optimized to carry three or more 43-byte Teletext packets. e.g. a page header and two associated lines. For Teletext subtitles, the data_unit_id is set to 3. In this form, captions have to be separated into byte pairs spread over frames in one second of video rather than grouped into one block as with the DVD structure. The same is true for Teletext subtitles with more than one line of text.

SDI/MXF SMPTE 291M Insertion[edit]

The packet-ed structure that is inserted before the SMPTE 259M active video frame or MXF essence video packet is coded as follows for a frame of associated video:

16 or 128 bitsancillary_flag or
patterned bslbf or
7 uimsbf
0xFFFF or
8 bitsdata_iduimsbf97
8 bitssecondary_data_iduimsbf2
8 bitsdata_countuimsbfvaries
X*24 bitscaption_data_blockbinaryfree form

bslbf: bit string, left bit first ; uimsbf: unsigned integer, most significant bit first


Ffmpeg Cea 708 Vs

This structure was designed for any digital audio or metadata that is to be synchronized with a video frame. SDI transports every eight bits in a 10 bit aligned packet, unlike MXF which is byte aligned and the ancillary flag bytes are replaced by 128 bit header.

SDI/MXF Caption Block
1 bitcaption_odd_field
(CC1/2 = 1; CC3/4 = 0)
2 bitscaption_reserveduimsbf0
5 bitscaption_line_offsetuimsbf15
8 bitscaption_first_byteodd parity uimsbf0x80
8 bitscaption_second_byteodd parity uimsbf0x80

Extended Data Service[edit]

Ffmpeg Cea 708

The EIA-608 data stream format includes Extended Data Service (XDS), a variety of information about the transmission. It is all optional,:

  • program name
  • offensiveness rating (violence, sex, etc.)
  • program category (drama, game show, etc.)


There are three sets of characters that the EIA-608 stream can direct the receiver to display: basic characters, special characters, and extended characters. A single two-byte EIA-608 command (represented by a single VBI line) can specify two basic characters, one special character, or one extended character.

Extended characters are a later addition to the standard and their decoding is optional.

EIA-608 provides controls for the color of the foreground and background of the text, underlining, blinking, and italics. The default color scheme is white characters on a black background, all opaque.

The Transparent Space special character implies a transparent background even in the absence of any background control commands. As the foreground of this character is a blank space, it really means a gap in the close caption text.

Non-Caption Data[edit]

This is used to either pad out the field line when no captions are sent or for the eXtended Data Service.

Basic North American character set[edit]

A command with bits 13 or 14 on directs the receiver to display two basic characters at the current cursor position for the current mode (closed caption or text). Each character is a code point (identifies the character to display), as follows.

The code is almost identical to ASCII; the exceptions are shown in red.

0010 00003220(SP)
0010 00013321!
0010 00103422'
0010 00113523#
0010 01003624$
0010 01013725%
0010 01103826&
0010 01113927
0010 10004028(
0010 10014129)
0010 1010422Aá
0010 1011432B+
0010 1100442C,
0010 1101452D-
0010 1110462E.
0010 1111472F/
0011 000048300
0011 000149311
0011 001050322
0011 001151333
0011 010052344
0011 010153355
0011 011054366
0011 011155377
0011 100056388
0011 100157399
0011 1010583A:
0011 1011593B;
0011 1100603C<
0011 1101613D=
0011 1110623E>
0011 1111633F?
0100 00006440@
0100 00016541A
0100 00106642B
0100 00116743C
0100 01006844D
0100 01016945E
0100 01107046F
0100 01117147G
0100 10007248H
0100 10017349I
0100 1010744AJ
0100 1011754BK
0100 1100764CL
0100 1101774DM
0100 1110784EN
0100 1111794FO
0101 00008050P
0101 00018151Q
0101 00108252R
0101 00118353S
0101 01008454T
0101 01018555U
0101 01108656V
0101 01118757W
0101 10008858X
0101 10018959Y
0101 1010905AZ
0101 1011915B[
0101 1100925Cé
0101 1101935D]
0101 1110945Eí
0101 1111955Fó
0110 00009660ú
0110 00019761a
0110 00109862b
0110 00119963c
0110 010010064d
0110 010110165e
0110 011010266f
0110 011110367g
0110 100010468h
0110 100110569i
0110 10101066Aj
0110 10111076Bk
0110 11001086Cl
0110 11011096Dm
0110 11101106En
0110 11111116Fo
0111 000011270p
0111 000111371q
0111 001011472r
0111 001111573s
0111 010011674t
0111 010111775u
0111 011011876v
0111 011111977w
0111 100012078x
0111 100112179y
0111 10101227Az
0111 10111237Bç
0111 11001247C÷
0111 11011257DÑ
0111 11101267Eñ
0111 11111277FSB

In the table above, SB represents a solid block. The apostrophe (code 27), which may originally have been intended to be a neutral apostrophe as in ASCII, is now recommended to be rendered as a right single quotation mark (Unicode U+2019). For a neutral single quote/apostrophe, the plain single quote from the extended character set should be used.[3]

Special North American character set[edit]

The only real use in North America of this set is the use of the Eighth note character to denote changes from spoken dialogue to singing or musical only scenes.

Ffmpeg Cea 708 Din

It is an acceptable broadcast engineering practice when translating EIA-608 to Teletext for PAL compatible countries to substitute this character for a number sign because of its similarity to a sharp.

A command to display a special character has a first byte of 0x11 or 0x19 (depending upon channel). The second byte is a code point in the range 0x30-0x3F as follows.

0011 00004830®
0011 00014931°
0011 00105032½
0011 00115133¿
0011 01005234
0011 01015335¢
0011 01105436£
0011 01115537
0011 10005638à
0011 10015739TS
0011 1010583Aè
0011 1011593Bâ
0011 1100603Cê
0011 1101613Dî
0011 1110623Eô
0011 1111633Fû

TM is short for unregistered trademark and should be represented in superscript (™). TS in the table above represents a 'transparent space' or non-breaking space. Finally, the Eighth note (♪) is used to denote singing or background music in captions.

Extended Western European character set[edit]

These extended character sets are rarely used due to most European countries using the BBC Ceefax based Teletext system.

The Ceefax system is more prone to character errors due to the greater number of data bits (337 versus 16) encoded per VBI field, these errors occur either on noise prone analog transmissions or connections.

  • A command to display an extended Spanish/French or miscellaneous character has a first byte of 0x12 or 0x1A (depending upon channel).
  • A command to display an extended Portuguese/German/Danish character has a first byte of 0x13 or 0x1B (depending upon channel).

The second byte is a code point in the range 0x20-0x3F is as follows

Extended Spanish/Miscellaneous
0010 00003220Á
0010 00013321É
0010 00103422Ó
0010 00113523Ú
0010 01003624Ü
0010 01013725ü
0010 01103826´
0010 01113927¡
0010 10004028*
0010 10014129'
0010 1010422A
0010 1011432B©
0010 1100442CSM
0010 1101452D·
0010 1110462E
0010 1111472F
Extended French
0011 00004830À
0011 00014931Â
0011 00105032Ç
0011 00115133È
0011 01005234Ê
0011 01015335Ë
0011 01105436ë
0011 01115537Î
0011 10005638Ï
0011 10015739ï
0011 1010583AÔ
0011 1011593BÙ
0011 1100603Cù
0011 1101613DÛ
0011 1110623E«
0011 1111633F»
0010 00003220Ã
0010 00013321ã
0010 00103422Í
0010 00113523Ì
0010 01003624ì
0010 01013725Ò
0010 01103826ò
0010 01113927Õ
0010 10004028õ
0010 10014129{
0010 1010422A}
0010 1011432B
0010 1100442C^
0010 1101452D_
0010 1110462E
0010 1111472F~
0011 00004830Ä
0011 00014931ä
0011 00105032Ö
0011 00115133ö
0011 01005234ß
0011 01015335¥
0011 01105436¤
0011 01115537
0011 10005638Å
0011 10015739å
0011 1010583AØ
0011 1011593Bø
0011 1100603C
0011 1101613D
0011 1110623E
0011 1111633F

SM is short for service mark and should be represented in superscript. The single quote mark is a curly left and double quote marks are curly left and right. The plus signs refer to top left, top right, lower left and lower right corners for box drawing.

Non-Western Norpak Character Sets[edit]

When used all standard and extended character sets are unused in favor of the following predefined sets, care must be taken to not emulate any control commands. This is an extension submitted to the CEC by Norpak who made a similar extension to the Teletext format for the Chinese market. The main use has been to provide double byte code point captioning to the Japanese, Taiwanese and South Korean markets. A command to switch character sets has a first byte of 0x17 or 0x1F (depending upon channel). The second byte is a character set reference in the range 0x24-0x2A as follows

BinaryDecimalHexSet in Use
0010 01003624Standard
0010 01013725Standard Double Height
0010 01103826Decoder Specific 1
0010 01113927Decoder Specific 2
0010 10004028China's GB 2312 (1980)
0010 10014129Korea's KS C 5601 (1987)
0010 1010422ALoadable

Control commands[edit]

Bits 15 and 7 are always odd parity bits. Bit 11 is always the channel bit.

Preamble address code with masking bit 15,11 and 7 as already defined abovecan be interpreted from following table

Ffmpeg Cea 708
14-13always 0
12always 1
10-8row position indicator
6always 1
5row position indicator
4-1text attribute indicator
0underline indicator

The row bits specify which of the fifteen screen rows should contain the caption text: row 11 (0000), 1 (0010), 2 (0011), 3, 4, 12, 13, 14, 15, 5, 6, 7, 8, 9, or 10 (1111).

The attributes bits allow 16 possibilities, which are: white (0000), green, blue, cyan, red, yellow, magenta, italics, indent 0, indent 4, indent 8, indent 12, indent 16, indent 20, indent 24, indent 28 (1111).

For a midrow code these are as follows: Bits 14, 13, 10, 9, 6 and 4 are always 0, bits 12, 8 and 5 are always 1. Bits 3, 2 and 1 form the color attribute 0001X10X(see the listing of attributes). Bit 0 indicates underline.

For other control codes these are as follows: Bits 14, 13, 9, 6 and 4 are always 0, bits 12, 10 and 5 are always 1. Bit 8 chooses between line 21 and 284. Bits 3, 2, 1 and 0 identify the particular action.

The command bits allow 16 possibilities, which are: resume caption loading (0000), backspace (0001), delete to end of row (0100), roll-up captions 2-rows, roll-up captions 3 rows, roll-up captions 4-rows, flash on (0.25 seconds once per second), resume direct captioning, text restart, resume text display, erase displayed memory, carriage return, erase nondisplayed memory, end of caption (1111).

For tabs these are as follows: Bits 14, 13, 6, 4, 3, 2 are always 0, bits 12, 10, 9, 8, 5 are always 1. Bits 1 and 0 determine the number of tab offsets.

Considering parity bit already ignored hex value have of 2 byte data is following command:

cc_data 0 (hex)cc_data 0 (binary)cc_data 1 (hex)cc_data 1 (binary)Command
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2000100000resume caption loading
(start buffered caption text)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
(overwrite last char)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2200100010alarm off
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2300100011alarm on
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2400100100delete to end of row
(clear line)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2500100101roll up 2
(scroll size)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2600100110roll up 3
(scroll size)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2700100111roll up 4
(scroll size)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2800101000flashes captions on
(0.25 seconds once per second)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2900101001resume direct captioning
(start caption text)
0x14 (TXT1) or 0x1c (TXT2) or
0x15 (TXT3) or 0x1D (TXT4)
0001C10F0x2A00101010text restart
(start non-caption text)
0x14 (TXT1) or 0x1c (TXT2) or
0x15 (TXT3) or 0x1D (TXT4)
0001C10F0x2B00101011resume text display
(resume non-caption text)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2C00101100erase display memory
(clear screen)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2D00101101carriage return
(scroll lines up)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2E00101110erase non displayed memory
(clear buffer)
0x14 (CC1) or 0x1c (CC2) or
0x15 (CC3) or 0x1D (CC4)
0001C10F0x2F00101111end of caption
(display buffer)
0x17 (CC1/3) or 0x1F (CC2/4)0001C1110x2100100001tab offset 1
(add spacing)
0x17 (CC1/3) or 0x1F (CC2/4)0001C1110x2200100010tab offset 2
(add spacing)
0x17 (CC1/3) or 0x1F (CC2/4)0001C1110x2300100011tab offset 3
(add spacing)


  1. ^'SCTE 21 2012 - STANDARD FOR CARRIAGE OF VBI DATA IN CABLE DIGITAL TRANSPORT STREAMS'(PDF). Society of Cable Telecommunications Engineers. SCTE 21: 13. 2012. Retrieved 4 October 2012.
  2. ^https://evertz.com/resources/eia_608_708_cc.pdf
  3. ^CEA-608-E R-2014 standard

External links[edit]

  • Closed caption decoder requirements for analog television receivers – 47 C.F.R. 15.119 – From the F.C.C.
  • Index of requirements documents in text and PDF for 47 C.F.R. 15 – use the 119 link – From the F.C.C.
  • CEA-608-E R-2014 – Latest revision of the standard from the Consumer Electronics Association
Retrieved from 'https://en.wikipedia.org/w/index.php?title=EIA-608&oldid=986512023'