Department of Information Engineering (DEI) University of Padova Italy Joint source-channel video coding for H.264 using FEC Simone Milani simone.milani@dei.unipd.it DEI-University of Padova Gian Antonio Mian mian@dei.unipd.it DEI-University of Padova Luca Celetto luca.celetto@st.com STMicroelectronics Andrea Vitali andrea.vitali@st.com STMicroelectronics
Outline Video transmission over lossy channels The zeros model for the H.264 encoder FEC matrixes of RTP packets An adaptive FEC optimization strategy Experimental results Conclusions 2
Video Transmission Over Packet-Switching Networks In order to transmit a video sequence on a limited-bandwidth channel (e.g. a radio channel), we need to shrink the amount of data with a lossy coding algorithm. coding 7.6 Mbit/s 64 kbit/s decoding 64 kbit/s channel Different rate control algorithms provide a different video quality at the same transmission rate. 3
The H.264 video coding standard Macroblock Frame format QCIF CIF Size 176 X 144 352 X 176 Input signal YUV 4:2:0 8 bit/sample Macroblock 16 16 4 blocks 16 smaller blocks H.264 standardizes a hybrid transform video coder with spatial and temporal prediction 8 8 4 4 4
Slice Partitioning According to the coded frame size and the slice partition, each picture is packetized in different manners. The slice partitioning can be performed in two ways Fixed number of MBs in a slice Nearly fixed number of bytes in a slice Input frame Fixed # of MBs Fixed # of bytes 5
Packetization After coding the input picture, the bit stream is embedded into many packets that are sent through the network. RTP packet.. Packets are sent across the network using the RTP protocol (Real-time Transport Protocol) Network Usually, 1 packet 1 slice In the simplest case, 1 packet 1 frame 6
Joint source-channel video coding Packets may get lost in the network because of delays transmission errors discarding (network congestions,) The end-user perceive the packet loss as a loss of visual information (i.e. a slice, a frame or more than one). Transmission Control Source Coding Channel Coding channel Channel Decoding Source Decoding 7
Error concealment at decoder side Error concealment techniques estimate the lost part of the picture using temporal and spatial correlation. The use of slice partitioning allows the decoder to perform a better error concealment since the lost portion of an image can be recovered through interpolation techniques. The types of interpolation are: Temporal: estimation performed from the corresponding MBs of the previous frames Spatial: estimation performed using the neighboring blocks in the current frame 8
Channel coding using FEC In order to increase robustness, we need to perform a channel coding Channel Coding Generally, a FEC code can be seen as a function that maps k-tuples of symbols into n-tuples symbols The correct codeword ĉ is chosen in order to miminize the Hamming distance d m (ĉ, c r ) where c r is the received codeword. 1 0 1 0 1 0 1 Hamming code (7,4) 9
Packet Video Source FEC Matrix codes It is possible to perform FEC filling a matrix with RTP packets Performing a convolutional code computation Decoding at the receiver whenever the number of losses is lower than a threshold FEC code packets Source code packets 10
FEC decoding at the receiver At the receiver, some information can be lost. Therefore, the FEC decoder Reconstruct the matrix Check the lost information Try to recover the lost packet using a FEC decoding algorithm columnwise. Error concealment techniques improve the visual quality estimating the lost part of the picture using temporal and spatial correlation. Temporal: estimation performed from the corresponding MBs of the previous frames Spatial: estimation performed using the neighboring blocks in the current frame 11
Experimental Results (1/4) PSNR Luma (db) 44,00 42,00 40,00 38,00 36,00 34,00 32,00 30,00 28,00 26,00 24,00 22,00 20,00 18,00 16,00 14,00 12,00 10,00 Results for RS(239,255) 0,00 1000,00 2000,00 3000,00 4000,00 5000,00 Rate (kbit/s) P_loss=0.005 P_loss=0.01 P_loss=0.05 P_loss=0.1 P_loss=0 Log. (P_loss=0.01) Poli. (P_loss=0.05) Log. (P_loss=0.1) Log. (P_loss=0.005) Log. (P_loss=0) Parameter Sequence Frame size GOP Struct Value Teeny Coded 60 frame SDU size 3/5 Channel E[pkt_length] Random SDU loss Matrix size 352 X 288 (CIF) 15 frames IPBBPBB #MB/slice 14 QP Error Concealme nt Fixed 255 X 3 X E[pkt_length] Bilinear Interpolation 12
Experimental Results(2/4) Results with P_{SDU loss}=0.01 Results with P_{SDU loss}=0.05 PSNR Luma ( db) 44,00 42,00 40,00 38,00 36,00 34,00 32,00 30,00 28,00 26,00 24,00 RS(223,255) RS(231,255) RS(239,255) RS(247,255) no FEC Log. (RS(223,255)) Log. (RS(231,255)) Log. (RS(239,255)) Log. (RS(247,255)) Log. (no FEC) PSNR Luma ( db) 35,00 33,00 31,00 29,00 27,00 25,00 23,00 21,00 19,00 17,00 RS(223,255) RS(231,255) RS(239,255) RS(247,255) no FEC Log. (RS(223,255)) Log. (RS(231,255)) Log. (RS(239,255)) Log. (RS(247,255)) Log. (no FEC) 22,00 15,00 20,00 13,00 0,00 1000,00 2000,00 3000,00 4000,00 5000,00 0,00 1000,00 2000,00 3000,00 4000,00 5000,00 Rate (kbit/s) Rate (kbit/s) 13
Experimental Results (3/4) Results with P_{SDU loss}=0.1 PSNR Luma (db) 30,00 29,00 28,00 27,00 26,00 25,00 24,00 23,00 22,00 21,00 20,00 19,00 18,00 17,00 16,00 15,00 14,00 13,00 12,00 11,00 10,00 0,00 100,00 200,00 300,00 400,00 500,00 600,00 700,00 800,00 Rate (kbit/s) RS(223,255) RS(231,255) RS(239,255) RS(247,255) no FEC Log. (RS(223,255)) Log. (RS(231,255)) Log. (RS(239,255)) Log. (RS(247,255)) Log. (no FEC) Parameter Value Sequence Foreman Frame size 176 X 144 GOP Length (QCIF) 15 frames GOP Structure IBBP #MB/slice 99 QP Fixed Coded frame 60 14
15 Experimental Results (4/4) Results for RS(239,255) 10,00 12,00 14,00 16,00 18,00 20,00 22,00 24,00 26,00 28,00 30,00 32,00 34,00 36,00 38,00 40,00 42,00 44,00 46,00 0,000 0,005 0,010 0,015 0,020 0,025 0,030 0,035 0,040 0,045 0,050 0,055 0,060 0,065 0,070 0,075 0,080 0,085 0,090 0,095 0,100 P_loss PSNR Luma (db) QP=18 QP=24 QP=30 QP=36 QP=42 Results for RS(223,255) 10,00 12,00 14,00 16,00 18,00 20,00 22,00 24,00 26,00 28,00 30,00 32,00 34,00 36,00 38,00 40,00 42,00 44,00 46,00 0,000 0,005 0,010 0,015 0,020 0,025 0,030 0,035 0,040 0,045 0,050 0,055 0,060 0,065 0,070 0,075 0,080 0,085 0,090 0,095 0,100 P_loss PSNR Luma (db) QP=18 QP=24 QP=30 QP=36 QP=42
Conclusion The rate modelization through the (ρ,e q )model is an interesting solution rate control both in terms of reduced computational complexity and coding performance. The proposed algorithm allows the encoder to increase both the average coding quality and reduce the artifacts due to an excessive variability of the quantization parameter QP. FEC codes channel coding permits transmitting a video sequence over lossy channel able to recover lost packets increase the performance of the error concealment techniques The choice of optimal code depends on many time-varying factors such as the input video sequence, the channel loss percentage, and quality or delay requirements 16
Next Steps In the future, we will a study an adaptive approach that should be able to partition the available bandwidth according to the application the FEC overhead the type of sequence (i.e. the type of scene) Channel loss percentage Implementation of low-cost RD-optimization technique in order to reduce the computational requirements. Comparison with other robust transmission techniques such as MD, retransmission of parts of the information, 17