by RSS Kai Lu  |  Aug 17, 2016  |  Filed in: Security Research

Google patched some Android security vulnerabilities in early August. One of them was a remote code execution vulnerability in Mediaserver (CVE-2016-3820), which was discovered by me. This vulnerability could enable an attacker using a specially crafted file to cause memory corruption during media file and data processing. This issue was rated as Critical by Google due to the possibility of remote code execution within the context of the Mediaserver process. The Mediaserver process has access to audio and video streams, as well as access to privileges that third-party apps could not normally access. The affected functionality is provided as a core part of Android, and there are multiple applications that allow it to be reached with remote content, most notably MMS and browser playback of media.

In this blog, we want to share our analysis of this vulnerability.

Proof of Concept

The vulnerability exists in the software-based H.264 decoder. Mediaserver normally prefers the hardware-based H.264 decoder shipped with most Android devices over the vulnerable software-based one. If the hardware-based H.264 decoder is chosen to parse the PoC file, the vulnerability is not triggered. Applications supporting H.264 media, however, could be vulnerable depending on which decoder is chosen by them.

The testing was conducted on the following device and software setup:

[*]google/hammerhead/hammerhead:6.0.1/MOB30H/root08012302:userdebug/test-keys
[*]Android/aosp_hammerhead/hammerhead/android-6.0.1_r41

The standalone command stagefright can be used to trigger the vulnerability using the software codec option as below:

/system/bin/stagefright -s /sdcard/ FG-VD-16-030_PoC_minimized.mp4

The crash log is shown below:

--------- beginning of crash
05-05 17:45:17.428  2054  2325 F libc    : Fatal signal 11 (SIGSEGV), code 2, fault addr 0xb5cd47c8 in tid 2325 (le.h264.decoder)
05-05 17:45:17.522  6319  6319 W debuggerd: type=1400 audit(0.0:7279): avc: denied { search } for name="tmp" dev="mmcblk0p28" ino=627090 scontext=u:r:debuggerd:s0 tcontext=u:object_r:shell_data_file:s0 tclass=dir permissive=0
05-05 17:45:17.529  6319  6319 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
05-05 17:45:17.529  6319  6319 F DEBUG   : Build fingerprint: 'google/hammerhead/hammerhead:6.0.1/MMB29V/2554798:user/release-keys'
--------- beginning of system
05-05 17:45:17.529   820   943 W NativeCrashListener: Couldn't find ProcessRecord for pid 2054
05-05 17:45:17.529  6319  6319 F DEBUG   : Revision: '0'
05-05 17:45:17.529  6319  6319 F DEBUG   : ABI: 'arm'
05-05 17:45:17.529  6319  6319 E DEBUG   : AM write failed: Broken pipe
05-05 17:45:17.530  6319  6319 F DEBUG   : pid: 2054, tid: 2325, name: le.h264.decoder  >>> /data/local/tmp/stagefright <<<
05-05 17:45:17.530  6319  6319 F DEBUG   : signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0xb5cd47c8
05-05 17:45:17.536  6319  6319 F DEBUG   :     r0 b49807e8  r1 b4b09630  r2 00000001  r3 000001a0
05-05 17:45:17.536  6319  6319 F DEBUG   :     r4 00000081  r5 00000000  r6 b5cd47c8  r7 00000000
05-05 17:45:17.536  6319  6319 F DEBUG   :     r8 00000000  r9 b4b09630  sl b5cba5dc  fp 000001a0
05-05 17:45:17.536  6319  6319 F DEBUG   :     ip b4b09fef  sp b49806a8  lr b5c33aa3  pc b5cd47c8  cpsr 20070010
05-05 17:45:17.547  6319  6319 F DEBUG   :
05-05 17:45:17.547  6319  6319 F DEBUG   : backtrace:
05-05 17:45:17.547  6319  6319 F DEBUG   :     #00 pc 000547c8  [anon:libc_malloc]
05-05 17:45:17.547  6319  6319 F DEBUG   :     #01 pc 00018aa1  /system/lib/libstagefright_soft_avcdec.so (ih264d_process_intra_mb+4448)
05-05 17:45:17.547  6319  6319 F DEBUG   :     #02 pc 0000d67d  /system/lib/libstagefright_soft_avcdec.so (ih264d_recon_deblk_slice+616)
05-05 17:45:17.547  6319  6319 F DEBUG   :     #03 pc 0000d949  /system/lib/libstagefright_soft_avcdec.so (ih264d_recon_deblk_thread+64)
05-05 17:45:17.548  6319  6319 F DEBUG   :     #04 pc 0003f45f  /system/lib/libc.so (__pthread_start(void*)+30)
05-05 17:45:17.548  6319  6319 F DEBUG   :     #05 pc 00019b43  /system/lib/libc.so (__start_thread+6)
05-05 17:45:17.592  6319  6319 W debuggerd: type=1400 audit(0.0:7280): avc: denied { search } for name="tmp" dev="mmcblk0p28" ino=627090 scontext=u:r:debuggerd:s0 tcontext=u:object_r:shell_data_file:s0 tclass=dir permissive=0
05-05 17:45:17.602  6319  6319 W debuggerd: type=1400 audit(0.0:7281): avc: denied { search } for name="tmp" dev="mmcblk0p28" ino=627090 scontext=u:r:debuggerd:s0 tcontext=u:object_r:shell_data_file:s0 tclass=dir permissive=0
05-05 17:45:17.649  6319  6319 F DEBUG   :
05-05 17:45:17.649  6319  6319 F DEBUG   : Tombstone written to: /data/tombstones/tombstone_05
05-05 17:45:17.649   820   836 I BootReceiver: Copying /data/tombstones/tombstone_05 to DropBox (SYSTEM_TOMBSTONE)

Analysis

The vulnerability exists in the libavc H.264 decoder invoked by libstagefright. Mediaserver uses the stagefright lib to handle audio and video streams. Let’s look into this specially crafted .mp4 file first. A comparison between the normal MP4 file and the minimized PoC file is shown below.

Figure 1. PoC File vs The Original MP4 File

Figure 2. Parsing of the PoC File with 010 Editor

From Figure 1 and Figure 2, we can see that the only difference is a single byte at offset 0x1a65f, and this byte is located in atom ‘mdat’. The atom ‘mdat’ stores the H.264 media data. For H.264 specifications, please refer to https://www.itu.int/rec/T-REC-H.264.

The Network Abstraction Layer (NAL) and Video Coding Layer (VCL) are the two main concepts in H.264. An H.264 file consists of a number of NAL units (NALU), and each NALU can be classified as VCL or non-VCL. Video data is processed by the codec and packed into NAL units. Please refer to https://tools.ietf.org/html/rfc6184.

In an MP4 file, the H.264 media data is stored in the following format in the atom ‘mdat’.

|Len(4 bytes)|Type 'mdat'|NALU len|NALU(header+payload)|NALU len|NALU(header+payload)|...

We can extract the NAL unit that contains the byte at offset 0x1a65f from the PoC file and the original MP4 file as follows:

Figure 3. The NAL unit extracted from PoC

Figure 4. The NAL unit extracted from the original MP4 file

From Figure 3 and Figure 4, we know the length of the NAL unit is equal to 0x65. 

Following is the H.264 structure.

Figure 5. H.264 Stream Layer Structure

Next, we trace this NAL unit through dynamic debugging in GDB. 

From the ‘Proof of Concept’ section, the signal SIGSEGV occurs in a thread from the backtrace output. Obviously, this vulnerability is triggered in a multithreaded environment. It certainly will increase the complexity of the debugging.

Let’s enter into the debugging world!

First, here is our debugging environment.

[*]google/hammerhead/hammerhead:6.0.1/MOB30H/root08012302:userdebug/test-keys
[*]Android/aosp_hammerhead/hammerhead/android-6.0.1_r41
[*]Ubuntu 14.04 LTS Desktop (64-bit)

The function ih264d_parse_nal_unit (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_parse_headers.c) is used to parse the  NAL unit. Its definition is shown below.

WORD32 ih264d_parse_nal_unit(iv_obj_t *dec_hdl,
                          ivd_video_decode_op_t *ps_dec_op,
                          UWORD8 *pu1_buf,
                          UWORD32 u4_length)
{…
}

The 3rd parameter pu1_buf points to the buffer of NAL unit data, while the 4th parameter u4_length is the length of the NAL unit data. This allows us to set the following condition breakpoint in this function to trace the NAL unit whose length is 0x65.

b ih264d_parse_headers.c:ih264d_parse_nal_unit if u4_length==0x65

As it continues to run, the above condition breakpoint is eventually hit, and the debug info is shown below.

(gdb) c
Continuing.
[New Thread 2533]
[Switching to Thread 2533]
Breakpoint 1, ih264d_parse_nal_unit (dec_hdl=dec_hdl@entry=0xb608e000, ps_dec_op=ps_dec_op@entry=0xb5ea2580,
    pu1_buf=pu1_buf@entry=0xb5200000 "e\210\200\025\200\002o#k\177\322", , u4_length=101)
    at external/libavc/decoder/ih264d_parse_headers.c:1011
1011        {
(gdb) info args
dec_hdl = 0xb608e000
ps_dec_op = 0xb5ea2580
pu1_buf = 0xb5200000 "e\210\200\025\200\002o#k\177\322",
u4_length = 101
(gdb) x/101b pu1_buf
0xb5200000:           0x65         0x88         0x80         0x15         0x80         0x02         0x6f          0x23
0xb5200008:           0x6b         0x7f          0xd2         0xc4         0x00         0x00         0x03         0x00
0xb5200010:           0x00         0x09         0x69         0x0f          0x71         0x2a         0xd7         0x1c
0xb5200018:           0x18         0x26         0x77         0x84         0x49         0x58         0x26         0x91
0xb5200020:           0x9d         0xee          0xad         0xcc          0x0b         0xad         0x81         0x30
0xb5200028:           0x26         0xa2         0x96         0xf9          0x3a         0x12         0x80         0xfb
0xb5200030:           0x51         0x5a         0x08         0x3c         0xa2         0x48         0x1c         0xdc
0xb5200038:           0x1d         0x75         0xae         0x82         0x22         0x6d         0xfb          0x57
0xb5200040:           0x37         0x7c         0xa5         0xaa         0xad         0x23         0x6d         0xcd
0xb5200048:           0xe1         0x40         0xe5         0xae         0xe3         0xe6         0x69         0xc5
0xb5200050:           0xe9         0xeb         0xcb         0x48         0xef          0x58         0x4c         0xa4
0xb5200058:           0x6b         0xb5         0x29         0xa5         0xb4         0xe7         0xf4          0x17
0xb5200060:           0x53         0x8e         0x2a         0x19         0xc0

It matches the NAL unit data in Figure 3.  We then continue to debug in GDB.

(gdb)
1040                    u1_nal_unit_type = NAL_UNIT_TYPE(u1_first_byte);
(gdb)
1044                    switch(u1_nal_unit_type)
(gdb) p/x u1_nal_unit_type
$18 = 0x5
 switch(u1_nal_unit_type)
            {
                case SLICE_DATA_PARTITION_A_NAL:
                case SLICE_DATA_PARTITION_B_NAL:
                case SLICE_DATA_PARTITION_C_NAL:
                    if(!ps_dec->i4_decode_header)
                        ih264d_parse_slice_partition(ps_dec, ps_bitstrm);
                    break;
                case IDR_SLICE_NAL:
                case SLICE_NAL:
                    /* ! */
                    DEBUG_THREADS_PRINTF("Decoding  a slice NAL\n");
                    if(!ps_dec->i4_decode_header)
                    {
                        if(ps_dec->i4_header_decoded == 3)
                        {
                            /* ! */
                            ps_dec->u4_slice_start_code_found = 1;
                            ih264d_rbsp_to_sodb(ps_dec->ps_bitstrm);
                            i_status = ih264d_parse_decode_slice(                             //enter here.
                                            (UWORD8)(u1_nal_unit_type
                                                            == IDR_SLICE_NAL),
                                            u1_nal_ref_idc, ps_dec);

The function ih264d_parse_decode_slice (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_parse_slice.c) is used to parse slice. Its definition is shown below.

1014 WORD32 ih264d_parse_decode_slice(UWORD8 u1_is_idr_slice,
1015                                 UWORD8 u1_nal_ref_idc,
1016                                 dec_struct_t *ps_dec /* Decoder parameters */
1017                                 )
1018 {
...
1599        if(ps_dec->u1_separate_parse == 1)
1600        {
1601            if(ps_dec->u4_dec_thread_created == 0)
1602            {
1603                ithread_create(ps_dec->pv_dec_thread_handle, NULL,
1604                               (void *)ih264d_decode_picture_thread,
1605                               (void *)ps_dec);  //create a thread
1606
1607                ps_dec->u4_dec_thread_created = 1;
1608            }
1609
1610            if((ps_dec->u4_num_cores == 3) &&
1611                            ((ps_dec->u4_app_disable_deblk_frm == 0) || ps_dec->i1_recon_in_thread3_flag)
1612                            && (ps_dec->u4_bs_deblk_thread_created == 0))
1613            {
1614                ps_dec->u4_start_recon_deblk = 0;
1615                ithread_create(ps_dec->pv_bs_deblk_thread_handle, NULL,
1616                               (void *)ih264d_recon_deblk_thread,
1617                               (void *)ps_dec);    //create a thread
1618                ps_dec->u4_bs_deblk_thread_created = 1;
1619            }
1620        }
...
1873    if(u1_slice_type == I_SLICE)
1874    {
1875        ps_dec->ps_cur_pic->u4_pack_slc_typ |= I_SLC_BIT;
1876
1877        ret = ih264d_parse_islice(ps_dec, u2_first_mb_in_slice);  //enter here
1878
1879        if(ps_dec->i4_pic_type != B_SLICE && ps_dec->i4_pic_type != P_SLICE)
1880            ps_dec->i4_pic_type = I_SLICE;
1881
1882    }
}

Continue to run until line 1599.

(gdb) until 1599
ih264d_parse_decode_slice (u1_is_idr_slice=, u1_is_idr_slice@entry=1 '\001', u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000',
    ps_dec=ps_dec@entry=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1599
1599                if(ps_dec->u1_separate_parse == 1)
(gdb) p/x ps_dec->u1_separate_parse
$11 = 0x1
...
 (gdb)
1632                                && (u1_slice_type != B_SLICE)

(gdb) info threads
[New Thread 2531]
[New Thread 2532]
[New Thread 2534]
[New Thread 3160]
[New Thread 3163]
  Id   Target Id         Frame
  7    Thread 3163       ih264d_recon_deblk_slice (ps_dec=ps_dec@entry=0xb608f000,
    ps_tfr_cxt=ps_tfr_cxt@entry=0xb4bfd8bc) at external/libavc/decoder/ih264d_thread_compute_bs.c:420
  6    Thread 3160       ih264d_decode_slice_thread (ps_dec=ps_dec@entry=0xb608f000)
    at external/libavc/decoder/ih264d_thread_parse_decode.c:468
  5    Thread 2534       syscall () at bionic/libc/arch-arm/bionic/syscall.S:44
  4    Thread 2532       __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
  3    Thread 2531       __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
* 2    Thread 2533       ih264d_parse_decode_slice (u1_is_idr_slice=,
    u1_is_idr_slice@entry=1 '\001', u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000',
    ps_dec=ps_dec@entry=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1632
  1    Thread 2501       syscall () at bionic/libc/arch-arm/bionic/syscall.S:44

(gdb) thread 6
[Switching to thread 6 (Thread 3160)]
#0  ih264d_decode_slice_thread (ps_dec=ps_dec@entry=0xb608f000)
    at external/libavc/decoder/ih264d_thread_parse_decode.c:468
468                           NOP(128);
(gdb) bt
#0  ih264d_decode_slice_thread (ps_dec=ps_dec@entry=0xb608f000)
    at external/libavc/decoder/ih264d_thread_parse_decode.c:468
#1  0xb5ed3560 in ih264d_decode_picture_thread (ps_dec=0xb608f000)
    at external/libavc/decoder/ih264d_thread_parse_decode.c:602
#2  0xb6b0e460 in __pthread_start (arg=0xb4cff930,
    arg@entry=)
    at bionic/libc/bionic/pthread_create.cpp:199
#3  0xb6ae8b44 in __start_thread (fn=, arg=)
    at bionic/libc/bionic/clone.cpp:41
#4  0x00000000 in ?? ()

(gdb) thread 7
[Switching to thread 7 (Thread 3163)]
#0  ih264d_recon_deblk_slice (ps_dec=ps_dec@entry=0xb608f000, ps_tfr_cxt=ps_tfr_cxt@entry=0xb4bfd8bc)
    at external/libavc/decoder/ih264d_thread_compute_bs.c:420
420                           NOP(128);
(gdb) bt
#0  ih264d_recon_deblk_slice (ps_dec=ps_dec@entry=0xb608f000, ps_tfr_cxt=ps_tfr_cxt@entry=0xb4bfd8bc)
    at external/libavc/decoder/ih264d_thread_compute_bs.c:420
#1  0xb5eb09f0 in ih264d_recon_deblk_thread (ps_dec=0xb608f000)
    at external/libavc/decoder/ih264d_thread_compute_bs.c:702
#2  0xb6b0e460 in __pthread_start (arg=0xb4bfd930,
    arg@entry=)
    at bionic/libc/bionic/pthread_create.cpp:199
#3  0xb6ae8b44 in __start_thread (fn=, arg=)
    at bionic/libc/bionic/clone.cpp:41
#4  0x00000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 2533)]
#0  ih264d_parse_decode_slice (u1_is_idr_slice=, u1_is_idr_slice@entry=1 '\001',
    u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000', ps_dec=ps_dec@entry=0xb608f000)
    at external/libavc/decoder/ih264d_parse_slice.c:1632
1632                                && (u1_slice_type != B_SLICE)

We can now see that the function ithread_create on line 1603 creates the thread 3160, and the function ithread_create on line 1615 creates the thread 3163.

Continue to run until line 1873.

(gdb) until 1873
ih264d_parse_decode_slice (u1_is_idr_slice=, u1_is_idr_slice@entry=1 '\001', u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000',
    ps_dec=ps_dec@entry=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1873
1873            if(u1_slice_type == I_SLICE)
...
(gdb)
1877                ret = ih264d_parse_islice(ps_dec, u2_first_mb_in_slice);
(gdb) s
ih264d_parse_islice (ps_dec=ps_dec@entry=0xb608f000, u2_first_mb_in_slice=u2_first_mb_in_slice@entry=0) at external/libavc/decoder/ih264d_parse_islice.c:1360
1360            dec_slice_params_t * ps_slice = ps_dec->ps_cur_slice;
(gdb) info args
ps_dec = 0xb608f000
u2_first_mb_in_slice = 0

Because the slice type is I_SLICE(0x5), enter into the function ih264d_parse_islice (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_parse_islice.c). Its definition is shown below.

1356        WORD32 ih264d_parse_islice(dec_struct_t *ps_dec,
1357                                    UWORD16 u2_first_mb_in_slice)
1358        {
...
1446            if(ps_pps->u1_entropy_coding_mode)
1447            {
1448                SWITCHOFFTRACE; SWITCHONTRACECABAC;
1449                if(ps_dec->ps_cur_slice->u1_mbaff_frame_flag)
1450                {
1451                    ps_dec->pf_get_mb_info = ih264d_get_mb_info_cabac_mbaff;
1452                }
1453                else
1454                    ps_dec->pf_get_mb_info = ih264d_get_mb_info_cabac_nonmbaff;
1455       
1456                ret = ih264d_parse_islice_data_cabac(ps_dec, ps_slice,
1457                                                     u2_first_mb_in_slice);  // enter here.
1458                if(ret != OK)
1459                    return ret;
1460                SWITCHONTRACE; SWITCHOFFTRACECABAC;
1461            }
1462            else
1463            {
1464                if(ps_dec->ps_cur_slice->u1_mbaff_frame_flag)
1465                {
1466                    ps_dec->pf_get_mb_info = ih264d_get_mb_info_cavlc_mbaff;
1467                }
1468                else
1469                    ps_dec->pf_get_mb_info = ih264d_get_mb_info_cavlc_nonmbaff;
1470                ret = ih264d_parse_islice_data_cavlc(ps_dec, ps_slice,
1471                                               u2_first_mb_in_slice);
1472                if(ret != OK)
1473                    return ret;
1474            }
1475       
1476            return OK;
                  }

Continue to run until line 1446.

(gdb) until 1446
ih264d_parse_islice (ps_dec=ps_dec@entry=0xb608f000, u2_first_mb_in_slice=u2_first_mb_in_slice@entry=0) at external/libavc/decoder/ih264d_parse_islice.c:1446
1446            if(ps_pps->u1_entropy_coding_mode)
(gdb) p/x ps_pps->u1_entropy_coding_mode
$22 = 0x1

Next, it enters into the function ih264d_parse_islice_data_cabac. Its definition is shown below.

972 WORD32 ih264d_parse_islice_data_cabac(dec_struct_t * ps_dec,
973                                      dec_slice_params_t * ps_slice,
974                                      UWORD16 u2_first_mb_in_slice)
975 {
...
1010    do
1011    {
...
1064            /* Parse Macroblock Data */
1065            if(25 == u1_mb_type)
1066            {
1067                /* I_PCM_MB */
1068                ps_cur_mb_info->ps_curmb->u1_mb_type = I_PCM_MB;
1069                ret = ih264d_parse_ipcm_mb(ps_dec, ps_cur_mb_info, u1_num_mbs);
1070                if(ret != OK)
1071                    return ret;
1072                ps_cur_deblk_mb->u1_mb_qp = 0;
1073            }
1074            else
1075            {
1076                ret = ih264d_parse_imb_cabac(ps_dec, ps_cur_mb_info, u1_mb_type);
1077                if(ret != OK)                        // trace it.
1078                    return ret;
1079                ps_cur_deblk_mb->u1_mb_qp = ps_dec->u1_qp;
1080            }
...
1154    }
1155    while(uc_more_data_flag);
...
1162        return ret;
1163  }

Next, set condition breakpoint on line 1077 as follows.

b ih264d_parse_islice.c:1077 if ret!=0x0

The debug info is shown below.

(gdb) b ih264d_parse_islice.c:1077 if ret!=0x0
Breakpoint 9 at 0xb5ebeeb2: file external/libavc/decoder/ih264d_parse_islice.c, line 1077.

Continue to run, the condition breakpoint is hit.

(gdb) c
Continuing.

Breakpoint 9, ih264d_parse_islice_data_cabac (ps_dec=ps_dec@entry=0xb608f000, ps_slice=0xb52c2000, u2_first_mb_in_slice=)
    at external/libavc/decoder/ih264d_parse_islice.c:1077
1077                        if(ret != OK)
(gdb) p/x ret
$13 = 0x6e

The value of variable 'ret' is equal to 0x6E(ERROR_EOB_TERMINATE_T). This means that it fails during parsing slice in the NAL unit that we specially crafted in the PoC file. The function will return the error code ERROR_EOB_TERMINATE_T. Meanwhile, we can check the status of these two threads.

(gdb) info threads
  Id   Target Id         Frame
  7    Thread 3163       sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
  6    Thread 3160       sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
  5    Thread 2534       syscall () at bionic/libc/arch-arm/bionic/syscall.S:44
  4    Thread 2532       __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
  3    Thread 2531       __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
* 2    Thread 2533       ih264d_parse_islice_data_cabac (ps_dec=ps_dec@entry=0xb608f000,
    ps_slice=0xb52c2000, u2_first_mb_in_slice=)
    at external/libavc/decoder/ih264d_parse_islice.c:1077
  1    Thread 2501       syscall () at bionic/libc/arch-arm/bionic/syscall.S:44
(gdb) thread 6
[Switching to thread 6 (Thread 3160)]
#0  sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
9                    mov     r7, ip
(gdb) bt
#0  sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
#1  0xb5eba0d0 in ithread_yield () at external/libavc/common/ithread.c:116
#2  0xb5ed2ffc in ih264d_decode_recon_tfr_nmb_thread (ps_dec=ps_dec@entry=0xb608f000,
    u1_num_mbs=, u1_num_mbs_next=, u1_end_of_row=)
    at external/libavc/decoder/ih264d_thread_parse_decode.c:265
#3  0xb5ed34f4 in ih264d_decode_slice_thread (ps_dec=ps_dec@entry=0xb608f000)
    at external/libavc/decoder/ih264d_thread_parse_decode.c:585
#4  0xb5ed3560 in ih264d_decode_picture_thread (ps_dec=0xb608f000)
    at external/libavc/decoder/ih264d_thread_parse_decode.c:602
#5  0xb6b0e460 in __pthread_start (arg=0xb4cff930,
    arg@entry=)
    at bionic/libc/bionic/pthread_create.cpp:199
#6  0xb6ae8b44 in __start_thread (fn=, arg=)
    at bionic/libc/bionic/clone.cpp:41
#7  0x00000000 in ?? ()
(gdb) thread 7
[Switching to thread 7 (Thread 3163)]
#0  sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
9                    mov     r7, ip
(gdb) bt
#0  sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
#1  0xb5eba0d0 in ithread_yield () at external/libavc/common/ithread.c:116
#2  0xb5eb06a8 in ih264d_recon_deblk_slice (ps_dec=ps_dec@entry=0xb608f000,
    ps_tfr_cxt=ps_tfr_cxt@entry=0xb4bfd8bc) at external/libavc/decoder/ih264d_thread_compute_bs.c:572
#3  0xb5eb09f0 in ih264d_recon_deblk_thread (ps_dec=0xb608f000)
    at external/libavc/decoder/ih264d_thread_compute_bs.c:702
#4  0xb6b0e460 in __pthread_start (arg=0xb4bfd930,
    arg@entry=)
    at bionic/libc/bionic/pthread_create.cpp:199
#5  0xb6ae8b44 in __start_thread (fn=, arg=)
    at bionic/libc/bionic/clone.cpp:41
#6  0x00000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 2533)]
#0  ih264d_parse_islice_data_cabac (ps_dec=ps_dec@entry=0xb608f000, ps_slice=0xb52c2000,
    u2_first_mb_in_slice=) at external/libavc/decoder/ih264d_parse_islice.c:1077
1077                        if(ret != OK)

We can see that both threads, 3160 and 3163, are in the status of sched_yield. It then forces the running thread to relinquish the processor until it again becomes the head of the thread list.

The function ih264d_parse_islice_data_cabac returns error code ERROR_EOB_TERMINATE_T, and the error code is always returned until the code line 2013 in the ih264d_video_decode function.

(gdb) n
ih264d_video_decode (dec_hdl=0xb608e000, pv_api_ip=0xb5ea25f0, pv_api_op=0xb5ea2580) at external/libavc/decoder/ih264d_api.c:2013
2013                if(ret != OK)
(gdb) p/x ret
$14 = 0x6e
(gdb)

The following is the code snippet around line 2013.

1864  do
1865    {
 1866       WORD32 buf_size;
 1867       pu1_buf = (UWORD8*)ps_dec_ip->pv_stream_buffer
 1868                       + ps_dec_op->u4_num_bytes_consumed;
 1869       u4_max_ofst = ps_dec_ip->u4_num_Bytes
 1870                       - ps_dec_op->u4_num_bytes_consumed;

2010    ps_dec->u4_return_to_app = 0;
2011    ret = ih264d_parse_nal_unit(dec_hdl, ps_dec_op,
2012                            pu1_bitstrm_buf, buflen);
2013    if(ret != OK)
2014    {



2053            if(ps_dec->u4_num_cores == 3)
2054            {
2055                ih264d_signal_bs_deblk_thread(ps_dec);
2056            }
2057           return (IV_FAIL);
2058
2059  }

As you can see above, line 2011 is in a loop. It will take the next iteration in the loop because the value of ret does not meet the condition of exiting the loop. Continue to run until the line 2011. It will parse the next NAL unit in the PoC file.

(gdb)
2011                ret = ih264d_parse_nal_unit(dec_hdl, ps_dec_op,
(gdb) s
ih264d_parse_nal_unit (dec_hdl=0xb608e000, ps_dec_op=0xb5ea2580, pu1_buf=0xb5200000 "e\002Ȉ\001X", u4_length=447)
    at external/libavc/decoder/ih264d_parse_headers.c:1027
1027                if(u4_length)
(gdb) info args
dec_hdl = 0xb608e000
ps_dec_op = 0xb5ea2580
pu1_buf = 0xb5200000 "e\002Ȉ\001X"
u4_length = 447
(gdb) x/447b pu1_buf
0xb5200000:           0x65         0x02         0xc8         0x88         0x01         0x58         0x00         0x26
0xb5200008:           0xff           0xf5          0x9c         0x39         0x86         0x3f          0x47         0x0b
0xb5200010:           0xa2         0x47         0xf6          0x5c         0x1d         0x87         0x90         0x50
0xb5200018:           0x0a         0x0d         0x3f          0x88         0xcc          0x32         0x05         0xc4
0xb5200020:           0x53         0xda         0xe5         0x55         0x75         0xca         0x83         0xf6

0xb52001a0:           0xe8         0xd2         0x85         0x8a         0xf7          0xe9         0x64         0x3e
0xb52001a8:           0xa4         0x90         0xf5          0x77         0xec          0xd6         0xc8         0x7e
0xb52001b0:           0x44         0xe8         0xb7         0xb7         0x55         0x75         0x86         0xd2
0xb52001b8:           0xf5          0xff           0xe1         0x7b         0x14         0x08         0xce

The buffer pointed to by pu1_buf stores the next NAL unit data. It matches the NAL unit data in the PoC file. Continue to trace how the program handles it.

(gdb)
1068                                    i_status = ih264d_parse_decode_slice(
(gdb) s
ih264d_parse_decode_slice (u1_is_idr_slice=1 '\001', u1_nal_ref_idc=3 '\003', ps_dec=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1025
1025            WORD32 i4_poc = 0;
(gdb) p/x ps_dec->u1_separate_parse
$32 = 0x1
(gdb) p/x ps_dec->u4_dec_thread_created
$33 = 0x1
(gdb) p/x ps_dec->u4_bs_deblk_thread_created
$34 = 0x1

Because both ps_dec->u4_dec_thread_created and ps_dec->u4_bs_deblk_thread_created are 0x01, this time the program does not execute the function ithread_create on line 1603 and 1615 to create a new thread. Now, continue to run until line 1873 in function ih264d_parse_decode_slice.

(gdb) until 1873
ih264d_parse_decode_slice (u1_is_idr_slice=, u1_is_idr_slice@entry=1 '\001', u1_nal_ref_idc=u1_nal_ref_idc@entry=0 '\000',
    ps_dec=ps_dec@entry=0xb608f000) at external/libavc/decoder/ih264d_parse_slice.c:1873
1873            if(u1_slice_type == I_SLICE)
...
1877                ret = ih264d_parse_islice(ps_dec, u2_first_mb_in_slice);
(gdb) s
ih264d_parse_islice (ps_dec=ps_dec@entry=0xb608f000, u2_first_mb_in_slice=u2_first_mb_in_slice@entry=88) at external/libavc/decoder/ih264d_parse_islice.c:1360
1360            dec_slice_params_t * ps_slice = ps_dec->ps_cur_slice;
(gdb) n
1361            UWORD32 *pu4_bitstrm_buf = ps_dec->ps_bitstrm->pu4_buffer;

Continue to run until line 1446 in function ih264d_parse_islice.

(gdb) until 1446
ih264d_parse_islice (ps_dec=ps_dec@entry=0xb608f000, u2_first_mb_in_slice=u2_first_mb_in_slice@entry=88) at external/libavc/decoder/ih264d_parse_islice.c:1446
1446        if(ps_pps->u1_entropy_coding_mode)
...
(gdb) s
1456            ret = ih264d_parse_islice_data_cabac(ps_dec, ps_slice,
(gdb) s
ih264d_parse_islice_data_cabac (ps_dec=ps_dec@entry=0xb608f000, ps_slice=0xb52c2000, u2_first_mb_in_slice=88) at external/libavc/decoder/ih264d_parse_islice.c:975
975    {
(gdb)

In the function ih264d_parse_islice_data_cabac, the program will call the function ih264d_parse_tfr_nmb on line 1137.

1135           if(ps_dec->u1_separate_parse)
1136            {
1137                ih264d_parse_tfr_nmb(ps_dec, u1_mb_idx, u1_num_mbs,
1138                                     u1_num_mbs_next, u1_tfr_n_mb, u1_end_of_row);   //trace it.
1139                ps_dec->ps_nmb_info +=  u1_num_mbs;
1140            }

So set the following breakpoint.

b ih264d_thread_parse_decode.c:ih264d_parse_tfr_nmb

The debug info is shown below when the above breakpoint is hit.

(gdb) c
Continuing.

Breakpoint 4, ih264d_parse_tfr_nmb (ps_dec=ps_dec@entry=0xb608f000, u1_mb_idx=u1_mb_idx@entry=0 '\000', u1_num_mbs=u1_num_mbs@entry=22 '\026',
    u1_num_mbs_next=0 '\000', u1_tfr_n_mb=1 '\001', u1_end_of_row=1 '\001') at external/libavc/decoder/ih264d_thread_parse_decode.c:68
68             {
(gdb)

The definition of the function ih264d_parse_tfr_nmb is shown below.

62 void ih264d_parse_tfr_nmb(dec_struct_t * ps_dec,
63                          UWORD8 u1_mb_idx,
64                          UWORD8 u1_num_mbs,
65                          UWORD8 u1_num_mbs_next,
66                          UWORD8 u1_tfr_n_mb,
67                          UWORD8 u1_end_of_row)
68 {
69    WORD32 i, u4_mb_num;
70
71    const UWORD32 u1_mbaff = ps_dec->ps_cur_slice->u1_mbaff_frame_flag;
72    UWORD32 u4_n_mb_start;
73
74    UNUSED(u1_mb_idx);
75    UNUSED(u1_num_mbs_next);
76    if(u1_tfr_n_mb)
77    {
78
79
80        u4_n_mb_start = (ps_dec->u2_cur_mb_addr + 1) - u1_num_mbs;
81
82        // copy into s_frmMbInfo
83
84        u4_mb_num = u4_n_mb_start;
85        u4_mb_num = (ps_dec->u2_cur_mb_addr + 1) - u1_num_mbs; //u4_mb_num is 0x58
86
87        for(i = 0; i < u1_num_mbs; i++)       // u1_num_mbs is 0x16
88        {
89            UPDATE_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, u4_mb_num,
90                                 ps_dec->u2_cur_slice_num);

91            DATA_SYNC();
92            UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_dec_mb_map, u4_mb_num);
93
94            u4_mb_num++;
95        }
96
...
164    }
165 }

We can see that the line 89 UPDATE_SLICE_NUM_MAP is used to update the buffer pointed to by ps_dec->pu2_slice_num_map with ps_dec->u2_cur_slice_num in a loop. The value of ps_dec->u2_cur_slice_num is 0x0. Go back to see why it’s 0x0.

1873    if(u1_slice_type == I_SLICE)
1874    {
1875        ps_dec->ps_cur_pic->u4_pack_slc_typ |= I_SLC_BIT;
1876
1877        ret = ih264d_parse_islice(ps_dec, u2_first_mb_in_slice); //ret is ERROR_EOB_TERMINATE_T
1878
1879        if(ps_dec->i4_pic_type != B_SLICE && ps_dec->i4_pic_type != P_SLICE)
1880            ps_dec->i4_pic_type = I_SLICE;
1881
1882    }

1909    if(ret != OK)
1910        return ret;    // return here
1911
1912    ps_dec->u2_cur_slice_num++;  //didn’t increase ps_dec->u2_cur_slice_num,so ps_dec->u2_cur_slice_num is still 0x0.
1913    /* storing last Mb X and MbY of the slice */

When the program handled the slice in the previous NAL unit, it returned an error code and didn’t fix the slice number increment for error clips.  So ps_dec->u2_cur_slice_num is still 0x0. The line 92 UPDATE_MB_MAP_MBNUM_BYTE is used to update the buffer pointed to by ps_dec->pu1_dec_mb_map with 0x01.

We can next check some variables and the status of threads.

(gdb) p/x ps_dec->u2_cur_slice_num
$37 = 0x0
87            for(i = 0; i < u1_num_mbs; i++)
(gdb) 
89                UPDATE_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, u4_mb_num,
(gdb) p/x u1_num_mbs
$38 = 0x16
(gdb) p/x u4_mb_num 
$39 = 0x58
(gdb) x/128b ps_dec->pu1_dec_mb_map 
0xb60ff200:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff208:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff210:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff218:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff220:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff228:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff230:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff238:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff240:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff248:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff250:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff258:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xb60ff260:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xb60ff268:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xb60ff270:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xb60ff278:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
(gdb) p/x ps_dec->pu1_recon_mb_map 
$31 = 0xb60ff400
(gdb) x/128b 0xb60ff400
0xb60ff400:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff408:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff410:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff418:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff420:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff428:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff430:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff438:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff440:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff448:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff450:    0x01    0x01    0x01    0x01    0x01    0x01    0x01    0x01
0xb60ff458:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xb60ff460:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xb60ff468:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xb60ff470:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0xb60ff478:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
 (gdb) info threads
  Id   Target Id         Frame 
  7    Thread 3163       sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
  6    Thread 3160       sched_yield () at bionic/libc/arch-arm/syscalls/sched_yield.S:9
  5    Thread 2534       syscall () at bionic/libc/arch-arm/bionic/syscall.S:44
  4    Thread 2532       __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
  3    Thread 2531       __ioctl () at bionic/libc/arch-arm/syscalls/__ioctl.S:8
* 2    Thread 2533       ih264d_parse_tfr_nmb (ps_dec=ps_dec@entry=0xb608f000, 
    u1_mb_idx=u1_mb_idx@entry=0 '\000', u1_num_mbs=u1_num_mbs@entry=22 '\026', 
    u1_num_mbs_next=
, u1_tfr_n_mb=1 '\001', u1_end_of_row=1 '\001')
    at external/libavc/decoder/ih264d_thread_parse_decode.c:80
  1    Thread 2501       syscall () at bionic/libc/arch-arm/bionic/syscall.S:44

We need to monitor these buffers pointed to by ps_dec->pu1_dec_mb_map, ps_dec->pu2_slice_num_map and ps_dec->pu1_recon_mb_map.

Next, we see why threads 3160 and 3163 still yield, and how to have them continue to run.

For thread 3160, it yields in the function ih264d_decode_recon_tfr_nmb_thread (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_thread_parse_decode.c).

200 WORD32 ih264d_decode_recon_tfr_nmb_thread(dec_struct_t * ps_dec,
201                                          UWORD8 u1_num_mbs,
202                                          UWORD8 u1_num_mbs_next,
203                                          UWORD8 u1_end_of_row)
204 {
205    WORD32 i,j;
...
228    while(1)
229    {
230
231        UWORD32 u4_max_mb = (UWORD32)(ps_dec->i2_dec_thread_mb_y + (1 << u1_mbaff)) * ps_dec->u2_frm_wd_in_mbs - 1;
232        u4_mb_num = u2_cur_dec_mb_num;
233        /*introducing 1 MB delay*/
234        u4_mb_num = MIN(u4_mb_num + u1_num_mbs + 1, u4_max_mb);
235
236        CHECK_MB_MAP_BYTE(u4_mb_num, ps_dec->pu1_dec_mb_map, u4_cond); // check ps_dec->pu1_dec_mb_map, the line 92 in function ih264d_parse_tfr_nmb is used to update byte with 0x01 at  the  buffer pointed by ps_dec->pu1_dec_mb_map. When the buffer pointed by (u4_mb_num + ps_dec->pu1_dec_mb_map) is updated with 0x01, the u4_cond will be equal to 0x01, then break the loop and the thread continues to run.
237        if(u4_cond)           // if u4_cond is 0x01, then break loop and stop thread yield.
238        {
239            break;  
240        }
241        else
242        {
243            if(nop_cnt > 0)
244            {
245                nop_cnt -= 128;
246                NOP(128);
247            }
248            else
249            {
250                if(ps_dec->u4_output_present && (2 == ps_dec->u4_num_cores) &&
251                   (ps_dec->u4_fmt_conv_cur_row < ps_dec->s_disp_frame_info.u4_y_ht))
252                {
253                    ps_dec->u4_fmt_conv_num_rows =
254                                MIN(FMT_CONV_NUM_ROWS,
255                                    (ps_dec->s_disp_frame_info.u4_y_ht
256                                                    - ps_dec->u4_fmt_conv_cur_row));
257                    ih264d_format_convert(ps_dec, &(ps_dec->s_disp_op),
258                                          ps_dec->u4_fmt_conv_cur_row,
259                                          ps_dec->u4_fmt_conv_num_rows);
260                    ps_dec->u4_fmt_conv_cur_row += ps_dec->u4_fmt_conv_num_rows;
261                }
262                else
263                {
264                    nop_cnt = 8*128;
265                    ithread_yield();
266                }
267            }
268        }
269    }
270    /* N Mb MC Loop */
...
342    /* N Mb IQ IT RECON  Loop */
343    for(j = 0; j < i; j++)
344    {
345        ps_cur_mb_info = &ps_dec->ps_frm_mb_info[ps_dec->cur_dec_mb_num];
346
347        if((ps_dec->u4_num_cores == 2) || !ps_dec->i1_recon_in_thread3_flag)
348        {
349            if(ps_cur_mb_info->u1_mb_type <= u1_skip_th)
350            {
351                ih264d_process_inter_mb(ps_dec, ps_cur_mb_info, j);
352            }
353            else if(ps_cur_mb_info->u1_mb_type != MB_SKIP)
354            {
355                if((u1_ipcm_th + 25) != ps_cur_mb_info->u1_mb_type)
356                {
357                    ps_cur_mb_info->u1_mb_type -= (u1_skip_th + 1);
358                    ih264d_process_intra_mb(ps_dec, ps_cur_mb_info, j);
359                }
360            }
361
362
363         if(ps_dec->u4_use_intrapred_line_copy == 1)
364                ih264d_copy_intra_pred_line(ps_dec, ps_cur_mb_info, j);
365        }
366
367        DATA_SYNC();
368
369        if(u1_mbaff)
370        {
371            if(u4_update_mbaff)
372            {
373                UWORD32 u4_mb_num = ps_cur_mb_info->u2_mbx
374                                + ps_dec->u2_frm_wd_in_mbs
375                                                * (ps_cur_mb_info->u2_mby >> 1);
376                UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_recon_mb_map, u4_mb_num);   //update byte in the buffer pointed by (ps_dec->pu1_recon_mb_map+u4_mb_num)
377                u4_update_mbaff = 0;
378            }
379            else
380            {
381                u4_update_mbaff = 1;
382            }
383        }
384        else
385        {
386            UWORD32 u4_mb_num = ps_cur_mb_info->u2_mbx
387                            + ps_dec->u2_frm_wd_in_mbs * ps_cur_mb_info->u2_mby;
388            UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_recon_mb_map, u4_mb_num);   //update byte in the buffer pointed by (ps_dec->pu1_recon_mb_map+u4_mb_num)
389        }
390        ps_dec->cur_dec_mb_num++;
391     }
...
...
...
435    return OK;
436 }

From above code, we can see that the line 236 CHECK_MB_MAP_BYTE is used to check the buffer pointed to by ps_dec->pu1_dec_mb_map.  Line 92 in function ih264d_parse_tfr_nmb is used to update the byte with 0x01 in the buffer pointed to by ps_dec->pu1_dec_mb_map. When the buffer pointed to by (u4_mb_num + ps_dec->pu1_dec_mb_map) is updated with 0x01, the u4_cond will be equal to 0x01, it then breaks the loop and the thread continues to run.

On the line 376 and 388, UPDATE_MB_MAP_MBNUM_BYTE is used to update the buffer pointed to by ps_dec->pu1_recon_mb_map at offset u4_mb_num.

For thread 3163, it yields in the function ih264d_recon_deblk_slice (in https://android.googlesource.com/platform/external/libavc/+/android-6.0.1_r41/decoder/ih264d_thread_compute_bs.c).

378 void ih264d_recon_deblk_slice(dec_struct_t *ps_dec, tfr_ctxt_t *ps_tfr_cxt)
379 {
...
...
...
531        while(1)
532        {
533            UWORD32 u4_cond = 0;
534            UWORD32 u4_mb_num = ps_dec->cur_recon_mb_num + recon_mb_grp - 1;
535
536            /*
537             * Wait for one extra mb of MC, because some chroma IQ-IT functions
538             * sometimes loads the pixels of the right mb and stores with the loaded
539             * values.
540             */
541            u4_mb_num = MIN(u4_mb_num + 1, (ps_dec->i2_recon_thread_mb_y + 1) * i2_pic_wdin_mbs - 1);
542
543            CHECK_MB_MAP_BYTE(u4_mb_num, ps_dec->pu1_recon_mb_map, u4_cond); // Check the buffer pointed by (ps_dec->pu1_recon_mb_map+u4_mb_num), if u4_cond is 0x01, then break loop and the thread continues to run. The operation of upadting ps_dec->pu1_recon_mb_map was done in functuon ih264d_decode_recon_tfr_nmb_thread in thread 3160.
544            if(u4_cond)
545            {
546                break;
547            }
548            else
549            {
550                if(nop_cnt > 0)
551                {
552                    nop_cnt -= 128;
553                    NOP(128);
554                }
555                else
556                {
557                    if(ps_dec->u4_output_present &&
558                       (ps_dec->u4_fmt_conv_cur_row < ps_dec->s_disp_frame_info.u4_y_ht))
559                    {
560                        ps_dec->u4_fmt_conv_num_rows =
561                                        MIN(FMT_CONV_NUM_ROWS,
562                                            (ps_dec->s_disp_frame_info.u4_y_ht
563                                                            - ps_dec->u4_fmt_conv_cur_row));
564                        ih264d_format_convert(ps_dec, &(ps_dec->s_disp_op),
565                                              ps_dec->u4_fmt_conv_cur_row,
566                                              ps_dec->u4_fmt_conv_num_rows);
567                        ps_dec->u4_fmt_conv_cur_row += ps_dec->u4_fmt_conv_num_rows;
568                    }
569                    else
570                    {
571                        nop_cnt = 8*128;
572                        ithread_yield();   
573                    }
574                }
575            }
576        }
577
578        for(j = 0; j < recon_mb_grp; j++)
579        {
580            GET_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, ps_dec->cur_recon_mb_num,
581                              u2_slice_num); // get slice num in ps_dec->pu2_slice_num_map, the slice num map was updated in line 89 UPDATE_SLICE_NUM_MAP of the function ih264d_parse_tfr_nmb. The u2_slice_num is always 0x0 because it's from ps_dec->u2_cur_slice_num.
582
583            if(u2_slice_num != ps_dec->u2_cur_slice_num_bs) // here ps_dec->u2_cur_slice_num_bs is 0x0, if u2_slice_num also is 0x0. So it does not break the loop and then the thread can continue to run.
584            {
585                u4_slice_end = 1;
586                break;
587            }
588            if(ps_dec->i1_recon_in_thread3_flag)
589            {
590                ps_cur_mb_info = &ps_dec->ps_frm_mb_info[ps_dec->cur_recon_mb_num];
591
592                if(ps_cur_mb_info->u1_mb_type <= u1_skip_th)
593                {
594                    ih264d_process_inter_mb(ps_dec, ps_cur_mb_info, j);
595                }
596                else if(ps_cur_mb_info->u1_mb_type != MB_SKIP)
597                {
598                    if((u1_ipcm_th + 25) != ps_cur_mb_info->u1_mb_type)
599                    {
600                        ps_cur_mb_info->u1_mb_type -= (u1_skip_th + 1);
601                        ih264d_process_intra_mb(ps_dec, ps_cur_mb_info, j); // trace it.
602                    }
603                }
604
605                ih264d_copy_intra_pred_line(ps_dec, ps_cur_mb_info, j);
606            }
607            ps_dec->cur_recon_mb_num++;
608        }

From above code, we see that the line 543 CHECK_MB_MAP_BYTE is used to check the buffer pointed to by (ps_dec->pu1_recon_mb_map+u4_mb_num). If u4_cond is 0x01, then break the loop and the thread continues to run. The update of the buffer pointed to by ps_dec->pu1_recon_mb_map is done in function ih264d_decode_recon_tfr_nmb_thread in thread 3160.

The line 580 GET_SLICE_NUM_MAP is used to get the slice num from ps_dec->pu2_slice_num_map. The slice num map is updated on line 89 UPDATE_SLICE_NUM_MAP of the function ih264d_parse_tfr_nmb. u2_slice_num is always 0x0 because it's from ps_dec->u2_cur_slice_num. 

On line 583, ps_dec->u2_cur_slice_num_bs is 0x0, and u2_slice_num is also 0x0. So it does not break the loop and the thread can continue to run.

Through the above analysis, we now know how to have these two threads continue to run.

Next, go back the GDB.

Set the following condition breakpoint to monitor when the loop is broken in function ih264d_decode_recon_tfr_nmb_thread.

b ih264d_thread_parse_decode.c:237 if u4_cond==1

Continue to run until the above condition breakpoint is hit. Then run the following command.

set scheduler-locking on

Continue to run until the line 388 (UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_recon_mb_map, u4_mb_num);). In a loop, it updates the buffer pointed to by ps_dec->pu1_recon_mb_map with 0x01.

In the funtion ih264d_recon_deblk_slice, the line 543 (CHECK_MB_MAP_BYTE(u4_mb_num, ps_dec->pu1_recon_mb_map, u4_cond);) running in a loop always checks to see if the byte pointed to by (ps_dec->pu1_recon_mb_map+u4_mb_num) is 0x01. Once u4_cond becomes 0x01, it will break the loop and then continue to run in this thread.

Next, set the following breakpoint when debugging in the function ih264d_decode_recon_tfr_nmb_thread.

b ih264d_process_intra_mb.c:ih264d_process_intra_mb

Run the following command to disable scheduler-locking:

set scheduler-locking off

Continue to run until the above breakpoint is hit. The debug info is shown below:

(gdb) c
Continuing.

Breakpoint 6, ih264d_process_intra_mb (ps_dec=ps_dec@entry=0xb608f000, 
    ps_cur_mb_info=ps_cur_mb_info@entry=0xb52b44dc, u1_mb_num=u1_mb_num@entry=0 '\000')
    at external/libavc/decoder/ih264d_process_intra_mb.c:725
725 {
(gdb) x/16b ps_dec->pv_proc_tu_coeff_data
0xb5140600: 0x01  0x01  0x01  0x01  0xff  0xff  0xff  0xff
0xb5140608: 0x09  0x20  0x00  0x00  0x00  0x00  0x00  0x00
(gdb) p/x ps_dec->pu1_recon_mb_map 
$31 = 0xb60ff400
(gdb) x/128b 0xb60ff400
0xb60ff400: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff408: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff410: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff418: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff420: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff428: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff430: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff438: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff440: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff448: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff450: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff458: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff460: 0x01  0x01  0x01  0x01  0x01  0x01  0x01  0x01
0xb60ff468: 0x01  0x01  0x01  0x01  0x01  0x01  0x00  0x00
0xb60ff470: 0x00  0x00  0x00  0x00  0x00  0x00  0x00  0x00
0xb60ff478: 0x00  0x00  0x00  0x00  0x00  0x00  0x00  0x00

From above output we can see ih264d_process_intra_mb is called in a loop on line 578 in function ih264d_recon_deblk_slice. The debug info below shows when the breakpoint is hit for the 11th time.

(gdb) c
Continuing.

Breakpoint 6, ih264d_process_intra_mb (ps_dec=ps_dec@entry=0xb608f000, 
    ps_cur_mb_info=ps_cur_mb_info@entry=0xb52b46f8, u1_mb_num=u1_mb_num@entry=10 '\n')
    at external/libavc/decoder/ih264d_process_intra_mb.c:725
725 {
(gdb) x/16b ps_dec->pv_proc_tu_coeff_data
0xb5140828: 0x00  0x00  0x40  0x03  0x80  0xfc  0x80  0xfc
0xb5140830: 0x00  0x00  0x00  0x00  0x00  0x00  0x00  0x00
...
 (gdb) 
764     UWORD8 *pu1_prev_intra4x4_pred_mode_data = (UWORD8 *)ps_dec->pv_proc_tu_coeff_data;                 //Pointer to keep track of intra4x4_pred_mode data in pv_proc_tu_coeff_data buffer
(gdb) n
767     u4_num_pmbair = (u1_mb_num >> u1_mbaff);
(gdb) p/x pu1_prev_intra4x4_pred_mode_data
$20 = 0xb5140828
(gdb) x/16b pu1_prev_intra4x4_pred_mode_data
0xb5140828: 0x00  0x00  0x40  0x03  0x80  0xfc  0x80  0xfc
0xb5140830: 0x00  0x00  0x00  0x00  0x00  0x00  0x00  0x00
...
 (gdb) p/x pu1_prev_intra4x4_pred_mode_flag
$21 = 0xb5140828
(gdb) p/x pu1_rem_intra4x4_pred_mode
$22 = 0xb514082c
(gdb) x/4b pu1_rem_intra4x4_pred_mode
0xb514082c: 0x80  0xfc  0x80  0xfc
(gdb) x/4b pu1_prev_intra4x4_pred_mode_flag
0xb5140828: 0x00  0x00  0x40  0x03
1652              i1_intra_pred = ((i1_left_pred_mode < 0) | (i1_top_pred_mode < 0)) ?
(gdb) 
1665                  if(!pu1_prev_intra4x4_pred_mode_flag[u1_sub_mb_num])
(gdb) 
1669                                                      >= i1_intra_pred);
(gdb) p/x i1_intra_pred 
$25 = 0x2
(gdb) n
1668                                      + (pu1_rem_intra4x4_pred_mode[u1_sub_mb_num]
(gdb) 
1667                      i1_intra_pred = pu1_rem_intra4x4_pred_mode[u1_sub_mb_num]
(gdb) n
1671        if(i1_intra_pred<0)
(gdb) p/x i1_intra_pred 
$26 = 0x81
(gdb) p i1_intra_pred 
$27 = -127 '\201'
(gdb) p/x u1_sub_mb_num 
$28 = 0x0

Go back to the source code. The code on line 1563 is the start of a loop where line 1634 calculates the value of i1_intra_pred and it's 0x02. Next, go to the line 1647, and you will see that pu1_prev_intra4x4_pred_mode_flag points to the buffer |00 00 40 03|, and u1_sub_mb_num is 0x0. Now that the ‘if’ condition is true, go to line 1649 to re-calculate i1_intra_pred. pu1_rem_intra4x4_pred_mode points to the buffer |80 fc 80 fc|, so i1_intra_pred = 0x80+(0x80>0x02)=0x81.

The definition of pu1_rem_intra4x4_pred_mode is on line 1359.

1359 UWORD8 *pu1_rem_intra4x4_pred_mode = pu1_prev_intra4x4_pred_mode_data + 4;

The definition of i1_intra_pred is on line 1357.

1357 WORD8 i1_intra_pred;

pu1_rem_intra4x4_pred_mode is an unsigned char pointer, and i1_intra_pred is a signed char. The value of pu1_rem_intra4x4_pred_mode[0] is an unsigned char. But when an unsigned char is assigned to a signed char, it's easy to cause an overflow.  Here, sthe i1_intra_pred is equal to -127. Next, go to line 1681.

1681  ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
                                    au1_ngbr_pels, pu1_luma_rec_buffer, 1,
                                    ui_rec_width,
                                    ((u1_is_top_sub_block << 2) | u1_is_left_sub_block));
                }

ps_dec->apf_intra_pred_luma_8x8 is an array of function pointers, which length is 0x09. In C programming language, the array can accept a negative number, but it can also cause an unexpected memory operation. Here the program jumps to an unexpected memory address to execute a function. The memory address could be in the code segment or the data segment.

The following is the debug info.

(gdb) n
1713                        ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) si
0xb5efbc00  1713                        ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) si
0xb5efbc04  1713                        ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) x/10i $pc
=> 0xb5efbc04
:   movs    r2, #1
   0xb5efbc06
:   ldr r0, [sp, #84]   ; 0x54
   0xb5efbc08
:   ldr r3, [sp, #32]
   0xb5efbc0a
:   add.w   r8, r1, r7, lsl #2
   0xb5efbc0e
:   mov r1, r9
   0xb5efbc10
:   ldr.w   r7, [r8, #4]
   0xb5efbc14
:   blx r7
   0xb5efbc16
:   ldr r0, [sp, #60]   ; 0x3c
   0xb5efbc18
:   ldrb    r3, [r0, #2]
   0xb5efbc1a
:   asrs    r3, r6
 (gdb) si
0xb5efbc06  1713                        ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) 
0xb5efbc08  1713                        ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) 
0xb5efbc0a  1713                        ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
 (gdb) si
0xb5efbc0e  1713                        ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) x/10i $pc
=> 0xb5efbc0e
:   mov r1, r9
   0xb5efbc10
:   ldr.w   r7, [r8, #4]
   0xb5efbc14
:   blx r7
   0xb5efbc16
:   ldr r0, [sp, #60]   ; 0x3c
   0xb5efbc18
:   ldrb    r3, [r0, #2]
   0xb5efbc1a
:   asrs    r3, r6
   0xb5efbc1c
:   lsls    r0, r3, #31
   0xb5efbc1e
:   bpl.n   0xb5efbc78
   0xb5efbc20 :   ldr r7, [sp, #96]   ; 0x60
   0xb5efbc22
:   movs    r1, #1
(gdb) si
0xb5efbc10  1713                        ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) 
0xb5efbc14  1713                        ps_dec->apf_intra_pred_luma_8x8[i1_intra_pred](
(gdb) i r
r0             0xb4c407e8   3032745960
r1             0xb4e2adc0   3034754496
r2             0x1  1
r3             0x1a0    416
r4             0x0  0
r5             0x81 129
r6             0x0  0
r7             0xb60f8fc8   3054473160
r8             0xb60d05dc   3054306780
r9             0xb4e2adc0   3034754496
r10            0xb60cf4a1   3054302369
r11            0x0  0
r12            0xb4e2b77f   3034756991
sp             0xb4c40688   0xb4c40688
lr             0x0  0
pc             0xb5efbc14   0xb5efbc14

cpsr           0x70030  458800
(gdb) si
Cannot access memory at address 0x0
0xb60f8fc8 in ?? ()
(gdb) x/8i $pc
=> 0xb60f8fc8:  ldrbtlt r12, [r10], #3104   ; 0xc20
   0xb60f8fcc:  ldrbtlt r0, [sp], #400  ; 0x190
   0xb60f8fd0:  muleq   r0, r0, r0
   0xb60f8fd4:  andeq   r0, r0, r0
   0xb60f8fd8:  andeq   r0, r0, r0
   0xb60f8fdc:  lsreq   r0, r0, #3
   0xb60f8fe0:  adcseq  r0, r8, r0, ror r1
   0xb60f8fe4:  andeq   r0, r0, r0

We can use command "cat /proc/[pid]/maps" to check the memory map.

b60c0000-b6100000 rw-p 00000000 00:00 0          [anon:libc_malloc]

The address 0xb60f8fc8 is between 0xb60c0000 and 0xb6100000, which is a data segment without execution privilege. So it causes a memory corruption.

In summary, we have drawn the code execution flow chart below to show how the vulnerability is triggered in a multithread environment. 

Figure 6. The code execution flow to trigger vulnerability

Finally, let’s see Google’s patch for this issue. Please refer to https://android.googlesource.com/platform/external/libavc/+/a78887bcffbc2995cf9ed72e0697acf560875e9e. Google fixed the slice number increment for error clips. In ih264d_parse_slice.c, the patch is shown below.

diff --git a/decoder/ih264d_parse_slice.c b/decoder/ih264d_parse_slice.c
index 5ff92f8..73bc45d 100644
--- a/decoder/ih264d_parse_slice.c
+++ b/decoder/ih264d_parse_slice.c
@@ -374,6 +374,7 @@
     ps_dec->ps_parse_cur_slice = &(ps_dec->ps_dec_slice_buf[0]);
     ps_dec->ps_decode_cur_slice = &(ps_dec->ps_dec_slice_buf[0]);
     ps_dec->ps_computebs_cur_slice = &(ps_dec->ps_dec_slice_buf[0]);
+    ps_dec->u2_cur_slice_num = 0;

     /* Initialize all the HP toolsets to zero */
     ps_dec->s_high_profile.u1_scaling_present = 0;
@@ -573,7 +574,6 @@
     ps_dec->u2_mv_2mb[1] = 0;
     ps_dec->u1_last_pic_not_decoded = 0;
-    ps_dec->u2_cur_slice_num = 0;
     ps_dec->u2_cur_slice_num_dec_thread = 0;
     ps_dec->u2_cur_slice_num_bs = 0;
     ps_dec->u4_intra_pred_line_ofst = 0;
@@ -1425,7 +1425,10 @@
     }

     if (ps_dec->u4_first_slice_in_pic == 0)
+    {
         ps_dec->ps_parse_cur_slice++;
+        ps_dec->u2_cur_slice_num++;
+    }
     ps_dec->u1_slice_header_done = 0;

@@ -1908,7 +1911,6 @@
     if(ret != OK)
         return ret;
-    ps_dec->u2_cur_slice_num++;
     /* storing last Mb X and MbY of the slice */
     ps_dec->i2_prev_slice_mbx = ps_dec->u2_mbx;
     ps_dec->i2_prev_slice_mby = ps_dec->u2_mby;

From the above patch, we can see Google fixed the slice number increment for error slice.

Combined with our analysis, after patching, ps_dec->u2_cur_slice_num will be 0x01 after handling the specially crafted NAL unit. When the program handles the next NAL unit, it executes in the function ih264d_parse_tfr_nmb.

87        for(i = 0; i < u1_num_mbs; i++)       // u1_num_mbs is 0x16
88        {
89            UPDATE_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, u4_mb_num,
90                                 ps_dec->u2_cur_slice_num);

91            DATA_SYNC();
92            UPDATE_MB_MAP_MBNUM_BYTE(ps_dec->pu1_dec_mb_map, u4_mb_num);
93
94            u4_mb_num++;
95        }

The buffer pointed to by ps_dec->pu2_slice_num_map at offset u4_mb_num will be updated with 0x01 in a loop. Next, the program continues to run in the function ih264d_recon_deblk_slice.

578        for(j = 0; j < recon_mb_grp; j++)
579        {
580            GET_SLICE_NUM_MAP(ps_dec->pu2_slice_num_map, ps_dec->cur_recon_mb_num,
581                              u2_slice_num);
582
583            if(u2_slice_num != ps_dec->u2_cur_slice_num_bs) // here ps_dec->u2_cur_slice_num_bs is 0x0
584            {
585                u4_slice_end = 1;
586                break;
587            }
588            if(ps_dec->i1_recon_in_thread3_flag)

598                    if((u1_ipcm_th + 25) != ps_cur_mb_info->u1_mb_type)
599                    {
600                        ps_cur_mb_info->u1_mb_type -= (u1_skip_th + 1);
601                        ih264d_process_intra_mb(ps_dec, ps_cur_mb_info, j);
602                    }
603                }

u2_slice_num will be 0x01 via line 580 GET_SLICE_NUM_MAP. ps_dec->u2_cur_slice_num_bs is 0x0, so it breaks the loop. Then the function ih264d_process_intra_mb on line 601 will not be called, and the vulnerability will not be triggered.

Demo

As mentioned in the “Proof of Concept” section, this vulnerability exists in the software-based H.264 decoder. Mediaserver normally prefers the hardware-based H.264 decoder shipped with most Android devices over the vulnerable software-based one. If the hardware-based H.264 decoder is chosen to parse the PoC file, the vulnerability is not triggered. Applications supporting H.264 media, however, could be vulnerable to the vulnerability depending on which decoder is chosen by them.

We developed an Android app that can demonstrate this vulnerability. From the video below, you can see that the Mediaserver crashed and restarted. 

 

 

 

Mitigation

All users of Google Android are encouraged to upgrade to the latest version of the software. Additionally, organizations that have deployed Fortinet IPS solutions are already protected from this vulnerability with the signature Google.Android.Mediaserver.Remote.Code.Execution.

Timeline

2016-05-06: Kai Lu of Fortinet's FortiGuard Labs reported this vulnerability to Google
2016-05-31: Google confirmed this vulnerability and set the severity to Critical
2016-08-01: Google released the patch
2016-08-05: Advisory posted by Fortinet's FortiGuard

 

by RSS Kai Lu  |  Aug 17, 2016  |  Filed in: Security Research

comments powered by Disqus