音视频同步原理解析

最后更新于：2022-04-01 15:57:51

音视频同步原理解析 2013-04-18 15:21:11 标签：[音频视频](http://blog.51cto.com/tag-%E9%9F%B3%E9%A2%91%E8%A7%86%E9%A2%91.html) ### 视频流中的DTS／PTS到底是什么? DTS（解码时间戳）和PTS（显示时间戳）分别是解码器进行解码和显示帧时相对于SCR（系统参考）的时间戳。SCR可以理解为解码器应该开始从磁盘读取数据时的时间。 mpeg文件中的每一个包都有一个SCR时间戳并且这个时间戳就是读取这个数据包时的系统时间。通常情况下，解码器会在它开始读取mpeg流时启动系统时钟（系统时钟的初始值是第一个数据包的SCR值，通常为0但也可以不从0开始）。 DTS 时间戳决定了解码器在SCR时间等于DTS时间时进行解码，PTS时间戳也是类似的。通常，DTS/PTS时间戳指示的是晚于音视频包中的SCR的一个时间。例如，如果一个视频数据包的SCR是100ms（意味着此包是播放100ms以后从磁盘中读取的），那么DTS/PTS值就差不多是200 /280ms，表明当SCR到200ms时这个视频数据应该被解码并在80ms以后被显示出来（视频数据在一个buffer中一直保存到开始解码）下溢通常发生在设置的视频数据流相关mux率太高。如果mux率是1000000bits/sec（意味着解码器要以1000000bits/sec的速率读取文件），可是视频速率是2000000bits/sec（意味着需要以2000000bits/sec的速率显示视频数据），从磁盘中读取视频数据时速度不够快以至于1秒钟内不能够读取足够的视频数据。这种情况下DTS/PTS时间戳就会指示视频在从硬盘中读出来之前进行解码或显示（DTS/PTS时间戳就要比包含它们的数据包中的SCR时间要早了）。如今依靠解码器，这基本已经不是什么问题了（尽管MPEG文件因为应该没有下溢而并不完全符合MPEG标准）。一些解码器（很多著名的基于PC的播放器）尽可能快的读取文件以便显示视频，可以的话直接忽略SCR。注意在你提供的列表中，平均的视频流速率为～3Mbps（3000000bits/sec）但是它的峰值达到了14Mbps（相当大，DVD限制在 9.8Mbps内）。这意味着mux率需要调整足够大以处理14Mbps的部分， bbMPEG计算出来的mux率有时候太低而导致下溢。你计划让视频流速率这么高么？这已经超过了DVD的说明了，而且很可能在大多数独立播放其中都不能播放。如果你不是这么计划，我会从1增加mquant的值并且在视频设置中将最大码流设置为9Mbps以保持一个小一点的码流。如果你确实想让视频码率那么高，你需要增大mux率。从提供的列表可以得出bbMPEG使用14706800bits/sec或者1838350bytes /sec的mux率（总数据速率为：1838350bytes/sec（14706800bits/sec）行）。你在强制mux率字段设置的值应该是以 bytes/sec为单位并被50整除。所以我会从36767（1838350/50）开始，一直增加直到不会再出现下溢错误为止。 ### 音视频同步原理[ffmpeg] ffmpeg对视频文件进行解码的大致流程： 1. 注册所有容器格式和CODEC: av_register_all() 2. 打开文件: av_open_input_file() 3. 从文件中提取流信息: av_find_stream_info() 4. 穷举所有的流，查找其中种类为CODEC_TYPE_VIDEO 5. 查找对应的解码器: avcodec_find_decoder() 6. 打开编解码器: avcodec_open() 7. 为解码帧分配内存: avcodec_alloc_frame() 8. 不停地从码流中提取中帧数据: av_read_frame() 9. 判断帧的类型，对于视频帧调用: avcodec_decode_video() 10. 解码完后，释放解码器: avcodec_close() 11. 关闭输入文件:av_close_input_file() output_example.c 中AV同步的代码如下(我的代码有些修改)，这个实现相当简单，不过挺说明问题。 ### 音视频同步-时间戳媒体内容在播放时，最令人头痛的就是音视频不同步。从技术上来说，解决音视频同步问题的最佳方案就是时间戳：首先选择一个参考时钟（要求参考时钟上的时间是线性递增的）；生成数据流时依据参考时钟上的时间给每个数据块都打上时间戳（一般包括开始时间和结束时间）；在播放时，读取数据块上的时间戳，同时参考当前参考时钟上的时间来安排播放（如果数据块的开始时间大于当前参考时钟上的时间，则不急于播放该数据块，直到参考时钟达到数据块的开始时间；如果数据块的开始时间小于当前参考时钟上的时间，则“尽快”播放这块数据或者索性将这块数据“丢弃”，以使播放进度追上参考时钟）。可见，避免音视频不同步现象有两个关键——一是在生成数据流时要打上正确的时间戳。如果数据块上打的时间戳本身就有问题，那么播放时再怎么调整也于事无补。假如，视频流内容是从0s开始的，假设10s时有人开始说话，要求配上音频流，那么音频流的起始时间应该是10s，如果时间戳从0s或其它时间开始打，则这个混合的音视频流在时间同步上本身就出了问题。打时间戳时，视频流和音频流都是参考参考时钟的时间，而数据流之间不会发生参考关系；也就是说，视频流和音频流是通过一个中立的第三方（也就是参考时钟）来实现同步的。第二个关键的地方，就是在播放时基于时间戳对数据流的控制，也就是对数据块早到或晚到采取不同的处理方法。图2.8中，参考时钟时间在0-10s内播放视频流内容过程中，即使收到了音频流数据块也不能立即播放它，而必须等到参考时钟的时间达到10s之后才可以，否则就会引起音视频不同步问题。基于时间戳的播放过程中，仅仅对早到的或晚到的数据块进行等待或快速处理，有时候是不够的。如果想要更加主动并且有效地调节播放性能，需要引入一个反馈机制，也就是要将当前数据流速度太快或太慢的状态反馈给“源”，让源去放慢或加快数据流的速度。熟悉DirectShow的读者一定知道，DirectShow中的质量控制（Quality Control）就是这么一个反馈机制。DirectShow对于音视频同步的解决方案是相当出色的。但WMF SDK在播放时只负责将ASF数据流读出并解码，而并不负责音视频内容的最终呈现，所以它也缺少这样的一个反馈机制。音视频同步通讯SDK源码包分享： Android：http://down.51cto.com/data/711001 Windows：http://down.51cto.com/data/715497 Linux：http://download.csdn.net/detail/weixiaowenrou/5169796 IOS：http://down.51cto.com/data/715486 WEB：http://down.51cto.com/data/710983! http://6352513.blog.51cto.com/6342513/1180742

ffmpeg tutorial 6 –音频同步实践

最后更新于：2022-04-01 15:57:48

## 音频同步初步印象：播放的速度终于均匀了，不过感觉好快话说，是按照视频同步的方案增加的函数增加的大函数都是audio做文件名的。期望在下一轮阅读中再次分析 synchronize_audio ## 比较tutorial5 vs tutorial 6 结构有点乱代码增加的大致有： ###选择同步的时钟接口函数新添加了 double get_video_clock(VideoState *is) ~~~ double get_video_clock(VideoState *is) { double delta; delta = (av_gettime() - is->video_current_pts_time) / 1000000.0; return is->video_current_pts + delta; } double get_external_clock(VideoState *is) { return av_gettime() / 1000000.0; } double get_master_clock(VideoState *is) { if(is->av_sync_type == AV_SYNC_VIDEO_MASTER) { return get_video_clock(is); } else if(is->av_sync_type == AV_SYNC_AUDIO_MASTER) { return get_audio_clock(is); } else { return get_external_clock(is); } } ~~~ ### 重点移植 synchronize_audio 第二是，添加了类似于尚一章同步视频的函数：同步音频，这个函数期望反复阅读鉴于重要性，分代码展示和代码分析两端 ~~~ int synchronize_audio(VideoState *is, short *samples, int samples_size, double pts) { int n; double ref_clock; n = 2 * is->audio_st->codec->channels; if(is->av_sync_type != AV_SYNC_AUDIO_MASTER) { double diff, avg_diff; int wanted_size, min_size, max_size, nb_samples; ref_clock = get_master_clock(is); diff = get_audio_clock(is) - ref_clock; if(diff < AV_NOSYNC_THRESHOLD) { // accumulate the diffs is->audio_diff_cum = diff + is->audio_diff_avg_coef * is->audio_diff_cum; if(is->audio_diff_avg_count < AUDIO_DIFF_AVG_NB) {//涉及到一个公式 is->audio_diff_avg_count++; } else { avg_diff = is->audio_diff_cum * (1.0 - is->audio_diff_avg_coef); if(fabs(avg_diff) >= is->audio_diff_threshold) { wanted_size = samples_size + ((int)(diff * is->audio_st->codec->sample_rate) * n); min_size = samples_size * ((100 - SAMPLE_CORRECTION_PERCENT_MAX) / 100); max_size = samples_size * ((100 + SAMPLE_CORRECTION_PERCENT_MAX) / 100); if(wanted_size < min_size) { wanted_size = min_size; } else if (wanted_size > max_size) { wanted_size = max_size; } if(wanted_size < samples_size) { /* remove samples */ samples_size = wanted_size; } else if(wanted_size > samples_size) { uint8_t *samples_end, *q; int nb; /* add samples by copying final sample*/ nb = (samples_size - wanted_size); samples_end = (uint8_t *)samples + samples_size - n; q = samples_end + n; while(nb > 0) { memcpy(q, samples_end, n); q += n; nb -= n; } samples_size = wanted_size; } } } } else { /* difference is TOO big; reset diff stuff */ is->audio_diff_avg_count = 0; is->audio_diff_cum = 0; } } return samples_size; } ~~~ 下面用突出显示展示我认为代码中的难点。 /* Add or subtract samples to get a better sync, return new audio buffer size */ int synchronize_audio(VideoState *is, short *samples, int samples_size, double pts) { int n; double ref_clock; n = 2 * is->audio_st->codec->channels; if(is->av_sync_type != AV_SYNC_AUDIO_MASTER) { double diff, avg_diff; int wanted_size, min_size, max_size, nb_samples; ref_clock = get_master_clock(is); diff = get_audio_clock(is) - ref_clock; if(diff < AV_NOSYNC_THRESHOLD) { // accumulate the diffs is->audio_diff_cum = diff + is->audio_diff_avg_coef * is->audio_diff_cum; if(is->audio_diff_avg_count < AUDIO_DIFF_AVG_NB) {//涉及到一个公式 is->audio_diff_avg_count++; } else { avg_diff = is->audio_diff_cum * (1.0 - is->audio_diff_avg_coef); //这一个函数都是理解的难点 if(fabs(avg_diff) >= is->audio_diff_threshold) { wanted_size = samples_size + ((int)(diff * is->audio_st->codec->sample_rate) * n); min_size = samples_size * ((100 - SAMPLE_CORRECTION_PERCENT_MAX) / 100); max_size = samples_size * ((100 + SAMPLE_CORRECTION_PERCENT_MAX) / 100); if(wanted_size < min_size) { wanted_size = min_size; } else if (wanted_size > max_size) { wanted_size = max_size; } if(wanted_size < samples_size) { /* remove samples */ samples_size = wanted_size; } else if(wanted_size > samples_size) { uint8_t *samples_end, *q; int nb; /* add samples by copying final sample*/ nb = (samples_size - wanted_size); samples_end = (uint8_t *)samples + samples_size - n; q = samples_end + n; while(nb > 0) { memcpy(q, samples_end, n); q += n; nb -= n; } samples_size = wanted_size; } } } } else { /* difference is TOO big; reset diff stuff */ is->audio_diff_avg_count = 0; is->audio_diff_cum = 0; } } return samples_size; } ### 变化最大的函数之audio_callback ~~~ void audio_callback(void *userdata, Uint8 *stream, int len) { VideoState *is = (VideoState *)userdata; int len1, audio_size; double pts; while(len > 0) { if(is->audio_buf_index >= is->audio_buf_size) { /* We have already sent all our data; get more */ audio_size = audio_decode_frame(is, is->audio_buf, sizeof(is->audio_buf), &pts); if(audio_size < 0) { /* If error, output silence */ is->audio_buf_size = 1024; memset(is->audio_buf, 0, is->audio_buf_size); } else { audio_size = synchronize_audio(is, (int16_t *)is->audio_buf, audio_size, pts); is->audio_buf_size = audio_size; } is->audio_buf_index = 0; } len1 = is->audio_buf_size - is->audio_buf_index; if(len1 > len) len1 = len; memcpy(stream, (uint8_t *)is->audio_buf + is->audio_buf_index, len1); len -= len1; stream += len1; is->audio_buf_index += len1; } } ~~~ 以上是代码，这里附带两个版本的callback函数的对比图： ![](https://docs.gechiui.com/gc-content/uploads/sites/kancloud/2016-02-22_56cae4b84da80.jpg) 可以看到终点修改的位置在解码后的处理。尤其注意红色的标示部分对比视频同步的代码，主要在vedio_thread中。 ### 小结看代码思路音频的入口，就是从回调函数看起，然后看到同步的函数对比，视频，视频就是从线程看起，并且结合主函数的刷新定时器来看。 ## 代码实践以下直接拿log分析：给出一次代码运行过程产生的输出，以后直接对应代码 ~~~ Function: synchronize_audio(VideoState *, short *, int, double), diff=0.027000000000000000 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= 0.027000000000000000 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=0 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 //这是一轮音频get 解码播放的过程，round 0,相同的过程将会持续到round 18,变化请看下面的注释 //省略其中重复的18个 //这是第20次音频取样播放，20是认为设定的阈值（详细见宏定义）；此时计算avg_diff，【】【】其实就是执行的另外一条代码覆盖； //并且第一次【注意，这是第一次】进入休整样本值的过程中，后续的代码执行结果：表示我们想让样本变小？？ Function: synchronize_audio(VideoState *, short *, int, double), diff=-13.566370441424555 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -13.573071923920748 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061, 【when】is->audio_diff_avg_count:20 >10 // Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size ：-216580 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 ~~~ 初步印象：视频放完了，音频还没有放完。 **表格一：**audio_size**（**synchronize_audio**返回的值）** 比较一下老代码，这个值是多还是少？ Todo:分析一下老版本的值多少？可否从音频的波特率上分析下,一共20个，20*480 <table border="1" cellspacing="0" cellpadding="7" width="569"><colgroup><col width="553"/></colgroup><tbody><tr><td valign="top" width="553">Line 4: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 8: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 12: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 16: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 20: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 24: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 28: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 32: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 36: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 40: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 44: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 48: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 52: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 56: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 60: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 64: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 68: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 72: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 76: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 80: Function: audio_callback(void *, unsigned char *, int),auido_size= 480Line 86: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 92: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 98: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 104: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 110: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 116: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 122: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 128: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 134: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 140: Function: audio_callback(void *, unsigned char *, int),auido_size= 0Line 146: Function: audio_callback(void *, unsigned char *, int),auido_size= 0 </td></tr></tbody></table> ### 附件：完整的log分析记录： Function: synchronize_audio(VideoState *, short *, int, double), diff=0.027000000000000000 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= 0.027000000000000000 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=0 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 //这是一轮音频get解码播放的过程，round 0,相同的过程将会持续到round 18,变化请看下面的注释 Function: synchronize_audio(VideoState *, short *, int, double), diff=-0.14302037207122775 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -0.14300687207122775 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=1 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-0.33604274414245550 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -0.33611424757849112 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=2 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-0.50606311621368327 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -0.50623117333747247 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=3 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-0.67608448828491097 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -0.67633760387157971 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=4 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-10.821046860356137 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -10.821385029158073 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=5 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-10.991067232427365 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -10.996477924941944 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=6 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-11.173088604498595 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -11.178586843461066 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=7 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-11.343109976569822 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -11.348699269991553 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=8 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-11.565133348641048 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -11.570807698276043 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=9 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-11.735154720712277 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -11.740940124561416 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=10 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-11.905175092783505 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -11.911045562845786 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=11 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-12.076196464854734 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -12.082151987636157 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=12 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-12.277218836925961 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -12.283259912919778 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=13 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-12.447239208997187 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -12.453380838953647 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=14 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-12.639261581068416 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -12.645488271487892 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=15 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-12.856284953139644 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -12.862607697275388 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=16 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-13.028306325210870 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -13.034737629059508 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=17 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-13.225328697282098 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -13.231846066096628 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=18 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 Function: synchronize_audio(VideoState *, short *, int, double), diff=-13.396349069353327 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -13.402964992386377 Function: synchronize_audio(VideoState *, short *, int, double), audio_diff_avg_count=19 Function: audio_callback(void *, unsigned char *, int),auido_size= 480 //这是第20次音频取样播放，20是认为设定的阈值；此时计算avg_diff； //并且第一次【注意，这是第一次】进入休整样本值的过程中，后续的代码执行结果：表示我们想让样本变小？？ Function: synchronize_audio(VideoState *, short *, int, double), diff=-13.566370441424555 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -13.573071923920748 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 // Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-216580 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-13.836396813495783 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -13.843183349457744 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-220902 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-14.106423185567010 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -14.113344777241739 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-225222 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-14.376449557638237 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -14.383506230026859 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-229542 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-14.646475929709466 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -14.653667682824478 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-233862 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-14.926503301780693 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -14.933830135622106 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-238344 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-15.196529673851920 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -15.203996588919731 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-242664 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-15.466556045923149 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -15.474158044217608 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-246984 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-15.736582417994377 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -15.744319497016486 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-251304 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-16.040610790065607 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -16.048482949814115 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-256168 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-16.310638162136833 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -16.318662403611739 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-260490 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-16.582664534208060 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -16.590823865409867 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-264842 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 Function: synchronize_audio(VideoState *, short *, int, double), diff=-16.868691906279288 Function: synchronize_audio(VideoState *, short *, int, double), // accumulate the diffs= -16.876987318211992 Function: synchronize_audio(VideoState *, short *, int, double), avg_diff=-9.2559631349317831e+061,【when】is->audio_diff_avg_count:20 >10 Function: synchronize_audio(VideoState *, short *, int, double), wanted_size < min_size：-269418 < 0 Function: synchronize_audio(VideoState *, short *, int, double), /* remove samples */ Function: audio_callback(void *, unsigned char *, int),auido_size= 0 ## 小结算法很重要实践很重要结合代码覆盖学习

tutorial 5 音视频同步初印象

最后更新于：2022-04-01 15:57:46

### 同步视频基础时间戳 dts解码时间戳 pts显示时间戳 ### 动手实践目标 1. 想不想动手实践看看**p**ts第一次被赋值。 1. 重点研究那个video_fresh-timer函数 1. 鉴于分析的视频只有12帧， - 理解actual_delay 期望值 - Audio的时间 - 理解一下那个时间戳 **vs**音频时间的言论** 1. video_fresh-timer关注delay :（现在的时间戳-上一个时间戳）同步阈值两个变量 1. 提前告知 tutorial同步音频到视频， 1. 认识校正方法，是修正delay,要么加倍延迟，要么没有延迟。下一节对于音频来说，要么插值，要么删减？一句话，同步的手段各有。 ### 表面变化：代码变化（tutorial 4 vs tutorial 5）增加的宏如下： ~~~ #define AV_SYNC_THRESHOLD 0.010 //minimum sync time #define AV_NOSYNC_THRESHOLD 10 //the differ of video and audio exceeds the time, do not sync with audio ~~~ 值得注意的数据结构变化如下： ~~~ struct VideoState{ ... double video_clock; double audio_clock; double frame_last_pts; double frame_last_delay; double frame_timer; } VideoState; ~~~ 关于音频当前时钟字段，暂时忽略，详见官方tutorial ### 视频线程代码及变迁 ~~~ int video_thread(void *arg) { VideoState *is = (VideoState *) arg; AVPacket pkt1, *packet = &pkt1; int len1, frameFinished; AVFrame *pFrame; double pts = 0; pFrame = avcodec_alloc_frame(); for (;;) { if (packet_queue_get(&is->videoq, packet, 1) < 0) { // means we quit getting packets break; } pts = 0; //save global pts to be stored in pFrame in first call global_video_pkt_pts = packet->pts; // printf("packet->pts %d\n", packet->pts ); // printf("packet->dts %d\n", packet->dts ); // Decode video frame len1 = avcodec_decode_video2(is->video_st->codec, pFrame, &frameFinished, packet); if (packet->dts == AV_NOPTS_VALUE && pFrame->opaque && *(uint64_t*) pFrame->opaque != AV_NOPTS_VALUE) { pts = *(uint64_t*) pFrame->opaque; } else if (packet->dts != AV_NOPTS_VALUE) { pts = packet->dts; } else { pts = 0; } // printf("pts %d\n", pts ); pts *= av_q2d(is->video_st->time_base); // Did we get a video frame? if (frameFinished) { pts = synchronize_video(is, pFrame, pts); if (queue_picture(is, pFrame, pts) < 0) { break; } } av_free_packet(packet); } av_free(pFrame); return 0; } ~~~ 差别有如下： Pts值的初始化,详细如下图图中红色部分是额外添加的dts处理 ![](https://docs.gechiui.com/gc-content/uploads/sites/kancloud/2016-02-22_56cae4b827038.jpg) 下图中表明还有视频同步函数的加入。 ![](https://docs.gechiui.com/gc-content/uploads/sites/kancloud/2016-02-22_56cae4b839dea.jpg) 以上是所有的两处大的修改。 ### 视频刷新定时器video_refresh_timer 一句话：请看英文解释，一次性解释清楚 <table border="1" cellspacing="0" cellpadding="7" width="569"><colgroup><col width="553"/></colgroup><tbody><tr><td valign="top" width="553">void video_refresh_timer(void *userdata) { VideoState *is = (VideoState *) userdata;VideoPicture *vp;double actual_delay, delay, sync_threshold, ref_clock, diff; if (is->video_st) {if (is->pictq_size == 0) {schedule_refresh(is, 1);} else {vp = &is->pictq[is->pictq_rindex]; /* Timing code goes here */delay = vp->pts - is->frame_last_pts; //the pts from lastif (delay <= 0 || delay >= 1.0) {//if incorrect delay, use previous one过大或者多小，超过1s,不过理喻delay = is->frame_last_delay;}/* save for next time */is->frame_last_delay = delay;is->frame_last_pts = vp->pts; /* update delay to sync to audio */ref_clock = get_audio_clock(is);diff = vp->pts - ref_clock; /* Skip or repeat the frame. Take delay into accountFFPlay still doesn't "know if this is the best guess." */sync_threshold = (delay > AV_SYNC_THRESHOLD) ? delay: AV_SYNC_THRESHOLD;if (fabs(diff) < AV_NOSYNC_THRESHOLD) {if (diff <= -sync_threshold) {delay = 0;//video is slow, do not delay，视频已太慢（视频预测时间-当前时间都是负值，请赶紧播）} else if (diff >= sync_threshold)delay = 2 * delay;//video is fast, delay double，视频太快，刷新时间变慢点}}is->frame_timer += delay;/* computer the REAL delay */actual_delay = is->frame_timer - (av_gettime() / 1000000.0);if (actual_delay < 0.010) {/* Really it should skip the picture instead */actual_delay = 0.010;}schedule_refresh(is, (int) (actual_delay * 1000 + 0.5));// schedule_refresh(is, 80); /* show the picture! */video_display(is); /* update queue for next picture! */后续略} </td></tr></tbody></table> ### 实验结果分析， #### Video_thread代码： #### 所有亮点都在颜色里。 Video_thread{ ~~~ Video_thread{ ... if (packet->dts == AV_NOPTS_VALUE && pFrame->opaque && *(uint64_t*) pFrame->opaque != AV_NOPTS_VALUE) { pts = *(uint64_t*) pFrame->opaque; } else if (packet->dts != AV_NOPTS_VALUE) { pts = packet->dts;//12帧测试案列执行此代码覆盖 } else { pts = 0; } pts *= av_q2d(is->video_st->time_base); // Did we get a video frame? if (frameFinished) { pts = synchronize_video(is, pFrame, pts); //除第一帧没有pts赋值，其他帧都有赋值。其他备注本案例没有重复帧， if (queue_picture(is, pFrame, pts) < 0) { break; } } } ~~~ ### video_refresh_timer代码运行结果这里的数据结果较多，尤其替换了上一小节定时刷新，值得反复体会。 ~~~ void video_refresh_timer(void *userdata) { ... if (is->video_st) { if (is->pictq_size == 0) { } else { vp = &is->pictq[is->pictq_rindex]; /* Timing code goes here */ delay = vp->pts - is->frame_last_pts; //the pts from last //实验结果非常理想，所有12帧处理的结果都是1.0 if (delay <= 0 || delay >= 1.0) { //if incorrect delay, use previous one delay= is->frame_last_delay; } /* save for next time */ is->frame_last_delay = delay; is->frame_last_pts = vp->pts; /* update delay to sync to audio */ ref_clock = get_audio_clock(is); diff = vp->pts - ref_clock;//具体的值见下面的表格 /* Skip or repeat the frame. Take delay into account FFPlay still doesn't "know if this is the best guess." */ sync_threshold = (delay > AV_SYNC_THRESHOLD) ? delay : AV_SYNC_THRESHOLD; if (fabs(diff) < AV_NOSYNC_THRESHOLD) { if (diff <= -sync_threshold) { delay = 0; //video is slow, do not delay //实验结果非常理想，都是说明视频太慢， } else if (diff >= sync_threshold) { delay = 2 * delay; //video is fast, delay double } } is->frame_timer += delay;//鉴于12帧delay=0，这个值就没有增长过 /* computer the REAL delay */ actual_delay = is->frame_timer - (av_gettime() / 1000000.0); if (actual_delay < 0.010) {//详细的真实delay数据如下，可见都是大大小于阈值 /* Really it should skip the picture instead */ actual_delay = 0.010; } logtime=actual_delay * 1000 + 0.5; schedule_refresh(is, (int) (actual_delay * 1000 + 0.5));//所有填入的时间都是10.5 // schedule_refresh(is, 80); /* show the picture! */ video_display(is); //ingore the the other code } } else { schedule_refresh(is, 100); } } ~~~ ### 12帧的时延分析表格，对于所有12帧的delay初步结果分析，可以看到所有的帧都太慢，不执行delay.**但是视觉上看来，视频帧还是太慢**，这说明音频还得同视频同步，期待下一小节。 <table border="1" cellspacing="0" cellpadding="7" width="569"><colgroup><col width="553"/></colgroup><tbody><tr><td valign="top" width="553">// 10 //the differ of video and audio exceeds the time, do not sync with audio// if(diff <= -sync_threshold)所有出现的sync_threshold都是0.04diff=-2.0393626991565137, //video is slow, do not delay，diff=-5.8978444236176193, //video is slow, do not delay，diff=-9.8762886597938149, //video is slow, do not delay，diff=-8.9662605435801321, //video is slow, do not delay，diff=-7.9662605435801321, //video is slow, do not delay，diff=-6.9662605435801321, //video is slow, do not delay，diff=-5.9662605435801321, //video is slow, do not delay，diff=-4.9662605435801321, //video is slow, do not delay，diff=-3.9662605435801321, //video is slow, do not delay，diff=-2.9662605435801321, //video is slow, do not delay，diff=-1.9662605435801321, //video is slow, do not delay，diff=-0.96626054358013214, //video is slow, do not delay</td></tr></tbody></table> 在计算真实的delay时，涉及到的表格如下， <table border="1" cellspacing="0" cellpadding="7" width="569"><colgroup><col width="553"/></colgroup><tbody><tr><td valign="top" width="553">actual_delay= -3.2161841392517090 ,frame_timer= 1401626707.2479169actual_delay= -8.3304760456085205 ,frame_timer= 1401626707.2479169actual_delay= -13.639780044555664 ,frame_timer= 1401626707.2479169actual_delay= -14.218813180923462 ,frame_timer= 1401626707.2479169actual_delay= -14.572834014892578 ,frame_timer= 1401626707.2479169actual_delay= -14.932854175567627 ,frame_timer= 1401626707.2479169actual_delay= -15.305875062942505 ,frame_timer= 1401626707.2479169actual_delay= -15.662896156311035 ,frame_timer= 1401626707.2479169actual_delay= -16.020915985107422 ,frame_timer= 1401626707.2479169actual_delay= -16.381937026977539 ,frame_timer= 1401626707.2479169actual_delay= -16.737957000732422 ,frame_timer= 1401626707.2479169actual_delay= -16.947968959808350 ,frame_timer= 1401626707.2479169 </td></tr></tbody></table> ## 其他：当播放结束，没有额外的多调用刷屏函数，不像上一个版本。结果中，也有多线程的随机性出现，情况在预料之中。未完成分析日志log的一轮起始，下次补充就是。现在认识依然很浅，比如i b b p的认识 ## 小结：逻辑很复杂，测试用例很理想，入门在路上。不要忘了英文注释不要忘了。。视频即使同步了，也比音频慢！！回顾，主要是看vedio_thread ,和播放的timer ### 附件完整log分析简化版（没有带修正的log） Function: video_thread(void *), [pts 2]=0 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=0.00000000000000000 Function: video_refresh_timer(void *), pts之delay=0.00000000000000000 Function: video_refresh_timer(void *), diff=-2.0393626991565137, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=1 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=1.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-5.8978444236176193, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=2 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=2.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-9.8762886597938149, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=3 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=3.0000000000000000 Function: video_thread(void *), [pts 2]=4 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=4.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-8.9662605435801321, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=5 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=5.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-7.9662605435801321, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=6 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=6.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-6.9662605435801321, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=7 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=7.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-5.9662605435801321, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=8 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=8.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-4.9662605435801321, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=9 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=9.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-3.9662605435801321, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=10 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=10.000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-2.9662605435801321, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=11 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=11.000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-1.9662605435801321, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-0.96626054358013214, //video is slow, do not delay Function: video_refresh_timer(void *),【core time】=10.500000000000000 ### 最终用到的log Function: our_get_buffer(AVCodecContext *, AVFrame *),pts初始化in自定义alloc，pts=0 Function: video_thread(void *), [pts 2]=0 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we aren't given a pts, set it to the clock */, 0.00000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=0.00000000000000000 Function: video_refresh_timer(void *), pts之delay=0.00000000000000000 Function: video_refresh_timer(void *), diff=-2.0393626991565137, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -3.2161841392517090 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=1 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 1.0000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=1.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-5.8978444236176193, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -8.3304760456085205 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=2 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 2.0000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=2.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-9.8762886597938149, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -13.639780044555664 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=3 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 3.0000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=3.0000000000000000 Function: video_thread(void *), [pts 2]=4 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 4.0000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=4.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-8.9662605435801321, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -14.218813180923462 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=5 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 5.0000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=5.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-7.9662605435801321, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -14.572834014892578 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=6 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 6.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=6.0000000000000000 Function: video_refresh_timer(void *), diff=-6.9662605435801321, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -14.932854175567627 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=7 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 7.0000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=7.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-5.9662605435801321, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -15.305875062942505 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=8 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 8.0000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=8.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-4.9662605435801321, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -15.662896156311035 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=9 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 9.0000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=9.0000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-3.9662605435801321, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -16.020915985107422 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=10 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 10.000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=10.000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-2.9662605435801321, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -16.381937026977539 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_thread(void *), [pts 2]=11 Function: synchronize_video(VideoState *, AVFrame *, double), /* if we have pts, set video clock to it */ 11.000000000000000 Function: video_thread(void *),【pts同步后？,经过计算重复帧等】=11.000000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-1.9662605435801321, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -16.737957000732422 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000 Function: video_refresh_timer(void *), pts之delay=1.0000000000000000 Function: video_refresh_timer(void *), diff=-0.96626054358013214, //video is slow, do not delay，0.040000000000000001 must>=10ms Function: video_refresh_timer(void *), /* computer the REAL delay */ actual_delay= -16.947968959808350 ,frame_timer= 1401626707.2479169 Function: video_refresh_timer(void *),【core time】=10.500000000000000

（未完成）ffmpeg tutorial 4 (播放视频) 解读

最后更新于：2022-04-01 15:57:44

### 基础：显示过程 1）设定显示区域，//video mode 2） Yuv覆盖 // 3）显示图像//rgb格式转yuv，操作力实际操作用 yv12来填充yuv420, 4）绘制图像//位置，高度，宽度，缩放大小的矩阵参数。一届理解：桌子，白底桌布，原始花纹，变化的花纹。【待补充 SDL 显示过程 csdn】 ### 关于播放声音的小结 1）声音的回调函数，尤其那个参数的设置过程（get decode...） 2） Event事件，联系到消费者、生产者问题。**其他处理比如为了预防死循环，自己sleep让系统调度，以方便做其他事 **3）一个数据结构，audioq,队列。 4）声音播放对应的函数。 ### 关于播放图像的小结 1）程序的调用框架 Video_thread ----call---> 可能 queue_picture(分配空间等)............... send event............收到事件，执行 alloc_picture; --->（空间分配好后的事件处理）【待补充】 Main timer---> schedule_refresh ----> send event......................... 收到事件，执行 video_refresh_timer--->播放函数video_play 待更新一个函数调用框架 2）大量的使用事件机制 > a) 刷新事件（细心的话，你会发现schedule_refresh触发了timer事件，而后者又调用了schedule_refresh，所谓的形成回路；在此我推断这里的定时器一般情形只能执行一次） b) 退出事件 c) 处理生产者消费者问题(细心的话，**会发现针对 video_q有两处上锁，有两处解锁，**具体的情景后续分析) 3）图像的处理过程在这里 “形散而神不散”，说句俗话，就是跟tutorial 2不一样，这里 buffer 申请，yuv覆盖申请，拷贝buffer 到yuv覆盖等等都独立成函数； 4）有一个全局的数据结构，videoState ,所谓的大数据结构，这个结构涵盖大部分的信息； 5）在使用音频队列的基础上，同时普及视频队列； ### 场景分析简要交互过程 Function: queue_picture(VideoState *, AVFrame *), Thread: 0x3197C video_thread,【wait】while (!vp->allocated && !is->quit) Function: alloc_picture(void *), Thread: 0x31844 主线程,【signal】,分配ok Function: queue_picture(VideoState *, AVFrame *), Thread: 0x3197C video_thread,验证一下，写到满？ Function: video_display(VideoState *), Thread: 0x31844 主线程，play core,SDL_DisplayYUVOverlay Function: video_refresh_timer(void *), Thread: 0x31844 主线程, size=1,验证是否会消费到底 Function: video_refresh_timer(void *), Thread: 0x31844 主线程，【signal】timer中队列数目减少 Function: queue_picture(VideoState *, AVFrame *), Thread: 0x3197C video_thread,【wait】while (!vp->allocated && !is->quit) Function: alloc_picture(void *), Thread: 0x31844 主线程,【signal】,分配ok Function: queue_picture(VideoState *, AVFrame *), Thread: 0x3197C video_thread,验证一下，写到满？ **【线程调度，再次发现没有分配，下一条记录可以发现】** //Function: queue_picture(VideoState *, AVFrame *), Thread: 0x3197C video_thread,【wait】while (!vp->allocated && !is->quit) Function: video_display(VideoState *), Thread: 0x31844 主线程，play core,SDL_DisplayYUVOverlay Function: video_refresh_timer(void *), Thread: 0x31844 主线程, size=1,验证是否会消费到底 Function: video_refresh_timer(void *), Thread: 0x31844 主线程，【signal】timer中队列数目减少我在下面的图片，对每一帧的过程进行了标示，细心甚至会发现第4帧出现时有一个申请内存的小插曲。 ### ![](https://docs.gechiui.com/gc-content/uploads/sites/kancloud/2016-02-22_56cae4b7e4165.jpg) 场景分析二:一个完整的交互过程直接上图。 ![](https://docs.gechiui.com/gc-content/uploads/sites/kancloud/2016-02-22_56cae4b80e031.jpg) //分析如上，不同，主要就是timer 还没有触发，所以就执行下一次的wait. //其实场景切换，不过与vedio_thread,主线程的分配，还有timer的播放； . ps:如果wait满足的话，那么下一次还是在原来线程里执行；鉴定真实的wait就是下一次执行不在这个线程里；一句话，这里的wait表示执行到这块代码，并不表示一定会wait. **队列的数目，从来没有超过1，这就是作者说的有了就要用，没有就要取；想起这里使用的同步量是mutex;** 其他：程序中有2个wait, 但下面这个wait从来没有执行过，后续可以增加分析 ~~~ while (is->pictq_size >= VIDEO_PICTURE_QUEUE_SIZE && !is->quit) { SDL_CondWait(is->pictq_cond, is->pictq_mutex); } ~~~ 唯一长期执行的wait是如下代码： ~~~ /* wait until we have a picture allocated */ SDL_LockMutex(is->pictq_mutex); while (!vp->allocated && !is->quit) { SDL_CondWait(is->pictq_cond, is->pictq_mutex); } SDL_UnlockMutex(is->pictq_mutex); ~~~ ### 最后一帧的处理那就是当所有的12帧图像播放完毕后，代码的行为。以下是log 经过分析，这是图像的最后一帧，当前的vedio picture已经分配，所以不会出现分配的情景。 Function: queue_picture(VideoState *, AVFrame *), Thread: 0x3AD14 video_thread，队列增加 size=1 Function: video_display(VideoState *), Thread: 0x3ABEC 主线程，play core,SDL_DisplayYUVOverlay Function: video_refresh_timer(void *), Thread: 0x3ABEC 主线程, size=1,验证是否会消费到底 Function: video_refresh_timer(void *), Thread: 0x3ABEC 主线程，【signal】timer中队列数目减少 Function: queue_picture(VideoState *, AVFrame *), Thread: 0x3AD14 video_thread，队列增加 size=1 Function: video_display(VideoState *), Thread: 0x3ABEC 主线程，play core,SDL_DisplayYUVOverlay Function: video_refresh_timer(void *), Thread: 0x3ABEC 主线程, size=1,验证是否会消费到底 Function: video_refresh_timer(void *), Thread: 0x3ABEC 主线程，【signal】timer中队列数目减少 ### 其他问题（有同事提出，可能哪个锁没有打开，此时待后续分析）具体说，就是上锁的位置就是那个vp 没有分配的情形，解锁就是alloc 完毕，或者播放完毕的情形；至于播放完毕会导致锁解开的情形待分析。解决方案:就是在上锁的位置打log，看上锁代码的log和后续的关键区log 是否会相继出现。\ 如果是相继出现，就证明queue_picture里没有出现锁mutex等待的情形；//mutex要么0要么1 视频刷新的时间可能会打乱播放的顺序。待考证解决方案：那就是修改播放刷新的时间， ### 小结：这是第一次使用日志的方式处理异步调试针对关键区域的断点放置，是个学问，针对读取变量或者锁的轮询有不同的方案；一个视频流完整的对应过程（忽略细节不同），如下代码：,一句话，找到开头和结尾。 Function: queue_picture(VideoState *, AVFrame *), Thread: 0x48CB0 video_thread,【wait】while (!vp->allocated && !is->quit) ...忽略 Function: video_refresh_timer(void *), Thread: 0x48AEC 主线程，【signal】timer中队列数目减少附件：完整的调试log（待补充,将会在csdn网盘补充）

ffmpeg tutorial 4 初印象

最后更新于：2022-04-01 15:57:42

目标大纲式列出变化（v3 vs v4）概述： 1. 其实最大的改进，就是引入一个大结构，那个 videostate。 1. 还有类似消息机制的引入：SDL_PushEvent 1. 同步变量在本节大量使用。 1. 视频queue的处理//在上一节的基础上仿照音频queue，类似位置增加了video queue 1. **新建视频播放线程video_thread ,看似简单，实际不简单** 1. 新建解码线程 Decode_thread，移植了主函数视频播放的功能。 1. 现在用一个函数解决解码器的打开 1. Packet feedback 的流程 //读取包 ### 留意事件的机制的使用直接上代码，这里说明定时器的使用 ~~~ static Uint32 sdl_refresh_timer_cb(Uint32 interval, void *opaque) { SDL_Event event; event.type = FF_REFRESH_EVENT; event.user.data1 = opaque; SDL_PushEvent(&event); //放入类似的消息队列 return 0; /* 0 means stop timer */ } /* schedule a video refresh in 'delay' ms */ static void schedule_refresh(VideoState *is, int delay) { SDL_AddTimer(delay, sdl_refresh_timer_cb, is);//添加timer } ~~~ ### 细节(音频api ,videoState) 一些细节的变化如下， 1）音频api发生变化，使用avcodec_decode_audio3 来替代以前的版本。 ![](https://docs.gechiui.com/gc-content/uploads/sites/kancloud/2016-02-22_56cae4af438bb.jpg) 2）在api大量使用videostate “大数据结构”来替代以前的结构，其实就是为了综合管理，以下场景以 audio_callback为例 ![](https://docs.gechiui.com/gc-content/uploads/sites/kancloud/2016-02-22_56cae4b7d1f77.jpg)

ffmpeg tutorial 3 教程结果分析

最后更新于：2022-04-01 15:57:39

目标：关注声音回调函数与声音解码函数的关系关注视频帧调用过程未完成测试案列：测试平台：测试代码：首先来说，为什么可以出声音，我认为是这个函数 SDL_pauseAudio 的行为. ### 关注声音回调函数与解码的关系用数据来说话，分别在两个位置下断点位置一是：audio_callback函数位置位置二是：audio_decode_frame解码函数位置让我们来看看实际的代码： ~~~ void audio_callback(void *userdata, Uint8 *stream, int len) { AVCodecContext *aCodecCtx = (AVCodecContext *)userdata; int len1, audio_size; static uint8_t audio_buf[(MAX_AUDIO_FRAME_SIZE * 3) / 2]; static unsigned int audio_buf_size = 0; static unsigned int audio_buf_index = 0; while(len > 0) { if(audio_buf_index >= audio_buf_size) { ~~~ ~~~ /* We have already sent all our data; get more */ audio_size = audio_decode_frame(aCodecCtx, audio_buf, audio_buf_size);//请特别注意。 ~~~ ~~~ if(audio_size < 0) { /* If error, output silence */ audio_buf_size = 1024; // arbitrary? memset(audio_buf, 0, audio_buf_size); } else { audio_buf_size = audio_size; } audio_buf_index = 0; } ~~~ ~~~ len1 = audio_buf_size - audio_buf_index; if(len1 > len) len1 = len; memcpy(stream, (uint8_t *)audio_buf + audio_buf_index, len1); len -= len1; stream += len1; audio_buf_index += len1; } } ~~~ 结果如下，可以看到至少4次audio_decode_frame后，系统将会调用回调函数。这样的回调会周期性发生。有同学会问，为什么是4次？我想要么是给系统回调的声音buffer放满了，要么是audio_callback 定时调用。 ~~~ Function: audio_callback(void *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_callback(void *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_callback(void *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_callback(void *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_callback(void *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 Function: audio_decode_frame(AVCodecContext *, unsigned char *, int), Thread: 0x81B8 Win32 线程 ~~~ ### 关注视频帧调用过程之所以贴在这里，想以后再分析，再给出结果。理论上，12个时刻，应该对应12个不同的帧，但是结果不是，看看以下结果（用notepad++处理后，用于统计个数）。 ~~~ Search "video" (11 hits in 1 file) new 2 (11 hits) Line 120: read one video frame Line 286: read one video frame Line 291: read one video frame Line 292: read one video frame Line 293: read one video frame Line 294: read one video frame Line 295: read one video frame Line 296: read one video frame Line 297: read one video frame Line 298: read one video frame Line 299: read one video frame Search "audio" (288 hits in 1 file) ~~~ ###未完成其他：看看第一次调用audio_callback时，是否遇到返回为空的情形。其他：请打印出每次解码出的buffer大小，尤其注意那个中间连续调用五次解码的过程。其他：是否可以结合dump出的文件信息，量化分析过程。

ffmpeg Tutorial -2 初印象

最后更新于：2022-04-01 15:57:37

##Tutorial -2 初印象：所有tutorial图像跑得**最快的**。如果看了后面的代码，其实就知道原因，这里buffer有了就播放，这里没有使用定时器来控制帧，更没有和音频同步。 ### 测试用例A:微软自带野生动物.wmv 结果及简要分析：失败，原因是不支持中文名； ### 测试用例B:微软自带 wildlife 结果及简要分析：成功播放，在 frameFinished加上断点。结果表格如下： ~~~ Function: SDL_main(int, char * *), Thread: 0x841C 主线程, frameFinished=-858993460 Function: SDL_main(int, char * *), Thread: 0x841C 主线程, frameFinished=0 Function: SDL_main(int, char * *), Thread: 0x841C 主线程, frameFinished=280 Function: SDL_main(int, char * *), Thread: 0x841C 主线程, frameFinished=280 Function: SDL_main(int, char * *), Thread: 0x841C 主线程, frameFinished=280 Function: SDL_main(int, char * *), Thread: 0x841C 主线程, frameFinished=280 后面省略上千次 ~~~ ### 测试用例C:ffmpeg sample: CLOCKTXT_320.avi' 测试结果及分析： 1. 格式的理解：覆盖上yv12和最终显示的yuv420 1. 第一次完整的认识sdl显示的过程：设置surface,设置覆盖，拷贝覆盖，显示（sdl_updateRec) 1. 第一次使用dump函数，看到结果如下： ~~~ [avi @ 003a3000] max_analyze_duration 5000000 reached at 5000000 Input #0, avi, from 'CLOCKTXT_320.avi': Duration: 00:00:12.00, start: 0.000000, bitrate: 42 kb/s Stream #0.0: Video: msrle, pal8, 320x320, 1 fps, 1 tbr, 1 tbn, 1 tbc Stream #0.1: Audio: truespeech, 8000 Hz, 1 channels, s16, 8 kb/s ~~~ 以上，第一次读包，直觉是读取的头部信息。由于样本是clock,帧分别就是12个时刻对应的图像，一帧一帧的跟踪变化，对出图过程的理解有很大帮助。当前只处理图像帧，如果要扩展声音的处理，将在以下代表位置： ~~~ while (av_read_frame(pFormatCtx, &packet) >= 0) {//有时候是声音，有时候是vedio if (packet.stream_index == videoStream) { ~~~ ### 小结： **测试用例标准化了**,**入门建议使用简单的测试案列**

前言

最后更新于：2022-04-01 15:57:35

> 原文出处：[FFmpeg入门实践与分析](http://blog.csdn.net/column/details/ffmpegtutorial.html) 作者：[titer1](http://blog.csdn.net/titer1) **本系列文章经作者授权在看云整理发布，未经作者允许，请勿转载！** # FFmpeg入门实践与分析 > 在动手中，发现ffmpeg的乐趣