<?xml version="1.0"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>GDCL News</title>
    <link>http://www.gdcl.co.uk/</link>
    <atom:link href="http://www.gdcl.co.uk/rss.xml" rel="self" type="application/rss+xml" />
    <description>Provides information on updated content at GDCL for DirectShow programming</description>
    <language>en-uk</language>
    <pubDate>Tue, 10 Mar 2015 14:27:32 +0000</pubDate>
    <lastBuildDate>Tue, 10 Mar 2015 14:27:32 +0000</lastBuildDate>

    
    <item>
      <title>Frame Re-ordering Support in iOS Video Encoding</title>
      <link>http://www.gdcl.co.uk/2014/04/22/Frame-Reordering.html</link>
      <pubDate>Tue, 22 Apr 2014 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2014/04/22/Frame-Reordering.html</guid>
      <description>
&lt;h1 id=&quot;frame-re-ordering-support-in-ios-video-encoding&quot;&gt;Frame Re-ordering Support in iOS Video Encoding&lt;/h1&gt;

&lt;p&gt;Last year, I &lt;a href=&quot;/2013/02/20/iOS-Video-Encoding.html&quot;&gt;published a sample&lt;/a&gt; that showed how to use hardware accelerated video encoding on the iPhone by extracting the encoded data from the file during encoding. I’ve now updated the sample &lt;a href=&quot;/iOS/encoderdemo.zip&quot;&gt;here&lt;/a&gt; to support frame re-ordering.&lt;/p&gt;

&lt;p&gt;Frame re-ordering can improve compression ratios by allowing bi-directional prediction, and was enabled by Apple in iOS 7 for some devices. However, it means that the timestamps (received from the AVCaptureSession) are not in the same order as the output frames. I’ve previously published a fix for my demo that simply disabled frame re-ordering to ensure that the demo continued to work. I’ve now had the time to fix the problem properly.&lt;/p&gt;

&lt;p&gt;The frames in the file are in decoding order. Each frame has a &lt;em&gt;picture order count&lt;/em&gt; (POC) which indicates the presentation order — this is the original capture order and thus corresponds to the timestamp order. There are several ways to encode the POC in a slice header; my sample now correctly decodes type 0, which is the type used by the Apple encoder. The iPhone encoder creates POC values that increase by 2 for each frame. Rather than hardwire this knowledge into the code, I use the order of the POC values, not their absolute values. This means delaying output by up to 2 frames, increasing the latency by around 10%. &lt;/p&gt;

&lt;p&gt;The updated sample is available &lt;a href=&quot;/iOS/encoderdemo.zip&quot;&gt;here&lt;/a&gt;. Please get in touch if you have any comments or queries.&lt;/p&gt;

&lt;p&gt;22nd April 2014&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Video Encoding on iOS 7</title>
      <link>http://www.gdcl.co.uk/2014/01/21/Video-Encoding-iOS7.html</link>
      <pubDate>Tue, 21 Jan 2014 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2014/01/21/Video-Encoding-iOS7.html</guid>
      <description>
&lt;h1 id=&quot;video-encoding-on-ios-7&quot;&gt;Video Encoding on iOS 7&lt;/h1&gt;

&lt;p&gt;Last year, I &lt;a href=&quot;/2013/02/20/iOS-Video-Encoding.html&quot;&gt;published a sample&lt;/a&gt; that showed how to use hardware accelerated video encoding on the iPhone for uses such as network streaming. Changes made to video encoding in iOS 7, and on the iPhone 5s, broke this mechanism, and I’ve today updated the source &lt;a href=&quot;/iOS/encoderdemo.zip&quot;&gt;here&lt;/a&gt; to fix this.&lt;/p&gt;

&lt;p&gt;One issue was pretty trivial: my code to divide up NALUs into frames was not quite correct. Apple added an SEI NALU to the stream, and this broke my scheme. I’ve fixed the scheme so this works correctly now.&lt;/p&gt;

&lt;p&gt;The second problem is somewhat more difficult. On the iPhone 5s, the encoder now supports frame re-ordering: every other frame uses bi-directional prediction, and so needs to be decoded before the other frame of the pair. So the (decoding) order of the frames in the file is not the same as the timestamps that came from the incoming frames.&lt;/p&gt;

&lt;p&gt;There is a way to resolve this, by decoding the ‘picture-order-count’ values from the H264 elementary stream. This is tediously difficult to do, but I have added code to do this for the subset of POC types that Apple uses. However, it appears that there is a bug.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Update 22nd April 2014&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I’ve looked at this again, and found the bug (which was in the frame separation logic in fact). With this fix, frame re-ordering now works correctly, using the picture order count to select the correct timestamp. This version is now published &lt;a href=&quot;/iOS/encoderdemo.zip&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

</description>
    </item>
    
    <item>
      <title>DirectShow RTSP filter and Port Settings</title>
      <link>http://www.gdcl.co.uk/2013/11/12/RTSP-Ports.html</link>
      <pubDate>Tue, 12 Nov 2013 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/11/12/RTSP-Ports.html</guid>
      <description>
&lt;h1 id=&quot;directshow-rtsp-filter-and-port-settings&quot;&gt;DirectShow RTSP filter and Port Settings&lt;/h1&gt;

&lt;p&gt;The RTSP Jukebox sample that I published &lt;a href=&quot;/2013/05/16/RTSP-Jukebox.html&quot;&gt;here&lt;/a&gt; hardwires the port to the default, 554. A number of people have reported problems with this, particularly as it appears that some Windows Media Player components reserve that port.&lt;/p&gt;

&lt;p&gt;I’ve created an updated version of the filter and test app that allows you to change the port used, and reports better error messages if the port chosen is already in use.&lt;/p&gt;

&lt;p&gt;Download the source and binaries &lt;a href=&quot;/RTSPJukebox.zip&quot;&gt;here&lt;/a&gt;. Feel free to get it touch if you are interested in extending this.&lt;/p&gt;

&lt;p&gt;12 November 2013&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Decline and Competition</title>
      <link>http://www.gdcl.co.uk/2013/09/29/DeclineandCompetition.html</link>
      <pubDate>Sun, 29 Sep 2013 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/09/29/DeclineandCompetition.html</guid>
      <description>&lt;h1 id=&quot;decline-and-competition&quot;&gt;Decline and Competition&lt;/h1&gt;

&lt;p&gt;The retirement of Steve Ballmer has led to a number of observations on Microsoft’s increasing irrelevance in the software world and discussion about the reasons for it. There are two popular explanations for this decline: the ‘stack ranking’ employee review system that divided staff rather than creating successful teams, or simply that ‘Ballmer has no taste’. &lt;/p&gt;

&lt;p&gt;Both of these are true to some extent, but neither really get to the bottom of the issue. There was certainly a lot of competition within Microsoft. It was widely believed that Bill Gates liked to encourage it, and there would often be several alternative solutions to a given problem under development. When I was leading the design for DirectShow, there were nine competing next-generation multimedia architectures. I spent half my time negotiating with these other groups to get them to switch to DirectShow. But while this was frustrating at the time, and seems inefficient, other companies have made this work.&lt;/p&gt;

&lt;p&gt;Microsoft understood the problems that needed solving, and had ideas for many of the things that later became important, but failed to create effective solutions. I used a keyboardless tablet computer in 1991, twenty years before the iPad, but I only used it once. It worked, but it wasn’t really a product that fired the imagination.&lt;/p&gt;

&lt;p&gt;I’ve heard people observe that Microsoft succeeded by creating things that were ‘good enough’. When I worked there,  I was told on more than one occasion by senior managers that we needed to ship quickly and not worry about creating a good product. If it was successful, we would have time to rewrite the software for a second release, from a position of strength. If it failed, the quality of the software would not matter. &lt;/p&gt;

&lt;p&gt;And there, in a single anecdote, is the essence of the thing. Microsoft was a company that competed. Competition was what it did best. There was no prize for creating an elegant solution to a problem, or going the extra mile to ensure that the details were polished. What counted was competing. &lt;/p&gt;

&lt;p&gt;I’m sure Steve Jobs was frustrating in a lot of ways, but when he died, the observation that brought tears to my eyes was this: he showed that you could be successful by creating a product that you were proud of. Commercial success does not require a shoddy compromise. &lt;/p&gt;

</description>
    </item>
    
    <item>
      <title>DirectShow RTSP Server filter and RTSP Jukebox</title>
      <link>http://www.gdcl.co.uk/2013/05/16/RTSP-Jukebox.html</link>
      <pubDate>Thu, 16 May 2013 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/05/16/RTSP-Jukebox.html</guid>
      <description>
&lt;h1 id=&quot;directshow-rtsp-server-filter-and-rtsp-jukebox&quot;&gt;DirectShow RTSP Server filter and RTSP Jukebox&lt;/h1&gt;

&lt;p&gt;A few weeks ago, I was investigating RTSP playback (see my iOS server prototype &lt;a href=&quot;/2013/02/20/iOS-Video-Encoding.html&quot;&gt;here&lt;/a&gt;). At the same time, I wrote an RTSP server as a DirectShow filter, together with an RTSP Jukebox app that loops multiple clips using &lt;a href=&quot;/gmfbridge/&quot;&gt;GMFBridge&lt;/a&gt;. I’ve decided to release it under an &lt;a href=&quot;/license.htm&quot;&gt;attribution licence&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The RTSP server is a DirectShow renderer filter, with an input pin for video (that must be connected) and an optional audio input. It supports only H264 and AAC input. When the graph is running, the input is fed to an RTSP server, which will transmit as a live stream.&lt;/p&gt;

&lt;p&gt;The Jukebox player app is adapted from the GMFPlay demo app. It allows you to create a playlist of clips, and the whole playlist is looped through the RTSP server filter. This uses GMFBridge to manage separate graphs for the source clips, with a single graph for the RTSP server. It’s a little experimental; the main limitation is that no type changing is supported, so all clips that are part of a playlist must be in exactly the same format.  You will need to download and register GMFBridge.dll (from &lt;a href=&quot;/gmfbridge/&quot;&gt;here&lt;/a&gt;) for the app to work.&lt;/p&gt;

&lt;p&gt;Download the source &lt;a href=&quot;/RTSPJukebox.zip&quot;&gt;here&lt;/a&gt;. Feel free to get it touch if you are interested in extending this.&lt;/p&gt;

&lt;p&gt;16th May 2013&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Motion JPEG and CC608 Captions in MP4</title>
      <link>http://www.gdcl.co.uk/2013/05/16/MP4-Update.html</link>
      <pubDate>Thu, 16 May 2013 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/05/16/MP4-Update.html</guid>
      <description>
&lt;h1 id=&quot;motion-jpeg-and-cc608-captions-in-mp4&quot;&gt;Motion JPEG and CC608 Captions in MP4&lt;/h1&gt;

&lt;p&gt;I’ve updated the DirectShow MP4 mux and demux filters. You can download the latest source and binaries &lt;a href=&quot;/mpeg4/mpeg4.zip&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This update includes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Support for Motion JPEG in mux and demux (thanks to Steve Sexton) — see further discussion on this &lt;a href=&quot;/2013/05/02/Motion-JPEG.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Support for CC608 caption data (Line-21 byte pairs). This is &lt;em&gt;not&lt;/em&gt; compatible with QuickTime or other players (the data will be ignored). However it is supported by both the multiplexor and demultiplexor.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;A fix for incorrect PCM alignment and a fix to the IStream handling (thanks to Maxim Kartavenkov)&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;16th May 2013&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Adventures with Motion JPEG</title>
      <link>http://www.gdcl.co.uk/2013/05/02/Motion-JPEG.html</link>
      <pubDate>Thu, 02 May 2013 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/05/02/Motion-JPEG.html</guid>
      <description>
&lt;h1 id=&quot;adventures-with-motion-jpeg&quot;&gt;Adventures with Motion JPEG&lt;/h1&gt;

&lt;p&gt;I recently added support for Motion JPEG streams to my MP4 mux and demux. (&lt;em&gt;Update:&lt;/em&gt; these changes are now available &lt;a href=&quot;/mpeg4/mpeg4.zip&quot;&gt;here&lt;/a&gt;). There were a few interesting little twists in getting this to work correctly, so I thought I would write these down in case they were of use to someone. I realise that the set of people interested in this who don’t already know it is quite small, but since my dog retired, I have to tell someone else.&lt;/p&gt;

&lt;p&gt;I want the files that my MP4 mux creates to be compatible with QuickTime. It’s not enough to create files that can be played back by my demux (although that in itself is useful in some cases, such as CC 608 closed caption data). So I’ve been testing these files with QuickTime, VLC and a handful of different decoders on DirectShow, including the BlackMagic decoder.&lt;/p&gt;

&lt;p&gt;Supporting progessive images was straightforward. I found I could use the FOURCC ‘jpeg’ and just put the elementary stream in the file without any changes, and it worked in all players. The issues arose with interlaced Motion JPEG, in which each frame contains two separate JPEG images and a header is needed to identify the boundary between them. There are two types of header structure used by different companies to mark this boundary, and I needed to include both for it to work correctly.&lt;/p&gt;

&lt;p&gt;First, a quick refresher on JPEG. The data stream consists of a set of tags, each beginning with a 0xFF byte, with the compressed data in the middle. Any 0xFF in the compressed data is escaped, so you can parse it easily, but unfortunately the length of the compressed data is not marked in a header. To parse it, you follow these rules:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;0xFF followed by another 0xFF (or at the end) is a padding byte.&lt;/li&gt;
  &lt;li&gt;0xFF followed by a 0 is an escaped 0xFF that is part of the compressed data&lt;/li&gt;
  &lt;li&gt;0xFF followed by a byte in the range 0xD0..0xD9 is a two byte code&lt;/li&gt;
  &lt;li&gt;0xFF followed by any other code will have a two byte length field followed by header data. The length field does not include the FF xx, but it does include the length field itself.&lt;/li&gt;
  &lt;li&gt;0xFF DA (Start Of Scan) marks the start of the compressed data. There is a length field, which only describes the length of the SOS record itself. The compressed data which follows is of unknown length.&lt;/li&gt;
  &lt;li&gt;The frame starts with an SOI (FF D8) and ends with a EOI (FF D9).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the MP4 index, each frame is indexed as a single chunk of data. For interlaced frames, the decoder needs to separate the two images, but as you can see from that brief description, that means parsing the whole scan data to find the EOI code. To avoid this, decoders require an additional header at the start of each image, giving the length of the whole image.&lt;/p&gt;

&lt;p&gt;Motion JPEG AVI files typically have an APP0 header (FF E0), with the tag ‘AVI1’. The structure of this data can be seen as the APP0 struct in typehandler.cpp, but essentially it just contains the fieldsize for the whole JPEG image. Quicktime requires an APP1 header (FF E1), with a ‘mjpg’ tag. This structure (struct APP1 in typehandler.cpp) is more complicated, as it contains offsets to key fields within the image header as well as the overall length. &lt;/p&gt;

&lt;p&gt;My mux will create or fix up both APP0 and APP1 headers, creating files that are playable by all the players I tried. If one of them is present, I use it to determine the length of the image. If neither are present, the mux will scan the entire image to find the EOI. This works correctly, but slows down the multiplexing process considerably.&lt;/p&gt;

&lt;p&gt;There’s another catch.  Motion JPEG files normally omit the Huffman table if the default table is used. However, Quicktime will not decompress images that omit the Huffman table, so the multiplexor will insert the default table if it is not present. This, unfortunately, adds 420 bytes to every image (840 bytes for interlaced frames). &lt;/p&gt;

&lt;p&gt;The final point is that any MP4 containing Motion JPEG will be rejected by QuickTime. It needs to be labelled as a MOV file. I’ve already met this when adding support for uncompressed PCM audio, and fortunately the fix is very simple. The first four bytes of the ‘ftyp’ atom at the start of the file contain the file type. I’ve been using ‘mp42’ for MP4 files. Changing this to ‘qt  ‘ means that QuickTime will accept it as a valid MOV file. My mux will do this if any of the contained streams are in a non-MP4 format (that is, Motion JPEG or uncompressed PCM audio).&lt;/p&gt;

&lt;p&gt;2nd May 2013&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Improved Interleaving in MP4 Multiplexor</title>
      <link>http://www.gdcl.co.uk/2013/04/03/MP4-Interleaving.html</link>
      <pubDate>Wed, 03 Apr 2013 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/04/03/MP4-Interleaving.html</guid>
      <description>
&lt;h1 id=&quot;improved-interleaving-in-mp4-multiplexor&quot;&gt;Improved Interleaving in MP4 Multiplexor&lt;/h1&gt;
&lt;p&gt;I’ve improved the interleaving in the MP4 multiplexor to limit the maximum offset between the same timestamp in different streams. Previously, the multiplexor would divide up streams into chunks of about 1 second, and the offset between the same timestamp in different streams could be as much as 2 seconds’ of data.&lt;/p&gt;

&lt;p&gt;In this version, there is a 400kb limit on the interleave size, and a few other changes to ensure that one stream does not run ahead of the others. Search for &lt;code&gt;MAX_INTERLEAVE_SIZE&lt;/code&gt; if you want to change this limit from 400kb. The previous one second limit is applied as well, so each interleave piece must be smaller than both the time and byte size limits.&lt;/p&gt;

&lt;p&gt;I’ve also incorporated some comments on NALU parsing from Dmitri Vasilyev and added support for ADTS AAC audio. &lt;/p&gt;

&lt;p&gt;Download the latest source and binary &lt;a href=&quot;/mpeg4/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;3rd April 2013&lt;/p&gt;

</description>
    </item>
    
    <item>
      <title>Fix for repeated pause/resume in CapturePause</title>
      <link>http://www.gdcl.co.uk/2013/04/03/Fix-CapturePause.html</link>
      <pubDate>Wed, 03 Apr 2013 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/04/03/Fix-CapturePause.html</guid>
      <description>
&lt;h1 id=&quot;fix-for-repeated-pauseresume-in-capturepause&quot;&gt;Fix for repeated pause/resume in CapturePause&lt;/h1&gt;

&lt;p&gt;I’ve updated the iPhone CapturePause sample to fix a problem that would occur if you repeatedly clicked on pause and resume during recording. The sample calculates the length of the pause and adjusts subsequent timestamps by that amount so that the resulting video is consistent. However, in some cases, the length of the pause would be calculated wrongly, which would result in errors from the asset writer (because the timestamps were prior to previously submitted timestamps).&lt;/p&gt;

&lt;p&gt;Download the sample &lt;a href=&quot;/iOS/capturepause.zip&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;3rd April 2013&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>iPhone Recording — Resume after Background</title>
      <link>http://www.gdcl.co.uk/2013/03/22/iPhone-Resume-Recording.html</link>
      <pubDate>Fri, 22 Mar 2013 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/03/22/iPhone-Resume-Recording.html</guid>
      <description>&lt;h1 id=&quot;iphone-recording-resume-after-background&quot;&gt;iPhone Recording: Resume after Background&lt;/h1&gt;

&lt;p&gt;A reader asks if it is possible to adapt &lt;a href=&quot;/2013/02/20/iPhone-Pause.html&quot;&gt;this sample&lt;/a&gt; (demonstrating pausing and resuming iPhone video recording) so that it can resume when the app returns from the background.&lt;/p&gt;

&lt;p&gt;The short answer is that (as far as I can see) it is possible, but fiddly.&lt;/p&gt;

&lt;p&gt;The AVFoundation asset writer does not have the option to append to an existing file. So if you want to resume recording to the same file, you’ll need to keep that open while in the background. The MP4 file format has an index that is written out to the file at the end of recording. Without this index, it is not possible to play the file. So if you leave the file open, it is not playable at all, and if you close it, you cannot write any more to it.&lt;/p&gt;

&lt;p&gt;The asset writer is writing to the file in the background. So when your app is sent to the background, even if you terminate the camera session, there will still be data to write and these writes will be terminated, causing an error. There’s no simple way to wait for the background writes to complete, unless you close the whole file. So option 1 is to request background time and just keep the app active in the background. To be honest, this seems very unsatisfactory.&lt;/p&gt;

&lt;p&gt;The only other alternative is to start a new file, and combine it with the old file. You could either write the old file to the new file before resuming recording (but that makes resuming a very slow operation) or you could create a different file each time and combine them when you stop recording (but that makes stopping recording a slow operation). &lt;/p&gt;

&lt;p&gt;My suggestion is this:
- before suspending, you are recording to a file (file A) which you close on entering background.
- when you resume, start recording to a new file (file B).
- in the background, create a new file C.
- write everything from file A to file C and delete file A.
- read the data from file B as it is being written, using &lt;a href=&quot;/2013/02/20/iOS-Video-Encoding.html&quot;&gt;this sample&lt;/a&gt;, and write to file C.&lt;/p&gt;

&lt;p&gt;I think whatever the approach you take, you will need to copy all the recorded data after resuming. If you write a separate file each time and combine all the files at the end, you only need to do this copy once, but it will be potentially a slow operation. This approach copies all the data every time you resume, but it does it in the background and should make both resuming and finishing rapid.&lt;/p&gt;

&lt;p&gt;For the time being, I’ll leave this as an exercise to the reader.&lt;/p&gt;

&lt;p&gt;22nd March 2013&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Pause Recording on iPhone</title>
      <link>http://www.gdcl.co.uk/2013/02/20/iPhone-Pause.html</link>
      <pubDate>Wed, 20 Feb 2013 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/02/20/iPhone-Pause.html</guid>
      <description>
&lt;h1 id=&quot;pause-recording-on-iphone&quot;&gt;Pause Recording on iPhone&lt;/h1&gt;
&lt;p&gt;I’ve written an example iPhone app that shows how to pause and resume video recording on an iPhone. Instead of using a movie file output that would record directly from the camera, this sample uses a Data Output which forwards the captured video and audio to the app. The app then forwards this to an AVAssetWriter when recording is enabled.&lt;/p&gt;

&lt;p&gt;After a pause and resume, the timestamps need to be adjusted (or there will be a pause in the video). The sample does this using &lt;code&gt;CMSampleBufferCreateCopyWithNewTiming&lt;/code&gt;, since there isn’t any way to change the sample times for audio (for video, you can use a pixel buffer adapter).&lt;/p&gt;

&lt;p&gt;The app is very basic, but it is fully functional and is published in source form under an &lt;a href=&quot;/license.htm&quot;&gt;attribution license&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Download the sample &lt;a href=&quot;/iOS/capturepause.zip&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Update:&lt;/em&gt; See also &lt;a href=&quot;/2013/03/22/iPhone-Resume-Recording.html&quot;&gt;this note&lt;/a&gt;  on resuming after the app enters background mode.&lt;/p&gt;

&lt;p&gt;For another iPhone capture example, see &lt;a href=&quot;/2013/02/20/iOS-Video-Encoding.html&quot;&gt;this sample&lt;/a&gt;, which demonstrates how to get the compressed video and audio from AVAssetWriter for network streaming:&lt;/p&gt;

&lt;p&gt;27th February 2013&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Hardware Video Encoding on iPhone — RTSP Server example</title>
      <link>http://www.gdcl.co.uk/2013/02/20/iOS-Video-Encoding.html</link>
      <pubDate>Wed, 20 Feb 2013 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/02/20/iOS-Video-Encoding.html</guid>
      <description>
&lt;h1 id=&quot;hardware-video-encoding-on-iphone--rtsp-server-example&quot;&gt;Hardware Video Encoding on iPhone — RTSP Server example&lt;/h1&gt;

&lt;p&gt;On iOS, the only way to use hardware acceleration when encoding video is to use AVAssetWriter, and that means writing the compressed video to file. If you want to stream that video over the network, for example, it needs to be read back out of the file.&lt;/p&gt;

&lt;p&gt;I’ve written an example application that demonstrates how to do this, as part of an RTSP server that streams H264 video from the iPhone or iPad camera to remote clients. The end-to-end latency, measured using a low-latency DirectShow client, is under a second. Latency with VLC and QuickTime playback is a few seconds, since these clients buffer somewhat more data at the client side.&lt;/p&gt;

&lt;p&gt;The whole example app is available in source form &lt;a href=&quot;/iOS/encoderdemo.zip&quot;&gt;here&lt;/a&gt; under an &lt;a href=&quot;/license.htm&quot;&gt;attribution license&lt;/a&gt;. It’s a very basic app,  but is fully functional. Build and run the app on an iPhone or iPad, then use Quicktime Player or VLC to play back the URL that is displayed in the app.&lt;/p&gt;

&lt;h2 id=&quot;details-details&quot;&gt;Details, Details&lt;/h2&gt;

&lt;p&gt;When the compressed video data is written to a MOV or MP4 file, it is written to an &lt;em&gt;mdat&lt;/em&gt; atom and indexed in the &lt;em&gt;moov&lt;/em&gt; atom. However, the &lt;em&gt;moov&lt;/em&gt; atom is not written out until the file is closed, and without that index, the data in &lt;em&gt;mdat&lt;/em&gt; is not easily accessible. There are no boundary markers or sub-atoms, just raw elementary stream. Moreover, the data in the &lt;em&gt;mdat&lt;/em&gt; cannot be extracted or used without the data from the &lt;em&gt;moov&lt;/em&gt; atom (specifically the lengthSize and SPS and PPS param sets). &lt;/p&gt;

&lt;p&gt;My example code takes the following approach to this problem:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Only video is written using the AVAssetWriter instance, or it would be impossible to distinguish video from audio in the &lt;em&gt;mdat&lt;/em&gt; atom.&lt;/li&gt;
  &lt;li&gt;Initially, I create two AVAssetWriter instances. The first frame is written to both, and then one instance is closed. Once the &lt;em&gt;moov&lt;/em&gt; atom has been written to that file, I parse the file and assume that the parameters apply to both instances, since the initial conditions were the same.&lt;/li&gt;
  &lt;li&gt;Once I have the parameters, I use a dispatch_source object to trigger reads from the file whenever new data is written. The body of the &lt;em&gt;mdat&lt;/em&gt; chunk consists of H264 NALUs, each preceded by a length field. Although the length of the &lt;em&gt;mdat&lt;/em&gt; chunk is not known, we can safely assume that it will continue to the end of the file (until we finish the output file and the &lt;em&gt;moov&lt;/em&gt; is added).&lt;/li&gt;
  &lt;li&gt;For RTP delivery of the data, we group the NALUs into frames by parsing the NALU headers. Since there are no AUDs marking the frame boundaries, this requires looking at several different elements of the NALU header.&lt;/li&gt;
  &lt;li&gt;Timestamps arrive with the uncompressed frames from the camera and are stored in a FIFO. These timestamps are applied to the compressed frames in the same order. Fortunately, the AVAssetWriter live encoder does not require re-ordering of frames. &lt;em&gt;Update&lt;/em&gt; this is no longer true, and I now have a version that supports re-ordered frames.&lt;/li&gt;
  &lt;li&gt;When the file gets too large, a new instance of AVAssetWriter is used, so that the old temporary file can be deleted. Transition code must then wait for the old instance to be closed so that the remaining NALUs can be read from the &lt;em&gt;mdat&lt;/em&gt; atom without reading past the end of that atom into the subsequent metadata. Finally, the new file is opened and timestamps are adjusted. The resulting compressed output is seamless.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A little experimentation suggests that we are able to read compressed frames from file about 500ms or so after they are captured, and these frames then arrive around 200ms after that at the client app. &lt;/p&gt;

&lt;h2 id=&quot;rotation&quot;&gt;Rotation&lt;/h2&gt;

&lt;p&gt;For modern graphics hardware, it is very straightforward to rotate an image when displaying it, and this is the method used by AVFoundation to handle rotation of the camera. The buffers are captured, encoded and written to file in landscape orientation. If the device is rotated to portrait mode, a transform matrix is written out to the file to indicate that the video should be rotated for playback. At the same time, the preview layer is also rotated to match the device orientation.&lt;/p&gt;

&lt;p&gt;This is efficient and works in most cases. However, there isn’t a way to pass this transform matrix to an RTP client, so the view on a remote player will not match the preview on the device if it is rotated away from the base camera orientation.&lt;/p&gt;

&lt;p&gt;The solution is to rotate the pixel buffers after receiving them from the capture output and before delivering them to the encoder. There is a cost to this processing, and this example code does not include this extra step.&lt;/p&gt;

&lt;h2 id=&quot;update-april-2014&quot;&gt;Update (April 2014)&lt;/h2&gt;

&lt;p&gt;On iOS 7, and particularly with an iPhone 5s, the output written to the file has changed slightly and I needed to update the code to fix this. The updated code is available &lt;a href=&quot;/iOS/encoderdemo.zip&quot;&gt;here&lt;/a&gt;, and there is a complete explanation &lt;a href=&quot;/2014/04/22/Frame-Reordering.html&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Improved PCM support in MP4 Multiplexor</title>
      <link>http://www.gdcl.co.uk/2013/02/15/PCM-in-MP4.html</link>
      <pubDate>Fri, 15 Feb 2013 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/02/15/PCM-in-MP4.html</guid>
      <description>&lt;h1 id=&quot;improved-pcm-support-in-mp4-multiplexor&quot;&gt;Improved PCM support in MP4 Multiplexor&lt;/h1&gt;

&lt;p&gt;I’ve changed the uncompressed audio support in my MP4 Multiplexor filter at &lt;a href=&quot;http://www.gdcl.co.uk/mpeg4/&quot;&gt;gdcl.co.uk/mpeg4&lt;/a&gt; to provide better compatibility with Quicktime.&lt;/p&gt;

&lt;p&gt;The MP4 format does not provide good support for uncompressed PCM audio. Quicktime stores PCM in MP4 files using an old format, dating from before the file format was modified to support compressed audio. My &lt;a href=&quot;http://www.gdcl.co.uk/mpeg4/&quot;&gt;MP4 Demultiplexor&lt;/a&gt; has support for playback of these old index formats. With this update, I’ve added support for creating these types to the multiplexor.&lt;/p&gt;

&lt;p&gt;In the normal MP4 index, blocks of audio are indexed. The sample size, sample duration and sample-to-chunk tables (&lt;em&gt;stsz&lt;/em&gt;, &lt;em&gt;stts&lt;/em&gt;, &lt;em&gt;stsc&lt;/em&gt;) all contain one entry for each block of audio. For this case, the MP4 multiplexor creates one index entry for each IMediaSample object that arrives at the input.&lt;/p&gt;

&lt;p&gt;For uncompressed audio, the unit of indexing is the audio sample. So for 16-bit stereo, the sample size is 4 bytes. The sample-to-chunk table (&lt;em&gt;stsc&lt;/em&gt;) reports how many (4-byte) samples are in each chunk. The sample duration table (&lt;em&gt;stts&lt;/em&gt;) reports the duration of each sample, but since the scale is always set to the sampling frequency, the duration is always 1. The sample size table (&lt;em&gt;stsz&lt;/em&gt;), for some unexplained reason, reports the size of each sample as a count of samples (huh?), so every entry is 1.&lt;/p&gt;

&lt;p&gt;There are a few other changes when PCM audio is stored and the old index format is used:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;For Quicktime to accept these files without complaint, it is necessary to use ‘qt  ‘ instead of ‘mp42’ in the ‘ftyp’ file type atom. However, these files, still labelled as ‘.mp4’ files are playable in VLC and DirectShow as well as Quicktime. &lt;/li&gt;
  &lt;li&gt;In a normal MP4, the multiplexor combines IMediaSample objects up to a second’s worth of data and writes that as a single chunk. For uncompressed audio, each chunk is about a quarter-second of audio.&lt;/li&gt;
  &lt;li&gt;Timing of the audio stream is based solely on the sampling frequency of the audio. Any timestamps associated with the audio are essentially ignored.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;15th February 2013&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>DirectShow and Beyond</title>
      <link>http://www.gdcl.co.uk/2013/01/10/DirectShow-and-beyond.html</link>
      <pubDate>Thu, 10 Jan 2013 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/01/10/DirectShow-and-beyond.html</guid>
      <description>&lt;h1 id=&quot;directshow-and-beyond&quot;&gt;DirectShow and Beyond&lt;/h1&gt;

&lt;p&gt;When I wrote my first filter in the autumn of 1994, I really did not think that I would still be creating DirectShow filters 18 years later. DirectShow has not just lasted well, it has thrived. Present on hundreds of millions of computers, it is used by a wide range of developers, in cars, in lecture rooms and in brain surgery as well as more traditional entertainment. It has scaled well, despite the fact that, for most of the initial development cycle, we had no multiprocessor systems and no hardware acceleration to test it with.&lt;/p&gt;

&lt;p&gt;But in recent months, most of my development work has been on other platforms. Not because DirectShow is less capable, but simply because it is not present on the platforms that count.&lt;/p&gt;

&lt;h2 id=&quot;media-foundation&quot;&gt;Media Foundation&lt;/h2&gt;
&lt;p&gt;Microsoft began its move away from DirectShow with the release of Windows Vista in 2007, which included an alternative multimedia platform: Media Foundation. This borrows heavily from the architecture of DirectShow, but is less flexible. Instead of the open graph, it aims to provide a more centrally-controlled model in which the components are more plug-ins than primary agents. This provides a more secure environment for playback of DRM-protected media. &lt;/p&gt;

&lt;p&gt;However, the initial release of MF had far fewer features than DirectShow, and proved harder to use for many simple tasks — developers who’ve long complained about DirectShow’s steep learning curve will appreciate the irony of that. The result was that MF was little used and, for the life of Vista and Windows 7, DirectShow continued to be the architecture of choice for third-party developers.&lt;/p&gt;

&lt;p&gt;With the benefit of hindsight, it looks as though Microsoft bet the wrong way. During the development of Vista, Microsoft tried to placate content owners by offering a secure system that would protect the content owners’ rights.  The effect this had on third-party developers was ignored —Microsoft assumed that they would have no choice but to follow their edicts (or perhaps thought that third-party developers no longer mattered). Five years later, DRM seems to be of much less importance while platforms stand or fall on the strength of third-party software support.&lt;/p&gt;

&lt;h3 id=&quot;windows-8&quot;&gt;Windows 8&lt;/h3&gt;
&lt;p&gt;In Vista and Windows 7, MF was an alternative to DirectShow. Windows 8 goes one step further. DirectShow is not available on Windows RT (the ARM version) or to any &lt;em&gt;Windows Store&lt;/em&gt; apps (formerly &lt;em&gt;Metro&lt;/em&gt;) on either ARM or Intel systems. These apps have access to a cut-down MF API. Only the &lt;em&gt;Desktop&lt;/em&gt; apps, on Intel systems, can still use DirectShow.&lt;/p&gt;

&lt;p&gt;Developers writing desktop apps for Windows 8 will have a choice between the &lt;em&gt;Windows Store&lt;/em&gt; environment and the traditional desktop. It’s far from clear which will see the bulk of development effort. To me, as a user, the desktop environment feels like a legacy platform; users who drop into the desktop environment will feel as though they have been exposed to Window’s raw underbelly.  But on the other hand, I don’t really want to run all my apps full-screen on my 27-inch display, and the gestures designed for touch-screen tablets do not work on a desktop. &lt;/p&gt;

&lt;p&gt;It’s too early to say how much impact this will have on developers’ priorities. Upgrade rates for new versions of Windows are often quite slow: by June 2012, nearly three years after release, Windows 7 was installed on 600 million systems, compared to about 675 million systems still running Windows XP.  And sales of Windows 8 are reported to be rather slower. So it’s likely that, for some considerable time to come, most Windows systems will be running XP or Windows 7, and DirectShow will be a useful tool for some years to come, even on Windows 8. But, ultimately, the writing is on the wall.&lt;/p&gt;

&lt;h3 id=&quot;mixing-mf-and-directshow&quot;&gt;Mixing MF and DirectShow&lt;/h3&gt;

&lt;p&gt;There is a lot of similarity between MF and DirectShow, and it is certainly possible to mix components, with a certain amount of code. I’ve not had the need to run DirectShow filters in MF, but I’ve written wrapper filters for MF objects and it is not that difficult.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The time model is similar — data from the source object is timestamped in 100ns units in both cases, but the MF object will be absolute, not relative to the seek point.&lt;/li&gt;
  &lt;li&gt;The media type in MF is an object, not a structure, but it has methods to get and set the media type using a DirectShow representation. You may still need to apply a few tweaks, though. For example, AAC audio media types in MF use a structure that is not recognised by DirectShow decoders, so you will need to convert this.&lt;/li&gt;
  &lt;li&gt;Media Foundation MFT objects are very similar to DirectShow DMOs. You can easily build a transform object that can work in both environments.&lt;/li&gt;
  &lt;li&gt;Sink objects and renderers are, apparently, rather harder to mix.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;apple&quot;&gt;Apple&lt;/h2&gt;
&lt;p&gt;Of course one of the most significant reasons for the decline in importance of DirectShow is the return to prominence of Apple. Apple’s importance as a computer manufacturer, and the success of iOS, means that many developers are writing for Apple platforms as a first choice. &lt;/p&gt;

&lt;p&gt;Even in the darkest days of the previous century, Apple computers were the preferred tool of most people dealing with computer-based video and audio, and the Quicktime framework long predates anything created by Microsoft. However, the move to 64-bit has not been executed flawlessly, and the divergence and subsequent realignment of OS X and iOS means that there are multiple choices of API frameworks for digital video software and, surprisingly, none are ideal.&lt;/p&gt;

&lt;p&gt;There are three options: QuickTime, QTKit and AV Foundation.&lt;/p&gt;

&lt;h3 id=&quot;quicktime&quot;&gt;QuickTime&lt;/h3&gt;
&lt;p&gt;The QuickTime API has been around since 1991 and supports a wide range of formats, including third-party container format support and third-party video and audio codecs. However, it is limited to 32-bit applications —no 64-bit API is (or will be) provided for QuickTime. &lt;/p&gt;

&lt;p&gt;QuickTime X is available in both 32-bit and 64-bit versions, and includes support for MPEG PS and TS containers and mpeg-2 video decoding. However, there is no API to access QuickTime X directly. The C API provides access only to the QuickTime 7 framework .&lt;/p&gt;

&lt;h3 id=&quot;qtkit&quot;&gt;QTKit&lt;/h3&gt;
&lt;p&gt;QTKit is an Objective-C framework that wraps QuickTime functionality.  It looks as if Apple intended QTKit to develop into a replacement for QuickTime as an API, but has since abandoned that in favour of AV Foundation.&lt;/p&gt;

&lt;p&gt;QTKit supports both 64-bit and 32-bit apps. In a 64-bit process, older 32-bit QuickTime codecs are still available: QTKit launches a 32-bit proxy process and opens the clip in that process. It will use both QuickTime 7 and QuickTime X, but QuickTime X and 64-bit functionality is only available for playback, not any sort of editing, transcoding or processing.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If you specify a &lt;em&gt;playback-only&lt;/em&gt; attribute on opening the media file, you can use both QT X and QT 7 functionality: in a 64-bit app, third-party containers and codecs are opened in a proxy 32-bit process and everything else is opened in 64-bits using QT X. &lt;/li&gt;
  &lt;li&gt;However, if you specify &lt;em&gt;playback-only&lt;/em&gt;, you can only pass the handle to a QTMovieView object to play it back. You can do nothing else, not even enumerate the tracks in the movie (and the size is reported as 0 x 0). &lt;/li&gt;
  &lt;li&gt;If you open without &lt;em&gt;playback-only&lt;/em&gt;, you are limited to QT 7 functionality. All files are opened in a 32-bit process, and you cannot open MPEG-2 TS or PS files.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;av-foundation&quot;&gt;AV Foundation&lt;/h3&gt;
&lt;p&gt;Both QTKit and QuickTime are only available on OS X. AV Foundation is the media framework for iOS, and has been brought to OS X starting with 10.7 (Lion). It is an Objective-C framework that supports a range of playback, editing and transcoding features, for 32-bit and 64-bit applications.&lt;/p&gt;

&lt;p&gt;However, it is limited in the containers and codecs it supports, and there is no option for third-party extensions to support other formats. It supports MPEG-2 PS and TS files (on input) and MOV/MP4 files with common codec formats (H264/mpeg-4, AAC/MP3), but there is no way to open MKV files or decode DNxHD streams, for example. &lt;/p&gt;

&lt;p&gt;You also cannot access the encoders and decoders separately, with third-party container support. So, for example, if you want to encode video and write to a TS file, you must write to an MP4 file (which will take advantage of hardware-accelerated encoding), but then you must read the elementary stream data from the MP4 file to feed to your own TS multiplexor (it is possible, with a little work, to do this during encoding before the file is finalised, but it’s certainly a little fiddly).&lt;/p&gt;

&lt;p&gt;If you need support for 3rd party containers or codecs, you will need to use Quicktime 7, either in a 32-bit C application or in Objective-C in 32-bit (or 64-bit, with a 32-bit proxy process for the decoding). Otherwise, you will probably choose AV Foundation (from Objective-C).&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>XVID encoder support in MP4 Multiplexor</title>
      <link>http://www.gdcl.co.uk/2013/01/09/XVID-Support-in-MP4-Mux.html</link>
      <pubDate>Wed, 09 Jan 2013 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2013/01/09/XVID-Support-in-MP4-Mux.html</guid>
      <description>&lt;h1 id=&quot;xvid-encoder-support-in-mp4-multiplexor&quot;&gt;XVID Encoder support in MP4 Multiplexor&lt;/h1&gt;

&lt;p&gt;The MP4 Multiplexor filter at &lt;a href=&quot;http://www.gdcl.co.uk/mpeg4/&quot;&gt;gdcl.co.uk/mpeg4&lt;/a&gt; has been updated with support for the XVID mpeg-4 video encoder.&lt;/p&gt;

&lt;p&gt;The multiplexor filter queues samples at its inputs so that the interleaving can be managed correctly. This requires an allocator at each input that provides a significant number of samples. If the allocator chosen by the source pin does not provide enough buffers, the multiplexor creates a private allocator and copies all buffers to this allocator on Receive. &lt;/p&gt;

&lt;p&gt;For this private allocator, older versions of the filter used a pool of fixed size buffers, where the size was determined by the cbSize field reported by the source pin’s allocator. The XVID mpeg-4 video encoder reports an output buffer size of 12MB. When the multiplexor tried to allocate a large number of 12MB buffers, it would often fail to allocate the memory, and this error was compounded by an error-handling bug that resulted in a silent failure to capture any video.&lt;/p&gt;

&lt;p&gt;In any case, the fixed-size buffer pool is inefficient with compressed video, where each frame is in a separate buffer but the size of each frame varies considerably. I’ve implemented a new allocator that uses a single pool of memory, with IMediaSample objects created dynamically, and locked ranges used to manage the refcounts. As well as fixing this particular problem, this new allocator will be a useful sample class for those wishing to use variable length buffers in DirectShow filters.&lt;/p&gt;

&lt;p&gt;9th January 2012&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Windows 8 Support in MP4 filters</title>
      <link>http://www.gdcl.co.uk/2012/11/16/Win8-Mp4filters.html</link>
      <pubDate>Fri, 16 Nov 2012 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2012/11/16/Win8-Mp4filters.html</guid>
      <description>&lt;h1 id=&quot;windows-8-support-in-mp4-filters&quot;&gt;Windows 8 Support in MP4 filters&lt;/h1&gt;

&lt;p&gt;The MP4 filters at &lt;a href=&quot;http://www.gdcl.co.uk/mpeg4/&quot;&gt;www.gdcl.co.uk/mpeg4/&lt;/a&gt; have been updated with the following changes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;There was a small bug in the demux that prevented connection to the Microsoft DTV-DVD video decoder on Windows 8 (the biSize field was left zero). This is now fixed.&lt;/li&gt;
  &lt;li&gt;Encoders that supplied H264 byte stream format with an MPEG2VIDEOINFO descriptor would cause the MP4 Mux to think that the data was length-prepended format and create an invalid MP4 file. &lt;/li&gt;
  &lt;li&gt;Several people reported that the MP4 mux would fail if upstream filters returned an error from GetPreroll.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;16th November 2012&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Update to Monitor filter</title>
      <link>http://www.gdcl.co.uk/2012/06/09/Monitor.html</link>
      <pubDate>Sat, 09 Jun 2012 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2012/06/09/Monitor.html</guid>
      <description>&lt;h1 id=&quot;update-to-monitor-filter&quot;&gt;Update to Monitor Filter&lt;/h1&gt;

&lt;p&gt;I’ve posted an update to the Monitor filter &lt;a href=&quot;/mobile/monitor_source.zip&quot;&gt;source&lt;/a&gt; and &lt;a href=&quot;/mobile/win32.zip&quot;&gt;binary&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Monitor is a pass-through filter for DirectShow that logs all the timestamps and media types of the data passing through it. It’s a very useful debugging tool, that I originally published for Windows Mobile &lt;a href=&quot;/mobile&quot;&gt;here&lt;/a&gt; but I’m now maintaining it only for Win32. &lt;/p&gt;

&lt;p&gt;It supports IFileSinkFilter for the log filename, so when you insert it into a graph, GraphEdt will ask you for the name of the file. This is used to write log output.&lt;/p&gt;

&lt;p&gt;The fixes in this release include improved support for Unicode, multithread-safe log output, and a fix to allow multiple independent monitor filters in the same graph.&lt;/p&gt;

&lt;p&gt;June 2012&lt;/p&gt;

</description>
    </item>
    
    <item>
      <title>Seeking Problem with MPEG-4 Demux</title>
      <link>http://www.gdcl.co.uk/2012/05/21/Mpeg4Seeking.html</link>
      <pubDate>Mon, 21 May 2012 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2012/05/21/Mpeg4Seeking.html</guid>
      <description>&lt;h1 id=&quot;seeking-problem-with-mpeg-4-demux&quot;&gt;Seeking Problem with Mpeg-4 Demux&lt;/h1&gt;

&lt;p&gt;Todor Todorov has tracked down a bug when using the mpeg4 demux filter in combination with Microsoft’s DTV-DVD decoders. When seeking, the demux will issue a Flush operation even if the graph is inactive, and this causes the decoder to malfunction and no video data is decoded after the seek.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;http://www.gdcl.co.uk/mpeg4/mpeg4.zip&quot;&gt;latest version&lt;/a&gt;  (1.0.0.11) of the demux fixes this.&lt;/p&gt;

&lt;p&gt;21st May 2012&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>GMFBridge Update to 1.0.0.20</title>
      <link>http://www.gdcl.co.uk/2012/May/livetiming.htm</link>
      <pubDate>Tue, 15 May 2012 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2012/May/livetiming.htm</guid>
      <description>&lt;h2 id=&quot;live-timing-support&quot;&gt;Live Timing Support&lt;/h2&gt;

&lt;p&gt;GMFBridge 1.0.0.20 includes support for Live Timing via an extended interface, IGMFBridgeController3. This option prevents the bridge from adjusting timestamps at all, and is required in some cases when recording from live sources, with the audio in a separate graph. This is discussed further &lt;a href=&quot;../../2011/July/GMFBridgeTimestamps.htm&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It also includes other changes collected over the last few months:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Visual Studio 2010 support&lt;/li&gt;
  &lt;li&gt;The logfile (gmfbridge.txt) is now looked for in My Documents instead of c:\. Create this file before running to make a log of the bridge’s operations.&lt;/li&gt;
  &lt;li&gt;When copying audio samples, nBlockAlign is now respected correctly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;15 May 2012&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>MP4 Filters updated</title>
      <link>http://www.gdcl.co.uk/2012/05/07/mp4-filters-10010.html</link>
      <pubDate>Mon, 07 May 2012 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2012/05/07/mp4-filters-10010.html</guid>
      <description>&lt;h1 id=&quot;update-to-mp4-filters&quot;&gt;Update to MP4 Filters&lt;/h1&gt;

&lt;p&gt;The MP4 filters at &lt;a href=&quot;http://www.gdcl.co.uk/mpeg4/&quot;&gt;www.gdcl.co.uk/mpeg4/&lt;/a&gt; have been updated with the following changes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Support for DV, mpeg-2 video, xdcam and I420 video types in demux&lt;/li&gt;
  &lt;li&gt;Support for 24-bit PCM, 32-bit PCM and ‘raw ‘ audio in demux&lt;/li&gt;
  &lt;li&gt;Changes to H-264 BSF/length-prepended detection in demux&lt;/li&gt;
  &lt;li&gt;Visual Studio 2010 support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;7th May 2012&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>The GMFBridge Toolkit</title>
      <link>http://www.gdcl.co.uk/gmfbridge/index.htm</link>
      <pubDate>Sun, 01 Jan 2012 00:00:00 +0000</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/gmfbridge/index.htm</guid>
      <description>&lt;h1 id=&quot;gmfbridge&quot;&gt;GMFBridge&lt;/h1&gt;
&lt;p&gt;##Multiple Graphs in DirectShow
Applications sometimes need to start and stop some filters independently of others, and to switch connections dynamically. GMFBridge is a multi-graph toolkit that shows how to use multiple DirectShow graphs to solve these problems. There are two example applications (in C++ and VB6)&lt;/p&gt;

&lt;p&gt;In this article, Geraint Davies shows how to solve these problems using multiple, related graphs of filters. The accompanying source code includes &lt;strong&gt;GMFBridge&lt;/strong&gt;: &lt;strong&gt;GMFPlay&lt;/strong&gt; shows how to view multiple clips as a single movie, and &lt;strong&gt;GMFPreview&lt;/strong&gt; demonstrates how to keep showing the preview stream from a video capture device while starting and stopping capture into different files.&lt;/p&gt;

&lt;p&gt;Published: November 2004. Latest update May 2012.&lt;/p&gt;

&lt;p&gt;Download here:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;../articles/MultipleGraphs.pdf&quot;&gt;Multiple Graphs in DirectShow&lt;/a&gt; article (PDF)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;GMFBridge.zip&quot;&gt;Source code&lt;/a&gt; to GMFBridge tool and example applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some &lt;strong&gt;FAQs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Don’t forget to register the DLLs with &lt;strong&gt;regsvr32&lt;/strong&gt;.&lt;/li&gt;
  &lt;li&gt;The project files are for VS2010 or VS2008. For the sample apps, you’ll need &lt;strong&gt;WTL 8.0&lt;/strong&gt; which you can download &lt;a href=&quot;http://www.microsoft.com/en-us/download/details.aspx?id=9668&quot;&gt;here&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;With VC6, if you see an error with &lt;strong&gt;LONG_PTR&lt;/strong&gt; in building GMFBridge_h.h from the idl file, make sure that your DirectX 9 SDK include files are higher on the include path than the VC98 include files.&lt;/li&gt;
  &lt;li&gt;If you want to view the graphs used, re-enable the Running Object Table code in BridgeSink::JoinFilterGraph and BridgeSource::JoinFilterGraph and then use GraphEdt’s Connect To Remote Graph feature. This is disabled by default as it can lead to refcount leaks. &lt;/li&gt;
  &lt;li&gt;If you want to debug problems, a good first step is to create an empty text file &quot;gmfbridge.txt&quot; in My Documents. If this file is present, GMFBridge will write log text to this file during execution.&lt;/li&gt;
&lt;/ul&gt;

&lt;div id=&quot;sidebar&quot;&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&quot;../articles/MultipleGraphs.pdf&quot;&gt;GMFBridge article (PDF)&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;GMFBridge.zip&quot;&gt;GMFBridge Source code&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;br /&gt;&lt;/li&gt;
 		
	&lt;li&gt;&lt;a href=&quot;/2012/May/livetiming.htm&quot;&gt;GMFBridge Update to 1.0.0.20&lt;/a&gt;&lt;/li&gt;
		
	&lt;li&gt;&lt;a href=&quot;/2011/July/GMFBridgeTimestamps.htm&quot;&gt;GMFBridge and Timestamps&lt;/a&gt;&lt;/li&gt;
		
	&lt;li&gt;&lt;a href=&quot;/2011/June/BridgeAndInfTee.htm&quot;&gt;GMFBridge, Inftee and Copies&lt;/a&gt;&lt;/li&gt;
		
	&lt;li&gt;&lt;a href=&quot;/2010/June/bridge10019.htm&quot;&gt;Roll-up of Recent Bug Fixes&lt;/a&gt;&lt;/li&gt;
	
    &lt;li&gt;&lt;a href=&quot;surfaceloss.html&quot;&gt;April 2009: Fix for Surface Loss&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;build10017.html&quot;&gt;March 2009: Build 1.0.0.17&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;typeswitching.htm&quot;&gt;March 2007: Type Switching&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;bridgeatdiscont.htm&quot;&gt;Sept 2006: BridgeAtDiscont&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;vbsupp.htm&quot;&gt;April 2006: VB Support&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Customising Rendering with IStreamBuilder</title>
      <link>http://www.gdcl.co.uk/2011/August/istreambuilder.htm</link>
      <pubDate>Sat, 06 Aug 2011 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2011/August/istreambuilder.htm</guid>
      <description>&lt;h1 id=&quot;customising-graph-building-with-istreambuilder&quot;&gt;Customising Graph Building with IStreamBuilder&lt;/h1&gt;

&lt;p&gt;DirectShow’s automated graph building works surprisingly well most of the time, but there are cases where you need to customise the graph-building logic a little. Doing this when you are building the graph in your application is quite straightforward (one of the simplest is to insert a preferred filter into the graph before calling RenderFile).  But from within a filter, it’s quite a bit harder.&lt;/p&gt;

&lt;p&gt;Consider a filter that outputs subtitles. By default, the graph manager will render this output pinto a separate renderer filter. However, you really want it connected to a secondary pin on the VMR, so that the subtitles are overlaid correctly on the video. How can you do this from within your filter, so that subtitles will be rendered even in other apps, such as Windows Media Player? &lt;/p&gt;

&lt;p&gt;The answer is to implement IStreamBuilder, which allows you to control the rendering of your output pin. I’ve used this on several occasions, and only recently discovered a couple of bugs in the DirectShow implementation. First, I’d like to outline the correct use of IStreamBuilder before describing the bugs.&lt;/p&gt;

&lt;h2 id=&quot;rendering-a-pin&quot;&gt;Rendering a pin&lt;/h2&gt;

&lt;p&gt;When DirectShow is trying to render a pin, it will query the pin for the IStreamBuilder interface. This is true for calls to RenderPin as well as RenderFile, but not for Connect operations (so you can control the rendering of your subtitle output without getting in the way when an app wants to connect your pin itself). &lt;/p&gt;

&lt;p&gt;If the pin implements IStreamBuilder, the graph manager will call its Render method. In your Render method, you can add filters, connect and disconnect pins, and most importantly, you call delegate back to the graph manager without an infinite loop. So, for example, some demux filters will implement IStreamBuilder so that they can add a preferred decoder filter to the graph, and then call the graph manager’s Render method to complete the rendering.&lt;/p&gt;

&lt;h2 id=&quot;backout&quot;&gt;Backout&lt;/h2&gt;

&lt;p&gt;If you implement Render, you must also implement the Backout method to undo everything you did in Render. This includes removing any filters you added and undoing any connect or disconnect operations. Since the graph manager backs out in reverse order, you should find that when your Backout method is called, the graph state is the same as it was at the end of your Render method. In fact, this means that it is safe to mess with connections that the graph manager has made, provided that you undo these operations in your Backout method.&lt;/p&gt;

&lt;h2 id=&quot;directshow-bugs&quot;&gt;DirectShow Bugs&lt;/h2&gt;

&lt;p&gt;I was prompted to write this note by the recent discovery of a couple of bugs in the DirectShow handling of IStreamBuilder.&lt;/p&gt;

&lt;p&gt;While the graph manager is working down the graph rendering pins, it keeps track of what fraction of the stream each pin represents. So, if the demux has two outputs (for video and audio), each output pin represents 50% of the original source. If the video decoder then splits into a video and a subtitle output, then each of those is 25% of the output (whereas the single audio output is still 50%). If a particular configuration of filters is found that renders some outputs but not others, the graph manager keeps a note of what fraction of the stream is rendered. If it fails to find any configuration that renders all the outputs, it will go back to the configuration that rendered the highest fraction, and rebuild that graph, before reporting a “partial success” code.&lt;/p&gt;

&lt;p&gt;When recreating a previous “best so far” configuration, the graph manager pulls &lt;em&gt;the same instance&lt;/em&gt; of the filter from its cache, and reconnects its input. It will then attempt to repeat the previous render operations on the output pins. Now, sometimes, the filter will not create the same output pins when it is reconnected. In my case, a problem with cleaning up in BreakConnect meant that no audio pins were created when the demux filter was connected a second time.&lt;/p&gt;

&lt;p&gt;In the regular Render code, this is detected and results in a simple error return. But when the graph builder is using IStreamBuilder to render an output pin, the code is in a slightly different order and a missing pin causes DirectShow to crash. Since the crash in deep in the filter graph manager, it’s very hard to relate this to the bug in the filter that caused it.&lt;/p&gt;

&lt;p&gt;There’s another problem in this same code, unfortunately. The graph manager code to recreate a partially-successful graph doesn’t work when IStreamBuilder is used. The pin render failure that should result in a partial-success code actually ends up being returned as an E_FAIL, so the whole render fails. So if you implement IStreamBuilder for all your demux output pins, and two of them can be rendered, but the third fails, you will not get a partial graph built in this case, as you would normally.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>GMFBridge and Timestamps</title>
      <link>http://www.gdcl.co.uk/2011/July/GMFBridgeTimestamps.htm</link>
      <pubDate>Fri, 01 Jul 2011 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2011/July/GMFBridgeTimestamps.htm</guid>
      <description>&lt;h1 id=&quot;gmfbridge-and-timestamps&quot;&gt;GMFBridge and Timestamps&lt;/h1&gt;

&lt;p&gt;GMFBridge is a sample that shows how to separate DirectShow tasks into multiple graphs, so that some filters can be started and stopped independently from others and so that you can switch sources or renderers without stopping the graph. &lt;/p&gt;

&lt;p&gt;Unless the graphs are all started and stopped in sync and share the same clock, they will have different time bases. Every sample that crosses a graph will need its timestamps adjusted for the new graph. However, there are several different ways to do this, appropriate to different situations. In this article, I’ve tried to describe the issues and explain how GMFBridge solves them.&lt;/p&gt;

&lt;h2 id=&quot;stream-time&quot;&gt;Stream Time&lt;/h2&gt;

&lt;p&gt;As a little background, let me summarise the time handling in DirectShow very briefly. &lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;a &lt;em&gt;reference clock&lt;/em&gt; provides an absolute time, that is a measure of wall-clock time according to some device. This time is always increasing and is not affected by a graph starting or stopping.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;the &lt;em&gt;stream time&lt;/em&gt; in a graph is the time that is currently being rendered. A sample that is timestamped 100ms will be due for rendering when the &lt;em&gt;stream time&lt;/em&gt; reaches 100ms.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;the &lt;em&gt;stream time offset&lt;/em&gt; is the offset from the &lt;em&gt;reference clock&lt;/em&gt; that gives stream time 0. This is given to each filter in the &lt;code&gt;Run()&lt;/code&gt; method. Subtracting the &lt;em&gt;stream time offset&lt;/em&gt; from the current reference clock time will give you the current stream time.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;seamless-file-playback&quot;&gt;Seamless File Playback&lt;/h2&gt;

&lt;p&gt;One common use for GMFBridge is to play a sequence of clips seamlessly. Each clip is opened in a separate source graph, and these are connected in turn to a single render graph. Within each source graph, the timestamps will go from zero to the clip’s duration,  but in the render graph, the timestamps cannot suddenly go back to zero without a pause or seek. So we need to add on the previous timestamp so that they continue in sequence:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;source		render
(clip 0)
0			0
33			33
66			66
100			100
...			...
10,000		10,000
(clip 1)
0			10,033
33			10,066
66			10,099
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You can see this code in &lt;code&gt;BridgeSource::OnNewConnection()&lt;/code&gt;, which will find the latest stop time used by the previous clip and use that as the baseline start time for the new clip. Each time there’s a new connection, we record the previous stop time and we add that on to all clips from then on.&lt;/p&gt;

&lt;p&gt;Incidentally, if you stop and start the source graph while still connected to the bridge, this is also treated as a new connection, since the timestamps will start from 0 again.&lt;/p&gt;

&lt;h2 id=&quot;switching-live-sources&quot;&gt;Switching Live Sources&lt;/h2&gt;

&lt;p&gt;If you are switching between live sources, rather than files, this might not work so well. If the timestamps are adjusted to fit after the last timestamp of the previous source, then each time you switch sources, you might increase the latency slightly. In any case, this behaviour is not want you normally want with live sources. You want to switch to the next source &lt;em&gt;now&lt;/em&gt;. It does not make sense to fit the current source after the previous source if both are live.&lt;/p&gt;

&lt;p&gt;In this case, you can use BridgeAtDiscont to indicate to GMFBridge that the next clip is not seamless, but instead starts immediately. The bridge will get the current &lt;em&gt;stream time&lt;/em&gt;, add on a latency to allow time for the samples to pass through the graph, and use that as the baseline time for the new graph. So all samples will be offset by this baseline.&lt;/p&gt;

&lt;p&gt;You can review this code also, in &lt;code&gt;BridgeSource::OnNewConnection()&lt;/code&gt;. If this is a discontinuity rather than a seamless change, the baseline time is set to the current stream time plus 300ms (as an allowance for how long the samples will take to flow through the graph).&lt;/p&gt;

&lt;p&gt;As an aside, note that the base class StreamTime method only works when the graph is running. It simply subtracts the &lt;em&gt;stream time offset&lt;/em&gt; from the current reference clock time. However, when the graph is paused, the stream time is not advancing, so you need to record the stream time at the moment of pause and use that until the next &lt;code&gt;Run()&lt;/code&gt; call. &lt;/p&gt;

&lt;h2 id=&quot;preview-then-record&quot;&gt;Preview Then Record&lt;/h2&gt;

&lt;p&gt;GMFBridge is often used to turn on and off the recording of a video without affecting the preview. A source graph contains the device capture filter, with the preview pin connected to a renderer. The source’s capture pin is connected to the bridge. When only preview is required, the graph is run and the capture pin’s samples are discarded at the bridge. When recording is turned on, the bridge is connected to a recording graph (containing multiplexor and file writer). Capture samples are then delivered over the bridge to the multiplexor and recording starts without the preview being affected at all.&lt;/p&gt;

&lt;p&gt;When recording starts, the preview graph may have been running for a while. So the first timestamp arriving at the bridge will be well past zero. But the capture graph has just started, and it is expecting samples starting at zero. All the logic described above, whether using contiguous or discontiguous time bases, relies on the first sample from a new clip being based at zero.&lt;/p&gt;

&lt;p&gt;To resolve this, the bridge sink (at the end of the source graph) will convert the timestamps to zero-based when connected to the bridge; the other end of the bridge will then convert the timestamps into the time base used in the downstream graph. This code is in &lt;code&gt;BridgeSink::AdjustTime()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To demonstrate, the following table shows the timestamps that might be generated if two graphs were bridged when both graphs were already running, assuming that we use BridgeAtDiscont(true).&lt;/p&gt;

&lt;p&gt;At first the graph is disconnected, so the samples are dropped&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;source	in bridge	render
0
33
66
100
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then the graph is connected. So the next sample is passed to the bridge, but adjusted to zero as the start of a new sequence. In the render graph at this point, the stream time is 1000ms, so the first sample is timestamped at 1000ms (now) plus 300ms latency.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;133		0			1000 + 300 + 0  = 1300
166		33			1000 + 300 + 33 = 1333
200		67			1000 + 300 + 67 = 1367
...
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;live-timing&quot;&gt;Live Timing&lt;/h2&gt;

&lt;p&gt;Despite all these timestamp adjustments, there are still cases that GMFBridge does not handle well. Consider this: you have two cameras, and a single microphone, and you are streaming the output over the network. You need to switch between the cameras, so each of those is in a separate graph connected by a bridge. In the downstream graph, the video stream from the bridge is connected to the encoder and transmitter. You can switch seamlessly between cameras by bridging one or other of your source graphs to the encoder graph. You only have a single microphone, so that is connected directly in the encoder graph.&lt;/p&gt;

&lt;p&gt;The problem is that every time the bridge connection is changed, the time base will change slightly. If the audio were being switched as well as the video, this would not matter, since the adjustments would be made to both audio and video timestamps. But with the audio in the encoder graph (staying unchanged), the sync between the audio and the video is drifting off, getting worse each time the bridge is changed.&lt;/p&gt;

&lt;p&gt;Since the video and audio are live, we don’t really want to alter the timestamps at all. This can be done if all the graphs use the same clock. Then all that’s needed is to adjust for the &lt;em&gt;stream time offset&lt;/em&gt;, to allow for the graphs starting at slightly different times. This can be done by adding on the &lt;em&gt;stream time offset&lt;/em&gt; in the sink filter when entering the bridge, which converts the timestamp to an absolute reference clock time, and then subtracting the &lt;em&gt;stream time offset&lt;/em&gt; in the source filter on leaving the bridge.&lt;/p&gt;

&lt;p&gt;I have a version of GMFBridge that implements this alternative timing model, that is currently in testing. I will publish it once it has been tested.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update&lt;/strong&gt;: this is now available &lt;a href=&quot;/2012/May/livetiming.htm&quot;&gt;here&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;1st July 2011&lt;/p&gt;

</description>
    </item>
    
    <item>
      <title>GMFBridge, Inftee and Copies</title>
      <link>http://www.gdcl.co.uk/2011/June/BridgeAndInfTee.htm</link>
      <pubDate>Mon, 27 Jun 2011 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2011/June/BridgeAndInfTee.htm</guid>
      <description>&lt;h1 id=&quot;gmfbridge-inftee-and-copies&quot;&gt;GMFBridge, InfTee and Copies&lt;/h1&gt;

&lt;p&gt;The GMFBridge tool shows how to divide DirectShow tasks into a number of separate graphs. This allows you to make changes while the graph is running, such as switching between different source files or input devices, and changing output files without stopping the graph.  GMFBridge shows that you can use multiple graphs without imposing much overhead, since it does not introduce any thread switches or copies of the data. &lt;/p&gt;

&lt;p&gt;This efficiency comes at the price of added complexity. Avoiding deadlocks can be quite complicated when you have a single thread calling across graphs that are in different states. In this article, I want to look at one aspect of that complexity: the difficulties of copying video frames. &lt;/p&gt;

&lt;h2 id=&quot;inftee-and-smart-tee&quot;&gt;Inftee and Smart Tee&lt;/h2&gt;

&lt;p&gt;One of the most common uses for GMFBridge is during video capture, to allow control over the multiplexor separately from the capture filter. This allows you to start and stop capture and change capture files without interrupting the preview display. To do this, you need to send the frames both to the preview renderer and to the multiplexor. This is most commonly done with either the &lt;em&gt;Infinite Pin Tee&lt;/em&gt; or the &lt;em&gt;Smart Tee&lt;/em&gt; filter.&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;inftee&lt;/em&gt; filter sends the same buffer, with the same IMediaSample object, to all of its output pins. The buffer is returned to the allocator’s pool only when the downstream filters on all of the output pins have finished with it. Since the IMediaSample object is the same, all the output pins get the same timestamp and changes to the metadata affect all the filters.&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;smart tee&lt;/em&gt; filter is designed specifically for use in video capture. In a video capture graph, frames are stamped with the time that they are captured; if these frames are delivered to a renderer, they will be late, since that time has already passed. The smart tee solves this by simply stripping off the timestamps, so the frames will be rendered immediately. It does this by wrapping another IMediaSample object around the same buffer. The original buffer, with timestamps intact, is sent via the capture pin to the multiplexor. The other IMediaSample object has no timestamps, but refers to the same physical buffer, and has a refcount on the original. The original buffer will not be returned to the allocator’s pool until both downstream filters have finished with it.&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;smart tee&lt;/em&gt; has another feature that is less popular. The preview pin drops frames if it considers that the recording is falling behind, and the mechanism used to decide this is fairly primitive, and based on the behaviour of capture graphs in typical systems of the mid-90s. For this reason, &lt;em&gt;inftee&lt;/em&gt; is often used instead.&lt;/p&gt;

&lt;h2 id=&quot;timestamp-correction&quot;&gt;Timestamp Correction&lt;/h2&gt;

&lt;p&gt;GMFBridge connects two separate filter graphs. These graphs will have separate time bases, possibly using separate clocks and almost certainly using a different stream time offset. So when samples are transferred between graphs, the timestamps need to be adjusted to fit the new graph’s time base. (As an aside, there are several different ways to adjust the timestamps, which is the subject of a separate note to be published shortly).&lt;/p&gt;

&lt;p&gt;The bridge adjusts the timestamps simply by setting the new timestamp on the sample. This means that any other filter using the same IMediaSample object (on another &lt;em&gt;inftee&lt;/em&gt; output pin) will see the timestamp correction. Of course, in the original graph, this timestamp correction can cause significant problems in lipsync or playback delays.&lt;/p&gt;

&lt;h2 id=&quot;multiplexor-queues&quot;&gt;Multiplexor Queues&lt;/h2&gt;

&lt;p&gt;When a buffer is sent to multiple output pins by the &lt;em&gt;inftee&lt;/em&gt; or &lt;em&gt;smart tee&lt;/em&gt; filters, it is not returned to the free pool until all of the downstream filters have finished with it. Some multiplexors require a long buffer queue in order to interleave audio and video in the correct ratio, and this can be a problem for other output pins using the same buffers. Typically, preview rendering will freeze since all the buffers are queued at the multiplexor, and in some cases, this will have a knock-on effect preventing audio delivery, which in turn prevents the multiplexor advancing, and thus the whole graph will deadlock.&lt;/p&gt;

&lt;h2 id=&quot;copying-frames&quot;&gt;Copying Frames&lt;/h2&gt;
&lt;p&gt;The simplest solution to these problems is to copy the frames if you are using a tee filter with GMFBridge.  Since the data is usually  in the form of uncompressed video data, this can often be done simply by introducing a &lt;em&gt;Colour Converter&lt;/em&gt; filter into the graph between the &lt;em&gt;inftee&lt;/em&gt; and the bridge, but any transform filter (other than in-place transforms) will do. This will ensure that the timestamp modifications do not get carried over to other output pins, and will also make sure that the multiplexor is using buffers from a different allocator, which should prevent the long multiplexor queue from affecting preview rendering.&lt;/p&gt;

&lt;p&gt;The timestamp modification issue alone could also be prevented by using a different IMediaSample object wrapping the same physical buffer memory. That is, instead of copying the whole buffer, a new IMediaSample can be allocated but pointing to the same buffer (and holding a refcount on the original sample object). It’s possible that a future release of GMFBridge will incorporate this feature.&lt;/p&gt;

&lt;h2 id=&quot;looser-coupling&quot;&gt;Looser Coupling&lt;/h2&gt;

&lt;p&gt;As I mentioned earlier, GMFBridge was designed to show that you can divide tasks into multiple graphs without any loss of efficiency. To achieve that, it keeps the graphs very tightly coupled, without thread switches or buffer copies.&lt;/p&gt;

&lt;p&gt;There are many cases where this is not the best route. For example, if you need to send the video frames to multiple outputs, you may need to copy the samples as I have discussed above, and as a result you will get the complexity of using a single thread in multiple graphs without most of the efficiency gains.&lt;/p&gt;

&lt;p&gt;In these cases, a more loosely coupled approach would be more appropriate. For example, the bridge sink filters can place the frames into a pool of buffers, from which multiple render or recording graphs can copy the frames. This allows the benefits of a multiple graph architecture, together with one-to-many delivery of frames, and avoids many of the complications of gmfbridge. The cost is a copy of the frame for each output, and a separate thread for each output.&lt;/p&gt;

&lt;p&gt;I’ve developed solutions like this for clients and found that the greater flexibility and simplicity easily outweighs the downside of a few memory copies. &lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Using Filters Without Registration</title>
      <link>http://www.gdcl.co.uk/2011/June/UnregisteredFilters.htm</link>
      <pubDate>Tue, 07 Jun 2011 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2011/June/UnregisteredFilters.htm</guid>
      <description>&lt;h1 id=&quot;using-filters-without-registration&quot;&gt;Using Filters Without Registration&lt;/h1&gt;

&lt;p&gt;Sometimes it’s useful to use filters without COM registration. COM registration requires admin rights, and allows other apps to use the filters. You can bypass this and use the filters directly, if you create them yourself and insert them into the graph.&lt;/p&gt;

&lt;h2 id=&quot;create-using-new&quot;&gt;Create using &lt;em&gt;new&lt;/em&gt;&lt;/h2&gt;

&lt;p&gt;There are two ways to create a filter yourself. If you have the source to the filter, or a private API, you can just construct the filter with &lt;em&gt;new&lt;/em&gt;. The important thing to remember is that you need to &lt;code&gt;AddRef()&lt;/code&gt; the filter immediately, and then treat it like a normal COM object. When you are finished with it, &lt;code&gt;Release()&lt;/code&gt; it rather than deleting it. If you use a smart pointer, this will all be taken care of for you.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;#include &amp;lt;comdef.h&amp;gt;
_COM_SMARTPTR_TYPEDEF(IBaseFilter, __uuidof(IBaseFilter));

IBaseFilterPtr pFilter = new CMyFilterClass();
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;create-using-a-private-cocreateinstance&quot;&gt;Create using a private &lt;em&gt;CoCreateInstance&lt;/em&gt;&lt;/h2&gt;

&lt;p&gt;Of course, in many cases, you will only have the filter DLL. Since the DLL is not registered, there is no way of mapping the filter’s CLSID to the DLL that contains it, so you can’t call &lt;em&gt;CoCreateInstance&lt;/em&gt;. But if you &lt;strong&gt;know&lt;/strong&gt; which DLL the filter is in, you can bypass that lookup and simply instantiate the filter exactly as &lt;em&gt;CoCreateInstance&lt;/em&gt; does.&lt;/p&gt;

&lt;p&gt;This function will create an &lt;em&gt;in-proc&lt;/em&gt; server for a COM object, given the filename and the CLSID.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;// define the prototype of the class factory entry point in a COM dll
typedef HRESULT (STDAPICALLTYPE* FN_DLLGETCLASSOBJECT)(REFCLSID clsid, REFIID iid, void** ppv);

HRESULT CreateObjectFromPath(TCHAR* pPath, REFCLSID clsid, IUnknown** ppUnk)
{
	// load the target DLL directly
	HMODULE lib = LoadLibrary(pPath);
	if (!lib)
	{
		return HRESULT_FROM_WIN32(GetLastError());
	}

	// the entry point is an exported function
	FN_DLLGETCLASSOBJECT fn = (FN_DLLGETCLASSOBJECT)GetProcAddress(lib, &quot;DllGetClassObject&quot;);
	if (fn == NULL)
	{
		return HRESULT_FROM_WIN32(GetLastError());
	}

	// create a class factory
	IUnknownPtr pUnk;
	HRESULT hr = fn(clsid,  IID_IUnknown,  (void**)(IUnknown**)&amp;amp;pUnk);
	if (SUCCEEDED(hr))
	{
		IClassFactoryPtr pCF = pUnk;
		if (pCF == NULL)
		{
			hr = E_NOINTERFACE;
		}
		else
		{
			// ask the class factory to create the object
			hr = pCF-&amp;gt;CreateInstance(NULL, IID_IUnknown, (void**)ppUnk);
		}
	}

	return hr;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;graph-building-with-private-filters&quot;&gt;Graph-building with private filters&lt;/h2&gt;

&lt;p&gt;In most cases, you can use an unregistered filter simply by adding it to the graph before calling RenderFile (or any other intelligent connect or render operation). The graph manager will try all the filters in the graph before looking through the registry to find new filters. If you have a custom demux, decoder or renderer, you can use it like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;IUnknownPtr pUnk;
HRESULT hr = CreateObjectFromPath(TEXT(&quot;c:\\path\\to\\myfilter.dll&quot;), IID_MyFilter, &amp;amp;pUnk);
if (SUCCEEDED(hr))
{
	IBaseFilterPtr pFilter = pUnk;
	pGraph-&amp;gt;AddFilter(pFilter, L&quot;Private Filter&quot;);
	pGraph-&amp;gt;RenderFile(pMediaClip, NULL);
}
&lt;/code&gt;&lt;/pre&gt;
</description>
    </item>
    
    <item>
      <title>Roll-up of Recent Bug Fixes</title>
      <link>http://www.gdcl.co.uk/2010/June/bridge10019.htm</link>
      <pubDate>Thu, 17 Jun 2010 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2010/June/bridge10019.htm</guid>
      <description>&lt;h1 id=&quot;gmfbridge&quot;&gt;GMFBridge&lt;/h1&gt;
&lt;p&gt;##Roll-up of Recent Bug Fixes&lt;/p&gt;

&lt;p&gt;I’ve released GMFBridge 1.0.0.19. This includes a number of bugfixes from the last few months. There are three main items:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A change to flushing on disconnect. If the graph is paused when the bridge is disconnected, we need to flush to ensure that a worker thread is not blocked inside the bridge. However, if we are at end of stream, then we know that the worker thread is not blocked and we can disconnect without flushing. This enables some single-step situations where previously the frames were being discarded by the flush.&lt;/li&gt;
  &lt;li&gt;Some DV decoders will start decoding a little before the requested start point, marking the early frames with negative timestamps. When this happens on the second or subsequent clip of a playlist, the negative timestamps prevent the two clips from being joined seamlessly. This is fixed by discarding preroll and negative timed data at a bridge change.&lt;/li&gt;
  &lt;li&gt;The GMFPreview sample now supports pause and resume of the recording without pausing the live source preview. This required a fix to the media timestamps at the bridge: these are now mapped by the same timestamp-mapping logic used for stream timestamps.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;17 June 2010&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Rollup of MP4 bugfixes</title>
      <link>http://www.gdcl.co.uk/2010/June/mp4updates.htm</link>
      <pubDate>Tue, 08 Jun 2010 00:00:00 +0100</pubDate>
      <author>geraintd@gdcl.co.uk(Geraint Davies)</author>
      <guid>http://www.gdcl.co.uk/2010/June/mp4updates.htm</guid>
      <description>&lt;h1 id=&quot;rollup-of-bug-fixes-to-mp4-mux-and-demux&quot;&gt;Rollup of bug fixes to MP4 mux and demux&lt;/h1&gt;

&lt;p&gt;Version 1.0.0.9 of the demux includes a number of fixes accumulated over the last year. These include&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Support for ‘sowt’ and ‘twos’ PCM audio in big-endian and little-endian variants in the demux&lt;/li&gt;
  &lt;li&gt;Support for files &amp;gt; 2GB&lt;/li&gt;
  &lt;li&gt;Improvement in performance for index lookups&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’ve also updated the MP4 mux to version 1.0.0.7, including a fix to H264 NAL Unit parsing and a fix for a crash on Stop if no data has been received.&lt;/p&gt;

&lt;p&gt;8 June 2010&lt;/p&gt;
</description>
    </item>
    

  </channel> 
</rss>
