Insights Into macOS Video Editing Performance
 

Insights Into macOS Video Editing Performance  

  RSS

mac_editor
(@mac_editor)
Famed Member Moderator
Joined: 3 years ago
 

Introduction

Apple has made some compelling changes to their Mac hardware lineup recently with the new 16-inch MacBook Pro with Pro Navi GPUs and better cooling, and the much anticipated Mac Pro with standard I/O expansion with performant albeit expensive GPUs. Final Cut Pro, Motion, and Compressor have undergone some important changes as well. The apps now use a new Metal engine, support multiple GPUs (Mac Pro only), and add GPU selection in-app. For more information, take a look at the Final Cut Pro changelog.

The aim of this analysis is to understand how Final Cut Pro attempts to utilize eGPUs in macOS. I wanted to see how the new version of Final Cut Pro behaves versus previously, and hopefully reason and make some sense about Final Cut performance. The tests use a non-complex project throughout. It is important to understand that these tests do not incorporate observations on timeline performance, scrubbing, etc. which are far more crucial to video editing than just exporting projects - this analysis is not about that.

 

Host System

I will be testing with a 2018 15-inch MacBook Pro that has the following specifications:

  • CPU: 2.6 GHz 6-Core Intel Core i7 (i7-8850H)
  • Integrated GPU: Intel UHD 630
  • Discrete GPU: Radeon Pro 560X

 

Methodology

The goal is to observe performance as we vary the following parameters:

Parameter Description
Version of macOS I will be testing against both Mojave | 10.14.6 and Catalina | 10.15.2.
Version of Final Cut Pro I will be using 10.3.410.4.6, and 10.4.8 for testing.
External GPUs I will be testing with a Mantiz Venus with the RX Vega 64 and RX 5700 XT.
Special Scenarios I will try disabling GPUs, simulating eGPUs as internal GPUs, and test multiple eGPUs.

 

I will not dive too deep into different codecs. Unless otherwise stated, all encodes are 8bpc. A missing entry in any benchmark table implies that either I was unable to run the test for that specific configuration or haven't yet.

 

Test Legend

The following table describes all the tests I have performed. I will be using abbreviations for test names for simplicity and cleanliness, as more than the test itself, we care about performance deltas.

Test Description
HBVT H.264 4m 22s 1080p 30fps Handbrake encode using H.264 VideoToolbox at default settings.
HBVT H.265 4m 22s 1080p 30fps Handbrake encode using H.265 VideoToolbox at default settings.
CMPR H.264 4m 22s 1080p 30fps Compressor encode using H.264 at default settings except bitrate=6000kbps.
CMPR H.265 4m 22s 1080p 30fps Compressor encode using HEVC at default settings with encoder set to fast.
FCP.8 OPFL Final Cut Pro 10.4.8 background render of 2m 16s 1080p24 video in 4K60 timeline with optical flow.
FCP.8 4K60 Final Cut Pro 10.4.8, export 2m 16s 1080p24 video in 4K timeline with optical flow using H.264.
FCP.8 4EFX Final Cut Pro 10.4.8, export 2m 16s 1080p24 video in 4K timeline with 4 effects + optical flow using H.264.
FCP.8 6EFX Final Cut Pro 10.4.8, export 2m 16s 1080p24 video in 4K timeline with 6 effects using H.264.
FCP.8 PRHQ Final Cut Pro 10.4.8, export 2m 16s 1080p24 video in 4K timeline with effects using ProRes 422 HQ.
FCP.6 6EFX Final Cut Pro 10.4.6, export 2m 16s 1080p24 video in 4K timeline with 6 effects using H.264.
FCP.6 PRHQ Final Cut Pro 10.4.6, export 2m 16s 1080p24 video in 4K timeline with effects using ProRes 422 HQ.

 

The performance legend will be as follows:

  • Green represents fastest performance across a particular test horizontally in benchmark tables.
  • Yellow represents next-best performance across a particular test horizontally in benchmark tables.
  • For times, lower is better, and for fps, higher is better.

Inferences and hypotheses regarding GPU utilization are discussed in a separate section, after all results are laid out.

 

macOS Catalina

VideoToolbox is Apple's API to encode/decode videos. Handbrake offers these presets with very limited adjustment settings because of the one-size-fits-all nature of the API, while Compressor seems to allow for some more fine tuning:

Test (10.15.2) Intel UHD 630 Radeon Pro 560X Apple T2 Radeon Vega 64 Radeon 5700 XT
HBVT H.264 53s - - 55s 52s
HBVT H.265 - - 52s - -
CMPR H.264 1m 22s - - 1m 57s 1m 40s
CMPR H.265 52s - - - -

 

The Navi GPU outperformed the Vega 64 and UHD 630 by a negligible margin in Handbrake. On the other hand, the Intel GPU edged out both eGPUs when transcoding in Compressor. What's more interesting is that for the HEVC encode, Compressor uses Intel instead of the T2 encoder.

Let's look at some Final Cut Pro results next:

Test (10.15.2) Radeon Pro 560X* Radeon Vega 64 Radeon 5700 XT
FCP.8 4K60 2m 19s 3m 11s 2m 40s
FCP.8 OPFL 2m 58s 2m 50s 2m 22s
FCP.8 4EFX 11m 6s 6m 38s 6m 53s
FCP.8 4EFB 11m 28s 7m 47s 8m 21s
FCP.8 6EFX 11m 36s 7m 4s 7m 44s
FCP.8 PRHQ 9m 52s 8m 29s 8m 46s
FCP.6 6EFX 5m 07s 3m 59s 5m 14s
FCP.6 PRHQ 5m 45s 5m 31s 6m 03s

 

*It is worth noting that while testing with the Radeon Pro 560X discrete GPU, in some tests, the Intel integrated GPU was also in use, but minimally. In terms of measuring outcome, it is still a valid result, as we care about the performance value of eGPU versus a system without it.

The results here are interesting, with the Vega GPU edging out the others overall. However, the really interesting results are when we look at FCP.8 vs. FCP.6, which I will address in a later section. Another interesting result is poor Navi performance on FCP.6, which is a likely indicator that Apple might be micro-optimizing Metal on Final Cut Pro, GPU by GPU, or Mac by Mac.

 

macOS Mojave

Since Navi support is absent in Mojave, I have substituted the 5700 XT with an R9 Fury, predecessor to the Vega GPUs. Additionally, with Mojave, we can test Final Cut Pro 10.3.4, which cannot run on Catalina as it contains some 32-bit components or plugins. Let's start with Compressor and Handbrake encodes:

Test (10.14.6) Intel UHD 630 Radeon Pro 560X Apple T2 Radeon Vega 64 Radeon R9 Fury
HBVT H.264 55s - - 55s TODO
HBVT H.265 - - 53s - -
CMPR H.264 1m 22s - - 1m 38s TODO
CMPR H.265 54s - - - -

 

The differences versus Catalina are negligible and within margin of error. We can safely conclude that there is nothing odd going on with the VideoToolboxAPI itself across macOS updates. I was unable to force encoding on the configurations above. I will experiment further in the future.

Let's look at Final Cut Pro results next:

Test (10.14.6) Radeon Pro 560X* Radeon Vega 64* Radeon R9 Fury
FCP.8 4K60 2m 18s 2m 18s TODO
FCP.8 OPFL 2m 07s 2m 03s TODO
FCP.8 4EFX 5m 13s 5m 49s TODO
FCP.8 4EFB 5m 34s 6m 50s TODO
FCP.8 6EFX 5m 31s 5m 47s TODO
FCP.8 PRHQ 5m 27s 7m 13s TODO
FCP.6 6EFX 4m 59s 3m 57s TODO
FCP.6 PRHQ 5m 45s 5m 27s TODO
FCP.3 6EFX 4m 24s 3m 55s TODO
FCP.3 PRHQ 5m 00s 4m 53s TODO

 

*Notably, during H.264 export, the Intel iGPU was active alongside the render GPU. This means that the d/eGPU performed rendering and the iGPU encoded the final result - in parallel.

 

Mojave vs. Catalina

Let's compare performance with Final Cut Pro as we switch operating systems. The results are interesting:

Test (Pro 560X) 10.14.6 10.15.2
FCP.8 4K60 2m 18s 2m 19s
FCP.8 OPFL 2m 07s 2m 58s
FCP.8 4EFX 5m 13s 11m 6s
FCP.8 4EFB 5m 34s 11m 28s
FCP.8 6EFX 5m 31s 11m 36s
FCP.8 PRHQ 5m 27s 9m 52s
FCP.6 6EFX 4m 59s 5m 07s
FCP.6 PRHQ 5m 45s 5m 45s
FCP.3 6EFX 4m 24s -
FCP.3 PRHQ 5m 00s -

 

It's important to note that Mojave seemed to also utilize the Intel GPU for some of the above encodes, which may have contributed to the observed gains. Clearly from this limited test suite, Mojave seems much better. Only the ProRes export in FCP.6 was tied. It's important to note that other codecs may yield other results. Let's take a look at eGPU results next:

Test (Vega 64) 10.14.6 10.15.2
FCP.8 4K60 2m 18s 3m 11s
FCP.8 OPFL 2m 03s 2m 50s
FCP.8 4EFX 5m 49s 6m 38s
FCP.8 4EFB 6m 50s 7m 47s
FCP.8 6EFX 5m 47s 7m 4s
FCP.8 PRHQ 7m 13s 8m 29s
FCP.6 6EFX 3m 57s 3m 59s
FCP.6 PRHQ 5m 27s 5m 31s
FCP.3 6EFX 3m 55s -
FCP.3 PRHQ 4m 53s -

 

Even for eGPUs, Mojave seems to be faring better. Again, I did see iGPU activity for some of the exports, which may explain the improvements. However, from what I could tell, eGPU was utilized more in Mojave than in Catalina.

 

Final Cut vs. Final Cut

I only have two tests so far that are shared across all the Final Cut versions. I may add more in the future:

Test (Pro 560X, 10.14.6) FCP.8 (10.4.8) FCP.6 (10.4.6) FCP.3 (10.3.4)
6EFX 5m 31s 4m 59s 4m 24s
PRHQ 5m 27s 5m 45s 5m 00s

 

The regression in performance becomes obvious. In fact, it also says a lot about GPU utilization. The newer versions of Final Cut utilize less of the GPU. We only have FCP.8 and FCP.6 in Catalina:

Test (Pro 560X, 10.15.2) FCP.8 (10.4.8) FCP.6 (10.4.6)
6EFX 11m 36s 5m 07s
PRHQ 9m 52s 5m 45s

 

The regression is more than obvious, with 10.4.6 taking less than half the time to finish 6EFX. Once again, the older version of Final Cut Pro utilized the GPU(s) much better. Let's look at eGPU results next:

Test (Vega 64, 10.14.6) FCP.8 (10.4.8) FCP.6 (10.4.6) FCP.3 (10.3.4)
6EFX 5m 47s 3m 57s 3m 55s
PRHQ 7m 13s 5m 27s 4m 53s

 

Once again, the trend we saw with the dGPU holds. FCP.3 holds a significant advantage versus FCP.8, with FCP.6 slotting in cozily in the middle, tending towards FCP.3 performance levels. On Catalina, the situation is similar:

Test (Vega 64, 10.15.2) FCP.8 (10.4.8) FCP.6 (10.4.6)
6EFX 7m 4s 3m 59s
PRHQ 8m 29s 5m 31s

 

Once again, the advantage with FCP.6 is significant. Both do not utilize the eGPU to their full potential, but the older renderer still fares significantly better vs. Metal.

 

Special Scenarios

Todo.

 

Inferences

The above results may indicate that if one is using macOS Mojave, an eGPU has negligible value. After all, the Vega 64 is computationally over 4 times more powerful than the Radeon Pro 560X, yet we see no significant margins. The reality is more complicated however. Timeline performance and scrubbing still benefit heavily from eGPUs. Even with my projects at 1080p, I can easily saturate the discrete GPU such that unrendered content is nearly unplayable. Background render needs to kick in for smooth playback. However, IIRC there was one Youtube video where a creator noticed worse timeline performance as well - measured by number of effects applied and unrendered clip. Since I have not tested this, no comments on that for now.

Parallelization is one of the most challenging aspects of programming and it is clear that Apple needs to beef up their Metal engine with regard to exports. At lower resolutions and with low-complexity projects, there may not be sufficient compute load to parallelize. However, the point here is that the older engine/FCPs do much better in video exports than the new one. The new Metal engine has brought improvements to other high-end codecs such Canon C200, RED, and the like which I don't work with as a hobbyist. For those codecs, check Youtube for comparisons.

One of the hypotheses that I came up with regarding low GPU utilization was that Metal-equipped Final Cut Pro is micro-optimizing on a Mac by Mac basis. @iphone4tw already showcased this in a sense by changing Hackintosh model identifier in their tests and seeing magical performance improvements. So my hypothesis was that FCP does not parallelize efficiently for GPUs that have far more CUs than the Radeon Pro 560X. But this should not be the case either, because if we assume equally good parallelization on eGPU as that on 560X, then by virtue of the eGPU cores being clocked far higher, the eGPU should have been generally faster in every test - which it is on Final Cut Pro 10.3.4.

What surprised me most was the performance difference between Mojave and Catalina on the discrete GPU. The deltas are unexpectedly significant. Same with eGPU, to a lesser degree. The next steps on my end are to do more tests to compare across the Final Cut Pro versions, and submit a bug report with this informal analysis linked, along with a sysdiagnose while rendering. Overall, this analysis took unexpected turns and it was astonishing to see.

This topic was modified 6 days ago

purge-wranglerpurge-nvdaset-eGPU
Insights Into macOS Video Editing Performance
2018 MacBook Pro 15" RP560X + RX 5700 XT (Mantiz Venus)

Master Threads:
2014 15-inch MacBook Pro 750M
2018 15-inch MacBook Pro


itsage, jefniro and tsakal liked
ReplyQuote
tsakal
(@tsakal)
Estimable Member
Joined: 1 year ago
 

Amazing work thank you. Apple are the masters in micromanaging. 

A. 2.7 GHz I7 4 Cores, 16Gb, 1TB MBP 13 2018 TB3 , EGPU Razer Core X, Nitro+ RX5700 xt 8Gb, Samsung 4K U28E590,
Mac OS Catalina 10.15.2, Ext SSD Windows 10 1803

B. 2.7 GHz I7 4 Cores, 16Gb, 1TB MBP 13 2018 TB3 , EGPU Gigabyte Gaming Box RX580 8Gb, Mac OS Catalina 10.15.2, Ext SSD Windows 10 1803

C. 3.1 GHz I7, 16Gb, 1TB MBP 13 2015 TB2 , EGPU Gigabyte Gaming Box RX580 8Gb


ReplyQuote
(@massimo_franzese)
Trusted Member
Joined: 1 month ago
 

First a question: did you have background rendering active or not in your export tests? That is the way final cut is designed to work and h264 exports will encode from render files in prores 422 if you have them with limited rework. ProRes 422 HQ will always re-render unless you have 422 HQ in your timeline render default.

It seems that Catalina that is at x.2 is not as performing as Mojave x.6 I would think this is normal issue with upgrades and Catalina has a lot of security features that no doubt can slow down the whole set up

It would also seem that final cut 10.4.8 is not optimised for Catalina as much as it is for Mojave which makes you wonder how the app and os development coordination happens or indeed does not

I do not work with HD and I work for 4K my scenario is typically a camera LUT a look LUT color wheel and other 1-2 effects plus of course transitions speed ramps etc etc I export to ProRes 422 HQ because there are many reports of issues using H264 on final cut and compressor including pixelation and posterisation in transitions and more so I identify myself with the ProResHQ scenario however I only have macs with no discrete GPU (MBP 13 and Mini 2018) so the benefit of the eGPU is because without I can't even playback my timeline in any version of FCP and my 4K files.

Catalina is due a 10.15.3 soon am not aware of imminent final cut builds but I would hope they start working better with catalina

Apple is not in the business of selling eGPU and if you have a discrete on board you will also have an integrated on the chip and your testing proves the value of eGPU is mostly for units that do not have a discrete GPU on board. This is not surprising for me but if you invested in a eGPU and have a unit with a discrete GPU on board and make no use of background render you may be annoyed with all of this.

We also have to be cognisant that a typical final cut pro user is working on complex timelines and demanding files and spends far more time in the edit than anything else. waiting a few hours for your encode is not a big issue in the scheme of things. I encode projects in HEVC and they compute 1 fps on average in handbrake. I tried compressor and there were so many artifacts I asked apple for a refund and got it.

Mac Mini 2018 3.2 Ghz 6 cores
Razer X enclosure with Sapphire Vega 64 Nitro
Benq PD2720U


ReplyQuote
mac_editor
(@mac_editor)
Famed Member Moderator
Joined: 3 years ago
 
Posted by: @massimo_franzese

First a question: did you have background rendering active or not in your export tests? That is the way final cut is designed to work and h264 exports will encode from render files in prores 422 if you have them with limited rework.

It’s not active for exports for these tests but that’s not the point. Point is lower GPU utilization period. Even if you have background rendering enabled, that itself will be slower than older versions of Final Cut Pro. The only thing that happens when you disable background rendering and export is that the video is rendered and encoded simultaneously. So if I’m seeing lower utilization in this configuration, then for background rendering, utilization will be even lower.

Posted by: @massimo_franzese

It seems that Catalina that is at x.2 is not as performing as Mojave x.6 I would think this is normal issue with upgrades and Catalina has a lot of security features that no doubt can slow down the whole set up

Security features don’t slow down the system. Apple software teams generally have strict policies about performance (I know since I’ve spoken to Apple engineers, know one, and am a software engineer elsewhere) and exclude any features that impact performance (especially Safari team, for example). Ironically they have faltered a bit with Final Cut Pro/Catalina. Sure, maybe updates to Catalina can fix the issue - which is the point - to bring this up so that they fix the issue.

Posted by: @massimo_franzese

Apple is not in the business of selling eGPU and if you have a discrete on board you will also have an integrated on the chip and your testing proves the value of eGPU is mostly for units that do not have a discrete GPU on board. This is not surprising for me but if you invested in a eGPU and have a unit with a discrete GPU on board and make no use of background render you may be annoyed with all of this.

So? They advertise native support and even allow for GPU selection. I’ve been using Final Cut Pro for the last 5 years with macs with discrete GPUs and the performance gains with eGPU (which was easily doable before High Sierra as well) were meaningful at the time (used an RX 480 - same delta versus older dGPU). I’m not asking for a 4x powerful GPU to give me 4x gains. I know firsthand that parallel programming is a challenge. But performance regression with new engine is absurd.

My testing proves that older versions of FCP utilized the eGPU better. Simple as that. Obviously an iGPU-only Mac will benefit from an eGPU more than a Mac with discrete graphics by virtue of the overwhelmingly massive compute difference. Note that the Vega 64 is still 4x more powerful than a 560X. Clearly Apple is not taking advantage of that (when they can and do in the Mac Pro - hence the argument about micromanaging).

Also, modern iMacs with AMD GPUs don’t have iGPUs (completely disabled and cannot be muxed to) so no - having a discrete GPU does not imply iGPU. Anyway, the iGPU was never used in Catalina with eGPU but was in Mojave. Fact remains that with the same hardware, older FCP is quicker. I also mentioned that timeline performance is still a big plus of eGPUs (which is why I use one).

Posted by: @massimo_franzese

We also have to be cognisant that a typical final cut pro user is working on complex timelines and demanding files and spends far more time in the edit than anything else. waiting a few hours for your encode is not a big issue in the scheme of things.

I already said this in my analysis. I am only looking at performance differences. I still use FCP even if it is slower than before simply because I like the UI (and it’s still faster than other editors mostly) and editing experience (timeline) is great. The point is that if background rendering is slower than before (it is, as explained why initially), then with more complex projects, one is more likely to hit GPU limits. At least when I edit, I can be in a situation where there is a lot of background rendering left, and as I add more and more things, eventually it piles up. Then everything rests on the GPU to play back unrendered timeline smoothly. Here the eGPU is massively beneficial, and dGPU (even this one) will choke.

Also note that whenever you are scrubbing or playing back video (not just in timeline, but also library), background rendering does nothing. ‘Background rendering’ only occurs when you do absolutely nothing in FCP (which I don’t when I’m editing).

Posted by: @massimo_franzese

I encode projects in HEVC and they compute 1 fps on average in handbrake.

Videotoolbox presets will do 150fps or thereabouts - of course, they are handicapped in terms of tuning. 

This post was modified 6 days ago

purge-wranglerpurge-nvdaset-eGPU
Insights Into macOS Video Editing Performance
2018 MacBook Pro 15" RP560X + RX 5700 XT (Mantiz Venus)

Master Threads:
2014 15-inch MacBook Pro 750M
2018 15-inch MacBook Pro


itsage liked
ReplyQuote
(@massimo_franzese)
Trusted Member
Joined: 1 month ago
 

In any productive process upgrades are a risk. I tend to upgrade my mini almost immediately and keep my mbp on an older release because of legacy issues in photo editing. So I kept my laptop on Mojave for a while. I then upgraded the mini run it for 2-3 weeks and experienced no visible problem so I upgraded the macbook. Then I realised that I had a bunch of 32 bits applications no longer working but I had already abandoned them (MPEG streamclip ffmpegx for example) so I was not too bothered. it bothers me more catalina encrypting my time machine disk over a week to be honest

If you use a professional application you treat both OS and application updates carefully especially as once your libraries go forward they don't go back. Final Cut Pro does allow you to run both 10.4.6 and 10.4.8 so you can keep going with that too if you wished and wait for better times instead catalina requires a full time machine rollback of your productive machine and is a right PITA.

I would recommend that if you submit something to apple for final cut you take the OS out of the equation otherwise things can get mixed up easily. Your tests show that performance of 10.4.8 is consistently worse so that is a point worth raising for all those features that Apple say today would benefit from an eGPU (live tasks are not one). Apple is already getting tickets for the features that have gone back that do not relate to eGPU playback for example is rubbish

Mac Mini 2018 3.2 Ghz 6 cores
Razer X enclosure with Sapphire Vega 64 Nitro
Benq PD2720U


ReplyQuote
(@megaseppl)
Eminent Member
Joined: 1 year ago
 

One little question: How did you change the GPU/UHD 630/T2 in Handbrake? I can't find any option for this.

Mac mini 2018 i7, Razer Core X with PowerColor Red Devil 5700 XT


ReplyQuote
mac_editor
(@mac_editor)
Famed Member Moderator
Joined: 3 years ago
 

@megaseppl

There is no option for that. When you use Videotoolbox presets, it decides the GPU automatically. For example, for H.264, it chose the eGPU if one was connected.

purge-wranglerpurge-nvdaset-eGPU
Insights Into macOS Video Editing Performance
2018 MacBook Pro 15" RP560X + RX 5700 XT (Mantiz Venus)

Master Threads:
2014 15-inch MacBook Pro 750M
2018 15-inch MacBook Pro


ReplyQuote
(@megaseppl)
Eminent Member
Joined: 1 year ago
 

@mac_editor

Thank you! So H.265 with Videotoolbox is always T2 on your Macbook Pro and H.264 with Videotoolbox is encoded on your eGPU (if connected)?
Never knew that the Videotoolbox is that fast... I think I may have lost days of my life because I encoded with my CPU... 😉

Just did a little test: H.264 is about 5x faster on my eGPU compared to CPU, H.265 (on T2?) about 10x faster than on CPU.

Mac mini 2018 i7, Razer Core X with PowerColor Red Devil 5700 XT


ReplyQuote
mac_editor
(@mac_editor)
Famed Member Moderator
Joined: 3 years ago
 

@megaseppl

Videotoolbox (VT) is generally very fast but does not allow for much configuration and typically yields large file sizes. I prefer doing CPU encodes for quality reasons - VT does not allow for constant quality settings (which I prefer).

I believe H.265 VT uses T2 chip because I can't imagine CPU encoding be that fast - even on ultrafast preset, it'll max out at around 50-70fps, while VT does 150fps - plus the fact that no GPU was active during this encode (although CPU usage was very high still).

I also tried some tricks to force GPU selection in VT (such as for H.265 to compare), but so far failed.

This post was modified 5 days ago

purge-wranglerpurge-nvdaset-eGPU
Insights Into macOS Video Editing Performance
2018 MacBook Pro 15" RP560X + RX 5700 XT (Mantiz Venus)

Master Threads:
2014 15-inch MacBook Pro 750M
2018 15-inch MacBook Pro


ReplyQuote