POSTED: Jun 20, 2018
TAGS: Developer Services, Linker, Toolchain
When developing console games, linking becomes an inevitable iterative process as changes are made and bugs are fixed. Whilst the C++ file may only be re-compiled once, for example after the file is modified, all of the intermediate files are re-linked, thereby generating the final executable, in every build. Over the course of the development life cycle this can mean having to re-link a game many thousands of times before it finally reaches the consumer. Reducing the overall time each link takes makes a significant impact to the games development process.
Typically, a large proportion of file input/output (I/O) when linking comes from reading into memory all the intermediate object files that constitute the game. These files can vary significantly in both their number and size depending on the game. In this experiment we look at three different storage technologies (SATA HDD, SATA SSD and PCI Express SSD) and four games of varying size to see how these impact link time performance.
Additionally consideration was given to the impact of the Windows file system I/O cache being populated (warm)(*) against the cache being depleted (cold) prior to linking. All subsequent timings were taken immediately after the game has already been linked at least once.
(*) Note: The PC has enough physical memory to have all of the object files resident in memory.
The Storage Devices Used
- SATA HDD
- Model: Toshiba HDWE140 (X300)
- Size: 4TB (128MB data buffer)
- Speed: 7200 RPM
- Interface: 6Gb/s SATA 3.0
- SATA SSD
- Model: Kingston SUV400S37480G (SSDNow UV400)
- Size: 480GB
- Interface: 6Gb/s SATA 3.0
- PCI Express SSD
- Model: Intel NVMe SSDPED1D28 (Optane SSD 900)
- Size: 280GB
- Interface: PCI Express NVMe 3.0 x4
All drives were formatted with NTFS using the default allocation unit size.
The PC Used
- Motherboard: Supermicro X10DAi
- CPU: Intel Xeon E5-2630 v3 @2.4GHz
- OS: Windows 10 64-bit
- RAM: 256GB – 8x32GB DDR4 PC4-17000 LRDIMM ECC (Samsung M386A4G40DM0-CPB00)
- I/O Controller: Intel C612 – 10x 6Gb/s SATA 3.0
Cold File I/O Cache vs. Warm I/O Cache
In this experiment we used the readily available Sysinternals RAMMap tool to clear the Windows file system I/O cache prior to each cold cache timing test.
Here are the results of the cold Windows file system I/O cache link times for each game and storage device:
- The SATA SSD device links approximately 2-5 times faster than the SATA HDD device.
- The PCIe SSD device links approximately twice as fast as the SATA SSD device, and 4-10 times faster than the SATA HDD device.
- The PCIe SSD device with a cold Windows file system I/O cache is close to the best case warm cache link time.
Cold File I/O Cache vs. Warm I/O Cache With Prefetching
We then repeated the same test but this time with the inclusion of the “--prefetch” linker switch. The “--prefetch” switch causes the linker to create an extra thread which proactively populates the Windows file system I/O cache with the link input files before they are required for processing by the main thread.
Here are the results of this test:
Comparing the regular link times against link times with prefetching enabled:
- The SATA SSD device links approximately 2-4 times faster than the SATA HDD device.
- The PCIe SSD device links approximately twice as fast as the SATA SSD device, and 3-7 times faster than the SATA HDD device.
- The PCIe SSD device with a cold Windows file system I/O cache is very close to the best warm cache link time.
- With a warm Windows file system I/O cache the link time for all storage devices is roughly the same with or without prefetching enabled.
- For all cases where the Windows file system I/O cache is cold, the "--prefetch" linker option always improves performance.
- The performance gain from the "--prefetch" linker option reduces as the speed of the storage device increases. This helps to narrow the performance difference between inexpensive and expensive storage devices. We saw the following performance improvement:
- SATA HDD: 23-35%
- SATA SSD: 9-29%
- PCIe SSD: 8-11%
- When the Windows file system I/O cache is warm, in general the "--prefetch" linker option makes little difference to link times.
- If budgets allow, invest in a PCIe SSD device to improve link performance. Different SSD devices to the one under test, which claim to offer higher performance may also yield similar results, but your mileage may vary.
- The biggest performance improvement comes from keeping the Windows file system I/O cache as populated as possible. Therefore, add as much RAM to your PC as possible.